RISS 검색 - 국내학술지논문 상세보기

국문 초록 (Abstract)

본 연구는 데이터 마이닝을 이용하여 스팸메일을 분류하는 문제에 대한 해법을 제시한다. 인터넷의 보급과 전산화로 인해 이메일을 이용하는 개인이나 단체가 증가하면서 생활의 편리함과 비즈니스에서의 효율성이 크게 향상이 되었다. 반면에 이러한 변화는 스팸메일을 수신하면서 많은 문제들을 야기시키고 있다. 스팸메일은 광고나 악의적인 목적으로 원하지 않는 수신자에게 전달되는 이메일을 말한다. 스팸메일은 개인에게는 혼란과 컴퓨터에 악영향을 미치고, 비즈니스에서는 중요한 업무를 방해하는 등 악영향을 끼치고 있다. 이러한 스팸메일을 스팸메일함으로 보내는 방법이 많이 연구되어 왔지만 효과적인 방법에 대한 필요가 여전히 높다. 이러한 문제를 해결하기 위해서 많은 해결 방법론이 연구되어져 왔고 데이터 마이닝이 우수한 결과를 보여줬다. 하지만 보유하고 있는 데이터의 상태가 불완전한 경우에는 데이터 마이닝 기법을 적용하기 쉽지 않다. 특히 데이터의 클레스에 대한 정보가 한쪽만 가지고 있거나 불확실한 경우에 대해서는 일반적인 데이터 마이닝 기법은 분류모형을 찾는 것이 어렵다. 본 논문에서는 이를 해결하기 위해 PU learning을 이용한다. 또한 기본 데이터 마이닝 기법으로는 Support Vector Machine(SVM)을 적용하였다. 실험 결과에서는 제시한 방법론이 스팸메일 분류에 대해 좋은 분류모형을 제시할 수 있다는 것을 보여준다.

번역하기

본 연구는 데이터 마이닝을 이용하여 스팸메일을 분류하는 문제에 대한 해법을 제시한다. 인터넷의 보급과 전산화로 인해 이메일을 이용하는 개인이나 단체가 증가하면서 생활의 편리함과 ...

다국어 초록 (Multilingual Abstract)

This paper proposes a classification model for spam email using data mining. The use of personal or business email has increased along with the growth of internet population and computerization. This change allows one to live in convenience or to improve the efficiency of business. On the other hand, spam emails cause many problems in our life. Spam email is defined as the email which is sent to anyone who does not want to receive the email that brings to annoyance or computer virus or interruption of business process. Although a lot of studies have been proposed to protect the spam email, we are still in need of an efficient classification method. Data mining is a prominent way to classify spam emails. However, if data do not have sufficient information, traditional data mining method may not apply for the problem. Therefore, we suggest PU learning algorithm to classify the problem with insufficient data which have only positive class and unlabeled data. Support vector machine (SVM) has been used as the basic data mining method. Experimental results show the viability of the proposed classification model.

번역하기

참고문헌 (Reference)

1 V.N. Vapnik, "The nature of Statistical Learning Theory" Springer 1995

2 A. Schwartz, "SpamAssassin" O'Reilly Media, Inc 2004

3 P.P. Chan, "Spam filtering for short messages in adversarial environment" 155 : 167-176, 2015

4 G. Fumera, "Spam filtering based on the analysis of text information embedded into images" 7 : 2699-2720, 2006

5 D.E. Bambauer, "Solving the Inbox Paradox: An Information-Based Policy Approach to Unsolicited E-mail Advertising" 10 (10): 1-94, 2005

6 M. Qi, "Semantic Analysis for Spam Filtering" 2914-2917,

7 R. Liu, "SVM-based active feedback in image retrieval using clustering and unlabeled data" 41 (41): 2645-2655, 2008

8 S.J. Delany, "SMS spam filtering: methods and data" 39 (39): 9899-9908, 2012

9 B. Liu, "Partially Supervised Classification of Text Documents" 387-394, 2002

10 J. Shen, "On robust image spam filtering via comprehensive visual modeling" 48 : 3227-3238, 2015

1 V.N. Vapnik, "The nature of Statistical Learning Theory" Springer 1995

2 A. Schwartz, "SpamAssassin" O'Reilly Media, Inc 2004

3 P.P. Chan, "Spam filtering for short messages in adversarial environment" 155 : 167-176, 2015

4 G. Fumera, "Spam filtering based on the analysis of text information embedded into images" 7 : 2699-2720, 2006

5 D.E. Bambauer, "Solving the Inbox Paradox: An Information-Based Policy Approach to Unsolicited E-mail Advertising" 10 (10): 1-94, 2005

6 M. Qi, "Semantic Analysis for Spam Filtering" 2914-2917,

7 R. Liu, "SVM-based active feedback in image retrieval using clustering and unlabeled data" 41 (41): 2645-2655, 2008

8 S.J. Delany, "SMS spam filtering: methods and data" 39 (39): 9899-9908, 2012

9 B. Liu, "Partially Supervised Classification of Text Documents" 387-394, 2002

10 J. Shen, "On robust image spam filtering via comprehensive visual modeling" 48 : 3227-3238, 2015

11 B. Scholkopf, "New support vector algorithms" 12 (12): 1207-1245, 2000

12 B. Massey, "Learning spam: simple techniques for freely-available software" 13-, 2003

13 G. Ruan, "Intelligent Detection Approaches for Spam, Natural Computation, 2007" 3 : 672-676, 2007

14 D.H. Fusilier, "Detecting positive and negative deceptive opinions using PU-learning" 51 (51): 433-443, 2015

15 J. Han, "Data Mining Concepts and Techniques" Morgan Kaufmann 2012

16 B. Liu, "Building Text Classifiers Using Positive and Unlabeled Examples" 179-186, 2003

17 L. Zhang, "An evaluation of statistical spam filtering techniques" 3 (3): 243-269, 2004

18 K.C. Ying, "An ensemble approach applied to classify spam e-mails" 37 (37): 2197-2201, 2010

19 M. Claesen, "A robust ensemble approach to learn from positive and unlabeled data using SVM base models" 160 : 73-84, 2015

20 T.S. Guzella, "A review of machine learning approaches to Spam filtering" 36 (36): 10206-10222, 2009

21 M.C. Su, "A neural tree and its application to spam e-mail detection" 37 (37): 7976-7985, 2010

22 I. Idris, "A combined negative selection algorithm–particle swarm optimization for an email spam detection system" 39 : 33-44, 2015

연월일	이력구분	이력상세
2020	평가예정	신규평가 신청대상 (신규평가)
2019-12-01	평가	등재 탈락 (기타)
2019-01-01	평가	등재학술지 유지 (계속평가)
2016-01-01	평가	등재학술지 선정 (계속평가)
2014-01-01	평가	등재후보학술지 선정 (신규평가)

기준연도	WOS-KCI 통합IF(2년)	KCIF(2년)	KCIF(3년)
2016	0.33	0.33	0.32
KCIF(4년)	KCIF(5년)	중심성지수(3년)	즉시성지수
0.33	0.32	0.407	0.14

상세검색

RISS 보유자료

상세검색

해외전자자료

스팸 메일 분류를 위한 데이터 마이닝 응용 = Data Mining for Spam Email Classification

부가정보

동일학술지(권/호) 다른 논문

분석정보

인용정보 인용지수 설명보기

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료