텍스트 마이닝을 이용한 웹 포럼 불량글 탐지 모델|RISS 상세보기

다국어 입력

あぁかがさざただなはばぱまやゃらわゎんいぃきぎしじちぢにひびぴみりうぅくぐすずつづっぬふぶぷむゆゅるえぇけげせぜてでねへべぺめれおぉこごそぞとどのほぼぽもよょろを

アァカサザタダナハバパマヤャラワヮンイィキギシジチヂニヒビピミリウゥクグスズツヅッヌフブプムユュルエェケゲセゼテデヘベペメレオォコゴソゾトドノホボポモヨョロヲ ―

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)

中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.

ㅥ ㅦ ㅧ ㅨ ㅩ ㅪ ㅫ ㅬ ㅭ ㅮ ㅯ ㅰ ㅱ ㅲ ㅳ ㅴ ㅵ ㅶ ㅷ ㅸ ㅹ ㅺ ㅻ ㅼ ㅽ ㅾ ㅿ ㆀ ㆁ ㆂ ㆃ ㆄ ㆅ ㆆ ㆇ ㆈ ㆉ ㆊ ㆋ ㆌ ㆍ ㆎ

Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ σ τ υ φ χ ψ ω

á à Á À é è É È ç Ç ê

Ä Ö Ü ä ö ü ß

ְ ֳ ֲ ֱ ָ ַ ֵ ֶ ִ ֹ ּ ֻ ׂ ׁ ּ פ ם ן ו ט א ר ק ף ך ל ח י ע כ ג ד ש ץ ת צ מ נ ה ב

‘ ’ “ ” 〔〕〈〉「」『』【】＂（）［］｛｝

± × ÷ ≠ ≤ ≥ ∞ ∴ ♂ ♀ ∠ ⊥ ⌒ ∂ ∇ ≡ ≒ ≪ ≫ √ ∽ ∝ ∵ ∫ ∬ ∈ ∋ ⊆ ⊇ ⊂ ⊃ ∪ ∩ ∧ ∨ ￢ ⇒ ⇔ ∀ ∃ ∮ ∑ ∏ ＋－＜＝＞

、。 · ‥ … ¨ 〃 ― ∥ ＼ ∼ ´ ～ ˇ ˘ ˝ ˚ ˙ ¸ ˛ ¡ ¿ ː ！＇，．／：；？＾＿｀｜

½ ⅓ ⅔ ¼ ¾ ⅛ ⅜ ⅝ ⅞ ¹ ² ³ ⁴ ⁿ ₁ ₂ ₃ ₄

Æ Ð Ħ Ĳ Ł Ø Œ Þ Ŧ Ŋ æ đ ð ħ ı ĳ ĸ ŀ ł ø œ ß þ ŧ ŋ ŉ

А Б В Г Д Е Ё Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я

′ ″ ℃ Å ￠￡￥ ¤ ℉ ‰ ＄％Ｆ￦㎕㎖㎗ ℓ ㎘㏄㎣㎤㎥㎦㎙㎚㎛㎜㎝㎞㎟㎠㎡㎢㏊㎍㎎㎏㏏㎈㎉㏈㎧㎨㎰㎱㎲㎳㎴㎵㎶㎷㎸㎹㎀㎁㎂㎃㎄㎺㎻㎽㎾㎿㎐㎑㎒㎓㎔ Ω ㏀㏁㎊㎋㎌㏖㏅㎭㎮㎯㏛㎩㎪㎫㎬㏝㏐㏓㏃㏉㏜㏆

§ ※ ☆ ★ ○ ● ◎ ◇ ◆ □ ■ △ ▽ → ← ↑ ↓ ↔ 〓 ◁ ◀ ▷ ▶ ♤ ♠ ♡ ♥ ♧ ♣ ⊙ ◈ ▣ ◐ ◑ ▒ ▤ ▥ ▨ ▧ ▦ ▩ ♨ ☏ ☎ ☜ ☞ ¶ † ‡ ↕ ↗ ↙ ↖ ↘ ♭ ♩ ♪ ♬ ㉿㈜ № ㏇ ™ ㏂㏘ ℡ ＃＆＊＠ ª º

ⅰ ⅱ ⅲ ⅳ ⅴ ⅵ ⅶ ⅷ ⅸ ⅹ Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ

ا ب ت ث ج ح خ د ذ ر ز س ش ص ض ط ظ ع غ ف ق ک ل م ن ه و ی

최근 검색 목록
전체삭제 닫기

RISS 인기검색어

텍스트 마이닝을 이용한 웹 포럼 불량글 탐지 모델 = The Spam Detection Model for Web Forums using Text Mining Techniques

한글로보기

https://www.riss.kr/link?id=A101574285

저자

우지영 (고려대학교)
발행기관
한국지식정보기술학회
학술지명
한국지식정보기술학회 논문지(Journal of Knowledge Information Technology and Systems)
권호사항

Vol.7 No.1 [2012]
발행연도
2012
작성언어
Korean
주제어

Web forum ; Social media ; Spam ; Posting quality ; Text mining
등재정보
KCI등재
자료형태
학술저널
수록면

159-166(8쪽)
KCI 피인용횟수
1
제공처
KCI

0
상세조회
0
다운로드
0
내보내기

서지정보 열기

부가정보

다국어 초록 (Multilingual Abstract)

The spam in the discussion web forum causes user inconvenience and lowers the value of the web forum as the open source of user opinion. The importance of postings is evaluated in terms of the number of involved authors, so the spam distorts the analysis result by adding the unnecessary data in the opinion analysis. We propose the automatic detection model of spam postings in the web forum. We extract text features of posting contents using text mining techniques from the perspective of linguistics and then perform supervised learning to recognize spam from normal postings. Significant features are derived through the learning process and the automatic detection model is built based on those features. To build the automatic detection model of normal postings and spam, four evaluators are asked to recognize the spam posting in prior. We adopted the Naive Bayesian, Support Vector Machine (SVM), decision tree, which are known to perform well in data and text mining tasks. We can extract the text features to recognize the spam and detect automatically the newly posted spam. We apply the proposed model to the YahooFinace-Walmart forum, which is the world largest Walmart-related web forum.

번역하기

The spam in the discussion web forum causes user inconvenience and lowers the value of the web forum as the open source of user opinion. The importance of postings is evaluated in terms of the number of involved authors, so the spam distorts the analy...

The spam in the discussion web forum causes user inconvenience and lowers the value of the web forum as the open source of user opinion. The importance of postings is evaluated in terms of the number of involved authors, so the spam distorts the analysis result by adding the unnecessary data in the opinion analysis. We propose the automatic detection model of spam postings in the web forum. We extract text features of posting contents using text mining techniques from the perspective of linguistics and then perform supervised learning to recognize spam from normal postings. Significant features are derived through the learning process and the automatic detection model is built based on those features. To build the automatic detection model of normal postings and spam, four evaluators are asked to recognize the spam posting in prior. We adopted the Naive Bayesian, Support Vector Machine (SVM), decision tree, which are known to perform well in data and text mining tasks. We can extract the text features to recognize the spam and detect automatically the newly posted spam. We apply the proposed model to the YahooFinace-Walmart forum, which is the world largest Walmart-related web forum.

더보기

참고문헌 (Reference)

1 Hayati P., "Toward spam 2.0: An evaluation of Web 2.0 anti-spam methods Industrial Informatics" 875-880, 2009

2 Buckland M., "The relationship between Recall and Precision" 45 (45): 12-19, 1999

3 Gruhl D., "The predictive power of online chatter" KDD 78-87, 2005

4 Vapnik VN., "The nature of statistical learning theory" Springer-Verlag 1995

5 Gillin P., "The New Influencers, A Marketer’s Guide to the New Social Media" Quill Driver Books\Word Dancer Press 2007

6 Robert F., "Syntax. Critical Concepts in Linguistics" Routledge 2006

7 Dunning T., "Statistical Identification of Language" New Mexico State University 94-273, 1994

8 Lin Y., "Splog detection using self-similarity analysis on blogtemporal dynamics" 2007

9 Jindal N., "Opinion Spam and Analysis" WSDM’08 2008

10 Lewis D., "Naive (Bayes) at forty: The independence assumption in information retrieval" Machine Learning 4-15, 1998

1 Hayati P., "Toward spam 2.0: An evaluation of Web 2.0 anti-spam methods Industrial Informatics" 875-880, 2009

2 Buckland M., "The relationship between Recall and Precision" 45 (45): 12-19, 1999

3 Gruhl D., "The predictive power of online chatter" KDD 78-87, 2005

4 Vapnik VN., "The nature of statistical learning theory" Springer-Verlag 1995

5 Gillin P., "The New Influencers, A Marketer’s Guide to the New Social Media" Quill Driver Books\Word Dancer Press 2007

6 Robert F., "Syntax. Critical Concepts in Linguistics" Routledge 2006

7 Dunning T., "Statistical Identification of Language" New Mexico State University 94-273, 1994

8 Lin Y., "Splog detection using self-similarity analysis on blogtemporal dynamics" 2007

9 Jindal N., "Opinion Spam and Analysis" WSDM’08 2008

10 Lewis D., "Naive (Bayes) at forty: The independence assumption in information retrieval" Machine Learning 4-15, 1998

11 Morinaga S., "Mining product reputations on the Web" 341 : 2002

12 Zinman A., "Is Britney Spears spam" 2007

13 Quinlan JR., "Induction of decision trees. In Machine Learning"

14 Benevenuto F., "Identifying Video Spammers in Online Social Networks" AIRWeb 2008

15 Gwet K., "Handbook of Inter-Rater Reliability (Second Edition)" ISBN 2010

16 Sampson S., "Gathering customer feedback via the Internet: instruments and prospects" 98 (98): 71-, 1998

17 Glance N., "Deriving Marketing Intelligence from Online Discussion" KDD 2005

18 Han S., "Collaborative blog spam filtering using adaptive percolation search" WWW 2006

19 Mishne G., "Blocking Blog Spam with Language Model Disagreement" AIRWeb 2005

20 Wanas N., "Automatic Scoring of Online Discussion Posts" WICOW 2008

21 Paul K., "Analyzing Grammar: An Introduction" Cambridge University Press 35-, 2005

22 Wenger A., "Analysis of travel bloggers' characteristics and their communication about Austria as a tourism destination" 14 (14): 2008

23 Liu Y, "ARSA: A Sentiment-Aware Model for Predicting Sales Performance Using Blogs" SIGIR 2007

24 Niu Y., "A Quantitative Study of Forum Spamming Using Context-based Analysis" 2007

동일학술지(권/호) 다른 논문

비실시간 웹 토론에서 학습자의 내·외향성이 몰입과 만족도에 주는 차이 분석
- 한국지식정보기술학회
- 김태웅
- 2012
- KCI등재
스노우보드 체험을 위한 몰입형 시뮬레이터의 개발
- 한국지식정보기술학회
- 김상연
- 2012
- KCI등재
1인칭 충돌회피 게임 개발
- 한국지식정보기술학회
- 안성옥
- 2012
- KCI등재
동적 웹 페이지의 로그 분석에 관한 연구
- 한국지식정보기술학회
- 김완규
- 2012
- KCI등재

동일학술지 더보기

더보기

분석정보

View

상세정보조회

0

Usage

원문다운로드

0

대출신청

0

복사신청

0

EDDS신청

0

동일 주제 내 활용도 TOP

주제

연도별 연구동향

연도별 활용동향

연관논문

연구자 네트워크맵

공동연구자 (7)

더보기

유사연구자 (20) 활용도상위20명

더보기

인용정보 인용지수 설명보기

학술지 이력

학술지 이력
연월일	이력구분	이력상세	등재구분
2028	평가예정	재인증평가 신청대상 (재인증)
2022-01-01	평가	등재학술지 유지 (재인증)
2019-04-09	학회명변경	영문명 : 미등록 -> Korea Knowledge Information Technology Society
2019-01-01	평가	등재학술지 유지 (계속평가)
2016-01-01	평가	등재학술지 유지 (계속평가)
2014-03-17	학술지명변경	외국어명 : Journal of The Korea Knowledge Information Technology Society -> Journal of Knowledge Information Technology and Systems
2012-01-01	평가	등재학술지 선정 (등재후보2차)
2011-01-01	평가	등재후보 1차 PASS (등재후보1차)
2009-01-01	평가	등재후보학술지 선정 (신규평가)

학술지 인용정보

학술지 인용정보
기준연도	WOS-KCI 통합IF(2년)	KCIF(2년)	KCIF(3년)
2016	0.39	0.39	0.29
KCIF(4년)	KCIF(5년)	중심성지수(3년)	즉시성지수
0.25	0.22	0.312	0.07

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료

서지정보
부가정보
동일학술지(권/호) 다른 논문
분석정보
인용정보
연관 공개강의(KOCW)

해외이동버튼