http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
워드임베딩과 그래프 기반 준지도학습을 통한 한국어 어휘 감성 점수 산출
서덕성(Deokseong Seo),모경현(Kyoung Hyun Mo),박재선(Jaesun Park),이기창(Gichang Lee),강필성(Pilsung Kang) 대한산업공학회 2017 대한산업공학회지 Vol.43 No.5
Sentiment analysis plays an important role in both public and private sectors to understand consumers’ responses to products or voters’ reactions to policies. One of the most key success factors of sentiment analysis is to build an appropriate sentiment word dictionary. Many current existing approaches either heavily rely on the knowledge of domain experts or word co-occurrence statistics, the first of which causes low efficiency and high expenditure while the second of which suffers from incomplete data. In order to resolve these shortcomings, we propose a new domain-specific Korean word sentiment score evaluation method based on word embedding and graph based semi-supervised learning. First, words are embedded in a lower dimensional space by Word2Vec technique. Then, the word relation graph is constructed based on the similarity between words in the embedding space. Then, we assign sentiments to approximately 1% words utilizing some indicators like centrality measure. The sentiment scores of the other unlabeled words are automatically assigned by label propagation with semisupervised learning. To verify our proposed method, we collect 1.98 million review comments from three movie review websites. Experimental results show that the proposed method achieves about 93% accuracy of polarity classification.
여행 사이트 리뷰를 활용한 관광지 만족도 요인 추출 및 평가
조수현(Suhyoun Cho),김보섭(Boseop Kim),박민식(Minsik Park),이기창(Gichang Lee),강필성(Pilsung Kang) 대한산업공학회 2017 대한산업공학회지 Vol.43 No.1
In order to attract foreign tourists, it is important to understand what factors on domestic tour spots are critically considered and how they are evaluated after visit. However, most of the researches on tour business have collected information from tourists through survey on a small number of tourists, which leads to inaccurate and biased conclusion. In this paper, we suggest a data-driven methodology to figure out tourists’ satisfaction factors and estimate sentiment scores on them. To do so, we collected review comments data from popular web site. Latent dirichlet allocation is employed to extract key factors and elastic net is used to estimate sentiment scores. Then, an aggregated evaluation score is generated by combining the factors and the sentiment scores per topics. Our proposed method can be used to recommend travel schedules with themes and discover new spots.