http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
한국어 감정분석 코퍼스를 활용한 양상정보 기반의 감정분석 연구
신효필(Hyopil Shin),김문형(Munhyong Kim),박수지(Suzi Park) 사단법인 한국언어학회 2016 언어학 Vol.0 No.74
This study develops a practical application of language resources from the Korean Sentiment Analysis Corpus (KOSAC) for sentiment analysis research. With this in mind, based on their sentiment properties and the probabilistic factors of annotated expressions from KOSAC, we extracted annotated expressions and refined them to be a sentiment analysis research resource. This study attempted to break away from simple calculation methods dependant on the distribution of lexical polarity items seen in previous research. Additionally, in order to perform more sophisticated sentiment analysis, we attempted to introduce pragmatic information which includes modality. In order to achieve this, we cataloged expressions that include pragmatic information related to the speaker"s attitude, based on their relative probability in KOSAC. After doing so, this study shows a practical application of this new language resource to subjectivity analysis research. When using this new resource, this research demonstrates an accuracy improvement of around 6%. This demonstrates very clearly that, in addition to polarity items, there exists a need to include a variety of aspects and lexical information when doing this type of research. Moreover, this extraction of sentiment expressions, depending on their semantic and pragmatic properties, not only shows an additional use of KOSAC, but also establishes a new resource in the field of sentiment analysis.
KOLON(the KOrean Lexicon mapped onto the Mikrokosmos ONtology): 한국어 어휘의 미크로코스모스 온톨로지로의 사상과 언어 자원의 결합
신효필(Hyopil Shin) 사단법인 한국언어학회 2010 언어학 Vol.0 No.56
The KOLON(KOrean Lexicon mapped onto the Mikrokosmos ONtology) is an output of our work of mapping Korean words onto the Mikrokosmos ontology with a view to building a Wordnet for Korean. Unlike other Wordnet-related resources, the KOLON aims at taking fully advantage of properties of a concept represented in a frame by inheriting them to lexicons. We mapped about 24,858 Korean words consisting of 7,386 nouns, 13,397 verbs and 4,075 adjectives so far. Since we keep adding lexical items and cleaning original mappings, the numbers are subject to change. Synonyms are grouped together for each concepts. The big difference between the KOLON and other Korean Wordnet-related resources in terms of synonyms comes from granularity. While other resources show a fine grained and very restricted set of synonyms, the work of mapping Korean words onto the Mikrokosmos ontology results in a wide coverage of synonym set, because a concept can cover many lexical items in a cognitive perspective. described the mapping procedure in line with parts-of-speech, and pointed out strengths and weaknesses of the work. And I compared the KOLON with another Korean Word net, KorLex, and showed the ideological differences between the two efforts. I contend that the work described here can be a useful resource for a natural language processing and theoretical Linguistic research. All the information and up-to-date lexical items can be checked on the website, http://word.snu.ac.kr/kolon.
감정어휘 평가사전과 의미마디 연산을 이용한 영화평 등급화 시스템
고민수(Minsu Ko),신효필(Hyopil Shin) 한국인지과학회 2010 인지과학 Vol.21 No.4
본 논문은 한 문서의 전체 의미는 각 부분의미의 합성이라는 관점에서 미리 반자동으로 구축된 감정어휘 평가사전을 기반으로 한 시스템을 제안한다. 인간의 의사 결정 과정과 유사한 방식으로 의사 결정 과정을 모델링하려는 노력으로써 본 ARSSA 시스템은 개별 리뷰의 의미값 연산과 자료 분류를 통해 감정 표현이 나타난 영화평 리뷰의 자동 등급화에 대한 연구를 수행한다. 이는 {‘평점’ : ‘리뷰’} 이항구조로 이루어진 현재의 평점 부여 형식에서 발생하는 두 변항의 불연속성 문제를 해결해보려는 목적을 가진다. 이는 어휘 의미 합성 과정에서 반영된 추상적 의미들의 합성 함수를 통해 실현될 수 있다. 시스템의 성능 실험에서 네이버 무비에서 확보한 1000개의 리뷰에 대한 10-fold 교차 검증 실험이 수행되었다. 이 실험은 기존에 부여된 평점과 비교하여 감정어휘 평가사전을 이용하였을 때 85%의 F1 Score를 보였다. Assuming that the whole meaning of a document is a composition of the meanings of each part, this paper proposes to study the automatic grading of movie reviews which contain sentimental expressions. This will be accomplished by calculating the values of semantic segments and performing data classification for each review. The ARSSA(The Automatic Rating System for Sentiment analysis using an Appraisal dictionary) system is an effort to model decision making processes in a manner similar to that of the human mind. This aims to resolve the discontinuity between the numerical ranking and textual rationalization present in the binary structure of the current review rating system: {rate: review}. This model can be realized by performing analysis on the abstract menas extracted from each review. The performance of this system was experimentally calculated by performing a 10-fold Cross-Validation test of 1000 reviews obtained from the Naver Movie site. The system achieved an 85% F1 Score when compared to predefined values using a predefined appraisal dictionary.
김영삼(Youngsam Kim),신효필(Hyopil Shin) 한국정보과학회 2018 정보과학회논문지 Vol.45 No.12
시간차(temporal-difference) 학습은 강화학습의 핵심적인 알고리즘으로 마르코프 체인 모형에서 상태의 가치를 실시간으로 측정하는데 유용한 방법론을 제공한다. 이 방법론에서 활용되는 마르코프 모형은 감쇄 비(discount factor)를 사용하여 보상이 주어지는 시점과 가까운 상태일수록 보상 값에 대해 더 많은 가중치를 주게 된다. 본 논문에서는 텍스트의 어떤 어휘가 갖는 감정 값을 측정하는데 있어 시간차학습이 기존의 베이즈 확률을 이용하는 방법보다 상대적으로 유용함을 보이고자 한다. 이는 시간차 학습이 본질적으로 점증적(incremental) 처리이며 감쇄 비를 통해 부여할 감정 값의 가중치를 조절할 수 있기 때문이다. 본 논문은 영화평 자료를 이용하여 이 방법의 효과를 간접적인 방법과 직접적인 방법 모두에서 검증하였으며, 이 방법이 대용량의 자료에 적용 가능함(scalable)을 보이기 위해 비동기 병렬처리 방식으로도 이 방법의 효과가 유지됨을 검증하였다. Temporal-difference(TD) learning is a core algorithm of reinforcement learning, which employs models of Markov process. In the TD methods, rewards are always discounted by a discount factor and states receive these discounted values as their rewards. In this paper, we attempted to estimate a semantic orientation of words in texts using the TD-based methods and examined the effectiveness of the proposed methods by comparing them to existing feature selection methods (indirect approach) and Bayes probabilities (direct approach). The TD-based estimation would be useful for tasks of social opinion mining, since TD learning is inherently an on-line method. In order to show our approach is scalable to huge data, the estimation method is also evaluated using asynchronous parallel processing.