http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
FastText와 셀프 매칭 어텐션 기반 포인터 네트워크를 이용한 한국어 상호참조해결
박천음(Cheoneum Park),황현선(Hyunsun Hwang),이창기(Changki Lee),김현기(Hyunki Kim) 한국정보과학회 2018 정보과학회 컴퓨팅의 실제 논문지 Vol.24 No.12
셀프 매칭 어텐션 메커니즘은 자기 자신에 대한 얼라인먼트 점수를 계산하는 방법이며, 주어진 시퀀스에 대하여 서로 유사한 단어 간의 얼라인먼트 점수가 더 높게 계산되어 상호참조해결에 도움이 될 수 있다. FastText는 입력 단어를 음절 단위 n-gram으로 나누어 학습하는 방법으로, 단어의 변형이 심하거나 단어 사전에 없는 unknown 단어를 처리하는데 적합하다. 본 논문에서는 셀프 매칭 어텐션 매커니즘을 기반으로 한 포인터 네트워크를 상호참조해결에 적용하고, 상호참조해결에서 발생하는 unknown 단어문제를 해결하기 위하여 FastText로 사전 학습한 단어 표현을 사용할 것을 제안한다. 실험 결과, 본 논문에서 제안한 모델 중에서 Self_att_ffnn 모델이 CoNLL F1 (test) 73.55%, Self_att_gru4 모델이 CoNLL F1 (test) 73.60%로 일반 포인터 네트워크보다 각각 2.72%, 1.52%의 성능 향상을 보였다. Self-matching attention mechanism is a method of calculating an alignment score for oneself, and two sequences applying an attention mechanism are the same sequence. Applying the self-matching attention mechanism to a given sequence, can facilitate solving coreference resolution by calculating higher alignment scores between similar words. FastText is a method to learn an input word by dividing the input word into a character n-gram. The FastText is suitable for dealing with unknown word not in a vocabulary or in a word variant. In this paper, we propose to apply pointer networks based on the self-matching attention mechanism to coreference resolution and to use word embedding pre-trained with FastText to solve an unknown word problem in coreference resolution. As a result, self_att_ffnn model showed 73.55% of CoNLL F1 (test) and self_att_gru4 model showed 73.60% of CoNLL F1 (test) among the proposed models, and models showed a performance improvement of 2.72% and 1.52%, respectively.
박천음(Cheoneum Park),이창기(Changki Lee) Korean Institute of Information Scientists and Eng 2017 정보과학회논문지 Vol.44 No.5
Pointer Networks is a deep-learning model for the attention-mechanism outputting of a list of elements that corresponds to the input sequence and is based on a recurrent neural network (RNN). The coreference resolution for pronouns is the natural language processing (NLP) task that defines a single entity to find the antecedents that correspond to the pronouns in a document. In this paper, a pronoun coreference-resolution method that finds the relation between the antecedents and the pronouns using the Pointer Networks is proposed; furthermore, the input methods of the Pointer Networks-that is, the chaining order between the words in an entity-are proposed. From among the methods that are proposed in this paper, the chaining order Coref2 showed the best performance with an F1 of MUC 81.40 %. The method showed performances that are 31.00 % and 19.28 % better than the rule-based (50.40 %) and statistics-based (62.12 %) coreference resolution systems, respectively, for the Korean pronouns.
박천음(Cheoneum Park),이창기(Changki Lee) Korean Institute of Information Scientists and Eng 2017 정보과학회논문지 Vol.44 No.8
Mention detection systems use nouns or noun phrases as a head and construct a chunk of text that defines any meaning, including a modifier. The term “mention detection” relates to the extraction of mentions in a document. In the mentions, a coreference resolution pertains to finding out if various mentions have the same meaning to each other. A pointer network is a model based on a recurrent neural network (RNN) encoder-decoder, and outputs a list of elements that correspond to input sequence. In this paper, we propose the use of mention detection using pointer networks. Our proposed model can solve the problem of overlapped mention detection, an issue that could not be solved by sequence labeling when applying the pointer network to the mention detection. As a result of this experiment, performance of the proposed mention detection model showed an F1 of 80.07%, a 7.65%p higher than rule-based mention detection; a co-reference resolution performance using this mention detection model showed a CoNLL F1 of 52.67% (mention boundary), and a CoNLL F1 of 60.11% (head boundary) that is high, 7.68%p, or 1.5%p more than coreference resolution using rule-based mention detection.
박천음(Cheoneum Park),이창기(Changki Lee) Korean Institute of Information Scientists and Eng 2017 정보과학회논문지 Vol.44 No.8
In this paper, we propose a Korean dependency parsing model using multi-task learning based pointer networks. Multi-task learning is a method that can be used to improve the performance by learning two or more problems at the same time. In this paper, we perform dependency parsing by using pointer networks based on this method and simultaneously obtaining the dependency relation and dependency label information of the words. We define five input criteria to perform pointer networks based on multi-task learning of morpheme in dependency parsing of a word. We apply a fine-tuning method to further improve the performance of the dependency parsing proposed in this paper. The results of our experiment show that the proposed model has better UAS 91.79% and LAS 89.48% than conventional Korean dependency parsing.