http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
조인식(Cho In-Sik),유현조(You Hyun-Jo),신효필(Shin Hyo-Pil) 한국사전학회 2004 한국사전학 Vol.- No.3
The purpose of this study is to identify the properties of special-word, and to show the process of extracting special-words from a large corpus. A special-word corresponds to the notion of unknown words, which is a counterpart of the lexical database in Natural Language Process(NLP). Generally unknown words cause a lot of ambiguities and thus decline the accuracy of NLP systems. The special-word in this work includes various expressions about the events of the day or the fashions, abbreviated words and naturalized word. We came up with a semi-automatic procedure of constructing a special-word dictionary mainly based on the language-dependent heuristics. We, however, also feel that other statistical considerations including frequencies, and probability distributions may be required for unknown word extractions in a higher automatic fashion.
한국어 정보 처리 시스템의 전처리를 위한 미등록어 추정 및 철자 오류의 자동 교정
박봉래(Park Bong Rae),임해창(Rim Hae Chang) 한국정보처리학회 1998 정보처리학회논문지 Vol.5 No.10
In this paper, we propose a method of recognizing unknown words and correcting spelling errors(including spacing errors) to increase the performance of Korean information processing systems. Unknown words are recognized through comparative analysis of two or more morphologically similar eojeols(spacing units in Korean) including the same unknown word candidates. And spacing errors and spelling errors are corrected by using lexicalized rules which are autimatically extracted from very large raw corpus. The extraction of the lexicalized rules is based on morphological and contextual similarities between error eojeols and their correction eojeols which are confirmed to be used in the corpus. The experimental result shows that our system can recognize unknown words in an accuracy of 98.9%, and can correct spacing errors and spelling errors in accuracies of 98.1% and 97.1%, respectively.
KNE: An Automatic Dictionary Expansion Method Using Use-cases for Morphological Analysis
Nam, Chung-Hyeon,Jang, Kyung-Sik The Korea Institute of Information and Commucation 2019 Journal of information and communication convergen Vol.17 No.3
Morphological analysis is used for searching sentences and understanding context. As most morpheme analysis methods are based on predefined dictionaries, the problem of a target word not being registered in the given morpheme dictionary, the so-called unregistered word problem, can be a major cause of reduced performance. The current practical solution of such unregistered word problem is to add them by hand-write into the given dictionary. This method is a limitation that restricts the scalability and expandability of dictionaries. In order to overcome this limitation, we propose a novel method to automatically expand a dictionary by means of use-case analysis, which checks the validity of the unregistered word by exploring the use-cases through web crawling. The results show that the proposed method is a feasible one in terms of the accuracy of the validation process, the expandability of the dictionary and, after registration, the fast extraction time of morphemes.