RISS 검색 - 국내학술지논문

1
21세기 세종 계획 전자사전의 특수어

조인식(Cho In-Sik),유현조(You Hyun-Jo),신효필(Shin Hyo-Pil) 한국사전학회 2004 한국사전학 Vol.- No.3
- 원문보기
The purpose of this study is to identify the properties of special-word, and to show the process of extracting special-words from a large corpus. A special-word corresponds to the notion of unknown words, which is a counterpart of the lexical database in Natural Language Process(NLP). Generally unknown words cause a lot of ambiguities and thus decline the accuracy of NLP systems. The special-word in this work includes various expressions about the events of the day or the fashions, abbreviated words and naturalized word. We came up with a semi-automatic procedure of constructing a special-word dictionary mainly based on the language-dependent heuristics. We, however, also feel that other statistical considerations including frequencies, and probability distributions may be required for unknown word extractions in a higher automatic fashion.
2
한국어 정보 처리 시스템의 전처리를 위한 미등록어 추정 및 철자 오류의 자동 교정

박봉래(Park Bong Rae),임해창(Rim Hae Chang) 한국정보처리학회 1998 정보처리학회논문지 Vol.5 No.10
- 원문보기 2
  ScienceON
  
  KISS
In this paper, we propose a method of recognizing unknown words and correcting spelling errors(including spacing errors) to increase the performance of Korean information processing systems. Unknown words are recognized through comparative analysis of two or more morphologically similar eojeols(spacing units in Korean) including the same unknown word candidates. And spacing errors and spelling errors are corrected by using lexicalized rules which are autimatically extracted from very large raw corpus. The extraction of the lexicalized rules is based on morphological and contextual similarities between error eojeols and their correction eojeols which are confirmed to be used in the corpus. The experimental result shows that our system can recognize unknown words in an accuracy of 98.9%, and can correct spacing errors and spelling errors in accuracies of 98.1% and 97.1%, respectively.
3
KNE: An Automatic Dictionary Expansion Method Using Use-cases for Morphological Analysis

Nam, Chung-Hyeon,Jang, Kyung-Sik The Korea Institute of Information and Commucation 2019 Journal of information and communication convergen Vol.17 No.3
- 원문보기 2
  ScienceON
  
  DBpia
Morphological analysis is used for searching sentences and understanding context. As most morpheme analysis methods are based on predefined dictionaries, the problem of a target word not being registered in the given morpheme dictionary, the so-called unregistered word problem, can be a major cause of reduced performance. The current practical solution of such unregistered word problem is to add them by hand-write into the given dictionary. This method is a limitation that restricts the scalability and expandability of dictionaries. In order to overcome this limitation, we propose a novel method to automatically expand a dictionary by means of use-case analysis, which checks the validity of the unregistered word by exploring the use-cases through web crawling. The results show that the proposed method is a feasible one in terms of the accuracy of the validation process, the expandability of the dictionary and, after registration, the fast extraction time of morphemes.
4
Probabilistic Segmentation and Tagging of Unknown Words

김보겸(Bogyum Kim),이재성(Jae Sung Lee) Korean Institute of Information Scientists and Eng 2016 정보과학회논문지 Vol.43 No.4
- 원문보기 2
  ScienceON
  
  DBpia

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천