http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
외래어 전자사전 표제어의 효율적인 확장을 위한 의미 분포 연구
남지순 한국언어과학회 2012 언어과학 Vol.19 No.1
An increasing number of English loanwords in Korean texts are one of the heavy obstacles to the automatic analysis of Korean texts due to the absence of these words in existing electronic lexicons. This study aims at proposing a semantic classification of 2,300 English loanwords that we selected from Korean Electronic Loanword Dictionary DECO-ZNF(Nam, 2010). The principle used in the semantic classification of Korean common nouns DSeC-N has been adopted in this study as well. On the basis of our results, we expect to predict in what semantic domains we should expand loanword entries: loanwords in semantic classes of new products, food, and art appear more important in number. The experiment presented at the end of the paper shows that a corpus of cosmetic products contains English loanwords much more than that of political news. Note that unless we expand the loanword entries related to this domain, the performance of automatic analysis will be far from being satisfying
남지순 한국어학회 2009 한국어학 Vol.42 No.-
This paper aims to describe lexico-syntactic aspects of Korean time expressions we can observe from real full texts such as internet newspapers Time expressions that, usually appeared as adverbial forms have been considered as peripheric phenomena in the syntactic studies are in fact one of the most important information we need to obtain in the automatic processing domain like information extraction However, recognizing linguistic forms representing various tune concepts is not a simple task unless we are provided with the reliable description of their linguistic characteristics In this study we present the result we could have derived from the analysis of korean real texts on the basis of LGG(Local Grammar Graph) schema, which will be transformed into finite-state automata in any NLP systems.
남지순 서울대학교 어학연구소 2002 語學硏究 Vol.38 No.1
This study aims to present a Korean Electronic Lexicon of Proper Nouns camed DECOR which is conceived to treat the major parts of the unknown words appeared in the automatic text processing. DECOR is made up of two modules: the module of Encyclopedic Nouns DECOR-E and that of Current Nouns DECOR-C. We discussed, in this paper, especially on the organization of DECOR-E. The lexicon DECOR-E contains about 33,000 proper nouns(i.e. encyclopedic nouns) and 2,200 suffix patterns related to these nouns. The suffix patterns are used to formally classify proper nouns, for they explicitly represent specific semantic features of proper nouns. In this way, we obtained 4 classes of proper nouns and each of them is divided into 3 or 4 sub-classes. The lexicon of Current Nouns DECOR-C is being constructed on the basis of that of DECOR-E. We finally discussed on a powerful auxiliary tool for an electronic lexicon of proper nouns, which is named 'Local Grammar'. Local grammars, also called Finite-State Local Automata, are represented under directed acyclic graphs and allow the automatic analyzer to recognize several transformed sequences which, carrying same information, are constituted of the same lexical words that are not in a same syntagmatic order.
남지순 국제언어인문학회 2013 인문언어 Vol.15 No.1
This study introduces a language processing model of one of the most prominent French computational linguists, Maurice Gross (1934~2001), discussing its computational applications. The methodology he proposed for describing syntactic properties of French simple sentences presupposes the exhaustive examination of all predicative entries, which gave it the name Lexicon-Grammar. The results of this approach are presented under a form of binary matrix: about 75,000 French predicates have been described and classified in syntactic tables. However, Gross realized that there still existed tremendous size of non-regular expressions, called compound words, frozen phrases or multi-word expressions, which can hardly can be described under syntactic and formal devices. Thus, he suggested a more processing-oriented framework named Local Grammar (LGG) which is represented as a directed graph like RTN (Recursive Transition Network). The LGGs are transformed into finite-state automata or transducers by means of FSTgraphEditor in UNITEX system, which is implemented, on one hand, for this purpose and, on the other hand, for the automatic corpus processing based on these finite graphs as well as electronic dictionaries included in the system. In this paper, various areas of LGG applications are discussed, such as the construction of linguistic resources, information extraction/opinion mining or formalization of frozen expressions for machine translation.