http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
논문 검색 결과의 효과적인 브라우징을 위한 단어 군집화 기반의 결과 내 군집화 기법
배경만(Kyoungman Bae),황재원(Jaewon Hwang),고영중(Youngjoong Ko),김종훈(Jonghoon Kim) 한국정보과학회 2010 정보과학회논문지 : 소프트웨어 및 응용 Vol.37 No.3
검색 결과 내 군집화(search-result clustering)는 검색 엔진으로부터 검색된 결과 내에서 비슷한 문서를 자동으로 군집화하는 기법이다. 본 논문에서는 논문 검색 서비스에 전문화된 새로운 결과 내군집화 기법을 제안한다. 제안하는 시스템은 ‘범주체계생성기(Category Hierarchy Generation System)’와 ‘논문군집기(Paper Clustering System)’로 구성되어있다. ‘범주체계생성기’는 KOSEF의 연구 범주 체계를 이용하여 분야 시소러스라 불리는 범주 체계를 생성하고, K-means 알고리즘을 이용한 단어 군집화 알고리즘을 사용하여 분야 시소러스의 키워드 집합을 확장한다. ‘논문군집기’는 top-down 방식과 bottom-up 방식을 이용하여 각 논문의 범주를 결정한다. 제안하는 시스템은 논문 검색 서비스와 같은 전문 분야에 대한 검색 서비스에 유용하게 사용될 수 있을 것이다. The search-results clustering problem is defined as the automatic and on-line grouping of similar documents in search results returned from a search engine. In this paper, we propose a new search-results clustering algorithm specialized for a paper search service. Our system consists of two algorithmic phases: Category Hierarchy Generation System (CHGS) and Paper Clustering System (PCS). In CHGS, we first build up the category hierarchy, called the Field Thesaurus, for each research field using an existing research category hierarchy (KOSEF’s research category hierarchy) and the keyword expansion of the field thesaurus by a word clustering method using the K-means algorithm. Then, in PCS, the proposed algorithm determines the category of each paper using top-down and bottom-up methods. The proposed system can be used in the application areas for retrieval services in a specialized field such as a paper search service.
자연어 기반 인터페이스에서 개체명 패턴을 이용한 효과적인 개체명과 주제어 인식 방법
배경만(Kyoungman Bae),김성현(Sunghyun Kim),고영중(Youngjoong Ko),김종훈(Jonghoon Kim) 한국정보기술학회 2014 한국정보기술학회논문지 Vol.12 No.1
Since the number of people who use mobile devices increase, the user needs about the Natural Language Interface(NLI) has been grown. A efficient named entity recognition is required to analyze the query entered by the NLI. Especially, the named entities in the natural query in a particular domain are efficiently extracted. In this paper, we propose the efficient named entity recognition method in a schedule and a personal information management domain. For this, we generate a hierarchical named entity dictionary by separating the name entity with the attribute type and the instance type. And then, we defined a named entity pattern to solve the problem of the synonym in the attribute named entity. We evaluated the proposed method. As a result, the baseline method achieved 72.5%. And the proposed method achieved 79.9%, which is 7.4% higher performance than the baseline method.
분류 우선순위 적용과 후보정 규칙을 이용한 효과적인 한국어 화행 분류
송남훈(Namhoon Song),배경만(Kyoungman Bae),고영중(Youngjoong Ko) Korean Institute of Information Scientists and Eng 2016 정보과학회논문지 Vol.43 No.1
A speech-act is a behavior intended by users in an utterance. Speech-act classification is important in a dialogue system. The machine learning and rule-based methods have mainly been used for speech-act classification. In this paper, we propose a speech-act classification method based on the combination of support vector machine (SVM) and transformation-based learning (TBL). The users utterance is first classified by SVM that is preferentially applied to categories with a low utterance rate in training data. Next, when an utterance has negative scores throughout the whole of the categories, the utterance is applied to the correction phase by rules. The results from our method were higher performance over the baseline system long with error-reduction.