RISS 검색 - 국내학술지논문

무료
기관 내 무료
유료

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

1
Evaluation metrics for automatically constructed concept maps

Aliya Nugumanova,Yerzhan Baiburin 제어로봇시스템학회 2021 제어로봇시스템학회 국제학술대회 논문집 Vol.2021 No.10
- 원문보기
Concept maps are knowledge visualization tools that allow representing the text or domain at a conceptual level. They reflect the systemic relations between the key concepts of the text and thereby contribute to a deeper understanding of its ideas, save time spent on reading and analysis. However, the very process of creating concept maps is laborious and time-consuming. At the same time, with the rapid growth of digital reading services, the automatic construction of concept maps attracts an increasing intensive research. Against that background, comparison and evaluation of methods for automatic construction of concept maps are of great importance. In this paper, we discuss popular evaluation metrics for automatically created concept maps and propose our new metric based on network centrality analysis. We test all the considered metrics by comparing an automatic concept map with a reference concept map developed manually by experts. Experiments show that our proposed metric complements existing metrics by providing information about significance degrees of concepts and relations.
2
BERT 기반의 사전 학습 언어 모형을 이용한 한국어 문서 추출 요약 베이스라인 설계

박재언,김지호,이홍철 한국정보기술학회 2022 한국정보기술학회논문지 Vol.20 No.6
- 원문보기
In modern society, where digital documents have increased exponentially, it is essential to efficiently obtain important information within documents. However, due to the vast amount of digital documents, it has become difficult for humans to abbreviate important information on individual documents. Document summarization is a Natural Language Processing field that extracts or generates meaningful sentences shorter than the original document while maintaining key information on the original document. However, since there is no appropriate Korean summarization data for benchmark, research has been conducted without a baseline, and development in this field is insufficient. In this paper, two document datasets that satisfy the accessibility and verification of summarization data and different text characteristics were selected. In addition, BERT-based multilingual and Korean pre-trained language models were selected, compared, and tested. For Korean documents, the Korean pre-trained language models outperformed the multilingual pre-trained language models in ROUGE scores. The cause was analyzed through the extraction ratio of selected summary sentences. 디지털 문서가 기하급수적으로 증가한 현대 사회에서 문서 내 중요한 정보를 효율적으로 획득하는 것은 중요한 요구사항이 되었다. 그러나 방대한 디지털 문서의 양은 개별 문서의 중요 정보를 식별하고 축약하는 데 어려움을 야기하였다. 문서 요약은 자연어 처리의 한 분야로서 원본 문서의 핵심적인 정보를 유지하는 동시에 중요 문장을 추출 또는 생성하는 작업이다. 하지만 벤치마크로 사용하기에 적절한 한국어 문서 데이터의 부재와 베이스라인 없이 문서 요약 연구가 진행되어 발전이 미진한 상황이다. 본 논문에서는 데이터에 대한 검증과 접근성을 충족하고 글의 특성이 다른 두 개의 문서 집합을 선정하였다. BERT 기반의 다국어 및 한국어 사전 학습 언어 모형들을 선정하여 비교 및 실험하였다. 주요 결과로는 한국어 사전 학습 언어 모형이 ROUGE 점수에서 다국어 사전 학습 언어 모형을 능가하였으며, 이에 대한 원인을 추출된 요약 문장의 비율을 통해 분석하였다.

내보내기
내책장담기
한글로보기

정확도순

내림차순

내림차순

10개씩 출력

맨처음 페이지로 1 맨끝 페이지로

상세검색

RISS 보유자료

상세검색

해외전자자료

연관 검색어 추천