RISS 검색 - 국내학술지논문 상세보기

다국어 초록 (Multilingual Abstract)

Since corpus is a pile of everyday language usage, using computing tools is essential in collecting, sifting, mining and using the meaningful data from the massive text data. In this paper, we introduce two tools for handling the large scale corpus; IMS Corpus Workbench (CWB) and Sketch Engine. The architecture of the tools is the inverted index model as a type of reference database, providing corpus handlers with speed and extendibility. The limit of CWB lies in the Western language character unicode system (ISO-8859), causing unsatisfactory handling of Korean in the full-fledged scale. We need to consider more suitable architectural design for searching, storing and user-friendly interface in case of large scale corpus in Korean.

번역하기

목차 (Table of Contents)

ABSTRACT
1. 서론
2. 대용량 코퍼스 툴
2.1 CWB
2.2. Sketch Engine

ABSTRACT
1. 서론
2. 대용량 코퍼스 툴
2.1 CWB
2.2. Sketch Engine
3. 활용 연구
3.1. 코퍼스 데이터 시각화
3.2. Perl 코퍼스 검색과 비교
4. 한국어와 대용량 코퍼스 툴
5. 결론
참고문헌

참고문헌 (Reference)

1 강승식, "한국어 형태소 분석과 정보 검색" 홍릉출판사 2002

2 김동성, "프로젝트 구텐베르크를 활용한 텍스트 시각화 및 용례 검색" 한국언어정보학회 22 (22): 81-104, 2018

3 "세종계획 코퍼스"

4 에이든, "빅데이터 인문학: 진격의 서막" 사계절 2015

5 이민행, "빅데이터 시대의 언어연구 - 내 손안의 검색엔진" 21세기북스 2015

6 "꼬꼬마"

7 "no Sketch Engine"

8 "d3. js"

9 Scott, M., "WordSmith Tools" Oxford University Press 1996

10 Bochkarev, V., "Verifying heap's law using Google Books Ngram data" 1-8, 2016

1 강승식, "한국어 형태소 분석과 정보 검색" 홍릉출판사 2002

2 김동성, "프로젝트 구텐베르크를 활용한 텍스트 시각화 및 용례 검색" 한국언어정보학회 22 (22): 81-104, 2018

3 "세종계획 코퍼스"

4 에이든, "빅데이터 인문학: 진격의 서막" 사계절 2015

5 이민행, "빅데이터 시대의 언어연구 - 내 손안의 검색엔진" 21세기북스 2015

6 "꼬꼬마"

7 "no Sketch Engine"

8 "d3. js"

9 Scott, M., "WordSmith Tools" Oxford University Press 1996

10 Bochkarev, V., "Verifying heap's law using Google Books Ngram data" 1-8, 2016

11 "VISL 프로젝트"

12 Evert, S., "Twenty-first century corpus workbench" Univ. of Birmingham 2011

13 Davies, M., "The advantage of using relational databases for large corpora" 15 (15): 412-418, 2005

14 Kilgariff, A., "The Sketch Engine : ten years on" 1 : 7-36, 2014

15 "The Gutenberg Project"

16 "TXM"

17 "TEI"

18 "Sketch Engine"

19 Richlý, P., "Manatee/Bonito" 65-70, 2007

20 "Leeds CQP"

21 Manning, C., "Introduction to Information Retrieval" Cambridge University Press 1999

22 Evert, S., "Inside the IMS corpus workbench" IULA, Univ. of Pompeu Fabra 2008

23 "Google Ngram Viewer"

24 Gries, S., "Dispersion and adjusted frequencies in corpora" 1 (1): 403-437, 2008

25 Ginzberg, J., "Detecting influenza epidemics using search engine query data" 459 (459): 1012-1014, 2009

26 McEnery, T., "Corpus Linguistics" Edinburgh University Press 1996

27 Christ, O., "Corpus Administrator’s Manual"

28 Anthony, L., "Contemporary Corpus Linguistics" 87-104, 2009

29 "CWB R tool"

30 "CWB"

31 Hardie, A., "CQPweb – combining power, flexibility and usability in a corpus analysis tool" 17 (17): 380-409, 2012

32 Tan, L., "Building and annotating the linguistically divers NTU-MC" 362-371, 2011

33 Gulordava, K., "A distributional similarity approach to the detection of semantic change in the Google Ngram corpus" 2011 : 67-71, 2011

34 Gries, S., "A Mosaic of Corpus Linguistics: Selected Approach" 269-291, 2010

연월일	이력구분	이력상세
2021	평가예정	신규평가 신청대상 (신규평가)
2020-12-01	평가	등재후보 탈락 (계속평가)
2018-01-01	평가	등재후보학술지 선정 (신규평가)

상세검색

RISS 보유자료

상세검색

해외전자자료

대용량 코퍼스 전산적 툴에 대한 연구

부가정보

동일학술지(권/호) 다른 논문

분석정보

인용정보 인용지수 설명보기

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료