RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        한국어 통시 신문 말뭉치의 구축과 활용

        강범일 ( Kang¸ Beomil ) 연세대학교 언어정보연구원 2021 언어사실과 관점 Vol.54 No.-

        This paper introduces the process of building a Korean diachronic corpus based on articles in Chosun Ilbo and Donga Ilbo from 1920 to 2019. Newspapers reflect not only the social but also the linguistic reality of their time, as they convey a variety of information and thoughts in the language of ordinary people. Such data must be processed into a form that can be analyzed quantitatively for an effective understanding of this linguistic reality. In order to do so, the spacing and notation of some vocabulary items were modified to meet current norms, and vocabulary listed in various dictionaries was added to the dictionary referenced by the morphological analyzer to improve vocabulary unit detection. After this pre-processing, changes in linguistic form were investigated to show the application of this corpus. The mean number of syllables in words decreased and the length of the sentences showed a continuous decrease. In addition, the proportion of Chinese characters in articles dropped and the use of Hangul and Alphabets has increased.

      • KCI등재

        어휘 중요도 측정을 위한 산포도 연구

        강범일 ( Kang Beomil ) 연세대학교 언어정보연구원 2023 언어사실과 관점 Vol.59 No.-

        본 연구에서는 기존의 산포도 관련 논의를 통해 산포도 척도가 발전되어 온 흐름을 정리하고, 이를 측정하는 여러 척도 중 타당성이 높다고 평가받는 비율 편차(Deviation of Proportions) 계열의 두 척도를 중심으로 한국어 말뭉치의 출현 어휘들을 분석해 보았다. 그 결과, 빈도의 영향을 없앤 DPnofreq가 빈도와의 상관성이 가장 낮고, 저빈도어를 대상으로도 변별력 있는 값을 산출하여 빈도와 차별화된 정보를 제공하는 것을 확인할 수 있었다. 이러한 결과는 말뭉치 기반 통계 연구에서 어휘를 비롯한 언어 단위의 중요도를 판단할 때 빈도와 더불어 산포도가 함께 고려될 필요가 있음을 보여 준다. This study summarises the development of dispersion measures by examining discussions on dispersion in the field of corpus linguistics, and analyzes word dispersion in the Korean corpus, focusing on two measures of the DP(Deviation of Proportions) family, which are considered to have high validity among other dispersion measures. The results show that DPnofreq, which eliminates the impact of frequency, has the lowest correlation with frequency and provides distinctive information even for low-frequency words. These results suggest that in corpus-based statistical studies, the dispersion should be taken into account in addition to frequency when assessing word importance or commonness.

      • KCI등재

        말뭉치 언어학과 통계학의 만남 - Vaclav Brezina(2018), Statistics in Corpus Linguistics -

        강범일 ( Kang Beomil ) 연세대학교 언어정보연구원 2022 언어사실과 관점 Vol.57 No.-

        This article discusses Vaclav Brezina’s book Statistics in Corpus Linguistics: A Practical Guide (2018). Although many linguists find statistics difficult, rigorous statistical procedures must be applied to generalize the findings from a corpus to the language as a whole. This book introduces statistical procedures related to a variety of topics in linguistics. Concepts from basic to complex are explained in easy language, and various learning materials are provided through a companion website. The book also includes the latest statistical methodologies and various visualization examples for linguistics research. Thus, this book can be recommended for linguistic researchers studying statistics for the first time.

      • KCI등재

        정치와 언어의 관계에 대한 양적 분석 시론

        김하수(Kim, Ha-Soo),손현정(Son, Hyunjung),이재윤(Lee, Jae Yun),강범일(Kang, Beomil) 담화·인지언어학회 2013 담화와 인지 Vol.20 No.1

        This paper aims to conduct a linguistic analysis of discourses by three Korean politicians who ran for the Korean presidency in 2012. Data on their speech and writing are collected for this purpose from TV talk shows on which they appeared and the books they wrote. Various methods are used to analyse the data, such as network analysis, word frequency analysis and the lexical repetition measuring method. Firstly, we extract frequent words from each politician’s subcorpus and identify statistically distinctive words that each politician used more frequently than other politicians. Secondly, we construct networks of co-occurring words via which the differences in network structure are analysed. Finally, a new method for measuring lexical repetition is used to discover pragmatic differences in their discourses. By applying these methods to the discourse data, we can more effectively propose linguistic characteristics of the politicians’ discourses.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼