RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Empirical Comparison of Word Similarity Measures Based on Co-Occurrence, Context, and a Vector Space Model

      한글로보기

      https://www.riss.kr/link?id=A106959748

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentio...

      Word similarity is often measured to enhance system performance in the information retrieval field and other related areas. This paper reports on an experimental comparison of values for word similarity measures that were computed based on 50 intentionally selected words from a Reuters corpus. There were three targets, including (1) co-occurrence-based similarity measures (for which a co-occurrence frequency is counted as the number of documents or sentences), (2) context-based distributional similarity measures obtained from a latent Dirichlet allocation (LDA), nonnegative matrix factorization (NMF), and Word2Vec algorithm, and (3) similarity measures computed from the tf-idf weights of each word according to a vector space model (VSM). Here, a Pearson correlation coefficient for a pair of VSM-based similarity measures and co-occurrence-based similarity measures according to the number of documents was highest. Group-average agglomerative hierarchical clustering was also applied to similarity matrices computed by individual measures. An evaluation of the cluster sets according to an answer set revealed that VSM- and LDA-based similarity measures performed best.

      더보기

      참고문헌 (Reference)

      1 Pekar, V., "Word classification based on combined measures of distributional and semantic similarity" Association for Computational Linguistics 2 : 147-150, 2003

      2 Lin, H., "Topic detection from short text: A term-based consensus clustering method" IEEE 1-6, 2016

      3 Peat, H. J., "The limitations of term cooccurrence data for query expansion in document retrieval systems" 42 (42): 378-383, 1991

      4 Li, C. H., "Text categorization algorithms using semantic approaches, corpus-based thesaurus and WordNet" 39 (39): 765-772, 2012

      5 Jo, T., "String vector based AHC as approach to word clustering" Lancaster Centre for Forecasting 133-138, 2016

      6 Chen, L., "Statistical relationship determination in automatic thesaurus construction" Association for Computing Machinery 267-268, 2005

      7 Dagan, I., "Similarity-based models of word cooccurrence probabilities" 34 (34): 43-69, 1999

      8 Lagutina, K., "Sentiment classification of Russian texts using automatically generated thesaurus" FRUCT Oy 217-222, 2018

      9 Zazo, A. F., "Reformulation of queries using similarity thesauri" 41 (41): 1163-1173, 2005

      10 Xu, J., "Query expansion using local and global document analysis" Association for Computing Machinery 4-11, 1996

      1 Pekar, V., "Word classification based on combined measures of distributional and semantic similarity" Association for Computational Linguistics 2 : 147-150, 2003

      2 Lin, H., "Topic detection from short text: A term-based consensus clustering method" IEEE 1-6, 2016

      3 Peat, H. J., "The limitations of term cooccurrence data for query expansion in document retrieval systems" 42 (42): 378-383, 1991

      4 Li, C. H., "Text categorization algorithms using semantic approaches, corpus-based thesaurus and WordNet" 39 (39): 765-772, 2012

      5 Jo, T., "String vector based AHC as approach to word clustering" Lancaster Centre for Forecasting 133-138, 2016

      6 Chen, L., "Statistical relationship determination in automatic thesaurus construction" Association for Computing Machinery 267-268, 2005

      7 Dagan, I., "Similarity-based models of word cooccurrence probabilities" 34 (34): 43-69, 1999

      8 Lagutina, K., "Sentiment classification of Russian texts using automatically generated thesaurus" FRUCT Oy 217-222, 2018

      9 Zazo, A. F., "Reformulation of queries using similarity thesauri" 41 (41): 1163-1173, 2005

      10 Xu, J., "Query expansion using local and global document analysis" Association for Computing Machinery 4-11, 1996

      11 Hofmann, T., "Probabilistic latent semantic indexing" Association for Computing Machinery 50-57, 1999

      12 Mohsen, G., "On the automatic construction of an Arabic thesaurus" IEEE 243-247, 2018

      13 Waltz, D. L., "Massively parallel parsing : A strongly interactive model of natural language interpretation" 9 (9): 51-74, 1985

      14 Ravikumar, S., "Mapping the intellectual structure of scientometrics : A co-word analysis of the journal Scientometrics(2005-2010)" 102 (102): 929-955, 2015

      15 Mikolov, T., "Linguistic regularities in continuous space word representations" Association for Computational Linguistics 746-751, 2013

      16 Lee, D. D., "Learning the parts of objects by non-negative matrix factorization" 401 : 788-791, 1999

      17 Blei, D. M., "Latent Dirichlet allocation" 3 : 993-1022, 2003

      18 Khasseh, A. A., "Intellectual structure of knowledge in iMetrics : A co-word analysis" 53 (53): 705-720, 2017

      19 Deerwester, S., "Indexing by latent semantic analysis" 41 (41): 391-407, 1990

      20 Gallant, S., "HNC's MatchPlus System" National Institute of Standards and Technology 107-111, 1992

      21 Pennington, J., "GloVe: Global vectors for word representation" Association for Computational Linguistics 1532-1543, 2014

      22 Terra, E. L., "Frequency estimates for statistical word similarity measures" Association for Computational Linguistics 1 : 165-172, 2003

      23 Griffiths, T. L., "Finding scientific topics" 101 (101): 5228-5235, 2004

      24 Toutanova, K., "Feature-rich part-of-speech tagging with a cyclic dependency network" Association for Computational Linguistics 173-180, 2003

      25 Kishida, K., "Empirical comparison of external evaluation measures for document clustering by using synthetic data" IPSJ SIG 1-7, 2014

      26 Kishida, K., "Double-pass clustering technique for multilingual document collections" 37 (37): 304-321, 2011

      27 Kotlerman, L., "Directional distributional similarity for lexical inference" 16 (16): 359-389, 2010

      28 Qiu, Y., "Concept based query expansion" Association for Computing Machinery 160-169, 1993

      29 Mandala, R., "Combining multiple evidence from different types of thesaurus for query expansion" Association for Computing Machinery 191-197, 1999

      30 Zhao, Z., "Cluster-driven model for improved word and text embedding" IOS Press 99-106, 2016

      31 Poostchi, H., "Cluster labeling by word embeddings and WordNet's hypernymy" Association for Computational Linguistics 66-70, 2018

      32 Xu, H., "Automatic thesaurus construction for spam filtering using revised back propagation neural network" 37 (37): 18-23, 2010

      33 Liebeskind, C., "Automatic thesaurus construction for modern Hebrew" European Language Resources Association 1446-1451, 2018

      34 Shunmugam, D. A., "An empirical investigation of word clustering techniques for natural language understanding" 6 (6): 2637-2646, 2016

      35 Jing, Y., "An association thesaurus for information retrieval" Le Centre de Hautes Etudes Internationales d'Informatique Documentaire 1 : 146-160, 1994

      36 Schutze, H., "A cooccurrence-based thesaurus and two applications to information retrieval" 33 (33): 307-318, 1997

      더보기

      동일학술지(권/호) 다른 논문

      동일학술지 더보기

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2024 평가예정 해외DB학술지평가 신청대상 (해외등재 학술지 평가)
      2021-01-01 평가 등재학술지 선정 (해외등재 학술지 평가) KCI등재
      2014-10-01 평가 등재후보 탈락 (계속평가)
      2013-01-25 학술지명변경 한글명 : 정보관리연구 -> Journal of Information Science Theory and Practice
      외국어명 : Journal of Information Management -> 미등록
      KCI등재
      2013-01-01 평가 등재 1차 FAIL (등재유지) KCI등재
      2010-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2007-01-01 평가 등재학술지 선정 (등재후보2차) KCI등재
      2006-06-20 학술지등록 한글명 : 정보관리연구
      외국어명 : Journal of Information Management
      KCI등재후보
      2006-01-01 평가 등재후보 1차 PASS (등재후보1차) KCI등재후보
      2004-07-01 평가 등재후보학술지 선정 (신규평가) KCI등재후보
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼