RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      KCI등재

      텍스트 분류 기반 기계학습의 정신과 진단 예측 적용 = Application of Text-Classification Based Machine Learning in Predicting Psychiatric Diagnosis

      한글로보기

      https://www.riss.kr/link?id=A106985859

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract)

      Objectives The aim was to find effective vectorization and classification models to predict a psychiatric diagnosis from text-based medical records. Methods Electronic medical records (n = 494) of present illness were collected retrospectively in inpa...

      Objectives The aim was to find effective vectorization and classification models to predict a psychiatric diagnosis from text-based medical records. Methods Electronic medical records (n = 494) of present illness were collected retrospectively in inpatient admission notes with three diagnoses of major depressive disorder, type 1 bipolar disorder, and schizophrenia. Data were split into 400 training data and 94 independent validation data. Data were vectorized by two different models such as term frequency-inverse document frequency (TF-IDF) and Doc2vec. Machine learning models for classification including stochastic gradient descent, logistic regression, support vector classification, and deep learning (DL) were applied to predict three psychiatric diagnoses. Five-fold cross-validation was used to find an effective model. Metrics such as accuracy, precision, recall, and F1-score were measured for comparison between the models. Results Five-fold cross-validation in training data showed DL model with Doc2vec was the most effective model to predict the diagnosis (accuracy = 0.87, F1-score = 0.87). However, these metrics have been reduced in independent test data set with final working DL models (accuracy = 0.79, F1-score = 0.79), while the model of logistic regression and support vector machine with Doc2vec showed slightly better performance (accuracy = 0.80, F1-score = 0.80) than the DL models with Doc2vec and others with TF-IDF. Conclusions The current results suggest that the vectorization may have more impact on the performance of classification than the machine learning model. However, data set had a number of limitations including small sample size, imbalance among the category, and its generalizability. With this regard, the need for research with multi-sites and large samples is suggested to improve the machine learning models.

      더보기

      참고문헌 (Reference)

      1 정지수, "문서 유사도를 통한 관련 문서 분류 시스템 연구" 한국방송∙미디어공학회 24 (24): 77-86, 2019

      2 허성완, "낚시성 인터넷 신문기사 검출을 위한 특징 추출" 한국정보과학회 43 (43): 1210-1215, 2016

      3 김정미, "Word2vec을 활용한 RNN기반의 문서 분류에 관한 연구" 한국지능시스템학회 27 (27): 560-565, 2017

      4 Ramos JA, "Using TF-IDF to determine word relevance in document queries" Rutgers 2003

      5 Hastie T, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction" Springer Series in Statistics 2001

      6 Weiss SM, "Text Mining : Predictive Methods for Analyzing Unstructured Information" Springer Science & Business Media 2010

      7 Srivastava AN, "Text Mining : Classification, Clustering, and Applications" Chapman and Hall/CRC 2009

      8 Craddock N, "Psychiatric diagnosis : impersonal, imperfect and important" 204 : 93-95, 2014

      9 Tran T, "Predicting mental conditions based on “history of present illness” in psychiatric notes with deep neural networks" 75 Suppl : S138-S148, 2017

      10 Geman S, "Neural Networks and the Bias/Variance Dilemma" MIT Press 1-58, 1992

      1 정지수, "문서 유사도를 통한 관련 문서 분류 시스템 연구" 한국방송∙미디어공학회 24 (24): 77-86, 2019

      2 허성완, "낚시성 인터넷 신문기사 검출을 위한 특징 추출" 한국정보과학회 43 (43): 1210-1215, 2016

      3 김정미, "Word2vec을 활용한 RNN기반의 문서 분류에 관한 연구" 한국지능시스템학회 27 (27): 560-565, 2017

      4 Ramos JA, "Using TF-IDF to determine word relevance in document queries" Rutgers 2003

      5 Hastie T, "The Elements of Statistical Learning: Data Mining, Inference, and Prediction" Springer Series in Statistics 2001

      6 Weiss SM, "Text Mining : Predictive Methods for Analyzing Unstructured Information" Springer Science & Business Media 2010

      7 Srivastava AN, "Text Mining : Classification, Clustering, and Applications" Chapman and Hall/CRC 2009

      8 Craddock N, "Psychiatric diagnosis : impersonal, imperfect and important" 204 : 93-95, 2014

      9 Tran T, "Predicting mental conditions based on “history of present illness” in psychiatric notes with deep neural networks" 75 Suppl : S138-S148, 2017

      10 Geman S, "Neural Networks and the Bias/Variance Dilemma" MIT Press 1-58, 1992

      11 Bird S, "Natural Language Processing with Python:Analyzing Text with the Natural Language Toolkit" O’Reilly Media, Inc 2009

      12 Forsting M, "Machine learning will change medicine" 58 : 357-358, 2017

      13 Deo RC, "Machine learning in medicine" 132 : 1920-1930, 2015

      14 He H, "Learning from imbalanced datas" 21 : 1263-1284, 2009

      15 Park EL, "KoNLPy: Korean natural language processing in Python" 2014

      16 Sadock BJ, "Kaplan and Sadock’s Synopsis of Psychiatry: Behavioral Sciences/Clinical Psychiatry" Wolters Kluwer Health 192-211, 2014

      17 Liu Y, "Imbalanced text classification: a term weighting approach" 36 : 690-701, 2009

      18 Bellmann R, "Dynamic Programming" Princeton University Press 1957

      19 김도우, "Doc2Vec과 Word2Vec을 활용한 Convolutional Neural Network 기반 한국어 신문 기사 분류" 한국정보과학회 44 (44): 742-747, 2017

      20 Le Q, "Distributed representations of sentences and documents" 2014

      21 American Psychiatric Association, "Diagnostic and Statistical Manual of Mental Disorders: DSM-5" American Psychaitric Association 2013

      22 American Psychiatric Association, "Diagnosis and Statistical Manual of Mental Disorders: DSM-IV" American Psychiatric Association 1994

      23 Miotto R, "Deep patient : an unsupervised representation to predict the future of patients from the electronic health records" 6 : 26094-, 2016

      24 Chen MC, "Deep learning to classify radiology free-text reports" 286 : 845-852, 2018

      25 LeCun Y, "Deep learning" 521 : 436-444, 2015

      26 Tang J, "Data Classification: Algorithms and Applications" Chapman & Hall/CRC 37-64, 2015

      27 Kiers HA, "Data Analysis, Classification, and Related Methods" Springer-Verlag 181-186, 2000

      28 Banerjee I, "Comparative effectiveness of convolutional neural network(CNN)and recurrent neural network(RNN)architectures for radiology text report classification" 97 : 79-88, 2019

      29 Giger ML, "Biomedical Information Technology" Elsevier 359-370, 2008

      30 Noorbakhsh-Sabet N, "Artificial intelligence transforms the future of health care" 132 : 795-801, 2019

      31 Forman G, "Apples-to-apples in cross-validation studies : pitfalls in classifier performance measurement" 12 : 49-57, 2010

      32 Guyon I, "An introduction to variable and feature selection" 3 : 1157-1182, 2003

      33 Fawcett T, "An introduction to ROC analysis" 27 : 961-874, 2006

      34 Cataltepe Z, "An improvement of centroid-based classification algorithm for text classification" 2007

      35 Kingma DP, "Adam: a method for stochastic optimization" 2015

      36 Nguyen P, "$\mathtt {Deepr}$ : a convolutional net for medical records" 21 : 22-30, 2017

      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      인용정보 인용지수 설명보기

      학술지 이력

      학술지 이력
      연월일 이력구분 이력상세 등재구분
      2026 평가예정 재인증평가 신청대상 (재인증)
      2020-01-01 평가 등재학술지 유지 (재인증) KCI등재
      2017-01-01 평가 등재학술지 유지 (계속평가) KCI등재
      2013-01-01 평가 등재 1차 FAIL (등재유지) KCI등재
      2010-01-01 평가 등재학술지 유지 (등재유지) KCI등재
      2007-01-01 평가 등재학술지 선정 (등재후보2차) KCI등재
      2006-01-01 평가 등재후보 1차 PASS (등재후보1차) KCI등재후보
      2004-01-01 평가 등재후보학술지 선정 (신규평가) KCI등재후보
      더보기

      학술지 인용정보

      학술지 인용정보
      기준연도 WOS-KCI 통합IF(2년) KCIF(2년) KCIF(3년)
      2016 0.19 0.19 0.19
      KCIF(4년) KCIF(5년) 중심성지수(3년) 즉시성지수
      0.14 0.15 0.475 0
      더보기

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼