RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제
      • 좁혀본 항목 보기순서

        • 원문유무
        • 음성지원유무
        • 원문제공처
          펼치기
        • 등재정보
          펼치기
        • 학술지명
          펼치기
        • 주제분류
          펼치기
        • 발행연도
          펼치기
        • 작성언어
        • 저자
          펼치기

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        오피니언 분류의 감성사전 활용효과에 대한 연구

        김승우(Seungwoo Kim),김남규(Namgyu Kim) 한국지능정보시스템학회 2014 지능정보연구 Vol.20 No.1

        Recently, with the advent of various information channels, the number of has continued to grow. The main cause of this phenomenon can be found in the significant increase of unstructured data, as the use of smart devices enables users to create data in the form of text, audio, images, and video. In various types of unstructured data, the user’s opinion and a variety of information is clearly expressed in text data such as news, reports, papers, and various articles. Thus, active attempts have been made to create new value by analyzing these texts. The representative techniques used in text analysis are text mining and opinion mining. These share certain important characteristics; for example, they not only use text documents as input data, but also use many natural language processing techniques such as filtering and parsing. Therefore, opinion mining is usually recognized as a sub-concept of text mining, or, in many cases, the two terms are used interchangeably in the literature. Suppose that the purpose of a certain classification analysis is to predict a positive or negative opinion contained in some documents. If we focus on the classification process, the analysis can be regarded as a traditional text mining case. However, if we observe that the target of the analysis is a positive or negative opinion, the analysis can be regarded as a typical example of opinion mining. In other words, two methods (i.e., text mining and opinion mining) are available for opinion classification. Thus, in order to distinguish between the two, a precise definition of each method is needed. In this paper, we found that it is very difficult to distinguish between the two methods clearly with respect to the purpose of analysis and the type of results. We conclude that the most definitive criterion to distinguish text mining from opinion mining is whether an analysis utilizes any kind of sentiment lexicon. We first established two prediction models, one based on opinion mining and the other on text mining. Next, we compared the main processes used by the two prediction models. Finally, we compared their prediction accuracy. We then analyzed 2,000 movie reviews. The results revealed that the prediction model based on opinion mining showed higher average prediction accuracy compared to the text mining model. Moreover, in the lift chart generated by the opinion mining based model, the prediction accuracy for the documents with strong certainty was higher than that for the documents with weak certainty. Most of all, opinion mining has a meaningful advantage in that it can reduce learning time dramatically, because a sentiment lexicon generated once can be reused in a similar application domain. Additionally, the classification results can be clearly explained by using a sentiment lexicon. This study has two limitations. First, the results of the experiments cannot be generalized, mainly because the experiment is limited to a small number of movie reviews. Additionally, various parameters in the parsing and filtering steps of the text mining may have affected the accuracy of the prediction models. However, this research contributes a performance and comparison of text mining analysis and opinion mining analysis for opinion classification. In future research, a more precise evaluation of the two methods should be made through intensive experiments.

      • KCI등재

        텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰

        박철수 한국지능정보시스템학회 2019 지능정보연구 Vol.25 No.3

        The goal of this paper is to investigate changes in North Korea’s domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year’s message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime’s South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South’s Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles’, 'North Korea Nuclear’, 'Diplomacy’, 'Unification’, and ' South–North Korean’. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of ‘North Korea Nuclear’, ‘North and South Korea’, ‘Missile’, ‘State Department’, and ‘International’. There are 16 words adopted in all three periods. The order is as follows: ‘missile’, ‘North Korea Nuclear’, ’Diplomacy’, ‘Unification’, ‘North and South Korea’, ‘Military’, ‘Kaesong Industrial Complex’, ‘Defense’, ‘Sanctions’, ‘Denuclearization’, ‘Peace’, ‘Exchange and Cooperation’, and ‘South Korea’. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through thes... 북한의 변화와 동향 파악에 대한 연구는 북한관련 정책에 대한 방향을 결정하고 북한의 행위를 예측하여 사전에 대응 할 수 있다는 측면에서 매우 중요하다. 현재까지 북한 동향에 대한 연구는 전문가를 중심으로 과거사례를 서술적으로 분석하여, 향후에 북한의 동향을 분석하고 대응하여 왔다. 이런 전문가 서술 중심의 북한 변화 및 동향 연구에서 비정형데이터를 이용한 텍스트마이닝 분석이 더해지면 보다 과학적인 북한 동향 분석이가능할 것이다. 특히 북한의 동향 파악과 북한의 대남 관련 행위와 연관된 연구는 통일 및 국방 분야에서 매우유용하며 필요한 분야이다. 본 연구에서는 북한의 신문 기사 내용을 활용한 텍스트마이닝 방법으로 북한과 관련한 핵심 단어를 구축하였다. 그리고 본 연구는 김정은 집권 이후 최근의 남북관계의 극적인 관계와 변화들을 기반으로 세 개의 기간을나누고 이 기간 내에 국내 언론에 나타난 북한과 관련성이 높은 단어들을 시계열적으로 분석한 연구이다. 북한과 관련한 주요 단어들을 세 개의 기간별로 분류하고 당시에 북한의 태도와 동향에 따라 해당 단어와 주제들의관련성이 어떻게 변화하였는지를 파악하였다. 본 연구는 텍스트마이닝을 이용한 연구가 남북관계 및 북한의 동향을 이해하고 분석하는 방법론으로서 얼마나 유용한 것이지를 파악하는 것이었다. 앞으로 북한의 동향 분석에 대한 연구는 물론 대북관계 및 정책에 대한방향을 결정하고, 북한의 행위를 사전에 예측하여 대응 할 수 있는 북한 리스크 측정 모델 구축을 위한 연구로진행 될 것이다.

      • KCI등재

        텍스트 마이닝을 이용한 Claim Data 분석

        오효성,조성기,강창완,임동순 한국자료분석학회 2010 Journal of the Korean Data Analysis Society Vol.12 No.1

        Nowadays most of the information in industry and business are stored electronically, in the form of text databases. Data stored in most text databases are nearly unstructured data. Thus studies on the modelling and implementation of unstructured data are needed and text mining has become an increasingly popular and essential theme in data mining. Text mining is data mining (applied to text) combined with natural language processing. In this study, we derive the segmentation and grouping results based on text data, that is, fashion company's customer claim data using text mining. As the results of text mining, we found that the claims for exchanges of products and the claims for service of stores were related. Finally we suggested this method for customer experience management(CEM) based on text mining. 현재까지 기업에서는 정형화된 데이터를 주 대상으로 고객을 세분화하고 관리하고 있는 실정이나, 최근에 들어 텍스트에서 의사결정에 수렴할 만한 가치 있고 유용한 정보를 찾아내는 작업의 중요성이 부각되고 있다. 텍스트 데이터는 수치 데이터와 달리 자연어로 구성된 비구조적 데이터로 이루어져 있어, 텍스트 간의 잠재된 정보를 찾는 방법이 텍스트 마이닝이다. 본 연구에서는 텍스트 마이닝을 사용하여 패션회사 고객의 Claim건에 대해 텍스트 간의 연관성을 토대로 분류 및 군집화한 결과를 나타내었다. 분석 결과에 따르면 제품에 대한 교환과 사이즈에 대한 교환으로 인한 Claim건과 매장의 서비스에 대한 Claim건이 연관성이 있음을 알 수 있고, 이는 정형화된 데이터 분석으로 도출할 수 없는 정보가 나타나고 있다. 그리고 클러스터로 나누어진 단어에서도 특성이 있는 결과가 도출되었으며 이 결과를 바탕으로 고객관계관리(CRM)를 넘어선 고객경험관리(CEM)를 하기 위한 하나의 방법을 제시하였다.

      • Text Mining : Extraction of Interesting Association Rule with Frequent Itemsets Mining for Korean Language from Unstructured Data

        Irfan Ajmal Khan,Junghyun Woo,Ji-Hoon Seo,Jin-Tak Choi 보안공학연구지원센터 2015 International Journal of Multimedia and Ubiquitous Vol.10 No.11

        Text mining is a specific method to extract knowledge from structured and unstructured data. This extracted knowledge from text mining process can be used for further usage and discovery. This paper presents the method for extraction information from unstructured text data and the importance of Association Rules Mining, specifically for of Korean language (text) and also, NLP (Natural Language Processing) tools are explained. Association Rules Mining (ARM) can also be used for mining association between itemsets from unstructured data with some modifications. Which can then, help for generating statistical thesaurus, to mine grammatical rules and to search large data efficiently. Although various association rules mining techniques have successfully used for market basket analysis but very few has applied on Korean text. A proposed Korean language mining method calculates and extracts meaningful patterns (association rules) between words and presents the hidden knowledge. First it cleans and integrates data, select relevant data then transform into transactional database. Then data mining techniques are used on data source to extract hidden patterns. These patterns are evaluated by specific rules until we get the valid and satisfactory result. We have tested on Korean news corpus and results have shown that it has worked well, and the results were adequate enough to research further.

      • KCI등재

        Text-Mining of Online Discourse to Characterize the Nature of Pain in Low Back Pain

        ( Young Uk Ryu ) 대한물리의학회 2019 대한물리의학회지 Vol.14 No.3

        PURPOSE: Text-mining has been shown to be useful for understanding the clinical characteristics and patients’ concerns regarding a specific disease. Low back pain (LBP) is the most common disease in modern society and has a wide variety of causes and symptoms. On the other hand, it is difficult to understand the clinical characteristics and the needs as well as demands of patients with LBP because of the various clinical characteristics. This study examined online texts on LBP to determine of text-mining can help better understand general characteristics of LBP and its specific elements. METHODS: Online data from www.spine-health.com were used for text-mining. Keyword frequency analysis was performed first on the complete text of postings (full-text analysis). Only the sentences containing the highest frequency word, pain, were selected. Next, texts including the sentences were used to re-analyze the keyword frequency (pain-text analysis). RESULTS: Keyword frequency analysis showed that pain is of utmost concern. Full-text analysis was dominated by structural, pathological, and therapeutic words, whereas pain-text analysis was related mainly to the location and quality of the pain. CONCLUSION: The present study indicated that text-mining for a specific element (keyword) of a particular disease could enhance the understanding of the specific aspect of the disease. This suggests that a consideration of the text source is required when interpreting the results. Clinically, the present results suggest that clinicians pay more attention to the pain a patient is experiencing, and provide information based on medical knowledge.

      • KCI등재후보

        Is Text Mining on Trade Claim Studies Applicable? Focused on Chinese Cases of Arbitration and Litigation Applying the CISG

        Cheon Yu,DongOh Choi,Yun-Seop Hwang 한국무역학회 2020 Journal of Korea trade Vol.24 No.8

        Purpose – This is an exploratory study that aims to apply text mining techniques, which computationally extracts words from the large-scale text data, to legal documents to quantify trade claim contents and enables statistical analysis. Design/methodology – This is designed to verify the validity of the application of text mining techniques as a quantitative methodology for trade claim studies, that have relied mainly on a qualitative approach. The subjects are 81 cases of arbitration and court judgments from China published on the website of the UNCITRAL where the CISG was applied. Validation is performed by comparing the manually analyzed result with the automatically analyzed result. The manual analysis result is the cluster analysis wherein the researcher reads and codes the case. The automatic analysis result is an analysis applying text mining techniques to the result of the cluster analysis. Topic modeling and semantic network analysis are applied for the statistical approach. Findings – Results show that the results of cluster analysis and text mining results are consistent with each other and the internal validity is confirmed. And the degree centrality of words that play a key role in the topic is high as the between centrality of words that are useful for grasping the topic and the eigenvector centrality of the important words in the topic is high. This indicates that text mining techniques can be applied to research on content analysis of trade claims for statistical analysis. Originality/value – Firstly, the validity of the text mining technique in the study of trade claim cases is confirmed. Prior studies on trade claims have relied on traditional approach. Secondly, this study has an originality in that it is an attempt to quantitatively study the trade claim cases, whereas prior trade claim cases were mainly studied via qualitative methods. Lastly, this study shows that the use of the text mining can lower the barrier for acquiring information from a large amount of digitalized text.

      • 기후환경 이슈 분석을 위한 텍스트 마이닝 활용방안 연구

        진대용 ( Daeyong Jin Et Al. ),강성원,최희선,한국진,김도연 한국환경정책평가연구원 2018 기후환경정책연구 Vol.2018 No.-

        본 연구는 환경 텍스트 데이터를 활용하여 주요 기후환경 이슈를 분석하기 위한 텍스트 마이닝 방법론의 활용방안을 탐색하였다. 환경 이슈를 분석하기 위해 활용할 수 있는 환경 텍스트들을 파악하고 각 텍스트에 대해 텍스트 마이닝 또는 빅데이터 분석 방법론을 활용하여 어떤 결과를 도출할 수 있는지 파악 및 점검하였다.먼저 텍스트 마이닝의 개념을 정의하고 환경(정책)연구에서 텍스트 마이닝 기법들의 활용 현황을 파악하였다. 텍스트 마이닝은 텍스트 데이터로부터 의미 있는 정보를 추출하는 과정이 다. ICT의 발전과 비정형 텍스트 분석을 위한 다양한 텍스트 마이닝 방법론이 등장함에 따라 대용량의 텍스트 데이터들로부터 과거의 주요 이슈를 파악하고 이들의 동향을 분석하여 미래 주요 이슈들의 동향에 대한 예측하는 연구가 다양한 분야에서 수행되고 있고 의미 있는 결과를 도출하고 있다. 환경(정책)연구에서도 텍스트 마이닝을 활용하여 연구 결과를 도출하고 있다. 하지만 다양한 분석을 통해 여러 관점에서 결과를 도출하는 과정의 중요성보다 결과 분석 및 해석에 초점이 맞춰져 있고, 연구를 수행하는 과정에 활용된 데이터나 소스코드 등은 다시 활용되지 않아 데이터 분석 연구의 장점을 충분히 발휘하지 못한 부분이 있다. 본 연구에서는 텍스트 마이닝의 강점인 데이터 분석의 자동화와 지속적인 활용성 측면을 극대화하기 위해 노력을 하였다. 본 연구에서는 이 목표를 달성하기 위해 다양한 환경 텍스트 데이터 수집 및 분석 기능을 포함시킨 환경 텍스트 분석 프레임워크를 구축하였으며, 모든 소스코드를 공개하고 데이터 분석에 익숙하지 않은 사용자를 위해 주요 기능을 웹 서비스 형태로 구현하였다.다음으로는 구축된 환경 텍스트 분석 프레임워크를 활용하여 환경 텍스트 데이터의 수집 및 분석을 수행하였다. 먼저 네이버 환경뉴스, 환경부 보도자료, 환경부 e-환경뉴스, 환경백서 데이터를 수집하는 알고리즘을 구축하고 주기적으로 크롤링을 수행하여 데이터 서버에 저장하도록 하였다. 또한 이를 바로 데이터 분석에 활용하여 최신 데이터를 분석할 수 있도록 하였다.본 연구에서는 기후환경 이슈에 대한 분석을 집중적으로 수행하였는데, 각 텍스트 데이터를 분석하여 개별 결과를 도출하였다. 환경 전체 분야를 보았을 때 ‘미세먼지’, ‘폭염’, ‘친환경’, 등의 키워드가 상대적으로 증가세를 보이고 있었으며, ‘기후변화’ 키워드의 경우에는 전체적으로 줄어드는 경향을 보이고 있었다. 이는 ‘기후변화’라는 키워드보다는 ‘기후변화’ 중 재난/재 해(폭염, 한파 등)와 같은 세부현상메 대한 기사가 많아졌고, ‘기후변화’ 키워드를 포함하지 않는 문서가 많아진 것에 기인한 것으로 판단된다. 세부적으로 네이버 환경뉴스의 경우 전반적으로 기후변화에 관련 정보 및 피해(폭염, 한파, 홍수 등)에 관련된 이슈들을 많이 포함하고 있어 전반적인 기후환경 이슈 분석에 유용함을 확인할 수 있었다. 네이버 환경뉴스에서 ‘기후 변화’의 근본적인 내용인 지구온난화현상이나 온실가스 감축 등과 같은 내용이 시간이 지날수록 줄어들고 최근에는 ‘폭염’, ‘가뭄’, ‘한파’ 등과 같은 세부현상들의 키워드를 포함하는 문서가 상대적으로 많아지는 추세를 보이고 있었다. 환경부 보도자료 및 e-환경뉴스에서는 기후변화 세부현상(폭염, 한파, 폭설 등) 하나하나에 대해 거의 다루고 있지 않았으며, ‘기후변화’라는 큰 틀에서 정책 논의나 앞으로의 방향에 대한 내용들을 포함하고 있어서 기후변화에 있어 근본적인 내용에 대한 이슈 및 흐름을 파악할 수 있는 장점이 있었다. 환경백서의 경우 키워드의 수는 많지 않았지만 ‘미세먼지’, ‘폭염’ 등 최신 주요 키워드들이 뚜렷하게 나타나고 있고, 다른 문서들과 달리 기후변화 키워드는 계속 증가하는 추세를 보이고 있어 실제 기후변화 문제 해결을 위한 많은 정책 논의가 있는 것으로 보인다.본 연구에서 활용한 LDA, Word2Vec 문장단위 키워드 분석, 문서단위 키워드 분석, 키워드 네트워크 분석, 문서 요약 등의 방법론은 앞으로 다양한 환경 텍스트에 포함된 이슈 발굴 및 분석에 유용하게 활용될 것으로 보인다. 또한 구축된 환경 텍스트 분석 프레임워크 및 웹 서비스를 활용할 수 있는 방안을 기술하였고, 연구 결과를 분석하여 도출된 결과를 활용한 환경 정책 사례를 제시하였다.본 연구의 결과물은 향후 환경 정책연구자들이 관련 정책을 수립할 때 데이터에 기반한 근거로 활용할 수 있으며, 앞으로 보다 다양한 텍스트 분석을 통해 민간, 언론, 환경연구자, 정책 공급자 등 다양한 관점을 고려한 정책 수립에 기여할 것으로 기대한다. In this study, we look at the application of text mining methodology to analyze major climatic environmental issues using environmental text data. We investigate environmental texts that can be used to analyze environmental issues and for each text, we understand and check what results could be derived.First, we define the concept of text mining and understand the usage of it in environment (policy) research. Text mining is the process of extracting meaningful information from text data. With the advance of ICT technology and various text mining methodologies for unstructured text analysis, research to identify trends in major issues from large-scale text data and to analyze trends in order to predict trends in future major issues is being conducted across various fields and has meaningful results. However, the focus is on the results analysis and interpretation rather than on the importance of the process of deriving the results from various perspectives through various analyses. Data and source code used in the process of research are not reused, so some of the advantages of data analysis is not fully demonstrated. In this study, we tried to maximize the automation and continuous utilization of data analysis, which is the strength of text mining. In this study, we constructed an environment text analysis framework that includes various environmental text data collection and analysis functions for all users who are unfamiliar with data analysis. We have released all the source code and implemented the key functions as a web service so that users who are not familiar with data analysis can use it.Next, we collected and analyzed environmental text data using the built environment text analysis framework. We constructed an algorithm to collect data from Naver environment news,Ministry of Environment press releases, Ministry of Environment e-environment news, environmental white papers and periodicals. Its crawls the data and stores it on the data server. In addition,the data is used to enable analysis of the latest data.Next, we constructed algorithms for analyzing the environmental text data, and results of the analysis were derived from this. As a result, keywords such as 'fine dust’,'heat waves’, and ’environmentally friendly1 had relatively increased, while the keyword 'climate change' showed a tendency to decrease overall. This seems to be due to a lot of articles about the detailed phenomena of ’climate change1 such as 'heat waves’,and ’cold waves' rather than the keyword 'climate change’. In detail, Naver’s environmental news includes a lot of issues related to climate change information and detailed phenomena (heat, cold wave, flood, etc.), and is useful for analyzing overall climate environment issues. The content for ’global climate change’,such as the phenomenon of global wanning and greenhouse gas reduction, has decreased over time. On Naver environmental news,the fundamental content for climate change, such as global warming and greenhouse gas reductions, declined over time and in recent years, there have been a relatively large number of documents containing keywords related to detailed phenomena such as 'heat waves’, ’drought’ and ’cold waves’. The Ministry of Environment’s press release and the Ministry of Environment e-environment news did not cover every detail of climate change phenomenon (heat,cold waves, heavy snow, etc.). It includes policy discussions and the future direction on the major trend of climate change, so it has an advantage in understanding the issues and flow of fundamental content in climate change. In the case of environmental white papers, the frequency of keywords is not high, but the latest important keywords such as ’fine dust’ and 'heat waves’ are showing an increasing trend. Unlike other documents, the keyword of ‘climate change9 is also continuously increasing. There appears to be a lot of policy discussion on climate change issues in the environmental white papers.Methodologies utilized in this study such as LDA, Word2Vec, sentence-based keyword analysis, document-based keyword analysis, keyword network analysis, and document summarization can be used to identify and analyze various climate issues in the future. In addition, we described how to utilize the built environment text analysis framework and web service, and presented environmental policy examples using the results of the analysis.Based on this research, environmental policy researchers are expected to be able to establish policies based on data, and contribute to the establishment of policies that take into account various perspectives such as private citizens, the media, environmental researchers, and policy providers through various text analyses.

      • KCI등재

        Text Mining in Biomedical Domain with Emphasis on Document Clustering

        Vinaitheerthan Renganathan 대한의료정보학회 2017 Healthcare Informatics Research Vol.23 No.3

        Objectives: With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. Methods: This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Results: Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Conclusions: Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

      • KCI등재

        Applying Academic Theory with Text Mining to Offer Business Insight : Illustration of Evaluating Hotel Service Quality

        Choong C. Lee,Kun Kim,Haejung Yun 한국경영정보학회 2019 Asia Pacific Journal of Information Systems Vol.29 No.5

        Now is the time for IS scholars to demonstrate the added value of academic theory through its integration with text mining, clearly outline how to implement this for text mining experts outside of the academic field, and move towards establishing this integration as a standard practice. Therefore, in this study we develop a systematic theory-based text-mining framework (TTMF), and illustrate the use and benefits of TTMF by conducting a text-mining project in an actual business case evaluating and improving hotel service quality using a large volume of actual user-generated reviews. A total of 61,304 sentences extracted from actual customer reviews were successfully allocated to SERVQUAL dimensions, and the pragmatic validity of our model was tested by the OLS regression analysis results between the sentiment scores of each SERVQUAL dimension and customer satisfaction (star rates), and showed significant relationships. As a post-hoc analysis, the results of the co-occurrence analysis to define the root causes of positive and negative service quality perceptions and provide action plans to implement improvements were reported.

      • An Ontology Based Text Analytics on Social Media

        Pankajdeep Kaur,Pallavi Sharma,Nikhil Vohra 보안공학연구지원센터 2015 International Journal of Database Theory and Appli Vol.8 No.5

        The amount of digital information that is created and used is progressively rising along with the growth of sophisticated hardware and software. In addition, real-world data come in a diversity of forms and can be tremendously bulky. This has augmented the need for powerful algorithms that can deduce and dig out appealing facts and useful information from these data. Text Mining (TM), which is a very complex process; has been successfully used for this purpose. Text mining alternately referred to as text data mining, more or less equivalent to text analytics, can be defined as the process of extracting high-quality information from text. Text mining involves the process of structuring the input data, deriving patterns within the structured data and lastly interpretation and revelation of the output. This paper provides outline on text analytics and social media analytics. At the end, this paper presents our proposed work based on ontology framework to cope up with excessive social media textual data.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼