RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      검색결과 좁혀 보기

      선택해제

      오늘 본 자료

      • 오늘 본 자료가 없습니다.
      더보기
      • 무료
      • 기관 내 무료
      • 유료
      • KCI등재

        Sarcasm Detection in Twitter - Performance Impact While Using Data Augmentation: Word Embeddings

        Alif Tri Handoyo,Hidayaturrahman,Derwin Suhartono,Criscentia Jessica Setiadi 한국지능시스템학회 2022 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.22 No.4

        Sarcasm is the use of words commonly used to ridicule someone or for humorous purposes. Several studies on sarcasm detection have utilized different learning algorithms. However,most of these learning models have always focused on the contents of expression only, thusleaving the contextual information in isolation. As a result, they failed to capture the contextualinformation in the sarcastic expression. Moreover, some datasets used in several studies havean unbalanced dataset, thus impacting the model result. In this paper, we propose a contextualmodel for sarcasm identification in Twitter using various pre-trained models and augmentingthe dataset by applying Global Vector representation (GloVe) for the construction of wordembedding and context learning to generate more sarcastic data, and also perform additionalexperiments by using the data duplication method. Data augmentation and duplication impactis tested in various datasets and augmentation sizes. In particular, we achieved the bestperformance after using the data augmentation method to increase 20% of the data labeledas sarcastic and improve the performance by 2.1% with an F1 Score of 40.44% compared to38.34% before using data augmentation in the iSarcasm dataset.

      • KCI등재

        Personality Prediction Based on Text Analytics Using Bidirectional Encoder Representations from Transformers from English Twitter Dataset

        Joshua Evan Arijanto,Steven Geraldy,Cyrena Tania,Derwin Suhartono 한국지능시스템학회 2021 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.21 No.3

        Personality traits can be inferred from a person’s behavioral patterns. One example is when writing posts on social media. Extracting information about individual personalities can yield enormous benefits for various applications such as recommendation systems, marketing, or hiring employees. The objective of this research is to build a personality prediction system that uses English texts from Twitter as a dataset to predict personality traits. This research uses the Big Five personality traits theory to analyze personality traits, which consist of openness, conscientiousness, extraversion, agreeableness, and neuroticism. Several classifiers were used in this research, such as support vector machine, convolutional neural network, and variants of bidirectional encoder representations from transformers (BERT). To improve the performance, we implemented several feature extraction techniques, such as N-gram, linguistic inquiry and word count (LIWC), word embedding, and data augmentation. The best results were obtained by fine-tuning the BERT model and using it as the main classifier of the personality prediction system. We conclude that the BERT performance could be improved by using individual tweets instead of concatenated ones.

      • KCI등재

        Identifying Personality Traits for Indonesian User from Twitter Dataset

        Nicholaus Hendrik Jeremy,Cristian Prasetyo,Derwin Suhartono 한국지능시스템학회 2019 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.19 No.4

        Social media allows the user to convey their actual self and share their life experiences through numerous ways. This behavior in turn reflects the user’s personality. In this paper, we experiment to automatically predict user’s personality based on Big Five Personality Trait on Twitter. Our focus is towards Indonesian user. Not only word n-gram, Twitter metadata is also used in a certain combination to determine the feature that will be used to predict the personality. Our research also attempts to find optimum setting based on the number of n-gram, classifier, and twitter metadata. Our experiment yields 0.7482 at most on F-Measure. We conclude that among all scenario, twitter metadata is the least impactful feature, while word n-gram impacts the most.

      연관 검색어 추천

      이 검색어로 많이 본 자료

      활용도 높은 자료

      해외이동버튼