http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Sarcasm Detection in Twitter - Performance Impact While Using Data Augmentation: Word Embeddings
Alif Tri Handoyo,Hidayaturrahman,Derwin Suhartono,Criscentia Jessica Setiadi 한국지능시스템학회 2022 INTERNATIONAL JOURNAL of FUZZY LOGIC and INTELLIGE Vol.22 No.4
Sarcasm is the use of words commonly used to ridicule someone or for humorous purposes. Several studies on sarcasm detection have utilized different learning algorithms. However,most of these learning models have always focused on the contents of expression only, thusleaving the contextual information in isolation. As a result, they failed to capture the contextualinformation in the sarcastic expression. Moreover, some datasets used in several studies havean unbalanced dataset, thus impacting the model result. In this paper, we propose a contextualmodel for sarcasm identification in Twitter using various pre-trained models and augmentingthe dataset by applying Global Vector representation (GloVe) for the construction of wordembedding and context learning to generate more sarcastic data, and also perform additionalexperiments by using the data duplication method. Data augmentation and duplication impactis tested in various datasets and augmentation sizes. In particular, we achieved the bestperformance after using the data augmentation method to increase 20% of the data labeledas sarcastic and improve the performance by 2.1% with an F1 Score of 40.44% compared to38.34% before using data augmentation in the iSarcasm dataset.