http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
Case-Related News Filtering via Topic-Enhanced Positive-Unlabeled Learning
Guanwen Wang,Zhengtao Yu,Yantuan Xian,Yu Zhang 한국정보처리학회 2021 Journal of information processing systems Vol.17 No.6
Case-related news filtering is crucial in legal text mining and divides news into case-related and case-unrelated categories. Because case-related news originates from various fields and has different writing styles, it is difficult to establish complete filtering rules or keywords for data collection. In addition, the labeled corpus for case-related news is sparse; therefore, to train a high-performance classification model, it is necessary to annotate the corpus. To address this challenge, we propose topic-enhanced positive-unlabeled learning, which selects positive and negative samples guided by topics. Specifically, a topic model based on a variational autoencoder (VAE) is trained to extract topics from unlabeled samples. By using these topics in the iterative process of positive-unlabeled (PU) learning, the accuracy of identifying case-related news can be improved. From the experimental results, it can be observed that the F1 value of our method on the test set is 1.8% higher than that of the PU learning baseline model. In addition, our method is more robust with low initial samples and high iterations, and compared with advanced PU learning baselines such as nnPU and I-PU, we obtain a 1.1% higher F1 value, which indicates that our method can effectively identify case-related news.
A Method of Chinese and Thai Cross-Lingual Query Expansion Based on Comparable Corpus
Tang, Peili,Zhao, Jing,Yu, Zhengtao,Wang, Zhuo,Xian, Yantuan Korea Information Processing Society 2017 Journal of information processing systems Vol.13 No.4
Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval.
Aspect-Based Sentiment Analysis with Position Embedding Interactive Attention Network
Yan Xiang,Jiqun Zhang,Zhoubin Zhang,Zhengtao Yu,Yantuan Xian 한국정보처리학회 2022 Journal of information processing systems Vol.18 No.5
Aspect-based sentiment analysis is to discover the sentiment polarity towards an aspect from user-generatednatural language. So far, most of the methods only use the implicit position information of the aspect in thecontext, instead of directly utilizing the position relationship between the aspect and the sentiment terms. Infact, neighboring words of the aspect terms should be given more attention than other words in the context. This paper studies the influence of different position embedding methods on the sentimental polarities of givenaspects, and proposes a position embedding interactive attention network based on a long short-term memorynetwork. Firstly, it uses the position information of the context simultaneously in the input layer and theattention layer. Secondly, it mines the importance of different context words for the aspect with the interactiveattention mechanism. Finally, it generates a valid representation of the aspect and the context for sentimentclassification. The model which has been posed was evaluated on the datasets of the Semantic Evaluation 2014. Compared with other baseline models, the accuracy of our model increases by about 2% on the restaurantdataset and 1% on the laptop dataset.
A Method of Chinese and Thai Cross-Lingual Query Expansion Based on Comparable Corpus
( Peili Tang ),( Jing Zhao ),( Zhengtao Yu ),( Zhuo Wang ),( Yantuan Xian ) 한국정보처리학회 2017 Journal of information processing systems Vol.13 No.4
Cross-lingual query expansion is usually based on the relationship among monolingual words. Bilingual comparable corpus contains relationships among bilingual words. Therefore, this paper proposes a method based on these relationships to conduct query expansion. First, the word vectors which characterize the bilingual words are trained using Chinese and Thai bilingual comparable corpus. Then, the correlation between Chinese query words and Thai words are computed based on these word vectors, followed with selecting the Thai candidate expansion terms via the correlative value. Then, multi-group Thai query expansion sentences are built by the Thai candidate expansion words based on Chinese query sentence. Finally, we can get the optimal sentence using the Chinese and Thai query expansion method, and perform the Thai query expansion. Experiment results show that the cross-lingual query expansion method we proposed can effectively improve the accuracy of Chinese and Thai cross-language information retrieval.