http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
MIL-BERT: 군사 도메인 특화 한국어 사전학습 언어모델
허희순(Hee-Soon Heo),윤창민(Chang-Min Yoon),유영하(Young-Ha Ryu),용석현(Seok-hyun Yong),김두영(Dooyoung Kim) 한국해군과학기술학회 2023 Journal of the KNST Vol.6 No.2
In this paper, we propose a specialized BERT model that is tailored to the military domain through additional pre-training. Existing BERT models are trained on generic corpora and are not optimized for specific domains. To address this limitation, we collected 1.1 million military sentences and 6,900 military terms from military news to construct a corpus for model training. Subsequently, we developed a tokenizer and trained the model using masked language modeling (MLM). To evaluate the performance, we conducted military sentence classification experiments comparing MIL-BERT with existing Korean BERT models, KcBERT and KoBERT. The experimental results showed that MIL-BERT outperformed the other models with a 2 % higher accuracy.