This study explores the effectiveness of Self-Consistency-based Prompt Tuning for improving the performance of Korean MRC(Machine Reading Comprehension) tasks. While LLM(Large Language Models) like ChatGPT(Chat Generative Pre-trained Transformer) and ...
This study explores the effectiveness of Self-Consistency-based Prompt Tuning for improving the performance of Korean MRC(Machine Reading Comprehension) tasks. While LLM(Large Language Models) like ChatGPT(Chat Generative Pre-trained Transformer) and Gemini have achieved impressive results in English-based NLP(Natural Language Processing), their performance often degrades in Korean due to linguistic complexity and limited high-quality datasets. To address this, we propose a Hard Prompt Tuning framework combined with a Self-Consistency inference strategy using the light weight Gemma2 2B(Billion) model. The method generates multiple answers to a single question and se lects the most consistent response, improving reliability and contextual accuracy. Experiments con ducted on KorQuAD(The Korean Question Answering Dataset) 1.0 and KLUE(Korean Language Understanding Evaluation) MRC datasets show significant gains in EM(Exact Match) and F1 Score as the number of sampled responses increases. Proposed model outperforms other compact LLMs, includ ing Qwen2 1.5B and Phi-4 mini, and even surpasses the performance of LLaMA3.1 8B in key metrics, despite having fewer parameters. These results demonstrate that parameter-efficient prompt tuning with Self-Consistency can offer competitive accuracy and robustness for Korean MRC, especially in re source-constrained environments.