Beyond Deletion: Improving LLM Reasoning Efficiency via Information-Theoretic Compression and Pruning Pohang University of Science and Technology = 삭제를 넘어서: 정보 이론적 압축과 가지치기를 통한 LLM 추론 효율성 향상|RISS 상세보기

다국어 초록 (Multilingual Abstract)

Chain-of-Thought (CoT) prompting has revolutionized the reasoning capabili- ties of Large Language Models (LLMs), yet it often incurs significant computational costs due to the generation of redundant, verbose, or irrelevant reasoning steps. While various Process Reward Models (PRMs) have been proposed to evaluate step-by-step reasoning, our analysis reveals that even state-of-the-art methods—specifically Rea- sonEval, ThinkPRM, and Qwen-Math-2.5-PRM—struggle to effectively distinguish “valid but redundant” steps, such as excessive decomposition or context-aware irrele- vant details. To systematically diagnose this limitation, we introduce RIV-GSM8K, a diag- nostic benchmark synthetically injected with five distinct types of inefficiency, includ- ing Redundancy, Irrelevance, and Verbosity. Using this benchmark, we demonstrate the blind spots of existing PRMs in detecting these subtle inefficiencies. Addressing these gaps, we propose CAID (Context-Aware Information Den- sity), a novel reference-free metric that quantifies reasoning efficiency by integrating local novelty, normalized information density, and global goal alignment. Further- more, we present PACE (Pruning And Compression for Efficiency), a lightweight, training-free post-hoc optimization framework that leverages CAID to dynamically compress trivial intermediate steps and prune irrelevant ones. Experiments on GSM8K, StrategyQA, and ARC-Challenge show that PACE reduces token consumption by 31–53% while maintaining or even improving rea- soning accuracy. These results demonstrate that information-theoretic optimization offers a more robust solution to the over-reasoning problem than existing PRM-based approaches.

번역하기

목차 (Table of Contents)

I. Introduction 1
1.1 Background and Motivation 1
1.2 The Blind Spot of Existing Evaluators 1
1.3 Proposed Approach 2
1.4 Contributions 4

I. Introduction 1
1.1 Background and Motivation 1
1.2 The Blind Spot of Existing Evaluators 1
1.3 Proposed Approach 2
1.4 Contributions 4
II. Related Work 5
2.1 Efficient LLM Inference and Token Pruning 5
2.2 Evaluation of Reasoning Quality 5
2.3 Process Reward Models and Redundancy Detection 6
III. Proposed Method 8
3.1 RIV-GSM8K: Diagnosing Inefficiency 8
3.1.1 Benchmark Construction 8
3.1.2 Taxonomy of Inefficiency 10
3.1.3 Dataset Statistics 11
3.2 CAID: Context-Aware Information Density 11
3.2.1 1. Local Similarity 12
3.2.2 2. Global Goal Alignment 12
3.2.3 3. Information Density 12
3.2.4 4. Semantic Delta with Decaying Threshold 13
3.2.5 Decision Logic 13
3.3 PACE: Pruning And Compression for Efficiency 14
3.3.1 1. Step Classification 15
3.3.2 2. Sequential Chain Restructuring 15
– III –
3.3.3 3. Validation via Answer Re-generation 17
IV. Experimental Setup 19
4.1 Datasets 19
4.2 Backbone Models and Baselines 20
4.3 Implementation Details of CAID 20
4.4 Hyperparameter Configuration 21
V. Experiments and Analysis 22
5.1 Performance on RIV-GSM8K: Metric Validation 22
5.1.1 Detection of Explicit Inefficiency 22
5.1.2 Interpretation of Normal Accuracy 23
5.2 Efficiency Improvement with PACE 24
5.3 Analysis 25
5.3.1 Ablation Study of CAID Components 25
5.3.2 Ablation Study of PACE Optimization 28
VI. Conclusion 32
Summary (in Korean) 34
References 35
– IV –

상세검색

RISS 보유자료

상세검색

해외전자자료

Beyond Deletion: Improving LLM Reasoning Efficiency via Information-Theoretic Compression and Pruning Pohang University of Science and Technology = 삭제를 넘어서: 정보 이론적 압축과 가지치기를 통한 LLM 추론 효율성 향상

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료