RISS 학술연구정보서비스

검색
다국어 입력

http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.

변환된 중국어를 복사하여 사용하시면 됩니다.

예시)
  • 中文 을 입력하시려면 zhongwen을 입력하시고 space를누르시면됩니다.
  • 北京 을 입력하시려면 beijing을 입력하시고 space를 누르시면 됩니다.
닫기
    인기검색어 순위 펼치기

    RISS 인기검색어

      Beyond Deletion: Improving LLM Reasoning Efficiency via Information-Theoretic Compression and Pruning Pohang University of Science and Technology = 삭제를 넘어서: 정보 이론적 압축과 가지치기를 통한 LLM 추론 효율성 향상

      한글로보기

      https://www.riss.kr/link?id=T17416529

      • 0

        상세조회
      • 0

        다운로드
      서지정보 열기
      • 내보내기
      • 내책장담기
      • 공유하기
      • 오류접수

      부가정보

      다국어 초록 (Multilingual Abstract) kakao i 다국어 번역

      Chain-of-Thought (CoT) prompting has revolutionized the reasoning capabili- ties of Large Language Models (LLMs), yet it often incurs significant computational costs due to the generation of redundant, verbose, or irrelevant reasoning steps. While various Process Reward Models (PRMs) have been proposed to evaluate step-by-step reasoning, our analysis reveals that even state-of-the-art methods—specifically Rea- sonEval, ThinkPRM, and Qwen-Math-2.5-PRM—struggle to effectively distinguish “valid but redundant” steps, such as excessive decomposition or context-aware irrele- vant details. To systematically diagnose this limitation, we introduce RIV-GSM8K, a diag- nostic benchmark synthetically injected with five distinct types of inefficiency, includ- ing Redundancy, Irrelevance, and Verbosity. Using this benchmark, we demonstrate the blind spots of existing PRMs in detecting these subtle inefficiencies. Addressing these gaps, we propose CAID (Context-Aware Information Den- sity), a novel reference-free metric that quantifies reasoning efficiency by integrating local novelty, normalized information density, and global goal alignment. Further- more, we present PACE (Pruning And Compression for Efficiency), a lightweight, training-free post-hoc optimization framework that leverages CAID to dynamically compress trivial intermediate steps and prune irrelevant ones. Experiments on GSM8K, StrategyQA, and ARC-Challenge show that PACE reduces token consumption by 31–53% while maintaining or even improving rea- soning accuracy. These results demonstrate that information-theoretic optimization offers a more robust solution to the over-reasoning problem than existing PRM-based approaches.
      번역하기

      Chain-of-Thought (CoT) prompting has revolutionized the reasoning capabili- ties of Large Language Models (LLMs), yet it often incurs significant computational costs due to the generation of redundant, verbose, or irrelevant reasoning steps. While var...

      Chain-of-Thought (CoT) prompting has revolutionized the reasoning capabili- ties of Large Language Models (LLMs), yet it often incurs significant computational costs due to the generation of redundant, verbose, or irrelevant reasoning steps. While various Process Reward Models (PRMs) have been proposed to evaluate step-by-step reasoning, our analysis reveals that even state-of-the-art methods—specifically Rea- sonEval, ThinkPRM, and Qwen-Math-2.5-PRM—struggle to effectively distinguish “valid but redundant” steps, such as excessive decomposition or context-aware irrele- vant details. To systematically diagnose this limitation, we introduce RIV-GSM8K, a diag- nostic benchmark synthetically injected with five distinct types of inefficiency, includ- ing Redundancy, Irrelevance, and Verbosity. Using this benchmark, we demonstrate the blind spots of existing PRMs in detecting these subtle inefficiencies. Addressing these gaps, we propose CAID (Context-Aware Information Den- sity), a novel reference-free metric that quantifies reasoning efficiency by integrating local novelty, normalized information density, and global goal alignment. Further- more, we present PACE (Pruning And Compression for Efficiency), a lightweight, training-free post-hoc optimization framework that leverages CAID to dynamically compress trivial intermediate steps and prune irrelevant ones. Experiments on GSM8K, StrategyQA, and ARC-Challenge show that PACE reduces token consumption by 31–53% while maintaining or even improving rea- soning accuracy. These results demonstrate that information-theoretic optimization offers a more robust solution to the over-reasoning problem than existing PRM-based approaches.

      더보기

      목차 (Table of Contents)

      • I. Introduction 1
      • 1.1 Background and Motivation 1
      • 1.2 The Blind Spot of Existing Evaluators 1
      • 1.3 Proposed Approach 2
      • 1.4 Contributions 4
      • I. Introduction 1
      • 1.1 Background and Motivation 1
      • 1.2 The Blind Spot of Existing Evaluators 1
      • 1.3 Proposed Approach 2
      • 1.4 Contributions 4
      • II. Related Work 5
      • 2.1 Efficient LLM Inference and Token Pruning 5
      • 2.2 Evaluation of Reasoning Quality 5
      • 2.3 Process Reward Models and Redundancy Detection 6
      • III. Proposed Method 8
      • 3.1 RIV-GSM8K: Diagnosing Inefficiency 8
      • 3.1.1 Benchmark Construction 8
      • 3.1.2 Taxonomy of Inefficiency 10
      • 3.1.3 Dataset Statistics 11
      • 3.2 CAID: Context-Aware Information Density 11
      • 3.2.1 1. Local Similarity 12
      • 3.2.2 2. Global Goal Alignment 12
      • 3.2.3 3. Information Density 12
      • 3.2.4 4. Semantic Delta with Decaying Threshold 13
      • 3.2.5 Decision Logic 13
      • 3.3 PACE: Pruning And Compression for Efficiency 14
      • 3.3.1 1. Step Classification 15
      • 3.3.2 2. Sequential Chain Restructuring 15
      • – III –
      • 3.3.3 3. Validation via Answer Re-generation 17
      • IV. Experimental Setup 19
      • 4.1 Datasets 19
      • 4.2 Backbone Models and Baselines 20
      • 4.3 Implementation Details of CAID 20
      • 4.4 Hyperparameter Configuration 21
      • V. Experiments and Analysis 22
      • 5.1 Performance on RIV-GSM8K: Metric Validation 22
      • 5.1.1 Detection of Explicit Inefficiency 22
      • 5.1.2 Interpretation of Normal Accuracy 23
      • 5.2 Efficiency Improvement with PACE 24
      • 5.3 Analysis 25
      • 5.3.1 Ablation Study of CAID Components 25
      • 5.3.2 Ablation Study of PACE Optimization 28
      • VI. Conclusion 32
      • Summary (in Korean) 34
      • References 35
      • – IV –
      더보기

      분석정보

      View

      상세정보조회

      0

      Usage

      원문다운로드

      0

      대출신청

      0

      복사신청

      0

      EDDS신청

      0

      동일 주제 내 활용도 TOP

      더보기

      주제

      연도별 연구동향

      연도별 활용동향

      연관논문

      연구자 네트워크맵

      공동연구자 (7)

      유사연구자 (20) 활용도상위20명

      이 자료와 함께 이용한 RISS 자료

      나만을 위한 추천자료

      해외이동버튼