高效推理模型:綜述
Efficient Reasoning Models: A Survey
April 15, 2025
作者: Sicheng Feng, Gongfan Fang, Xinyin Ma, Xinchao Wang
cs.AI
摘要
推理模型在解決複雜且邏輯密集的任務上展現了顯著進展,其方法是在得出最終答案之前生成延伸的思維鏈(Chain-of-Thoughts, CoTs)。然而,這種「慢思考」範式的出現,伴隨著大量序列生成的詞元,不可避免地引入了顯著的計算開銷。因此,這凸顯了對有效加速的迫切需求。本調查旨在全面概述高效推理的最新進展,並將現有工作分為三個關鍵方向:(1) 更短——將冗長的思維鏈壓縮為簡潔而有效的推理鏈;(2) 更小——通過知識蒸餾、其他模型壓縮技術以及強化學習等方法,開發具有強大推理能力的緊湊語言模型;(3) 更快——設計高效的解碼策略以加速推理。本調查中討論的論文精選集可在我們的GitHub倉庫中找到。
English
Reasoning models have demonstrated remarkable progress in solving complex and
logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to
arriving at a final answer. Yet, the emergence of this "slow-thinking"
paradigm, with numerous tokens generated in sequence, inevitably introduces
substantial computational overhead. To this end, it highlights an urgent need
for effective acceleration. This survey aims to provide a comprehensive
overview of recent advances in efficient reasoning. It categorizes existing
works into three key directions: (1) shorter - compressing lengthy CoTs into
concise yet effective reasoning chains; (2) smaller - developing compact
language models with strong reasoning capabilities through techniques such as
knowledge distillation, other model compression techniques, and reinforcement
learning; and (3) faster - designing efficient decoding strategies to
accelerate inference. A curated collection of papers discussed in this survey
is available in our GitHub repository.Summary
AI-Generated Summary