高效推理模型：綜述

摘要

推理模型在解決複雜且邏輯密集的任務上展現了顯著進展，其方法是在得出最終答案之前生成延伸的思維鏈（Chain-of-Thoughts, CoTs）。然而，這種「慢思考」範式的出現，伴隨著大量序列生成的詞元，不可避免地引入了顯著的計算開銷。因此，這凸顯了對有效加速的迫切需求。本調查旨在全面概述高效推理的最新進展，並將現有工作分為三個關鍵方向：(1) 更短——將冗長的思維鏈壓縮為簡潔而有效的推理鏈；(2) 更小——通過知識蒸餾、其他模型壓縮技術以及強化學習等方法，開發具有強大推理能力的緊湊語言模型；(3) 更快——設計高效的解碼策略以加速推理。本調查中討論的論文精選集可在我們的GitHub倉庫中找到。

English

Reasoning models have demonstrated remarkable progress in solving complex and logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to arriving at a final answer. Yet, the emergence of this "slow-thinking" paradigm, with numerous tokens generated in sequence, inevitably introduces substantial computational overhead. To this end, it highlights an urgent need for effective acceleration. This survey aims to provide a comprehensive overview of recent advances in efficient reasoning. It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities through techniques such as knowledge distillation, other model compression techniques, and reinforcement learning; and (3) faster - designing efficient decoding strategies to accelerate inference. A curated collection of papers discussed in this survey is available in our GitHub repository.

高效推理模型：綜述

Efficient Reasoning Models: A Survey

摘要

Summary

Support

Support