高效推理模型：综述

摘要

推理模型在解决复杂且逻辑密集的任务上展现了显著进展，其方法是在得出最终答案前生成扩展的思维链（CoTs）。然而，这种“慢思考”范式的兴起，伴随着大量连续生成的标记，不可避免地引入了巨大的计算开销。因此，这凸显了对有效加速的迫切需求。本综述旨在全面概述高效推理领域的最新进展，将现有工作归纳为三个关键方向：（1）更短——将冗长的CoTs压缩为简洁而有效的推理链；（2）更小——通过知识蒸馏、其他模型压缩技术及强化学习等方法，开发具备强大推理能力的紧凑语言模型；（3）更快——设计高效的解码策略以加速推理过程。本综述所讨论的论文精选集可在我们的GitHub仓库中查阅。

English

Reasoning models have demonstrated remarkable progress in solving complex and logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to arriving at a final answer. Yet, the emergence of this "slow-thinking" paradigm, with numerous tokens generated in sequence, inevitably introduces substantial computational overhead. To this end, it highlights an urgent need for effective acceleration. This survey aims to provide a comprehensive overview of recent advances in efficient reasoning. It categorizes existing works into three key directions: (1) shorter - compressing lengthy CoTs into concise yet effective reasoning chains; (2) smaller - developing compact language models with strong reasoning capabilities through techniques such as knowledge distillation, other model compression techniques, and reinforcement learning; and (3) faster - designing efficient decoding strategies to accelerate inference. A curated collection of papers discussed in this survey is available in our GitHub repository.

高效推理模型：综述

Efficient Reasoning Models: A Survey

摘要

Summary

Support

Support