高效推理模型:综述
Efficient Reasoning Models: A Survey
April 15, 2025
作者: Sicheng Feng, Gongfan Fang, Xinyin Ma, Xinchao Wang
cs.AI
摘要
推理模型在解决复杂且逻辑密集的任务上展现了显著进展,其方法是在得出最终答案前生成扩展的思维链(CoTs)。然而,这种“慢思考”范式的兴起,伴随着大量连续生成的标记,不可避免地引入了巨大的计算开销。因此,这凸显了对有效加速的迫切需求。本综述旨在全面概述高效推理领域的最新进展,将现有工作归纳为三个关键方向:(1)更短——将冗长的CoTs压缩为简洁而有效的推理链;(2)更小——通过知识蒸馏、其他模型压缩技术及强化学习等方法,开发具备强大推理能力的紧凑语言模型;(3)更快——设计高效的解码策略以加速推理过程。本综述所讨论的论文精选集可在我们的GitHub仓库中查阅。
English
Reasoning models have demonstrated remarkable progress in solving complex and
logic-intensive tasks by generating extended Chain-of-Thoughts (CoTs) prior to
arriving at a final answer. Yet, the emergence of this "slow-thinking"
paradigm, with numerous tokens generated in sequence, inevitably introduces
substantial computational overhead. To this end, it highlights an urgent need
for effective acceleration. This survey aims to provide a comprehensive
overview of recent advances in efficient reasoning. It categorizes existing
works into three key directions: (1) shorter - compressing lengthy CoTs into
concise yet effective reasoning chains; (2) smaller - developing compact
language models with strong reasoning capabilities through techniques such as
knowledge distillation, other model compression techniques, and reinforcement
learning; and (3) faster - designing efficient decoding strategies to
accelerate inference. A curated collection of papers discussed in this survey
is available in our GitHub repository.Summary
AI-Generated Summary