驾驭推理经济性：大语言模型高效推理研究综述

摘要

大型语言模型（LLMs）的最新进展显著提升了其执行复杂推理任务的能力，实现了从快速直觉思维（系统1）向缓慢深度推理（系统2）的转变。尽管系统2推理提高了任务准确性，但由于其缓慢的思维特性及低效或冗余的推理行为，往往伴随着巨大的计算成本。相比之下，系统1推理虽计算效率高，却导致性能欠佳。因此，在性能（收益）与计算成本（预算）之间寻求平衡变得至关重要，这催生了推理经济性的概念。本综述全面分析了LLMs在训练后及测试推理阶段中的推理经济性，涵盖：i) 推理低效的成因，ii) 不同推理模式的行为分析，以及iii) 实现推理经济性的潜在解决方案。通过提供可操作的见解并指出开放挑战，我们旨在阐明提升LLMs推理经济性的策略，从而为推动这一不断发展的研究领域提供宝贵资源。此外，我们还设立了一个公共资源库，持续追踪这一快速演进领域的最新动态。

English

Recent advancements in Large Language Models (LLMs) have significantly enhanced their ability to perform complex reasoning tasks, transitioning from fast and intuitive thinking (System 1) to slow and deep reasoning (System 2). While System 2 reasoning improves task accuracy, it often incurs substantial computational costs due to its slow thinking nature and inefficient or unnecessary reasoning behaviors. In contrast, System 1 reasoning is computationally efficient but leads to suboptimal performance. Consequently, it is critical to balance the trade-off between performance (benefits) and computational costs (budgets), giving rise to the concept of reasoning economy. In this survey, we provide a comprehensive analysis of reasoning economy in both the post-training and test-time inference stages of LLMs, encompassing i) the cause of reasoning inefficiency, ii) behavior analysis of different reasoning patterns, and iii) potential solutions to achieve reasoning economy. By offering actionable insights and highlighting open challenges, we aim to shed light on strategies for improving the reasoning economy of LLMs, thereby serving as a valuable resource for advancing research in this evolving area. We also provide a public repository to continually track developments in this fast-evolving field.

驾驭推理经济性：大语言模型高效推理研究综述

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

摘要

Summary

Support

Support