思维操控:外部思维对大规模推理模型具有高效性
Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models
April 18, 2025
作者: Yule Liu, Jingyi Zheng, Zhen Sun, Zifan Peng, Wenhan Dong, Zeyang Sha, Shiwen Cui, Weiqiang Wang, Xinlei He
cs.AI
摘要
近期,大规模推理模型(LRMs)的进展展示了通过扩展测试时计算来增强多任务推理能力的有效性。然而,LRMs普遍存在“过度思考”问题,即模型生成大量冗余的推理步骤,却仅带来有限的性能提升。现有工作依赖于微调来缓解过度思考,但这需要额外数据、非常规的训练设置,存在安全对齐风险,且泛化能力较差。
通过实证分析,我们揭示了LRM行为的一个重要特征:在思考标记(<think>和</think>)之间插入由较小模型生成的外部推理链(CoTs),能有效引导模型减少思考步骤。基于这些洞察,我们提出了一个简单而高效的流程——ThoughtMani,使LRMs能够绕过不必要的中间步骤,显著降低计算成本。我们进行了广泛的实验以验证ThoughtMani的实用性和效率。例如,在LiveBench/Code数据集上应用于QwQ-32B时,ThoughtMani保持了原有性能,并将输出标记数量减少了约30%,且CoT生成器的开销极小。此外,我们发现ThoughtMani平均提升了10%的安全对齐性。鉴于模型供应商通常同时提供不同规模的模型,ThoughtMani为构建更高效、更易获取的LRMs提供了一种有效途径,适用于实际应用场景。
English
Recent advancements in large reasoning models (LRMs) have demonstrated the
effectiveness of scaling test-time computation to enhance reasoning
capabilities in multiple tasks. However, LRMs typically suffer from
"overthinking" problems, where models generate significantly redundant
reasoning steps while bringing limited performance gains. Existing work relies
on fine-tuning to mitigate overthinking, which requires additional data,
unconventional training setups, risky safety misalignment, and poor
generalization.
Through empirical analysis, we reveal an important characteristic of LRM
behaviors that placing external CoTs generated by smaller models between the
thinking token (<think> and </think>) can effectively
manipulate the model to generate fewer thoughts. Building on these insights, we
propose a simple yet efficient pipeline, ThoughtMani, to enable LRMs to bypass
unnecessary intermediate steps and reduce computational costs significantly. We
conduct extensive experiments to validate the utility and efficiency of
ThoughtMani. For instance, when applied to QwQ-32B on the LiveBench/Code
dataset, ThoughtMani keeps the original performance and reduces output token
counts by approximately 30%, with little overhead from the CoT generator.
Furthermore, we find that ThoughtMani enhances safety alignment by an average
of 10%. Since model vendors typically serve models of different sizes
simultaneously, ThoughtMani provides an effective way to construct more
efficient and accessible LRMs for real-world applications.Summary
AI-Generated Summary