CoT-Valve:长度可压缩的思维链调节
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
February 13, 2025
作者: Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang
cs.AI
摘要
思维链显著增强了模型的推理能力,但也伴随着推理成本的显著增加,因为存在较长的链。通过观察到在简单任务下推理路径可以轻松压缩,但在困难任务下会遇到困难,我们探讨了通过一个模型弹性地控制推理路径长度的可行性,从而根据任务难度动态减少推理模型的推理开销。我们引入了一种名为CoT-Valve的新调整和推理策略,旨在允许模型生成不同长度的推理链。为实现这一目标,我们提出了在参数空间中识别一个方向,通过操纵该方向可以有效地控制生成的CoT的长度。此外,我们展示了这种属性对于压缩推理链是有价值的。我们构建了包含从长到短链的相同问题的数据集,并探索了两种增强策略用于CoT-Valve:(1)精确长度可压缩的CoT调整方法,以及(2)渐进式链长度压缩方法。我们的实验表明,CoT-Valve成功实现了链的可控性和可压缩性,并且表现优于基于提示的控制。我们将这种方法应用于QwQ-32B-Preview,在GSM8K上将推理链从741个标记减少到225个标记,性能略微下降(从95.07%到94.92%),在AIME上将推理链从6827个标记减少到4629个标记,仅多出一个错误答案。
English
Chain-of-Thought significantly enhances a model's reasoning capability, but
it also comes with a considerable increase in inference costs due to long
chains. With the observation that the reasoning path can be easily compressed
under easy tasks but struggle on hard tasks, we explore the feasibility of
elastically controlling the length of reasoning paths with only one model,
thereby reducing the inference overhead of reasoning models dynamically based
on task difficulty. We introduce a new tuning and inference strategy named
CoT-Valve, designed to allow models to generate reasoning chains of varying
lengths. To achieve this, we propose to identify a direction in the parameter
space that, when manipulated, can effectively control the length of generated
CoT. Moreover, we show that this property is valuable for compressing the
reasoning chain. We construct datasets with chains from long to short for the
same questions and explore two enhanced strategies for CoT-Valve: (1) a precise
length-compressible CoT tuning method, and (2) a progressive chain length
compression approach. Our experiments show that CoT-Valve successfully enables
controllability and compressibility of the chain and shows better performance
than the prompt-based control. We applied this method to QwQ-32B-Preview,
reducing reasoning chains on GSM8K from 741 to 225 tokens with a minor
performance drop (95.07% to 94.92%) and on AIME from 6827 to 4629 tokens, with
only one additional incorrect answer.Summary
AI-Generated Summary