思考操作：外部思考は大規模推論モデルにおいて効率的である

要旨

大規模推論モデル（LRM）の最近の進展により、推論能力を向上させるためにテスト時の計算をスケーリングすることの有効性が、複数のタスクで実証されています。しかし、LRMは通常「過剰思考」の問題に悩まされており、モデルが大幅に冗長な推論ステップを生成する一方で、性能向上は限定的です。既存の研究では、過剰思考を軽減するためにファインチューニングに依存していますが、これには追加のデータ、非標準的なトレーニング設定、リスクのある安全性の不整合、そして汎化性能の低さが伴います。実証分析を通じて、私たちはLRMの動作における重要な特性を明らかにしました。それは、より小さなモデルによって生成された外部の連鎖的思考（CoT）を思考トークン（<think>と</think>）の間に配置することで、モデルがより少ない思考を生成するように効果的に操作できるというものです。これらの洞察に基づいて、私たちはThoughtManiというシンプルで効率的なパイプラインを提案し、LRMが不要な中間ステップを回避し、計算コストを大幅に削減できるようにします。ThoughtManiの有用性と効率性を検証するために、広範な実験を行いました。例えば、LiveBench/CodeデータセットでQwQ-32Bに適用した場合、ThoughtManiは元の性能を維持しつつ、出力トークン数を約30％削減し、CoTジェネレーターからのオーバーヘッドはほとんどありませんでした。さらに、ThoughtManiは安全性の整合性を平均10％向上させることがわかりました。モデルベンダーは通常、異なるサイズのモデルを同時に提供するため、ThoughtManiは実世界のアプリケーション向けにより効率的でアクセスしやすいLRMを構築するための効果的な方法を提供します。

English

Recent advancements in large reasoning models (LRMs) have demonstrated the effectiveness of scaling test-time computation to enhance reasoning capabilities in multiple tasks. However, LRMs typically suffer from "overthinking" problems, where models generate significantly redundant reasoning steps while bringing limited performance gains. Existing work relies on fine-tuning to mitigate overthinking, which requires additional data, unconventional training setups, risky safety misalignment, and poor generalization. Through empirical analysis, we reveal an important characteristic of LRM behaviors that placing external CoTs generated by smaller models between the thinking token (<think> and </think>) can effectively manipulate the model to generate fewer thoughts. Building on these insights, we propose a simple yet efficient pipeline, ThoughtMani, to enable LRMs to bypass unnecessary intermediate steps and reduce computational costs significantly. We conduct extensive experiments to validate the utility and efficiency of ThoughtMani. For instance, when applied to QwQ-32B on the LiveBench/Code dataset, ThoughtMani keeps the original performance and reduces output token counts by approximately 30%, with little overhead from the CoT generator. Furthermore, we find that ThoughtMani enhances safety alignment by an average of 10%. Since model vendors typically serve models of different sizes simultaneously, ThoughtMani provides an effective way to construct more efficient and accessible LRMs for real-world applications.

思考操作：外部思考は大規模推論モデルにおいて効率的である

Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models

要旨

Summary

Support

Support