測試時間計算:從系統一思考到系統二思考
Test-time Computing: from System-1 Thinking to System-2 Thinking
January 5, 2025
作者: Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang
cs.AI
摘要
o1 模型在複雜推理中的卓越表現表明,測試時計算的擴展可以進一步發揮模型的潛力,實現強大的系統二思維。然而,目前對於測試時計算擴展仍缺乏全面的調查。我們將測試時計算的概念追溯到系統一模型。在系統一模型中,測試時計算通過參數更新、輸入修改、表示編輯和輸出校準來應對分佈變化,並通過提升魯棒性和泛化性來改善。在系統二模型中,它通過重複取樣、自我校正和樹搜索來增強模型的推理能力,以解決複雜問題。我們根據從系統一到系統二思維的趨勢組織了這份調查,突出了測試時計算在從系統一模型到弱系統二模型,再到強系統二模型的過渡中的關鍵作用。同時,我們也指出了一些可能的未來方向。
English
The remarkable performance of the o1 model in complex reasoning demonstrates
that test-time computing scaling can further unlock the model's potential,
enabling powerful System-2 thinking. However, there is still a lack of
comprehensive surveys for test-time computing scaling. We trace the concept of
test-time computing back to System-1 models. In System-1 models, test-time
computing addresses distribution shifts and improves robustness and
generalization through parameter updating, input modification, representation
editing, and output calibration. In System-2 models, it enhances the model's
reasoning ability to solve complex problems through repeated sampling,
self-correction, and tree search. We organize this survey according to the
trend of System-1 to System-2 thinking, highlighting the key role of test-time
computing in the transition from System-1 models to weak System-2 models, and
then to strong System-2 models. We also point out a few possible future
directions.Summary
AI-Generated Summary