测试时间计算:从系统一思维到系统二思维
Test-time Computing: from System-1 Thinking to System-2 Thinking
January 5, 2025
作者: Yixin Ji, Juntao Li, Hai Ye, Kaixin Wu, Jia Xu, Linjian Mo, Min Zhang
cs.AI
摘要
o1模型在复杂推理中表现出色,表明测试时间计算的扩展可以进一步释放模型的潜力,实现强大的系统二思维。然而,目前仍然缺乏针对测试时间计算扩展的综合调查。我们追溯测试时间计算的概念到系统一模型。在系统一模型中,测试时间计算通过参数更新、输入修改、表示编辑和输出校准来解决分布转移问题,提高鲁棒性和泛化能力。在系统二模型中,它通过重复采样、自我校正和树搜索来增强模型的推理能力,解决复杂问题。我们根据从系统一到系统二思维的趋势组织这项调查,突出测试时间计算在从系统一模型向弱系统二模型,再到强系统二模型的过渡中的关键作用。我们还指出了一些可能的未来方向。
English
The remarkable performance of the o1 model in complex reasoning demonstrates
that test-time computing scaling can further unlock the model's potential,
enabling powerful System-2 thinking. However, there is still a lack of
comprehensive surveys for test-time computing scaling. We trace the concept of
test-time computing back to System-1 models. In System-1 models, test-time
computing addresses distribution shifts and improves robustness and
generalization through parameter updating, input modification, representation
editing, and output calibration. In System-2 models, it enhances the model's
reasoning ability to solve complex problems through repeated sampling,
self-correction, and tree search. We organize this survey according to the
trend of System-1 to System-2 thinking, highlighting the key role of test-time
computing in the transition from System-1 models to weak System-2 models, and
then to strong System-2 models. We also point out a few possible future
directions.Summary
AI-Generated Summary