ChatPaper.aiChatPaper

重新思考预训练中的反思机制

Rethinking Reflection in Pre-Training

April 5, 2025
作者: Essential AI, Darsh J Shah, Peter Rushton, Somanshu Singla, Mohit Parmar, Kurt Smith, Yash Vanjani, Ashish Vaswani, Adarsh Chaluvaraju, Andrew Hojel, Andrew Ma, Anil Thomas, Anthony Polloreno, Ashish Tanwer, Burhan Drak Sibai, Divya S Mansingka, Divya Shivaprasad, Ishaan Shah, Karl Stratos, Khoi Nguyen, Michael Callahan, Michael Pust, Mrinal Iyer, Philip Monk, Platon Mazarakis, Ritvik Kapila, Saurabh Srivastava, Tim Romanski
cs.AI

摘要

语言模型自我反思其推理过程的能力,为解决复杂问题提供了关键优势。尽管近期研究多聚焦于这一能力在强化学习阶段的发展,但我们发现,它实际上在模型预训练阶段就已初现端倪。为探究此现象,我们有意在思维链中引入错误,测试模型是否仍能通过识别并纠正这些错误得出正确答案。通过追踪预训练不同阶段的性能表现,我们观察到这种自我纠错能力早期即显现,并随时间稳步提升。例如,一个在4万亿标记上预训练的OLMo2-7B模型,在我们的六项自我反思任务中展现了自我纠错能力。
English
A language model's ability to reflect on its own reasoning provides a key advantage for solving complex problems. While most recent research has focused on how this ability develops during reinforcement learning, we show that it actually begins to emerge much earlier - during the model's pre-training. To study this, we introduce deliberate errors into chains-of-thought and test whether the model can still arrive at the correct answer by recognizing and correcting these mistakes. By tracking performance across different stages of pre-training, we observe that this self-correcting ability appears early and improves steadily over time. For instance, an OLMo2-7B model pre-trained on 4 trillion tokens displays self-correction on our six self-reflection tasks.

Summary

AI-Generated Summary

PDF776April 8, 2025