AIMO-2获胜方案:基于OpenMathReasoning数据集构建顶尖数学推理模型
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset
April 23, 2025
作者: Ivan Moshkov, Darragh Hanley, Ivan Sorokin, Shubham Toshniwal, Christof Henkel, Benedikt Schifferer, Wei Du, Igor Gitman
cs.AI
摘要
本文介绍了我们在AI数学奥林匹克竞赛——进步奖2(AIMO-2)中的获奖方案。构建顶尖数学推理模型的秘诀基于三大支柱。首先,我们创建了一个包含54万道独特高质量数学问题的大规模数据集,其中包括奥林匹克级别的题目及其320万条长推理解答。其次,我们开发了一种新颖方法,通过迭代训练、生成和质量筛选,将代码执行与长推理模型相结合,生成了170万条高质量的工具集成推理解答。第三,我们构建了一个训练模型从众多候选解答中选择最有前景答案的流程。研究表明,这种生成式解答选择(GenSelect)方法相较于多数投票基线有显著提升。综合这些创新,我们训练了一系列模型,在数学推理基准测试中取得了领先成果。为促进进一步研究,我们在商业许可下公开了代码、模型及完整的OpenMathReasoning数据集。
English
This paper presents our winning submission to the AI Mathematical Olympiad -
Progress Prize 2 (AIMO-2) competition. Our recipe for building state-of-the-art
mathematical reasoning models relies on three key pillars. First, we create a
large-scale dataset comprising 540K unique high-quality math problems,
including olympiad-level problems, and their 3.2M long-reasoning solutions.
Second, we develop a novel method to integrate code execution with long
reasoning models through iterative training, generation, and quality filtering,
resulting in 1.7M high-quality Tool-Integrated Reasoning solutions. Third, we
create a pipeline to train models to select the most promising solution from
many candidates. We show that such generative solution selection (GenSelect)
can significantly improve upon majority voting baseline. Combining these ideas,
we train a series of models that achieve state-of-the-art results on
mathematical reasoning benchmarks. To facilitate further research, we release
our code, models, and the complete OpenMathReasoning dataset under a
commercially permissive license.Summary
AI-Generated Summary