AIMO-2 冠軍解決方案：利用 OpenMathReasoning 數據集構建最先進的數學推理模型

摘要

本文介绍了我们在AI数学奥林匹克竞赛——进步奖2（AIMO-2）中的获奖方案。我们构建顶尖数学推理模型的秘诀基于三大支柱。首先，我们创建了一个包含54万道独特高质量数学问题的大规模数据集，其中包括奥林匹克级别的题目及其320万条长推理解答。其次，我们开发了一种新颖方法，通过迭代训练、生成和质量筛选，将代码执行与长推理模型相结合，生成了170万条高质量的工具集成推理解答。第三，我们建立了一个管道，用于训练模型从众多候选解答中选择最有前景的解决方案。我们证明，这种生成式解答选择（GenSelect）能显著超越多数投票的基线方法。结合这些理念，我们训练了一系列模型，在数学推理基准测试中取得了顶尖成果。为促进进一步研究，我们在商业许可下发布了我们的代码、模型以及完整的OpenMathReasoning数据集。

English

This paper presents our winning submission to the AI Mathematical Olympiad - Progress Prize 2 (AIMO-2) competition. Our recipe for building state-of-the-art mathematical reasoning models relies on three key pillars. First, we create a large-scale dataset comprising 540K unique high-quality math problems, including olympiad-level problems, and their 3.2M long-reasoning solutions. Second, we develop a novel method to integrate code execution with long reasoning models through iterative training, generation, and quality filtering, resulting in 1.7M high-quality Tool-Integrated Reasoning solutions. Third, we create a pipeline to train models to select the most promising solution from many candidates. We show that such generative solution selection (GenSelect) can significantly improve upon majority voting baseline. Combining these ideas, we train a series of models that achieve state-of-the-art results on mathematical reasoning benchmarks. To facilitate further research, we release our code, models, and the complete OpenMathReasoning dataset under a commercially permissive license.

AIMO-2 冠軍解決方案：利用 OpenMathReasoning 數據集構建最先進的數學推理模型

AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

摘要

Summary

Support

Support