ChatPaper.aiChatPaper

AIDE:代码空间中的AI驱动探索

AIDE: AI-Driven Exploration in the Space of Code

February 18, 2025
作者: Zhengyao Jiang, Dominik Schmidt, Dhruv Srikanth, Dixing Xu, Ian Kaplan, Deniss Jacenko, Yuxiang Wu
cs.AI

摘要

作为现代人工智能基石,机器学习推动了彻底改变世界的创新。然而,在这些进步背后,隐藏着一个复杂且往往繁琐的过程,需要大量人力和计算资源进行迭代与实验。开发机器学习模型的工程师和科学家们,将大量时间耗费在试错任务上,而非构思创新解决方案或研究假设。为应对这一挑战,我们推出了AI驱动探索(AIDE),一个由大型语言模型(LLMs)赋能的机器学习工程代理。AIDE将机器学习工程视为代码优化问题,并将试错过程构建为潜在解决方案空间中的树搜索。通过策略性地复用和精炼有前景的解决方案,AIDE有效地以计算资源换取性能提升,在包括我们的Kaggle评估、OpenAI MLE-Bench和METRs RE-Bench在内的多个机器学习工程基准测试中,均取得了业界领先的成绩。
English
Machine learning, the foundation of modern artificial intelligence, has driven innovations that have fundamentally transformed the world. Yet, behind advancements lies a complex and often tedious process requiring labor and compute intensive iteration and experimentation. Engineers and scientists developing machine learning models spend much of their time on trial-and-error tasks instead of conceptualizing innovative solutions or research hypotheses. To address this challenge, we introduce AI-Driven Exploration (AIDE), a machine learning engineering agent powered by large language models (LLMs). AIDE frames machine learning engineering as a code optimization problem, and formulates trial-and-error as a tree search in the space of potential solutions. By strategically reusing and refining promising solutions, AIDE effectively trades computational resources for enhanced performance, achieving state-of-the-art results on multiple machine learning engineering benchmarks, including our Kaggle evaluations, OpenAI MLE-Bench and METRs RE-Bench.

Summary

AI-Generated Summary

PDF76February 20, 2025