搜索、验证和反馈:通过验证器工程走向基于基础模型的下一代后训练范式。
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
November 18, 2024
作者: Xinyan Guan, Yanjiang Liu, Xinyu Lu, Boxi Cao, Ben He, Xianpei Han, Le Sun, Jie Lou, Bowen Yu, Yaojie Lu, Hongyu Lin
cs.AI
摘要
机器学习的发展越来越注重强大模型和更可扩展的监督信号的发展。然而,基础模型的出现在提供必要的有效监督信号以进一步增强其能力方面面临着重大挑战。因此,迫切需要探索新的监督信号和技术方法。在本文中,我们提出了验证器工程,这是一种专门为基础模型时代设计的新型后训练范式。验证器工程的核心是利用一套自动验证器来执行验证任务,并向基础模型提供有意义的反馈。我们将验证器工程过程系统地划分为三个基本阶段:搜索、验证和反馈,并全面审视每个阶段的最新研究进展。我们相信,验证器工程是实现人工通用智能的基本途径。
English
The evolution of machine learning has increasingly prioritized the
development of powerful models and more scalable supervision signals. However,
the emergence of foundation models presents significant challenges in providing
effective supervision signals necessary for further enhancing their
capabilities. Consequently, there is an urgent need to explore novel
supervision signals and technical approaches. In this paper, we propose
verifier engineering, a novel post-training paradigm specifically designed for
the era of foundation models. The core of verifier engineering involves
leveraging a suite of automated verifiers to perform verification tasks and
deliver meaningful feedback to foundation models. We systematically categorize
the verifier engineering process into three essential stages: search, verify,
and feedback, and provide a comprehensive review of state-of-the-art research
developments within each stage. We believe that verifier engineering
constitutes a fundamental pathway toward achieving Artificial General
Intelligence.Summary
AI-Generated Summary