搜索、验证和反馈:通过验证器工程走向基于基础模型的下一代后训练范式。

Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering

November 18, 2024
作者: Xinyan Guan, Yanjiang Liu, Xinyu Lu, Boxi Cao, Ben He, Xianpei Han, Le Sun, Jie Lou, Bowen Yu, Yaojie Lu, Hongyu Lin
cs.AI

摘要

机器学习的发展越来越注重强大模型和更可扩展的监督信号的发展。然而,基础模型的出现在提供必要的有效监督信号以进一步增强其能力方面面临着重大挑战。因此,迫切需要探索新的监督信号和技术方法。在本文中,我们提出了验证器工程,这是一种专门为基础模型时代设计的新型后训练范式。验证器工程的核心是利用一套自动验证器来执行验证任务,并向基础模型提供有意义的反馈。我们将验证器工程过程系统地划分为三个基本阶段:搜索、验证和反馈,并全面审视每个阶段的最新研究进展。我们相信,验证器工程是实现人工通用智能的基本途径。
English
The evolution of machine learning has increasingly prioritized the development of powerful models and more scalable supervision signals. However, the emergence of foundation models presents significant challenges in providing effective supervision signals necessary for further enhancing their capabilities. Consequently, there is an urgent need to explore novel supervision signals and technical approaches. In this paper, we propose verifier engineering, a novel post-training paradigm specifically designed for the era of foundation models. The core of verifier engineering involves leveraging a suite of automated verifiers to perform verification tasks and deliver meaningful feedback to foundation models. We systematically categorize the verifier engineering process into three essential stages: search, verify, and feedback, and provide a comprehensive review of state-of-the-art research developments within each stage. We believe that verifier engineering constitutes a fundamental pathway toward achieving Artificial General Intelligence.

Summary

AI-Generated Summary

PDF192November 19, 2024