搜尋、驗證與回饋:透過驗證器工程朝向基礎模型的下一代事後訓練範式
Search, Verify and Feedback: Towards Next Generation Post-training Paradigm of Foundation Models via Verifier Engineering
November 18, 2024
作者: Xinyan Guan, Yanjiang Liu, Xinyu Lu, Boxi Cao, Ben He, Xianpei Han, Le Sun, Jie Lou, Bowen Yu, Yaojie Lu, Hongyu Lin
cs.AI
摘要
機器學習的演進越來越重視強大模型的開發和更具可擴展性的監督信號。然而,基礎模型的出現在提供進一步增強其能力所需的有效監督信號方面帶來了重大挑戰。因此,迫切需要探索新的監督信號和技術方法。本文提出了驗證器工程,這是一種專門為基礎模型時代設計的新型後訓練範式。驗證器工程的核心是利用一套自動驗證器來執行驗證任務並向基礎模型提供有意義的反饋。我們將驗證器工程過程系統地分為三個基本階段:搜索、驗證和反饋,並對每個階段內的最新研究發展進行全面回顧。我們認為,驗證器工程是實現人工通用智能的基本途徑。
English
The evolution of machine learning has increasingly prioritized the
development of powerful models and more scalable supervision signals. However,
the emergence of foundation models presents significant challenges in providing
effective supervision signals necessary for further enhancing their
capabilities. Consequently, there is an urgent need to explore novel
supervision signals and technical approaches. In this paper, we propose
verifier engineering, a novel post-training paradigm specifically designed for
the era of foundation models. The core of verifier engineering involves
leveraging a suite of automated verifiers to perform verification tasks and
deliver meaningful feedback to foundation models. We systematically categorize
the verifier engineering process into three essential stages: search, verify,
and feedback, and provide a comprehensive review of state-of-the-art research
developments within each stage. We believe that verifier engineering
constitutes a fundamental pathway toward achieving Artificial General
Intelligence.Summary
AI-Generated Summary