小型語言模型的戰略協調框架在數據合成中媲美大型語言模型

摘要

雖然數據合成與蒸餾是提升小型語言模型的有效策略，但現有方法過度依賴大型語言模型（LLMs），這些模型存在計算成本高、環境效率低以及從單一架構中繼承的潛在偏見等問題。相比之下，小型LLMs更易於獲取且更具可持續性，但其個體能力在生成高質量、多樣化且可靠的數據方面往往不足。受人類協作過程（如同行評審）的啟發，我們提出了一個多個小型LLMs參與的框架——GRA，該框架通過聚合小型LLMs的專業角色來實現通常由單一大型LLM完成的迭代優化和質量控制。在這個協作框架中，多個小型LLMs承擔不同的角色——生成者、審閱者和裁決者——以模擬一個受同行評審啟發的數據合成流程。生成者提出初始數據樣本，審閱者對其質量和多樣性進行評判，而裁決者則解決衝突以最終確定輸出。通過將合成過程分解為專業的子任務，協作的小型LLMs能夠在數據層面達到與基於大型LLM的蒸餾相當的水平。通過在多個基準上的實驗，我們證明GRA生成的數據匹配甚至超越了單一大型LLM的輸出質量，例如Qwen-2.5-72B-Instruct。我們的結果挑戰了高質量數據合成必須依賴單一大型模型的必要性，轉而倡導對小型代理進行戰略性協調。我們的數據集、模型和代碼已公開於https://github.com/GX-XinGao/GRA。

English

While data synthesis and distillation are promising strategies to enhance small language models, current approaches heavily rely on Large Language Models (LLMs), which suffer from high computational costs, environmental inefficiency, and potential biases inherited from monolithic architectures. In contrast, smaller LLMs are more accessible and sustainable, but their individual capabilities often fall short in generating high-quality, diverse, and reliable data. Inspired by collaborative human processes (e.g., peer review), we propose a multiple small LLMs involved framework, GRA, that aggregates specialized roles across small LLMs to iterative refinement and quality control typically achieved by a single large LLM. In this collaborative framework, multiple small LLMs assume distinct roles-Generator, Reviewer, and Adjudicator-to simulate a peer-review-inspired data synthesis pipeline. The Generator proposes initial data samples, the Reviewer critiques their quality and diversity, and the Adjudicator resolves conflicts to finalize the output. By decomposing the synthesis process into specialized sub-tasks, collaborative small LLMs can achieve data-level parity with large LLM-based distillation. Through experiments across multiple benchmarks, we demonstrate that GRA-produced data matches or exceeds the quality of single large LLM outputs, e.g., Qwen-2.5-72B-Instruct. Our results challenge the necessity of monolithic large models for high-quality data synthesis, advocating instead for strategic coordination of smaller agents. Our datasets, models, and code are publicly available at https://github.com/GX-XinGao/GRA.

小型語言模型的戰略協調框架在數據合成中媲美大型語言模型

A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis

摘要

Summary

Support

Support