CodeSteer:通过代码/文本引导进行符号增强的语言模型
CodeSteer: Symbolic-Augmented Language Models via Code/Text Guidance
February 4, 2025
作者: Yongchao Chen, Yilun Hao, Yueying Liu, Yang Zhang, Chuchu Fan
cs.AI
摘要
现有方法未能有效引导大型语言模型(LLMs)在文本推理和代码生成之间进行平衡,导致符号计算能力被低效利用。我们引入了CodeSteer,这是一种有效的方法,用于指导LLM的代码/文本生成。我们构建了一个全面的基准SymBench,其中包含37个具有可调整复杂性的符号任务,并合成了包含12,000个多轮引导/生成轨迹和5,500个引导比较对的数据集。我们使用新设计的多轮监督微调(SFT)和直接偏好优化(DPO)对Llama-3-8B模型进行微调。得到的模型CodeSteerLLM,增加了提出的符号和自答检查器,有效地引导更大型模型的代码/文本生成。通过将CodeSteer与GPT-4o相结合,其平均性能得分从53.3提高到86.4,甚至在所有37个任务(28个已见,9个未见)上都超过了现有最佳LLM OpenAI o1(82.7)、o1-preview(74.8)和DeepSeek R1(76.8)。针对GPT-4o进行训练,CodeSteer展现出卓越的泛化能力,在Claude、Mistral和GPT-3.5上提供了平均41.8的性能提升。CodeSteer引导的LLMs充分利用符号计算,在高度复杂的任务上保持强大性能。模型、数据集和代码可在以下网址获取:https://github.com/yongchao98/CodeSteer-v1.0。
English
Existing methods fail to effectively steer Large Language Models (LLMs)
between textual reasoning and code generation, leaving symbolic computing
capabilities underutilized. We introduce CodeSteer, an effective method for
guiding LLM code/text generation. We construct a comprehensive benchmark
SymBench comprising 37 symbolic tasks with adjustable complexity and also
synthesize datasets of 12k multi-round guidance/generation trajectories and
5.5k guidance comparison pairs. We fine-tune the Llama-3-8B model with a newly
designed multi-round supervised fine-tuning (SFT) and direct preference
optimization (DPO). The resulting model, CodeSteerLLM, augmented with the
proposed symbolic and self-answer checkers, effectively guides the code/text
generation of larger models. Augmenting GPT-4o with CodeSteer raises its
average performance score from 53.3 to 86.4, even outperforming the existing
best LLM OpenAI o1 (82.7), o1-preview (74.8), and DeepSeek R1 (76.8) across all
37 tasks (28 seen, 9 unseen). Trained for GPT-4o, CodeSteer demonstrates
superior generalizability, providing an average 41.8 performance boost on
Claude, Mistral, and GPT-3.5. CodeSteer-guided LLMs fully harness symbolic
computing to maintain strong performance on highly complex tasks. Models,
Datasets, and Codes are available at
https://github.com/yongchao98/CodeSteer-v1.0.Summary
AI-Generated Summary