控制LLM：LLM智能保留的受控进化

摘要

大型语言模型（LLMs）需要大量的计算资源，因此在不从头开始重新训练的情况下增强它们的能力至关重要。在这一领域的一个关键挑战是灾难性遗忘（CF），它影响了连续预训练（CPT）和连续监督微调（CSFT）期间的性能。我们提出了Control LLM，这是一种新颖的方法，利用并行预训练和扩展的Transformer块，通过插值策略对齐它们的隐藏状态。这种方法有效地保持了现有任务的性能，同时无缝集成了新知识。大量实验证明了Control LLM在CPT和CSFT中的有效性。在Llama3.1-8B-Instruct上，它在数学推理（+14.4%在Math-Hard）和编码性能（+10%在MBPP-PLUS）方面取得了显著的改进。在Llama3.1-8B上，它增强了多语言能力（+10.6%在C-Eval，+6.8%在CMMLU，+30.2%在CMMLU-0shot-CoT）。它超越了现有方法，在使用更少的数据和计算资源的情况下，从相同基础模型微调的开源模型中实现了SOTA。关键是，这些收益是在保持强大原始能力的同时实现的，与开源数学和编码模型中的>35%相比，其降级很小（<4.3%在MMLU）。这种方法已成功应用于LinkedIn的GenAI驱动的求职者和广告单元产品中。为了支持进一步的研究，我们向社区发布了训练和评估代码（https://github.com/linkedin/ControlLLM），以及在公共数据集上训练的模型（https://huggingface.co/ControlLLM）。

English

Large Language Models (LLMs) demand significant computational resources, making it essential to enhance their capabilities without retraining from scratch. A key challenge in this domain is catastrophic forgetting (CF), which hampers performance during Continuous Pre-training (CPT) and Continuous Supervised Fine-Tuning (CSFT). We propose Control LLM, a novel approach that leverages parallel pre-trained and expanded transformer blocks, aligning their hidden-states through interpolation strategies This method effectively preserves performance on existing tasks while seamlessly integrating new knowledge. Extensive experiments demonstrate the effectiveness of Control LLM in both CPT and CSFT. On Llama3.1-8B-Instruct, it achieves significant improvements in mathematical reasoning (+14.4% on Math-Hard) and coding performance (+10% on MBPP-PLUS). On Llama3.1-8B, it enhances multilingual capabilities (+10.6% on C-Eval, +6.8% on CMMLU, and +30.2% on CMMLU-0shot-CoT). It surpasses existing methods and achieves SOTA among open-source models tuned from the same base model, using substantially less data and compute. Crucially, these gains are realized while preserving strong original capabilities, with minimal degradation (<4.3% on MMLU) compared to >35% in open-source Math and Coding models. This approach has been successfully deployed in LinkedIn's GenAI-powered job seeker and Ads unit products. To support further research, we release the training and evaluation code (https://github.com/linkedin/ControlLLM) along with models trained on public datasets ( https://huggingface.co/ControlLLM) to the community.

控制LLM：LLM智能保留的受控进化

Control LLM: Controlled Evolution for Intelligence Retention in LLM

摘要

Summary

Support