选择性自监督微调：提升大规模语言模型的泛化能力

摘要

针对特定数据集对大型语言模型（LLMs）进行微调是提升目标任务性能的常见做法。然而，这种性能提升往往会导致过拟合，即模型在任务或训练数据特征上过于专门化，从而丧失泛化能力。本文提出了一种选择性自监督微调方法（Selective Self-to-Supervised Fine-Tuning, S3FT），该微调方法在保持优于标准监督微调（Supervised Fine-Tuning, SFT）性能的同时，提升了模型的泛化能力。S3FT利用了一个查询存在多个有效响应的特性，通过使用模型自身的正确响应，在微调阶段减少了模型的专门化程度。具体而言，S3FT首先通过部署合适的评判机制，从训练集中识别出模型的正确响应；随后，对于其余样本，使用这些正确响应与黄金响应（或其释义）对模型进行微调。通过在数学推理、Python编程及阅读理解任务上的实验，验证了S3FT的有效性。结果显示，标准SFT在MMLU和TruthfulQA等多个基准测试上平均性能下降可达4.4，而S3FT则将此下降幅度减半至2.5，表明其在微调任务上表现显著更优的同时，具备更强的泛化能力。

English

Fine-tuning Large Language Models (LLMs) on specific datasets is a common practice to improve performance on target tasks. However, this performance gain often leads to overfitting, where the model becomes too specialized in either the task or the characteristics of the training data, resulting in a loss of generalization. This paper introduces Selective Self-to-Supervised Fine-Tuning (S3FT), a fine-tuning approach that achieves better performance than the standard supervised fine-tuning (SFT) while improving generalization. S3FT leverages the existence of multiple valid responses to a query. By utilizing the model's correct responses, S3FT reduces model specialization during the fine-tuning stage. S3FT first identifies the correct model responses from the training set by deploying an appropriate judge. Then, it fine-tunes the model using the correct model responses and the gold response (or its paraphrase) for the remaining samples. The effectiveness of S3FT is demonstrated through experiments on mathematical reasoning, Python programming and reading comprehension tasks. The results show that standard SFT can lead to an average performance drop of up to 4.4 on multiple benchmarks, such as MMLU and TruthfulQA. In contrast, S3FT reduces this drop by half, i.e. 2.5, indicating better generalization capabilities than SFT while performing significantly better on the fine-tuning tasks.

选择性自监督微调：提升大规模语言模型的泛化能力

Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models

摘要

Summary

Support