选择性自监督微调:提升大规模语言模型的泛化能力
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models
February 12, 2025
作者: Sonam Gupta, Yatin Nandwani, Asaf Yehudai, Dinesh Khandelwal, Dinesh Raghu, Sachindra Joshi
cs.AI
摘要
针对特定数据集对大型语言模型(LLMs)进行微调是提升目标任务性能的常见做法。然而,这种性能提升往往会导致过拟合,即模型在任务或训练数据特征上过于专门化,从而丧失泛化能力。本文提出了一种选择性自监督微调方法(Selective Self-to-Supervised Fine-Tuning, S3FT),该微调方法在保持优于标准监督微调(Supervised Fine-Tuning, SFT)性能的同时,提升了模型的泛化能力。S3FT利用了一个查询存在多个有效响应的特性,通过使用模型自身的正确响应,在微调阶段减少了模型的专门化程度。具体而言,S3FT首先通过部署合适的评判机制,从训练集中识别出模型的正确响应;随后,对于其余样本,使用这些正确响应与黄金响应(或其释义)对模型进行微调。通过在数学推理、Python编程及阅读理解任务上的实验,验证了S3FT的有效性。结果显示,标准SFT在MMLU和TruthfulQA等多个基准测试上平均性能下降可达4.4,而S3FT则将此下降幅度减半至2.5,表明其在微调任务上表现显著更优的同时,具备更强的泛化能力。
English
Fine-tuning Large Language Models (LLMs) on specific datasets is a common
practice to improve performance on target tasks. However, this performance gain
often leads to overfitting, where the model becomes too specialized in either
the task or the characteristics of the training data, resulting in a loss of
generalization. This paper introduces Selective Self-to-Supervised Fine-Tuning
(S3FT), a fine-tuning approach that achieves better performance than the
standard supervised fine-tuning (SFT) while improving generalization. S3FT
leverages the existence of multiple valid responses to a query. By utilizing
the model's correct responses, S3FT reduces model specialization during the
fine-tuning stage. S3FT first identifies the correct model responses from the
training set by deploying an appropriate judge. Then, it fine-tunes the model
using the correct model responses and the gold response (or its paraphrase) for
the remaining samples. The effectiveness of S3FT is demonstrated through
experiments on mathematical reasoning, Python programming and reading
comprehension tasks. The results show that standard SFT can lead to an average
performance drop of up to 4.4 on multiple benchmarks, such as MMLU and
TruthfulQA. In contrast, S3FT reduces this drop by half, i.e. 2.5, indicating
better generalization capabilities than SFT while performing significantly
better on the fine-tuning tasks.Summary
AI-Generated Summary