RobustFT:在有噪声响应下针对大型语言模型的鲁棒监督微调
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
December 19, 2024
作者: Junyu Luo, Xiao Luo, Kaize Ding, Jingyang Yuan, Zhiping Xiao, Ming Zhang
cs.AI
摘要
监督微调(SFT)在调整大型语言模型(LLMs)以适应特定领域或任务中发挥着关键作用。然而,根据经验实验证明,在实际应用中收集的数据不可避免地包含噪声,这给模型在下游任务中的性能带来了重大挑战。因此,迫切需要一个噪声鲁棒的SFT框架来增强模型在下游任务中的能力。为了解决这一挑战,我们引入了一个稳健的SFT框架(RobustFT),对下游任务数据进行噪声检测和重新标记。在噪声识别方面,我们的方法采用多专家协作系统和推理增强模型,实现了卓越的噪声检测。在去噪阶段,我们采用了一种上下文增强策略,该策略整合了最相关和最可信的知识,然后经过仔细评估生成可靠的注释。此外,我们引入了一种基于响应熵的有效数据选择机制,确保只有高质量样本被保留用于微调。在五个数据集上进行的大量实验表明,RobustFT在嘈杂场景中表现出色。
English
Supervised fine-tuning (SFT) plays a crucial role in adapting large language
models (LLMs) to specific domains or tasks. However, as demonstrated by
empirical experiments, the collected data inevitably contains noise in
practical applications, which poses significant challenges to model performance
on downstream tasks. Therefore, there is an urgent need for a noise-robust SFT
framework to enhance model capabilities in downstream tasks. To address this
challenge, we introduce a robust SFT framework (RobustFT) that performs noise
detection and relabeling on downstream task data. For noise identification, our
approach employs a multi-expert collaborative system with inference-enhanced
models to achieve superior noise detection. In the denoising phase, we utilize
a context-enhanced strategy, which incorporates the most relevant and confident
knowledge followed by careful assessment to generate reliable annotations.
Additionally, we introduce an effective data selection mechanism based on
response entropy, ensuring only high-quality samples are retained for
fine-tuning. Extensive experiments conducted on multiple LLMs across five
datasets demonstrate RobustFT's exceptional performance in noisy scenarios.Summary
AI-Generated Summary