RobustFT:在嘈雜回應下針對大型語言模型進行穩健監督微調
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
December 19, 2024
作者: Junyu Luo, Xiao Luo, Kaize Ding, Jingyang Yuan, Zhiping Xiao, Ming Zhang
cs.AI
摘要
監督式微調(SFT)在調整大型語言模型(LLMs)以適應特定領域或任務中扮演關鍵角色。然而,根據實證實驗所示,實際應用中收集的數據不可避免地包含噪音,這對模型在下游任務中的表現提出了重大挑戰。因此,迫切需要一個抗噪聲的SFT框架,以增強模型在下游任務中的能力。為應對這一挑戰,我們引入了一個強健的SFT框架(RobustFT),對下游任務數據進行噪聲檢測和重新標記。在噪聲識別方面,我們的方法採用多專家協作系統,搭配推理增強模型,實現卓越的噪聲檢測。在去噪階段,我們採用了一種上下文增強策略,該策略整合了最相關和最有信心的知識,並經過仔細評估以生成可靠的標註。此外,我們引入了一種基於響應熵的有效數據選擇機制,確保僅保留高質量樣本進行微調。在五個數據集上進行的大量實驗表明,RobustFT在噪聲情境中表現出色。
English
Supervised fine-tuning (SFT) plays a crucial role in adapting large language
models (LLMs) to specific domains or tasks. However, as demonstrated by
empirical experiments, the collected data inevitably contains noise in
practical applications, which poses significant challenges to model performance
on downstream tasks. Therefore, there is an urgent need for a noise-robust SFT
framework to enhance model capabilities in downstream tasks. To address this
challenge, we introduce a robust SFT framework (RobustFT) that performs noise
detection and relabeling on downstream task data. For noise identification, our
approach employs a multi-expert collaborative system with inference-enhanced
models to achieve superior noise detection. In the denoising phase, we utilize
a context-enhanced strategy, which incorporates the most relevant and confident
knowledge followed by careful assessment to generate reliable annotations.
Additionally, we introduce an effective data selection mechanism based on
response entropy, ensuring only high-quality samples are retained for
fine-tuning. Extensive experiments conducted on multiple LLMs across five
datasets demonstrate RobustFT's exceptional performance in noisy scenarios.Summary
AI-Generated Summary