ChatPaper.aiChatPaper

SemiEvol:用於LLM適應的半監督微調

SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation

October 17, 2024
作者: Junyu Luo, Xiao Luo, Xiusi Chen, Zhiping Xiao, Wei Ju, Ming Zhang
cs.AI

摘要

監督微調(SFT)在調整大型語言模型(LLMs)以適應特定領域或任務中至關重要。然而,在實際應用中僅有有限量的標記數據可用,這對於 SFT 產生令人滿意的結果構成嚴峻挑戰。因此,迫切需要一個能夠充分利用標記和未標記數據進行 LLM 微調的高效框架。為此,我們引入了一個名為 SemiEvol 的半監督微調框架,以傳播和選擇方式進行 LLM 適應。對於知識傳播,SemiEvol 採用雙層方法,通過權重內和內容方法從標記數據傳播知識到未標記數據。對於知識選擇,SemiEvol 融入協作學習機制,選擇高質量的虛擬回應樣本。我們在七個通用或特定領域數據集上使用 GPT-4o-mini 和 Llama-3.1 進行實驗,展示了模型在目標數據上性能顯著提升。此外,我們將 SemiEvol 與 SFT 和自我演化方法進行比較,突出了其在混合數據情境中的實用性。
English
Supervised fine-tuning (SFT) is crucial in adapting large language models (LLMs) to a specific domain or task. However, only a limited amount of labeled data is available in practical applications, which poses a severe challenge for SFT in yielding satisfactory results. Therefore, a data-efficient framework that can fully exploit labeled and unlabeled data for LLM fine-tuning is highly anticipated. Towards this end, we introduce a semi-supervised fine-tuning framework named SemiEvol for LLM adaptation from a propagate-and-select manner. For knowledge propagation, SemiEvol adopts a bi-level approach, propagating knowledge from labeled data to unlabeled data through both in-weight and in-context methods. For knowledge selection, SemiEvol incorporates a collaborative learning mechanism, selecting higher-quality pseudo-response samples. We conducted experiments using GPT-4o-mini and Llama-3.1 on seven general or domain-specific datasets, demonstrating significant improvements in model performance on target data. Furthermore, we compared SemiEvol with SFT and self-evolution methods, highlighting its practicality in hybrid data scenarios.

Summary

AI-Generated Summary

PDF482November 16, 2024