透過先備學習，虛構的合成數據可以提升LLM的真實性

摘要

最近的研究已識別出 LLM 幻覺的一個加劇因素，即預訓練和微調之間的知識不一致，其中不熟悉的微調數據會誤導 LLM 製造似是而非的錯誤輸出。在本文中，我們提出了一種名為 Prereq-Tune 的新型微調策略，以解決這種知識不一致並減少幻覺。從根本上說，Prereq-Tune 將技能和知識的學習分離開來，使模型僅學習任務技能而不受知識不一致的影響。為了實現這一目標，Prereq-Tune 引入了一個額外的先決學習階段，以學習 SFT 所需的知識，從而使後續的 SFT 專注於任務技能。Prereq-Tune 還可以與虛構的合成數據結合，以增強 LLM 輸出與其內部知識的基礎。實驗表明，Prereq-Tune 在改善 LLM 在短問答和長文生成任務中的事實性方面優於現有基準。它還為 LLM 中的知識控制生成開辟了新的可能性。我們的代碼可在 https://github.com/UCSB-NLP-Chang/Prereq_tune.git 上找到。

English

Recent studies have identified one aggravating factor of LLM hallucinations as the knowledge inconsistency between pre-training and fine-tuning, where unfamiliar fine-tuning data mislead the LLM to fabricate plausible but wrong outputs. In this paper, we propose a novel fine-tuning strategy called Prereq-Tune to address this knowledge inconsistency and reduce hallucinations. Fundamentally, Prereq-Tune disentangles the learning of skills and knowledge, so the model learns only the task skills without being impacted by the knowledge inconsistency. To achieve this, Prereq-Tune introduces an additional prerequisite learning stage to learn the necessary knowledge for SFT, allowing subsequent SFT to focus only on task skills. Prereq-Tune can also be combined with fictitious synthetic data to enhance the grounding of LLM outputs to their internal knowledge. Experiments show that Prereq-Tune outperforms existing baselines in improving LLM's factuality across short QA and long-form generation tasks. It also opens new possibilities for knowledge-controlled generation in LLMs. Our code is available at https://github.com/UCSB-NLP-Chang/Prereq_tune.git.

透過先備學習，虛構的合成數據可以提升LLM的真實性

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

摘要

Summary

Support

Support