가공된 합성 데이터는 선행 학습을 통해 LLM 사실성을 향상시킬 수 있습니다.

초록

최근 연구에서는 LLM 환각의 악화 요인 중 하나로 사전 훈련과 세밀 조정 간의 지식 불일치를 확인했습니다. 여기서 익숙하지 않은 세밀 조정 데이터가 LLM을 오류가 있지만 타당한 출력을 만들도록 오도하는 것으로 나타났습니다. 본 논문에서는 이러한 지식 불일치를 해소하고 환각을 줄이기 위한 새로운 세밀 조정 전략인 Prereq-Tune을 제안합니다. 기본적으로 Prereq-Tune은 기술과 지식의 학습을 분리하여 모델이 작업 기술만 학습하고 지식 불일치의 영향을 받지 않도록 합니다. 이를 달성하기 위해 Prereq-Tune은 SFT를 위한 필수 지식을 학습하기 위한 추가적인 선행 학습 단계를 도입하여, 이후의 SFT가 작업 기술에만 집중할 수 있도록 합니다. Prereq-Tune은 LLM 출력을 내부 지식에 더 잘 근거지도록 가짜 합성 데이터와 결합할 수도 있습니다. 실험 결과, Prereq-Tune은 짧은 QA 및 장문 생성 작업에서 LLM의 사실성을 향상시키는 데 기존 기준선을 능가하는 것으로 나타났습니다. 또한 LLM에서 지식 제어 생성을 위한 새로운 가능성을 열어줍니다. 저희 코드는 https://github.com/UCSB-NLP-Chang/Prereq_tune.git에서 확인할 수 있습니다.

English

Recent studies have identified one aggravating factor of LLM hallucinations as the knowledge inconsistency between pre-training and fine-tuning, where unfamiliar fine-tuning data mislead the LLM to fabricate plausible but wrong outputs. In this paper, we propose a novel fine-tuning strategy called Prereq-Tune to address this knowledge inconsistency and reduce hallucinations. Fundamentally, Prereq-Tune disentangles the learning of skills and knowledge, so the model learns only the task skills without being impacted by the knowledge inconsistency. To achieve this, Prereq-Tune introduces an additional prerequisite learning stage to learn the necessary knowledge for SFT, allowing subsequent SFT to focus only on task skills. Prereq-Tune can also be combined with fictitious synthetic data to enhance the grounding of LLM outputs to their internal knowledge. Experiments show that Prereq-Tune outperforms existing baselines in improving LLM's factuality across short QA and long-form generation tasks. It also opens new possibilities for knowledge-controlled generation in LLMs. Our code is available at https://github.com/UCSB-NLP-Chang/Prereq_tune.git.

가공된 합성 데이터는 선행 학습을 통해 LLM 사실성을 향상시킬 수 있습니다.

Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning

초록

Summary

Support