ChatPaper.aiChatPaper

SelfCite:自监督对齐:大型语言模型中的上下文归因

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

February 13, 2025
作者: Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, James Glass, Shang-Wen Li, Wen-tau Yih
cs.AI

摘要

我们介绍了SelfCite,这是一种新颖的自监督方法,可以对齐LLM以生成高质量、细粒度、句级引文,用于其生成的回复中的陈述。SelfCite不仅仅依赖昂贵且劳动密集的注释,而是通过上下文消融利用LLM本身提供的奖励信号:如果需要引文,从上下文中删除引文文本应该会阻止相同的回复;如果引文足够,仅保留引文文本应该会保留相同的回复。这种奖励可以引导推理时的最佳-N抽样策略,显著提高引文质量,并可用于偏好优化,直接微调模型以生成更好的引文。SelfCite的有效性通过在五个长格式问答任务中LongBench-Cite基准上将引文F1提高了高达5.3个点来加以证明。
English
We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks.

Summary

AI-Generated Summary

PDF332February 14, 2025