知識編輯真的能夠修正幻覺嗎?
Can Knowledge Editing Really Correct Hallucinations?
October 21, 2024
作者: Baixiang Huang, Canyu Chen, Xiongxiao Xu, Ali Payani, Kai Shu
cs.AI
摘要
大型語言模型(LLMs)儘管在各項任務上具有卓越的能力,但仍然存在幻覺問題,指的是生成內容中的非事實信息。與此同時,知識編輯已被發展為一種新興流行範式,用於糾正LLMs中編碼的錯誤事實知識,具有避免從頭重新訓練的優勢。然而,現有知識編輯評估數據集的一個常見問題是,它們並未確保LLMs在進行編輯之前實際生成幻覺答案以回答評估問題。當LLMs在經過不同技術編輯後在這些數據集上進行評估時,很難直接採納性能來評估不同知識編輯方法在糾正幻覺方面的有效性。因此,基本問題仍未得到充分驗證:知識編輯是否真的能夠糾正LLMs中的幻覺?我們提出了HalluEditBench,以全面評估知識編輯方法在糾正現實世界幻覺方面的表現。首先,我們嚴謹構建了一個包含9個領域、26個主題和超過6,000個幻覺的大型數據集。然後,我們從效能、泛化性、可移植性、局部性和韌性等五個維度全面評估知識編輯方法的表現。通過HalluEditBench,我們提供了對不同知識編輯方法在糾正幻覺方面的潛力和限制的新見解,這可能激發未來的改進並促進知識編輯領域的進展。
English
Large Language Models (LLMs) suffer from hallucinations, referring to the
non-factual information in generated content, despite their superior capacities
across tasks. Meanwhile, knowledge editing has been developed as a new popular
paradigm to correct the erroneous factual knowledge encoded in LLMs with the
advantage of avoiding retraining from scratch. However, one common issue of
existing evaluation datasets for knowledge editing is that they do not ensure
LLMs actually generate hallucinated answers to the evaluation questions before
editing. When LLMs are evaluated on such datasets after being edited by
different techniques, it is hard to directly adopt the performance to assess
the effectiveness of different knowledge editing methods in correcting
hallucinations. Thus, the fundamental question remains insufficiently
validated: Can knowledge editing really correct hallucinations in LLMs? We
proposed HalluEditBench to holistically benchmark knowledge editing methods in
correcting real-world hallucinations. First, we rigorously construct a massive
hallucination dataset with 9 domains, 26 topics and more than 6,000
hallucinations. Then, we assess the performance of knowledge editing methods in
a holistic way on five dimensions including Efficacy, Generalization,
Portability, Locality, and Robustness. Through HalluEditBench, we have provided
new insights into the potentials and limitations of different knowledge editing
methods in correcting hallucinations, which could inspire future improvements
and facilitate the progress in the field of knowledge editing.Summary
AI-Generated Summary