知識編輯真的能夠修正幻覺嗎？

摘要

大型語言模型（LLMs）儘管在各項任務上具有卓越的能力，但仍然存在幻覺問題，指的是生成內容中的非事實信息。與此同時，知識編輯已被發展為一種新興流行範式，用於糾正LLMs中編碼的錯誤事實知識，具有避免從頭重新訓練的優勢。然而，現有知識編輯評估數據集的一個常見問題是，它們並未確保LLMs在進行編輯之前實際生成幻覺答案以回答評估問題。當LLMs在經過不同技術編輯後在這些數據集上進行評估時，很難直接採納性能來評估不同知識編輯方法在糾正幻覺方面的有效性。因此，基本問題仍未得到充分驗證：知識編輯是否真的能夠糾正LLMs中的幻覺？我們提出了HalluEditBench，以全面評估知識編輯方法在糾正現實世界幻覺方面的表現。首先，我們嚴謹構建了一個包含9個領域、26個主題和超過6,000個幻覺的大型數據集。然後，我們從效能、泛化性、可移植性、局部性和韌性等五個維度全面評估知識編輯方法的表現。通過HalluEditBench，我們提供了對不同知識編輯方法在糾正幻覺方面的潛力和限制的新見解，這可能激發未來的改進並促進知識編輯領域的進展。

English

Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct the erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, one common issue of existing evaluation datasets for knowledge editing is that they do not ensure LLMs actually generate hallucinated answers to the evaluation questions before editing. When LLMs are evaluated on such datasets after being edited by different techniques, it is hard to directly adopt the performance to assess the effectiveness of different knowledge editing methods in correcting hallucinations. Thus, the fundamental question remains insufficiently validated: Can knowledge editing really correct hallucinations in LLMs? We proposed HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations. First, we rigorously construct a massive hallucination dataset with 9 domains, 26 topics and more than 6,000 hallucinations. Then, we assess the performance of knowledge editing methods in a holistic way on five dimensions including Efficacy, Generalization, Portability, Locality, and Robustness. Through HalluEditBench, we have provided new insights into the potentials and limitations of different knowledge editing methods in correcting hallucinations, which could inspire future improvements and facilitate the progress in the field of knowledge editing.

知識編輯真的能夠修正幻覺嗎？

Can Knowledge Editing Really Correct Hallucinations?

摘要

Summary

Support

Support