HALoGEN:奇幻LLM幻觉及其发现之处
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
January 14, 2025
作者: Abhilasha Ravichander, Shrusti Ghela, David Wadden, Yejin Choi
cs.AI
摘要
尽管生成式大型语言模型(LLMs)具有生成高质量和流畅文本的能力,但它们也会产生幻觉:即与已建立的世界知识或提供的输入上下文不符的陈述。然而,衡量幻觉可能具有挑战性,因为让人类实时验证模型生成的内容既昂贵又耗时。在这项工作中,我们发布了HALoGEN,一个全面的幻觉基准,包括:(1)10,923个用于生成模型的提示,涵盖编程、科学归因和摘要等九个领域,以及(2)针对每种用例的自动高精度验证器,将LLM生成的内容分解为原子单元,并针对高质量知识源验证每个单元。我们使用这一框架评估了来自14个语言模型的约150,000个生成内容,发现即使是表现最佳的模型也充斥着幻觉(有时在某些领域生成的原子事实高达86%)。我们进一步为LLM幻觉定义了一种新的错误分类,基于它们是否可能源自对训练数据的错误回忆(A型错误)、训练数据中的错误知识(B型错误)或虚构(C型错误)。我们希望我们的框架为实现对生成模型产生幻觉原因的原则性研究提供基础,并推动可信赖的大型语言模型的发展。
English
Despite their impressive ability to generate high-quality and fluent text,
generative large language models (LLMs) also produce hallucinations: statements
that are misaligned with established world knowledge or provided input context.
However, measuring hallucination can be challenging, as having humans verify
model generations on-the-fly is both expensive and time-consuming. In this
work, we release HALoGEN, a comprehensive hallucination benchmark consisting
of: (1) 10,923 prompts for generative models spanning nine domains including
programming, scientific attribution, and summarization, and (2) automatic
high-precision verifiers for each use case that decompose LLM generations into
atomic units, and verify each unit against a high-quality knowledge source. We
use this framework to evaluate ~150,000 generations from 14 language models,
finding that even the best-performing models are riddled with hallucinations
(sometimes up to 86% of generated atomic facts depending on the domain). We
further define a novel error classification for LLM hallucinations based on
whether they likely stem from incorrect recollection of training data (Type A
errors), or incorrect knowledge in training data (Type B errors), or are
fabrication (Type C errors). We hope our framework provides a foundation to
enable the principled study of why generative models hallucinate, and advances
the development of trustworthy large language models.Summary
AI-Generated Summary