在大规模知识库上揭示语言模型的知识缺陷

摘要

大型语言模型（LLMs）展现出卓越的语言处理能力，却常难以准确保留事实知识，导致产生幻觉及不可靠的输出。全面评估LLMs与大规模知识库的匹配度以理解其知识缺陷，在计算上极为昂贵，尤其对于闭源权重模型而言。我们提出随机误差上升（SEA）框架，这一可扩展且高效的方案，能在严格查询预算下，发现闭源权重LLMs的知识缺陷（错误）。SEA并未简单探测所有知识候选，而是将错误发现建模为一个随机优化过程：通过利用与先前观察到的失败案例的语义相似性，迭代检索新的高错误率候选。为进一步提升搜索效率与覆盖范围，SEA采用文档与段落层次的分级检索，并构建关系有向无环图以模拟错误传播，识别系统性故障模式。实证表明，SEA发现的知识错误数量是自动能力发现的40.7倍，比AutoBencher多26.7%，同时将每错误成本分别降低了599倍和9倍。人工评估确认了生成问题的高质量，而消融与收敛分析验证了SEA中各组件的贡献。对发现错误的进一步分析揭示了跨LLM家族的关联性故障模式及反复出现的知识短板，强调了未来LLM开发中需加强数据覆盖与针对性微调的必要性。

English

Large language models (LLMs) possess impressive linguistic capabilities but often fail to faithfully retain factual knowledge, leading to hallucinations and unreliable outputs. Understanding LLMs' knowledge deficiencies by exhaustively evaluating against full-scale knowledge bases is computationally prohibitive, especially for closed-weight models. We propose stochastic error ascent (SEA), a scalable and efficient framework for discovering knowledge deficiencies (errors) in closed-weight LLMs under a strict query budget. Rather than naively probing all knowledge candidates, SEA formulates error discovery as a stochastic optimization process: it iteratively retrieves new high-error candidates by leveraging the semantic similarity to previously observed failures. To further enhance search efficiency and coverage, SEA employs hierarchical retrieval across document and paragraph levels, and constructs a relation directed acyclic graph to model error propagation and identify systematic failure modes. Empirically, SEA uncovers 40.7x more knowledge errors than Automated Capability Discovery and 26.7% more than AutoBencher, while reducing the cost-per-error by 599x and 9x, respectively. Human evaluation confirms the high quality of generated questions, while ablation and convergence analyses validate the contribution of each component in SEA. Further analysis on the discovered errors reveals correlated failure patterns across LLM families and recurring deficits, highlighting the need for better data coverage and targeted fine-tuning in future LLM development.

在大规模知识库上揭示语言模型的知识缺陷

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

摘要

Summary

Support

Support