ChatPaper.aiChatPaper

通过内部表征视角分析大语言模型跨语言知识边界认知

Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations

April 18, 2025
作者: Chenghao Xiao, Hou Pong Chan, Hao Zhang, Mahani Aljunied, Lidong Bing, Noura Al Moubayed, Yu Rong
cs.AI

摘要

理解大型语言模型(LLMs)的知识边界对于防止幻觉现象至关重要,然而,当前关于LLMs知识边界的研究主要集中于英语。在本研究中,我们首次探讨了LLMs如何通过处理多种语言中已知与未知问题时的内部表征来识别跨语言的知识边界。实证研究揭示了三个关键发现:1)LLMs对知识边界的感知编码于模型的中层至中上层,这一现象在不同语言间具有一致性。2)知识边界感知的语言差异呈现线性结构,这启发我们提出了一种无需训练的校准方法,有效实现了跨语言知识边界感知能力的迁移,从而有助于降低低资源语言中的幻觉风险。3)在双语问题对翻译上进行微调,进一步增强了LLMs跨语言识别知识边界的能力。鉴于缺乏跨语言知识边界分析的标准测试平台,我们构建了一个多语言评估套件,包含三种代表性的知识边界数据类型。我们的代码与数据集已公开于https://github.com/DAMO-NLP-SG/LLM-Multilingual-Knowledge-Boundaries。
English
While understanding the knowledge boundaries of LLMs is crucial to prevent hallucination, research on knowledge boundaries of LLMs has predominantly focused on English. In this work, we present the first study to analyze how LLMs recognize knowledge boundaries across different languages by probing their internal representations when processing known and unknown questions in multiple languages. Our empirical studies reveal three key findings: 1) LLMs' perceptions of knowledge boundaries are encoded in the middle to middle-upper layers across different languages. 2) Language differences in knowledge boundary perception follow a linear structure, which motivates our proposal of a training-free alignment method that effectively transfers knowledge boundary perception ability across languages, thereby helping reduce hallucination risk in low-resource languages; 3) Fine-tuning on bilingual question pair translation further enhances LLMs' recognition of knowledge boundaries across languages. Given the absence of standard testbeds for cross-lingual knowledge boundary analysis, we construct a multilingual evaluation suite comprising three representative types of knowledge boundary data. Our code and datasets are publicly available at https://github.com/DAMO-NLP-SG/LLM-Multilingual-Knowledge-Boundaries.

Summary

AI-Generated Summary

PDF152April 21, 2025