ChatPaper.aiChatPaper

大型語言模型誠實性調查

A Survey on the Honesty of Large Language Models

September 27, 2024
作者: Siheng Li, Cheng Yang, Taiqiang Wu, Chufan Shi, Yuji Zhang, Xinyu Zhu, Zesen Cheng, Deng Cai, Mo Yu, Lemao Liu, Jie Zhou, Yujiu Yang, Ngai Wong, Xixin Wu, Wai Lam
cs.AI

摘要

誠實是對齊大型語言模型(LLMs)與人類價值觀的基本原則,要求這些模型能夠識別自己所知道和不知道的事情,並能夠忠實表達其知識。儘管前景看好,目前的LLMs仍然表現出顯著的不誠實行為,例如自信地呈現錯誤答案或無法表達其所知。此外,關於LLMs誠實性的研究也面臨挑戰,包括對誠實的不同定義、區分已知和未知知識的困難,以及對相關研究缺乏全面的理解。為了應對這些問題,我們提供了一份關於LLMs誠實性的調查,涵蓋其澄清、評估方法和改進策略。此外,我們還提供了未來研究的見解,旨在激發這一重要領域的進一步探索。
English
Honesty is a fundamental principle for aligning large language models (LLMs) with human values, requiring these models to recognize what they know and don't know and be able to faithfully express their knowledge. Despite promising, current LLMs still exhibit significant dishonest behaviors, such as confidently presenting wrong answers or failing to express what they know. In addition, research on the honesty of LLMs also faces challenges, including varying definitions of honesty, difficulties in distinguishing between known and unknown knowledge, and a lack of comprehensive understanding of related research. To address these issues, we provide a survey on the honesty of LLMs, covering its clarification, evaluation approaches, and strategies for improvement. Moreover, we offer insights for future research, aiming to inspire further exploration in this important area.

Summary

AI-Generated Summary

PDF333November 16, 2024