ChatPaper.aiChatPaper

我们无法用现有词汇理解人工智能

We Can't Understand AI Using our Existing Vocabulary

February 11, 2025
作者: John Hewitt, Robert Geirhos, Been Kim
cs.AI

摘要

这份立场文件认为,为了理解人工智能,我们不能依赖现有的人类词汇。相反,我们应努力发展新词:代表我们想教给机器的精确人类概念,或者我们需要学习的机器概念的新词。我们从一个前提出发,即人类和机器有不同的概念。这意味着可解释性可以被构建为一个沟通问题:人类必须能够引用和控制机器概念,并将人类概念传达给机器。通过发展新词来创建共享的人机语言,我们相信可以解决这一沟通问题。成功的新词实现了一定程度的抽象化:不要过于详细,以便在许多情境中重复使用,也不要过于高层次,以便传达精确信息。作为概念验证,我们展示了如何通过“长度新词”实现控制LLM响应长度,而“多样性新词”则允许采样更多变化的响应。综上所述,我们认为不能用现有词汇理解人工智能,通过新词的拓展为更好地控制和理解机器创造了机会。
English
This position paper argues that, in order to understand AI, we cannot rely on our existing vocabulary of human words. Instead, we should strive to develop neologisms: new words that represent precise human concepts that we want to teach machines, or machine concepts that we need to learn. We start from the premise that humans and machines have differing concepts. This means interpretability can be framed as a communication problem: humans must be able to reference and control machine concepts, and communicate human concepts to machines. Creating a shared human-machine language through developing neologisms, we believe, could solve this communication problem. Successful neologisms achieve a useful amount of abstraction: not too detailed, so they're reusable in many contexts, and not too high-level, so they convey precise information. As a proof of concept, we demonstrate how a "length neologism" enables controlling LLM response length, while a "diversity neologism" allows sampling more variable responses. Taken together, we argue that we cannot understand AI using our existing vocabulary, and expanding it through neologisms creates opportunities for both controlling and understanding machines better.

Summary

AI-Generated Summary

PDF104February 17, 2025