时间是否占据一席之地？时序头机制：语言模型如何回忆特定时间信息

摘要

尽管语言模型提取事实的能力已得到广泛研究，但其如何处理随时间变化的事实仍鲜有探讨。我们通过电路分析发现了“时间头”——主要负责处理时间知识的特定注意力头。我们证实，这些时间头存在于多个模型中，尽管其具体位置可能有所不同，且它们的响应会因知识类型及其对应年份而异。禁用这些时间头会削弱模型回忆特定时间知识的能力，同时保持其一般能力，且不影响时间不变性和问答表现。此外，这些时间头不仅对数字条件（如“2004年”）有响应，也对文本别名（如“在……年”）有响应，表明它们编码了超越简单数字表示的时间维度。更进一步，我们通过展示如何通过调整这些时间头的值来编辑时间知识，拓展了研究发现的潜在应用。

English

While the ability of language models to elicit facts has been widely investigated, how they handle temporally changing facts remains underexplored. We discover Temporal Heads, specific attention heads primarily responsible for processing temporal knowledge through circuit analysis. We confirm that these heads are present across multiple models, though their specific locations may vary, and their responses differ depending on the type of knowledge and its corresponding years. Disabling these heads degrades the model's ability to recall time-specific knowledge while maintaining its general capabilities without compromising time-invariant and question-answering performances. Moreover, the heads are activated not only numeric conditions ("In 2004") but also textual aliases ("In the year ..."), indicating that they encode a temporal dimension beyond simple numerical representation. Furthermore, we expand the potential of our findings by demonstrating how temporal knowledge can be edited by adjusting the values of these heads.

时间是否占据一席之地？时序头机制：语言模型如何回忆特定时间信息

Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information

摘要

Summary

Support