ChatPaper.aiChatPaper

论双语语言模型中共享语法表征的习得

On the Acquisition of Shared Grammatical Representations in Bilingual Language Models

March 5, 2025
作者: Catherine Arnett, Tyler A. Chang, James A. Michaelov, Benjamin K. Bergen
cs.AI

摘要

尽管跨语言迁移对当代语言模型的多语言能力至关重要,但其具体机制尚未得到充分理解。本文探讨了单语语言模型在开始接受第二语言训练时会发生什么变化。具体而言,我们训练了小型双语模型,并控制了每种语言的数据量及语言接触的顺序。为了寻找共享多语言表征的证据,我们采用了结构启动这一研究人类语法表征的方法。首先,我们复现了先前的跨语言结构启动实验结果,并发现,在控制了训练数据量和语言接触后,不同语言对及方向间存在不对称效应。我们认为,这种不对称性可能为人类结构启动效应的假设提供了新的视角。此外,我们还发现,对于相似度较低的语言对,结构启动效应较弱,这凸显了跨语言迁移学习及共享表征在类型学多样语言中的潜在局限性。
English
While crosslingual transfer is crucial to contemporary language models' multilingual capabilities, how it occurs is not well understood. In this paper, we ask what happens to a monolingual language model when it begins to be trained on a second language. Specifically, we train small bilingual models for which we control the amount of data for each language and the order of language exposure. To find evidence of shared multilingual representations, we turn to structural priming, a method used to study grammatical representations in humans. We first replicate previous crosslingual structural priming results and find that after controlling for training data quantity and language exposure, there are asymmetrical effects across language pairs and directions. We argue that this asymmetry may shape hypotheses about human structural priming effects. We also find that structural priming effects are less robust for less similar language pairs, highlighting potential limitations of crosslingual transfer learning and shared representations for typologically diverse languages.

Summary

AI-Generated Summary

PDF31March 7, 2025