語言模型是算術中的符號學習者。
Language Models are Symbolic Learners in Arithmetic
October 21, 2024
作者: Chunyuan Deng, Zhiqi Li, Roy Xie, Ruidi Chang, Hanjie Chen
cs.AI
摘要
大型語言模型(LLMs)被認為在算術學習方面存在困難,這是由於語言建模和數值計算之間固有的差異,但缺乏具體證據。本研究通過雙邊實驗回應了這一主張。我們首先調查LLMs在算術學習過程中是否利用部分乘積。我們發現,儘管LLMs在學習後能夠識別一些部分乘積,但它們未能將其應用於算術任務。接著,我們探討LLMs如何通過將任務分解為子群來符號化地處理算術問題,並假設困難來自於子群的複雜性和選擇。我們的結果顯示,當子群的複雜性固定時,LLMs會類似地處理一系列不同的算術運算。通過分析不同訓練尺寸下的位置級準確性,我們進一步觀察到其呈現U形模式:LLMs在第一和最後位置迅速學習最簡單的模式,同時逐漸學習中間位置的更困難模式。這表明LLMs在學習過程中按照從易到難的範式選擇子群。我們的研究確認了LLMs在算術任務中是純符號學習者,並強調了通過子群級別量化深入理解它們的重要性。
English
Large Language Models (LLMs) are thought to struggle with arithmetic learning
due to the inherent differences between language modeling and numerical
computation, but concrete evidence has been lacking. This work responds to this
claim through a two-side experiment. We first investigate whether LLMs leverage
partial products during arithmetic learning. We find that although LLMs can
identify some partial products after learning, they fail to leverage them for
arithmetic tasks, conversely. We then explore how LLMs approach arithmetic
symbolically by breaking tasks into subgroups, hypothesizing that difficulties
arise from subgroup complexity and selection. Our results show that when
subgroup complexity is fixed, LLMs treat a collection of different arithmetic
operations similarly. By analyzing position-level accuracy across different
training sizes, we further observe that it follows a U-shaped pattern: LLMs
quickly learn the easiest patterns at the first and last positions, while
progressively learning the more difficult patterns in the middle positions.
This suggests that LLMs select subgroup following an easy-to-hard paradigm
during learning. Our work confirms that LLMs are pure symbolic learners in
arithmetic tasks and underscores the importance of understanding them deeply
through subgroup-level quantification.Summary
AI-Generated Summary