ChatPaper.aiChatPaper

下一個標記預測中的物理學

Physics in Next-token Prediction

November 1, 2024
作者: Hongjun An, Yiliang Song, Xuelong Li
cs.AI

摘要

我們發現了「下一個標記預測」(Next-token Prediction, NTP)中的基礎物理學。我們確定了NTP中信息守恆定律,並提出了信息容量第一定律(IC-1),證明自回歸模型中智能出現的本質基本上是一個信息傳遞過程。我們還將蘭道爾原理引入NTP,制定了信息容量第二定律(IC-2),建立了自回歸模型訓練與能量消耗之間的關係。此外,我們提出了幾個推論,對生產實踐具有實際意義。最後,我們驗證了我們的研究結果與現有理論的相容性和互補性。
English
We discovered the underlying physics in Next-token Prediction (NTP). We identified the law of information conservation within NTP and proposed the First Law of Information Capacity (IC-1), demonstrating that the essence of intelligence emergence in auto-regressive models is fundamentally a process of information transfer. We also introduced Landauer's Principle into NTP, formulating the Second Law of Information Capacity (IC-2), which establishes the relationship between auto-regressive model training and energy consumption. Additionally, we presented several corollaries, which hold practical significance for production practices. Finally, we validated the compatibility and complementarity of our findings with existing theories.

Summary

AI-Generated Summary

PDF143November 13, 2024