使用连续概念进行LLM预训练

摘要

下一个标记预测一直是大型语言模型预训练中使用的标准训练目标。表示是通过优化标记级困惑度而学习的。我们提出了连续概念混合（CoCoMix），这是一种将离散的下一个标记预测与连续概念相结合的新型预训练框架。具体来说，CoCoMix 预测从预训练的稀疏自动编码器中学习的连续概念，并将它们与模型的隐藏状态混合，通过与标记隐藏表示交替进行。通过在多个基准测试上进行实验，包括语言建模和下游推理任务，我们展示了 CoCoMix 比标准的下一个标记预测、知识蒸馏和插入暂停标记更具样本效率，并且始终表现更好。我们发现将概念学习和交替相结合在端到端框架中对性能提升至关重要。此外，CoCoMix 通过允许直接检查和修改预测的概念来增强可解释性和可操控性，为引导模型的内部推理过程提供了一种透明的方式。

English

Next token prediction has been the standard training objective used in large language model pretraining. Representations are learned as a result of optimizing for token-level perplexity. We propose Continuous Concept Mixing (CoCoMix), a novel pretraining framework that combines discrete next token prediction with continuous concepts. Specifically, CoCoMix predicts continuous concepts learned from a pretrained sparse autoencoder and mixes them into the model's hidden state by interleaving with token hidden representations. Through experiments on multiple benchmarks, including language modeling and downstream reasoning tasks, we show that CoCoMix is more sample efficient and consistently outperforms standard next token prediction, knowledge distillation and inserting pause tokens. We find that combining both concept learning and interleaving in an end-to-end framework is critical to performance gains. Furthermore, CoCoMix enhances interpretability and steerability by allowing direct inspection and modification of the predicted concept, offering a transparent way to guide the model's internal reasoning process.

使用连续概念进行LLM预训练

LLM Pretraining with Continuous Concepts

摘要

Summary

Support