掷骰子前先看清：突破下一词预测的创作局限

摘要

我们设计了一套最小化的算法任务，这些任务是对开放式现实世界任务的松散抽象。这使得我们能够清晰且可控地量化当前语言模型的创造性极限。与需要创造性、远见性思维跃迁的现实世界任务类似，我们的任务要求一个隐含的、开放式的随机规划步骤，该步骤要么（a）在抽象知识图谱中发现新的联系（如文字游戏、类比推理或研究），要么（b）构建新的模式（如设计数学问题或新蛋白质）。在这些任务中，我们从实证和概念上论证了为何仅学习下一个词是短视的，并过度依赖记忆；相比之下，多词方法，即无教师训练和扩散模型，在生成多样且原创的输出方面表现更优。其次，在我们的任务中，我们发现，要在不损害连贯性的前提下从Transformer中引出随机性，更好的方法是在输入层直接注入噪声（通过我们称之为哈希条件化的方法），而非依赖于输出层的温度采样。因此，我们的工作为分析开放式创造性技能提供了一个原则性的最小测试平台，并为超越下一个词学习和基于softmax的采样提供了新的论据。我们已将部分代码公开于https://github.com/chenwu98/algorithmic-creativity。

English

We design a suite of minimal algorithmic tasks that are a loose abstraction of open-ended real-world tasks. This allows us to cleanly and controllably quantify the creative limits of the present-day language model. Much like real-world tasks that require a creative, far-sighted leap of thought, our tasks require an implicit, open-ended stochastic planning step that either (a) discovers new connections in an abstract knowledge graph (like in wordplay, drawing analogies, or research) or (b) constructs new patterns (like in designing math problems or new proteins). In these tasks, we empirically and conceptually argue how next-token learning is myopic and memorizes excessively; comparatively, multi-token approaches, namely teacherless training and diffusion models, excel in producing diverse and original output. Secondly, in our tasks, we find that to elicit randomness from the Transformer without hurting coherence, it is better to inject noise right at the input layer (via a method we dub hash-conditioning) rather than defer to temperature sampling from the output layer. Thus, our work offers a principled, minimal test-bed for analyzing open-ended creative skills, and offers new arguments for going beyond next-token learning and softmax-based sampling. We make part of the code available under https://github.com/chenwu98/algorithmic-creativity

掷骰子前先看清：突破下一词预测的创作局限

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

摘要

Summary

Support

Support