推理时缩放理念可助力生成式预训练算法优化

摘要

近年来，通过生成式预训练，基础模型取得了显著进展，然而该领域的算法创新主要停滞在针对离散信号的自回归模型和针对连续信号的扩散模型上。这种停滞形成了一个瓶颈，阻碍了我们充分挖掘丰富多模态数据的潜力，进而限制了多模态智能的发展。我们认为，采用“推理优先”的视角——即在推理阶段优先考虑跨序列长度和优化步骤的扩展效率——能够激发新型生成式预训练算法的诞生。以归纳矩匹配（IMM）为例，我们展示了如何通过针对性修改来解决扩散模型推理过程中的局限性，从而开发出一种稳定的单阶段算法，该算法不仅实现了更优的样本质量，还将推理效率提升了一个数量级以上。

English

Recent years have seen significant advancements in foundation models through generative pre-training, yet algorithmic innovation in this space has largely stagnated around autoregressive models for discrete signals and diffusion models for continuous signals. This stagnation creates a bottleneck that prevents us from fully unlocking the potential of rich multi-modal data, which in turn limits the progress on multimodal intelligence. We argue that an inference-first perspective, which prioritizes scaling efficiency during inference time across sequence length and refinement steps, can inspire novel generative pre-training algorithms. Using Inductive Moment Matching (IMM) as a concrete example, we demonstrate how addressing limitations in diffusion models' inference process through targeted modifications yields a stable, single-stage algorithm that achieves superior sample quality with over an order of magnitude greater inference efficiency.

推理时缩放理念可助力生成式预训练算法优化

Ideas in Inference-time Scaling can Benefit Generative Pre-training Algorithms

摘要

Summary

Support