并行自回归视觉生成

摘要

自回归模型已成为视觉生成的强大方法，但由于其逐个令牌预测过程而导致推理速度缓慢。在本文中，我们提出了一种简单而有效的并行自回归视觉生成方法，提高了生成效率同时保留了自回归建模的优势。我们的关键见解是，并行生成取决于视觉令牌之间的依赖关系-具有弱依赖性的令牌可以并行生成，而强烈依赖的相邻令牌很难一起生成，因为它们的独立抽样可能导致不一致性。基于这一观察结果，我们开发了一种并行生成策略，可以并行生成具有弱依赖性的远程令牌，同时对于具有强依赖性的本地令牌保持顺序生成。我们的方法可以无缝集成到标准自回归模型中，无需修改架构或分词器。在ImageNet和UCF-101上的实验表明，我们的方法在图像和视频生成任务中实现了3.6倍的加速，质量相当，并且在最小质量降低的情况下最多可实现9.5倍的加速。我们希望这项工作能激发未来对高效视觉生成和统一自回归建模的研究。项目页面：https://epiphqny.github.io/PAR-project。

English

Autoregressive models have emerged as a powerful approach for visual generation but suffer from slow inference speed due to their sequential token-by-token prediction process. In this paper, we propose a simple yet effective approach for parallelized autoregressive visual generation that improves generation efficiency while preserving the advantages of autoregressive modeling. Our key insight is that parallel generation depends on visual token dependencies-tokens with weak dependencies can be generated in parallel, while strongly dependent adjacent tokens are difficult to generate together, as their independent sampling may lead to inconsistencies. Based on this observation, we develop a parallel generation strategy that generates distant tokens with weak dependencies in parallel while maintaining sequential generation for strongly dependent local tokens. Our approach can be seamlessly integrated into standard autoregressive models without modifying the architecture or tokenizer. Experiments on ImageNet and UCF-101 demonstrate that our method achieves a 3.6x speedup with comparable quality and up to 9.5x speedup with minimal quality degradation across both image and video generation tasks. We hope this work will inspire future research in efficient visual generation and unified autoregressive modeling. Project page: https://epiphqny.github.io/PAR-project.

并行自回归视觉生成

Parallelized Autoregressive Visual Generation

摘要

Summary

Support