并行自回归视觉生成
Parallelized Autoregressive Visual Generation
December 19, 2024
作者: Yuqing Wang, Shuhuai Ren, Zhijie Lin, Yujin Han, Haoyuan Guo, Zhenheng Yang, Difan Zou, Jiashi Feng, Xihui Liu
cs.AI
摘要
自回归模型已成为视觉生成的强大方法,但由于其逐个令牌预测过程而导致推理速度缓慢。在本文中,我们提出了一种简单而有效的并行自回归视觉生成方法,提高了生成效率同时保留了自回归建模的优势。我们的关键见解是,并行生成取决于视觉令牌之间的依赖关系-具有弱依赖性的令牌可以并行生成,而强烈依赖的相邻令牌很难一起生成,因为它们的独立抽样可能导致不一致性。基于这一观察结果,我们开发了一种并行生成策略,可以并行生成具有弱依赖性的远程令牌,同时对于具有强依赖性的本地令牌保持顺序生成。我们的方法可以无缝集成到标准自回归模型中,无需修改架构或分词器。在ImageNet和UCF-101上的实验表明,我们的方法在图像和视频生成任务中实现了3.6倍的加速,质量相当,并且在最小质量降低的情况下最多可实现9.5倍的加速。我们希望这项工作能激发未来对高效视觉生成和统一自回归建模的研究。项目页面:https://epiphqny.github.io/PAR-project。
English
Autoregressive models have emerged as a powerful approach for visual
generation but suffer from slow inference speed due to their sequential
token-by-token prediction process. In this paper, we propose a simple yet
effective approach for parallelized autoregressive visual generation that
improves generation efficiency while preserving the advantages of
autoregressive modeling. Our key insight is that parallel generation depends on
visual token dependencies-tokens with weak dependencies can be generated in
parallel, while strongly dependent adjacent tokens are difficult to generate
together, as their independent sampling may lead to inconsistencies. Based on
this observation, we develop a parallel generation strategy that generates
distant tokens with weak dependencies in parallel while maintaining sequential
generation for strongly dependent local tokens. Our approach can be seamlessly
integrated into standard autoregressive models without modifying the
architecture or tokenizer. Experiments on ImageNet and UCF-101 demonstrate that
our method achieves a 3.6x speedup with comparable quality and up to 9.5x
speedup with minimal quality degradation across both image and video generation
tasks. We hope this work will inspire future research in efficient visual
generation and unified autoregressive modeling. Project page:
https://epiphqny.github.io/PAR-project.Summary
AI-Generated Summary