ChatPaper.aiChatPaper

Open-Sora计划:开源大型视频生成模型

Open-Sora Plan: Open-Source Large Video Generation Model

November 28, 2024
作者: Bin Lin, Yunyang Ge, Xinhua Cheng, Zongjian Li, Bin Zhu, Shaodong Wang, Xianyi He, Yang Ye, Shenghai Yuan, Liuhan Chen, Tanghui Jia, Junwu Zhang, Zhenyu Tang, Yatian Pang, Bin She, Cen Yan, Zhiheng Hu, Xiaoyi Dong, Lin Chen, Zhang Pan, Xing Zhou, Shaoling Dong, Yonghong Tian, Li Yuan
cs.AI

摘要

我们介绍了Open-Sora计划,这是一个旨在为生成所需高分辨率视频提供大型生成模型的开源项目,其基于各种用户输入。我们的项目包括用于整个视频生成过程的多个组件,包括Wavelet-Flow变分自动编码器、联合图像视频Skiparse去噪器和各种条件控制器。此外,我们设计了许多用于高效训练和推断的辅助策略,并提出了用于获取所需高质量数据的多维数据整理流程。由于高效的思路,我们的Open-Sora计划在定性和定量评估中均取得了令人印象深刻的视频生成结果。我们希望我们的精心设计和实践经验能激发视频生成研究社区。我们所有的代码和模型权重都可以在https://github.com/PKU-YuanGroup/Open-Sora-Plan 上公开获取。
English
We introduce Open-Sora Plan, an open-source project that aims to contribute a large generation model for generating desired high-resolution videos with long durations based on various user inputs. Our project comprises multiple components for the entire video generation process, including a Wavelet-Flow Variational Autoencoder, a Joint Image-Video Skiparse Denoiser, and various condition controllers. Moreover, many assistant strategies for efficient training and inference are designed, and a multi-dimensional data curation pipeline is proposed for obtaining desired high-quality data. Benefiting from efficient thoughts, our Open-Sora Plan achieves impressive video generation results in both qualitative and quantitative evaluations. We hope our careful design and practical experience can inspire the video generation research community. All our codes and model weights are publicly available at https://github.com/PKU-YuanGroup/Open-Sora-Plan.

Summary

AI-Generated Summary

PDF342December 3, 2024