Open-Sora 計畫:開源大型影片生成模型
Open-Sora Plan: Open-Source Large Video Generation Model
November 28, 2024
作者: Bin Lin, Yunyang Ge, Xinhua Cheng, Zongjian Li, Bin Zhu, Shaodong Wang, Xianyi He, Yang Ye, Shenghai Yuan, Liuhan Chen, Tanghui Jia, Junwu Zhang, Zhenyu Tang, Yatian Pang, Bin She, Cen Yan, Zhiheng Hu, Xiaoyi Dong, Lin Chen, Zhang Pan, Xing Zhou, Shaoling Dong, Yonghong Tian, Li Yuan
cs.AI
摘要
我們介紹了 Open-Sora 計畫,這是一個開源項目,旨在為基於各種使用者輸入生成所需的高分辨率長時段視頻提供一個大型生成模型。我們的項目包含多個組件,用於整個視頻生成過程,包括 Wavelet-Flow 變分自編碼器、聯合圖像-視頻 Skiparse 降噪器和各種條件控制器。此外,設計了許多有效的訓練和推斷輔助策略,並提出了用於獲取所需高質量數據的多維數據整理流程。由於高效的思維,我們的 Open-Sora 計畫在定性和定量評估中均取得了令人印象深刻的視頻生成結果。我們希望我們的精心設計和實踐經驗能激發視頻生成研究社區。我們所有的代碼和模型權重都可以在 https://github.com/PKU-YuanGroup/Open-Sora-Plan 上公開獲取。
English
We introduce Open-Sora Plan, an open-source project that aims to contribute a
large generation model for generating desired high-resolution videos with long
durations based on various user inputs. Our project comprises multiple
components for the entire video generation process, including a Wavelet-Flow
Variational Autoencoder, a Joint Image-Video Skiparse Denoiser, and various
condition controllers. Moreover, many assistant strategies for efficient
training and inference are designed, and a multi-dimensional data curation
pipeline is proposed for obtaining desired high-quality data. Benefiting from
efficient thoughts, our Open-Sora Plan achieves impressive video generation
results in both qualitative and quantitative evaluations. We hope our careful
design and practical experience can inspire the video generation research
community. All our codes and model weights are publicly available at
https://github.com/PKU-YuanGroup/Open-Sora-Plan.Summary
AI-Generated Summary