Open-Sora 計畫：開源大型影片生成模型

摘要

我們介紹了 Open-Sora 計畫，這是一個開源項目，旨在為基於各種使用者輸入生成所需的高分辨率長時段視頻提供一個大型生成模型。我們的項目包含多個組件，用於整個視頻生成過程，包括 Wavelet-Flow 變分自編碼器、聯合圖像-視頻 Skiparse 降噪器和各種條件控制器。此外，設計了許多有效的訓練和推斷輔助策略，並提出了用於獲取所需高質量數據的多維數據整理流程。由於高效的思維，我們的 Open-Sora 計畫在定性和定量評估中均取得了令人印象深刻的視頻生成結果。我們希望我們的精心設計和實踐經驗能激發視頻生成研究社區。我們所有的代碼和模型權重都可以在 https://github.com/PKU-YuanGroup/Open-Sora-Plan 上公開獲取。

English

We introduce Open-Sora Plan, an open-source project that aims to contribute a large generation model for generating desired high-resolution videos with long durations based on various user inputs. Our project comprises multiple components for the entire video generation process, including a Wavelet-Flow Variational Autoencoder, a Joint Image-Video Skiparse Denoiser, and various condition controllers. Moreover, many assistant strategies for efficient training and inference are designed, and a multi-dimensional data curation pipeline is proposed for obtaining desired high-quality data. Benefiting from efficient thoughts, our Open-Sora Plan achieves impressive video generation results in both qualitative and quantitative evaluations. We hope our careful design and practical experience can inspire the video generation research community. All our codes and model weights are publicly available at https://github.com/PKU-YuanGroup/Open-Sora-Plan.

Open-Sora 計畫：開源大型影片生成模型

Open-Sora Plan: Open-Source Large Video Generation Model

摘要

Support