Turbo3D:超快速文本轉3D生成

Turbo3D: Ultra-fast Text-to-3D Generation

December 5, 2024
作者: Hanzhe Hu, Tianwei Yin, Fujun Luan, Yiwei Hu, Hao Tan, Zexiang Xu, Sai Bi, Shubham Tulsiani, Kai Zhang
cs.AI

摘要

我們提出了 Turbo3D,一個超快的文本轉3D系統,能夠在不到一秒的時間內生成高質量的高斯樣本資產。Turbo3D採用了快速的4步驟、4視圖擴散生成器和一個高效的前饋高斯重建器,兩者均在潛在空間中運作。這個4步驟、4視圖生成器是通過一種新穎的雙教師方法提煉出來的學生模型,該方法鼓勵學生從多視圖教師那裡學習視圖一致性,從單視圖教師那裡學習照片逼真感。通過將高斯重建器的輸入從像素空間轉移到潛在空間,我們消除了額外的圖像解碼時間,並將變壓器序列長度減半,實現了最大效率。我們的方法展示了優越的3D生成結果,同時運行時間僅為以往基準方法的一小部分。
English
We present Turbo3D, an ultra-fast text-to-3D system capable of generating high-quality Gaussian splatting assets in under one second. Turbo3D employs a rapid 4-step, 4-view diffusion generator and an efficient feed-forward Gaussian reconstructor, both operating in latent space. The 4-step, 4-view generator is a student model distilled through a novel Dual-Teacher approach, which encourages the student to learn view consistency from a multi-view teacher and photo-realism from a single-view teacher. By shifting the Gaussian reconstructor's inputs from pixel space to latent space, we eliminate the extra image decoding time and halve the transformer sequence length for maximum efficiency. Our method demonstrates superior 3D generation results compared to previous baselines, while operating in a fraction of their runtime.

Summary

AI-Generated Summary

PDF32December 10, 2024