Phidias:一個從文字、圖像和3D條件創建3D內容的生成模型,具有參考增強擴散。
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
September 17, 2024
作者: Zhenwei Wang, Tengfei Wang, Zexin He, Gerhard Hancke, Ziwei Liu, Rynson W. H. Lau
cs.AI
摘要
在3D建模中,設計師通常會使用現有的3D模型作為參考來創建新的模型。這種做法啟發了Phidias的開發,這是一種新穎的生成模型,它使用擴散來進行參考增強的3D生成。根據一幅圖像,我們的方法利用檢索或用戶提供的3D參考模型來引導生成過程,從而提高生成質量、泛化能力和可控性。我們的模型集成了三個關鍵組件:1)動態調節條件強度的元控制網絡,2)動態參考路由,減輕輸入圖像和3D參考之間的不對齊,以及3)自我參考增強,實現具有漸進課程的自監督訓練。總的來說,這些設計相對於現有方法帶來了明顯的改進。Phidias建立了一個統一的框架,用於使用文本、圖像和3D條件進行3D生成,具有多功能應用。
English
In 3D modeling, designers often use an existing 3D model as a reference to
create new ones. This practice has inspired the development of Phidias, a novel
generative model that uses diffusion for reference-augmented 3D generation.
Given an image, our method leverages a retrieved or user-provided 3D reference
model to guide the generation process, thereby enhancing the generation
quality, generalization ability, and controllability. Our model integrates
three key components: 1) meta-ControlNet that dynamically modulates the
conditioning strength, 2) dynamic reference routing that mitigates misalignment
between the input image and 3D reference, and 3) self-reference augmentations
that enable self-supervised training with a progressive curriculum.
Collectively, these designs result in a clear improvement over existing
methods. Phidias establishes a unified framework for 3D generation using text,
image, and 3D conditions with versatile applications.Summary
AI-Generated Summary