PartGen:使用多視圖擴散模型進行部分級別的3D生成和重建。
PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models
December 24, 2024
作者: Minghao Chen, Roman Shapovalov, Iro Laina, Tom Monnier, Jianyuan Wang, David Novotny, Andrea Vedaldi
cs.AI
摘要
現在,文字或圖像轉3D生成器和3D掃描儀可以製作具有高質量形狀和紋理的3D資產。這些資產通常由單一的融合表示組成,例如隱式神經場、高斯混合或網格,沒有任何有用的結構。然而,大多數應用程序和創意工作流需要資產由幾個有意義的部分組成,這些部分可以獨立操縱。為了彌補這一差距,我們引入了PartGen,一種從文字、圖像或非結構化3D物體開始生成由有意義部分組成的3D物體的新方法。首先,給定3D物體的多個視圖,生成或渲染,多視圖擴散模型提取一組合理且視圖一致的部分分割,將物體分成部分。然後,第二個多視圖擴散模型分別處理每個部分,填補遮蔽部分,並使用這些完成的視圖通過將其提供給3D重建網絡進行3D重建。這個完成過程考慮整個物體的上下文,以確保部分整合得緊密。生成式完成模型可以彌補由於遮蔽而缺失的信息;在極端情況下,它可以根據輸入的3D資產幻想完全看不見的部分。我們在生成和真實3D資產上評估我們的方法,並展示它在分割和部分提取基線方面遠遠優於之前。我們還展示了下游應用,如3D部分編輯。
English
Text- or image-to-3D generators and 3D scanners can now produce 3D assets
with high-quality shapes and textures. These assets typically consist of a
single, fused representation, like an implicit neural field, a Gaussian
mixture, or a mesh, without any useful structure. However, most applications
and creative workflows require assets to be made of several meaningful parts
that can be manipulated independently. To address this gap, we introduce
PartGen, a novel approach that generates 3D objects composed of meaningful
parts starting from text, an image, or an unstructured 3D object. First, given
multiple views of a 3D object, generated or rendered, a multi-view diffusion
model extracts a set of plausible and view-consistent part segmentations,
dividing the object into parts. Then, a second multi-view diffusion model takes
each part separately, fills in the occlusions, and uses those completed views
for 3D reconstruction by feeding them to a 3D reconstruction network. This
completion process considers the context of the entire object to ensure that
the parts integrate cohesively. The generative completion model can make up for
the information missing due to occlusions; in extreme cases, it can hallucinate
entirely invisible parts based on the input 3D asset. We evaluate our method on
generated and real 3D assets and show that it outperforms segmentation and
part-extraction baselines by a large margin. We also showcase downstream
applications such as 3D part editing.Summary
AI-Generated Summary