PartGen:使用多视角扩散模型进行零件级别的3D生成和重建

PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models

December 24, 2024
作者: Minghao Chen, Roman Shapovalov, Iro Laina, Tom Monnier, Jianyuan Wang, David Novotny, Andrea Vedaldi
cs.AI

摘要

文本或图像到3D生成器和3D扫描仪现在可以生成具有高质量形状和纹理的3D资产。这些资产通常由单个融合表示组成,如隐式神经场、高斯混合或网格,没有任何有用的结构。然而,大多数应用程序和创意工作流需要资产由几个有意义的部分组成,这些部分可以独立操作。为了弥补这一差距,我们引入了PartGen,一种新颖的方法,从文本、图像或非结构化的3D对象开始生成由有意义部分组成的3D物体。首先,给定3D对象的多个视图,生成或渲染,多视角扩散模型提取一组合理且视图一致的部分分割,将对象分为部分。然后,第二个多视角扩散模型单独处理每个部分,填补遮挡,并使用这些完成的视图通过馈送到3D重建网络进行3D重建。这个完成过程考虑整个对象的上下文,以确保部分整合得紧密。生成完成模型可以弥补由于遮挡而缺失的信息;在极端情况下,它可以根据输入的3D资产产生完全看不见的部分。我们在生成和真实的3D资产上评估了我们的方法,并展示它在分割和部分提取基线方面表现出色。我们还展示了下游应用,如3D部分编辑。
English
Text- or image-to-3D generators and 3D scanners can now produce 3D assets with high-quality shapes and textures. These assets typically consist of a single, fused representation, like an implicit neural field, a Gaussian mixture, or a mesh, without any useful structure. However, most applications and creative workflows require assets to be made of several meaningful parts that can be manipulated independently. To address this gap, we introduce PartGen, a novel approach that generates 3D objects composed of meaningful parts starting from text, an image, or an unstructured 3D object. First, given multiple views of a 3D object, generated or rendered, a multi-view diffusion model extracts a set of plausible and view-consistent part segmentations, dividing the object into parts. Then, a second multi-view diffusion model takes each part separately, fills in the occlusions, and uses those completed views for 3D reconstruction by feeding them to a 3D reconstruction network. This completion process considers the context of the entire object to ensure that the parts integrate cohesively. The generative completion model can make up for the information missing due to occlusions; in extreme cases, it can hallucinate entirely invisible parts based on the input 3D asset. We evaluate our method on generated and real 3D assets and show that it outperforms segmentation and part-extraction baselines by a large margin. We also showcase downstream applications such as 3D part editing.

Summary

AI-Generated Summary

PDF142December 25, 2024