ChatPaper.aiChatPaper

StdGEN:从单个图像生成语义分解的3D角色

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

November 8, 2024
作者: Yuze He, Yanning Zhou, Wang Zhao, Zhongkai Wu, Kaiwen Xiao, Wei Yang, Yong-Jin Liu, Xiao Han
cs.AI

摘要

我们提出了StdGEN,这是一个创新的流水线,可以从单个图像生成语义分解的高质量3D角色,广泛应用于虚拟现实、游戏和电影制作等领域。与以往的方法不同,这些方法在分解能力、质量不佳和优化时间长方面存在困难,StdGEN具有分解性、有效性和效率;即它可以在三分钟内生成细节丰富的3D角色,包括身体、服装和头发等分离的语义组件。StdGEN的核心是我们提出的语义感知大型重建模型(S-LRM),这是一种基于Transformer的通用模型,可以联合从多视角图像中以前馈方式重建几何、颜色和语义。引入了可微分的多层语义表面提取方案,以从我们的S-LRM重建的混合隐式场中获取网格。此外,还将专门的高效多视角扩散模型和迭代多层表面细化模块集成到流水线中,以促进高质量、可分解的3D角色生成。大量实验证明了我们在3D动漫角色生成方面的最新性能,几何、纹理和分解能力方面均显著超过现有基线。StdGEN提供即用的语义分解3D角色,并支持灵活定制,适用于各种应用。项目页面:https://stdgen.github.io
English
We present StdGEN, an innovative pipeline for generating semantically decomposed high-quality 3D characters from single images, enabling broad applications in virtual reality, gaming, and filmmaking, etc. Unlike previous methods which struggle with limited decomposability, unsatisfactory quality, and long optimization times, StdGEN features decomposability, effectiveness and efficiency; i.e., it generates intricately detailed 3D characters with separated semantic components such as the body, clothes, and hair, in three minutes. At the core of StdGEN is our proposed Semantic-aware Large Reconstruction Model (S-LRM), a transformer-based generalizable model that jointly reconstructs geometry, color and semantics from multi-view images in a feed-forward manner. A differentiable multi-layer semantic surface extraction scheme is introduced to acquire meshes from hybrid implicit fields reconstructed by our S-LRM. Additionally, a specialized efficient multi-view diffusion model and an iterative multi-layer surface refinement module are integrated into the pipeline to facilitate high-quality, decomposable 3D character generation. Extensive experiments demonstrate our state-of-the-art performance in 3D anime character generation, surpassing existing baselines by a significant margin in geometry, texture and decomposability. StdGEN offers ready-to-use semantic-decomposed 3D characters and enables flexible customization for a wide range of applications. Project page: https://stdgen.github.io

Summary

AI-Generated Summary

PDF143November 14, 2024