IDOL:从单张图像实时生成逼真的3D人体模型
IDOL: Instant Photorealistic 3D Human Creation from a Single Image
December 19, 2024
作者: Yiyu Zhuang, Jiaxi Lv, Hao Wen, Qing Shuai, Ailing Zeng, Hao Zhu, Shifeng Chen, Yujiu Yang, Xun Cao, Wei Liu
cs.AI
摘要
从单个图像创建高保真、可动画的3D全身化身是一项具有挑战性的任务,这是因为人类的外观和姿势多种多样,而高质量训练数据的可用性有限。为了实现快速且高质量的人类重建,本研究从数据集、模型和表示的角度重新思考了这一任务。首先,我们引入了一个大规模的以人为中心生成数据集,名为HuGe100K,包含了10万个多样化、逼真的人类图像集。每个集合包含特定人类姿势的24个视角帧,使用可控姿势的图像到多视角模型生成。接下来,利用HuGe100K中的视角、姿势和外观的多样性,我们开发了一个可扩展的前馈变换器模型,从给定的人类图像中预测出在统一空间中的3D人类高斯表示。该模型经过训练,可以将人类姿势、身体形状、服装几何和纹理进行解耦。估计出的高斯可以在无需后处理的情况下进行动画化。我们进行了全面的实验来验证所提出的数据集和方法的有效性。我们的模型展示了在单个GPU上即时从单个输入图像高效重建出1K分辨率逼真人类的能力。此外,它无缝支持各种应用,以及形状和纹理编辑任务。
English
Creating a high-fidelity, animatable 3D full-body avatar from a single image
is a challenging task due to the diverse appearance and poses of humans and the
limited availability of high-quality training data. To achieve fast and
high-quality human reconstruction, this work rethinks the task from the
perspectives of dataset, model, and representation. First, we introduce a
large-scale HUman-centric GEnerated dataset, HuGe100K, consisting of 100K
diverse, photorealistic sets of human images. Each set contains 24-view frames
in specific human poses, generated using a pose-controllable
image-to-multi-view model. Next, leveraging the diversity in views, poses, and
appearances within HuGe100K, we develop a scalable feed-forward transformer
model to predict a 3D human Gaussian representation in a uniform space from a
given human image. This model is trained to disentangle human pose, body shape,
clothing geometry, and texture. The estimated Gaussians can be animated without
post-processing. We conduct comprehensive experiments to validate the
effectiveness of the proposed dataset and method. Our model demonstrates the
ability to efficiently reconstruct photorealistic humans at 1K resolution from
a single input image using a single GPU instantly. Additionally, it seamlessly
supports various applications, as well as shape and texture editing tasks.Summary
AI-Generated Summary