ChatPaper.aiChatPaper

LHM:基于单张图像的快速大规模可动画人体重建模型

LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds

March 13, 2025
作者: Lingteng Qiu, Xiaodong Gu, Peihao Li, Qi Zuo, Weichao Shen, Junfei Zhang, Kejie Qiu, Weihao Yuan, Guanying Chen, Zilong Dong, Liefeng Bo
cs.AI

摘要

从单张图像重建可动画的3D人体是一个极具挑战性的问题,这源于解耦几何、外观和形变时的模糊性。近年来,3D人体重建的进展主要集中在静态人体建模上,而依赖合成3D扫描数据进行训练限制了其泛化能力。相比之下,基于优化的视频方法虽能实现更高保真度,但需要受控的捕捉条件和计算密集型的优化过程。受大规模重建模型在高效静态重建中崭露头角的启发,我们提出了LHM(大规模可动画人体重建模型),通过前馈方式推断以3D高斯溅射表示的高保真虚拟形象。我们的模型采用多模态Transformer架构,利用注意力机制有效编码人体位置特征与图像特征,从而细致保留服装几何与纹理。为进一步增强面部身份保持与细节恢复,我们提出了一种头部特征金字塔编码方案,以聚合头部区域的多尺度特征。大量实验表明,LHM能在数秒内生成逼真的可动画人体,无需对面部和手部进行后处理,在重建精度和泛化能力上均超越了现有方法。
English
Animatable 3D human reconstruction from a single image is a challenging problem due to the ambiguity in decoupling geometry, appearance, and deformation. Recent advances in 3D human reconstruction mainly focus on static human modeling, and the reliance of using synthetic 3D scans for training limits their generalization ability. Conversely, optimization-based video methods achieve higher fidelity but demand controlled capture conditions and computationally intensive refinement processes. Motivated by the emergence of large reconstruction models for efficient static reconstruction, we propose LHM (Large Animatable Human Reconstruction Model) to infer high-fidelity avatars represented as 3D Gaussian splatting in a feed-forward pass. Our model leverages a multimodal transformer architecture to effectively encode the human body positional features and image features with attention mechanism, enabling detailed preservation of clothing geometry and texture. To further boost the face identity preservation and fine detail recovery, we propose a head feature pyramid encoding scheme to aggregate multi-scale features of the head regions. Extensive experiments demonstrate that our LHM generates plausible animatable human in seconds without post-processing for face and hands, outperforming existing methods in both reconstruction accuracy and generalization ability.

Summary

AI-Generated Summary

PDF325March 21, 2025