ChatPaper.aiChatPaper

HumanDreamer-X:基于高斯复原的单图像真人化身重建

HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration

April 4, 2025
作者: Boyuan Wang, Runqi Ouyang, Xiaofeng Wang, Zheng Zhu, Guosheng Zhao, Chaojun Ni, Guan Huang, Lihong Liu, Xingang Wang
cs.AI

摘要

单图像人体重建对于数字人体建模应用至关重要,但依然是一项极具挑战性的任务。现有方法依赖生成模型合成多视角图像,以便进行后续的三维重建与动画制作。然而,直接从单张人体图像生成多视角时,往往存在几何不一致性问题,导致重建模型中出现肢体断裂或模糊等现象。为应对这些局限,我们提出了HumanDreamer-X,一个将多视角人体生成与重建整合到统一流程中的创新框架,显著提升了重建三维模型的几何一致性与视觉保真度。在该框架中,3D高斯溅射作为显式三维表示,提供了初始几何与外观优先级。在此基础上,HumanFixer被训练用于修复3DGS渲染,确保生成结果达到照片级真实感。此外,我们深入探讨了多视角人体生成中注意力机制的内在挑战,并提出了一种注意力调制策略,有效增强了多视角间的几何细节与身份一致性。实验结果表明,我们的方法在生成与重建的PSNR质量指标上分别提升了16.45%和12.65%,最高PSNR可达25.62 dB,同时在野外数据上展现了良好的泛化能力,并适用于多种人体重建骨干模型。
English
Single-image human reconstruction is vital for digital human modeling applications but remains an extremely challenging task. Current approaches rely on generative models to synthesize multi-view images for subsequent 3D reconstruction and animation. However, directly generating multiple views from a single human image suffers from geometric inconsistencies, resulting in issues like fragmented or blurred limbs in the reconstructed models. To tackle these limitations, we introduce HumanDreamer-X, a novel framework that integrates multi-view human generation and reconstruction into a unified pipeline, which significantly enhances the geometric consistency and visual fidelity of the reconstructed 3D models. In this framework, 3D Gaussian Splatting serves as an explicit 3D representation to provide initial geometry and appearance priority. Building upon this foundation, HumanFixer is trained to restore 3DGS renderings, which guarantee photorealistic results. Furthermore, we delve into the inherent challenges associated with attention mechanisms in multi-view human generation, and propose an attention modulation strategy that effectively enhances geometric details identity consistency across multi-view. Experimental results demonstrate that our approach markedly improves generation and reconstruction PSNR quality metrics by 16.45% and 12.65%, respectively, achieving a PSNR of up to 25.62 dB, while also showing generalization capabilities on in-the-wild data and applicability to various human reconstruction backbone models.

Summary

AI-Generated Summary

PDF132April 7, 2025