ChatPaper.aiChatPaper

HumanDreamer-X:基於高斯重建的單圖像真人化身生成技術

HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration

April 4, 2025
作者: Boyuan Wang, Runqi Ouyang, Xiaofeng Wang, Zheng Zhu, Guosheng Zhao, Chaojun Ni, Guan Huang, Lihong Liu, Xingang Wang
cs.AI

摘要

單圖人體重建對於數位人體建模應用至關重要,但這仍是一項極具挑戰性的任務。現有方法依賴生成模型來合成多視角圖像,以便進行後續的3D重建與動畫製作。然而,直接從單一人體圖像生成多視角圖像會遭遇幾何不一致性問題,導致重建模型出現肢體斷裂或模糊等問題。為解決這些限制,我們提出了HumanDreamer-X,這是一個將多視角人體生成與重建整合為統一流程的新框架,顯著提升了重建3D模型的幾何一致性和視覺逼真度。在此框架中,3D高斯潑濺作為顯式3D表示,提供初始幾何與外觀優先級。基於此基礎,HumanFixer被訓練來修復3DGS渲染,確保照片級真實感。此外,我們深入探討了多視角人體生成中注意力機制的固有挑戰,並提出了一種注意力調製策略,有效增強了多視角間的幾何細節與身份一致性。實驗結果表明,我們的方法在生成與重建的PSNR質量指標上分別提升了16.45%和12.65%,最高可達25.62 dB的PSNR,同時在野外數據上展現了泛化能力,並適用於多種人體重建骨幹模型。
English
Single-image human reconstruction is vital for digital human modeling applications but remains an extremely challenging task. Current approaches rely on generative models to synthesize multi-view images for subsequent 3D reconstruction and animation. However, directly generating multiple views from a single human image suffers from geometric inconsistencies, resulting in issues like fragmented or blurred limbs in the reconstructed models. To tackle these limitations, we introduce HumanDreamer-X, a novel framework that integrates multi-view human generation and reconstruction into a unified pipeline, which significantly enhances the geometric consistency and visual fidelity of the reconstructed 3D models. In this framework, 3D Gaussian Splatting serves as an explicit 3D representation to provide initial geometry and appearance priority. Building upon this foundation, HumanFixer is trained to restore 3DGS renderings, which guarantee photorealistic results. Furthermore, we delve into the inherent challenges associated with attention mechanisms in multi-view human generation, and propose an attention modulation strategy that effectively enhances geometric details identity consistency across multi-view. Experimental results demonstrate that our approach markedly improves generation and reconstruction PSNR quality metrics by 16.45% and 12.65%, respectively, achieving a PSNR of up to 25.62 dB, while also showing generalization capabilities on in-the-wild data and applicability to various human reconstruction backbone models.

Summary

AI-Generated Summary

PDF112April 7, 2025