CaPa:用於高效率4K紋理網格生成的雕刻與繪製綜合技術

CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation

January 16, 2025
作者: Hwan Heo, Jangyeong Kim, Seongyeong Lee, Jeong A Wi, Junyoung Choi, Sangjun Ahn
cs.AI

摘要

從文本或視覺輸入中合成高質量的3D資產已成為現代生成建模中的核心目標。儘管3D生成算法層出不窮,但它們常常面臨著多視角不一致、生成時間過長、低保真度和表面重建問題等挑戰。雖然一些研究已經解決了其中一些問題,但一個全面的解決方案仍然難以捉摸。在本文中,我們介紹了CaPa,一個雕刻和繪畫框架,可以高效生成高保真度的3D資產。CaPa採用了一個兩階段過程,將幾何生成和紋理合成分開。首先,一個3D潛在擴散模型生成由多視角輸入引導的幾何,確保在不同視角下的結構一致性。隨後,利用一種新穎的、與模型無關的空間分離注意力,該框架為給定的幾何合成高分辨率紋理(高達4K)。此外,我們提出了一種3D感知遮擋修補算法,填補未紋理化的區域,從而在整個模型上產生連貫的結果。這個流程在不到30秒內生成高質量的3D資產,為商業應用提供即用輸出。實驗結果表明,CaPa在紋理保真度和幾何穩定性方面表現出色,為實用、可擴展的3D資產生成建立了新的標準。
English
The synthesis of high-quality 3D assets from textual or visual inputs has become a central objective in modern generative modeling. Despite the proliferation of 3D generation algorithms, they frequently grapple with challenges such as multi-view inconsistency, slow generation times, low fidelity, and surface reconstruction problems. While some studies have addressed some of these issues, a comprehensive solution remains elusive. In this paper, we introduce CaPa, a carve-and-paint framework that generates high-fidelity 3D assets efficiently. CaPa employs a two-stage process, decoupling geometry generation from texture synthesis. Initially, a 3D latent diffusion model generates geometry guided by multi-view inputs, ensuring structural consistency across perspectives. Subsequently, leveraging a novel, model-agnostic Spatially Decoupled Attention, the framework synthesizes high-resolution textures (up to 4K) for a given geometry. Furthermore, we propose a 3D-aware occlusion inpainting algorithm that fills untextured regions, resulting in cohesive results across the entire model. This pipeline generates high-quality 3D assets in less than 30 seconds, providing ready-to-use outputs for commercial applications. Experimental results demonstrate that CaPa excels in both texture fidelity and geometric stability, establishing a new standard for practical, scalable 3D asset generation.

Summary

AI-Generated Summary

PDF103January 17, 2025