TEXGen:一種用於網格紋理的生成擴散模型
TEXGen: a Generative Diffusion Model for Mesh Textures
November 22, 2024
作者: Xin Yu, Ze Yuan, Yuan-Chen Guo, Ying-Tian Liu, JianHui Liu, Yangguang Li, Yan-Pei Cao, Ding Liang, Xiaojuan Qi
cs.AI
摘要
儘管高品質的紋理貼圖對於逼真的3D資產渲染至關重要,但很少有研究探索直接在紋理空間中進行學習,尤其是在大規模數據集上。在這項研究中,我們偏離傳統方法,不再依賴於預先訓練的2D擴散模型來優化3D紋理的測試時間。相反,我們專注於在UV紋理空間本身進行學習的基本問題。我們首次訓練了一個大型擴散模型,能夠以前向傳遞的方式直接生成高分辨率的紋理貼圖。為了促進在高分辨率UV空間中的有效學習,我們提出了一種可擴展的網絡架構,交錯在UV貼圖上進行卷積,同時在點雲上使用注意力層。利用這種架構設計,我們訓練了一個擁有7億參數的擴散模型,可以生成由文本提示和單視圖圖像引導的UV紋理貼圖。一旦訓練完成,我們的模型自然支持各種擴展應用,包括文本引導的紋理修補、稀疏視圖紋理完成以及文本驅動的紋理合成。項目頁面位於http://cvmi-lab.github.io/TEXGen/。
English
While high-quality texture maps are essential for realistic 3D asset
rendering, few studies have explored learning directly in the texture space,
especially on large-scale datasets. In this work, we depart from the
conventional approach of relying on pre-trained 2D diffusion models for
test-time optimization of 3D textures. Instead, we focus on the fundamental
problem of learning in the UV texture space itself. For the first time, we
train a large diffusion model capable of directly generating high-resolution
texture maps in a feed-forward manner. To facilitate efficient learning in
high-resolution UV spaces, we propose a scalable network architecture that
interleaves convolutions on UV maps with attention layers on point clouds.
Leveraging this architectural design, we train a 700 million parameter
diffusion model that can generate UV texture maps guided by text prompts and
single-view images. Once trained, our model naturally supports various extended
applications, including text-guided texture inpainting, sparse-view texture
completion, and text-driven texture synthesis. Project page is at
http://cvmi-lab.github.io/TEXGen/.Summary
AI-Generated Summary