MVPaint：用於繪製任何3D物件的同步多視圖擴散

摘要

在 3D 資產製作工作流程中，質感是一個至關重要的步驟，它增強了 3D 資產的視覺吸引力和多樣性。儘管最近在文本轉質感（T2T）生成方面取得了進展，現有方法通常產生次優質的結果，主要是由於局部不連續性、多視角之間的不一致性，以及對 UV 展開結果的重度依賴。為了應對這些挑戰，我們提出了一種名為 MVPaint 的新型生成-精煉 3D 質感框架，可以生成高分辨率、無縫質感，同時強調多視角一致性。MVPaint 主要包括三個關鍵模塊。1）同步多視角生成（SMG）。給定一個 3D 網格模型，MVPaint 首先通過採用 SMG 模型同時生成多視角圖像，這導致粗糙的質感結果，其中未塗色的部分是由於觀察缺失。2）空間感知 3D 補全（S3I）。為了確保完整的 3D 質感，我們引入了 S3I 方法，專門設計用於有效地質感以前未觀察到的區域。3）UV 精煉（UVR）。此外，MVPaint 使用 UVR 模塊來改善 UV 空間中的質感質量，首先執行 UV 空間超分辨率，然後通過一個空間感知縫合平滑算法來修正由 UV 展開引起的空間質感不連續性。此外，我們建立了兩個 T2T 評估基準：Objaverse T2T 基準和 GSO T2T 基準，分別基於 Objaverse 數據集中選定的高質量 3D 網格和整個 GSO 數據集。大量實驗結果表明，MVPaint 超越了現有的最先進方法。值得注意的是，MVPaint 能夠生成高保真度的質感，並且具有最小的 Janus 問題和高度增強的跨視角一致性。

English

Texturing is a crucial step in the 3D asset production workflow, which enhances the visual appeal and diversity of 3D assets. Despite recent advancements in Text-to-Texture (T2T) generation, existing methods often yield subpar results, primarily due to local discontinuities, inconsistencies across multiple views, and their heavy dependence on UV unwrapping outcomes. To tackle these challenges, we propose a novel generation-refinement 3D texturing framework called MVPaint, which can generate high-resolution, seamless textures while emphasizing multi-view consistency. MVPaint mainly consists of three key modules. 1) Synchronized Multi-view Generation (SMG). Given a 3D mesh model, MVPaint first simultaneously generates multi-view images by employing an SMG model, which leads to coarse texturing results with unpainted parts due to missing observations. 2) Spatial-aware 3D Inpainting (S3I). To ensure complete 3D texturing, we introduce the S3I method, specifically designed to effectively texture previously unobserved areas. 3) UV Refinement (UVR). Furthermore, MVPaint employs a UVR module to improve the texture quality in the UV space, which first performs a UV-space Super-Resolution, followed by a Spatial-aware Seam-Smoothing algorithm for revising spatial texturing discontinuities caused by UV unwrapping. Moreover, we establish two T2T evaluation benchmarks: the Objaverse T2T benchmark and the GSO T2T benchmark, based on selected high-quality 3D meshes from the Objaverse dataset and the entire GSO dataset, respectively. Extensive experimental results demonstrate that MVPaint surpasses existing state-of-the-art methods. Notably, MVPaint could generate high-fidelity textures with minimal Janus issues and highly enhanced cross-view consistency.

MVPaint：用於繪製任何3D物件的同步多視圖擴散

MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D

摘要

Summary

Support

Support