ChatPaper.aiChatPaper

Pandora3D:一个面向高质量三维形状与纹理生成的综合框架

Pandora3D: A Comprehensive Framework for High-Quality 3D Shape and Texture Generation

February 20, 2025
作者: Jiayu Yang, Taizhang Shang, Weixuan Sun, Xibin Song, Ziang Cheng, Senbo Wang, Shenzhou Chen, Weizhe Liu, Hongdong Li, Pan Ji
cs.AI

摘要

本报告提出了一套全面的框架,用于从多样化的输入提示(包括单张图像、多视角图像及文本描述)中生成高质量的3D形状与纹理。该框架由3D形状生成和纹理生成两部分构成。(1) 3D形状生成流程采用变分自编码器(VAE)将隐式3D几何编码至潜在空间,并利用扩散网络根据输入提示生成潜在表示,同时通过改进增强了模型容量。此外,还探索了一种替代性的艺术家创建网格(AM)生成方法,在简单几何体上展现出良好效果。(2) 纹理生成则是一个多阶段过程,始于正面图像的生成,随后是多视角图像生成、RGB到PBR纹理转换,以及高分辨率多视角纹理的精细化处理。每一阶段均嵌入了一致性调度器,在推理过程中强制执行多视角纹理间的像素级一致性,确保无缝融合。 该流程展示了有效处理多种输入格式的能力,通过先进的神经网络架构与创新方法,产出高质量的3D内容。报告详细阐述了系统架构、实验结果,以及未来改进与扩展框架的潜在方向。源代码及预训练权重已发布于:https://github.com/Tencent/Tencent-XR-3DGen。
English
This report presents a comprehensive framework for generating high-quality 3D shapes and textures from diverse input prompts, including single images, multi-view images, and text descriptions. The framework consists of 3D shape generation and texture generation. (1). The 3D shape generation pipeline employs a Variational Autoencoder (VAE) to encode implicit 3D geometries into a latent space and a diffusion network to generate latents conditioned on input prompts, with modifications to enhance model capacity. An alternative Artist-Created Mesh (AM) generation approach is also explored, yielding promising results for simpler geometries. (2). Texture generation involves a multi-stage process starting with frontal images generation followed by multi-view images generation, RGB-to-PBR texture conversion, and high-resolution multi-view texture refinement. A consistency scheduler is plugged into every stage, to enforce pixel-wise consistency among multi-view textures during inference, ensuring seamless integration. The pipeline demonstrates effective handling of diverse input formats, leveraging advanced neural architectures and novel methodologies to produce high-quality 3D content. This report details the system architecture, experimental results, and potential future directions to improve and expand the framework. The source code and pretrained weights are released at: https://github.com/Tencent/Tencent-XR-3DGen.

Summary

AI-Generated Summary

PDF52February 25, 2025