DI-PCG：基于扩散的高效逆向程序内容生成用于高质量3D资产创作

摘要

程序内容生成（PCG）在创建高质量的3D内容方面非常强大，但控制其以生成所需形状却很困难，通常需要进行大量参数调整。逆向程序内容生成旨在在输入条件下自动找到最佳参数。然而，现有基于采样和神经网络的方法仍然存在大量的样本迭代或受限的可控性。在这项工作中，我们提出了DI-PCG，一种从一般图像条件进行逆向PCG的新颖高效方法。其核心是一个轻量级扩散变换器模型，其中PCG参数直接被视为去噪目标，观察到的图像作为控制参数生成的条件。DI-PCG高效且有效。仅需7.6M个网络参数和30个GPU小时进行训练，它展现出在准确恢复参数和良好泛化到野外图像方面的卓越性能。定量和定性实验结果验证了DI-PCG在逆向PCG和图像到3D生成任务中的有效性。DI-PCG为高效的逆向PCG提供了一种有前途的方法，并代表了朝着模拟如何使用参数模型构建3D资产的3D生成路径的宝贵探索步骤。

English

Procedural Content Generation (PCG) is powerful in creating high-quality 3D contents, yet controlling it to produce desired shapes is difficult and often requires extensive parameter tuning. Inverse Procedural Content Generation aims to automatically find the best parameters under the input condition. However, existing sampling-based and neural network-based methods still suffer from numerous sample iterations or limited controllability. In this work, we present DI-PCG, a novel and efficient method for Inverse PCG from general image conditions. At its core is a lightweight diffusion transformer model, where PCG parameters are directly treated as the denoising target and the observed images as conditions to control parameter generation. DI-PCG is efficient and effective. With only 7.6M network parameters and 30 GPU hours to train, it demonstrates superior performance in recovering parameters accurately, and generalizing well to in-the-wild images. Quantitative and qualitative experiment results validate the effectiveness of DI-PCG in inverse PCG and image-to-3D generation tasks. DI-PCG offers a promising approach for efficient inverse PCG and represents a valuable exploration step towards a 3D generation path that models how to construct a 3D asset using parametric models.

DI-PCG：基于扩散的高效逆向程序内容生成用于高质量3D资产创作

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

摘要

Support