DI-PCG：基於擴散的高效反向程序內容生成，用於高質量3D資產創建

摘要

程序化內容生成（PCG）在創建高質量的3D內容方面非常強大，但控制其以產生所需形狀卻是困難的，通常需要進行大量參數調整。逆向程序化內容生成的目標是在輸入條件下自動找到最佳參數。然而，現有的基於採樣和神經網絡的方法仍然存在許多樣本迭代或受限的可控性問題。在這項工作中，我們提出了DI-PCG，一種從一般圖像條件進行逆向PCG的新穎且高效的方法。其核心是一個輕量級擴散轉換器模型，其中PCG參數直接被視為去噪目標，觀察到的圖像則作為控制參數生成的條件。DI-PCG高效且有效。僅需760萬個網絡參數和30個GPU小時進行訓練，就展示了在準確恢復參數方面的卓越性能，並且對野外圖像具有良好的泛化能力。定量和定性實驗結果驗證了DI-PCG在逆向PCG和圖像到3D生成任務中的有效性。DI-PCG為高效的逆向PCG提供了一種有前途的方法，並代表了朝著模擬如何使用參數模型構建3D資產的3D生成路徑的有價值的探索步驟。

English

Procedural Content Generation (PCG) is powerful in creating high-quality 3D contents, yet controlling it to produce desired shapes is difficult and often requires extensive parameter tuning. Inverse Procedural Content Generation aims to automatically find the best parameters under the input condition. However, existing sampling-based and neural network-based methods still suffer from numerous sample iterations or limited controllability. In this work, we present DI-PCG, a novel and efficient method for Inverse PCG from general image conditions. At its core is a lightweight diffusion transformer model, where PCG parameters are directly treated as the denoising target and the observed images as conditions to control parameter generation. DI-PCG is efficient and effective. With only 7.6M network parameters and 30 GPU hours to train, it demonstrates superior performance in recovering parameters accurately, and generalizing well to in-the-wild images. Quantitative and qualitative experiment results validate the effectiveness of DI-PCG in inverse PCG and image-to-3D generation tasks. DI-PCG offers a promising approach for efficient inverse PCG and represents a valuable exploration step towards a 3D generation path that models how to construct a 3D asset using parametric models.

DI-PCG：基於擴散的高效反向程序內容生成，用於高質量3D資產創建

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

摘要

Summary

Support