DSO：通过模拟反馈对齐3D生成器以实现物理合理性

摘要

大多数3D物体生成器注重美学质量，却常忽视实际应用中的物理约束。其中一项关键约束是3D物体应具备自支撑性，即在重力作用下保持平衡。以往生成稳定3D物体的方法依赖于可微分物理模拟器在测试时优化几何形状，这种方法不仅速度慢、稳定性差，还容易陷入局部最优。受生成模型与外部反馈对齐研究的启发，我们提出了直接模拟优化（DSO）框架，利用（非可微分的）模拟器反馈，直接提升3D生成器输出稳定物体的概率。我们构建了一个包含3D物体及其通过物理模拟器获得的稳定性评分的数据集。随后，我们可以使用稳定性评分作为对齐指标，通过直接偏好优化（DPO）或我们新引入的直接奖励优化（DRO）目标，对3D生成器进行微调，无需成对偏好即可对齐扩散模型。实验表明，采用DPO或DRO目标微调的前馈生成器，在生成稳定物体方面，比测试时优化方法更快且效果更佳。值得注意的是，DSO框架即便在没有真实3D物体训练数据的情况下也能工作，允许3D生成器通过自动收集自身输出的模拟反馈实现自我提升。

English

Most 3D object generators focus on aesthetic quality, often neglecting physical constraints necessary in applications. One such constraint is that the 3D object should be self-supporting, i.e., remains balanced under gravity. Prior approaches to generating stable 3D objects used differentiable physics simulators to optimize geometry at test-time, which is slow, unstable, and prone to local optima. Inspired by the literature on aligning generative models to external feedback, we propose Direct Simulation Optimization (DSO), a framework to use the feedback from a (non-differentiable) simulator to increase the likelihood that the 3D generator outputs stable 3D objects directly. We construct a dataset of 3D objects labeled with a stability score obtained from the physics simulator. We can then fine-tune the 3D generator using the stability score as the alignment metric, via direct preference optimization (DPO) or direct reward optimization (DRO), a novel objective, which we introduce, to align diffusion models without requiring pairwise preferences. Our experiments show that the fine-tuned feed-forward generator, using either DPO or DRO objective, is much faster and more likely to produce stable objects than test-time optimization. Notably, the DSO framework works even without any ground-truth 3D objects for training, allowing the 3D generator to self-improve by automatically collecting simulation feedback on its own outputs.

DSO：通过模拟反馈对齐3D生成器以实现物理合理性

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

摘要

Summary

Support

Support