DSO:通过模拟反馈对齐3D生成器以实现物理合理性
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
March 28, 2025
作者: Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
cs.AI
摘要
大多数3D物体生成器注重美学质量,却常忽视实际应用中的物理约束。其中一项关键约束是3D物体应具备自支撑性,即在重力作用下保持平衡。以往生成稳定3D物体的方法依赖于可微分物理模拟器在测试时优化几何形状,这种方法不仅速度慢、稳定性差,还容易陷入局部最优。受生成模型与外部反馈对齐研究的启发,我们提出了直接模拟优化(DSO)框架,利用(非可微分的)模拟器反馈,直接提升3D生成器输出稳定物体的概率。我们构建了一个包含3D物体及其通过物理模拟器获得的稳定性评分的数据集。随后,我们可以使用稳定性评分作为对齐指标,通过直接偏好优化(DPO)或我们新引入的直接奖励优化(DRO)目标,对3D生成器进行微调,无需成对偏好即可对齐扩散模型。实验表明,采用DPO或DRO目标微调的前馈生成器,在生成稳定物体方面,比测试时优化方法更快且效果更佳。值得注意的是,DSO框架即便在没有真实3D物体训练数据的情况下也能工作,允许3D生成器通过自动收集自身输出的模拟反馈实现自我提升。
English
Most 3D object generators focus on aesthetic quality, often neglecting
physical constraints necessary in applications. One such constraint is that the
3D object should be self-supporting, i.e., remains balanced under gravity.
Prior approaches to generating stable 3D objects used differentiable physics
simulators to optimize geometry at test-time, which is slow, unstable, and
prone to local optima. Inspired by the literature on aligning generative models
to external feedback, we propose Direct Simulation Optimization (DSO), a
framework to use the feedback from a (non-differentiable) simulator to increase
the likelihood that the 3D generator outputs stable 3D objects directly. We
construct a dataset of 3D objects labeled with a stability score obtained from
the physics simulator. We can then fine-tune the 3D generator using the
stability score as the alignment metric, via direct preference optimization
(DPO) or direct reward optimization (DRO), a novel objective, which we
introduce, to align diffusion models without requiring pairwise preferences.
Our experiments show that the fine-tuned feed-forward generator, using either
DPO or DRO objective, is much faster and more likely to produce stable objects
than test-time optimization. Notably, the DSO framework works even without any
ground-truth 3D objects for training, allowing the 3D generator to self-improve
by automatically collecting simulation feedback on its own outputs.Summary
AI-Generated Summary