扩散蒸馏与直接偏好优化相结合的高效3D LiDAR场景补全
Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion
April 15, 2025
作者: An Zhaol, Shengyuan Zhang, Ling Yang, Zejian Li, Jiale Wu, Haoran Xu, AnYang Wei, Perry Pengyun GU Lingyun Sun
cs.AI
摘要
由于扩散模型的采样速度较慢,其在3D LiDAR场景补全中的应用受到限制。分数蒸馏虽能加速扩散采样,但会导致性能下降,而通过直接策略优化(DPO)进行后训练,则能利用偏好数据提升性能。本文提出了一种新颖的扩散蒸馏框架——Distillation-DPO,用于实现偏好对齐的LiDAR场景补全。首先,学生模型生成具有不同初始噪声的成对补全场景。其次,以LiDAR场景评估指标作为偏好,构建胜败样本对。这种构建方式合理,因为大多数LiDAR场景指标信息丰富但不可微分,无法直接优化。再次,Distillation-DPO通过利用教师模型与学生模型在成对补全场景上评分函数的差异来优化学生模型。此过程重复进行直至收敛。大量实验表明,与最先进的LiDAR场景补全扩散模型相比,Distillation-DPO在实现更高质量场景补全的同时,将补全速度提升了5倍以上。据我们所知,我们的方法是首次探索在蒸馏中采用偏好学习,并为偏好对齐的蒸馏提供了洞见。我们的代码已公开在https://github.com/happyw1nd/DistillationDPO。
English
The application of diffusion models in 3D LiDAR scene completion is limited
due to diffusion's slow sampling speed. Score distillation accelerates
diffusion sampling but with performance degradation, while post-training with
direct policy optimization (DPO) boosts performance using preference data. This
paper proposes Distillation-DPO, a novel diffusion distillation framework for
LiDAR scene completion with preference aligment. First, the student model
generates paired completion scenes with different initial noises. Second, using
LiDAR scene evaluation metrics as preference, we construct winning and losing
sample pairs. Such construction is reasonable, since most LiDAR scene metrics
are informative but non-differentiable to be optimized directly. Third,
Distillation-DPO optimizes the student model by exploiting the difference in
score functions between the teacher and student models on the paired completion
scenes. Such procedure is repeated until convergence. Extensive experiments
demonstrate that, compared to state-of-the-art LiDAR scene completion diffusion
models, Distillation-DPO achieves higher-quality scene completion while
accelerating the completion speed by more than 5-fold. Our method is the first
to explore adopting preference learning in distillation to the best of our
knowledge and provide insights into preference-aligned distillation. Our code
is public available on https://github.com/happyw1nd/DistillationDPO.Summary
AI-Generated Summary