ChatPaper.aiChatPaper

擴散蒸餾與直接偏好優化於高效3D LiDAR場景補全

Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion

April 15, 2025
作者: An Zhaol, Shengyuan Zhang, Ling Yang, Zejian Li, Jiale Wu, Haoran Xu, AnYang Wei, Perry Pengyun GU Lingyun Sun
cs.AI

摘要

由於擴散模型的採樣速度較慢,其在3D LiDAR場景補全中的應用受到限制。分數蒸餾雖然能加速擴散採樣,但會導致性能下降,而通過直接策略優化(DPO)進行後訓練則能利用偏好數據提升性能。本文提出了一種新穎的擴散蒸餾框架——Distillation-DPO,用於實現偏好對齊的LiDAR場景補全。首先,學生模型生成具有不同初始噪聲的配對補全場景。其次,以LiDAR場景評估指標作為偏好,構建勝負樣本對。這種構建方式是合理的,因為大多數LiDAR場景指標雖然信息豐富但不可微分,無法直接優化。第三,Distillation-DPO通過利用教師模型和學生模型在配對補全場景上的分數函數差異來優化學生模型。此過程重複進行直至收斂。大量實驗表明,與最先進的LiDAR場景補全擴散模型相比,Distillation-DPO在實現更高質量場景補全的同時,將補全速度提升了5倍以上。據我們所知,我們的方法是首個探索在蒸餾中採用偏好學習的研究,並為偏好對齊蒸餾提供了見解。我們的代碼已公開在https://github.com/happyw1nd/DistillationDPO。
English
The application of diffusion models in 3D LiDAR scene completion is limited due to diffusion's slow sampling speed. Score distillation accelerates diffusion sampling but with performance degradation, while post-training with direct policy optimization (DPO) boosts performance using preference data. This paper proposes Distillation-DPO, a novel diffusion distillation framework for LiDAR scene completion with preference aligment. First, the student model generates paired completion scenes with different initial noises. Second, using LiDAR scene evaluation metrics as preference, we construct winning and losing sample pairs. Such construction is reasonable, since most LiDAR scene metrics are informative but non-differentiable to be optimized directly. Third, Distillation-DPO optimizes the student model by exploiting the difference in score functions between the teacher and student models on the paired completion scenes. Such procedure is repeated until convergence. Extensive experiments demonstrate that, compared to state-of-the-art LiDAR scene completion diffusion models, Distillation-DPO achieves higher-quality scene completion while accelerating the completion speed by more than 5-fold. Our method is the first to explore adopting preference learning in distillation to the best of our knowledge and provide insights into preference-aligned distillation. Our code is public available on https://github.com/happyw1nd/DistillationDPO.

Summary

AI-Generated Summary

PDF42April 16, 2025