將擴散模型精煉為高效的3D LiDAR場景補全
Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
December 4, 2024
作者: Shengyuan Zhang, An Zhao, Ling Yang, Zejian Li, Chenye Meng, Haoran Xu, Tianrun Chen, AnYang Wei, Perry Pengyun GU, Lingyun Sun
cs.AI
摘要
擴散模型已被應用於 3D LiDAR 場景完成,因為其強大的訓練穩定性和高完成質量。然而,由於自動駕駛車輛需要對周圍環境進行高效感知,擴散模型的慢採樣速度限制了基於擴散的場景完成模型的實際應用。本文提出了一種針對 3D LiDAR 場景完成模型量身定制的新型蒸餾方法,名為 ScoreLiDAR,實現了高效且高質量的場景完成。ScoreLiDAR 使經過蒸餾的模型在蒸餾後能夠在更少的步驟中進行採樣。為了提高完成質量,我們還引入了一種新型結構損失,該損失鼓勵經過蒸餾的模型捕捉 3D LiDAR 場景的幾何結構。該損失包含一個場景層面的術語,限制整體結構,以及一個點層面的術語,限制關鍵地標點及其相對配置。大量實驗表明,ScoreLiDAR 將 SemanticKITTI 上每幀的完成時間從 30.55 秒加速到 5.37 秒(>5倍),並且相較於最先進的 3D LiDAR 場景完成模型,實現了卓越的性能。我們的代碼公開在 https://github.com/happyw1nd/ScoreLiDAR。
English
Diffusion models have been applied to 3D LiDAR scene completion due to their
strong training stability and high completion quality. However, the slow
sampling speed limits the practical application of diffusion-based scene
completion models since autonomous vehicles require an efficient perception of
surrounding environments. This paper proposes a novel distillation method
tailored for 3D LiDAR scene completion models, dubbed ScoreLiDAR,
which achieves efficient yet high-quality scene completion. ScoreLiDAR enables
the distilled model to sample in significantly fewer steps after distillation.
To improve completion quality, we also introduce a novel Structural
Loss, which encourages the distilled model to capture the geometric structure
of the 3D LiDAR scene. The loss contains a scene-wise term constraining the
holistic structure and a point-wise term constraining the key landmark points
and their relative configuration. Extensive experiments demonstrate that
ScoreLiDAR significantly accelerates the completion time from 30.55 to 5.37
seconds per frame (>5times) on SemanticKITTI and achieves superior
performance compared to state-of-the-art 3D LiDAR scene completion models. Our
code is publicly available at https://github.com/happyw1nd/ScoreLiDAR.Summary
AI-Generated Summary