将扩散模型提炼为高效的3D LiDAR场景补全

摘要

由于扩散模型具有强大的训练稳定性和高完成质量，因此已将其应用于3D LiDAR场景补全。然而，由于自动驾驶车辆需要对周围环境进行高效感知，扩散模型的缓慢采样速度限制了基于扩散的场景补全模型的实际应用。本文提出了一种针对3D LiDAR场景补全模型量身定制的新型蒸馏方法，名为ScoreLiDAR，实现了高效且高质量的场景补全。ScoreLiDAR使经过蒸馏的模型在蒸馏后能够在更少的步骤中进行采样。为了提高补全质量，我们还引入了一种新颖的结构损失，鼓励经过蒸馏的模型捕捉3D LiDAR场景的几何结构。该损失包含一个约束整体结构的场景项和一个约束关键地标点及其相对配置的点项。大量实验证明，ScoreLiDAR将SemanticKITTI上每帧的完成时间从30.55秒加速到5.37秒（>5倍），并且相较于最先进的3D LiDAR场景补全模型，取得了更优越的性能。我们的代码可在https://github.com/happyw1nd/ScoreLiDAR 上公开获取。

English

Diffusion models have been applied to 3D LiDAR scene completion due to their strong training stability and high completion quality. However, the slow sampling speed limits the practical application of diffusion-based scene completion models since autonomous vehicles require an efficient perception of surrounding environments. This paper proposes a novel distillation method tailored for 3D LiDAR scene completion models, dubbed ScoreLiDAR, which achieves efficient yet high-quality scene completion. ScoreLiDAR enables the distilled model to sample in significantly fewer steps after distillation. To improve completion quality, we also introduce a novel Structural Loss, which encourages the distilled model to capture the geometric structure of the 3D LiDAR scene. The loss contains a scene-wise term constraining the holistic structure and a point-wise term constraining the key landmark points and their relative configuration. Extensive experiments demonstrate that ScoreLiDAR significantly accelerates the completion time from 30.55 to 5.37 seconds per frame (>5times) on SemanticKITTI and achieves superior performance compared to state-of-the-art 3D LiDAR scene completion models. Our code is publicly available at https://github.com/happyw1nd/ScoreLiDAR.

将扩散模型提炼为高效的3D LiDAR场景补全

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

摘要

Summary

Support

Support