深度蒸馏:蒸馏技术打造更强大的单目深度估计器
Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator
February 26, 2025
作者: Xiankang He, Dongyan Guo, Hongji Li, Ruibo Li, Ying Cui, Chi Zhang
cs.AI
摘要
单目深度估计(MDE)旨在从单一RGB图像中预测场景深度,在三维场景理解中扮演着关键角色。近期,零样本MDE的进展通过归一化深度表示和基于蒸馏的学习方法,提升了模型在多样化场景中的泛化能力。然而,当前用于蒸馏的深度归一化方法依赖全局归一化,可能会放大噪声伪标签,降低蒸馏效果。本文系统分析了不同深度归一化策略对伪标签蒸馏的影响。基于研究发现,我们提出了跨上下文蒸馏方法,该方法融合全局与局部深度线索以提升伪标签质量。此外,我们引入了一种多教师蒸馏框架,利用不同深度估计模型的互补优势,从而获得更稳健且精确的深度预测。在多个基准数据集上的大量实验表明,我们的方法在定量与定性评估上均显著超越了现有最先进技术。
English
Monocular depth estimation (MDE) aims to predict scene depth from a single
RGB image and plays a crucial role in 3D scene understanding. Recent advances
in zero-shot MDE leverage normalized depth representations and
distillation-based learning to improve generalization across diverse scenes.
However, current depth normalization methods for distillation, relying on
global normalization, can amplify noisy pseudo-labels, reducing distillation
effectiveness. In this paper, we systematically analyze the impact of different
depth normalization strategies on pseudo-label distillation. Based on our
findings, we propose Cross-Context Distillation, which integrates global and
local depth cues to enhance pseudo-label quality. Additionally, we introduce a
multi-teacher distillation framework that leverages complementary strengths of
different depth estimation models, leading to more robust and accurate depth
predictions. Extensive experiments on benchmark datasets demonstrate that our
approach significantly outperforms state-of-the-art methods, both
quantitatively and qualitatively.Summary
AI-Generated Summary