ChatPaper.aiChatPaper

深度蒸馏:蒸馏技术打造更强大的单目深度估计器

Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator

February 26, 2025
作者: Xiankang He, Dongyan Guo, Hongji Li, Ruibo Li, Ying Cui, Chi Zhang
cs.AI

摘要

单目深度估计(MDE)旨在从单一RGB图像中预测场景深度,在三维场景理解中扮演着关键角色。近期,零样本MDE的进展通过归一化深度表示和基于蒸馏的学习方法,提升了模型在多样化场景中的泛化能力。然而,当前用于蒸馏的深度归一化方法依赖全局归一化,可能会放大噪声伪标签,降低蒸馏效果。本文系统分析了不同深度归一化策略对伪标签蒸馏的影响。基于研究发现,我们提出了跨上下文蒸馏方法,该方法融合全局与局部深度线索以提升伪标签质量。此外,我们引入了一种多教师蒸馏框架,利用不同深度估计模型的互补优势,从而获得更稳健且精确的深度预测。在多个基准数据集上的大量实验表明,我们的方法在定量与定性评估上均显著超越了现有最先进技术。
English
Monocular depth estimation (MDE) aims to predict scene depth from a single RGB image and plays a crucial role in 3D scene understanding. Recent advances in zero-shot MDE leverage normalized depth representations and distillation-based learning to improve generalization across diverse scenes. However, current depth normalization methods for distillation, relying on global normalization, can amplify noisy pseudo-labels, reducing distillation effectiveness. In this paper, we systematically analyze the impact of different depth normalization strategies on pseudo-label distillation. Based on our findings, we propose Cross-Context Distillation, which integrates global and local depth cues to enhance pseudo-label quality. Additionally, we introduce a multi-teacher distillation framework that leverages complementary strengths of different depth estimation models, leading to more robust and accurate depth predictions. Extensive experiments on benchmark datasets demonstrate that our approach significantly outperforms state-of-the-art methods, both quantitatively and qualitatively.

Summary

AI-Generated Summary

PDF115February 27, 2025