DepthMaster:驯服单目深度估计的扩散模型
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
January 5, 2025
作者: Ziyang Song, Zerong Wang, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang, Tianzhu Zhang
cs.AI
摘要
扩散去噪范式内的单目深度估计展示了令人印象深刻的泛化能力,但受限于低推理速度。最近的方法采用单步确定性范式以提高推理效率,同时保持可比较的性能。然而,它们忽视了生成和判别特征之间的差距,导致结果不够理想。在这项工作中,我们提出了DepthMaster,这是一个单步扩散模型,旨在为判别深度估计任务调整生成特征。首先,为了减轻生成特征引入的纹理细节过拟合问题,我们提出了一个特征对齐模块,该模块整合了高质量的语义特征,以增强去噪网络的表示能力。其次,为了解决单步确定性框架中细粒度细节的缺失,我们提出了一个傅立叶增强模块,以自适应地平衡低频结构和高频细节。我们采用两阶段训练策略,充分发挥这两个模块的潜力。在第一阶段,我们专注于通过特征对齐模块学习全局场景结构,而在第二阶段,我们利用傅立叶增强模块来提高视觉质量。通过这些努力,我们的模型在泛化和细节保留方面实现了最先进的性能,在各种数据集上优于其他基于扩散的方法。我们的项目页面位于https://indu1ge.github.io/DepthMaster_page。
English
Monocular depth estimation within the diffusion-denoising paradigm
demonstrates impressive generalization ability but suffers from low inference
speed. Recent methods adopt a single-step deterministic paradigm to improve
inference efficiency while maintaining comparable performance. However, they
overlook the gap between generative and discriminative features, leading to
suboptimal results. In this work, we propose DepthMaster, a single-step
diffusion model designed to adapt generative features for the discriminative
depth estimation task. First, to mitigate overfitting to texture details
introduced by generative features, we propose a Feature Alignment module, which
incorporates high-quality semantic features to enhance the denoising network's
representation capability. Second, to address the lack of fine-grained details
in the single-step deterministic framework, we propose a Fourier Enhancement
module to adaptively balance low-frequency structure and high-frequency
details. We adopt a two-stage training strategy to fully leverage the potential
of the two modules. In the first stage, we focus on learning the global scene
structure with the Feature Alignment module, while in the second stage, we
exploit the Fourier Enhancement module to improve the visual quality. Through
these efforts, our model achieves state-of-the-art performance in terms of
generalization and detail preservation, outperforming other diffusion-based
methods across various datasets. Our project page can be found at
https://indu1ge.github.io/DepthMaster_page.Summary
AI-Generated Summary