DepthMaster:馴服擴散模型以進行單眼深度估計
DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
January 5, 2025
作者: Ziyang Song, Zerong Wang, Bo Li, Hao Zhang, Ruijie Zhu, Li Liu, Peng-Tao Jiang, Tianzhu Zhang
cs.AI
摘要
在擴散去噪範式內的單目深度估計展現了令人印象深刻的泛化能力,但卻面臨著低推論速度的問題。最近的方法採用單步驟確定性範式以提高推論效率,同時保持可比擬的表現。然而,它們忽略了生成和判別特徵之間的差距,導致次優異的結果。在這項工作中,我們提出了DepthMaster,一個單步驟擴散模型,旨在為判別式深度估計任務調整生成特徵。首先,為了減輕生成特徵引入的對紋理細節的過度擬合,我們提出了一個特徵對齊模組,該模組整合高質量的語義特徵以增強去噪網絡的表示能力。其次,為了應對單步確定性框架中缺乏細緻細節的問題,我們提出了一個傅立葉增強模組,以自適應方式平衡低頻結構和高頻細節。我們採用兩階段訓練策略,充分發揮這兩個模組的潛力。在第一階段,我們專注於通過特徵對齊模組學習全局場景結構,而在第二階段,我們利用傅立葉增強模組來提高視覺質量。通過這些努力,我們的模型在泛化和細節保留方面實現了最先進的表現,在各種數據集上優於其他基於擴散的方法。我們的項目頁面位於https://indu1ge.github.io/DepthMaster_page。
English
Monocular depth estimation within the diffusion-denoising paradigm
demonstrates impressive generalization ability but suffers from low inference
speed. Recent methods adopt a single-step deterministic paradigm to improve
inference efficiency while maintaining comparable performance. However, they
overlook the gap between generative and discriminative features, leading to
suboptimal results. In this work, we propose DepthMaster, a single-step
diffusion model designed to adapt generative features for the discriminative
depth estimation task. First, to mitigate overfitting to texture details
introduced by generative features, we propose a Feature Alignment module, which
incorporates high-quality semantic features to enhance the denoising network's
representation capability. Second, to address the lack of fine-grained details
in the single-step deterministic framework, we propose a Fourier Enhancement
module to adaptively balance low-frequency structure and high-frequency
details. We adopt a two-stage training strategy to fully leverage the potential
of the two modules. In the first stage, we focus on learning the global scene
structure with the Feature Alignment module, while in the second stage, we
exploit the Fourier Enhancement module to improve the visual quality. Through
these efforts, our model achieves state-of-the-art performance in terms of
generalization and detail preservation, outperforming other diffusion-based
methods across various datasets. Our project page can be found at
https://indu1ge.github.io/DepthMaster_page.Summary
AI-Generated Summary