適應視覺基礎模型以實現遙感影像中的強健雲分割

摘要

在遙感影像解釋中，雲分割是一個關鍵挑戰，其準確性直接影響後續數據處理和分析的效果。最近，視覺基礎模型（VFM）展示了在各種視覺任務中強大的泛化能力。本文提出了一種參數高效的自適應方法，稱為Cloud-Adapter，旨在增強雲分割的準確性和韌性。我們的方法利用了一個在通用領域數據上預訓練的VFM，該模型保持凍結，無需進行額外的訓練。Cloud-Adapter包含一個輕量級的空間感知模塊，最初利用卷積神經網絡（ConvNet）提取密集的空間表示。這些多尺度特徵然後被聚合並作為上下文輸入提供給一個適應模塊，該模塊調節VFM中的凍結變換層。實驗結果表明，Cloud-Adapter方法僅利用凍結主幹的可訓練參數的0.6%，實現了顯著的性能提升。Cloud-Adapter在多個衛星來源、傳感器系列、數據處理級別、土地覆蓋情景和標註粒度的各種雲分割數據集上始終保持最先進的性能。我們已在https://github.com/XavierJiezou/Cloud-Adapter 上發布了源代碼和預訓練模型，以支持進一步的研究。

English

Cloud segmentation is a critical challenge in remote sensing image interpretation, as its accuracy directly impacts the effectiveness of subsequent data processing and analysis. Recently, vision foundation models (VFM) have demonstrated powerful generalization capabilities across various visual tasks. In this paper, we present a parameter-efficient adaptive approach, termed Cloud-Adapter, designed to enhance the accuracy and robustness of cloud segmentation. Our method leverages a VFM pretrained on general domain data, which remains frozen, eliminating the need for additional training. Cloud-Adapter incorporates a lightweight spatial perception module that initially utilizes a convolutional neural network (ConvNet) to extract dense spatial representations. These multi-scale features are then aggregated and serve as contextual inputs to an adapting module, which modulates the frozen transformer layers within the VFM. Experimental results demonstrate that the Cloud-Adapter approach, utilizing only 0.6% of the trainable parameters of the frozen backbone, achieves substantial performance gains. Cloud-Adapter consistently attains state-of-the-art (SOTA) performance across a wide variety of cloud segmentation datasets from multiple satellite sources, sensor series, data processing levels, land cover scenarios, and annotation granularities. We have released the source code and pretrained models at https://github.com/XavierJiezou/Cloud-Adapter to support further research.

適應視覺基礎模型以實現遙感影像中的強健雲分割

Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images

摘要

Support