画像パッチのトークン化：大規模画像における効果的なヘイズ除去のためのグローバルコンテキスト融合

要旨

グローバルな文脈情報とローカルな詳細特徴は、かすみ除去タスクにおいて不可欠である。深層学習モデルは、小さな低解像度の画像では良好な性能を発揮するが、GPUメモリの制約により、大きな高解像度の画像では困難に直面する。妥協策として、画像のスライス化やダウンサンプリングがしばしば採用される。前者はグローバルな情報を減らし、後者は高周波の詳細を捨ててしまう。これらの課題に対処するため、我々はDehazeXLを提案する。これは、グローバルな文脈とローカルな特徴抽出を効果的にバランスさせ、主流のGPUハードウェア上で大規模画像のエンドツーエンドモデリングを可能にするかすみ除去手法である。さらに、かすみ除去性能におけるグローバルな文脈利用の効率を評価するため、かすみ除去タスクの特性に合わせた視覚的帰属手法を設計した。最後に、大規模画像のかすみ除去のためのベンチマークデータセットの不足を認識し、モデルのトレーニングとテストを支援するために超高解像度のかすみ除去データセット（8KDehaze）を開発した。これには、8192×8192ピクセルのクリアな画像とかすみ画像のペアが10000組含まれている。大規模な実験により、DehazeXLがわずか21GBのメモリで10240×10240ピクセルの画像を推論し、評価された全ての手法の中で最先端の結果を達成できることが示された。ソースコードと実験データセットはhttps://github.com/CastleChen339/DehazeXLで公開されている。

English

Global contextual information and local detail features are essential for haze removal tasks. Deep learning models perform well on small, low-resolution images, but they encounter difficulties with large, high-resolution ones due to GPU memory limitations. As a compromise, they often resort to image slicing or downsampling. The former diminishes global information, while the latter discards high-frequency details. To address these challenges, we propose DehazeXL, a haze removal method that effectively balances global context and local feature extraction, enabling end-to-end modeling of large images on mainstream GPU hardware. Additionally, to evaluate the efficiency of global context utilization in haze removal performance, we design a visual attribution method tailored to the characteristics of haze removal tasks. Finally, recognizing the lack of benchmark datasets for haze removal in large images, we have developed an ultra-high-resolution haze removal dataset (8KDehaze) to support model training and testing. It includes 10000 pairs of clear and hazy remote sensing images, each sized at 8192 times 8192 pixels. Extensive experiments demonstrate that DehazeXL can infer images up to 10240 times 10240 pixels with only 21 GB of memory, achieving state-of-the-art results among all evaluated methods. The source code and experimental dataset are available at https://github.com/CastleChen339/DehazeXL.

画像パッチのトークン化：大規模画像における効果的なヘイズ除去のためのグローバルコンテキスト融合

Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images

要旨

Summary

Support

Support