ChatPaper.aiChatPaper

SARChat-Bench-2M:一项面向合成孔径雷达图像解释的多任务视觉-语言基准测试

SARChat-Bench-2M: A Multi-Task Vision-Language Benchmark for SAR Image Interpretation

February 12, 2025
作者: Zhiming Ma, Xiayang Xiao, Sihao Dong, Peidong Wang, HaiPeng Wang, Qingyun Pan
cs.AI

摘要

在合成孔径雷达(SAR)遥感图像解释领域,虽然视觉语言模型(VLMs)在自然语言处理和图像理解方面取得了显著进展,但由于领域专业知识不足,它们在专业领域的应用仍然受到限制。本文首次提出了用于SAR图像的大规模多模态对话数据集SARChat-2M,包含约200万个高质量图像文本对,涵盖了各种场景并具有详细的目标注释。该数据集不仅支持视觉理解和目标检测等几个关键任务,还具有独特的创新方面:本研究开发了一个用于SAR领域的视觉语言数据集和基准,评估VLMs在SAR图像解释中的能力,为构建各种遥感垂直领域的多模态数据集提供了范例框架。通过对16种主流VLMs的实验,数据集的有效性得到了充分验证,并成功建立了SAR领域的第一个多任务对话基准。该项目将在https://github.com/JimmyMa99/SARChat 上发布,旨在推动SAR视觉语言模型的深入发展和广泛应用。
English
In the field of synthetic aperture radar (SAR) remote sensing image interpretation, although Vision language models (VLMs) have made remarkable progress in natural language processing and image understanding, their applications remain limited in professional domains due to insufficient domain expertise. This paper innovatively proposes the first large-scale multimodal dialogue dataset for SAR images, named SARChat-2M, which contains approximately 2 million high-quality image-text pairs, encompasses diverse scenarios with detailed target annotations. This dataset not only supports several key tasks such as visual understanding and object detection tasks, but also has unique innovative aspects: this study develop a visual-language dataset and benchmark for the SAR domain, enabling and evaluating VLMs' capabilities in SAR image interpretation, which provides a paradigmatic framework for constructing multimodal datasets across various remote sensing vertical domains. Through experiments on 16 mainstream VLMs, the effectiveness of the dataset has been fully verified, and the first multi-task dialogue benchmark in the SAR field has been successfully established. The project will be released at https://github.com/JimmyMa99/SARChat, aiming to promote the in-depth development and wide application of SAR visual language models.

Summary

AI-Generated Summary

PDF124February 13, 2025