地球的全局和密集嵌入:Major TOM漂浮在潜空间中
Global and Dense Embeddings of Earth: Major TOM Floating in the Latent Space
December 7, 2024
作者: Mikolaj Czerkawski, Marcin Kluczek, Jędrzej S. Bojanowski
cs.AI
摘要
随着像哥白尼计划这样的大型项目档案中地球观测数据量不断增加,对底层原始数据进行高效的向量表示变得日益重要。从预训练深度神经网络中提取特征表示的方法是一种强大的方法,可以提供输入数据的语义抽象。然而,针对包含地理空间数据的图像档案进行此类处理的方法尚未明确定义。本研究提出了对现有社区项目 Major TOM 进行扩展,该项目旨在提供和规范用于地球观测的开放和免费的 AI 可用数据集。此外,随着本文的发表,还公开释放了四个全球和密集的嵌入式数据集,这是目前覆盖地球表面最全面的全球开放地理空间视觉嵌入数据集。
English
With the ever-increasing volumes of the Earth observation data present in the
archives of large programmes such as Copernicus, there is a growing need for
efficient vector representations of the underlying raw data. The approach of
extracting feature representations from pretrained deep neural networks is a
powerful approach that can provide semantic abstractions of the input data.
However, the way this is done for imagery archives containing geospatial data
has not yet been defined. In this work, an extension is proposed to an existing
community project, Major TOM, focused on the provision and standardization of
open and free AI-ready datasets for Earth observation. Furthermore, four global
and dense embedding datasets are released openly and for free along with the
publication of this manuscript, resulting in the most comprehensive global open
dataset of geospatial visual embeddings in terms of covered Earth's surface.Summary
AI-Generated Summary