ChatPaper.aiChatPaper

MegaLoc:一检索定全局

MegaLoc: One Retrieval to Place Them All

February 24, 2025
作者: Gabriele Berton, Carlo Masone
cs.AI

摘要

从与给定查询相同的位置检索图像,是多个计算机视觉任务中的重要组成部分,如视觉地点识别、地标检索、视觉定位、三维重建以及同步定位与地图构建(SLAM)。然而,现有解决方案通常专为其中某一任务设计,当需求稍有变化或遇到分布外数据时,往往表现不佳。本文中,我们整合了多种现有方法、训练技术和数据集,训练出一个名为MegaLoc的检索模型,该模型在多项任务中均表现出色。我们发现,MegaLoc(1)在大量视觉地点识别数据集上达到了业界领先水平,(2)在常见的地标检索数据集上取得了令人印象深刻的成果,以及(3)在LaMAR数据集上的视觉定位任务中,仅通过替换检索方法,便为现有定位流程设立了新的标杆。MegaLoc的代码已公开于https://github.com/gmberton/MegaLoc。
English
Retrieving images from the same location as a given query is an important component of multiple computer vision tasks, like Visual Place Recognition, Landmark Retrieval, Visual Localization, 3D reconstruction, and SLAM. However, existing solutions are built to specifically work for one of these tasks, and are known to fail when the requirements slightly change or when they meet out-of-distribution data. In this paper we combine a variety of existing methods, training techniques, and datasets to train a retrieval model, called MegaLoc, that is performant on multiple tasks. We find that MegaLoc (1) achieves state of the art on a large number of Visual Place Recognition datasets, (2) impressive results on common Landmark Retrieval datasets, and (3) sets a new state of the art for Visual Localization on the LaMAR datasets, where we only changed the retrieval method to the existing localization pipeline. The code for MegaLoc is available at https://github.com/gmberton/MegaLoc

Summary

AI-Generated Summary

PDF12February 25, 2025