ChatPaper.aiChatPaper

學習抗遮擋視覺變換器以實現無人機即時追蹤

Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking

April 12, 2025
作者: You Wu, Xucheng Wang, Xiangyang Yang, Mengyuan Liu, Dan Zeng, Hengzhou Ye, Shuiwang Li
cs.AI

摘要

近期,基於視覺Transformer(ViT)骨幹的單流架構在實時無人機(UAV)追蹤中展現出巨大潛力。然而,建築物和樹木等障礙物頻繁造成的遮擋暴露了這些模型的一個主要缺陷:它們往往缺乏有效處理遮擋的策略。因此,需要新的方法來增強單流ViT模型在航空追蹤中的遮擋魯棒性。在本研究中,我們提出了一種基於ViT的遮擋魯棒表示(ORR)學習方法,通過強制目標特徵表示對由空間Cox過程建模的隨機遮罩操作保持不變性,來提升UAV追蹤的遮擋魯棒性。這種隨機遮罩操作近似模擬了目標遮擋,從而使我們能夠學習到對目標遮擋具有魯棒性的ViT模型,用於UAV追蹤。該框架被命名為ORTrack。此外,為了促進實時應用,我們提出了一種自適應特徵知識蒸餾(AFKD)方法,以創建一個更為緊湊的追蹤器,該追蹤器根據任務難度自適應地模仿教師模型ORTrack的行為。這個學生模型被稱為ORTrack-D,它在保持ORTrack大部分性能的同時,提供了更高的效率。在多個基準上的廣泛實驗驗證了我們方法的有效性,展示了其最先進的性能。代碼可在https://github.com/wuyou3474/ORTrack 獲取。
English
Single-stream architectures using Vision Transformer (ViT) backbones show great potential for real-time UAV tracking recently. However, frequent occlusions from obstacles like buildings and trees expose a major drawback: these models often lack strategies to handle occlusions effectively. New methods are needed to enhance the occlusion resilience of single-stream ViT models in aerial tracking. In this work, we propose to learn Occlusion-Robust Representations (ORR) based on ViTs for UAV tracking by enforcing an invariance of the feature representation of a target with respect to random masking operations modeled by a spatial Cox process. Hopefully, this random masking approximately simulates target occlusions, thereby enabling us to learn ViTs that are robust to target occlusion for UAV tracking. This framework is termed ORTrack. Additionally, to facilitate real-time applications, we propose an Adaptive Feature-Based Knowledge Distillation (AFKD) method to create a more compact tracker, which adaptively mimics the behavior of the teacher model ORTrack according to the task's difficulty. This student model, dubbed ORTrack-D, retains much of ORTrack's performance while offering higher efficiency. Extensive experiments on multiple benchmarks validate the effectiveness of our method, demonstrating its state-of-the-art performance. Codes is available at https://github.com/wuyou3474/ORTrack.

Summary

AI-Generated Summary

PDF22April 19, 2025