ChatPaper.aiChatPaper

目标检测性能与视觉显著性和深度估计的相关性

Correlation of Object Detection Performance with Visual Saliency and Depth Estimation

November 5, 2024
作者: Matthias Bartolo, Dylan Seychell
cs.AI

摘要

随着目标检测技术的不断发展,理解它们与互补视觉任务的关系对于优化模型架构和计算资源变得至关重要。本文研究了目标检测准确性与两个基本视觉任务——深度预测和视觉显著性预测之间的相关性。通过在COCO和Pascal VOC数据集上使用最先进的模型(DeepGaze IIE、Depth Anything、DPT-Large和Itti的模型)进行全面实验,我们发现视觉显著性与目标检测准确性之间表现出一致性更强的相关性(在Pascal VOC上mArho高达0.459),相比之下深度预测的相关性较低(mArho最高达0.283)。我们的分析揭示了这些相关性在不同目标类别之间存在显著变化,较大的目标显示出高达三倍于较小目标的相关值。这些发现表明,将视觉显著性特征纳入目标检测架构可能比深度信息更有益,尤其是对于特定目标类别。观察到的类别特定变化还为有针对性的特征工程和数据集设计改进提供了见解,潜在地促进更高效准确的目标检测系统的发展。
English
As object detection techniques continue to evolve, understanding their relationships with complementary visual tasks becomes crucial for optimising model architectures and computational resources. This paper investigates the correlations between object detection accuracy and two fundamental visual tasks: depth prediction and visual saliency prediction. Through comprehensive experiments using state-of-the-art models (DeepGaze IIE, Depth Anything, DPT-Large, and Itti's model) on COCO and Pascal VOC datasets, we find that visual saliency shows consistently stronger correlations with object detection accuracy (mArho up to 0.459 on Pascal VOC) compared to depth prediction (mArho up to 0.283). Our analysis reveals significant variations in these correlations across object categories, with larger objects showing correlation values up to three times higher than smaller objects. These findings suggest incorporating visual saliency features into object detection architectures could be more beneficial than depth information, particularly for specific object categories. The observed category-specific variations also provide insights for targeted feature engineering and dataset design improvements, potentially leading to more efficient and accurate object detection systems.

Summary

AI-Generated Summary

PDF41November 13, 2024