NL-Eye:圖像的Abductive NLI
NL-Eye: Abductive NLI for Images
October 3, 2024
作者: Mor Ventura, Michael Toker, Nitay Calderon, Zorik Gekhman, Yonatan Bitton, Roi Reichart
cs.AI
摘要
基於視覺語言模型(VLM)的機器人在檢測到濕地板時是否會警告我們可能會滑倒?最近的VLM已展示出令人印象深刻的能力,然而它們推斷結果和原因的能力仍未得到充分探索。為了解決這個問題,我們引入了NL-Eye,一個旨在評估VLM視覺溯因推理能力的基準測試。NL-Eye將溯因自然語言推理(NLI)任務應用到視覺領域,要求模型基於前提圖像評估假設圖像的合理性並解釋其決策。NL-Eye包含350個精心挑選的三元組示例(1,050張圖像),涵蓋各種推理類別:物理、功能性、邏輯、情感、文化和社會。數據編輯過程包括兩個步驟 - 撰寫文本描述和使用文本生成圖像模型生成圖像,兩者都需要大量人工參與以確保高質量和具有挑戰性的場景。我們的實驗表明,VLM在NL-Eye上遇到了顯著困難,通常表現在隨機基準水平,而人類在合理性預測和解釋質量方面表現出色。這表明現代VLM在溯因推理能力方面存在不足。NL-Eye代表了向開發能夠進行強大多模式推理的VLM的重要一步,包括用於事故預防機器人和生成視頻驗證等現實應用。
English
Will a Visual Language Model (VLM)-based bot warn us about slipping if it
detects a wet floor? Recent VLMs have demonstrated impressive capabilities, yet
their ability to infer outcomes and causes remains underexplored. To address
this, we introduce NL-Eye, a benchmark designed to assess VLMs' visual
abductive reasoning skills. NL-Eye adapts the abductive Natural Language
Inference (NLI) task to the visual domain, requiring models to evaluate the
plausibility of hypothesis images based on a premise image and explain their
decisions. NL-Eye consists of 350 carefully curated triplet examples (1,050
images) spanning diverse reasoning categories: physical, functional, logical,
emotional, cultural, and social. The data curation process involved two steps -
writing textual descriptions and generating images using text-to-image models,
both requiring substantial human involvement to ensure high-quality and
challenging scenes. Our experiments show that VLMs struggle significantly on
NL-Eye, often performing at random baseline levels, while humans excel in both
plausibility prediction and explanation quality. This demonstrates a deficiency
in the abductive reasoning capabilities of modern VLMs. NL-Eye represents a
crucial step toward developing VLMs capable of robust multimodal reasoning for
real-world applications, including accident-prevention bots and generated video
verification.Summary
AI-Generated Summary