ChatPaper.aiChatPaper

視覺對抗圖靈測試(VCT^2):探索人工智慧生成圖像檢測的挑戰,並引入視覺人工智慧指數(V_AI)。

Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

November 24, 2024
作者: Nasrin Imanpour, Shashwat Bajpai, Subhankar Ghosh, Sainath Reddy Sankepally, Abhilekh Borah, Hasnat Md Abdullah, Nishoak Kosaraju, Shreyas Dixit, Ashhar Aziz, Shwetangshu Biswas, Vinija Jain, Aman Chadha, Amit Sheth, Amitava Das
cs.AI

摘要

隨著人工智慧技術在圖像生成方面的應用日益普及,並且變得越來越容易取得,對於這些圖像可能被濫用以散佈錯誤信息的擔憂日益增加。最近的人工智慧生成圖像檢測(AGID)方法包括 CNNDetection、NPR、DM Image Detection、Fake Image Detection、DIRE、LASTED、GAN Image Detection、AIDE、SSP、DRCT、RINE、OCC-CLIP、De-Fake 和 Deep Fake Detection。然而,我們認為目前最先進的AGID技術尚不足以有效檢測當代人工智慧生成的圖像,並主張對這些方法進行全面重新評估。我們引入了視覺反作弊圖靈測試(VCT^2),這是一個基準,包括由當代文本生成圖像模型(Stable Diffusion 2.1、Stable Diffusion XL、Stable Diffusion 3、DALL-E 3 和 Midjourney 6)生成的約130K張圖像。VCT^2包括兩組提示,來自紐約時報Twitter帳號的推文和MS COCO數據集的圖像標題。我們還評估了上述AGID技術在VCT^2基準上的表現,突顯了它們在檢測人工智慧生成圖像方面的無效性。隨著圖像生成的人工智慧模型不斷演進,對於評估這些模型的需求變得日益迫切。為了滿足這一需求,我們提出了視覺人工智慧指數(V_AI),該指數從各種視覺角度評估生成的圖像,包括紋理複雜度和對象連貫性,為評估圖像生成的人工智慧模型設定了新的標準。為了促進這一領域的研究,我們將我們的 https://huggingface.co/datasets/anonymous1233/COCO_AI 和 https://huggingface.co/datasets/anonymous1233/twitter_AI 數據集公開提供。
English
The proliferation of AI techniques for image generation, coupled with their increasing accessibility, has raised significant concerns about the potential misuse of these images to spread misinformation. Recent AI-generated image detection (AGID) methods include CNNDetection, NPR, DM Image Detection, Fake Image Detection, DIRE, LASTED, GAN Image Detection, AIDE, SSP, DRCT, RINE, OCC-CLIP, De-Fake, and Deep Fake Detection. However, we argue that the current state-of-the-art AGID techniques are inadequate for effectively detecting contemporary AI-generated images and advocate for a comprehensive reevaluation of these methods. We introduce the Visual Counter Turing Test (VCT^2), a benchmark comprising ~130K images generated by contemporary text-to-image models (Stable Diffusion 2.1, Stable Diffusion XL, Stable Diffusion 3, DALL-E 3, and Midjourney 6). VCT^2 includes two sets of prompts sourced from tweets by the New York Times Twitter account and captions from the MS COCO dataset. We also evaluate the performance of the aforementioned AGID techniques on the VCT^2 benchmark, highlighting their ineffectiveness in detecting AI-generated images. As image-generative AI models continue to evolve, the need for a quantifiable framework to evaluate these models becomes increasingly critical. To meet this need, we propose the Visual AI Index (V_AI), which assesses generated images from various visual perspectives, including texture complexity and object coherence, setting a new standard for evaluating image-generative AI models. To foster research in this domain, we make our https://huggingface.co/datasets/anonymous1233/COCO_AI and https://huggingface.co/datasets/anonymous1233/twitter_AI datasets publicly available.

Summary

AI-Generated Summary

PDF42November 27, 2024