视觉对抗图灵测试(VCT^2):探索人工智能生成图像检测的挑战,并引入视觉人工智能指数(V_AI)。
Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)
November 24, 2024
作者: Nasrin Imanpour, Shashwat Bajpai, Subhankar Ghosh, Sainath Reddy Sankepally, Abhilekh Borah, Hasnat Md Abdullah, Nishoak Kosaraju, Shreyas Dixit, Ashhar Aziz, Shwetangshu Biswas, Vinija Jain, Aman Chadha, Amit Sheth, Amitava Das
cs.AI
摘要
随着人工智能技术在图像生成领域的广泛应用,以及其日益普及,人们对这些图像可能被滥用以传播虚假信息的担忧日益加剧。最近的人工智能生成图像检测(AGID)方法包括CNNDetection、NPR、DM图像检测、伪造图像检测、DIRE、LASTED、GAN图像检测、AIDE、SSP、DRCT、RINE、OCC-CLIP、De-Fake以及Deep Fake检测。然而,我们认为当前最先进的AGID技术无法有效检测当代人工智能生成的图像,并主张全面重新评估这些方法。我们引入了视觉对抗图灵测试(VCT^2),这是一个基准测试,包括由当代文本到图像模型(Stable Diffusion 2.1、Stable Diffusion XL、Stable Diffusion 3、DALL-E 3和Midjourney 6)生成的约130K张图像。VCT^2包括两组提示,分别来自纽约时报Twitter账号的推文和MS COCO数据集的标题。我们还评估了上述AGID技术在VCT^2基准测试上的性能,突显它们在检测人工智能生成的图像方面的无效性。随着图像生成型人工智能模型的不断发展,评估这些模型的需求变得日益关键。为满足这一需求,我们提出了视觉人工智能指数(V_AI),该指数从各种视觉角度评估生成的图像,包括纹理复杂性和物体连贯性,为评估图像生成型人工智能模型设立了新的标准。为促进该领域的研究,我们将我们的https://huggingface.co/datasets/anonymous1233/COCO_AI和https://huggingface.co/datasets/anonymous1233/twitter_AI数据集公开提供。
English
The proliferation of AI techniques for image generation, coupled with their
increasing accessibility, has raised significant concerns about the potential
misuse of these images to spread misinformation. Recent AI-generated image
detection (AGID) methods include CNNDetection, NPR, DM Image Detection, Fake
Image Detection, DIRE, LASTED, GAN Image Detection, AIDE, SSP, DRCT, RINE,
OCC-CLIP, De-Fake, and Deep Fake Detection. However, we argue that the current
state-of-the-art AGID techniques are inadequate for effectively detecting
contemporary AI-generated images and advocate for a comprehensive reevaluation
of these methods. We introduce the Visual Counter Turing Test (VCT^2), a
benchmark comprising ~130K images generated by contemporary text-to-image
models (Stable Diffusion 2.1, Stable Diffusion XL, Stable Diffusion 3, DALL-E
3, and Midjourney 6). VCT^2 includes two sets of prompts sourced from tweets by
the New York Times Twitter account and captions from the MS COCO dataset. We
also evaluate the performance of the aforementioned AGID techniques on the
VCT^2 benchmark, highlighting their ineffectiveness in detecting AI-generated
images. As image-generative AI models continue to evolve, the need for a
quantifiable framework to evaluate these models becomes increasingly critical.
To meet this need, we propose the Visual AI Index (V_AI), which assesses
generated images from various visual perspectives, including texture complexity
and object coherence, setting a new standard for evaluating image-generative AI
models. To foster research in this domain, we make our
https://huggingface.co/datasets/anonymous1233/COCO_AI and
https://huggingface.co/datasets/anonymous1233/twitter_AI datasets publicly
available.Summary
AI-Generated Summary