感染生成式人工智慧的病毒

Infecting Generative AI With Viruses

January 9, 2025
作者: David Noever, Forrest McKee
cs.AI

摘要

本研究展示了一種新穎的方法,用於測試視覺-大型語言模型(VLM/LLM)的安全邊界,該方法是將EICAR測試文件嵌入JPEG圖像中。我們在多個LLM平台上成功執行了四個不同的協議,包括OpenAI GPT-4o、Microsoft Copilot、Google Gemini 1.5 Pro和Anthropic Claude 3.5 Sonnet。實驗驗證了包含EICAR簽名的修改後JPEG文件可以上傳、操作並可能在LLM虛擬工作空間內執行。主要發現包括:1)在圖像元數據中掩蓋EICAR字符串的一致能力而不被檢測到,2)在LLM環境中使用基於Python的操作成功提取測試文件,以及3)展示了多種混淆技術,包括base64編碼和字符串反轉。本研究擴展了微軟研究的「滲透測試參與規則」框架,以評估基於雲的生成式人工智能和LLM的安全邊界,特別關注容器化環境內的文件處理和執行能力。
English
This study demonstrates a novel approach to testing the security boundaries of Vision-Large Language Model (VLM/ LLM) using the EICAR test file embedded within JPEG images. We successfully executed four distinct protocols across multiple LLM platforms, including OpenAI GPT-4o, Microsoft Copilot, Google Gemini 1.5 Pro, and Anthropic Claude 3.5 Sonnet. The experiments validated that a modified JPEG containing the EICAR signature could be uploaded, manipulated, and potentially executed within LLM virtual workspaces. Key findings include: 1) consistent ability to mask the EICAR string in image metadata without detection, 2) successful extraction of the test file using Python-based manipulation within LLM environments, and 3) demonstration of multiple obfuscation techniques including base64 encoding and string reversal. This research extends Microsoft Research's "Penetration Testing Rules of Engagement" framework to evaluate cloud-based generative AI and LLM security boundaries, particularly focusing on file handling and execution capabilities within containerized environments.

Summary

AI-Generated Summary

PDF129January 13, 2025