文本和視覺模式中的字符遺忘
CLEAR: Character Unlearning in Textual and Visual Modalities
October 23, 2024
作者: Alexey Dontsov, Dmitrii Korzh, Alexey Zhavoronkin, Boris Mikheev, Denis Bobkov, Aibek Alanov, Oleg Y. Rogov, Ivan Oseledets, Elena Tutubalina
cs.AI
摘要
機器遺忘(MU)對於增強深度學習模型中的隱私和安全性至關重要,特別是在大型多模態語言模型(MLLMs)中,通過刪除特定的私人或危險信息。儘管MU在文本和視覺模態方面取得了顯著進展,但多模態遺忘(MMU)仍然明顯未被充分探索,部分原因是缺乏適合的開源基準。為了解決這個問題,我們介紹了CLEAR,這是一個新的基準,旨在評估MMU方法。CLEAR包含200個虛構個人和3,700張圖像,與相應的問答對相關聯,從而實現跨模態的全面評估。我們評估了10種MU方法,將它們適應到MMU中,並突出了特定於多模態遺忘的新挑戰。我們還展示了對LoRA權重進行簡單的ell_1正則化可以顯著減輕災難性遺忘,保持模型對保留數據的性能。數據集可在以下網址獲取:https://huggingface.co/datasets/therem/CLEAR
English
Machine Unlearning (MU) is critical for enhancing privacy and security in
deep learning models, particularly in large multimodal language models (MLLMs),
by removing specific private or hazardous information. While MU has made
significant progress in textual and visual modalities, multimodal unlearning
(MMU) remains significantly underexplored, partially due to the absence of a
suitable open-source benchmark. To address this, we introduce CLEAR, a new
benchmark designed to evaluate MMU methods. CLEAR contains 200 fictitious
individuals and 3,700 images linked with corresponding question-answer pairs,
enabling a thorough evaluation across modalities. We assess 10 MU methods,
adapting them for MMU, and highlight new challenges specific to multimodal
forgetting. We also demonstrate that simple ell_1 regularization on LoRA
weights significantly mitigates catastrophic forgetting, preserving model
performance on retained data. The dataset is available at
https://huggingface.co/datasets/therem/CLEARSummary
AI-Generated Summary