ChatPaper.aiChatPaper

SEAL:低秩适应上的纠缠白盒水印

SEAL: Entangled White-box Watermarks on Low-Rank Adaptation

January 16, 2025
作者: Giyeong Oh, Saejin Kim, Woohyun Cho, Sangkyu Lee, Jiwan Chung, Dokyung Song, Youngjae Yu
cs.AI

摘要

最近,LoRA及其变种已成为训练和共享大型预训练模型特定任务版本的事实标准策略,这归功于其高效性和简单性。然而,针对LoRA权重的版权保护问题,特别是基于水印的技术,仍未得到充分探讨。为填补这一空白,我们提出了SEAL(SEcure wAtermarking on LoRA weights),这是LoRA的通用白盒水印技术。SEAL在可训练的LoRA权重之间嵌入了一个秘密的、不可训练的矩阵,作为主张所有权的护照。然后,SEAL通过训练将护照与LoRA权重纠缠在一起,而不需要额外的纠缠损失,并在隐藏护照后分发微调后的权重。在应用SEAL时,我们观察到在常识推理、文本/视觉指导调整以及文本到图像合成任务中没有性能下降。我们证明SEAL对各种已知攻击具有鲁棒性:删除、混淆和模糊攻击。
English
Recently, LoRA and its variants have become the de facto strategy for training and sharing task-specific versions of large pretrained models, thanks to their efficiency and simplicity. However, the issue of copyright protection for LoRA weights, especially through watermark-based techniques, remains underexplored. To address this gap, we propose SEAL (SEcure wAtermarking on LoRA weights), the universal whitebox watermarking for LoRA. SEAL embeds a secret, non-trainable matrix between trainable LoRA weights, serving as a passport to claim ownership. SEAL then entangles the passport with the LoRA weights through training, without extra loss for entanglement, and distributes the finetuned weights after hiding the passport. When applying SEAL, we observed no performance degradation across commonsense reasoning, textual/visual instruction tuning, and text-to-image synthesis tasks. We demonstrate that SEAL is robust against a variety of known attacks: removal, obfuscation, and ambiguity attacks.

Summary

AI-Generated Summary

PDF102January 21, 2025