Marcatura d'acqua testuale multi-bit robusta con parafrasi basate su LLM.

Abstract

Proponiamo un watermark testuale multi-bit impercettibile incorporato mediante parafrasi con LLM. Ottimizziamo due parafrasatori LLM progettati per comportarsi in modo diverso in modo che le differenze nelle parafrasi, riflesse nella semantica del testo, possano essere identificate da un decoder addestrato. Per incorporare il nostro watermark multi-bit, utilizziamo alternativamente due parafrasatori per codificare il codice binario predefinito a livello di frase. Successivamente, utilizziamo un classificatore di testo come decoder per decodificare ciascun bit del watermark. Attraverso esperimenti approfonditi, dimostriamo che i nostri watermark possono raggiungere oltre il 99,99\% di AUC di rilevamento con parafrasatori di testo di dimensioni ridotte (1,1 miliardi), mantenendo al contempo le informazioni semantiche della frase originale. Inoltre, il nostro processo è robusto alle sostituzioni di parole e alle perturbazioni nelle parafrasi delle frasi, e generalizza bene ai dati fuori distribuzione. Mostriamo anche la furtività del nostro watermark con valutazioni basate su LLM. Rendiamo il codice open-source: https://github.com/xiaojunxu/multi-bit-text-watermark.

English

We propose an imperceptible multi-bit text watermark embedded by paraphrasing with LLMs. We fine-tune a pair of LLM paraphrasers that are designed to behave differently so that their paraphrasing difference reflected in the text semantics can be identified by a trained decoder. To embed our multi-bit watermark, we use two paraphrasers alternatively to encode the pre-defined binary code at the sentence level. Then we use a text classifier as the decoder to decode each bit of the watermark. Through extensive experiments, we show that our watermarks can achieve over 99.99\% detection AUC with small (1.1B) text paraphrasers while keeping the semantic information of the original sentence. More importantly, our pipeline is robust under word substitution and sentence paraphrasing perturbations and generalizes well to out-of-distributional data. We also show the stealthiness of our watermark with LLM-based evaluation. We open-source the code: https://github.com/xiaojunxu/multi-bit-text-watermark.

Marcatura d'acqua testuale multi-bit robusta con parafrasi basate su LLM.

Robust Multi-bit Text Watermark with LLM-based Paraphrasers

Abstract

Support