稳健且细粒度的AI生成文本检测
Robust and Fine-Grained Detection of AI Generated Texts
April 16, 2025
作者: Ram Mohan Rao Kadiyala, Siddartha Pullakhandam, Kanwal Mehreen, Drishti Sharma, Siddhant Gupta, Jebish Purbey, Ashay Srivastava, Subhasya TippaReddy, Arvind Reddy Bobbili, Suraj Telugara Chandrashekhar, Modabbir Adeeb, Srinadh Vura, Hamza Farooq
cs.AI
摘要
理想的机器生成内容检测系统应能有效应对各类生成器,尤其是随着日益先进的大型语言模型(LLMs)不断涌现。现有系统在准确识别较短文本中的AI生成内容方面常显乏力。此外,并非所有文本皆纯由人类或LLM独立创作,因此我们更关注于部分由人类与LLM共同创作的文本案例。本文提出了一套专为标记分类任务构建的模型,这些模型在大量人机协作创作的文本上进行训练,并在未见过的领域、生成器、非母语者文本及对抗性输入文本上均表现出色。我们还引入了一个包含超过240万条此类文本的新数据集,这些文本主要由多个知名专有LLMs在23种语言中协作完成。此外,我们展示了模型在各领域及各生成器文本上的性能表现。其他发现包括模型针对不同对抗方法的性能对比、输入文本长度的影响,以及生成文本相较于原始人类创作文本的特征分析。
English
An ideal detection system for machine generated content is supposed to work
well on any generator as many more advanced LLMs come into existence day by
day. Existing systems often struggle with accurately identifying AI-generated
content over shorter texts. Further, not all texts might be entirely authored
by a human or LLM, hence we focused more over partial cases i.e human-LLM
co-authored texts. Our paper introduces a set of models built for the task of
token classification which are trained on an extensive collection of
human-machine co-authored texts, which performed well over texts of unseen
domains, unseen generators, texts by non-native speakers and those with
adversarial inputs. We also introduce a new dataset of over 2.4M such texts
mostly co-authored by several popular proprietary LLMs over 23 languages. We
also present findings of our models' performance over each texts of each domain
and generator. Additional findings include comparison of performance against
each adversarial method, length of input texts and characteristics of generated
texts compared to the original human authored texts.Summary
AI-Generated Summary