ChatGPT를 자주 사용하는 사람들은 AI가 생성한 텍스트를 정확하고 견고하게 감지하는 사람들이다.

초록

본 논문에서는 상업용 LLMs(GPT-4o, Claude, o1)에 의해 생성된 텍스트를 인간이 얼마나 잘 감지할 수 있는지 연구합니다. 우리는 주석 달기 위해 어노테이터를 고용하여 300편의 비픽션 영어 기사를 읽고, 그것들을 인간이 작성한 것인지 AI가 생성한 것인지로 레이블을 지정하며 결정에 대한 단락 길이의 설명을 제공합니다. 우리의 실험 결과는 LLMs를 쓰는 작업에 익숙한 어노테이터들은 전문 교육이나 피드백 없이도 AI가 생성한 텍스트를 감지하는 데 뛰어남을 보여줍니다. 사실, 이러한 "전문가" 어노테이터 5명 중 다수결은 300편 중 1편만을 잘못 분류하며, 우리가 평가한 대부분의 상업용 및 오픈 소스 탐지기를 능가합니다. 심지어 어조 변경 및 인간화와 같은 회피 전술이 존재할 때에도요. 전문가들의 자유 형식 설명에 대한 질적 분석은 'AI 어휘'와 같은 구체적인 어휘 단서에 크게 의존하는 반면, 텍스트 내에서 폼, 독창성, 명확성과 같은 더 복잡한 현상들(예: 공식성)에도 주목합니다. 우리는 인간 및 자동화된 AI 생성 텍스트의 감지에 대한 미래 연구를 촉진하기 위해 우리의 주석이 달린 데이터셋과 코드를 공개합니다.

English

In this paper, we study how well humans can detect text generated by commercial LLMs (GPT-4o, Claude, o1). We hire annotators to read 300 non-fiction English articles, label them as either human-written or AI-generated, and provide paragraph-length explanations for their decisions. Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such "expert" annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization. Qualitative analysis of the experts' free-form explanations shows that while they rely heavily on specific lexical clues ('AI vocabulary'), they also pick up on more complex phenomena within the text (e.g., formality, originality, clarity) that are challenging to assess for automatic detectors. We release our annotated dataset and code to spur future research into both human and automated detection of AI-generated text.

ChatGPT를 자주 사용하는 사람들은 AI가 생성한 텍스트를 정확하고 견고하게 감지하는 사람들이다.

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

초록

Summary

Support