理解与预测GitHub上有害对话中的失控现象

摘要

软件项目的蓬勃发展离不开来自不同背景的个体参与和贡献。然而，有害语言和负面互动会阻碍贡献者的参与和留存，并疏远新加入者。主动的治理策略旨在通过处理偏离初衷的对话来预防毒性行为的发生。本研究旨在理解和预测GitHub上导致毒性的对话偏离现象。为支持这项研究，我们精心构建了一个新颖的数据集，包含202个来自GitHub的有害对话及其标注的偏离点，以及696个非有害对话作为基线。基于此数据集，我们识别了有害对话和偏离点的独特特征，包括第二人称代词、否定词等语言标记，以及“苦涩挫败”与“不耐烦”的语气，还有项目贡献者与外部参与者之间对话动态的模式。利用这些实证观察，我们提出了一种主动治理方法，旨在自动检测并处理潜在有害对话，防止其升级。通过运用现代大语言模型（LLMs），我们开发了一种对话轨迹摘要技术，该技术能够捕捉讨论的演变过程并识别偏离的早期迹象。实验表明，针对GitHub对话摘要定制的LLM提示在预测对话偏离方面达到了69%的F1分数，相较于一系列基线方法有了显著提升。

English

Software projects thrive on the involvement and contributions of individuals from different backgrounds. However, toxic language and negative interactions can hinder the participation and retention of contributors and alienate newcomers. Proactive moderation strategies aim to prevent toxicity from occurring by addressing conversations that have derailed from their intended purpose. This study aims to understand and predict conversational derailment leading to toxicity on GitHub. To facilitate this research, we curate a novel dataset comprising 202 toxic conversations from GitHub with annotated derailment points, along with 696 non-toxic conversations as a baseline. Based on this dataset, we identify unique characteristics of toxic conversations and derailment points, including linguistic markers such as second-person pronouns, negation terms, and tones of Bitter Frustration and Impatience, as well as patterns in conversational dynamics between project contributors and external participants. Leveraging these empirical observations, we propose a proactive moderation approach to automatically detect and address potentially harmful conversations before escalation. By utilizing modern LLMs, we develop a conversation trajectory summary technique that captures the evolution of discussions and identifies early signs of derailment. Our experiments demonstrate that LLM prompts tailored to provide summaries of GitHub conversations achieve 69% F1-Score in predicting conversational derailment, strongly improving over a set of baseline approaches.

理解与预测GitHub上有害对话中的失控现象

Understanding and Predicting Derailment in Toxic Conversations on GitHub

摘要

Summary

Support

Support