“这并非我的真实写照”:探究合成AI语音服务中的口音偏见与数字排斥现象
"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services
April 12, 2025
作者: Shira Michel, Sufi Kaur, Sarah Elizabeth Gillespie, Jeffrey Gleason, Christo Wilson, Avijit Ghosh
cs.AI
摘要
人工智能(AI)语音生成与声音克隆技术的最新进展已能产生自然流畅的语音和精准的声音复制,然而这些技术在不同口音和语言特征的社会技术系统中的影响尚未被充分理解。本研究通过混合方法,结合问卷调查与访谈,评估了两款合成AI语音服务(Speechify和ElevenLabs),旨在衡量其技术性能,并揭示用户的生活经历如何影响他们对这些语音技术中口音变化的感知。我们的研究结果揭示了五种地区性英语口音在技术性能上的差异,并展示了当前语音生成技术可能无意中强化了语言特权与基于口音的歧视,潜在地催生了新型的数字排斥现象。总体而言,本研究通过为开发者、政策制定者及组织提供可操作的见解,强调了包容性设计与监管的必要性,以确保AI语音技术的公平性与社会责任。
English
Recent advances in artificial intelligence (AI) speech generation and voice
cloning technologies have produced naturalistic speech and accurate voice
replication, yet their influence on sociotechnical systems across diverse
accents and linguistic traits is not fully understood. This study evaluates two
synthetic AI voice services (Speechify and ElevenLabs) through a mixed methods
approach using surveys and interviews to assess technical performance and
uncover how users' lived experiences influence their perceptions of accent
variations in these speech technologies. Our findings reveal technical
performance disparities across five regional, English-language accents and
demonstrate how current speech generation technologies may inadvertently
reinforce linguistic privilege and accent-based discrimination, potentially
creating new forms of digital exclusion. Overall, our study highlights the need
for inclusive design and regulation by providing actionable insights for
developers, policymakers, and organizations to ensure equitable and socially
responsible AI speech technologies.Summary
AI-Generated Summary