「這並非我的真實呈現」:探討合成AI語音服務中的口音偏見與數位排斥現象
"It's not a representation of me": Examining Accent Bias and Digital Exclusion in Synthetic AI Voice Services
April 12, 2025
作者: Shira Michel, Sufi Kaur, Sarah Elizabeth Gillespie, Jeffrey Gleason, Christo Wilson, Avijit Ghosh
cs.AI
摘要
近期人工智慧(AI)語音生成與聲音克隆技術的進步,已能產出自然流暢的語音及精確的聲音複製,然而這些技術對跨多種口音與語言特徵的社會技術系統之影響尚未被充分理解。本研究透過混合方法,結合問卷調查與訪談,評估兩種合成AI語音服務(Speechify與ElevenLabs),以檢視其技術表現,並探討使用者的生活經驗如何影響他們對這些語音技術中口音變化的感知。我們的研究結果揭示了五種地區性英語口音間的技術表現差異,並顯示當前的語音生成技術可能無意中強化了語言特權與基於口音的歧視,潛在地創造了新型態的數位排斥。總體而言,本研究強調了包容性設計與規範的必要性,為開發者、政策制定者及組織提供了可操作的見解,以確保AI語音技術的公平性與社會責任。
English
Recent advances in artificial intelligence (AI) speech generation and voice
cloning technologies have produced naturalistic speech and accurate voice
replication, yet their influence on sociotechnical systems across diverse
accents and linguistic traits is not fully understood. This study evaluates two
synthetic AI voice services (Speechify and ElevenLabs) through a mixed methods
approach using surveys and interviews to assess technical performance and
uncover how users' lived experiences influence their perceptions of accent
variations in these speech technologies. Our findings reveal technical
performance disparities across five regional, English-language accents and
demonstrate how current speech generation technologies may inadvertently
reinforce linguistic privilege and accent-based discrimination, potentially
creating new forms of digital exclusion. Overall, our study highlights the need
for inclusive design and regulation by providing actionable insights for
developers, policymakers, and organizations to ensure equitable and socially
responsible AI speech technologies.Summary
AI-Generated Summary