ChatPaper.aiChatPaper

道德故事:用于评估道德一致性的法语数据集

Histoires Morales: A French Dataset for Assessing Moral Alignment

January 28, 2025
作者: Thibaud Leteno, Irina Proskurina, Antoine Gourru, Julien Velcin, Charlotte Laclau, Guillaume Metzler, Christophe Gravier
cs.AI

摘要

将语言模型与人类价值观对齐至关重要,特别是在它们越来越融入日常生活的情况下。虽然模型通常会根据用户偏好进行调整,但确保它们与现实社会情境中的道德规范和行为相一致同样重要。尽管在英语和中文等语言取得了显著进展,但法语在这方面受到的关注较少,导致我们对LLM在该语言中处理道德推理的方式了解不足。为填补这一空白,我们介绍了Histoires Morales,这是一个源自道德故事的法语数据集,通过翻译创建,并在后续通过母语者的协助进行了精炼,以确保语法准确性和适应法国文化背景。我们还依赖数据集中道德价值观的标注,以确保它们与法国规范相一致。Histoires Morales涵盖了各种社会情境,包括小费习惯的差异、在人际关系中的诚实表达以及对待动物的责任。为促进未来研究,我们还对多语言模型在法语和英语数据上的对齐以及对齐的稳健性进行了初步实验。我们发现,虽然LLM通常默认与人类道德规范一致,但它们很容易受到用户偏好优化的影响,无论是对道德还是不道德数据。
English
Aligning language models with human values is crucial, especially as they become more integrated into everyday life. While models are often adapted to user preferences, it is equally important to ensure they align with moral norms and behaviours in real-world social situations. Despite significant progress in languages like English and Chinese, French has seen little attention in this area, leaving a gap in understanding how LLMs handle moral reasoning in this language. To address this gap, we introduce Histoires Morales, a French dataset derived from Moral Stories, created through translation and subsequently refined with the assistance of native speakers to guarantee grammatical accuracy and adaptation to the French cultural context. We also rely on annotations of the moral values within the dataset to ensure their alignment with French norms. Histoires Morales covers a wide range of social situations, including differences in tipping practices, expressions of honesty in relationships, and responsibilities toward animals. To foster future research, we also conduct preliminary experiments on the alignment of multilingual models on French and English data and the robustness of the alignment. We find that while LLMs are generally aligned with human moral norms by default, they can be easily influenced with user-preference optimization for both moral and immoral data.

Summary

AI-Generated Summary

PDF32January 29, 2025