语言模型中的文化意识调查：文本及其更多方面

摘要

在各种应用中大规模部署大型语言模型（LLMs），如聊天机器人和虚拟助手，需要LLMs对用户具有文化敏感性，以确保包容性。文化在心理学和人类学中得到广泛研究，最近在使LLMs更具文化包容性方面出现了激增的研究，超越了多语言性，并建立在心理学和人类学研究结果的基础上。在本文中，我们调查了将文化意识融入基于文本和多模态LLMs的努力。我们首先定义了LLMs中的文化意识，以人类学和心理学对文化的定义作为出发点。然后，我们检查了用于创建跨文化数据集的方法、在下游任务中实现文化包容性的策略，以及用于基准测试LLMs中文化意识的方法。此外，我们讨论了文化对齐的伦理影响、人机交互在推动LLMs中文化包容性方面的作用，以及文化对齐在推动社会科学研究中的作用。最后，根据我们对文献中存在的差距的发现，我们提供了未来研究的指引。

English

Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs to be culturally sensitive to the user to ensure inclusivity. Culture has been widely studied in psychology and anthropology, and there has been a recent surge in research on making LLMs more culturally inclusive in LLMs that goes beyond multilinguality and builds on findings from psychology and anthropology. In this paper, we survey efforts towards incorporating cultural awareness into text-based and multimodal LLMs. We start by defining cultural awareness in LLMs, taking the definitions of culture from anthropology and psychology as a point of departure. We then examine methodologies adopted for creating cross-cultural datasets, strategies for cultural inclusion in downstream tasks, and methodologies that have been used for benchmarking cultural awareness in LLMs. Further, we discuss the ethical implications of cultural alignment, the role of Human-Computer Interaction in driving cultural inclusion in LLMs, and the role of cultural alignment in driving social science research. We finally provide pointers to future research based on our findings about gaps in the literature.

语言模型中的文化意识调查：文本及其更多方面

Survey of Cultural Awareness in Language Models: Text and Beyond

摘要

Summary

Support

Support