基於大型語言模型的生成心理測量，衡量人類和人工智能價值觀

摘要

人類價值觀及其衡量一直是跨學科的長期研究課題。人工智慧的最新進展引發了對這一領域的重新關注，大型語言模型（LLMs）作為價值衡量的工具和對象嶄露頭角。本研究引入了基於生成心理測量的價值（GPV）方法，該方法基於LLM，以數據驅動的方式進行價值衡量，理論上基於文本揭示的選擇性感知。我們首先對LLM進行微調，以實現準確的感知級別價值衡量，並驗證LLMs將文本解析為感知的能力，形成GPV管道的核心。將GPV應用於人類撰寫的博客，我們展示了其穩定性、有效性，以及相對於先前心理學工具的優越性。然後，將GPV擴展到LLM價值衡量，我們通過以下方式推進了當前技術：1）一種心理測量方法，根據其可擴展和自由形式的輸出來衡量LLM的價值，實現特定上下文的衡量；2）對衡量範式進行比較分析，顯示先前方法的反應偏差；以及3）試圖將LLM的價值與其安全性相關聯，揭示不同價值體系的預測能力，以及各種價值對LLM安全性的影響。通過跨學科的努力，我們旨在利用人工智慧來進行下一代心理測量，並將心理測量應用於價值與人工智慧的協調。

English

Human values and their measurement are long-standing interdisciplinary inquiry. Recent advances in AI have sparked renewed interest in this area, with large language models (LLMs) emerging as both tools and subjects of value measurement. This work introduces Generative Psychometrics for Values (GPV), an LLM-based, data-driven value measurement paradigm, theoretically grounded in text-revealed selective perceptions. We begin by fine-tuning an LLM for accurate perception-level value measurement and verifying the capability of LLMs to parse texts into perceptions, forming the core of the GPV pipeline. Applying GPV to human-authored blogs, we demonstrate its stability, validity, and superiority over prior psychological tools. Then, extending GPV to LLM value measurement, we advance the current art with 1) a psychometric methodology that measures LLM values based on their scalable and free-form outputs, enabling context-specific measurement; 2) a comparative analysis of measurement paradigms, indicating response biases of prior methods; and 3) an attempt to bridge LLM values and their safety, revealing the predictive power of different value systems and the impacts of various values on LLM safety. Through interdisciplinary efforts, we aim to leverage AI for next-generation psychometrics and psychometrics for value-aligned AI.

基於大型語言模型的生成心理測量，衡量人類和人工智能價值觀

Measuring Human and AI Values based on Generative Psychometrics with Large Language Models

摘要

Summary

Support

Support