评估“约束条件”在从人工智能反馈中学习中的作用

Evaluating the role of `Constitutions' for learning from AI feedback

November 15, 2024
作者: Saskia Redgate, Andrew M. Bean, Adam Mahdi
cs.AI

摘要

大型语言模型(LLMs)不断增强的能力已经导致它们被用作训练和评估其他LLMs的人类反馈替代品。这些方法通常依赖于“宪章”,即评论模型用来提供反馈和改进生成的书面指导方针。我们通过使用四种不同的宪章来改进医疗面试中的以患者为中心的沟通,研究了宪章选择如何影响反馈质量。在由215名人类评分者进行的两两比较中,我们发现详细的宪章在情感品质方面取得了更好的结果。然而,在学习更多与信息收集和提供相关的实用技能方面,没有一种宪章能够超越基准。我们的发现表明,尽管应优先考虑详细的宪章,但在某些领域中,AI反馈作为奖励信号的有效性可能存在一些限制。
English
The growing capabilities of large language models (LLMs) have led to their use as substitutes for human feedback for training and assessing other LLMs. These methods often rely on `constitutions', written guidelines which a critic model uses to provide feedback and improve generations. We investigate how the choice of constitution affects feedback quality by using four different constitutions to improve patient-centered communication in medical interviews. In pairwise comparisons conducted by 215 human raters, we found that detailed constitutions led to better results regarding emotive qualities. However, none of the constitutions outperformed the baseline in learning more practically-oriented skills related to information gathering and provision. Our findings indicate that while detailed constitutions should be prioritised, there are possible limitations to the effectiveness of AI feedback as a reward signal in certain areas.

Summary

AI-Generated Summary

PDF52November 19, 2024