評估「憲章」在從人工智慧反饋中學習的角色

摘要

大型語言模型（LLMs）不斷增強的能力已導致它們被用作訓練和評估其他LLMs的人類反饋的替代品。這些方法通常依賴“憲章”，即評論模型用來提供反饋並改進生成的指導方針。我們通過使用四種不同的憲章來改進醫學訪談中以患者為中心的溝通，來研究憲章的選擇如何影響反饋質量。在215名人類評分員進行的兩兩比較中，我們發現詳細的憲章在情感品質方面取得了更好的結果。然而，在學習與信息收集和提供相關的實用技能方面，沒有任何一種憲章勝過了基準。我們的研究結果表明，儘管應優先選擇詳細的憲章，但在某些領域中，AI反饋作為獎勵信號的有效性可能存在一些限制。

English

The growing capabilities of large language models (LLMs) have led to their use as substitutes for human feedback for training and assessing other LLMs. These methods often rely on `constitutions', written guidelines which a critic model uses to provide feedback and improve generations. We investigate how the choice of constitution affects feedback quality by using four different constitutions to improve patient-centered communication in medical interviews. In pairwise comparisons conducted by 215 human raters, we found that detailed constitutions led to better results regarding emotive qualities. However, none of the constitutions outperformed the baseline in learning more practically-oriented skills related to information gathering and provision. Our findings indicate that while detailed constitutions should be prioritised, there are possible limitations to the effectiveness of AI feedback as a reward signal in certain areas.

評估「憲章」在從人工智慧反饋中學習的角色

Evaluating the role of `Constitutions' for learning from AI feedback

摘要

Summary

Support