AI 피드백으로부터의 학습을 위한 '헌법'의 역할을 평가하는 것

초록

대형 언어 모델(LLM)의 증가하는 능력으로 인해 다른 LLM의 훈련 및 평가를 위해 인간 피드백 대체재로 사용되고 있다. 이러한 방법들은 종종 비평 모델이 피드백을 제공하고 세대를 향상시키기 위해 사용하는 '헌법'에 의존한다. 우리는 의료 면접에서 환자 중심 의사소통을 개선하기 위해 네 가지 다른 헌법을 사용하여 피드백 품질이 어떻게 영향을 받는지 조사했다. 215명의 인간 평가자들이 실시한 쌍별 비교에서, 상세한 헌법이 감정적 특성에 관한 결과에서 더 나은 결과를 이끌어냈다는 것을 발견했다. 그러나 어떠한 헌법도 정보 수집 및 제공과 관련된 실용적 기술 학습에서 베이스라인을 능가하지 못했다. 우리의 연구 결과는 상세한 헌법이 우선시되어야 하지만, 특정 영역에서 AI 피드백의 효과적인 보상 신호에 대한 가능한 제한 사항이 있다는 것을 나타낸다.

English

The growing capabilities of large language models (LLMs) have led to their use as substitutes for human feedback for training and assessing other LLMs. These methods often rely on `constitutions', written guidelines which a critic model uses to provide feedback and improve generations. We investigate how the choice of constitution affects feedback quality by using four different constitutions to improve patient-centered communication in medical interviews. In pairwise comparisons conducted by 215 human raters, we found that detailed constitutions led to better results regarding emotive qualities. However, none of the constitutions outperformed the baseline in learning more practically-oriented skills related to information gathering and provision. Our findings indicate that while detailed constitutions should be prioritised, there are possible limitations to the effectiveness of AI feedback as a reward signal in certain areas.

AI 피드백으로부터의 학습을 위한 '헌법'의 역할을 평가하는 것

Evaluating the role of `Constitutions' for learning from AI feedback

초록

Support