LearnLM:改进 Gemini 以用于学习。

LearnLM: Improving Gemini for Learning

December 21, 2024
作者: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Brett Wiltshire, Brian Veprek, Daniel Gillick, Daniel Kasenberg, Derek Ahmed, Irina Jurenka, James Cohan, Jennifer She, Julia Wilkowski, Kaiz Alarakyia, Kevin McKee, Lisa Wang, Markus Kunesch, Mike Schaekermann, Miruna Pîslar, Nikhil Joshi, Parsa Mahmoudieh, Paul Jhun, Sara Wiltberger, Shakir Mohamed, Shashank Agarwal, Shubham Milind Phal, Sun Jae Lee, Theofilos Strinopoulos, Wei-Jen Ko, Amy Wang, Ankit Anand, Avishkar Bhoopchand, Dan Wild, Divya Pandya, Filip Bar, Garth Graham, Holger Winnemoeller, Mahvish Nagda, Prateek Kolhar, Renee Schneider, Shaojian Zhu, Stephanie Chan, Steve Yadlowsky, Viknesh Sounderajah, Yannis Assael
cs.AI

摘要

当今的生成式人工智能系统通常被调整为默认呈现信息,而不像人类导师那样与用户互动以促进学习。为了应对这些系统在教育领域的广泛应用需求,我们重新构思了注入教学行为的挑战,将其视为教学指导跟随的问题,其中训练和评估示例包括系统级指导,描述了后续模型回合中存在或期望的特定教学属性。这种框架避免了将我们的模型局限于任何特定的教学定义,而是允许教师或开发人员指定期望的模型行为。这也为改进 Gemini 模型的学习能力铺平了道路,通过将我们的教学数据添加到训练后的混合模型中,同时扩展了它们快速增长的能力集。这两者都代表了与我们最初的技术报告相比的重要变化。我们展示了如何通过教学指导跟随训练产生了一个 LearnLM 模型(可在 Google AI Studio 上获得),在各种学习场景中明显受到专家评分者的青睐,平均偏好强度比 GPT-4o 高出 31%,比 Claude 3.5 高出 11%,比基于 Gemini 1.5 Pro 模型的 LearnLM 高出 13%。
English
Today's generative AI systems are tuned to present information by default rather than engage users in service of learning as a human tutor would. To address the wide range of potential education use cases for these systems, we reframe the challenge of injecting pedagogical behavior as one of pedagogical instruction following, where training and evaluation examples include system-level instructions describing the specific pedagogy attributes present or desired in subsequent model turns. This framing avoids committing our models to any particular definition of pedagogy, and instead allows teachers or developers to specify desired model behavior. It also clears a path to improving Gemini models for learning -- by enabling the addition of our pedagogical data to post-training mixtures -- alongside their rapidly expanding set of capabilities. Both represent important changes from our initial tech report. We show how training with pedagogical instruction following produces a LearnLM model (available on Google AI Studio) that is preferred substantially by expert raters across a diverse set of learning scenarios, with average preference strengths of 31\% over GPT-4o, 11\% over Claude 3.5, and 13\% over the Gemini 1.5 Pro model LearnLM was based on.

Summary

AI-Generated Summary

PDF222December 24, 2024