LearnLM:改善用於學習的Gemini
LearnLM: Improving Gemini for Learning
December 21, 2024
作者: LearnLM Team, Abhinit Modi, Aditya Srikanth Veerubhotla, Aliya Rysbek, Andrea Huber, Brett Wiltshire, Brian Veprek, Daniel Gillick, Daniel Kasenberg, Derek Ahmed, Irina Jurenka, James Cohan, Jennifer She, Julia Wilkowski, Kaiz Alarakyia, Kevin McKee, Lisa Wang, Markus Kunesch, Mike Schaekermann, Miruna Pîslar, Nikhil Joshi, Parsa Mahmoudieh, Paul Jhun, Sara Wiltberger, Shakir Mohamed, Shashank Agarwal, Shubham Milind Phal, Sun Jae Lee, Theofilos Strinopoulos, Wei-Jen Ko, Amy Wang, Ankit Anand, Avishkar Bhoopchand, Dan Wild, Divya Pandya, Filip Bar, Garth Graham, Holger Winnemoeller, Mahvish Nagda, Prateek Kolhar, Renee Schneider, Shaojian Zhu, Stephanie Chan, Steve Yadlowsky, Viknesh Sounderajah, Yannis Assael
cs.AI
摘要
當今的生成式人工智慧系統通常被調整為默認呈現資訊,而非像人類導師那樣與使用者互動以促進學習。為了應對這些系統在教育領域中的廣泛應用,我們重新定義了注入教學行為的挑戰,將其視為教學指導跟隨的挑戰,其中訓練和評估範例包括系統級指導,描述後續模型轉換中存在或期望的具體教學特徵。這種框架避免了將我們的模型限定於任何特定的教學定義,反而允許教師或開發人員指定期望的模型行為。這也為改進 Gemini 模型的學習能力鋪平了道路,通過將我們的教學數據添加到訓練後的混合中,與其快速擴展的功能集相結合。這兩者都代表了我們最初技術報告的重要變化。我們展示了如何使用教學指導跟隨進行訓練,產生了一個 LearnLM 模型(可在 Google AI Studio 上使用),在各種學習情境中,專家評分者明顯偏好該模型,平均偏好強度比 GPT-4o 高出 31%,比 Claude 3.5 高出 11%,比基於 Gemini 1.5 Pro 模型的 LearnLM 高出 13%。
English
Today's generative AI systems are tuned to present information by default
rather than engage users in service of learning as a human tutor would. To
address the wide range of potential education use cases for these systems, we
reframe the challenge of injecting pedagogical behavior as one of
pedagogical instruction following, where training and evaluation
examples include system-level instructions describing the specific pedagogy
attributes present or desired in subsequent model turns. This framing avoids
committing our models to any particular definition of pedagogy, and instead
allows teachers or developers to specify desired model behavior. It also clears
a path to improving Gemini models for learning -- by enabling the addition of
our pedagogical data to post-training mixtures -- alongside their rapidly
expanding set of capabilities. Both represent important changes from our
initial tech report. We show how training with pedagogical instruction
following produces a LearnLM model (available on Google AI Studio) that is
preferred substantially by expert raters across a diverse set of learning
scenarios, with average preference strengths of 31\% over GPT-4o, 11\% over
Claude 3.5, and 13\% over the Gemini 1.5 Pro model LearnLM was based on.Summary
AI-Generated Summary