데이터로부터 게임의 잠재적 규칙 학습: 체스 이야기

초록

우리는 수백만 개의 매개변수를 가진 작은 사전 훈련된 기본 생성 언어 모델이 프로세스와 관련된 데이터로부터 프로세스의 잠재적인 규칙을 학습할 수 있다는 것을 보여줍니다. 스테판 즤바이히의 소설 "쇼흐노벨레"로 영어로는 "The Royal Game"으로도 알려진 작품에서 영감을 받아, 우리는 28M 및 125M 매개변수의 사전 훈련된 작은 기본 언어 모델(SLMs)이 1,000에서 1,000,000개의 예제로 지시를 세밀하게 조정하여 체스의 규칙을 학습하고, 합법적인 수를 제안하며, 정확하게 체스 문제를 해결할 수 있다는 것을 보여줍니다. 또한 연이은 언어 모델 세밀 조정 에포크가 개선된 결과에 미치는 영향을 탐구하고, 지시 세밀 조정 예제 수를 증가시킴으로써 모델 환각을 줄이는 것을 보여줍니다.

English

We demonstrate that small pretrained foundational generative language models with millions of parameters can learn the latent rules of a process from data associated with the process. Inspired by Stefan Zweig's novella "Schachnovelle," also known as "The Royal Game" in English, we show that 28M and 125M parameter pretrained foundational small language models (SLMs) can be instruction fine-tuned with 1,000-to-1,000,000 examples to learn the rules of chess, propose legal moves, and accurately solve chess problems. We also explore the impact of successive language model fine-tuning epochs on improved outcomes and demonstrate reductions in model hallucinations by increasing the number of instruction fine-tuning examples.

데이터로부터 게임의 잠재적 규칙 학습: 체스 이야기

Learning the Latent Rules of a Game from Data: A Chess Story

초록

Summary

Support

Support