대규모 언어 모델의 기억력과 논리 추론에 관한 연구

초록

대형 언어 모델 (LLMs)은 어려운 추론 벤치마크에서 우수한 성능을 달성하지만 기본적인 추론 오류를 범하기도 합니다. 이 대조적인 행동은 LLMs의 추론 능력 메커니즘을 이해하는 데 어려움을 줍니다. 하나의 가설은 일반적인 추론 벤치마크에서 점점 높고 거의 포화된 성능이 유사한 문제들을 암기함으로써 이루어질 수 있다는 것입니다. 본 논문에서는 나이트와 남녀 (K&K) 퍼즐을 기반으로 한 동적 생성 논리 추론 벤치마크를 사용하여 추론 작업에서의 암기 정도를 양적으로 측정하여 이 가설을 체계적으로 조사했습니다. 우리는 LLMs가 훈련 퍼즐을 보간할 수 있음을 발견했으며 (거의 완벽한 정확도 달성), 그러나 이러한 퍼즐이 약간 왜곡될 때 실패하는 것으로 나타났습니다. 이는 모델이 훈련 퍼즐을 해결하기 위해 암기에 크게 의존한다는 것을 시사합니다. 반면에, 우리는 세밀한 조정이 많은 암기를 유발하지만 일반화 성능을 일관되게 향상시킨다는 것을 보여줍니다. 왜곡 테스트, 난이도 수준 간 전이성, 모델 내부 조사 및 잘못된 답변으로의 세밀한 조정과 함께 한 군데서 추론을 배우는 것을 보여주는 분석을 통해, LLMs가 훈련 데이터 암기에도 불구하고 K&K 퍼즐에서 추론하는 방법을 배운다는 것을 보여줍니다. 이 현상은 LLMs가 추론과 암기 사이의 복잡한 상호 작용을 나타냅니다. 마지막으로, 샘플별 암기 점수를 사용한 분석을 통해 LLMs가 논리 퍼즐을 해결하는 과정에서 추론과 암기 사이를 전환하는 방법에 대한 통찰을 제공합니다. 우리의 코드와 데이터는 https://memkklogic.github.io에서 사용할 수 있습니다.

English

Large language models (LLMs) achieve good performance on challenging reasoning benchmarks, yet could also make basic reasoning mistakes. This contrasting behavior is puzzling when it comes to understanding the mechanisms behind LLMs' reasoning capabilities. One hypothesis is that the increasingly high and nearly saturated performance on common reasoning benchmarks could be due to the memorization of similar problems. In this paper, we systematically investigate this hypothesis with a quantitative measurement of memorization in reasoning tasks, using a dynamically generated logical reasoning benchmark based on Knights and Knaves (K&K) puzzles. We found that LLMs could interpolate the training puzzles (achieving near-perfect accuracy) after fine-tuning, yet fail when those puzzles are slightly perturbed, suggesting that the models heavily rely on memorization to solve those training puzzles. On the other hand, we show that while fine-tuning leads to heavy memorization, it also consistently improves generalization performance. In-depth analyses with perturbation tests, cross difficulty-level transferability, probing model internals, and fine-tuning with wrong answers suggest that the LLMs learn to reason on K&K puzzles despite training data memorization. This phenomenon indicates that LLMs exhibit a complex interplay between memorization and genuine reasoning abilities. Finally, our analysis with per-sample memorization score sheds light on how LLMs switch between reasoning and memorization in solving logical puzzles. Our code and data are available at https://memkklogic.github.io.

대규모 언어 모델의 기억력과 논리 추론에 관한 연구

On Memorization of Large Language Models in Logical Reasoning

초록

Support