关于大型语言模型在逻辑推理中的记忆化

摘要

大型语言模型（LLMs）在具有挑战性的推理基准上取得了良好的性能，但也可能出现基本推理错误。当涉及理解LLMs推理能力背后的机制时，这种对比行为令人困惑。一个假设是，在常见推理基准上越来越高且几乎饱和的性能可能是由于类似问题的记忆。在本文中，我们通过对基于“骑士与诡计者”（K&K）谜题的动态生成的逻辑推理基准进行定量记忆度量，系统地研究了这一假设。我们发现，LLMs在微调后可以插值训练谜题（达到几乎完美的准确率），但在这些谜题稍作扰动时会失败，这表明模型在解决这些训练谜题时严重依赖记忆。另一方面，我们表明，虽然微调会导致大量记忆，但也始终改善泛化性能。通过扰动测试、跨难度级别的可转移性、探测模型内部以及使用错误答案进行微调的深入分析表明，尽管训练数据被记忆，LLMs仍学会在K&K谜题上推理。这种现象表明，LLMs在记忆和真正推理能力之间展现出复杂的相互作用。最后，我们通过每个样本的记忆度量分数的分析揭示了LLMs在解决逻辑谜题时如何在推理和记忆之间切换。我们的代码和数据可在https://memkklogic.github.io 上获取。

English

Large language models (LLMs) achieve good performance on challenging reasoning benchmarks, yet could also make basic reasoning mistakes. This contrasting behavior is puzzling when it comes to understanding the mechanisms behind LLMs' reasoning capabilities. One hypothesis is that the increasingly high and nearly saturated performance on common reasoning benchmarks could be due to the memorization of similar problems. In this paper, we systematically investigate this hypothesis with a quantitative measurement of memorization in reasoning tasks, using a dynamically generated logical reasoning benchmark based on Knights and Knaves (K&K) puzzles. We found that LLMs could interpolate the training puzzles (achieving near-perfect accuracy) after fine-tuning, yet fail when those puzzles are slightly perturbed, suggesting that the models heavily rely on memorization to solve those training puzzles. On the other hand, we show that while fine-tuning leads to heavy memorization, it also consistently improves generalization performance. In-depth analyses with perturbation tests, cross difficulty-level transferability, probing model internals, and fine-tuning with wrong answers suggest that the LLMs learn to reason on K&K puzzles despite training data memorization. This phenomenon indicates that LLMs exhibit a complex interplay between memorization and genuine reasoning abilities. Finally, our analysis with per-sample memorization score sheds light on how LLMs switch between reasoning and memorization in solving logical puzzles. Our code and data are available at https://memkklogic.github.io.

关于大型语言模型在逻辑推理中的记忆化

On Memorization of Large Language Models in Logical Reasoning

摘要

Summary

Support

Support