代码输入输出：通过代码输入输出预测压缩推理模式

摘要

推理是大型语言模型的基本能力。尽管先前的研究主要集中在增强数学或代码生成等狭窄技能方面，但改善许多其他推理任务的表现仍然具有挑战性，因为训练数据稀疏且分散。为了解决这个问题，我们提出了CodeI/O，这是一种新颖的方法，通过将原始代码转换为代码输入-输出预测格式，系统地压缩了内嵌在上下文代码中的多样化推理模式。通过训练模型以自然语言完全预测给定代码和测试用例的输入/输出，作为“思维链”（CoT）的理由，我们让模型接触到通用推理基元，如逻辑流规划、状态空间搜索、决策树遍历和模块分解，同时将结构化推理与特定于代码的语法分离，并保留程序严谨性。实验结果表明，CodeI/O在符号、科学、逻辑、数学和数值、常识推理任务中均取得了一致的改进。通过匹配现有的地面真实输出或使用预测的输入重新执行代码，我们可以验证每个预测，并通过多轮修订进一步增强CoTs，从而实现CodeI/O++并获得更高的性能。我们的数据和模型可在https://github.com/hkust-nlp/CodeIO 上获得。

English

Reasoning is a fundamental capability of Large Language Models. While prior research predominantly focuses on enhancing narrow skills like math or code generation, improving performance on many other reasoning tasks remains challenging due to sparse and fragmented training data. To address this issue, we propose CodeI/O, a novel approach that systematically condenses diverse reasoning patterns inherently embedded in contextually-grounded codes, through transforming the original code into a code input-output prediction format. By training models to predict inputs/outputs given code and test cases entirely in natural language as Chain-of-Thought (CoT) rationales, we expose them to universal reasoning primitives -- like logic flow planning, state-space searching, decision tree traversal, and modular decomposition -- while decoupling structured reasoning from code-specific syntax and preserving procedural rigor. Experimental results demonstrate CodeI/O leads to consistent improvements across symbolic, scientific, logic, math & numerical, and commonsense reasoning tasks. By matching the existing ground-truth outputs or re-executing the code with predicted inputs, we can verify each prediction and further enhance the CoTs through multi-turn revision, resulting in CodeI/O++ and achieving higher performance. Our data and models are available at https://github.com/hkust-nlp/CodeIO.

代码输入输出：通过代码输入输出预测压缩推理模式

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

摘要

Summary

Support