PlanGEN：用于生成规划和推理轨迹的多智能体框架，用于复杂问题解决。

摘要

最近的智能体框架和推理时算法经常在复杂规划问题上遇到困难，这是由于验证生成的计划或推理以及单个任务中实例的不同复杂性的限制。许多现有方法针对这些任务要么执行任务级验证而不考虑约束，要么应用推理时算法而不适应实例级复杂性。为了解决这些限制，我们提出了PlanGEN，这是一个模型无关且易于扩展的智能体框架，具有三个关键组件：约束、验证和选择智能体。具体而言，我们的方法提出了约束引导的迭代验证，以提升推理时算法（Best of N、Tree-of-Thought 和 REBASE）的性能。在PlanGEN框架中，选择智能体根据实例复杂性优化算法选择，确保更好地适应复杂规划问题。实验结果表明，在多个基准测试中，我们相比最强基线取得了显著改进，实现了NATURAL PLAN（相似8%提升）、OlympiadBench（相似4%提升）、DocFinQA（相似7%提升）和GPQA（相似1%提升）的最新成果。我们的关键发现突显了约束引导的迭代验证改善了推理时算法，并且自适应选择进一步提升了在复杂规划和推理问题上的性能。

English

Recent agent frameworks and inference-time algorithms often struggle with complex planning problems due to limitations in verifying generated plans or reasoning and varying complexity of instances within a single task. Many existing methods for these tasks either perform task-level verification without considering constraints or apply inference-time algorithms without adapting to instance-level complexity. To address these limitations, we propose PlanGEN, a model-agnostic and easily scalable agent framework with three key components: constraint, verification, and selection agents. Specifically, our approach proposes constraint-guided iterative verification to enhance performance of inference-time algorithms--Best of N, Tree-of-Thought, and REBASE. In PlanGEN framework, the selection agent optimizes algorithm choice based on instance complexity, ensuring better adaptability to complex planning problems. Experimental results demonstrate significant improvements over the strongest baseline across multiple benchmarks, achieving state-of-the-art results on NATURAL PLAN (sim8%uparrow), OlympiadBench (sim4%uparrow), DocFinQA (sim7%uparrow), and GPQA (sim1%uparrow). Our key finding highlights that constraint-guided iterative verification improves inference-time algorithms, and adaptive selection further boosts performance on complex planning and reasoning problems.

PlanGEN：用于生成规划和推理轨迹的多智能体框架，用于复杂问题解决。

PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving

摘要

Summary

Support

Support