MPO：通过元计划优化提升大语言模型代理性能

摘要

近期，大型语言模型（LLMs）的进展使得基于LLM的智能体能够成功应对交互式规划任务。然而，尽管取得了这些成就，现有方法常面临规划幻觉问题，且需针对每个新智能体进行重新训练。为解决这些挑战，我们提出了元规划优化（Meta Plan Optimization, MPO）框架，该框架通过直接融入显式指导来增强智能体的规划能力。与以往依赖复杂知识的方法不同，这些方法要么需要大量人力投入，要么缺乏质量保证，MPO则通过元计划利用高层通用指导来辅助智能体规划，并基于智能体任务执行的反馈持续优化元计划。我们在两项代表性任务上的实验表明，MPO显著超越了现有基线方法。此外，分析结果显示，MPO提供了一种即插即用的解决方案，不仅提升了任务完成效率，还在先前未见场景中增强了泛化能力。

English

Recent advancements in large language models (LLMs) have enabled LLM-based agents to successfully tackle interactive planning tasks. However, despite their successes, existing approaches often suffer from planning hallucinations and require retraining for each new agent. To address these challenges, we propose the Meta Plan Optimization (MPO) framework, which enhances agent planning capabilities by directly incorporating explicit guidance. Unlike previous methods that rely on complex knowledge, which either require significant human effort or lack quality assurance, MPO leverages high-level general guidance through meta plans to assist agent planning and enables continuous optimization of the meta plans based on feedback from the agent's task execution. Our experiments conducted on two representative tasks demonstrate that MPO significantly outperforms existing baselines. Moreover, our analysis indicates that MPO provides a plug-and-play solution that enhances both task completion efficiency and generalization capabilities in previous unseen scenarios.

MPO：通过元计划优化提升大语言模型代理性能

MPO: Boosting LLM Agents with Meta Plan Optimization

摘要

Summary

Support

Support