PaSa：一种用于全面学术论文搜索的LLM代理程序

摘要

我们介绍了PaSa，这是一款由大型语言模型驱动的先进论文搜索代理。PaSa能够自主做出一系列决策，包括调用搜索工具、阅读论文和选择相关参考文献，最终为复杂的学术查询获取全面准确的结果。我们使用强化学习和一个合成数据集AutoScholarQuery对PaSa进行优化，该数据集包含来自顶级人工智能会议出版物的3.5万个细粒度学术查询及相应论文。此外，我们开发了RealScholarQuery，一个收集真实学术查询以评估PaSa在更现实场景下性能的基准。尽管在合成数据上训练，PaSa在RealScholarQuery上明显优于现有基准，包括Google、Google Scholar、用于释义查询的Google with GPT-4、chatGPT（启用搜索的GPT-4o）、GPT-o1和PaSa-GPT-4o（通过提示GPT-4o实现的PaSa）。值得注意的是，PaSa-7B在recall@20上比最佳基于Google的基准Google with GPT-4o高出37.78%，在recall@50上高出39.90%。它还在召回率和精确率上分别比PaSa-GPT-4o高出30.36%和4.25%。模型、数据集和代码可在https://github.com/bytedance/pasa获得。

English

We introduce PaSa, an advanced Paper Search agent powered by large language models. PaSa can autonomously make a series of decisions, including invoking search tools, reading papers, and selecting relevant references, to ultimately obtain comprehensive and accurate results for complex scholarly queries. We optimize PaSa using reinforcement learning with a synthetic dataset, AutoScholarQuery, which includes 35k fine-grained academic queries and corresponding papers sourced from top-tier AI conference publications. Additionally, we develop RealScholarQuery, a benchmark collecting real-world academic queries to assess PaSa performance in more realistic scenarios. Despite being trained on synthetic data, PaSa significantly outperforms existing baselines on RealScholarQuery, including Google, Google Scholar, Google with GPT-4 for paraphrased queries, chatGPT (search-enabled GPT-4o), GPT-o1, and PaSa-GPT-4o (PaSa implemented by prompting GPT-4o). Notably, PaSa-7B surpasses the best Google-based baseline, Google with GPT-4o, by 37.78% in recall@20 and 39.90% in recall@50. It also exceeds PaSa-GPT-4o by 30.36% in recall and 4.25% in precision. Model, datasets, and code are available at https://github.com/bytedance/pasa.

PaSa：一种用于全面学术论文搜索的LLM代理程序

PaSa: An LLM Agent for Comprehensive Academic Paper Search

摘要

Summary

Support

Support