ChatPaper.aiChatPaper

IPBench:大型语言模型在知识产权领域知识能力的基准测试

IPBench: Benchmarking the Knowledge of Large Language Models in Intellectual Property

April 22, 2025
作者: Qiyao Wang, Guhong Chen, Hongbo Wang, Huaren Liu, Minghui Zhu, Zhifei Qin, Linwei Li, Yilin Yue, Shiqiang Wang, Jiayan Li, Yihang Wu, Ziqiang Liu, Longze Chen, Run Luo, Liyang Fan, Jiaming Li, Lei Zhang, Kan Xu, Hongfei Lin, Hamid Alinejad-Rokny, Shiwen Ni, Yuan Lin, Min Yang
cs.AI

摘要

知识产权(IP)是一个融合技术与法律知识的独特领域,其复杂性和知识密集性不言而喻。随着大语言模型(LLMs)的持续进步,它们在处理知识产权任务方面展现出巨大潜力,能够更高效地分析、理解并生成与知识产权相关的内容。然而,现有数据集和基准要么仅聚焦于专利,要么覆盖知识产权领域的有限方面,与现实场景缺乏契合。为填补这一空白,我们首次提出了全面的知识产权任务分类体系,并构建了一个大规模、多样化的双语基准——IPBench,涵盖8种知识产权机制和20项任务。该基准旨在评估大语言模型在现实世界知识产权应用中的表现,包括理解和生成两方面。我们对16个大语言模型进行了基准测试,从通用模型到领域专用模型均有涉及,发现即使表现最佳的模型准确率也仅为75.8%,显示出显著的改进空间。值得注意的是,开源的知识产权和法律导向模型落后于闭源的通用模型。我们公开了IPBench的所有数据和代码,并将持续更新更多与知识产权相关的任务,以更好地反映知识产权领域的现实挑战。
English
Intellectual Property (IP) is a unique domain that integrates technical and legal knowledge, making it inherently complex and knowledge-intensive. As large language models (LLMs) continue to advance, they show great potential for processing IP tasks, enabling more efficient analysis, understanding, and generation of IP-related content. However, existing datasets and benchmarks either focus narrowly on patents or cover limited aspects of the IP field, lacking alignment with real-world scenarios. To bridge this gap, we introduce the first comprehensive IP task taxonomy and a large, diverse bilingual benchmark, IPBench, covering 8 IP mechanisms and 20 tasks. This benchmark is designed to evaluate LLMs in real-world intellectual property applications, encompassing both understanding and generation. We benchmark 16 LLMs, ranging from general-purpose to domain-specific models, and find that even the best-performing model achieves only 75.8% accuracy, revealing substantial room for improvement. Notably, open-source IP and law-oriented models lag behind closed-source general-purpose models. We publicly release all data and code of IPBench and will continue to update it with additional IP-related tasks to better reflect real-world challenges in the intellectual property domain.

Summary

AI-Generated Summary

PDF32April 23, 2025