ChatPaper.aiChatPaper

IPBench:大型語言模型在知識產權領域的知識基準測試

IPBench: Benchmarking the Knowledge of Large Language Models in Intellectual Property

April 22, 2025
作者: Qiyao Wang, Guhong Chen, Hongbo Wang, Huaren Liu, Minghui Zhu, Zhifei Qin, Linwei Li, Yilin Yue, Shiqiang Wang, Jiayan Li, Yihang Wu, Ziqiang Liu, Longze Chen, Run Luo, Liyang Fan, Jiaming Li, Lei Zhang, Kan Xu, Hongfei Lin, Hamid Alinejad-Rokny, Shiwen Ni, Yuan Lin, Min Yang
cs.AI

摘要

知識產權(IP)是一個獨特的領域,它融合了技術與法律知識,因此本質上具有複雜性和知識密集性。隨著大型語言模型(LLMs)的不斷進步,它們在處理IP任務方面展現出巨大潛力,能夠更高效地分析、理解並生成與IP相關的內容。然而,現有的數據集和基準要么僅專注於專利,要么覆蓋IP領域的有限方面,缺乏與現實場景的對齊。為彌補這一差距,我們首次提出了全面的IP任務分類法,並推出了一個大型、多樣化的雙語基準——IPBench,涵蓋8種IP機制和20項任務。該基準旨在評估LLMs在現實世界知識產權應用中的表現,包括理解和生成兩方面。我們對16個LLMs進行了基準測試,範圍從通用模型到領域專用模型,結果發現即使表現最佳的模型準確率也僅為75.8%,顯示出巨大的改進空間。值得注意的是,開源的IP和法律導向模型落後於閉源的通用模型。我們公開了IPBench的所有數據和代碼,並將持續更新更多與IP相關的任務,以更好地反映知識產權領域的現實挑戰。
English
Intellectual Property (IP) is a unique domain that integrates technical and legal knowledge, making it inherently complex and knowledge-intensive. As large language models (LLMs) continue to advance, they show great potential for processing IP tasks, enabling more efficient analysis, understanding, and generation of IP-related content. However, existing datasets and benchmarks either focus narrowly on patents or cover limited aspects of the IP field, lacking alignment with real-world scenarios. To bridge this gap, we introduce the first comprehensive IP task taxonomy and a large, diverse bilingual benchmark, IPBench, covering 8 IP mechanisms and 20 tasks. This benchmark is designed to evaluate LLMs in real-world intellectual property applications, encompassing both understanding and generation. We benchmark 16 LLMs, ranging from general-purpose to domain-specific models, and find that even the best-performing model achieves only 75.8% accuracy, revealing substantial room for improvement. Notably, open-source IP and law-oriented models lag behind closed-source general-purpose models. We publicly release all data and code of IPBench and will continue to update it with additional IP-related tasks to better reflect real-world challenges in the intellectual property domain.

Summary

AI-Generated Summary

PDF42April 23, 2025