未检与忽视:通过CheckboxQA解决大语言模型中的复选框盲点问题
Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA
April 14, 2025
作者: Michał Turski, Mateusz Chiliński, Łukasz Borchmann
cs.AI
摘要
在现实世界的文档处理中,复选框至关重要,其勾选与否直接关系到数据提取和决策流程。然而,尽管大型视觉与语言模型在众多任务中表现出色,它们在解析可勾选内容方面仍面临挑战。这一难题在那些单个被忽视的复选框可能导致高昂的监管或合同疏漏的行业中尤为紧迫。为填补这一空白,我们推出了CheckboxQA数据集,这是一个专门设计用于评估和提升模型在复选框相关任务上表现的资源。该数据集揭示了当前模型的局限性,并作为推动文档理解系统进步的重要工具,对法律科技和金融等领域的应用具有深远意义。
数据集已公开提供,访问地址为:
https://github.com/Snowflake-Labs/CheckboxQA
English
Checkboxes are critical in real-world document processing where the presence
or absence of ticks directly informs data extraction and decision-making
processes. Yet, despite the strong performance of Large Vision and Language
Models across a wide range of tasks, they struggle with interpreting checkable
content. This challenge becomes particularly pressing in industries where a
single overlooked checkbox may lead to costly regulatory or contractual
oversights. To address this gap, we introduce the CheckboxQA dataset, a
targeted resource designed to evaluate and improve model performance on
checkbox-related tasks. It reveals the limitations of current models and serves
as a valuable tool for advancing document comprehension systems, with
significant implications for applications in sectors such as legal tech and
finance.
The dataset is publicly available at:
https://github.com/Snowflake-Labs/CheckboxQASummary
AI-Generated Summary