未受檢視與被忽視:以CheckboxQA解決大型語言模型中的複選框盲點
Unchecked and Overlooked: Addressing the Checkbox Blind Spot in Large Language Models with CheckboxQA
April 14, 2025
作者: Michał Turski, Mateusz Chiliński, Łukasz Borchmann
cs.AI
摘要
在現實世界的文件處理中,核取方塊(Checkboxes)扮演著至關重要的角色,其勾選與否直接影響數據提取和決策流程。然而,儘管大型視覺與語言模型在多種任務上表現出色,它們在解讀可勾選內容方面仍存在困難。這一挑戰在那些單一遺漏的核取方塊可能導致昂貴的監管或合同疏忽的行業中尤為迫切。為填補這一空白,我們推出了CheckboxQA數據集,這是一個專門設計的資源,旨在評估並提升模型在核取方塊相關任務上的表現。該數據集揭示了當前模型的局限性,並作為推動文件理解系統進步的寶貴工具,對法律科技和金融等領域的應用具有重要意義。
該數據集已公開於:
https://github.com/Snowflake-Labs/CheckboxQA
English
Checkboxes are critical in real-world document processing where the presence
or absence of ticks directly informs data extraction and decision-making
processes. Yet, despite the strong performance of Large Vision and Language
Models across a wide range of tasks, they struggle with interpreting checkable
content. This challenge becomes particularly pressing in industries where a
single overlooked checkbox may lead to costly regulatory or contractual
oversights. To address this gap, we introduce the CheckboxQA dataset, a
targeted resource designed to evaluate and improve model performance on
checkbox-related tasks. It reveals the limitations of current models and serves
as a valuable tool for advancing document comprehension systems, with
significant implications for applications in sectors such as legal tech and
finance.
The dataset is publicly available at:
https://github.com/Snowflake-Labs/CheckboxQASummary
AI-Generated Summary