ChartCitor:用于细粒度图表可视化归因的多智能体框架
ChartCitor: Multi-Agent Framework for Fine-Grained Chart Visual Attribution
February 3, 2025
作者: Kanika Goswami, Puneet Mathur, Ryan Rossi, Franck Dernoncourt
cs.AI
摘要
大型语言模型(LLMs)可以执行图表问答任务,但通常会生成未经验证的虚构响应。现有的答案归因方法由于受限于有限的视觉-语义上下文、复杂的视觉-文本对齐要求以及跨复杂布局的边界框预测困难,因此很难将响应与源图表联系起来。我们提出了ChartCitor,这是一个多代理框架,通过在图表图像中识别支持证据来提供细粒度的边界框引用。该系统协调LLM代理执行图表到表格的提取、答案重构、表格增强、通过预过滤和重新排序进行证据检索,以及表格到图表的映射。ChartCitor在不同类型的图表上优于现有基线。定性用户研究表明,ChartCitor通过为LLM辅助图表问答提供增强的可解释性,有助于增加用户对生成式AI的信任,并使专业人士更加高效。
English
Large Language Models (LLMs) can perform chart question-answering tasks but
often generate unverified hallucinated responses. Existing answer attribution
methods struggle to ground responses in source charts due to limited
visual-semantic context, complex visual-text alignment requirements, and
difficulties in bounding box prediction across complex layouts. We present
ChartCitor, a multi-agent framework that provides fine-grained bounding box
citations by identifying supporting evidence within chart images. The system
orchestrates LLM agents to perform chart-to-table extraction, answer
reformulation, table augmentation, evidence retrieval through pre-filtering and
re-ranking, and table-to-chart mapping. ChartCitor outperforms existing
baselines across different chart types. Qualitative user studies show that
ChartCitor helps increase user trust in Generative AI by providing enhanced
explainability for LLM-assisted chart QA and enables professionals to be more
productive.Summary
AI-Generated Summary