ChartCitor: 세밀한 차트 시각적 속성을 위한 다중 에이전트 프레임워크

초록

대형 언어 모델 (LLM)은 차트 질의응답 작업을 수행할 수 있지만 종종 확인되지 않은 환각적인 응답을 생성합니다. 기존의 답변 속성 할당 방법은 시각-의미적 맥락의 제한, 복잡한 시각-텍스트 정렬 요구 사항 및 복잡한 레이아웃을 횡단하는 경계 상자 예측의 어려움으로 인해 소스 차트에 응답을 근거 지을 때 어려움을 겪습니다. 우리는 차트 이미지 내에서 지원 증거를 식별함으로써 세밀한 경계 상자 인용을 제공하는 다중 에이전트 프레임워크인 ChartCitor를 제시합니다. 시스템은 LLM 에이전트들을 조율하여 차트-테이블 추출, 응답 재구성, 테이블 보강, 사전 필터링 및 재랭킹을 통한 증거 검색, 그리고 테이블-차트 매핑을 수행합니다. ChartCitor는 다양한 차트 유형에서 기존의 기준선을 능가합니다. 질적 사용자 연구는 ChartCitor가 LLM 지원 차트 QA의 설명 가능성을 향상시킴으로써 사용자들의 Generative AI에 대한 신뢰를 증가시키고 전문가들이 더 생산적일 수 있도록 돕는다는 것을 보여줍니다.

English

Large Language Models (LLMs) can perform chart question-answering tasks but often generate unverified hallucinated responses. Existing answer attribution methods struggle to ground responses in source charts due to limited visual-semantic context, complex visual-text alignment requirements, and difficulties in bounding box prediction across complex layouts. We present ChartCitor, a multi-agent framework that provides fine-grained bounding box citations by identifying supporting evidence within chart images. The system orchestrates LLM agents to perform chart-to-table extraction, answer reformulation, table augmentation, evidence retrieval through pre-filtering and re-ranking, and table-to-chart mapping. ChartCitor outperforms existing baselines across different chart types. Qualitative user studies show that ChartCitor helps increase user trust in Generative AI by providing enhanced explainability for LLM-assisted chart QA and enables professionals to be more productive.

ChartCitor: 세밀한 차트 시각적 속성을 위한 다중 에이전트 프레임워크

ChartCitor: Multi-Agent Framework for Fine-Grained Chart Visual Attribution

초록

Support