SBS 圖表:從階段合成圖像進行的預訓練圖像問答
SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
December 23, 2024
作者: Risa Shinoda, Kuniaki Saito, Shohei Tanaka, Tosho Hirasawa, Yoshitaka Ushiku
cs.AI
摘要
建立一個大規模的圖形問答數據集需要大量的工作,從收集和選擇圖形到提取文本、數字和顏色等屬性,再到生成問答。儘管最近LLM的發展促使人們努力合成圖形,但大多數集中在問答生成方面。此外,使用LLM直接創建圖形時常遇到問題,如代碼錯誤、外觀相似的圖形和圖形中的重複內容。為了解決這個問題,我們提出了SBSFigures(逐階合成圖形),這是一個用於預訓練圖形問答的數據集。我們提出的流程使得能夠創建具有完整可視化數據標註和密集問答標註的圖表圖形,而無需進行任何手動標註過程。我們的逐階流程使得能夠高效地創建多樣的主題和外觀圖形,同時最大程度地減少代碼錯誤。我們的SBSFigures展示了強大的預訓練效果,使得能夠從我們的預訓練權重開始,僅使用有限量的真實圖表數據進行高效訓練。
English
Building a large-scale figure QA dataset requires a considerable amount of
work, from gathering and selecting figures to extracting attributes like text,
numbers, and colors, and generating QAs. Although recent developments in LLMs
have led to efforts to synthesize figures, most of these focus primarily on QA
generation. Additionally, creating figures directly using LLMs often encounters
issues such as code errors, similar-looking figures, and repetitive content in
figures. To address this issue, we present SBSFigures (Stage-by-Stage Synthetic
Figures), a dataset for pre-training figure QA. Our proposed pipeline enables
the creation of chart figures with complete annotations of the visualized data
and dense QA annotations without any manual annotation process. Our
stage-by-stage pipeline makes it possible to create diverse topic and
appearance figures efficiently while minimizing code errors. Our SBSFigures
demonstrate a strong pre-training effect, making it possible to achieve
efficient training with a limited amount of real-world chart data starting from
our pre-trained weights.Summary
AI-Generated Summary