SBS 圖表：從階段合成圖像進行的預訓練圖像問答

摘要

建立一個大規模的圖形問答數據集需要大量的工作，從收集和選擇圖形到提取文本、數字和顏色等屬性，再到生成問答。儘管最近LLM的發展促使人們努力合成圖形，但大多數集中在問答生成方面。此外，使用LLM直接創建圖形時常遇到問題，如代碼錯誤、外觀相似的圖形和圖形中的重複內容。為了解決這個問題，我們提出了SBSFigures（逐階合成圖形），這是一個用於預訓練圖形問答的數據集。我們提出的流程使得能夠創建具有完整可視化數據標註和密集問答標註的圖表圖形，而無需進行任何手動標註過程。我們的逐階流程使得能夠高效地創建多樣的主題和外觀圖形，同時最大程度地減少代碼錯誤。我們的SBSFigures展示了強大的預訓練效果，使得能夠從我們的預訓練權重開始，僅使用有限量的真實圖表數據進行高效訓練。

English

Building a large-scale figure QA dataset requires a considerable amount of work, from gathering and selecting figures to extracting attributes like text, numbers, and colors, and generating QAs. Although recent developments in LLMs have led to efforts to synthesize figures, most of these focus primarily on QA generation. Additionally, creating figures directly using LLMs often encounters issues such as code errors, similar-looking figures, and repetitive content in figures. To address this issue, we present SBSFigures (Stage-by-Stage Synthetic Figures), a dataset for pre-training figure QA. Our proposed pipeline enables the creation of chart figures with complete annotations of the visualized data and dense QA annotations without any manual annotation process. Our stage-by-stage pipeline makes it possible to create diverse topic and appearance figures efficiently while minimizing code errors. Our SBSFigures demonstrate a strong pre-training effect, making it possible to achieve efficient training with a limited amount of real-world chart data starting from our pre-trained weights.

SBS 圖表：從階段合成圖像進行的預訓練圖像問答

SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images

摘要

Support