SBS图表:来自分阶段合成图像的预训练图像问答
SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
December 23, 2024
作者: Risa Shinoda, Kuniaki Saito, Shohei Tanaka, Tosho Hirasawa, Yoshitaka Ushiku
cs.AI
摘要
构建一个大规模的图形问答数据集需要大量的工作,从收集和选择图形到提取文本、数字和颜色等属性,并生成问答。尽管最近LLM技术的发展促使人们努力合成图形,但大多数工作主要集中在问答生成上。此外,直接使用LLM创建图形往往会遇到诸如代码错误、外观相似的图形和图形中的重复内容等问题。为了解决这个问题,我们提出了SBSFigures(分阶段合成图形),这是一个用于图形问答预训练的数据集。我们提出的流程使得能够创建具有可视化数据完整注释和密集问答注释的图表图形,而无需进行任何手动注释过程。我们的分阶段流程使得能够高效创建多样化的主题和外观图形,同时最大程度地减少代码错误。我们的SBSFigures展示了强大的预训练效果,使得可以从我们的预训练权重开始,仅使用有限量的真实图表数据就能实现高效训练。
English
Building a large-scale figure QA dataset requires a considerable amount of
work, from gathering and selecting figures to extracting attributes like text,
numbers, and colors, and generating QAs. Although recent developments in LLMs
have led to efforts to synthesize figures, most of these focus primarily on QA
generation. Additionally, creating figures directly using LLMs often encounters
issues such as code errors, similar-looking figures, and repetitive content in
figures. To address this issue, we present SBSFigures (Stage-by-Stage Synthetic
Figures), a dataset for pre-training figure QA. Our proposed pipeline enables
the creation of chart figures with complete annotations of the visualized data
and dense QA annotations without any manual annotation process. Our
stage-by-stage pipeline makes it possible to create diverse topic and
appearance figures efficiently while minimizing code errors. Our SBSFigures
demonstrate a strong pre-training effect, making it possible to achieve
efficient training with a limited amount of real-world chart data starting from
our pre-trained weights.Summary
AI-Generated Summary