ChatPaper.aiChatPaper

BenchX:用于医学视觉-语言联合预训练的统一基准框架,以胸部X射线为例

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

October 29, 2024
作者: Yang Zhou, Tan Li Hui Faith, Yanyu Xu, Sicong Leng, Xinxing Xu, Yong Liu, Rick Siow Mong Goh
cs.AI

摘要

医学视觉语言预训练(MedVLP)展现了从成对和无配对医学图像和报告中学习可泛化和可转移的视觉表示的潜力。MedVLP能够为下游任务提供有用的特征,并有助于使用更少示例将特定任务模型调整到新设置中。然而,现有的MedVLP方法在数据集、预处理和微调实现方面常常存在差异。这在评估MedVLP方法在各种临床相关任务中的泛化能力时带来了巨大挑战,因为缺乏统一、标准化和全面的基准。为填补这一空白,我们提出了BenchX,一个统一的基准框架,可以使用公共胸部X射线数据集进行MedVLP方法之间的对比和系统分析。具体而言,BenchX由三个组成部分组成:1)涵盖九个数据集和四个医学任务的全面数据集;2)基准套件,用于标准化数据预处理、训练-测试分割和参数选择;3)统一的微调协议,可容纳异构的MedVLP方法,以在分类、分割和报告生成等方面实现一致的任务适应。利用BenchX,我们为九种最先进的MedVLP方法建立了基线,并发现一些早期的MedVLP方法的性能可以提升,超越更近期的方法,促使重新审视MedVLP先前工作中的发展和结论。我们的代码可在https://github.com/yangzhou12/BenchX 上找到。
English
Medical Vision-Language Pretraining (MedVLP) shows promise in learning generalizable and transferable visual representations from paired and unpaired medical images and reports. MedVLP can provide useful features to downstream tasks and facilitate adapting task-specific models to new setups using fewer examples. However, existing MedVLP methods often differ in terms of datasets, preprocessing, and finetuning implementations. This pose great challenges in evaluating how well a MedVLP method generalizes to various clinically-relevant tasks due to the lack of unified, standardized, and comprehensive benchmark. To fill this gap, we propose BenchX, a unified benchmark framework that enables head-to-head comparison and systematical analysis between MedVLP methods using public chest X-ray datasets. Specifically, BenchX is composed of three components: 1) Comprehensive datasets covering nine datasets and four medical tasks; 2) Benchmark suites to standardize data preprocessing, train-test splits, and parameter selection; 3) Unified finetuning protocols that accommodate heterogeneous MedVLP methods for consistent task adaptation in classification, segmentation, and report generation, respectively. Utilizing BenchX, we establish baselines for nine state-of-the-art MedVLP methods and found that the performance of some early MedVLP methods can be enhanced to surpass more recent ones, prompting a revisiting of the developments and conclusions from prior works in MedVLP. Our code are available at https://github.com/yangzhou12/BenchX.

Summary

AI-Generated Summary

PDF102November 13, 2024