ChatPaper.aiChatPaper

BenchX:用於醫學視覺語言預訓練的統一基準框架,適用於胸部X光。

BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays

October 29, 2024
作者: Yang Zhou, Tan Li Hui Faith, Yanyu Xu, Sicong Leng, Xinxing Xu, Yong Liu, Rick Siow Mong Goh
cs.AI

摘要

醫學視覺語言預訓練(MedVLP)展現了在從成對和非成對的醫學影像和報告中學習可泛化和可轉移的視覺表示方面的潛力。MedVLP能夠為下游任務提供有用的特徵,並有助於使用更少的示例將特定任務模型適應到新的設置中。然而,現有的MedVLP方法在數據集、預處理和微調實現方面常常存在差異。這對於評估一個MedVLP方法在各種臨床相關任務中的泛化能力構成了巨大挑戰,因為缺乏統一、標準化和全面的基準。為了填補這一空白,我們提出了BenchX,一個統一的基準框架,可以使用公共胸部X光數據集實現MedVLP方法之間的直接比較和系統分析。具體而言,BenchX由三個組件組成:1)包含九個數據集和四個醫學任務的全面數據集;2)基準套件,用於標準化數據預處理、訓練-測試分割和參數選擇;3)統一的微調協議,可容納異構MedVLP方法,以實現在分類、分割和報告生成方面的一致任務適應。利用BenchX,我們為九種最先進的MedVLP方法建立了基準,發現一些早期的MedVLP方法的性能可以提升,超越更近期的方法,促使重新審視先前MedVLP作品的發展和結論。我們的代碼可在https://github.com/yangzhou12/BenchX 上找到。
English
Medical Vision-Language Pretraining (MedVLP) shows promise in learning generalizable and transferable visual representations from paired and unpaired medical images and reports. MedVLP can provide useful features to downstream tasks and facilitate adapting task-specific models to new setups using fewer examples. However, existing MedVLP methods often differ in terms of datasets, preprocessing, and finetuning implementations. This pose great challenges in evaluating how well a MedVLP method generalizes to various clinically-relevant tasks due to the lack of unified, standardized, and comprehensive benchmark. To fill this gap, we propose BenchX, a unified benchmark framework that enables head-to-head comparison and systematical analysis between MedVLP methods using public chest X-ray datasets. Specifically, BenchX is composed of three components: 1) Comprehensive datasets covering nine datasets and four medical tasks; 2) Benchmark suites to standardize data preprocessing, train-test splits, and parameter selection; 3) Unified finetuning protocols that accommodate heterogeneous MedVLP methods for consistent task adaptation in classification, segmentation, and report generation, respectively. Utilizing BenchX, we establish baselines for nine state-of-the-art MedVLP methods and found that the performance of some early MedVLP methods can be enhanced to surpass more recent ones, prompting a revisiting of the developments and conclusions from prior works in MedVLP. Our code are available at https://github.com/yangzhou12/BenchX.

Summary

AI-Generated Summary

PDF102November 13, 2024