ChatPaper.aiChatPaper

CheXWorld:探索放射影像表示中的圖像世界建模學習

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

April 18, 2025
作者: Yang Yue, Yulin Wang, Chenxin Tao, Pan Liu, Shiji Song, Gao Huang
cs.AI

摘要

人類能夠建立內在的世界模型,這些模型編碼了常識性知識,告訴他們世界如何運作並預測其行為的後果。這一概念在近期的初步工作中已成為建立通用機器學習模型的一個有前景的方向,例如在視覺表徵學習領域。本文中,我們提出了CheXWorld,這是首次嘗試構建一個針對放射影像的自監督世界模型。具體而言,我們的工作開發了一個統一框架,該框架同時建模了合格放射科醫生所必需的三個醫學知識方面,包括:1)描述局部組織細粒度特徵的局部解剖結構(如結構、形狀和紋理);2)描述人體全局組織的全局解剖佈局(如器官和骨骼的佈局);以及3)鼓勵CheXWorld建模不同放射影像外觀域之間轉換的領域變異(如因來自不同醫院、設備或患者而導致的清晰度、對比度和曝光度的變化)。通過實證研究,我們設計了定性和定量分析,揭示出CheXWorld成功捕捉了這三個維度的醫學知識。此外,在八個醫學影像分類和分割基準上的遷移學習實驗表明,CheXWorld顯著優於現有的自監督學習方法和大規模醫學基礎模型。代碼及預訓練模型可在https://github.com/LeapLabTHU/CheXWorld獲取。
English
Humans can develop internal world models that encode common sense knowledge, telling them how the world works and predicting the consequences of their actions. This concept has emerged as a promising direction for establishing general-purpose machine-learning models in recent preliminary works, e.g., for visual representation learning. In this paper, we present CheXWorld, the first effort towards a self-supervised world model for radiographic images. Specifically, our work develops a unified framework that simultaneously models three aspects of medical knowledge essential for qualified radiologists, including 1) local anatomical structures describing the fine-grained characteristics of local tissues (e.g., architectures, shapes, and textures); 2) global anatomical layouts describing the global organization of the human body (e.g., layouts of organs and skeletons); and 3) domain variations that encourage CheXWorld to model the transitions across different appearance domains of radiographs (e.g., varying clarity, contrast, and exposure caused by collecting radiographs from different hospitals, devices, or patients). Empirically, we design tailored qualitative and quantitative analyses, revealing that CheXWorld successfully captures these three dimensions of medical knowledge. Furthermore, transfer learning experiments across eight medical image classification and segmentation benchmarks showcase that CheXWorld significantly outperforms existing SSL methods and large-scale medical foundation models. Code & pre-trained models are available at https://github.com/LeapLabTHU/CheXWorld.

Summary

AI-Generated Summary

PDF172April 23, 2025