潜在扩散自编码器:迈向医学影像中高效且有意义 的无监督表征学习
Latent Diffusion Autoencoders: Toward Efficient and Meaningful Unsupervised Representation Learning in Medical Imaging
April 11, 2025
作者: Gabriele Lozupone, Alessandro Bria, Francesco Fontanella, Frederick J. A. Meijer, Claudio De Stefano, Henkjan Huisman
cs.AI
摘要
本研究提出了一种新型的编码器-解码器扩散框架——潜在扩散自编码器(LDAE),旨在实现高效且富有意义的无监督学习,特别是在医学影像领域,以阿尔茨海默病(AD)为案例,利用ADNI数据库中的脑部磁共振成像(MR)数据进行研究。与传统的在图像空间操作的扩散自编码器不同,LDAE在压缩的潜在表示中应用扩散过程,从而提升了计算效率,使得三维医学影像的表示学习变得可行。为验证所提方法,我们探讨了两个关键假设:(i)LDAE能有效捕捉与AD及衰老相关的三维脑部MR图像中的语义表示;(ii)LDAE在保持计算高效的同时,能够实现高质量的图像生成与重建。实验结果支持了这两个假设:(i)线性探针评估显示,LDAE在AD诊断(ROC-AUC:90%,准确率:84%)和年龄预测(MAE:4.1年,RMSE:5.2年)方面表现出色;(ii)学习到的语义表示支持属性操控,产生解剖学上合理的修改;(iii)语义插值实验表明,对于缺失扫描的重建效果显著,6个月间隔的SSIM达到0.969(MSE:0.0019)。即便对于更长的间隔(24个月),模型仍保持稳健性能(SSIM > 0.93,MSE < 0.004),显示出捕捉时间进展趋势的能力;(iv)与传统扩散自编码器相比,LDAE显著提升了推理吞吐量(快20倍),同时提高了重建质量。这些发现使LDAE成为可扩展医学影像应用的一个有前景的框架,并有望作为医学图像分析的基础模型。代码可在https://github.com/GabrieleLozupone/LDAE获取。
English
This study presents Latent Diffusion Autoencoder (LDAE), a novel
encoder-decoder diffusion-based framework for efficient and meaningful
unsupervised learning in medical imaging, focusing on Alzheimer disease (AD)
using brain MR from the ADNI database as a case study. Unlike conventional
diffusion autoencoders operating in image space, LDAE applies the diffusion
process in a compressed latent representation, improving computational
efficiency and making 3D medical imaging representation learning tractable. To
validate the proposed approach, we explore two key hypotheses: (i) LDAE
effectively captures meaningful semantic representations on 3D brain MR
associated with AD and ageing, and (ii) LDAE achieves high-quality image
generation and reconstruction while being computationally efficient.
Experimental results support both hypotheses: (i) linear-probe evaluations
demonstrate promising diagnostic performance for AD (ROC-AUC: 90%, ACC: 84%)
and age prediction (MAE: 4.1 years, RMSE: 5.2 years); (ii) the learned semantic
representations enable attribute manipulation, yielding anatomically plausible
modifications; (iii) semantic interpolation experiments show strong
reconstruction of missing scans, with SSIM of 0.969 (MSE: 0.0019) for a 6-month
gap. Even for longer gaps (24 months), the model maintains robust performance
(SSIM > 0.93, MSE < 0.004), indicating an ability to capture temporal
progression trends; (iv) compared to conventional diffusion autoencoders, LDAE
significantly increases inference throughput (20x faster) while also enhancing
reconstruction quality. These findings position LDAE as a promising framework
for scalable medical imaging applications, with the potential to serve as a
foundation model for medical image analysis. Code available at
https://github.com/GabrieleLozupone/LDAESummary
AI-Generated Summary