MedAgent-Pro：迈向基于多模态证据的医疗诊断推理代理工作流

摘要

开发可靠的AI系统以辅助人类临床医生进行多模态医疗诊断，长期以来一直是研究人员的核心目标。近年来，多模态大语言模型（MLLMs）在多个领域获得了广泛关注并取得了显著成功。凭借强大的推理能力及依据用户指令执行多样化任务的能力，它们在提升医疗诊断方面展现出巨大潜力。然而，直接将MLLMs应用于医疗领域仍面临挑战。这些模型对视觉输入的细节感知不足，限制了其进行定量图像分析的能力，而这对于医疗诊断至关重要。此外，MLLMs在推理过程中常出现幻觉和不一致现象，而临床诊断必须严格遵循既定标准。为应对这些挑战，我们提出了MedAgent-Pro，一个基于证据的推理代理系统，旨在实现可靠、可解释且精确的医疗诊断。该系统通过分层工作流程实现：在任务层面，基于知识的推理根据检索到的临床标准为特定疾病生成可靠的诊断方案；而在案例层面，多个工具代理处理多模态输入，按照方案分析不同指标，并基于定量与定性证据提供最终诊断。在二维和三维医疗诊断任务上的全面实验验证了MedAgent-Pro的优越性和有效性，案例研究进一步凸显了其可靠性和可解释性。代码已发布于https://github.com/jinlab-imvr/MedAgent-Pro。

English

Developing reliable AI systems to assist human clinicians in multi-modal medical diagnosis has long been a key objective for researchers. Recently, Multi-modal Large Language Models (MLLMs) have gained significant attention and achieved success across various domains. With strong reasoning capabilities and the ability to perform diverse tasks based on user instructions, they hold great potential for enhancing medical diagnosis. However, directly applying MLLMs to the medical domain still presents challenges. They lack detailed perception of visual inputs, limiting their ability to perform quantitative image analysis, which is crucial for medical diagnostics. Additionally, MLLMs often exhibit hallucinations and inconsistencies in reasoning, whereas clinical diagnoses must adhere strictly to established criteria. To address these challenges, we propose MedAgent-Pro, an evidence-based reasoning agentic system designed to achieve reliable, explainable, and precise medical diagnoses. This is accomplished through a hierarchical workflow: at the task level, knowledge-based reasoning generate reliable diagnostic plans for specific diseases following retrieved clinical criteria. While at the case level, multiple tool agents process multi-modal inputs, analyze different indicators according to the plan, and provide a final diagnosis based on both quantitative and qualitative evidence. Comprehensive experiments on both 2D and 3D medical diagnosis tasks demonstrate the superiority and effectiveness of MedAgent-Pro, while case studies further highlight its reliability and interpretability. The code is available at https://github.com/jinlab-imvr/MedAgent-Pro.

MedAgent-Pro：迈向基于多模态证据的医疗诊断推理代理工作流

MedAgent-Pro: Towards Multi-modal Evidence-based Medical Diagnosis via Reasoning Agentic Workflow

摘要

Summary

Support

Support