利用先进的患者模拟器探索询诊-诊断关系
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators
January 16, 2025
作者: Zhaocheng Liu, Quan Tu, Wen Ye, Yu Xiao, Zhishou Zhang, Hengfu Cui, Yalun Zhu, Qiang Ju, Shizheng Li, Jian Xie
cs.AI
摘要
在线医学咨询(OMC)限制医生仅通过询问收集患者信息,使本已复杂的诊断顺序决策过程变得更具挑战性。最近,大型语言模型的快速发展展示了改变OMC的巨大潜力。然而,大多数研究主要集中在在相对充足信息条件下提高诊断准确性,而对咨询过程中的“询问”阶段关注有限。这种缺乏关注导致“询问”和“诊断”之间的关系尚未得到充分探讨。本文首先从真实医患对话中提取真实患者互动策略,并利用这些策略指导患者模拟器的训练,使其紧密模拟现实行为。通过将医疗记录输入到我们的患者模拟器中以模拟患者回应,我们进行了大量实验,探讨了咨询过程中“询问”和“诊断”之间的关系。实验结果表明,询问和诊断遵循李比希法则:低质量的询问限制了诊断的有效性,无论诊断能力如何,反之亦然。此外,实验揭示了各种模型在询问性能上的显著差异。为了研究这一现象,我们将询问过程分为四类:(1)主诉询问;(2)已知症状的具体描述;(3)询问伴随症状;和(4)收集家族或病史。我们分析了不同模型在这四类询问中的分布,以探讨其显著性能差异背后的原因。我们计划在 https://github.com/LIO-H-ZEN/PatientSimulator 开源我们患者模拟器的权重和相关代码。
English
Online medical consultation (OMC) restricts doctors to gathering patient
information solely through inquiries, making the already complex sequential
decision-making process of diagnosis even more challenging. Recently, the rapid
advancement of large language models has demonstrated a significant potential
to transform OMC. However, most studies have primarily focused on improving
diagnostic accuracy under conditions of relatively sufficient information,
while paying limited attention to the "inquiry" phase of the consultation
process. This lack of focus has left the relationship between "inquiry" and
"diagnosis" insufficiently explored. In this paper, we first extract real
patient interaction strategies from authentic doctor-patient conversations and
use these strategies to guide the training of a patient simulator that closely
mirrors real-world behavior. By inputting medical records into our patient
simulator to simulate patient responses, we conduct extensive experiments to
explore the relationship between "inquiry" and "diagnosis" in the consultation
process. Experimental results demonstrate that inquiry and diagnosis adhere to
the Liebig's law: poor inquiry quality limits the effectiveness of diagnosis,
regardless of diagnostic capability, and vice versa. Furthermore, the
experiments reveal significant differences in the inquiry performance of
various models. To investigate this phenomenon, we categorize the inquiry
process into four types: (1) chief complaint inquiry; (2) specification of
known symptoms; (3) inquiry about accompanying symptoms; and (4) gathering
family or medical history. We analyze the distribution of inquiries across the
four types for different models to explore the reasons behind their significant
performance differences. We plan to open-source the weights and related code of
our patient simulator at https://github.com/LIO-H-ZEN/PatientSimulator.Summary
AI-Generated Summary