利用先進的病患模擬器探索詢問與診斷之間的關係
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators
January 16, 2025
作者: Zhaocheng Liu, Quan Tu, Wen Ye, Yu Xiao, Zhishou Zhang, Hengfu Cui, Yalun Zhu, Qiang Ju, Shizheng Li, Jian Xie
cs.AI
摘要
線上醫療諮詢(OMC)限制醫生僅透過詢問收集病人資訊,使本已複雜的診斷過程變得更具挑戰性。最近,大型語言模型的快速進展顯示了改變OMC的重大潛力。然而,大多數研究主要集中在在相對充足資訊條件下提高診斷準確性,卻對諮詢過程中的「詢問」階段給予有限關注。這種缺乏關注導致「詢問」與「診斷」之間的關係尚未得到充分探討。本文首先從真實醫患對話中提取真實患者互動策略,並利用這些策略指導患者模擬器的訓練,以模擬真實行為。透過將醫療記錄輸入我們的患者模擬器以模擬患者回應,我們進行了廣泛實驗,探索諮詢過程中「詢問」與「診斷」之間的關係。實驗結果顯示,詢問和診斷遵循利比希法則:低質量的詢問會限制診斷的有效性,而診斷能力也是如此,反之亦然。此外,實驗揭示了各種模型在詢問表現上的顯著差異。為了調查這一現象,我們將詢問過程分為四類:(1)主訴詢問;(2)已知症狀的具體描述;(3)詢問伴隨症狀;和(4)收集家族或醫療史。我們分析不同模型在這四類詢問中的分佈,以探索其顯著表現差異背後的原因。我們計劃將我們的患者模擬器的權重和相關代碼開源在 https://github.com/LIO-H-ZEN/PatientSimulator。
English
Online medical consultation (OMC) restricts doctors to gathering patient
information solely through inquiries, making the already complex sequential
decision-making process of diagnosis even more challenging. Recently, the rapid
advancement of large language models has demonstrated a significant potential
to transform OMC. However, most studies have primarily focused on improving
diagnostic accuracy under conditions of relatively sufficient information,
while paying limited attention to the "inquiry" phase of the consultation
process. This lack of focus has left the relationship between "inquiry" and
"diagnosis" insufficiently explored. In this paper, we first extract real
patient interaction strategies from authentic doctor-patient conversations and
use these strategies to guide the training of a patient simulator that closely
mirrors real-world behavior. By inputting medical records into our patient
simulator to simulate patient responses, we conduct extensive experiments to
explore the relationship between "inquiry" and "diagnosis" in the consultation
process. Experimental results demonstrate that inquiry and diagnosis adhere to
the Liebig's law: poor inquiry quality limits the effectiveness of diagnosis,
regardless of diagnostic capability, and vice versa. Furthermore, the
experiments reveal significant differences in the inquiry performance of
various models. To investigate this phenomenon, we categorize the inquiry
process into four types: (1) chief complaint inquiry; (2) specification of
known symptoms; (3) inquiry about accompanying symptoms; and (4) gathering
family or medical history. We analyze the distribution of inquiries across the
four types for different models to explore the reasons behind their significant
performance differences. We plan to open-source the weights and related code of
our patient simulator at https://github.com/LIO-H-ZEN/PatientSimulator.Summary
AI-Generated Summary