ChatPaper.aiChatPaper

通過檢索到的上下文來增強醫療LLM模型

Boosting Healthcare LLMs Through Retrieved Context

September 23, 2024
作者: Jordi Bayarri-Planas, Ashwin Kumar Gururajan, Dario Garcia-Gasulla
cs.AI

摘要

大型語言模型(LLMs)在自然語言處理方面展現了卓越的能力,然而,它們的事實不準確性和幻覺限制了它們的應用,特別是在像醫療保健這樣的關鍵領域。通過引入相關信息作為輸入的上下文檢索方法已成為增強LLM事實性和可靠性的重要途徑。本研究探討了上下文檢索方法在醫療保健領域內的界限,優化其組件並將其性能與開放和封閉替代方案進行了基準測試。我們的研究發現顯示,當開放式LLMs與經過優化的檢索系統相結合時,可以在確立的醫療保健基準測試(多項選擇問答)上實現與最大私有解決方案相當的性能。我們意識到在問題中包含可能答案的缺乏現實性(這種設置只在醫學考試中發現),並在評估到沒有這些選項時強大的LLM性能下降後,我們將上下文檢索系統擴展到這個方向。具體而言,我們提出了OpenMedPrompt,這是一個改進生成更可靠開放式答案的流程,將這項技術更接近實際應用。
English
Large Language Models (LLMs) have demonstrated remarkable capabilities in natural language processing, and yet, their factual inaccuracies and hallucinations limits their application, particularly in critical domains like healthcare. Context retrieval methods, by introducing relevant information as input, have emerged as a crucial approach for enhancing LLM factuality and reliability. This study explores the boundaries of context retrieval methods within the healthcare domain, optimizing their components and benchmarking their performance against open and closed alternatives. Our findings reveal how open LLMs, when augmented with an optimized retrieval system, can achieve performance comparable to the biggest private solutions on established healthcare benchmarks (multiple-choice question answering). Recognizing the lack of realism of including the possible answers within the question (a setup only found in medical exams), and after assessing a strong LLM performance degradation in the absence of those options, we extend the context retrieval system in that direction. In particular, we propose OpenMedPrompt a pipeline that improves the generation of more reliable open-ended answers, moving this technology closer to practical application.

Summary

AI-Generated Summary

PDF212November 16, 2024