MedMobile: 전문 수준의 임상 능력을 갖춘 휴대용 규모의 언어 모델

초록

언어 모델(Language models, LMs)은 의학 분야에서 전문가 수준의 추론 및 기억 능력을 보여주었습니다. 그러나 계산 비용과 개인 정보 보호에 대한 우려가 넓은 범위로의 적용을 방해하고 있습니다. 우리는 모바일 장치에서 실행 가능한 38억 개 파라미터를 갖춘 MedMobile이라는 phi-3-mini의 절약형 적응을 소개합니다. MedMobile이 MedQA (USMLE)에서 75.7%의 점수를 획득하여 의사들을 위한 합격 기준(~60%)을 넘어서며, 크기의 100배인 모델들의 점수에 접근합니다. 우리는 이어서 신중한 일련의 제거 실험을 수행하고, 사고의 연결, 앙상블 및 세밀한 조정이 가장 큰 성능 향상을 이끌어내는 것을 보여줍니다. 반면, 의도치 않게 검색 보강 생성은 상당한 개선을 보여주지 못하는 것으로 나타납니다.

English

Language models (LMs) have demonstrated expert-level reasoning and recall abilities in medicine. However, computational costs and privacy concerns are mounting barriers to wide-scale implementation. We introduce a parsimonious adaptation of phi-3-mini, MedMobile, a 3.8 billion parameter LM capable of running on a mobile device, for medical applications. We demonstrate that MedMobile scores 75.7% on the MedQA (USMLE), surpassing the passing mark for physicians (~60%), and approaching the scores of models 100 times its size. We subsequently perform a careful set of ablations, and demonstrate that chain of thought, ensembling, and fine-tuning lead to the greatest performance gains, while unexpectedly retrieval augmented generation fails to demonstrate significant improvements

MedMobile: 전문 수준의 임상 능력을 갖춘 휴대용 규모의 언어 모델

MedMobile: A mobile-sized language model with expert-level clinical capabilities

초록

Support