Closing the gap between open source and commercial large language models for medical evidence summarization.

TitleClosing the gap between open source and commercial large language models for medical evidence summarization.
Publication TypeJournal Article
Year of Publication2024
AuthorsZhang G, Jin Q, Zhou Y, Wang S, Idnay B, Luo Y, Park E, Nestor JG, Spotnitz ME, Soroush A, Campion TR, Lu Z, Weng C, Peng Y
JournalNPJ Digit Med
Volume7
Issue1
Pagination239
Date Published2024 Sep 09
ISSN2398-6352
Abstract

Large language models (LLMs) hold great promise in summarizing medical evidence. Most recent studies focus on the application of proprietary LLMs. Using proprietary LLMs introduces multiple risk factors, including a lack of transparency and vendor dependency. While open-source LLMs allow better transparency and customization, their performance falls short compared to the proprietary ones. In this study, we investigated to what extent fine-tuning open-source LLMs can further improve their performance. Utilizing a benchmark dataset, MedReview, consisting of 8161 pairs of systematic reviews and summaries, we fine-tuned three broadly-used, open-sourced LLMs, namely PRIMERA, LongT5, and Llama-2. Overall, the performance of open-source models was all improved after fine-tuning. The performance of fine-tuned LongT5 is close to GPT-3.5 with zero-shot settings. Furthermore, smaller fine-tuned models sometimes even demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were manifested in both a human evaluation and a larger-scale GPT4-simulated evaluation.

DOI10.1038/s41746-024-01239-w
Alternate JournalNPJ Digit Med
PubMed ID39251804
PubMed Central IDPMC11383939
Grant ListR01 LM014306 / LM / NLM NIH HHS / United States
T15 LM007079 / LM / NLM NIH HHS / United States
R01 LM009886 / LM / NLM NIH HHS / United States
R01 HG012655 / HG / NHGRI NIH HHS / United States
UL1 TR002384 / TR / NCATS NIH HHS / United States
UL1 TR001873 / TR / NCATS NIH HHS / United States
UL1TR001873 / / U.S. Department of Health & Human Services | National Institutes of Health (NIH) /
R01LM014344 / / U.S. Department of Health & Human Services | NIH | U.S. National Library of Medicine (NLM) /
R01 LM014344 / LM / NLM NIH HHS / United States