Title | Closing the gap between open source and commercial large language models for medical evidence summarization. |
Publication Type | Journal Article |
Year of Publication | 2024 |
Authors | Zhang G, Jin Q, Zhou Y, Wang S, Idnay B, Luo Y, Park E, Nestor JG, Spotnitz ME, Soroush A, Campion TR, Lu Z, Weng C, Peng Y |
Journal | NPJ Digit Med |
Volume | 7 |
Issue | 1 |
Pagination | 239 |
Date Published | 2024 Sep 09 |
ISSN | 2398-6352 |
Abstract | Large language models (LLMs) hold great promise in summarizing medical evidence. Most recent studies focus on the application of proprietary LLMs. Using proprietary LLMs introduces multiple risk factors, including a lack of transparency and vendor dependency. While open-source LLMs allow better transparency and customization, their performance falls short compared to the proprietary ones. In this study, we investigated to what extent fine-tuning open-source LLMs can further improve their performance. Utilizing a benchmark dataset, MedReview, consisting of 8161 pairs of systematic reviews and summaries, we fine-tuned three broadly-used, open-sourced LLMs, namely PRIMERA, LongT5, and Llama-2. Overall, the performance of open-source models was all improved after fine-tuning. The performance of fine-tuned LongT5 is close to GPT-3.5 with zero-shot settings. Furthermore, smaller fine-tuned models sometimes even demonstrated superior performance compared to larger zero-shot models. The above trends of improvement were manifested in both a human evaluation and a larger-scale GPT4-simulated evaluation. |
DOI | 10.1038/s41746-024-01239-w |
Alternate Journal | NPJ Digit Med |
PubMed ID | 39251804 |
PubMed Central ID | PMC11383939 |
Grant List | R01 LM014306 / LM / NLM NIH HHS / United States T15 LM007079 / LM / NLM NIH HHS / United States R01 LM009886 / LM / NLM NIH HHS / United States R01 HG012655 / HG / NHGRI NIH HHS / United States UL1 TR002384 / TR / NCATS NIH HHS / United States UL1 TR001873 / TR / NCATS NIH HHS / United States UL1TR001873 / / U.S. Department of Health & Human Services | National Institutes of Health (NIH) / R01LM014344 / / U.S. Department of Health & Human Services | NIH | U.S. National Library of Medicine (NLM) / R01 LM014344 / LM / NLM NIH HHS / United States |