Reinforcement learning improves LLM accuracy and reasoning in disease classification from radiology reports.

TitleReinforcement learning improves LLM accuracy and reasoning in disease classification from radiology reports.
Publication TypeJournal Article
Year of Publication2026
AuthorsWei Y, Lin Y, Flanders A, Shih G, Peng Y
JournalNPJ Digit Med
Date Published2026 Apr 30
ISSN2398-6352
Abstract

Accurate disease classification from radiology reports is essential for many applications. While supervised fine-tuning (SFT) of lightweight LLMs improves accuracy, it can degrade reasoning. We propose a two-stage approach: SFT on disease labels followed by Group Relative Policy Optimization (GRPO) to refine predictions by optimizing accuracy and format without reasoning supervision. Across three radiologist-annotated datasets, SFT outperformed baselines and GRPO further improved classification and enhanced reasoning recall and comprehensiveness.

DOI10.1038/s41746-026-02685-4
Alternate JournalNPJ Digit Med
PubMed ID42062541
Grant List75N920202D00021 / EB / NIBIB NIH HHS / United States
75N920202D00021 / EB / NIBIB NIH HHS / United States
75N920202D00021 / EB / NIBIB NIH HHS / United States
2145640 / / NSF CAREER Award /