Reinforcement learning improves LLM accuracy and reasoning in disease classification from radiology reports.

Submitted by yip4002 on June 5, 2026 - 4:45pm

Title	Reinforcement learning improves LLM accuracy and reasoning in disease classification from radiology reports.
Publication Type	Journal Article
Year of Publication	2026
Authors	Wei Y, Lin Y, Flanders A, Shih G, Peng Y
Journal	NPJ Digit Med
Date Published	2026 Apr 30
ISSN	2398-6352
Abstract	Accurate disease classification from radiology reports is essential for many applications. While supervised fine-tuning (SFT) of lightweight LLMs improves accuracy, it can degrade reasoning. We propose a two-stage approach: SFT on disease labels followed by Group Relative Policy Optimization (GRPO) to refine predictions by optimizing accuracy and format without reasoning supervision. Across three radiologist-annotated datasets, SFT outperformed baselines and GRPO further improved classification and enhanced reasoning recall and comprehensiveness.
DOI	10.1038/s41746-026-02685-4
Alternate Journal	NPJ Digit Med
PubMed ID	42062541
Grant List	75N920202D00021 / EB / NIBIB NIH HHS / United States 75N920202D00021 / EB / NIBIB NIH HHS / United States 75N920202D00021 / EB / NIBIB NIH HHS / United States 2145640 / / NSF CAREER Award /