EvidenceOutcomes: A Dataset of Clinical Trial Publications with Clinically Meaningful Outcomes.

TitleEvidenceOutcomes: A Dataset of Clinical Trial Publications with Clinically Meaningful Outcomes.
Publication TypeJournal Article
Year of Publication2025
AuthorsZhou Y, Newbury AM, Zhang G, Idnay BRoss, Liu H, Weng C, Peng Y
JournalStud Health Technol Inform
Volume329
Pagination723-727
Date Published2025 Aug 07
ISSN1879-8365
KeywordsClinical Trials as Topic, Data Mining, Evidence-Based Medicine, Humans, Machine Learning, Natural Language Processing, Outcome Assessment, Health Care
Abstract

The fundamental process of evidence extraction in evidence-based medicine relies on identifying PICO elements, with Outcomes being the most complex and often overlooked. To address this, we introduce EvidenceOutcomes, a large annotated corpus of clinically meaningful outcomes. A robust annotation guideline was developed in collaboration with clinicians and NLP experts, and three annotators annotated the Results and Conclusions of 500 PubMed abstracts and 140 EBM-NLP abstracts, achieving an inter-rater agreement of 0.76. A fine-tuned PubMedBERT model achieved F1 scores of 0.69 (entity level) and 0.76 (token level). EvidenceOutcomes offers a benchmark for advancing machine learning algorithms in extracting clinically meaningful outcomes.

DOI10.3233/SHTI250935
Alternate JournalStud Health Technol Inform
PubMed ID40775953