Extracting chemical-protein relations with ensembles of SVM and deep learning models

TitleExtracting chemical-protein relations with ensembles of SVM and deep learning models
Publication TypeJournal Article
Year of Publication2018
AuthorsPeng Y, Rios A, Kavuluru R, Lu Z
JournalDatabase (Oxford)
Volume2018
Date Published2018 01 01
ISSN1758-0463
KeywordsData Curation, Databases, Chemical, Databases, Protein, Machine Learning, Models, Theoretical, Neural Networks, Computer, Proteins, Reproducibility of Results, Support Vector Machine
Abstract

Mining relations between chemicals and proteins from the biomedical literature is an increasingly important task. The CHEMPROT track at BioCreative VI aims to promote the development and evaluation of systems that can automatically detect the chemical-protein relations in running text (PubMed abstracts). This work describes our CHEMPROT track entry, which is an ensemble of three systems, including a support vector machine, a convolutional neural network, and a recurrent neural network. Their output is combined using majority voting or stacking for final predictions. Our CHEMPROT system obtained 0.7266 in precision and 0.5735 in recall for an F-score of 0.6410 during the challenge, demonstrating the effectiveness of machine learning-based approaches for automatic relation extraction from biomedical literature and achieving the highest performance in the task during the 2017 challenge.Database URL: http://www.biocreative.org/tasks/biocreative-vi/track-5/.

DOI10.1093/database/bay073
Alternate JournalDatabase (Oxford)
PubMed ID30020437
PubMed Central IDPMC6051439
Grant ListR21 LM012274 / LM / NLM NIH HHS / United States