TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays

TitleTieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-Rays
Publication TypeConference Proceedings
Year of Conference2018
AuthorsWang X, Peng Y, Lu L, Lu Z, Summers RM
Conference NameIEEE/CVF Conference on Computer Vision and Pattern Recognition
Pagination9049-9058
Date Published06/2018
PublisherIEEE
Conference LocationSalt Lake City, UT, USA
ISBN Number978-1-5386-6420-9
Abstract

Chest X-rays are one of the most common radiological examinations in daily clinical routines. Reporting thorax diseases using chest X-rays is often an entry-level task for radiologist trainees. Yet, reading a chest X-ray image remains a challenging job for learning-oriented machine intelligence, due to (1) shortage of large-scale machine-learnable medical image datasets, and (2) lack of techniques that can mimic the high-level reasoning of human radiologists that requires years of knowledge accumulation and professional training. In this paper, we show the clinical free-text radiological reportscan be utilized as a priori knowledge for tackling these two key problems. We propose a novel Text-Image Embedding network (TieNet) for extracting the distinctive image and text representations. Multi-level attention models are integrated into an end-to-end trainable CNN-RNN architecture for highlighting the meaningful text words and image regions. We first apply TieNet to classify the chest X-rays by using both image features and text embeddings extracted from associated reports. The proposed auto-annotation framework achieves high accuracy (over 0.9 on average in AUCs) in assigning disease labels for our hand-label evaluation dataset. Furthermore, we transform the TieNet into a chest X-ray reporting system. It simulates the reporting process and can output disease classification and a preliminary report together. The classification results are significantly improved (6% increase on average in AUCs) compared to the state-of-the-art baseline on an unseen and hand-labeled dataset (OpenI).

DOI10.1109/CVPR.2018.00943