UK researchers have demonstrated the application of a machine learning-based "red dot" model to classify chest radiographs as normal or abnormal, with 94.6% accuracy. Its application to real-world datasets may be warranted in optimising clinician workload in the face of expanding demand, according to the researchers. Their work is published in the journal Clinical Radiology.
Despite radiological advances, the chest radiography (CXR) remains the most commonly requested imaging technique in the UK. Consequently, timely radiologist reporting of every film is not always possible, leading to a “backlog” of unreported studies. The "red dot" system is a longstanding method of flagging abnormal radiographs. The modern system uses digital superimposition of the words “red dot” on such images, a tribute to the traditional method of affixing a circular, red sticker to the abnormal plain film. Although radiographers can be trained in this prospective image prioritisation methodology, it does not tackle the retrospective burden of imaging backlog.
Machine learning (ML), specifically the field of image recognition using neural networks, could provide one such avenue of pictorial classification. Deep convolutional neural networks (CNN), an architecture loosely modelled on the biological organisation of the human brain, represents one such ML approach, historically demonstrating breakthroughs in computer vision and speech recognition. The aim of the present study was to develop an ML-based model for the binary classification of chest radiography abnormalities, to serve as a retrospective tool in guiding clinician reporting prioritisation.
In this study, the open-source machine learning library, Tensorflow, was used to retrain a final layer of the deep convolutional neural network, Inception, to perform binary normality classification on two, anonymised, public image datasets. Retraining was performed on 47,644 images using commodity hardware, with validation testing on 5,505 previously unseen radiographs. Confusion matrix analysis was performed to derive diagnostic utility metrics.
According to the researchers, a final model accuracy of 94.6% (95% confidence interval [CI]: 94.3–94.7%) based on an unseen testing subset (n=5,505) was obtained, yielding a sensitivity of 94.6% (95% CI: 94.4–94.7%) and a specificity of 93.4% (95% CI: 87.2–96.9%) with a positive predictive value (PPV) of 99.8% (95% CI: 99.7–99.9%) and area under the curve (AUC) of 0.98 (95% CI: 0.97–0.99).
"The model’s false-positive rate (FPR) was between 3-13%. Previous meta-analyses of radiographer 'red dot' usage in combined chest and abdominal radiograph interpretation in the emergency department setting identified FPRs between 7-12%, which suggests reasonable concordance with existing literature from real-world settings," the researchers point out. "Additionally of note, model abnormality detection gives neither weight to the severity of the recognised disease process, nor any pointers to the suspected pathology."
The model has important limitations including overfitting, a problem whereby images of the same label share features that a human would know is not related to the label. Although the dataset has been internally tested on more than 5,000 images, the researchers say, this does not represent a real-world validation. As such, further research is required to validate the application of such models to non-publicly available research datasets.
"Although further work is required to validate the application of such models to real-world datasets, the present study adds to existing literature in proposing the application of deep machine learning to the rapid automatic detection of abnormality in radiological imaging," the researchers add.