Chronic obstructive pulmonary disease (COPD) is often detected late, when symptoms have already progressed and options to slow decline are more limited. Chest radiographs are common in everyday care yet are not used to diagnose COPD. A deep learning model called CXR-Lung-Risk, originally trained to estimate long-term lung-related mortality from a single radiograph, was tested to see whether it can highlight adults at higher risk of developing COPD using images captured during routine outpatient visits. The evaluation compared the model with a simple electronic record-based clinical score and explored how the imaging signal relates to lung function and blood proteins in an independent cohort. The findings suggest an opportunity to surface at-risk individuals from scans already present in clinical workflows.
Tested in Everyday Imaging
The main analysis drew on a large group of outpatients in mid to later life who had standard posterior–anterior chest radiographs within a defined two-year window at a single hospital system. None had a prior record of COPD, emphysema or lung cancer when their radiograph was taken. To reflect real-world variation, results were examined separately for people who had ever smoked and for those who had never smoked. Individuals were followed for several years to see who subsequently received a COPD diagnosis.
Must Read: Strengthening Radiographer Preparedness for Severe Contrast Media Reactions
CXR-Lung-Risk was used exactly as released, without any retraining, and applied to the earliest available radiograph for each person. Performance was set against TargetCOPD, a clinical score assembled from routine information such as age, smoking status, breathlessness and selected prescriptions. Across the full outpatient cohort, the imaging-based score improved discrimination for future COPD compared with the clinical score used alone. The combined use of both approaches performed best, with consistent gains seen when looking at shorter time frames and across subgroups by sex and self-identified race and ethnicity.
Beyond headline accuracy, the evaluation also looked at clinical utility using decision curves. These showed that the imaging score delivered higher net benefit than the clinical score across common decision thresholds for smokers and within a pragmatic range for non-smokers, indicating that it could help identify more people who would go on to develop COPD while avoiding unnecessary follow-up for those at low risk.
Stratifying Risk for Earlier Case Finding
To understand how the model might be used in practice, individuals were grouped into low, moderate and high risk based on their imaging score. These categories showed clear separation in outcomes over time after accounting for available clinical variables and radiologist-reported findings. Among people who had ever smoked, those in the high-risk group were substantially more likely to be diagnosed with COPD during follow-up than those in the low-risk group, with the moderate group falling in between. The same graded pattern was seen among never-smokers, albeit with lower absolute event rates.
When the imaging score was considered alongside the clinical score, the two together sharpened risk stratification. People labelled high risk by both approaches experienced the highest rates of subsequent COPD, while those labelled low risk by both had the lowest rates. Importantly, even a single chest radiograph from routine care contained enough information for the model to contribute meaningfully to this separation, without any special imaging protocol or additional patient burden.
The researchers also examined where the model appeared to focus within the image. Saliency maps frequently highlighted structures in the mediastinum and around the aortic knob in both lower and higher risk predictions, with additional attention in the upper lung zones among individuals flagged as high risk. Some of these areas overlap with regions where radiographic correlations of lung pathology may be seen, though the relevance of certain highlighted structures to COPD risk remains to be clarified.
Signals in Lung Function and Blood
To probe what the imaging signal might represent biologically, the team turned to an independent cohort within the Project Baseline Health Study, where participants had chest radiographs, spirometry and, for many, blood proteomics. Here, higher imaging risk scores aligned with lower performance on several measures of pulmonary function. This relationship was strongest in people who had ever smoked, but meaningful associations were also observed in those who had never smoked for key measures of airflow and gas transfer.
Blood protein analysis added further context. A small set of proteins showed positive associations with the imaging-derived risk. Across the overall group, secretoglobin family 3A member 2 and lysozyme were linked to higher risk scores. Among smokers, surfactant protein B and leucine-rich α-2 glycoprotein 1 were associated, while the patterns for the other proteins were directionally similar but did not reach significance. In never-smokers, the association for lysozyme resembled the full analysis, and signals for the other markers were weaker. Together, these findings suggest that what the model detects on routine radiographs corresponds to measurable differences in lung physiology and specific protein signatures related to lung health.
Applying deep learning to standard chest radiographs taken in everyday care identified adults at increased risk of developing COPD over subsequent years, beyond what a simple clinical score could provide. Risk categories based on the imaging score separated outcomes in both smokers and never-smokers, and combining the imaging and clinical approaches further sharpened who was most and least likely to receive a diagnosis. The imaging signal aligned with poorer lung function and with a small set of blood proteins linked to lung biology in an independent cohort. As an externally validated, open-source tool, CXR-Lung-Risk could help prompt earlier confirmation testing such as spirometry and closer follow-up within existing care pathways. Further work on diverse populations and integration into clinical systems will help define how best to use this signal to support earlier case finding and more proactive management.
Source: The Lancet Digital Health
Image Credit: iStock