Radiology departments continue to balance diagnostic accuracy, reporting efficiency and workflow sustainability as imaging demand grows. Bedside chest radiography remains one of the most frequently performed examinations in hospital settings, requiring consistent interpretation under time pressure. Structured reporting (SR) has been promoted as a way to standardise documentation and reduce variability in radiology reports, while artificial intelligence tools promise to support detection and streamline reporting tasks. Questions remain about how these approaches influence radiologists’ interaction with medical images, particularly visual attention during interpretation. Eye-tracking analysis provides an opportunity to examine how reporting interfaces shape reading behaviour, decision-making and diagnostic performance during chest radiograph interpretation across different levels of clinical experience.
Reporting Modes and Reader Workflow
Three reporting approaches were evaluated using bedside chest radiographs: conventional free-text reporting, SR templates and AI-prefilled structured reporting (AI-SR). Eight readers participated in the evaluation, divided into novice and non-novice groups based on radiography training experience. Novice readers included medical students and residents without formal radiography rotation experience, while non-novice readers were radiology residents who had completed radiography training but were not yet board certified.
Each reader interpreted the same set of 35 bedside chest radiographs during three reading sessions separated by washout periods of at least two weeks. Radiographs were presented in a different order in each session to minimise recall effects. The dataset included examinations from 35 patients and reflected a clinically representative distribution of findings, with predominantly normal or mildly abnormal radiographs and fewer moderate or severe abnormalities.
Image interpretation and reporting were performed on a workstation displaying the radiograph on the right side of the screen and the reporting interface on the left. Eye-tracking technology recorded gaze behaviour throughout each session. Two areas of interest were defined for analysis: the radiograph display field and the report display field. Free-text reporting required narrative typing supported by text-expansion software. SR required readers to complete predefined template fields using selectable options describing the presence, severity and distribution of radiographic findings. AI-SR used the same template, prefilled with automated suggestions generated by a convolutional neural network trained on bedside chest radiographs, allowing readers to confirm or modify entries.
Diagnostic Accuracy and Reporting Efficiency
Diagnostic performance was measured using agreement with a reference standard established by majority vote among six expert radiologists. Accuracy levels were similar between free-text reporting and SR, with quadratic-weighted Cohen κ values of 0.58 and 0.60 respectively. AI-SR improved agreement to κ = 0.71 compared with both other reporting modes. The neural network operating independently achieved κ = 0.81 against the same reference standard.
Performance improvements associated with AI-SR were observed across both experience levels, although the magnitude differed. Novice readers showed larger improvements in diagnostic agreement when AI-SR replaced free-text reporting, whereas non-novice readers demonstrated smaller gains. These findings indicate that AI assistance may provide greater diagnostic support for less experienced readers while still improving performance among readers with prior radiography training.
Must Read: DL for Osteoporosis Detection Using Chest CT Scans
Reporting time differed substantially across reporting modes. Mean reporting time per radiograph decreased from 88.1 seconds during free-text reporting to 37.3 seconds with SR and to 25.0 seconds with AI-SR. Novice readers demonstrated the largest reductions in reporting time when structured workflows were introduced. Non-novice readers remained faster overall across all reporting modes, with clear time reductions from free-text reporting to SR. Additional time savings with AI-SR compared with SR were primarily observed among novice readers, while differences between SR and AI-SR were smaller for non-novice readers.
Gaze Patterns and Visual Attention
Eye-tracking measurements revealed consistent differences in visual interaction across reporting modes. SR and AI-SR both reduced visual attention directed toward the reporting interface compared with free-text reporting. Total fixation duration in the report display field decreased from 11.4 seconds during free-text reporting to 4.8 seconds with SR and 3.6 seconds with AI-SR. Reductions in visit duration, fixation count and saccade count were also observed in the report display field when structured templates replaced narrative typing.
Changes in gaze distribution differed by experience level. Novice readers showed increased visual focus on the radiograph when using SR and AI-SR compared with free-text reporting. Their fixation patterns indicated reduced interaction with the reporting interface and greater attention directed toward image interpretation. Non-novice readers maintained relatively stable radiograph fixation duration across all reporting modes, suggesting that reporting interface changes had less influence on their visual search behaviour.
Heatmap visualisations demonstrated more concentrated fixation clusters during SR and AI-SR compared with free-text reporting. Visual attention was more strongly directed toward anatomical regions referenced by the structured template. Central and basal lung zones received the highest fixation density, while lung apices and extrapulmonary structures received less attention overall. These patterns reflected a more focused visual search strategy aligned with template-driven reporting fields. No meaningful differences in gaze metrics were observed between SR and AI-SR, indicating that AI-prefilled templates did not substantially alter visual search behaviour compared with SR alone.
User experience ratings supported these findings. SR and AI-SR were rated more positively than free-text reporting across measures of satisfaction, usability and perceived efficiency. AI-SR achieved the highest satisfaction ratings overall. Cognitive burden ratings were also more favourable for structured reporting modes than for free-text reporting. Despite these positive perceptions, readers frequently reported limited trust in AI-generated suggestions even when rating the technology as useful.
Structured reporting and AI-assisted templates significantly influenced chest radiography interpretation workflows, affecting reporting efficiency, diagnostic agreement and visual interaction patterns. SR reduced reporting time and decreased attention directed toward the reporting interface, while AI-prefilled templates further improved diagnostic agreement with an expert-derived reference standard and produced additional efficiency gains, particularly among less experienced readers. Eye-tracking analysis showed that structured workflows encouraged greater visual focus on radiographs for novice readers without substantially altering gaze behaviour among more experienced readers. User feedback indicated strong acceptance of structured reporting approaches alongside persistent caution regarding AI-generated suggestions, highlighting the importance of interface design and user trust in the integration of AI-supported reporting systems.
Source: Radiology
Image Credit: iStock