The integration of artificial intelligence in radiology has rapidly advanced, offering significant potential for predictive diagnostics and personalised care. However, the clinical reliability of these models hinges on their transparency and reproducibility, particularly through independent external validation. Despite growing emphasis on open science, many AI models remain unavailable, limiting scientific progress and patient benefit. A comprehensive meta-research study analysed AI development studies published in 2022 across five leading radiology journals, evaluating how frequently models and their supporting materials were shared to enable replication.


 
Low Model Availability and Its Implications
Among the 268 radiology AI development studies assessed, only 39.9% made their models available for replication. This availability was defined by the presence of sufficient technical information, including model architecture and weights for deep learning (DL) models or formulae and coefficients for regression models. Traditional regression-based models were the most transparent, with a 73.3% availability rate, while DL models had the lowest at just 11.5%. The limited availability of DL models is concerning, given their complexity and reliance on large, often proprietary datasets. Even when training code was shared, critical components like inference code or trained model weights were frequently absent, undermining the potential for external validation.
Model availability also varied by journal and imaging modality. Radiologia Medica showed the highest transparency rate (65.0%), while projection-based imaging studies and musculoskeletal imaging exhibited particularly low availability. Diagnostic models were less likely to be shared compared to prognostic ones, and studies that included external testing showed a slightly higher tendency to provide model access. These patterns suggest that both editorial policies and model type influence transparency. However, even in journals mandating code disclosure, such as Radiology, model sharing remained low unless specifically required, highlighting a gap between policy and practice.


Influence of Preprocessing Tools and Software Accessibility
Beyond model architecture and weights, reproducibility in AI research also depends on the availability of preprocessing tools and feature extraction software. In a secondary analysis, the study imposed stricter criteria by examining whether open-source or commercial software was used during preprocessing. When only studies using open-source software were considered, the overall model availability dropped to 23.5%. For DL studies, this figure fell even lower to 9.7%, reflecting the compounded challenge of replicating models that rely on proprietary tools or in-house developed software.

 

Must Read: Transforming Radiology Reporting with Large Language Models


This finding underscores the importance of not only sharing models but also ensuring that the entire workflow, including preprocessing pipelines, is accessible and standardised. Radiomics packages, commonly used for feature extraction, were associated with lower model availability in multivariable analyses. This could be due to their complexity or proprietary nature, which limits accessibility and adds barriers to full replication. Moreover, segmentation tools, although considered more robust and thus often excluded from stringent assessments, still play a critical role in image preparation and should not be overlooked when considering full transparency.


Factors Associated with Greater Availability
The study conducted logistic regression analyses to determine which factors were associated with increased or decreased model availability. Traditional regression-based models stood out, with an odds ratio (OR) of 17.11 for being available, suggesting a strong positive association. External validation also emerged as a significant factor, with studies conducting external testing more likely to share their models. Conversely, the use of radiomics packages showed a negative association with model availability (OR 0.27), reflecting potential challenges tied to licensing or technical complexity.


Interestingly, larger training sets (over 1,000 individuals) were associated with lower availability in univariable analyses. This counterintuitive finding may reflect concerns about data privacy, resource constraints or commercial motivations behind high-scale DL projects. Despite expectations, factors such as the country of origin or imaging modality had limited predictive power in the multivariable model. The overarching insight is that simpler models and more rigorous validation practices promote transparency, while technological and institutional complexities reduce the likelihood of model sharing.

 

AI models in radiology, particularly deep learning systems, remain largely inaccessible to the broader research community, hindering reproducibility and external validation. Although traditional regression models fare better, the overall landscape reveals significant gaps in transparency. These limitations are further compounded when preprocessing software and radiomics tools are not readily accessible. Editorial mandates for code sharing have yet to fully address this issue, and regulatory, commercial or ethical barriers often restrict the release of trained models.


Improving the availability of AI models requires a multifaceted approach. Stricter model-sharing policies by journals, the use of accessible software throughout the development pipeline, and mechanisms for providing demo interfaces when full sharing is not feasible can help bridge the transparency gap. Encouraging researchers to disclose reasons for non-availability would also promote accountability. As radiology continues to evolve with AI, ensuring that these tools are not just innovative but also replicable and transparent will be essential for advancing patient care and maintaining scientific integrity.

 

Source: European Radiology

Image Credit: Freepik


References:

Lee T, Lee JH, Yoon SH et al. (2025) Availability and transparency of artificial intelligence models in radiology: a meta-research study. Eur Radiol. 



Latest Articles

AI in radiology, deep learning models, medical AI transparency, AI reproducibility, model availability, radiology AI, machine learning in healthcare, AI validation, radiomics AI in radiology lacks transparency, limiting validation. Discover how model availability impacts reproducibility.