Healthcare professionals around the world are facing increasing pressures due to rising patient loads, staff shortages, and escalating costs. These systemic challenges are straining the capacity to deliver consistent, high-quality care and are expected to worsen without effective intervention. In this context, artificial intelligence (AI) offers promising solutions to optimise workflows, support clinical decision-making, and improve patient outcomes, particularly in data-intensive settings like the ICU. 

 

As a result, research on AI in healthcare, especially within the ICU, has expanded rapidly. However, despite promising developments, clinical integration of AI remains limited. One major barrier is the lack of external validation and prospective implementation studies. For instance, only a small fraction of FDA-approved AI tools (9%) undergo prospective postmarket surveillance, and fewer than 2% are supported by robust publications on safety and efficacy. 

 

Other challenges to AI implementation include poor integration with existing clinical workflows, regulatory and compliance hurdles, resistance to adoption in clinical practice, and unclear financial returns. Moreover, research funding is predominantly funnelled towards developing new AI models rather than implementing and evaluating those already developed.

 

To address these gaps, continuous scientific evaluation of AI applications is essential. Traditional systematic reviews provide comprehensive overviews but often lag behind the fast-paced developments in AI, rendering them outdated soon after publication. In contrast, living systematic reviews (LSRs) offer a more agile approach by continuously updating evidence, highlighting changes and ongoing barriers over time. This is especially valuable in dynamic fields like AI and in critical settings like the ICU, where timely, evidence-based updates are vital for clinical decision-making.

 

A 2021 systematic review called for shifting focus from AI development to clinical implementation in the ICU. Building on that foundation, a current review examines the progress of AI models in this setting, evaluating their technical maturity, adherence to reporting standards, risk of bias, and global distribution. It also includes newer AI approaches such as large language models (LLMs) and reinforcement learning. The goal is to assess the readiness of AI applications for real-world clinical use and to identify the key barriers that must be addressed to support their effective integration into patient care.

 

The review reveals that despite rapid growth in AI research for the ICU, much of this increase is driven by retrospective studies focused on model development rather than clinical use. Although the proportion of studies with high risk of bias has decreased somewhat, the number of studies with unclear risk has risen, and adherence to reporting standards remains low.

 

The transition from AI development to clinical implementation is limited across all AI types, including generative models. This gap represents missed opportunities for patient care improvement and inefficient resource use. Key barriers include lack of external validation, limited prospective evaluation, poor workflow integration, regulatory complexity, slow adoption, weak business cases, and insufficient funding. The authors call for coordinated, comprehensive strategies prioritising funding, collaboration on validation, prospective studies, and ongoing monitoring to enable sustainable clinical adoption. LSRs, like this one, are crucial for tracking progress and identifying challenges over time.

 

Their findings align with prior reviews highlighting scarce real-world evaluations and uneven geographic and specialty representation, but add dynamic insights by assessing technical maturity (TRL) and bias comprehensively. The review highlights systemic hurdles, such as a mismatch between many registered AI clinical trials and low approval rates by agencies like the FDA, pointing to difficulties in translating research to practice.

 

Ethical checklists and implementation science frameworks are emphasised as key tools to bridge development and application gaps. Collaborative networks and consensus frameworks focused on transparency and maturity in AI integration are essential for establishing responsible practices. The review stresses the importance of clear reporting standards, noting that despite modest increases, major gaps persist, with rising unclear bias undermining confidence and reproducibility.

 

As AI, including generative models, continues to evolve rapidly, uniform and dynamic evaluation approaches like LSRs are vital to maintain relevant, comparable assessments over time. This method also reduces duplication of effort and aligns with the system-wide transformative potential of AI, similar to innovations during the COVID-19 pandemic. Bias and lack of diversity remain major challenges, with AI models often trained on datasets underrepresenting marginalised groups, perpetuating health inequities such as underdiagnosis and stereotyping.

 

Overall, this review shows a rapid rise in AI research in intensive care, mainly through retrospective studies, but clinical implementation, including for generative AI, remains limited. It underscores the urgent need for a shift toward operationalising and prospectively testing AI applications to ensure real clinical benefits. 

 

Source: JAMA
Image Credit: iStock
 


References:

Berkhout WEM, van Wijngaarden JJ, Workum JD et al. (2025) Operationalization of Artificial Intelligence Applications in the Intensive Care Unit: A Systematic Review. JAMA Netw Open. 8(7):e2522866.



Latest Articles

ICU, AI, Artifical Intelligence, Clinical Integration, Bias, AI models, AI model maturity, operationalisation, validation Operationalisation of Artificial Intelligence Applications