A recent review advocates for a fundamental shift in how artificial intelligence is integrated into intensive care units, moving away from a narrow focus on model development towards comprehensive human-centred AI system design. The discussion is based on the World Health Organization’s 2021 ethical framework, which establishes six core principles for AI in healthcare: protecting human autonomy, promoting well-being and safety, ensuring transparency and explainability, fostering responsibility and accountability, ensuring inclusiveness and equity, and promoting responsive and sustainable systems.
In ICU settings, where decisions carry life-or-death consequences, AI must augment rather than replace clinical judgment. Unlike clinicians, machines take no oath and bear no moral responsibility. Therefore, AI outputs should be explanatory rather than prescriptive, displaying confidence levels and allowing clinician override without friction. A significant concern is automation bias, where clinicians over-rely on algorithmic recommendations, particularly problematic in high-pressure ICU environments with elevated cognitive load. The authors note that some AI systems can actually degrade human performance even when AI is correct, highlighting the need for systems designed to prompt critical engagement rather than blind acceptance.
Any AI system operating in critical clinical environments must prioritise patient safety, clinician support, and broader public health benefits. AI should align with how clinicians think and work, supporting rather than displacing decision-making. Rigorous clinical trials should be used to compare standard care to AI-supported care. Real-world deployment requires usability studies and stress-testing, with AI co-designed alongside end users. Poorly integrated tools can increase cognitive burden, introduce delays, and diminish care quality. For example, adding AI-driven alarms to already overloaded ICU environments could reduce the signal-to-noise ratio and further erode responsiveness to truly critical alerts.
The authors distinguish between transparency (how a system was built and behaves), explainability (the rationale for specific outputs), and interpretability (whether outputs can meaningfully inform clinical decisions). Explainability is particularly crucial in critical care, where clinicians must quickly assess whether recommendations are trustworthy and contextually relevant. Rudin’s argument that interpretability, rather than post-hoc explainability, should be the goal in high-stakes machine learning, as post-hoc rationales can be unreliable or misleading. Systems should provide well-calibrated probabilities with explicit uncertainty measures, incorporate mechanisms to abstain when confidence is low, and offer counterfactual guidance showing how recommendations might vary under different interventions.
Clear ethical and legal responsibilities must be assigned for AI systems deployed in ICUs. While some countries have established data governance frameworks, many jurisdictions, particularly in low- and middle-income countries, lack necessary infrastructure and legislation. The authors stress that regulatory approval does not guarantee prospective validation for safety and effectiveness. They highlight the need for post-deployment monitoring, continuous auditing, and a multidisciplinary AI Safety Committee comprising clinical leads, machine learning operations specialists, and quality representatives. Routine audits covering calibration, drift, alert quality, and subgroup performance should be automated and continuously logged.
Bias in AI presents profound implications for critical care. Models trained on non-representative data risk producing recommendations that are less effective or harmful for marginalised populations, including racial minorities, women, children, and patients from low-resource settings. ICU-specific documentation practices exacerbate these risks, as records may be incomplete or reconstructed from memory, missing important qualitative data. While promoting fairness is crucial, operationalising these principles remains challenging, as fairness attributes are seldom recorded and widely accepted metrics have yet to be defined.
Human-centred AI requires ongoing commitment, with systems evolving based on feedback, updated evidence, and changing clinical practices. The authors highlight challenges in workforce readiness and digital infrastructure, particularly in the global south, emphasising that investment in human capacity through training and support is central to sustainability. They also address AI’s environmental footprint as large-scale model training and deployment contribute substantially to carbon emissions. This concern directly intersects with social justice, as climate change disproportionately affects the populations most at risk of being left behind in digital health transformations.
Hence, an operational framework would integrate the WHO principles across three layers: the AI lifecycle, clinical workflow, and governance structures. Moving beyond model development requires designing comprehensive human-AI systems rooted in ethics, contextual relevance, and stakeholder engagement.
Source: Critical Care
Image Credit: iStock