Artificial intelligence and big data analytics are moving clinical decision support systems from early experimentation into real-world clinical practice. These tools combine clinical, imaging, patient-generated and administrative data to support diagnostic accuracy, prognostic estimation, risk stratification and more timely decisions. A viewpoint recently published in the Journal of Medical Internet Research links these gains to a persistent obstacle: access to large, high-quality and granular data. AI-informed systems can assist clinicians facing symptom complexity and uncertainty, yet their performance depends on data that are representative, interoperable, usable and protected. The central challenge is therefore not only algorithmic capability, but the ability of health systems to build data environments that make trustworthy information available while respecting privacy, governance and ethical constraints.

 

Clinical Promise Across Care Settings

AI-enabled clinical decision support already spans several high-value areas of care. In oncology, models use clinical variables, imaging, genomic profiles and longitudinal outcomes to support cancer subtype classification, survival prediction and individualised risk stratification. Explainable methods can show which clinical or molecular factors drive predictions, supporting clinician trust and shared decision-making. In organ transplantation, predictive analytics and optimisation algorithms analyse donor and recipient data to support matching that balances equity, urgency and expected benefit while seeking better graft use and post-transplant survival. Mismatches can lead to graft failure, patient morbidity and wasted organs.

 

Must Read: Explainable AI Visualisations for ICU Intubation Risk

 

Diabetic retinopathy screening shows how AI can extend access where ophthalmology capacity is limited. Deep learning models trained on large retinal image datasets can detect disease with accuracy comparable to expert clinicians, enabling point-of-care screening and earlier referral in rural and underserved settings. Other applications include post-traumatic epilepsy risk prediction after traumatic brain injury, rehabilitation planning after spinal cord injury, rare disease characterisation using longitudinal electronic health record data and emergency department tools that analyse free-text notes to identify patients at risk of unplanned return visits. After spinal cord injury, models can forecast independence scores, gait recovery and impairment scale changes. Across these examples, AI supports diagnosis, prevention, risk stratification and resource allocation, but implementation still depends on data quality, interpretability and workflow fit.

 

Data Access Remains the Main Bottleneck

Access to clinically rich data remains one of the most significant constraints on AI-enabled decision support. Health care data are sensitive, fragmented across institutions, governed by complex regulatory frameworks and costly to curate. Public and semi-public datasets have supported important progress, including cancer registry data from the Surveillance, Epidemiology, and End Results Program, transplant data from the United Network for Organ Sharing and administrative claims data from the Centers for Medicare and Medicaid Services. These resources offer scale, but their structure and access rules may limit the granularity needed for advanced clinical models.

 

Institution-managed and commercial real-world data platforms add clinical detail through electronic health records, laboratory results, medication histories, imaging metadata, clinical notes and care pathway information. Oracle Real-World Data and Epic Cosmos show how large EHR-based platforms can support cohort identification, model development and validation. Their secure, cloud-based environments can reduce some privacy risks by limiting direct data downloads. However, subscription fees, partnership requirements, contractual restrictions, approval processes and governance rules can restrict participation to well-resourced organisations.

 

National repositories also offer important resources. The Veterans Health Administration provides longitudinal patient data through VINCI and related registries, while the UK Biobank and All of Us combine multiple data types with structured access models. Even these initiatives face limitations linked to data quality, eligibility, institutional affiliation, strict use agreements and global accessibility.

 

Privacy-Preserving Models and Governance

Practical and ethical constraints shape every stage of AI development for clinical use. Electronic health record data may be incomplete, inconsistently documented or affected by diagnostic coding practices that do not always reflect confirmed conditions. Models trained on historical data may miss new treatments, changing standards of care or shifting disease patterns. Underrepresentation of some groups can also affect fairness, while opaque systems may weaken clinician trust when predictions influence high-stakes decisions. The safe use of AI therefore requires safeguards, transparency, accountability and continuing involvement of clinicians and patients in design and deployment.

 

Synthetic data and federated learning offer routes to wider model development without unrestricted movement of sensitive records. Synthetic datasets can preserve statistical properties and clinical relationships while reducing reidentification risk, supporting algorithm development, benchmarking and education. Their value depends on the quality, timeliness and representativeness of the underlying data, and they require ongoing governance to avoid drift, hidden bias and misleading outputs.

 

Federated learning sends models to local sites, where institutions train them without centralising raw patient records. Shared parameter updates refine a global model while maintaining institutional control over data. This approach may suit rare diseases, multicentre work and regulated AI development, although it adds technical complexity, coordination demands and pressure on existing infrastructure. Hybrid ecosystems also need quality standards, common vocabularies, lineage tracking and guarded self-service access.

 

AI-enabled clinical decision support can improve diagnostic accuracy, risk stratification, resource use and patient outcomes across diverse clinical domains. Its progress now depends on more than model performance. High-quality data, interoperable systems, privacy protection, ethical governance, explainability and workflow alignment all determine whether tools can be trusted and used safely. Hybrid data ecosystems that combine curated real-world data, synthetic data and federated architectures offer a practical direction. Sustained benefit requires data environments that are secure, transparent, monitored and designed around clinical needs without compromising patient rights, dignity and equity.

 

Source: Journal of Medical Internet Research

Image Credit: iStock


References:

Daly JE, Delen D, Han Z et al. (2026) AI in Clinical Decision Support Systems: Promising Applications and Strategies for Managing Data Challenges. J Med Internet Res;28:e71532.




Latest Articles

AI clinical decision support, healthcare data access, federated learning healthcare, synthetic data healthcare, real world evidence, medical AI governance AI clinical decision support faces data access barriers. Explore real-world healthcare data, privacy challenges, and federated learning solutions.