The rapid rise of large language models (LLMs) and other generative foundation models has prompted a familiar question in healthcare: whether general-purpose systems can replace task-specific tools. Generative models can draft notes, produce trial summaries, assist diagnostic reasoning and even model protein structures, which makes endpoint-driven medical algorithms look dated. Yet the current landscape shows a more nuanced reality. Specialised medical algorithms, validated for specific endpoints and embedded into workflows, remain the regulatory-tested pathway to consistent quality, performance and trust, while generative approaches redefine what might be possible next. The direction is integration, applying each approach where it performs best.
Generative Models: Pace, Scale and Limits
Two 2025 programmes highlight both momentum and constraint. CoMET, developed by Epic Systems and Microsoft Research, was trained on more than 300 million patient records and 16 billion encounters to anticipate medical events by simulating future health timelines. Its performance matched or outperformed traditional specialised models in some disease areas, yet it fell short in others and its efficacy depended on model scale. In parallel, Delphi-2M, a modified Generative Pre-trained Transformer published in Nature by researchers from the European Molecular Biology Laboratory, European Bioinformatics Institute, learned the natural history of human disease from 400,000 UK Biobank participants and was externally validated in 1.93 million individuals from Denmark. It predicts the rates of more than 1,000 diseases, specifically 1,258 states covering the full ICD-10 list and estimates the timing of multiple conditions up to 20 years before onset in a multimorbidity context.
Must Read: Bridging the Readiness Gap in Generative AI for Healthcare
Both initiatives also set clear boundaries. Generative models are not ready for point-of-care use, aside from the absence of suitable regulatory frameworks. Delphi-2M reported selection and immortality bias, missingness in data sources and limits in achieved prediction performance. CoMET and Delphi-2M showed inconsistent accuracy across diseases, underscoring that scale and sophistication do not yet guarantee uniform, clinically dependable outputs.
Why Specialised Algorithms Still Anchor Trust
Trust in clinical AI hinges on accuracy and reproducibility, which specialised medical algorithms continue to deliver. These systems are validated against defined clinical endpoints and measurable outcomes, are explainable, auditable and regulator-approved. Crucially, they are embedded into clinical workflows so that benefits apply automatically and equitably across patients. Where head-to-head comparisons are available, specialised models currently retain an edge in diagnosis and risk prediction for diabetes and in forecasting specific chronic disease outcomes such as 1–3 year risk for chronic heart failure, stroke, heart attack or atherosclerotic cardiovascular disease in hyperlipidaemia.
Generative models add complementary value by simulating potential trajectories in a patient’s health journey. Rather than focusing on a single event, they propose quantitative, data-driven scenarios for what may happen next, which supports proactive planning. The present reality is therefore a division of labour: specialised algorithms deliver dependable decisions at the point of care, while generative systems explore broader contextual possibilities that can inform future care pathways.
Hybrid Intelligence for Safer, Smarter Decision Support
A practical path forward pairs the two approaches in a hybrid architecture. At the frontline, endpoint-driven specialised algorithms continue to predict, interpret and decide. In the background, generative models learn continuously from real-world data, identify subtle population patterns and surface insights that can refine future tools. Over time, specialised models may benefit from generative systems that consider wider clinical context, while fine-tuning generative models can improve task-specific performance. In operational terms, a foundational model could trigger a specialised algorithm when appropriate, combining breadth with precision.
Strengthening healthcare-specific foundation models by connecting them to trusted clinical knowledge sources helps to improve reliability. Linking to knowledge graphs or applying retrieval-augmented generation (RAG) leads to outputs grounded in facts and supported by higher-quality reasoning. These measures also dampen fears of hallucinations and create mechanisms for validation and audit, which are essential for trust and adoption in health systems. For enterprise leaders, progress rests on parallel tracks: investing in validated, regulatory-approved specialised algorithms that deliver performance today, creating safe sandboxes to learn from real-world data, benchmark against clinical truth and mitigate bias with RAG or knowledge graphs and building literacy and organisational readiness through governance that treats AI as a continuously learning ecosystem where accountability, validation and equity are integral.
Clinical AI is evolving toward models that understand the language of disease across individual and population journeys, bringing the concept of a virtual patient closer to practical use. Foundational models reveal latent structures in longitudinal health data, while specialised algorithms provide dependable, regulator-approved performance at the bedside. A hybrid intelligence approach reconciles these strengths, enabling safe implementation now and better personalisation over time. For healthcare professionals and leaders, the imperative is to combine present-day reliability with structured innovation so that decision support becomes both accountable and increasingly capable without compromising patient safety.
Source: Healthcare Transformers
Image Credit: iStock