Defining and Detecting AI Hallucinations in Nuclear Imaging

In Artificial intelligence
Fri, 7 Nov 2025

Artificial intelligence–generated content (AIGC) is advancing rapidly across nuclear medicine imaging (NMI), promising software-based gains in denoising, motion correction, attenuation correction and cross-modality translation. These approaches may streamline workflows, reduce radiation exposure and improve quantitative accuracy. Alongside these benefits, a critical risk has emerged: AIGC can fabricate realistic yet false image content that misrepresents anatomy or function, undermining diagnostic confidence and clinical safety. A domain-specific, shared framework is needed to name, detect and mitigate these failures. The DREAM report addresses this need by proposing a focused definition, illustrating representative failure modes, outlining multi-level evaluation approaches and mapping root causes to practical safeguards so that AIGC in NMI can be deployed more safely.

What Hallucinations Mean in Nuclear Medicine

Definitions of hallucination vary widely across literature. For NMI, the DREAM report recommends a narrow, operational meaning: AI-fabricated abnormalities or artifacts that look plausible and realistic yet are factually false and deviate from anatomic or functional truth or are unsupported by measurement when ground truth images are unavailable. This scope excludes general artifacts from traditional workflows and distinguishes hallucinations from other AI errors such as lesion omission or uniform intensity shifts, which are treated as illusions rather than fabrications.

Must Read: AI Hallucinations and the Risks for Healthcare

Representative risks span common AIGC tasks in NMI. During image enhancement, visually impressive SPECT or PET denoising can introduce false perfusion or lesion-like signals. In AI-based attenuation correction, synthetic maps derived from emission data may embed subtle but consequential false structures despite good visual agreement with references. Cross-modality translation is particularly vulnerable when attempting to infer functional abnormalities from structural data or vice versa, because pathophysiology may precede or bypass visible morphologic change. While such synthesis holds value for PET/MRI attenuation correction or dataset augmentation, its direct diagnostic substitution remains prone to misleading fabrications. By anchoring the definition in fabricated, realistic-looking abnormalities, the framework targets the errors most likely to deceive readers and propagate clinical harm.

How to Measure and Monitor Hallucinations

Robust deployment requires dedicated detection and evaluation beyond conventional image quality metrics. Image-based approaches include the hallucination index, which compares AIGC against a zero-hallucination reference constructed to match signal-to-noise and radiomics analyses that probe whether clinically relevant features in regions of interest remain statistically consistent with references. Both can reveal subtle divergences that plain visual scoring may miss, though they may also capture non-hallucinatory discrepancies and need tailoring to isolate fabrications.

When paired references are unavailable, dataset-level strategies become useful. The neural hallucination phallucinatingtifies deviations in feature space relative to a calibration bank, while no-gold-standard evaluation adapts quantitative imaging methodology to compare precision across models without assuming ground truth, acknowledging that it may reflect general error rather than hallucination alone. Clinically focused assessment remains essential: downstream task performance, expert Likert scoring augmented with bounding boxes and concise descriptors and sampled case review to balance feasibility with granularity. Automation can ease the burden, yet NMI lacks benchmark datasets annotated specifically for hallucinations, limiting the training of reliable detectors. Building multi-institutional, expandable repositories with standardised criteria would enable scalable, clinically aligned monitoring.

Regulatory and postmarket perspectives also matter. Cleared commercial tools exist, and draft device guidance recognises that erroneous outputs erode reliability and trust, advocating lifecycle approaches and rigorous validation. Routine clinical monitoring is not universally mandated, creating a gap that professional initiatives seek to address. Within such frameworks, hallucinations warrant explicit tracking, including thresholds that balance dose-reduction claims against fabrication risk, so visual gains do not mask inaccuracies.

Where Hallucinations Come from and How to Reduce Them

Hallucinations arise when the learned mapping from source to target images diverges from the true relationship. Data, learning and model factors each contribute, and mitigations should match the cause. Domain shift is a major driver: if training distributions overrepresent specific patterns or underrepresent rare pathologies, models may hallucinate familiar features or misbehave on out-of-distribution cases. Mitigations include clearly defining intended use and limits, improving data quality, quantity and diversity across scanners, protocols and populations, leveraging federated learning and applying domain adaptation when broad datasets are not feasible. Transfer learning and continuous updates can strike a balance between generalisation and specialisation, while retrieval-augmented workflows face constraints in NMI due to limited structured visual knowledge sources.

Data nondeterminism introduces aleatoric uncertainty from acquisition noise and ill-posed inverse problems, yielding one-to-many plausible outputs. Better acquisition, systematic cleaning and rigorous preprocessing can reduce variability, though practical constraints remain. Even with strong data, input perturbations or suboptimal prompts can trigger failures, structured prompts that encode organs, noise levels or anatomic expectations have improved fidelity in denoising and translation tasks.

From a learning perspective, underspecification means multiple models may meet validation targets yet differ in faithfulness. Ensemble or feature averaging across runs can suppress spurious signals at computational cost. Human-in-the-loop alignment allows experts to steer models toward plausible solutions, complemented by automated fast-checking layers that flag suspect content based on rules, heuristics or learned detectors. At the model level, limited perceptual understanding can be addressed by integrating auxiliary priors and constraints. Multimodal conditioning with demographic and disease-specific biomarkers has preserved pathological features in PET synthesis. Anatomically and metabolically informed diffusion models, or task-specific loss functions aligned with clinical endpoints such as perfusion defect detection, have reduced fabrications by guiding feature extraction toward medically relevant structure and function.

AIGC is reshaping NMI workflows yet introduces a distinctive safety risk when models fabricate realistic but false abnormalities. The DREAM report frames a practical, NMI-specific definition, demonstrates where fabrications emerge and recommends layered evaluation spanning image statistics, dataset-level analysis and clinically grounded assessment, with an emphasis on building annotated benchmarks. Mitigation should target data diversity and stability, address underspecification through averaging and expert alignment and embed anatomic, functional and task-based priors into model design. With explicit monitoring and thoughtful safeguards, healthcare organisations can realise AIGC efficiency gains while constraining hallucination risk in routine practice.

Source: Journal of Nuclear Medicine

Image Credit: iStock

References:

Xia M, Bayerlein R, Chemli Y et al. (2025) On Hallucinations in Artificial Intelligence–Generated Content for Nuclear Medicine Imaging (the DREAM Report). Journal of Nuclear Medicine: jnumed.125.270653.

Radiology, healthcare technology, PET, SPECT, diagnostic imaging, AI Safety, Medical AI, Machine learning in medicine, AI hallucinations, nuclear medicine imaging, AIGC, DREAM report

Latest Articles

Hospitals of the Future: The Next Frontier in Patient-Centred Care
- Journal Article
- 18/10/2025
Hospitals are rapidly evolving into smart, connected ecosystems focused on proactive, personalised care. Leveraging AI, robotics, remote monitoring and digital health tools, they enhance diagnostics, improve workflows and support decentralised models like virtual wards. Predictive analytics, interoper
READ MORE
AI Orchestration in Emergency Radiology – Implementation in the Valencia Health Region
- Journal Article
- 18/10/2025
The Valencia Health Region deployed a vendor-neutral AI orchestration system across 29 hospitals to improve emergency radiology. Validated at Hospital General Universitario Dr Balmis, it streamlines triage, accelerates diagnoses and reduces radiologists’ workload. The system processes over 5,700 studi
READ MORE
Advancement of 3D Printing in Healthcare and Its Impact on Sustainability
- Journal Article
- 18/10/2025
3D printing is transforming healthcare through personalised devices, surgical precision and faster prototyping while advancing sustainability. On-demand production reduces waste, supports circular economy models and lowers carbon footprints by minimising transport and inventory. Despite its promise,...
READ MORE

AI hallucinations, nuclear imaging, artificial intelligence, AIGC, DREAM report, PET, SPECT, attenuation correction, image denoising, medical imaging safety, clinical AI, diagnostic accuracy, healthcare AI Explore how the DREAM report defines, detects and mitigates AI hallucinations in nuclear imaging for safer diagnostics.

Defining and Detecting AI Hallucinations in Nuclear Imaging

References:

Latest Articles

Related Articles

Latest News

INFO

IMAGING

ICU

EXEC

IT

CARDIOLOGY

JOURNALS

EVENTS

FACULTY

PARTNERS

JOBS

COMPANIES

PRODUCTS

BLOG

VIDEOS

Communities

CONTACT US

EU Office

Rue Villain XIV 53-55

B-1050 Brussels, Belgium

Tel: +357 86 870 007

E-mail: [email protected]

EMEA & ROW Office

166, Agias Filaxeos

CY-3083, Limassol, Cyprus

Tel: +357 86 870 007

E-mail: [email protected]

Headquarters

Kosta Ourani, 5

Petoussis Court, 5th floor

CY-3085 Limassol, Cyprus

E-mail: [email protected]