Thyroid nodules are common and often benign, yet their evaluation frequently leads to overdiagnosis and unnecessary invasive procedures. Traditional diagnostic approaches such as ultrasound imaging and fine-needle aspiration biopsies are limited by their reliance on radiologist experience and subjectivity. Although artificial intelligence has emerged as a promising aid, many models lack transparency and clinical integration. The development of ThyGPT, a multimodal generative pre-trained transformer, offers a novel solution. It aims to improve diagnostic accuracy while enabling explainable interaction between AI and clinicians, thereby transforming thyroid nodule assessment and management. 

 

Limitations of Traditional CAD and the Need for ThyGPT 
Conventional computer-aided diagnosis (CAD) systems using ultrasound images have improved risk stratification for thyroid nodules, but they come with critical drawbacks. Most models operate as opaque "black boxes," unable to explain their reasoning or provide rationale for their outputs. Their communication is limited to static labels or scores, reducing clinician trust and hindering meaningful interaction. These limitations have stymied clinical adoption and led many radiologists to abandon such tools. Addressing this gap, ThyGPT introduces the concept of AI-generated content-enhanced CAD (AIGC-CAD), combining large language models with image analysis. Unlike earlier models, ThyGPT can engage in question-and-answer sessions, justify its predictions and adapt to the clinical reasoning process. This interaction is essential to regain trust in AI and enable its integration into medical workflows. 

 

Must Read: SMI TI-RADS for Enhancing Thyroid Nodule Risk Stratification 

 

Diagnostic Performance and Clinical Integration 
ThyGPT was trained on an extensive dataset of over 511,000 ultrasound images and nearly 50,000 reports from 59,406 patients across nine hospitals. Evaluated on two external test sets, the model significantly improved diagnostic outcomes. When junior and senior radiologists used ThyGPT, their sensitivity and specificity increased across the board, surpassing the model’s performance alone. Specifically, average sensitivity rose from 0.802 to 0.893 and specificity from 0.809 to 0.922. Moreover, ThyGPT enabled nuanced decision-making through a three-tier scoring strategy: high-risk nodules prompted surgery without biopsy, moderately suspicious ones followed standard guidelines and low-risk nodules warranted observation. This approach reduced unnecessary biopsies from 64.2% to 23.3% while halving the rate of missed malignancies. Importantly, the AI’s predictions triggered beneficial diagnostic revisions in over 10% of cases, particularly among junior radiologists. The system demonstrated high accuracy when radiologists chose to alter their initial judgment, with an error rate as low as 0.2%. The added clarity and collaborative format of ThyGPT promoted its role as a clinical copilot, facilitating safer and more informed diagnostic decisions. 

 

Real-Time Error Detection and Future Applications 
Beyond diagnosis, ThyGPT excels in error detection within ultrasound reports. In a test set of 1263 reports, including 157 with known errors, ThyGPT identified inaccuracies with a 90.5% detection rate, outperforming all human radiologists. It processed reports 1,610 times faster than humans, highlighting its potential for real-time quality control. The system was especially effective in catching side confusions and inconsistencies between text and images, critical issues in radiology reporting. When combined with human oversight, detection accuracy rose even higher, particularly benefiting junior staff. The model’s cross-modal capability enables it to align visual data with textual descriptions, ensuring semantic consistency. Additionally, multilingual validation showed no significant performance drop, indicating the system’s global applicability. These strengths position ThyGPT not only as a diagnostic assistant but also as a tool for standardising and improving reporting practices across diverse clinical environments. It supports overburdened clinicians and could help fill gaps in regions with radiologist shortages. 

 

ThyGPT represents a major advancement in AI-assisted radiology, combining the interpretability of large language models with the precision of medical imaging analysis. By facilitating interactive diagnosis, improving accuracy, reducing unnecessary procedures and detecting report errors in real time, it redefines the role of CAD systems in clinical practice. Its success in both junior and senior radiologist workflows underscores its broad utility. While limitations remain, particularly in subtype detection and device variation, the model's transparent and adaptive nature provides a robust foundation for future CAD development. ThyGPT marks a shift towards a more collaborative, efficient and trustworthy integration of AI in medical diagnostics. 

 

Source: npj digital medicine 

Image Credit: iStock


References:

Yao J, Wang Y, Lei Z et al. (2025) Multimodal GPT model for assisting thyroid nodule diagnosis and management. npj Digit. Med. 8:245.



Latest Articles

ThyGPT, thyroid diagnosis, AI in radiology, multimodal GPT, thyroid nodule assessment, ultrasound AI, thyroid CAD, diagnostic accuracy, AI copilot, medical AI Revolutionise thyroid diagnosis with ThyGPT – AI that enhances accuracy, reduces biopsies and supports clinical workflows.