Detecting Physician Fatigue Through Clinical Notes

In EXEC
Mon, 14 Jul 2025

Physician fatigue poses a serious risk to decision-making quality in emergency departments, where high stakes and intense workloads are routine. Traditionally, fatigue has been measured indirectly through shift length, overnight duty or work frequency, which do not fully capture the nuanced, real-time condition of physicians during patient encounters. A new approach proposes using clinical notes as a source of insight into physician fatigue, capitalising on the linguistic patterns that emerge under cognitive stress. By applying machine learning techniques to these notes, researchers aim to create a more precise and actionable measure of fatigue, potentially improving both clinician well-being and patient outcomes.

Quantifying Fatigue Using Textual Signals

Using data from over 129,000 emergency department visits at a single academic medical centre, researchers developed a machine learning model to classify physician notes based on prior workload. Specifically, physicians who had worked at least four of the previous seven days were considered “high-workload” and presumably fatigued. The model was trained on a balanced dataset using features such as note length, readability, the predictability of words, and the frequency of specific linguistic elements including cognitive and affective word categories.

Must Read: Addressing Physician Burnout: A Focus on Well-Being and Performance

Interestingly, one of the strongest indicators of fatigue was the predictability of language, measured through the perplexity score of a fine-tuned language model. Lower perplexity—more predictable text—correlated with greater fatigue, suggesting that fatigued physicians rely on formulaic expressions, potentially reflecting decreased cognitive engagement. Other markers included reduced use of first-person pronouns and insight words, and an increased use of certainty terms and anger-related words. These patterns point to a measurable shift in linguistic behaviour under fatigue, forming the basis for an interpretable model.

Validation Across Contexts and Correlation with Decision Quality

To validate the model’s capacity to detect fatigue, researchers examined whether its predictions correlated with settings known to be fatiguing. Notes written during overnight shifts and those associated with high patient volumes on a single shift had significantly higher predicted fatigue scores, despite these variables not being part of the model’s training data. Additionally, greater variability in a physician’s recent shift start times—a measure of circadian disruption—was associated with increased predicted fatigue.

Crucially, predicted fatigue was not only correlated with conditions of fatigue but also linked to decision-making outcomes. In cases where physicians opted to test for acute coronary syndrome, higher predicted fatigue was associated with a significant decrease in test yield. While coarse measures like total days worked showed no meaningful relationship with test outcomes, the fine-grained, note-based fatigue measure revealed a 19% decrease in diagnostic yield per standard deviation increase in predicted fatigue. This link between linguistic cues and clinical decision quality provides compelling support for the model’s utility.

Implications for Large Language Models in Clinical Practice

The study further examined the characteristics of clinical notes generated by large language models (LLMs). Since LLMs rely on next-word prediction, their generated notes tend to be highly predictable, mimicking the language patterns found in fatigued human-authored notes. When real physician notes and LLM-generated continuations were compared, the latter exhibited 74% higher predicted fatigue scores. This raises critical concerns about the potential quality and clinical value of AI-generated documentation, particularly if such notes mirror those written under fatigue.

These findings highlight a dual challenge: ensuring that AI tools used to support documentation do not inadvertently introduce fatigue-like linguistic patterns, and recognising that automation may reduce the reflective thinking embedded in the act of note writing. As such, there is a need to consider more nuanced roles for LLMs in healthcare, such as assisting in information solicitation rather than full automation of documentation. Used appropriately, LLMs could enhance the note-writing process without undermining physician agency or the richness of clinical documentation.

By analysing the text of clinical notes, a machine learning model can infer physician fatigue with greater precision than traditional metrics based on work schedules. This innovative method captures subtle but significant linguistic shifts associated with cognitive strain, offering a practical and interpretable way to monitor clinician well-being. The association between predicted fatigue and clinical decision outcomes underscores the model’s potential impact on both patient care and physician performance. At the same time, the study raises important questions about the role of LLMs in clinical documentation, particularly regarding their unintended resemblance to fatigued writing. This approach offers a promising direction for enhancing quality and safety through language-informed insights.

Source: nature communications

Image Credit: iStock

References:

Hsu CC, Obermeyer Z & Tan C (2025) A machine learning model using clinical notes to identify physician fatigue. Nat Commun, 16:5791.

machine learning, emergency medicine, diagnostic accuracy, AI in Healthcare, LLMs, clinician well-being, clinical notes, physician fatigue, cognitive strain, fatigue detection

Latest Articles

Hospitals of the Future: The Next Frontier in Patient-Centred Care
- Journal Article
- 18/10/2025
Hospitals are rapidly evolving into smart, connected ecosystems focused on proactive, personalised care. Leveraging AI, robotics, remote monitoring and digital health tools, they enhance diagnostics, improve workflows and support decentralised models like virtual wards. Predictive analytics, interoper
READ MORE
AI Orchestration in Emergency Radiology – Implementation in the Valencia Health Region
- Journal Article
- 18/10/2025
The Valencia Health Region deployed a vendor-neutral AI orchestration system across 29 hospitals to improve emergency radiology. Validated at Hospital General Universitario Dr Balmis, it streamlines triage, accelerates diagnoses and reduces radiologists’ workload. The system processes over 5,700 studi
READ MORE
Advancement of 3D Printing in Healthcare and Its Impact on Sustainability
- Journal Article
- 18/10/2025
3D printing is transforming healthcare through personalised devices, surgical precision and faster prototyping while advancing sustainability. On-demand production reduces waste, supports circular economy models and lowers carbon footprints by minimising transport and inventory. Despite its promise,...
READ MORE

physician fatigue, clinical notes, machine learning, emergency medicine, AI in healthcare, cognitive strain, language patterns, LLMs, decision-making, healthcare AI Machine learning detects physician fatigue through linguistic patterns in clinical notes, improving care quality and clinician well-being.

Detecting Physician Fatigue Through Clinical Notes

References:

Latest Articles

Related Articles

Latest News

INFO

IMAGING

ICU

EXEC

IT

CARDIOLOGY

JOURNALS

EVENTS

FACULTY

PARTNERS

JOBS

COMPANIES

PRODUCTS

BLOG

VIDEOS

Communities

CONTACT US

EU Office

Rue Villain XIV 53-55

B-1050 Brussels, Belgium

Tel: +357 86 870 007

E-mail: [email protected]

EMEA & ROW Office

166, Agias Filaxeos

CY-3083, Limassol, Cyprus

Tel: +357 86 870 007

E-mail: [email protected]

Headquarters

Kosta Ourani, 5

Petoussis Court, 5th floor

CY-3085 Limassol, Cyprus

E-mail: [email protected]