Disorders of the median nerve are a frequent cause of upper-limb symptoms and functional limitation, with carpal tunnel syndrome among the most recognised clinical presentations. Ultrasound imaging is widely used to support diagnosis, guide interventions and monitor recovery, yet consistent identification of the nerve during live scanning remains challenging. Image appearance can change rapidly with probe movement, anatomical variation and artefacts, making reliable interpretation difficult during dynamic examinations. These challenges are particularly relevant during procedures that require continuous visual feedback, as well as during rehabilitation assessments that depend on observing nerve motion. A deep learning framework known as UltraMN has been developed to address these issues by enabling real-time recognition of standard imaging planes together with automated segmentation of the median nerve in ultrasound video sequences.
Creating a Dynamic Ultrasound Dataset for Model Development
Ultrasound data were collected over a one-year period from several hundred adult participants with clinically normal median nerves. Ethical approval and informed consent were obtained before image acquisition. Individuals with previous wrist or forearm surgery or major traumatic injury were excluded to ensure anatomical consistency. The cohort included a balanced representation of sexes and a broad adult age range. Demographic information was recorded but not incorporated into model training.
Must Read:ImageVU Expands Secure Secondary Use of Radiology Data
Image acquisition followed established European guidance for musculoskeletal ultrasound. Four standard imaging planes were used, combining transverse and longitudinal views at defined anatomical landmarks along the forearm and wrist. Rather than relying on single still images, scanning was performed dynamically, with short video clips recorded for each plane on both sides. Each clip contained dozens of consecutive frames acquired at a rate consistent with routine clinical ultrasound. Similarity between adjacent frames reflected the continuous nature of scanning while preserving meaningful variation.
The resulting dataset consisted of several thousand videos and a large number of individual frames. To reduce the risk of overfitting, data were divided at the participant level into separate training, validation and test sets. Augmentation techniques were applied to increase variability, including basic geometric transformations and the addition of noise. Regularisation strategies were also incorporated during training to improve generalisability. All labelling was performed by experienced radiologists, with strong agreement reported for both plane identification and nerve boundary annotation.
A Multitask Architecture for Plane Recognition and Segmentation
UltraMN is built as an end-to-end convolutional neural network based on a U-Net architecture adapted for video data. The design integrates two related tasks within a single framework. One component focuses on recognising the standard imaging plane, while the other performs segmentation of the median nerve itself. Unlike conventional approaches that rely on two-dimensional analysis, the model uses three-dimensional convolutions to capture both spatial structure and temporal relationships across consecutive frames.
The classification branch draws on features extracted from deeper layers of the network to distinguish between the predefined imaging planes. In parallel, the segmentation branch generates a pixel-level outline of the median nerve. The two tasks are optimised together using a combined loss function, allowing shared features to support both recognition and delineation. Training was performed over an extended number of epochs using a modern optimisation strategy and dedicated graphical processing hardware.
Performance was assessed using standard metrics for both classification and segmentation. These included accuracy-based measures for plane recognition and overlap-based measures for nerve segmentation. Comparisons were reported against other deep learning models designed for similar tasks. Overall, the multitask approach demonstrated stronger performance than the reference methods, reflecting the benefit of integrating temporal information and shared representations within a single model.
Real-Time Performance and Current Scope of Application
Evaluation on a held-out test set showed that the system achieved high accuracy in identifying standard imaging planes and strong agreement between automated and expert-defined nerve contours. Errors were most commonly associated with lower-quality images, such as those affected by noise, suboptimal probe contact or unclear anatomical boundaries. Segmentation inaccuracies were also more likely in regions where the median nerve closely adjoined surrounding structures.
Real-time feasibility was assessed using a high-performance workstation representative of advanced clinical or research environments. Input frames were standardised before processing, and total processing time included preparation, inference and visualisation steps. The reported inference speed exceeded the typical frame rate of diagnostic ultrasound systems, with overall latency remaining well below commonly cited thresholds for real-time clinical use. These results suggest that the framework can operate alongside live scanning without disrupting workflow.
The work focused exclusively on individuals with normal median nerve anatomy, and the authors identified this as a key limitation. Generalisation to pathological conditions, including carpal tunnel syndrome, has not yet been fully evaluated. Ongoing data collection across multiple centres aims to address this gap by incorporating cases with confirmed pathology and by extending validation to different ultrasound systems. Longitudinal assessment in patients undergoing treatment has also been proposed to explore the potential for monitoring changes in nerve morphology over time.
UltraMN demonstrates the feasibility of combining standard plane recognition and median nerve segmentation within a single deep learning framework capable of real-time ultrasound analysis. By leveraging dynamic video data and a multitask design, the system delivers stable outputs that align with routine scanning conditions. The relevance lies in the potential to enhance consistency during ultrasound examinations and to support future applications in procedural guidance and follow-up assessment. Further validation in pathological populations and across a wider range of imaging platforms will be essential to define its clinical role.
Source: Ultrasound in Medicine and Biology
Image Credit: iStock