Hip fractures are a major concern in clinical practice, particularly among older patients, as they are associated with significant levels of morbidity, mortality and long-term disability. Although recent years have seen a reduction in the one-year mortality rate following hip fractures, these injuries remain a key contributor to loss of functional independence, often requiring surgical intervention. Prompt and accurate diagnosis is essential for improving outcomes. However, a notable proportion of clinically relevant fractures are not initially detected on X-rays. 

 

Differences in diagnostic accuracy between less experienced clinicians and specialists highlight the challenge of perceptual errors. Traditional AI methods have demonstrated high performance in detecting fractures, but most are limited to binary outputs and lack specificity or interpretability. A newly developed AI approach addresses these limitations by detecting, classifying, grading and segmenting femoral fractures from hip x-rays, offering enhanced diagnostic granularity and transparency. 

 

Enhanced Fracture Detection and Classification 
Earlier AI methods concentrated on classifying whether a hip fracture was present or not, achieving strong results in terms of area under the curve (AUC). More recent developments introduced multiclass classification, identifying general fracture locations, such as femoral neck or pertrochanteric regions. Nevertheless, these approaches typically returned a single label for the whole image and did not highlight the regions used by the network for diagnosis. Techniques such as Grad-CAM were employed to improve interpretability, but they lacked precision. The new model builds on this work by detecting, classifying, grading and segmenting fractures within a single framework. It includes femoral neck fractures graded by the Garden scale, pertrochanteric fractures graded by the Evans scale and subtrochanteric fractures. Segmentations generated by the model provide clear indications of the fracture location, giving radiologists visual cues about the evidence used by the AI. This integration of grading and segmentation is unique in the current literature and promises greater clinical utility and confidence.

 

 Must Read: Enhancing X-Ray Imaging: The Promise of Cascade-Engineered Technology

 

Comprehensive Data Collection and Methodology 
The study involved a retrospective analysis approved by the Institutional Review Board, using over 10,000 hip x-rays from 2,618 patients. From this pool, 986 studies were selected for annotation. Radiologists with varying experience levels segmented fractures and classified them using standard scales. Each femur was labelled as either healthy or fractured, with six distinct fracture categories. The YOLOv8 network, known for its efficiency in object detection and segmentation, was trained on cropped femur regions of interest, which were preprocessed to enhance contrast and standardise dimensions. The network architecture consisted of a backbone for feature extraction, a neck for feature refinement and a decoupled head for generating linked outputs in classification, detection and segmentation. Various combinations of model complexity and image resolution were tested across multiple runs to determine the optimal setup. The configuration with the highest accuracy and AUC on the validation set featured a “small” model and image resolution of 1024 × 1024 pixels.

 

Benchmarking and Clinical Potential 
Evaluation of the YOLOv8 model on a test set demonstrated its strong performance across all metrics. It achieved an AUC of 0.981 for distinguishing fractured from healthy femurs, with a classification accuracy of 86.2% across the six fracture categories. The segmentation output had a Dice similarity coefficient of 0.77, showing close alignment with manual annotations. Most classification errors occurred within the same anatomical region, suggesting the model generally preserved location-specific accuracy. Comparisons with baseline networks such as DenseNet121, InceptionV3 and ResNet50V2 showed that although AUCs were similar across models, the YOLOv8-based method consistently produced higher classification accuracy. Importantly, it was the only method capable of segmenting fracture lines directly. Limitations included a relatively small number of subtrochanteric and undisplaced fractures in the training data, reducing model performance for these subtypes. Despite using data from a single institution, the variety of imaging devices and modalities enhances the likelihood of generalisability. Planned future work includes assessing the model’s clinical performance and extending its training set through semi-automated methods to balance fracture subtype representation. 

 

The study presents a robust AI system capable of detecting, grading and segmenting femoral fractures in X-ray images. Unlike previous methods, it unifies classification and segmentation into a single workflow, offering greater interpretability and diagnostic clarity. With performance metrics comparable or superior to existing models and the unique ability to highlight the regions influencing its predictions, this system represents a significant advance in AI-assisted radiology. Future efforts will focus on validating its effectiveness in clinical practice, expanding its dataset and exploring its educational value for radiologists. 

 

Source: Insights into Imaging 

Image Credit: iStock

 


References:

González G, Galant J, Salinas JM et al. (2025) Classification and segmentation of hip fractures in X-rays: highlighting fracture regions for interpretable diagnosis. Insights Imaging, 16:86.  



Latest Articles

hip fracture, AI diagnosis, femoral fractures, fracture grading, X-ray imaging, YOLOv8, radiology AI, medical imaging UK, fracture segmentation, diagnostic accuracy Discover advanced AI for hip fracture detection, grading and segmentation, improving X-ray diagnosis accuracy.