Radiomics has emerged as a transformative approach in medical imaging, offering the potential to extract high-dimensional data from scans for enhanced disease characterisation and treatment prediction. In breast cancer, particularly in triple-negative breast cancer (TNBC), radiomics derived from magnetic resonance imaging (MRI) holds promise for predicting outcomes such as pathologic complete response (pCR). However, the reproducibility of radiomics features remains a challenge due to the variability introduced during image acquisition and pre-processing. Among these, intensity normalisation methods have become a focal point, given their significant influence on feature stability and model performance. A recent review published in the European Journal of Radiology explores how different image normalisation techniques affect the robustness of MRI radiomics features and the predictive power of machine learning models built to forecast pCR in TNBC.

 

Normalisation Methods and Feature Robustness 
The study utilised MRI scans from the MAMA-MIA dataset and the PARTNER trial, focusing on patients with TNBC. Four commonly used normalisation methods were examined: N4 bias field correction, Min-Max scaling, Z-score standardisation and Piecewise Linear Histogram Equalisation (PLHE). Spatial normalisation to a uniform voxel size was also applied to address differences in image resolution. Radiomics features were then extracted using a standard pipeline and assessed for robustness across 16 different combinations of these normalisation techniques.

 

Must Read: Bridging the Gap: MRI Screening for Dense Breasts in Europe

 

Robustness was quantified using the Concordance Correlation Coefficient (CCC), revealing that linear methods like Z-score and Min-Max normalisation significantly affected first-order features, while PLHE, a non-linear method, impacted a broader range of texture-based features. Dataset characteristics influenced how features responded to normalisation. For example, the PARTNER dataset, acquired under consistent imaging protocols, showed higher feature stability than the more heterogeneous MAMA-MIA datasets. This highlighted that both normalisation method and data acquisition consistency are critical to radiomics reproducibility.

 

Predictive Modelling and the Effect of Normalisation 
Machine learning models were trained using radiomics features processed under different normalisation schemes to predict pCR. Logistic regression with ElasticNet regularisation was employed, and a range of feature selection techniques were tested. The highest predictive performance, measured by the area under the receiver operating characteristic curve (ROC-AUC), was achieved using a combination of three normalisation steps: bias field correction, spatial normalisation and Z-score standardisation.

 

However, the study also found that the benefit of normalisation varied by dataset and training size. Smaller training sets, such as those from ISPY1 and DUKE, showed higher variability in model performance depending on the normalisation applied. In contrast, models trained on larger datasets like ISPY2 were more stable, suggesting that the impact of normalisation decreases as training data volume increases. Additionally, the same technique—PLHE—had opposing effects on different datasets: enhancing predictive accuracy in PARTNER while reducing it in ISPY1 and DUKE. This variability emphasises the importance of tailoring pre-processing strategies to the characteristics of the dataset.

 

Dataset Heterogeneity and Voxel Size Considerations 
Differences in voxel dimensions between datasets also contributed to the variation in normalisation effects. For instance, the DUKE dataset had consistent voxel spacing, which likely contributed to its high robustness under spatial normalisation. In contrast, ISPY1 and ISPY2 exhibited more variability, particularly in the z-dimension, potentially leading to greater distortion during resampling. These spatial factors underscore the need to consider voxel size heterogeneity when designing pre-processing workflows for radiomics studies.

 

The findings suggest that while image normalisation can enhance the predictive power of radiomics models, it is not a one-size-fits-all solution. Instead, normalisation should be treated as a model hyperparameter, with the optimal combination determined through cross-validation. This approach allows for more personalised and data-sensitive pre-processing strategies, which can improve generalisability across different cohorts and imaging protocols.

 

This investigation demonstrates the pivotal role that image normalisation plays in breast MRI radiomics, particularly when predicting treatment response in TNBC. The study reveals that while the combined application of bias field correction, spatial normalisation and Z-score standardisation yields the best predictive performance overall, the effectiveness of normalisation strategies is highly dataset-dependent. Smaller datasets benefit more from rigorous pre-processing, while larger datasets offer greater resilience to variability. Furthermore, differences in image acquisition and voxel dimensions between datasets contribute to the differential impact of normalisation techniques. These insights highlight the necessity of adapting normalisation methods to the specific context of each study and reinforce the importance of robust pre-processing in achieving reliable radiomics-based clinical tools.

 

Source: European Journal of Radiology 

Image Credit: Freepik


References:

Schwarzhans F, George G, Escudero Sanchez L et al. (2025) Image normalization techniques and their effect on the robustness and predictive power of breast MRI radiomics. European Journal of Radiology: In Press. 



Latest Articles

breast MRI radiomics, image normalisation, TNBC, radiomics features, predictive modelling, machine learning, voxel size, medical imaging, pCR prediction, MRI pre-processing Explore how image normalisation affects MRI radiomics and model accuracy in predicting TNBC treatment response.