Breast lesion classification can combine mammography and contrast-enhanced imaging within a multimodal deep learning workflow, as reported in a recent study published in the Journal of Biomedical Informatics. The workflow brings together Full-Field Digital Mammography and Contrast-Enhanced Spectral Mammography in craniocaudal and mediolateral oblique views within a single classification pipeline. It also includes a generative artificial intelligence component for cases where contrast-enhanced images are not available. In those cases, synthetic Contrast-Enhanced Spectral Mammography images are generated from mammography inputs and used alongside the available imaging data. Performance is assessed with real multimodal images and in experiments where real contrast-enhanced images are progressively replaced with synthetic ones.

 

Multimodal Classification and Fusion Strategy

The classification workflow processes mammography and contrast-enhanced images separately for craniocaudal and mediolateral oblique views. A dedicated convolutional neural network is trained for each modality–view combination, producing malignancy probabilities that are subsequently combined through a late-fusion process. The fusion procedure first integrates predictions across views within each modality and then merges modality-specific probabilities into a single malignancy score. Weighted averaging is used in both stages, with weights derived from Matthews Correlation Coefficient values calculated during validation.

 

Must Read:Multimodal AI Sharpens Recurrence Risk in Clear Cell RCC

 

Three convolutional neural network architectures are evaluated, including ResNet18, ResNet50 and VGG16. Model training uses stratified five-fold cross-validation to maintain balanced distributions of malignant and benign cases across folds. Image-level data are divided into training, validation and testing subsets, while ensuring that images from the same patient remain within a single subset. Data augmentation techniques include small rotations, zoom adjustments and horizontal or vertical shifts applied during training to improve model generalisation. Final classification decisions are produced by mapping the fused malignancy probability to benign or malignant categories.

 

Dataset Preparation and Synthetic CESM Generation

The dataset consists of Contrast-Enhanced Spectral Mammography examinations collected from just over 200 patients, with ages ranging from early adulthood to advanced age and an average in the mid-fifties. Imaging was performed using a dedicated mammography system and produced more than 2,000 images across modalities and views. Low-energy images are treated as equivalent to Full-Field Digital Mammography, while dual-energy subtracted images represent Contrast-Enhanced Spectral Mammography because they show contrast uptake within breast tissue.

To maintain consistent imaging conditions, some acquisitions were excluded, resulting in a refined dataset used for virtual biopsy experiments. The classification subset includes paired mammography and contrast-enhanced images for malignant and benign lesions in both craniocaudal and mediolateral oblique views. The generative modelling task uses a larger set of paired images from both modalities to learn cross-modality translation.

 

Image pre-processing includes padding images to a square format, applying contrast stretching, normalising intensity values between 0 and 1 and resizing images to 256 by 256 pixels. Left-breast images are flipped horizontally to standardise orientation across samples. Synthetic Contrast-Enhanced Spectral Mammography images are generated using a CycleGAN model trained independently for craniocaudal and mediolateral oblique views. Training incorporates adversarial learning together with cycle-consistency and identity constraints to preserve anatomical structure while translating between modalities.

 

Performance with Real and Synthetic Modalities

Image-generation performance is evaluated using reconstruction and similarity metrics across multiple validation folds. Reported values include mean squared error close to zero, peak signal-to-noise ratios above approximately 24 dB and structural similarity index values greater than about 0.8 for both imaging views. These results indicate visual and structural correspondence between generated and real contrast-enhanced images.

 

Classification performance is examined under several experimental configurations, including mammography alone, contrast-enhanced imaging alone, synthetic contrast-enhanced imaging alone and multimodal combinations using real or generated contrast-enhanced images. Across the evaluated neural network architectures, multimodal configurations that include real contrast-enhanced images achieve stronger performance across metrics such as AUC, G-mean and Matthews Correlation Coefficient compared with mammography alone. Configurations combining mammography with synthetic contrast-enhanced images also demonstrate improvements in several conditions.

 

Robustness to missing modality data is assessed by progressively replacing real contrast-enhanced images with synthetic ones in increments from 10% to full replacement. Patient sampling is repeated multiple times for each configuration, and results are averaged across repetitions. Performance decreases gradually as more synthetic images replace real ones, yet synthetic contrast-enhanced images remain beneficial compared with relying only on mammography in several experimental settings. Additional analysis groups test images according to breast density categories defined by Breast Imaging Reporting and Data System classifications. Multimodal combinations using real contrast-enhanced imaging achieve the strongest results across density groups, while improvements associated with synthetic images are particularly observed in denser breast categories.

 

A multimodal and multi-view deep learning framework has been designed to support virtual biopsy classification of breast lesions using mammography and contrast-enhanced imaging. The workflow incorporates generative modelling to synthesise contrast-enhanced images when they are unavailable, allowing classification models to operate in missing-modality scenarios. Quantitative evaluation shows that synthetic images can approximate structural characteristics of real contrast-enhanced images and can support classification performance when combined with mammography data. Multimodal configurations using real contrast-enhanced imaging achieve the strongest results overall, while experiments with partial and complete modality replacement demonstrate gradual performance changes as synthetic images are introduced. The framework illustrates how generative artificial intelligence can be integrated into multimodal diagnostic pipelines to maintain classification capability when imaging information is incomplete.

 

Source: Journal of Biomedical Informatics

Image Credit: iStock


References:

Rofena A, Piccolo CL, Beomonte Zobel B et al. (2026) Augmented intelligence for multimodal virtual biopsy in breast cancer using generative artificial intelligence. Journal of Biomedical Informatics; 174:104971.




Latest Articles

breast lesion AI, multimodal imaging, mammography AI, contrast-enhanced mammography, generative AI, deep learning, virtual biopsy GenAI enables multimodal breast lesion classification using mammography and contrast-enhanced images for accurate diagnosis.