Deep learning methods for electrocardiogram (ECG) interpretation continue to expand across diagnostic classification and prediction tasks, yet practical implementation in clinical research environments remains technically demanding. ECG datasets are often stored in different formats, machine learning workflows require specialised expertise and pretrained models are not always easily transferable between institutions. ExChanGeAI is a containerised web-based platform designed to bring ECG data preparation, visualisation, prediction and model training into a single workflow. The system supports local computation and model sharing while aiming to simplify experimentation with ECG deep learning. By integrating processing steps that are typically separated across multiple software environments, the platform enables healthcare researchers and technical teams to explore model development and evaluation within a unified interface.
Data Processing and Visual ECG Exploration
The platform is designed to accommodate heterogeneous ECG datasets commonly encountered in clinical research and development. Multiple input formats are supported, including waveform data stored in research and clinical standards such as DICOM, DAT and XML. During ingestion, ECG recordings are automatically resampled to a unified frequency, with a default configuration of 100 Hz, and signals are standardised toward conventional 12-lead recordings with approximately ten seconds of waveform data. When recordings differ in duration or sampling characteristics, automated cropping or expansion ensures consistency across datasets. Signal scaling is harmonised to millivolt units where necessary.
Interactive visualisation tools allow users to inspect ECG waveforms and derived features within the same environment used for model development. QRS complexes, fiducial points and aligned median beats can be examined alongside the original signals. R-peak alignment and transformation functions are integrated into the workflow, enabling inspection of processed signals before model training or prediction. Resting ECGs are displayed in a grid layout representing the standard clinical presentation of leads, with synchronised zooming across channels to support comparative inspection. Exploratory analysis features provide visual summaries of dataset characteristics and model outputs, including confusion matrices and receiver operating characteristic curves, helping users interpret classification behaviour across diagnostic categories.
Model Exchange and Training Accessibility
Interoperability is a central design principle of the platform, which supports model exchange through the Open Neural Network Exchange (ONNX) format. Models compatible with ONNX specifications can be deployed across different installations, allowing collaboration between research groups without transferring underlying patient data. PyTorch models can also be used when architecture definitions are available, enabling integration of externally developed models into the workflow.
A repository interface known as Model ExChanGe enables synchronisation of pretrained models through a WebDav-based system. Baseline architectures available within the platform include XceptionTime, InceptionTime and a model originating from the PhysioNet and Computing in Cardiology Challenge. These models can be used for prediction or adapted through fine-tuning to new datasets. Fine-tuning options include updating only the classification head or training all model layers, with automated handling of parameter freezing and output adaptation.
Training workflows are designed to reduce the need for manual configuration. ONNX models are converted into PyTorch representations to support training operations when required. Default optimisation uses AdamW together with a learning-rate scheduler, and a learning-rate finder automatically identifies a suitable starting value. Data are divided into training and evaluation subsets using stratified sampling, typically following an 80/20 split. Training runs are limited to a predefined number of epochs and include checkpointing and early stopping based on validation performance. After training, summary outputs include dataset statistics, loss progression and evaluation metrics such as F1-scores, allowing users to review model behaviour within the same environment used for training.
Must Read: Standardising Participatory Digital Health Reporting
Cross-Dataset Evaluation and Performance Patterns
The platform’s workflow was evaluated across several ECG datasets representing different clinical and research contexts. PTB-XL served as the primary dataset for model development and internal testing, with fold-based splits used to simulate both limited-data and larger-data scenarios. External validation datasets included MIMIC-IV-ECG, Yang et al and a clinical dataset collected in an emergency department setting. Differences in annotation methods across datasets required mapping between descriptive ECG labels and diagnostic coding systems such as ICD-10. Some datasets relying on administrative coding were noted to contain label noise, reflecting real-world conditions in which diagnostic codes may incorporate clinical information beyond ECG findings.
Classification targets included broad diagnostic superclasses such as myocardial infarction, conduction disturbances and hypertrophy, as well as more specific diagnostic distinctions. Results indicated that models trained from scratch within the platform, particularly XceptionTime and InceptionTime, frequently achieved strong weighted F1-scores across tasks. Fine-tuning pretrained models improved performance but did not consistently exceed results obtained through de novo training in limited-data scenarios. One pretrained architecture demonstrated relatively stable performance across external datasets, indicating robustness when applied to heterogeneous data sources.
Increasing the amount of training data improved model performance across architectures, demonstrating the platform’s ability to scale with dataset size. An additional evaluation examined prediction of revascularisation using a balanced subset of emergency department ECG records, with equal numbers of positive and negative cases. Transfer learning was applied by adapting a pretrained model originally trained on diagnostic classification tasks. Computational comparisons highlighted differences in model complexity, with a foundation model containing more than 90 million parameters requiring greater computational resources than smaller architectures that achieved competitive performance.
ExChanGeAI integrates ECG data processing, visual analysis, prediction and model training within a single containerised platform intended to simplify deep learning workflows for ECG research. Support for multiple waveform formats, interoperable model exchange and automated training configuration enables experimentation without extensive programming requirements while maintaining local data control. Evaluation across several datasets demonstrates that both de novo training and transfer learning approaches can be explored within the same environment, with performance influenced by dataset size, annotation quality and model architecture. The unified workflow provides healthcare researchers with a practical framework for developing and comparing ECG deep learning models under conditions that reflect the variability of real clinical data.
Source: Journal of Medical Internet Research
Image Credit: iStock
References:
Bickmann L, Plagwitz L, Büscher A et al. (2026) End-to-End Platform for Electrocardiogram Analysis and Model Fine-Tuning: Development and Validation Study. J Med Internet Res 28:e81116.