Abstract
Background: Data silos and privacy constraints limit the centralized development of machine learning models for cardiovascular disease. Federated learning enables multi-institutional training without sharing raw patient records, but the evidence base in cardiology and deployment-grade evaluation remains uneven.
Objective: To synthesize how federated learning has been implemented for cardiovascular disease prediction and to identify factors that determine clinical translation readiness, including heterogeneity handling, validation quality, privacy and security safeguards, and operational feasibility.
Methods: A systematic literature review was conducted in PubMed, Web of Science, and IEEE Xplore, with supplementary searches in arXiv and Google Scholar, covering studies published from January 2022 through December 2025. Studies applying federated learning to clinically meaningful cardiovascular disease prediction tasks were included. We extracted data on clinical tasks, modalities, federation types, training strategies, evaluation designs, and deployment considerations, and synthesized the findings qualitatively.
Results: Twenty-two studies were included, spanning early screening, clinical diagnosis, prognostic evaluation, and emerging treatment-related decision-support applications. Modalities included electronic health records, electrocardiograms, phonocardiograms, echocardiography, cardiac imaging, and wearable data. Most studies used horizontal federated learning with FedAvg baselines, with variants targeting non-IID heterogeneity, personalization, and efficiency. Performance often approached centralized training and exceeded single-site baselines. However, the evidence base was predominantly retrospective and frequently relied on public datasets or simulated client splits, while reporting of site-held-out validation, calibration, subgroup performance, privacy safeguards, robustness, and system costs was inconsistent.
Conclusion: Federated learning is a promising paradigm for privacy-preserving, multi-institutional cardiovascular disease prediction, yet the evidence remains mainly retrospective, with heterogeneous validation and limited deployment-grade reporting. This review synthesizes task, modality, federation design, and operational constraints, and highlights adoption priorities: site-held-out and external validation with calibration and subgroup fairness monitoring, communication-efficient personalization under drift, auditable privacy and security safeguards, and clinical-grade MLOps for monitoring, rollback, and continual updating.