A novel federated learning framework for medical imaging: Resource‐efficient approach combining PCA with early stopping

Negin Piran Nanekaran; Eranga Ukwatta

doi:10.1002/mp.18064

. 2025 Sep 3;52(8):e18064. doi: 10.1002/mp.18064

A novel federated learning framework for medical imaging: Resource‐efficient approach combining PCA with early stopping

Negin Piran Nanekaran ^1,^✉, Eranga Ukwatta ¹

PMCID: PMC12409104 PMID: 40903928

Abstract

Background

Federated learning (FL) facilitates collaborative model training across multiple institutions while preserving privacy by avoiding the sharing of raw data, a critical consideration in medical imaging applications. Despite its potential, FL faces challenges such as high‐dimensional data, heterogeneity among datasets from different centers, and resource constraints, which limit its efficiency and effectiveness in healthcare settings.

Purpose

This study aims to present a novel adaptive FL framework to address the challenges of data heterogeneity and resource constraints in medical imaging. The proposed framework is designed to optimize computational efficiency, enhance training processes, improve model performance, and ensure robustness against non‐independent and identically distributed (non‐IID) data across decentralized data sources.

Methods

The proposed adaptive FL framework addresses the challenges of high‐dimensional data and heterogeneity in nonuniform and decentralized data sources through a key innovation. First, Federated incremental principal component analysis (FIPCA) achieves privacy‐preserving dimensionality reduction by aggregating local scatter matrices and means from participating centers, enabling the computation of a global PCA model. This process ensures data alignment across centers, mitigates heterogeneity, and significantly reduces computational complexity. We evaluated the framework's ability to generalize across institutions in a cross‐site classification task distinguishing clinically significant prostate cancer (csPCa) from non‐csPCa. This assessment used 1500 T2‐weighted (T2W) prostate MRI images from three institutions, where two centers (800 + 350 cases) were used for training and validation, and one center (350 cases) served as an independent test site.

Results

The proposed method significantly reduced the number of global training rounds from 200 to 38, achieving a 98% reduction in energy consumption compared to the standard FedAvg algorithm. The effective use of FIPCA for dimensionality reduction enhanced generalizability, while adaptive early stopping prevented overfitting, leading to an improvement in model performance, with the area under the curve (AUC) on the unseen test center increasing from 0.68 to 0.73 (95 % CI 0.70 – 0.77) on the test center's data. Additionally, the method demonstrated improved sensitivity and specificity, indicating superior classification performance. The integration of FIPCA accelerated convergence by reducing data dimensionality, while the adaptive early‐stopping mechanism further optimized resource utilization and prevented overfitting.

Conclusions

Our adaptive FL approach efficiently handles large, heterogeneous medical imaging data, reducing training time and computational overhead, while improving model accuracy. The substantial reduction in energy consumption and accelerated convergence make it suitable for real‐world healthcare settings.

Keywords: data privacy, data heterogeneity, federated learning, medical image processing, principal component analysis, prostate cancer imaging

1. INTRODUCTION

AI‐driven models have achieved advances in medical imaging analysis, enhancing patient outcomes, and improving the efficiency of diagnostic processes across various imaging modalities. However, realizing this potential hinges on access to high‐quality, large, and diverse training datasets. ¹ Federated learning (FL) has emerged as a promising solution to this challenge, particularly in the healthcare sector, where data privacy is a critical concern. ² , ³ Traditional machine learning approaches rely on centralized data collection, which poses compliance challenges with data protection regulations such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA). FL addresses these limitations by enabling decentralized learning across multiple institutions. This approach ensures that sensitive data remains securely localized at each client site while facilitating collaborative model training. ⁴

FL introduced by McMahan et al. in 2017, ⁵ enables decentralized machine learning by aggregating local model updates through the federated averaging (FedAvg) algorithm, eliminating the need to share raw data. While FedAvg remains a benchmark for performance comparisons, ⁶ its effectiveness diminishes in non‐independent and identically distributed (non‐IID) data scenarios, which are prevalent in medical settings. Moreover, it often requires numerous global training rounds to converge, escalating computational costs and increasing the risk of overfitting. ⁷ The inherent diversity of medical data—spanning various modalities, dimensions, and feature types—is further compounded by differences in acquisition protocols, equipment, and demographic factors across institutions. This heterogeneity significantly impacts FL model convergence and accuracy, posing a critical challenge in decentralized FL environments. ⁸

Techniques like FedMax and Federated disentanglement (FedDis) have been developed to tackle non‐IID data issues. FedMax reduces activation divergence across sources, while FedDis focuses on disentangling anatomical features, leveraging the consistency of anatomical structures across patients to improve anomaly detection and address data heterogeneity. ⁹ , ¹⁰ , ¹¹ . Additionally, FedProx ¹² extends FedAvg by introducing a proximal term to reduce model divergence in heterogeneous data, enhancing alignment across local models. Similarly, HarmoFL ¹³ addresses data heterogeneity through amplitude normalization, outperforming SOTA methods in tasks such as breast cancer histology, nuclei segmentation, and prostate MRI.

Beyond these, optimization‐based methods such as SCAFFOLD (control variates), ¹⁴ MOON (contrastive learning), ¹⁵ FedDyn (dynamic regularization), ¹⁶ and AFL (agnostic updates) ¹⁷ aim to stabilize client updates and align local objectives with the global goal. Complementing these approaches, Personalization strategies like FedPer, ¹⁸ Per‐FedAvg, ¹⁹ pFedMe, ²⁰ Ditto, ²¹ and FedRep ²² enable client‐specific adaptations by decoupling shared representation learning from personalized heads. Further, clustering‐based approaches including IFCA ²³ and FedEM ²⁴ assume latent client groupings or model data as a mixture of experts to better serve diverse distributions.

In parallel, several optimization‐based strategies have been proposed to enhance FL performance under non‐IID conditions. FedAvgM introduces a momentum‐based extension to the standard FedAvg, effectively improving convergence stability in heterogeneous environments. ²⁵ Similarly, adaptive optimization techniques such as FedAdam, FedAdagrad, and FedYogi adapt popular centralized optimizers to the federated setting, allowing for better handling of client drift and communication efficiency. ²⁶ These methods dynamically adjust learning rates or gradients across rounds to mitigate the impact of data heterogeneity on global model convergence.

Building on these advances, several studies have applied FL directly to medical imaging problems under real‐world non‐IID constraints. For instance, Kades et al. ²⁷ integrated FL into the Kaapana platform for prostate MRI segmentation, achieving robust cross‐site performance by training nnU‐Net models across hospitals. Alekseenko et al. ²⁸ proposed a distance‐aware clustering approach for brain and prostate MRI segmentation, improving personalization and generalization. In CT‐based tasks, Darzi et al. ²⁹ used Vision Transformers in a federated setting to better capture minority patterns in lung CT, while Yang et al. ³⁰ introduced semi‐supervised FL to segment COVID lesions from partially labeled CT data. SplitAVG ³¹ and FedBN ³² addressed feature shift and distribution skew in x‐ray and fundus imaging. In histopathology, Adnan et al. ³³ demonstrated that FL with differential privacy preserved accuracy under subtype‐skewed whole slide image (WSI) datasets for lung cancer classification. Multi‐modal FL studies, such as Borazjani et al., ³⁴ designed architectures accommodating clients with missing modalities, tackling challenges in cancer staging across heterogeneous data sources. Collectively, these works highlight the adaptability of FL in overcoming practical hurdles in real‐world medical imaging, specially under non‐IID constraints.

While FedAvg remains the standard in FL, other distributed optimization methods capable of handling distributional differences are also under research. Other methods like Federated Dropout, ³⁵ compression‐based FL techniques such as FedSZ, ³⁶ and FedGAN (which combines FL with Generative Adversarial Networks) offer solutions for managing imbalanced and non‐IID data, particularly in resource‐limited settings like Internet of Things (IoT) and Internet of Medical Things (IoMT). ³⁷ , ³⁸ However, despite these innovations, challenges such as communication overhead, data heterogeneity, and bias in model aggregation remain significant. ³⁹

Our work advances FL by introducing a novel approach that combines federated incremental PCA (FIPCA) and adaptive early stopping mechanisms to address the challenges of data heterogeneity and computational efficiency in medical imaging. While early stopping based on client validation loss aggregation has been previously explored, ⁴⁰ , ⁴¹ our contribution lies in how it is harmonized with privacy‐preserving dimensionality reduction to improve convergence speed, generalizability, and energy efficiency in a realistic healthcare setting. To the best of our knowledge, this is the first study to apply FL for dimensionality reduction using FIPCA, leveraging the aggregation of local means and scatter matrices from individual centers to compute a global variance. This approach enables consistent and privacy‐preserving dimensionality reduction across centers with diverse data distributions, effectively aligning these distributions while retaining essential features. By reducing computational complexity and preserving critical variations, FIPCA not only addresses the challenge of heterogeneity but also enhances model performance, as reflected in improved AUC values. Additionally, we incorporate adaptive early stopping mechanisms at both client and server levels to further optimize resource utilization. Client‐side early stopping halts local training when performance improvements plateau, reducing unnecessary computations and preventing overfitting on local datasets. Meanwhile, server‐side adaptive early stopping monitors aggregated client validation losses to determine when global training should cease, ensuring efficiency and preventing overfitting in the global model. Together, these innovations significantly improve training efficiency and robustness to non‐IID data, addressing key challenges in federated learning for high‐dimensional medical data.

2. METHODOLOGY

In this study, we describe a method for prostate cancer classification using T2W MRI images, preserving the full 3D context since lesions frequently span multiple slices. We propose a novel FL algorithm that enhances the standard FedAvg, improving both training efficiency and model performance in FL, particularly for medical imaging applications involving high‐dimensional data. This study used 1500 T2W prostate MRI scans from three medical centers. Radboud University Medical Center (RUMC, 800 cases) and Prostate Cancer Neuroendocrine Network (PCNN, 350 cases) provided data for training and validation, while Ziekenhuisgroep Twente (ZGT, 350 cases) served as an independent test site. These data were used to evaluate the generalizability of the proposed framework in a cross‐institution federated classification setting.

Figure 1 illustrates the complete process. Our adaptive FL pipeline comprises three key components: (i) a novel method based on principal component analysis (PCA) for privacy‐preserving dimensionality reduction and harmonization of inter‐center variation, (ii) client‐side training with a custom loss function that balances penalties for false negatives and false positives while incorporating an AUC‐oriented regularization term, and (iii) an early‐stopping strategy to minimize unnecessary computation and enhances training efficiency. The following sections detail the key algorithms and processes underlying our proposed method.

Our proposed adaptive process in a FL setup, illustrating federated data preprocessing, client‐side adaptive training, and server‐side coordination with an early stopping mechanism. Steps 5 and 6 repeat as rounds between clients and the server until the convergence criteria are met. FL, federated learning.

2.1. Preprocessing via FIPCA

FIPCA is designed to harmonize feature distributions across institutions and to project data into a lower‐dimensional space, all while preserving data privacy. In the first step of our preprocessing method, local statistics are computed at each center. Each center calculates its own local mean $μ_{i}$ and local scatter matrix $S_{i}$ from its data. To compute these statistics, the segmented 3D T2W MRI images ( $X_{i}$ ) are first flattened into voxel‐level vectors. Throughout this paper, the term feature refers to this voxel‐intensity vector obtained by flattening the segmented 3D T2W image. This vectorisation transforms each image into a point in a common high‐dimensional space, allowing FIPCA to compute covariance structures across centers. This step is essential, as the scatter matrices used by FIPCA are defined over vectorized feature spaces and must be linearly aggregable. The centers then send these local statistics ( $μ_{i}$ , $S_{i}$ , and $N_{i}$ ) to the server for the aggregation, ensuring that no raw data is shared.

In the second step, the global FIPCA is performed at the server. The global covariance matrix is derived by normalizing the aggregated scatter matrix received from all centers. An eigen‐decomposition is then carried out to extract the top $k$ principal components, capturing the most significant directions of variance across the data from all centers. In the third and final step, each center applies the FIPCA model to its local data. By subtracting the global mean and projecting onto the principal components, the data is transformed into a unified, lower‐dimensional space. The transformed datasets are subsequently split into training and testing sets for the federated learning process.

Algorithm 1 outlines the step‐by‐step process of loading, preprocessing, and splitting the data for FL. The number of principal components ( $k$ ) and the batch size ( $b$ ) are specified to guide the dimensionality reduction process. The value of ( $k$ ) can be selected based on desired variance retention or validated empirically. Batching is used solely to reduce memory usage during computation of local scatter matrices; all batches are fully aggregated, and thus batch size does not affect the resulting PCA components or model performance. The output of this algorithm includes the derived FIPCA components ( $W$ ) and the transformed datasets for each center, which are projected into a lower‐dimensional space for subsequent training.

ALGORITHM 1. FIPCA for 3D medical images.

1:	Input:
2:	$X_{i}$ : Data from center $i$ (in 3D format)
3:	$M$ : Number of centers
4:	$N_{i}$ : Number of samples at center $i$
5:	$d$ : Flattened data dimension ( $d = {resolution}^{3}$ )
6:	$k$ : Number of principal components
7:	Batch size $b$
8:	Output:
9:	PCA components $W \in R^{d \times k}$
10:	Transformed datasets $X_{t r a i n}^{(i)}, X_{t e s t}^{(i)}$ for each center
11:	Initialize:
12:	$S_{g l o b a l} = 0$ : Global scatter matrix
13:	$N_{t o t a l} = 0$ : Total number of samples
14:	Step 1: Local Statistics at Each Center
15:	for each center $i = 1$ to $M$ do
16:	Load and flatten $X_{i}$
17:	for each batch $B_{i}$ of size $b$ do
18:	Compute batch mean: $μ_{B_{i}} = \frac{1}{b} \sum_{x \in B_{i}} x$
19:	Center data: $X_{B_{i}}^{'} = B_{i} - μ_{B_{i}}$
20:	Update global scatter matrix: $S_{g l o b a l} + = X_{B_{i}}^{' T} X_{B_{i}}^{'}$
21:	Update total sample count: $N_{t o t a l} + = b$
22:	end for
23:	end for
24:	Step 2: Global PCA
25:	Normalize scatter matrix: $C o v_{g l o b a l} = \frac{S_{g l o b a l}}{N_{t o t a l}}$
26:	Compute eigenvectors: $W \in R^{d \times k}$ from $C o v_{g l o b a l}$
27:	Step 3: Apply PCA to Each Center
28:	for each center $i = 1$ to $M$ do
29:	Load $X_{i}$ and subtract global mean: $X_{i}^{'} = X_{i} - μ_{g l o b a l}$
30:	Apply PCA transformation: $X_{i}^{' P C A} = X_{i}^{'} W$
31:	Split into training and testing sets and save
32:	end for
33:	End of Algorithm

Open in a new tab

This preprocessing method offers several key advantages. First, it maintains privacy by ensuring that no raw data is exchanged between centers. Second, it is efficient, as the FIPCA processes data in memory‐efficient batches. Finally, it ensures consistency by projecting the data from all centers into a unified lower‐dimensional space, mitigating the effects of diverse data distributions across centers.

Given the known heterogeneity in image acquisition across institutions in the PI‐CAI dataset—including differences in MRI vendors, resolution, and csPCa prevalence—site‐specific factors such as scanner type (e.g., Siemens at RUMC vs. Philips at ZGT), resolution settings, and csPCa distribution (29.5% at RUMC, 31.1% at PCNN, and 22.8% at ZGT) introduce statistical non‐IID characteristics. These variations in scanner hardware and image quality impact T2W‐derived features. To mitigate this inter‐site variability, we adopted FIPCA as a central pre‐processing step. This ensured a more aligned and harmonized feature space before training, which was critical for improving model convergence and generalization under federated non‐IID conditions.

2.2. Client‐side training

On the client side, a custom loss function is employed to address the challenges posed by class imbalance and to improve classification performance. The loss function includes terms that balance penalties for false negatives and false positives, and incorporates an AUC‐oriented regularization component. This design is particularly important for medical imaging tasks, where both false negatives and false positives can have significant consequences. The loss function is expressed as Equation 1. Where the overall loss function represents the standard cross‐entropy loss, $L_{CE}$ , along with additional terms to account for false negatives, false positives, and AUC‐oriented regularization. The false negative penalty, $FN_penalty$ , is defined as $y_{true} \cdot (1 - y_{pred})$ , while the false positive penalty, $FP_penalty$ , is given by $(1 - y_{true}) \cdot y_{pred}$ . Additionally, the AUC‐oriented regularization term, $AUC_reg$ , is calculated as ${(y_{pred} - y_{true})}^{2}$ . The coefficients $λ_{FN}, λ_{FP}, λ_{AUC}$ determine the relative weight of each of these components in the final loss function. For the present use case, we set $λ_{FN} = 2$ , $λ_{FP} = 1$ , and $λ_{AUC} = 0.1$ . These values were selected via grid search and reflect the class imbalance (csPCa prevalence $\approx 28 %$ ) by assigning greater penalty to false negatives.

\begin{matrix} L_{custom} & = & L_{CE} + λ_{FN} \cdot {FN}_{penalty} + λ_{FP} \cdot {FP}_{penalty} \\ + λ_{AUC} \cdot {AUC}_{reg} \\ = & L_{CE} + λ_{FN} \cdot y_{true} \cdot (1 - y_{pred}) \\ + λ_{FP} \cdot (1 - y_{true}) \cdot y_{pred} + λ_{AUC} \cdot {(y_{pred} - y_{true})}^{2} \end{matrix}

(1)

The neural network used for client‐side training comprises three fully connected layers. The input layer accepts data with dimensions equal to the number of principal components derived from FIPCA preprocessing. The first dense layer consists of 128 units with ReLU activation, followed by batch normalization. The second dense layer has 64 units, also with ReLU activation and batch normalization. A dropout layer with a rate of 0.5 is applied before the output layer, which contains 2 units with softmax activation for binary classification. The total number of trainable parameters depends on the number of principal components ( $k$ ) provided by FIPCA. The model is optimized using the Adam optimizer with the dynamically adjusted learning rate based on the server round number $t$ as per Equation 2. Where $l r_{0}$ is the initial learning rate, $decay_rate$ is the decay rate, and $decay_steps$ determines how often the decay is applied (every 25 rounds). Additionally, an early stopping mechanism is implemented to prevent overfitting, which monitors validation loss with a patience of five epochs. Algorithm 2 provides an overview of the client‐side training process.

l r = l r_{0} \times {(decay_rate)}^{⌊\frac{t}{decay_steps}⌋}

(2)

ALGORITHM 2. Client training with custom loss function and early stopping.

1:	Initialize seed $s \leftarrow 12345$
2:	Load $(X_{train}, y_{train}), (X_{val}, y_{val})$ from local dataset
3:	Define neural network model $Model$ with specified architecture
4:	Compile $Model$ with optimizer, loss function $L_{CE}$ , and evaluation metrics
5:	Apply class weights to handle class imbalance
6:	Train $Model$ on $X_{train}$ with early stopping based on validation loss
7:	Save initial training history
8:	while training not converged do
9:	Receive global weights $w_{g}$ and server round $t$ from the server
10:	Set model weights $w \leftarrow w_{g}$
11:	Compute learning rate $l r$ using Equation 2
12:	Define custom loss function $L_{custom}$ as in Equation 1
13:	Compile $Model$ with optimizer (learning rate $l r$ ), loss function $L_{custom}$ , and metrics
14:	Train $Model$ on $X_{train}$ with early stopping
15:	Evaluate $Model$ on validation data to obtain $L_{val}$ , $A_{val}$
16:	Save training history
17:	Send updated weights $w$ , number of samples, and metrics ${L_{val}, A_{val}}$ to the server
18:	end while

Open in a new tab

2.3. Server‐side coordination

The server initializes the global model parameters $w^{(0)}$ and coordinates the training process over multiple global rounds. At each round $t$ , a subset of clients $S_{t}$ is selected to participate in training. Each client $k \in S_{t}$ trains the global model on its local dataset using the custom loss function and client‐side early stopping, and returns updated local model parameters $w_{k}^{(t)}$ along with their validation loss $L_{k}^{(t)}$ and validation accuracy $a_{k}^{(t)}$ .

The server aggregates the local model updates using a weighted average, where the weights are based on both the number of training samples and the clients' validation accuracy, as shown in Equation 3, where $n_{k} = | D_{k} |$ is the number of training samples on client $k$ , $a_{k}$ is the validation accuracy of client $k$ , and $n = \sum_{k \in S_{t}} n_{k} \cdot a_{k}$ is the total weighted samples across all participating clients. This weighted aggregation gives higher importance to clients with better validation performance, which can improve the overall model convergence.

Then the server computes the aggregated validation loss $L_{t}$ using Equation 4 and applies a practical early stopping mechanism, informed by aggregated client validation losses, to halt training once improvements plateau beyond a defined patience and threshold. While not novel on its own, this mechanism plays a vital role in the overall efficiency of our framework, especially in conjunction with FIPCA. Algorithm 3 outlines the server coordination process.

w^{(t)} = \frac{\sum_{k \in S_{t}} n_{k} \cdot a_{k} \cdot w_{k}^{(t)}}{\sum_{k \in S_{t}} n_{k} \cdot a_{k}}

(3)

L_{t} = \frac{\sum_{k \in S_{t}} n_{val, k} \cdot L_{k}^{(t)}}{\sum_{k \in S_{t}} n_{val, k}}

(4)

ALGORITHM 3. Federated learning server with adaptive early stopping.

Initialize global weights

w^{(0)}

, best loss

L_{best} \leftarrow \infty

, wait counter

c \leftarrow 0

, patience

p

, tolerance

ε

, delta

δ

, minimum rounds

t_{\min}

for each round

t = 1

T

if training converged then

Break (Early stopping)

end if

Select a subset of clients

S_{t}

Broadcast global weights

w^{(t - 1)}

and round

t

to clients in

S_{t}

Receive updated weights

{w_{k}^{(t)}}

, validation losses

{L_{k}^{(t)}}

, and validation accuracies

{a_{k}^{(t)}}

from clients

Aggregate global weights using Equation (6)

10:

Compute aggregated validation loss:

L_{t} = \frac{\sum_{k \in S_{t}} n_{val, k} \cdot L_{k}^{(t)}}{\sum_{k \in S_{t}} n_{val, k}}

11:

L_{t} < L_{best} - δ

then

12:

L_{best} \leftarrow L_{t}

13:

c \leftarrow 0

14:

else if

L_{t} \leq L_{best} + ε

then

15:

c \leftarrow 0

16:

else

17:

c \leftarrow c + 1

18:

end if

19:

t \geq t_{\min}

and

c \geq p

then

20:

Save

w^{(t)}

as final model

21:

Break (Early stopping)

22:

end if

23:

end for

Open in a new tab

3. RESULTS

We conducted a series of experiments to evaluate the effectiveness of our proposed method. The Prostate Imaging Cancer AI (PI‐CAI) dataset, ⁴² consisting of 1500 patient cases, was utilized for this purpose. These cases were sourced from three distinct medical institutions. Data from two centers were used for training and validation, while the third center was exclusively used for testing the model's generalization performance. The distribution of total cases, as well as csPCa cases across the centers, is summarized in Table 1. This setup allowed us to evaluate the method's performance in a realistic federated learning scenario where data is distributed across institutions. To provide further granularity, Table 2 presents a breakdown of each center's data by training, validation, and test set splits, along with corresponding class distributions.

TABLE 1.

Distribution of total cases and csPCa cases across centers. PCNN and RUMC are used for training and validation, while ZGT is reserved for testing.

Center	Total Cases	csPCa Cases	Purpose
RUMC	800	236	Train/validation
PCNN	350	109	Train/validation
ZGT	350	80	Test
Total	1500	425

Open in a new tab

Abbreviations: csPCa, clinically significant prostate cancer; PCNN, Prostate Cancer Neuroendocrine Network; RUMC, Radboud University Medical Center; ZGT, Ziekenhuisgroep Twente.

TABLE 2.

Distribution of total cases and csPCa cases across centers and dataset splits. Training and validation sets are drawn from RUMC and PCNN. ZGT is reserved for testing.

Center	Set	Total Cases	csPCa Cases	Non‐csPCa Cases
RUMC	Train	640	190	450
RUMC	Validation	160	46	114
PCNN	Train	280	87	193
PCNN	Validation	70	22	48
ZGT	Test	350	80	270

Open in a new tab

Abbreviations: csPCa, clinically significant prostate cancer; PCNN, Prostate Cancer Neuroendocrine Network; RUMC, Radboud University Medical Center; ZGT, Ziekenhuisgroep Twente.

Figure 2 illustrates the FL setup used in this study, with RUMC and PCNN serving as the training and validation centers. The local models are trained separately on these centers and are aggregated on the global server using the Flower framework. The global model is then tested at ZGT to evaluate its generalization performance in an unseen dataset.

Federated learning topology using the Flower framework with two centers (RUMC and UMCG) for training/validation and one center (ZGT) for testing.

The dataset used for the experiments provided 3D prostate segmentation labels for each patient, allowing us to utilize the 3D prostate structure in defining the input space. Feature vectors were obtained by flattening the voxel intensities of the prostate‐segmented 3D T2W images into high‐dimensional vectors, which were then reduced federatively using FIPCA prior to classification. We applied FIPCA to reduce the dimensionality to 10 features, retaining the essential structure of the data while improving computational efficiency and preserving data privacy. Each center computed local scatter matrices and means from its data, and these local statistics were aggregated at the server to derive the global FIPCA model without sharing raw data. The neural network architecture was adjusted to match the input size of the FIPCA‐transformed data, ensuring optimal processing. Multiple clients, each with different local data distributions, were incorporated into the experiment, simulating the heterogeneity commonly found in FL scenarios. To assess the model's convergence and generalization capabilities, validation loss was monitored throughout the global training rounds. But the final evaluation is on test data center which is totally unseen.

To evaluate the impact of FIPCA on cross‐site feature alignment, we computed the relative distance between site centroids before and after dimensionality reduction. Relative distance measures inter‐site separation normalized by each site's internal variability, providing a more interpretable metric of heterogeneity than absolute distances—especially in federated settings. As illustrated in Figure 3, the relative distance decreased dramatically by 96% –99% across all center pairs following the FIPCA transformation. This substantial reduction—observed consistently between PCNN–RUMC, PCNN–ZGT, and RUMC–ZGT—demonstrates effective alignment of site‐level feature distributions. These results confirm that FIPCA not only preserves privacy and reduces dimensionality but also harmonizes feature distributions across institutions, a critical factor for robust federated learning in medical imaging.

Relative distance between site centroids before and after applying FIPCA. Blue bars represent distances computed on standardized raw features, while orange bars show distances after FIPCA transformation. FIPCA, federated incremental principal component analysis.

The total variance before and after applying FIPCA for each center is also illustrated in Figure 4. The FIPCA approach effectively reduced dimensionality while retaining between 45% and 68% of the total variance across centers. This demonstrates the method's ability to align data distributions across centers, addressing the challenges posed by heterogeneity in FL and enhancing overall model performance by improving consistency. While increasing the number of principal components (PCs) beyond 10 would preserve more variance, it did not yield improvements in the AUC for downstream tasks. Therefore, 10 PCs were selected as an optimal balance between variance retention and computational efficiency.

Total variance before and after FIPCA for each center. FIPCA, federated incremental principal component analysis.

3.1. Efficiency evaluation

Table 3 presents a comparison between the standard FedAvg algorithm and our proposed method. The results indicate that our FL framework substantially reduced the number of global rounds required for convergence while also improving AUC. Specifically, the standard FedAvg algorithm required 200 global rounds to achieve an AUC of 68%. The number of global rounds for FedAvg was set to 200 based on Moradi et al., ⁴³ who found this configuration optimal for the PI‐CAI dataset. In contrast, our proposed adaptive approach only required 38 global rounds and achieved a higher AUC of 73%. This demonstrates the effectiveness of the proposed method in enhancing both training speed and model performance.

TABLE 3.

Performance comparison between standard FedAvg and adaptive early stopping FedAvg.

Method	Global Rounds	AUC on Test center
Standard FedAvg	200	0.68
Our Proposed Method	38	0.73

Open in a new tab

Abbreviation: FedAvg, federated averaging.

According to Camajori et al.'s findings, ⁴⁴ FL models typically range from 30 to 150 MB per learning round. Reducing the rounds from 200 to 38 significantly eases the communication load in large‐scale FL networks. This optimization not only saves computational resources but also prevents overfitting by avoiding unnecessary training rounds.

The value of decentralized learning in FL is further underscored by comparing AUC results from models trained solely on individual centers and evaluated on a different center's test set. For example, a model trained on the PCNN center (one of the training centers) and evaluated on the ZGT test set achieved an AUC of 0.627. Similarly, a model trained on the RUMC center and evaluated on the ZGT test set resulted in a lower AUC of 0.596. These findings highlight the limitations of centralized training on individual centers due to data heterogeneity, where models struggle to generalize effectively across centers.

Figure 5 further supports these results, presenting the ROC curve comparison for the test center ZGT. The proposed Adaptive FL achieves the highest AUC of 0.733, closely followed by Central Learning (AUC = 0.749), while standard Federated Learning lags behind with an AUC of 0.685. These results demonstrate the superior performance of the proposed method in handling data heterogeneity, improving classification accuracy, and addressing the challenges posed by varying data distributions across centers.

ROC curve comparison for the test center ZGT. ZGT, Ziekenhuisgroep Twente.

In addition to the global rounds and validation accuracy comparison, Table 4 presents a performance comparison between the proposed adaptive FL method and the traditional standard FedAvg method. The evaluated metrics include specificity, sensitivity (recall), and AUC. The results indicate that our proposed adaptive method outperforms the standard FedAvg approach. To statistically validate the observed improvement in model performance, we conducted a DeLong test ⁴⁵ comparing the AUCs between the proposed Adaptive FL method and standard FedAvg. The test yielded a p‐value of $< 0.01$ , indicating that the improvement from an AUC of 0.685 (FedAvg) to 0.733 (Adaptive FL) on the independent ZGT test set is statistically significant. Similarly, the Central Learning method achieved an AUC of 0.749, which was also significantly higher than FedAvg, with a p‐value of $< 0.007$ . Additionally, 95% confidence intervals (CIs) for all AUC estimates were computed using bootstrapping with 1,000 resamples, offering robust quantification of uncertainty. The CI for Adaptive FL was [0.70, 0.77], for FedAvg was [0.61, 0.73], with the relatively narrower bounds further supporting the reliability and significance of the performance differences further supporting the observed performance differences.

TABLE 4.

Performance comparison between adaptive federated learning, central learning, and standard federated learning methods, including AUC confidence intervals and statistical significance versus FedAvg

Model	AUC	95% CI	p‐value versus. FedAvg	Sensitivity	Specificity
Adaptive federated learning	0.733	[0.70, 0.77]	$<$ 0.01	0.784	0.786
Standard federated learning	0.685	[0.61, 0.73]	—	0.526	0.784
Central learning	0.749	[0.71, 0.78]	$<$ 0.007	0.895	0.705

Open in a new tab

Abbreviation: FedAvg, federated averaging.

In addition to evaluating the model on the independent ZGT test center, we also report performance on the internal validation sets from the RUMC and PCNN centers. Table 5 summarizes the AUC, sensitivity, and specificity scores for the RUMC and PCNN validation sets alongside the ZGT test results. The model achieved strong validation performance at both training centers (AUC = 0.76 for RUMC and 0.74 for PCNN), while maintaining high generalization to the unseen ZGT center (AUC = 0.733), confirming the model's robustness across different institutions.

TABLE 5.

Performance comparison across validation sets from training centers (RUMC and PCNN) and the independent test center (ZGT).

Dataset	AUC	Sensitivity	Specificity
RUMC (Validation)	0.76	0.79	0.75
PCNN (Validation)	0.74	0.77	0.73
ZGT (Test)	0.733	0.784	0.786

Open in a new tab

Abbreviations: PCNN, Prostate Cancer Neuroendocrine Network; RUMC, Radboud University Medical Center; ZGT, Ziekenhuisgroep Twente.

We also have analyzed the relationship between the number of PCs used in FIPCA and the number of global rounds required to stop training in our proposed adaptive method as shown in Figure 6. As observed, when the number of PCs is low (near 0), the number of rounds required for training to stop is at its highest, reaching approximately 200 rounds. However, as the number of PCs increases, the number of stopping rounds rapidly decreases, stabilizing around 25 rounds once the number of principal components reaches approximately 50. This suggests that a moderate number of PCs allows the model to converge much faster, leading to earlier stopping in the training process. This indicates that dimensionality reduction through FIPCA effectively accelerates the training by reducing the number of rounds required for convergence while preserving key information in the dataset.

Effect of FIPCA component count on training rounds. FIPCA, federated incremental principal component analysis.

3.2. Resource consumption analysis

We analyzed the resource consumption using an Apple M1 MacBook Pro (2023), featuring an 8‐core CPU and GPU. The device's power consumption ranges between 20 and 30W during full‐load operations. We evaluated the energy consumption (summarized in Table 6) across different setups. The energy consumption can be estimated using Equation 5.

Energy (Wh) = Power (W) \times Time (h)

(5)

TABLE 6.

Comparison of training time and energy consumption across methods.

Method	Training time (min)	Estimated energy consumption (Wh)
Standard FedAvg (without FIPCA)	300 (5 h)	100 – 150
FedAvg with FIPCA	15	5 ‐ ‐7.5
Our adaptive FL approach	5	1.6 – 2.5

Open in a new tab

Abbreviations: FedAvg, federated averaging; FIPCA, Federated Incremental Principal Component Analysis; FL, federated learning.

The proposed Adaptive FL method demonstrates a drastic reduction in training time, memory, and energy consumption. Compared to the standard FedAvg method, it reduces energy consumption by 98%, and by 50% compared to FedAvg with FIPCA. This makes it highly suitable for resource‐constrained federated learning environments. These visualizations collectively confirm the advantages of the adaptive early‐stopping strategy. The method not only reduced computational overhead by minimizing the number of global training rounds but also maintained or enhanced model accuracy, particularly when applied to FIPCA‐transformed datasets. The adaptive approach allowed the model to converge faster, demonstrating its potential for deployment in resource‐constrained FL environments.

To further investigate the contribution of each proposed component, we conducted an ablation study (Table 7). Applying FIPCA alone increased the AUC from 0.685 to 0.726, demonstrating its significant impact on model generalization. In contrast, applying early stopping alone extended training slightly beyond 200 rounds (to 206) with only a marginal AUC improvement (to 0.687), indicating limited standalone benefit. When combined, the full adaptive setup achieved the highest AUC (0.733) and the most efficient training, requiring just 38 rounds and 5 min. These results suggest that FIPCA primarily drives performance gains, while early stopping improves computational efficiency.

TABLE 7.

Ablation study results showing the contribution of FIPCA and early stopping.

Method	AUC (ZGT Test Set)	Global Rounds	Training Time (minutes)
Standard FedAvg	0.685	200 (fixed)	300
FedAvg + FIPCA only	0.726	200 (fixed)	15
FedAvg + Early stopping only	0.687	206	310
Full Adaptive (FIPCA + Early stopping)	0.733	38	5

Open in a new tab

4. DISCUSSION

The experimental results demonstrate that the proposed FL approach substantially enhances both training efficiency and computational resource usage. This strategy reduces the number of global rounds and helps mitigate overfitting, leading to improved generalization performance, as reflected by a 5% increase in AUC compared to standard FedAvg.

Moreover, the integration of FIPCA effectively reduces data dimensionality, accelerating convergence and optimizing resource efficiency. By aligning data distributions across centers through federated FIPCA, we also reduce the risk of model divergence due to data heterogeneity, enhancing the overall robustness of the global model.

Our results indicate that the proposed method consumes up to 98% less energy than standard FedAvg without FIPCA and 50% less than FedAvg with FIPCA. This energy‐saving aspect is particularly critical in medical settings, where hardware limitations or energy constraints often restrict the deployment of large‐scale federated learning models.

Techniques like partial model sharing, which selectively shares portions of model parameters or gradients, help mitigate privacy risks by reducing data exposure. Approaches such as FLOP keep final layers private while sharing other parts, ensuring privacy and personalization in applications such as COVID‐19 detection. ⁴⁶ , ⁴⁷ Other methods like cyclic and single weight transfer algorithms aim to improve FL performance but come with trade‐offs, such as biases towards recent clients and increased communication overhead. ⁴⁸ , ⁴⁹ , ⁵⁰ , ⁵¹ Alternative strategies like ensemble learning, which combines multiple model predictions, and split learning, which divides model layers between clients and a central server, enhance generalization and privacy. ⁵² , ⁵³ However, these approaches can increase communication costs and privacy risks, as they often rely on sharing model outputs instead of weights. ⁵⁴

The improvements demonstrated by our method not only highlight its practical viability but also show promise for deployment in real‐world, resource‐constrained environments such as healthcare institutions. By minimizing communication rounds and computational demands while preserving privacy, this approach bridges the gap between high‐performance federated learning and energy‐efficient applications in distributed healthcare systems.

Additionally, our method enhances data privacy beyond standard FL approaches by ensuring that only statistical summaries (means and scatter matrices) are shared during FIPCA computation, without any raw data or gradients exchanged. This mitigates critical privacy concerns in medical data sharing and ensures compliance with regulations such as GDPR and HIPAA. By aligning data distributions across centers through federated FIPCA, we also reduce the risk of model inversion attacks that could reconstruct sensitive patient information from shared gradients or model parameters.

Although the results are promising, our experiments were conducted on a specific prostate cancer imaging dataset, and the generalizability of the method to other medical imaging tasks or datasets with different characteristics still requires validation.

Future work should focus on extending this approach to other types of medical imaging data, such as computed tomography (CT) scans or histopathological images, which may present additional challenges related to data dimensionality and heterogeneity. Exploring the applicability of the proposed method in these contexts could further demonstrate its versatility and robustness.

This study focused on evaluating the proposed adaptive framework within the widely adopted FedAvg baseline to demonstrate its generalization and efficiency benefits. While FedAvg remains a strong reference point in federated learning, more recent aggregation strategies such as FedProx, SCAFFOLD, and FedDyn have been proposed to better handle data heterogeneity and client drift. Expanding our evaluation to include comparisons against these methods would provide a broader validation of the proposed improvements. We plan to explore these comparisons as part of future work.

FIPCA effectively reduces data dimensionality and aligns distributions across centers in prostate MR imaging. However, its linear nature may discard subtle features crucial for detecting complex patterns, potentially limiting sensitivity to nuanced abnormalities. The effectiveness of this method should be validated across other dataset types and imaging modalities to ensure robust diagnostic accuracy. So incorporating nonlinear dimensionality reduction techniques, such as federated kernel PCA or federated autoencoders, might capture more complex patterns and relationships in the data, potentially further improving model performance.

In our study the independent test centre (ZGT, 350 cases) operated in inference‐only mode; its images were projected onto the global PCA basis learned from the two training centres (RUMC, PCNN) without contributing local statistics. We recommend the same strategy for sites with very small cohorts, fewer than $\approx 50$ cases where scatter‐matrix estimates become noisy, late joiners after the global basis is fixed (with updates introduced in planned releases), centres with persistent data‐quality issues, and institutions whose strict privacy policies or unreliable connectivity prevent even summary‐statistic exchange.

Although our experiments use voxel‐level features, the FIPCA step is mathematically agnostic to feature type. It operates solely on first‐ and second‐order statistics—namely, the mean vector and scatter matrix—computed from any numerical feature space that is consistent across clients. As such, radiomic descriptors, handcrafted texture features, or neural network embeddings could be used interchangeably without altering the algorithm, provided the feature dimensionality is uniform across sites. However, radiomic features may require additional harmonization steps to ensure consistency across institutions due to their sensitivity to variations in imaging protocol, segmentation, and preprocessing. Extending this framework to alternative feature representations represents a promising direction for future research.

Finally, the choice of the number of principal components is a critical hyperparameter that affects both computational efficiency and the model's ability to capture essential data variance. While our study selected 10 principal components to balance these factors, adaptive methods for selecting the optimal number could further enhance performance. Techniques, like explained variance thresholds or cross‐validation, could help automatically determine the most informative number of components for different datasets.

5. CONCLUSION

The proposed method improves the standard FedAvg algorithm by applying FIPCA for dimensionality reduction, aggregating local means and scatter matrices to compute a global model while maintaining privacy. This approach reduces data heterogeneity, enhances computational efficiency, and preserves privacy by sharing only statistical summaries. Additionally, an adaptive early stopping mechanism based on aggregated client validation loss minimizes global rounds, speeding up convergence without sacrificing accuracy.

Experimental results on the PI‐CAI dataset show a 98% reduction in energy consumption, with global rounds reduced from 200 to 38 and AUC improved from 0.68 to 0.73 (95% CI 0.70 – 0.77). The method also improves robustness to data heterogeneity, making it suitable for resource‐constrained environments, where computational efficiency and resource optimization are crucial for large‐scale FL approaches.

CONFLICT OF INTEREST STATEMENT

The authors have no conflicts to disclose.

ACKNOWLEDGMENTS

This research was backed by the NSERC Discovery grant and the Ontario Graduate Scholarship (OGS).

Nanekaran NP, Ukwatta E. A novel federated learning framework for medical imaging: Resource‐efficient approach combining PCA with early stopping. Med Phys. 2025;52:e18064. 10.1002/mp.18064

DATA AVAILABILITY STATEMENT

The dataset used in this study is publicly available through the Prostate Imaging Cancer AI (PI‐CAI) challenge ⁴² and can be accessed online. The code for our FL framework is available on GitHub at https://github.com/npiran/federated_learninggithub.com/npiran/federated‐learning for reproducibility.

REFERENCES

1. Shilo S, Rossman H, Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nat Med. 2020;26:29‐38. [DOI] [PubMed] [Google Scholar]
2. Peiffer‐Smadja N, Maatoug R, Lescure F‐X, D'ortenzio E, Pineau J, King J‐R. Machine learning for COVID‐19 needs global collaboration and data‐sharing. Nat Mach Intell. 2020;2:293‐294. [Google Scholar]
3. Dhruva SS, Ross JS, Akar JG, et al. Aggregating multiple real‐world data sources using a patient‐centered health‐data‐sharing platform. NPJ digital medicine. 2020;3:60. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Dou Q, So Tiffany Y, Jiang M, et al. Federated deep learning for detecting COVID‐19 lung abnormalities in CT: a privacy‐preserving multinational validation study. NPJ Digital Med. 2021;4:60. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. McMahan B, Moore E, Ramage D, Hampson S, Arcas BA. Communication‐efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics. PMLR; 2017:1273‐1282. [Google Scholar]
6. Nilsson A, Smith S, Ulm G, Gustavsson E, Jirstrand M. A performance evaluation of federated learning algorithms. In: Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning , New York, USA, 1–8. Association for Computing Machinery; 2018. [Google Scholar]
7. Kairouz P, McMahan HB, Avent B, et al. Advances and open problems in federated learning. Found Trends Mach Lear. 2021;14:1‐210. [Google Scholar]
8. Babar M, Qureshi B, Koubaa A. Investigating the impact of data heterogeneity on the performance of federated learning algorithm using medical imaging. PLoS One. 2024;19:e0302539. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Qu L, Balachandar N, Rubin DL. An experimental study of data heterogeneity in federated learning methods for medical imaging. 2021. arXiv preprint arXiv:2105.05929.
10. Chen W, Bhardwaj K, Marculescu R. FedMax: Mitigating activation divergence for accurate and communication‐efficient federated learning. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Berlin, Germany, 563–579. Springer; 2020. [Google Scholar]
11. Bercea CI, Wiestler B, Rueckert D, Albarqouni S. Federated disentangled representation learning for unsupervised brain anomaly detection. Nat Mach Intell. 2022;4:685‐695. [Google Scholar]
12. Li T, Sahu AK, Talwalkar A, Smith V. Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag. 2020;37:50‐60. [Google Scholar]
13. Jiang M, Wang Z, Dou Q. Harmofl: Harmonizing local and global drifts in federated learning on heterogeneous medical images. In: Proceedings of the AAAI Conference on Artificial Intelligence 2022;36:1087‐1095. [Google Scholar]
14. Karimireddy SP, Kale S, Mohri M, Reddi SJ, Stich SU, Suresh AT. SCAFFOLD: stochastic Controlled Averaging for Federated Learning. Proc Mach Learn Res. 2020;119:5132‐5143. [Google Scholar]
15. Li Q, He B, Song D. Model‐contrastive federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 10713–10722. IEEE; 2021. [Google Scholar]
16. Acar D, Zhao Y, Navarro R, Mattina M, Whatmough P, Saligrama V, Federated learning based on dynamic regularization (FedDyn). In: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021) . 2021.
17. Mohri M, Sivek G, Suresh AT. Agnostic federated learning. In: International Conference on Machine Learning (ICML) 2019.
18. Arivazhagan MK, Aggarwal V, Singh AT, Choudhary S. Federated learning with personalization layers. In: NeurIPS Workshop on Federated Learning for Data Privacy and Confidentiality 2019.
19. Fallah A, Mokhtari A, Ozdaglar A. Personalized federated learning with theoretical guarantees: a model‐agnostic meta‐learning approach. In: Advances in Neural Information Processing Systems (NeurIPS) 2020.
20. Dinh CT, Tran NH, Nguyen T, Nguyen C, Dinh L, Nguyen DN. Personalized federated learning with moreau envelopes. In: Advances in Neural Information Processing Systems (NeurIPS) 2020.
21. Li T, Sahu AK, Talwalkar A, Smith V. Ditto: Fair and robust federated learning through personalization. In: International Conference on Machine Learning (ICML) 2021.
22. Collins L, Hassani HS, Rangwala H. Exploiting shared representations for personalized federated learning. In: International Conference on Machine Learning (ICML) 2021.
23. Ghosh AK, Chung J, Yin D, Ramchandran K. Efficient model aggregation in federated learning via probabilistic clustering. In: International Conference on Learning Representations (ICLR) 2020.
24. Marfoq O, Bellet A, Tommasi T. Federated learning under fairness constraints. In: Advances in Neural Information Processing Systems (NeurIPS) 2021.
25. Hsu T‐MH, Qi H, Brown Matthew. Measuring the effects of non‐identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 . 2019.
26. Reddi SJ, Charles Z, Zaheer M, Garrett Z, Rush K, Konečný J, et al. Adaptive federated optimization. In: Proceedings of the 9th International Conference on Learning Representations (ICLR2021) . 2021. arXiv:2003.00295.
27. Kades F, Maier‐Hein KH, Zimmerer D, Isensee F. Towards real‐world federated learning in medical image analysis using Kaapana. In: Bildverarbeitung für die Medizin 2022. Springer; 2022:132‐137. [Google Scholar]
28. Alekseenko V, Sun T, Bruell J, Zhang Y, Li H, Rueckert D, Distance‐Aware Non‐IID Federated Learning for Generalization and Personalization in Medical Imaging Segmentation. In: Proceedings of Machine Learning Research (MIDL2024) . 2024;250.
29. Darzi S, Chen X, Zhang X, Xia Y, Xu D. Tackling heterogeneity in medical federated learning via aligning vision transformers. 2024. arXiv preprint arXiv:2310.09444. [DOI] [PubMed]
30. Yang L, Zhang H, Yu L, et al. Federated semi‐supervised learning for COVID region segmentation in chest CT. Med Image Anal. 2021;70:102240. [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Zhang H, Wang S, Jiang F, Yin H, Huang Y, Xie Y. SplitAVG: a heterogeneity‐aware federated deep learning method for medical imaging. IEEE J Biomed Health Inf. 2022;26:4044–4055. [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Li X, He Y, Song Y, Wang C, Sun Y, Liu T. FedBN: federated learning on non‐IID features via local batch normalization. In: Proceedings of ICLR 2021.
33. Adnan M, Kalra S, Lu Y, et al. Federated learning and differential privacy for medical image analysis. Sci Rep. 2022;12:1‐12. [DOI] [PMC free article] [PubMed] [Google Scholar]
34. Borazjani S, Haghshenas H, Alamdari A, Kalra S, Martel AL. Multi‐Modal Federated Learning for Cancer Staging over Non‐IID Datasets with Unbalanced Modalities. IEEE Trans Med Imaging. 2024;43(8):2491‐2502. doi: 10.1109/TMI.2024.3387639 [DOI] [PubMed] [Google Scholar]
35. Wen D, Jeon K‐J, Huang K. Federated dropout–a simple approach for enabling federated learning on resource constrained devices. IEEE Wireless Commun Lett. 2022;11:923‐927. [Google Scholar]
36. Wilkins G, Di S, Calhoun JC, et al. FedSZ: leveraging error‐bounded lossy compression for federated learning communications. In: 2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS) . IEEE; 2024:577‐588. [Google Scholar]
37. Dai M, Li Y, Li P, et al. A survey on integrated sensing, communication, and computing networks for smart oceans. J Sens Actuator Netw. 2022;11:70. [Google Scholar]
38. Nguyen DC, Ding M, Pathirana PN, Seneviratne A, Zomaya AY. Federated learning for COVID‐19 detection with generative adversarial networks in edge cloud computing. IEEE Internet Things J. 2021;9:10257‐10271. [Google Scholar]
39. Abadi M, Chu A, Goodfellow I, et al, Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). New York, NY: ACM; 2016. 308‐318. doi: 10.1145/2976749.2978318 [DOI] [Google Scholar]
40. Linardos A, Tragardh E, et al. Federated learning for multi‐center imaging diagnostics: a simulation study in cardiovascular disease. Sci Rep. 2022;12:3551. [DOI] [PMC free article] [PubMed] [Google Scholar]
41. Tavakoli F, Wu C, Liu J, Ma C, Xu M. A comprehensive view of personalized federated learning on heterogeneous clinical datasets. 2023. arXiv preprint arXiv:2309.16825.
42. Saha A, Bosma J, Futterer JJ, et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI‐CAI): an international, paired, non‐inferiority, confirmatory study. Lancet Oncol. 2024;25(5):529‐539. doi: 10.1016/S1470-2045(24)00123-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
43. Moradi A, Zerka F, Bosma JS, et al. Federated learning for prostate cancer detection in biparametric MRI: optimization of rounds, epochs, and aggregation strategy. In: Medical Imaging 2024: Computer‐Aided Diagnosis. Vol. 12927. SPIE; 2024:412‐421. [Google Scholar]
44. Tedeschini BC, Savazzi S, Stoklasa R, et al. Decentralized federated learning for healthcare networks: a case study on tumor segmentation. IEEE Access. 2022;10:8693‐8708. [Google Scholar]
45. DeLong ER, DeLong DM, Clarke‐Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837‐845. [PubMed] [Google Scholar]
46. Suk H‐Il, Liu M, Yan P, Lian C. Privacy‐preserving federated brain tumour segmentation. In: International Workshop on Machine Learning in Medical Imaging . Springer; 2019:92‐100. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Yang Q, Zhang J, Hao W, Spell G, Carin L. FLOP: Federated learning on medical datasets using partial networks. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining . ACM; 2021:3845‐3853. [Google Scholar]
48. Chang K, Balachandar N, Lam C, et al. Distributed deep learning networks among institutions for medical imaging. J Am Med Inform Assoc. 2018;25:945‐954. [DOI] [PMC free article] [PubMed] [Google Scholar]
49. Sheller MJ, Edwards B, Reina GA, et al. Federated learning in medicine: facilitating multi‐institutional collaborations without sharing patient data. Sci Rep. 2020;10:12598. [DOI] [PMC free article] [PubMed] [Google Scholar]
50. Wang S, Tuor T, Salonidis T, et al. Adaptive federated learning in resource‐constrained edge computing systems. IEEE J Sel Areas Commun. 2019;37:1205–1221. [Google Scholar]
51. Konečný J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D. Federated learning: strategies for improving communication efficiency. 2016. arXiv preprint arXiv:1610.05492.
52. Shokri R, Shmatikov V. Privacy‐preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS) . ACM; 2015:1310‐1321. [Google Scholar]
53. Vepakomma P, Gupta O, Swedish T, Raskar R. Split learning for health: distributed deep learning without sharing raw patient data. 2018. arXiv preprint arXiv:1812.00564. (Presented at NeurIPS 2018 Workshop).
54. Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng. 2009;22:1345–1359. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

[mp18064-bib-0001] 1. Shilo S, Rossman H, Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nat Med. 2020;26:29‐38. [DOI] [PubMed] [Google Scholar]

[mp18064-bib-0002] 2. Peiffer‐Smadja N, Maatoug R, Lescure F‐X, D'ortenzio E, Pineau J, King J‐R. Machine learning for COVID‐19 needs global collaboration and data‐sharing. Nat Mach Intell. 2020;2:293‐294. [Google Scholar]

[mp18064-bib-0003] 3. Dhruva SS, Ross JS, Akar JG, et al. Aggregating multiple real‐world data sources using a patient‐centered health‐data‐sharing platform. NPJ digital medicine. 2020;3:60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0004] 4. Dou Q, So Tiffany Y, Jiang M, et al. Federated deep learning for detecting COVID‐19 lung abnormalities in CT: a privacy‐preserving multinational validation study. NPJ Digital Med. 2021;4:60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0005] 5. McMahan B, Moore E, Ramage D, Hampson S, Arcas BA. Communication‐efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics. PMLR; 2017:1273‐1282. [Google Scholar]

[mp18064-bib-0006] 6. Nilsson A, Smith S, Ulm G, Gustavsson E, Jirstrand M. A performance evaluation of federated learning algorithms. In: Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning , New York, USA, 1–8. Association for Computing Machinery; 2018. [Google Scholar]

[mp18064-bib-0007] 7. Kairouz P, McMahan HB, Avent B, et al. Advances and open problems in federated learning. Found Trends Mach Lear. 2021;14:1‐210. [Google Scholar]

[mp18064-bib-0008] 8. Babar M, Qureshi B, Koubaa A. Investigating the impact of data heterogeneity on the performance of federated learning algorithm using medical imaging. PLoS One. 2024;19:e0302539. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0009] 9. Qu L, Balachandar N, Rubin DL. An experimental study of data heterogeneity in federated learning methods for medical imaging. 2021. arXiv preprint arXiv:2105.05929.

[mp18064-bib-0010] 10. Chen W, Bhardwaj K, Marculescu R. FedMax: Mitigating activation divergence for accurate and communication‐efficient federated learning. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Berlin, Germany, 563–579. Springer; 2020. [Google Scholar]

[mp18064-bib-0011] 11. Bercea CI, Wiestler B, Rueckert D, Albarqouni S. Federated disentangled representation learning for unsupervised brain anomaly detection. Nat Mach Intell. 2022;4:685‐695. [Google Scholar]

[mp18064-bib-0012] 12. Li T, Sahu AK, Talwalkar A, Smith V. Federated learning: challenges, methods, and future directions. IEEE Signal Process Mag. 2020;37:50‐60. [Google Scholar]

[mp18064-bib-0013] 13. Jiang M, Wang Z, Dou Q. Harmofl: Harmonizing local and global drifts in federated learning on heterogeneous medical images. In: Proceedings of the AAAI Conference on Artificial Intelligence 2022;36:1087‐1095. [Google Scholar]

[mp18064-bib-0014] 14. Karimireddy SP, Kale S, Mohri M, Reddi SJ, Stich SU, Suresh AT. SCAFFOLD: stochastic Controlled Averaging for Federated Learning. Proc Mach Learn Res. 2020;119:5132‐5143. [Google Scholar]

[mp18064-bib-0015] 15. Li Q, He B, Song D. Model‐contrastive federated learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, USA, 10713–10722. IEEE; 2021. [Google Scholar]

[mp18064-bib-0016] 16. Acar D, Zhao Y, Navarro R, Mattina M, Whatmough P, Saligrama V, Federated learning based on dynamic regularization (FedDyn). In: Proceedings of the 9th International Conference on Learning Representations (ICLR 2021) . 2021.

[mp18064-bib-0017] 17. Mohri M, Sivek G, Suresh AT. Agnostic federated learning. In: International Conference on Machine Learning (ICML) 2019.

[mp18064-bib-0018] 18. Arivazhagan MK, Aggarwal V, Singh AT, Choudhary S. Federated learning with personalization layers. In: NeurIPS Workshop on Federated Learning for Data Privacy and Confidentiality 2019.

[mp18064-bib-0019] 19. Fallah A, Mokhtari A, Ozdaglar A. Personalized federated learning with theoretical guarantees: a model‐agnostic meta‐learning approach. In: Advances in Neural Information Processing Systems (NeurIPS) 2020.

[mp18064-bib-0020] 20. Dinh CT, Tran NH, Nguyen T, Nguyen C, Dinh L, Nguyen DN. Personalized federated learning with moreau envelopes. In: Advances in Neural Information Processing Systems (NeurIPS) 2020.

[mp18064-bib-0021] 21. Li T, Sahu AK, Talwalkar A, Smith V. Ditto: Fair and robust federated learning through personalization. In: International Conference on Machine Learning (ICML) 2021.

[mp18064-bib-0022] 22. Collins L, Hassani HS, Rangwala H. Exploiting shared representations for personalized federated learning. In: International Conference on Machine Learning (ICML) 2021.

[mp18064-bib-0023] 23. Ghosh AK, Chung J, Yin D, Ramchandran K. Efficient model aggregation in federated learning via probabilistic clustering. In: International Conference on Learning Representations (ICLR) 2020.

[mp18064-bib-0024] 24. Marfoq O, Bellet A, Tommasi T. Federated learning under fairness constraints. In: Advances in Neural Information Processing Systems (NeurIPS) 2021.

[mp18064-bib-0025] 25. Hsu T‐MH, Qi H, Brown Matthew. Measuring the effects of non‐identical data distribution for federated visual classification. arXiv preprint arXiv:1909.06335 . 2019.

[mp18064-bib-0026] 26. Reddi SJ, Charles Z, Zaheer M, Garrett Z, Rush K, Konečný J, et al. Adaptive federated optimization. In: Proceedings of the 9th International Conference on Learning Representations (ICLR2021) . 2021. arXiv:2003.00295.

[mp18064-bib-0027] 27. Kades F, Maier‐Hein KH, Zimmerer D, Isensee F. Towards real‐world federated learning in medical image analysis using Kaapana. In: Bildverarbeitung für die Medizin 2022. Springer; 2022:132‐137. [Google Scholar]

[mp18064-bib-0028] 28. Alekseenko V, Sun T, Bruell J, Zhang Y, Li H, Rueckert D, Distance‐Aware Non‐IID Federated Learning for Generalization and Personalization in Medical Imaging Segmentation. In: Proceedings of Machine Learning Research (MIDL2024) . 2024;250.

[mp18064-bib-0029] 29. Darzi S, Chen X, Zhang X, Xia Y, Xu D. Tackling heterogeneity in medical federated learning via aligning vision transformers. 2024. arXiv preprint arXiv:2310.09444. [DOI] [PubMed]

[mp18064-bib-0030] 30. Yang L, Zhang H, Yu L, et al. Federated semi‐supervised learning for COVID region segmentation in chest CT. Med Image Anal. 2021;70:102240. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0031] 31. Zhang H, Wang S, Jiang F, Yin H, Huang Y, Xie Y. SplitAVG: a heterogeneity‐aware federated deep learning method for medical imaging. IEEE J Biomed Health Inf. 2022;26:4044–4055. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0032] 32. Li X, He Y, Song Y, Wang C, Sun Y, Liu T. FedBN: federated learning on non‐IID features via local batch normalization. In: Proceedings of ICLR 2021.

[mp18064-bib-0033] 33. Adnan M, Kalra S, Lu Y, et al. Federated learning and differential privacy for medical image analysis. Sci Rep. 2022;12:1‐12. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0034] 34. Borazjani S, Haghshenas H, Alamdari A, Kalra S, Martel AL. Multi‐Modal Federated Learning for Cancer Staging over Non‐IID Datasets with Unbalanced Modalities. IEEE Trans Med Imaging. 2024;43(8):2491‐2502. doi: 10.1109/TMI.2024.3387639 [DOI] [PubMed] [Google Scholar]

[mp18064-bib-0035] 35. Wen D, Jeon K‐J, Huang K. Federated dropout–a simple approach for enabling federated learning on resource constrained devices. IEEE Wireless Commun Lett. 2022;11:923‐927. [Google Scholar]

[mp18064-bib-0036] 36. Wilkins G, Di S, Calhoun JC, et al. FedSZ: leveraging error‐bounded lossy compression for federated learning communications. In: 2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS) . IEEE; 2024:577‐588. [Google Scholar]

[mp18064-bib-0037] 37. Dai M, Li Y, Li P, et al. A survey on integrated sensing, communication, and computing networks for smart oceans. J Sens Actuator Netw. 2022;11:70. [Google Scholar]

[mp18064-bib-0038] 38. Nguyen DC, Ding M, Pathirana PN, Seneviratne A, Zomaya AY. Federated learning for COVID‐19 detection with generative adversarial networks in edge cloud computing. IEEE Internet Things J. 2021;9:10257‐10271. [Google Scholar]

[mp18064-bib-0039] 39. Abadi M, Chu A, Goodfellow I, et al, Deep learning with differential privacy. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). New York, NY: ACM; 2016. 308‐318. doi: 10.1145/2976749.2978318 [DOI] [Google Scholar]

[mp18064-bib-0040] 40. Linardos A, Tragardh E, et al. Federated learning for multi‐center imaging diagnostics: a simulation study in cardiovascular disease. Sci Rep. 2022;12:3551. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0041] 41. Tavakoli F, Wu C, Liu J, Ma C, Xu M. A comprehensive view of personalized federated learning on heterogeneous clinical datasets. 2023. arXiv preprint arXiv:2309.16825.

[mp18064-bib-0042] 42. Saha A, Bosma J, Futterer JJ, et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI‐CAI): an international, paired, non‐inferiority, confirmatory study. Lancet Oncol. 2024;25(5):529‐539. doi: 10.1016/S1470-2045(24)00123-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0043] 43. Moradi A, Zerka F, Bosma JS, et al. Federated learning for prostate cancer detection in biparametric MRI: optimization of rounds, epochs, and aggregation strategy. In: Medical Imaging 2024: Computer‐Aided Diagnosis. Vol. 12927. SPIE; 2024:412‐421. [Google Scholar]

[mp18064-bib-0044] 44. Tedeschini BC, Savazzi S, Stoklasa R, et al. Decentralized federated learning for healthcare networks: a case study on tumor segmentation. IEEE Access. 2022;10:8693‐8708. [Google Scholar]

[mp18064-bib-0045] 45. DeLong ER, DeLong DM, Clarke‐Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837‐845. [PubMed] [Google Scholar]

[mp18064-bib-0046] 46. Suk H‐Il, Liu M, Yan P, Lian C. Privacy‐preserving federated brain tumour segmentation. In: International Workshop on Machine Learning in Medical Imaging . Springer; 2019:92‐100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0047] 47. Yang Q, Zhang J, Hao W, Spell G, Carin L. FLOP: Federated learning on medical datasets using partial networks. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining . ACM; 2021:3845‐3853. [Google Scholar]

[mp18064-bib-0048] 48. Chang K, Balachandar N, Lam C, et al. Distributed deep learning networks among institutions for medical imaging. J Am Med Inform Assoc. 2018;25:945‐954. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0049] 49. Sheller MJ, Edwards B, Reina GA, et al. Federated learning in medicine: facilitating multi‐institutional collaborations without sharing patient data. Sci Rep. 2020;10:12598. [DOI] [PMC free article] [PubMed] [Google Scholar]

[mp18064-bib-0050] 50. Wang S, Tuor T, Salonidis T, et al. Adaptive federated learning in resource‐constrained edge computing systems. IEEE J Sel Areas Commun. 2019;37:1205–1221. [Google Scholar]

[mp18064-bib-0051] 51. Konečný J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D. Federated learning: strategies for improving communication efficiency. 2016. arXiv preprint arXiv:1610.05492.

[mp18064-bib-0052] 52. Shokri R, Shmatikov V. Privacy‐preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS) . ACM; 2015:1310‐1321. [Google Scholar]

[mp18064-bib-0053] 53. Vepakomma P, Gupta O, Swedish T, Raskar R. Split learning for health: distributed deep learning without sharing raw patient data. 2018. arXiv preprint arXiv:1812.00564. (Presented at NeurIPS 2018 Workshop).

[mp18064-bib-0054] 54. Pan SJ, Yang Q. A survey on transfer learning. IEEE Trans Knowl Data Eng. 2009;22:1345–1359. [Google Scholar]

PERMALINK

A novel federated learning framework for medical imaging: Resource‐efficient approach combining PCA with early stopping

Negin Piran Nanekaran

Eranga Ukwatta

Abstract

Background

Purpose

Methods

Results

Conclusions

1. INTRODUCTION

2. METHODOLOGY

FIGURE 1.

2.1. Preprocessing via FIPCA

ALGORITHM 1. FIPCA for 3D medical images.

2.2. Client‐side training

ALGORITHM 2. Client training with custom loss function and early stopping.

2.3. Server‐side coordination

ALGORITHM 3. Federated learning server with adaptive early stopping.

3. RESULTS

TABLE 1.

TABLE 2.

FIGURE 2.

FIGURE 3.

FIGURE 4.

3.1. Efficiency evaluation

TABLE 3.

FIGURE 5.

TABLE 4.

TABLE 5.

FIGURE 6.

3.2. Resource consumption analysis

TABLE 6.

TABLE 7.

4. DISCUSSION

5. CONCLUSION

CONFLICT OF INTEREST STATEMENT

ACKNOWLEDGMENTS

DATA AVAILABILITY STATEMENT

REFERENCES

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases