Skip to main content
Digital Health logoLink to Digital Health
. 2025 Sep 11;11:20552076251358531. doi: 10.1177/20552076251358531

Federated learning and differential privacy: Machine learning and deep learning for biomedical image data classification

Sobia Wassan 1,, Liudajun 1, Han Ying 1, Hu Dongyan 1, Pan Fei 1
PMCID: PMC12426403  PMID: 40949674

Abstract

Background

The integration of differential privacy and federated learning in healthcare is key for maintaining patient confidentiality while ensuring accurate predictive modeling. With increasing concerns about privacy, it is essential to explore methods that protect data privacy without compromising model performance.

Objective

This study evaluates the effectiveness of feedforward neural networks (FNNs), Gaussian processes (GPs), and a subset of deep learning neural networks (MLP) in classifying biomedical image data, incorporating federated learning to enhance privacy preservation.

Method

We implemented FNN, GP, and MLP models using federated learning and differential privacy techniques. Models were evaluated based on training and validation accuracy, correlation coefficients, mean absolute error (MAE), root mean squared error (RMSE), and relative errors, including relative absolute error (RAE) and relative root squared error (RRSE).

Results

The FNN achieved 86.49% training accuracy and 82.08% overall accuracy but showed potential overfitting with 68.75% validation accuracy. The GP model had a correlation coefficient of 0.9741, a MAE of 108.38, and a RMSE of 173.49. The DNN outperformed the other models with a correlation coefficient of 0.9980, a MAE of 36.80, and a RMSE of 51.01. Federated learning improved privacy while maintaining model performance.

Conclusion

Federated learning with differential privacy offers a promising solution for secure and accurate biomedical image classification, supporting privacy-preserving machine learning in medical diagnostics without compromising performance.

Keywords: Federated learning, machine learning, healthcare data privacy, cryptographic techniques, deep learning, biomedical image classification

Introduction

The healthcare industry is witnessing a revolution with advancements in machine learning, as the technology now aids in predicting diseases, performing diagnostic operations, and dispensing customized treatments. Risks for healthcare privacy emerge once the sensitivity of patients’ information becomes a concern. Federated learning and differential privacy are two practical approaches that can help protect the privacy of information without compromising model predictive capabilities. The use of machine learning, combined with the expansion of big data and the integration of globalized systems, has enabled previously unseen collaborative training capacities for machine learning models between organizations and countries.

However, healthcare organizations face data privacy restrictions as their primary barrier to implementing achievable technical solutions, although such collaborations remain challenging.13 Currently, privacy-preserving technologies, such as federated learning, blockchain, and generative adversarial networks (GANs), are receiving increasing attention to deal with these challenges. Google introduced federated learning (FL) in 2016 as a distributed machine-learning platform that protects user privacy through its multi-party collaboration features. Federated learning is a critical medical application that helps maintain the confidentiality of information through the use of decentralized training procedures. This approach enables the training of data at a localized level without exchanging the actual contents of the information, instead conveying updates. 4 It has evolved into a powerful distributed machine-learning environment that preserves the confidentiality of information while enabling multi-peer cooperation. Federated learning has been increasingly utilized in medicine as a more secure enhancement over traditional centralized training procedures, as it allows models to be trained on localized sources of data while conveying only model updates, such as gradients, rather than raw information. 5 Image forest data uses CNN as the processing method. 6 Data analysis for secure banking operations utilizes various machine learning and deep learning techniques, as recommended by banking institutions. 7 In genomics for personalized medicine, 8 federated learning has also shown promise in the internet of medical things (IoMT) for real-time health monitoring. 9 With its resilience and adaptability for various healthcare applications, federated learning proves to be a versatile tool for medical innovation and the protection of patient privacy. By addressing the data privacy concerns associated with collaborative learning approaches, the incorporation of federated learning opens the door to global, fair, and secure healthcare solutions. First, global model parameters are distributed from a centralized server within the federated learning system to chosen participating sites. Local data is then used to train each site's local model, which has the same framework as the global model. The sites conduct the training using the modified model parameters, not the raw data, and send them back to the central server. The restriction of traditional centralized learning, which requires raw data to be sent to a central organization for processing, is addressed by this configuration, ensuring that sensitive data remains on the local device or site. Machine learning techniques in the federated learning model are shown in Figure 1.

Figure 1.

Figure 1.

Display the machine learning techniques in the federated learning model.

The central server performs an aggregation process that combines the local modifications sent from participating sites to improve weights in the global model. The process then continues with yet another set of participating organizations obtaining this most recent model, after which they commence their second round of regional training. The iterative procedure of the global model continues until all versions merge, achieving optimal performance. The core advantage of federated learning is that it enables better model performance through the collaborative processing of extensive, multi-jurisdictional data, as the limitations of individual data sources are mitigated. The approach is implemented with strong privacy measures in place, as raw data remains at its source location.

The healthcare domain values federated learning because of its ability to preserve sensitive patient information during operations. The framework demonstrates excellent benefits when researchers need to work in specialized fields whose data resources are limited due to regulatory or ethical constraints. Through federated learning, doctors can establish research ties among medical facilities, scientific institutions, and national research bodies, thereby advancing fields such as rare disease analysis, customized treatment development, and algorithmic prediction capabilities.

The substantial benefits that federated learning offers have not propelled it into becoming a standard clinical practice. The complete incorporation of federated learning into clinical applications faces resistance due to various issues related to data distribution heterogeneity, communication performance limitations, and the need for reliable privacy protection methods. The successful clinical application of federated learning depends on ongoing efforts to address current implementation hurdles, enabling the healthcare and research fields to benefit from this technology. Federated learning stands as a promising approach to uniting data privacy solutions with collaborative innovation, as it creates a bridge toward better healthcare equity and Security, along with effective solutions. 10 This study aims to provide strategies for IT specialists and e-government services to enhance privacy and security measures continuously. 11

The privacy-preservation technique of federated learning enables different healthcare organizations to train their models collaboratively while shielding patient data from disclosure, thereby building enhanced diagnostic models for dental research and healthcare facilities.12,13 The decentralized system trains models independently at local sites before combining them into a collective model, allowing clients to maintain privacy and reduce communication costs for multi-institution projects..14,15 Federated learning protects patient privacy during model training across distributed medical databases for rare disease studies in the same way it succeeded in protecting data confidentiality among divergent Alzheimer's disease diagnostic centers.16,17 Elliptic curve cryptography (ECC) teams up with Serpent encryption to secure data access by protecting sensitive information during federated learning operations, particularly in healthcare domains and other protected sites. 18 Similarly, federated learning ensures the secure processing of energy data for non-intrusive load monitoring in the Power IoT sector, providing complete user privacy protection and enabling precise evaluation of energy consumption without compromising data integrity. 19

Additionally, the optimization of neural network training methods, which includes exploring activation functions and hidden layers, has led to federated learning models that provide more efficient and accurate solutions in various fields. 20 Federated learning provides an effective framework that many industries can utilize for secure, collaborative AI model training while upholding privacy protection and delivering operational benefits. Different machine learning systems implementing biomedical data classification protocols with privacy preservation form the research topic. Federated learning, when combined with differential privacy, enables researchers to evaluate both feedforward neural networks (FNNs) and Gaussian processes (GPs) for disease class determination through performance assessment alongside deep learning neural networks.

Reeled study

Data augmentation

The training dataset can be made artificially larger through the use of data augmentation. Small image adjustments, such as grading, rotating, spinning, and applying various sifters, are examples of traditional image processing techniques. 21 These strategies may fail to account for fluctuations in dataset distributions, so although they help maintain the original data distribution, they may not improve model generalization. Advanced approaches, such as GANs, can create new data while preserving the creative distribution, thus enhancing model performance across several datasets.2224 Two such techniques are examined in the literature review: conditional GANs 25 and dual GANs.2629 Advanced solutions such as AI-driven calibration, multilayer sensing, and enhanced light source technologies are overcoming current limitations in optical glucose sensing, paving the way for noninvasive, real-time, and personalized diabetes management. 30 In addition, there are evolutionary algorithms. This research presents a CIDN for cardiac vascular segmentation, incorporating BAB and MIB to enhance the acquisition of spatial and edge information. 31 The client's model is updated by both synthesized and original datasets, allowing Feds to operate more consistently and broadly. The process of aligning client models with the server model imposes a significant computational burden, requiring substantial resources and coordination among multiple users, which can hinder the synthesis of high-quality images.

Generative adversarial networks

In 2014, Ian Goodfellow presented GANs to the world as a class of machine learning models. GAN consists of neural networks called generators and discriminators, which are trained together in a competition. Random data, known as noise, is generated in the generator, and the discriminator checks whether the output is original. The purpose of the discriminator is to enhance its ability to distinguish between real and fake data, and the generator attempts to replicate the discriminator's decision-making process. Adversarial training enhances each network while also enabling the generator to produce highly realistic-looking data. GAN models are highly successful in creating visually appealing images, videos, and text, as demonstrated by various applications, including image processing, data augmentation, and design replication. A different GAN variety called a conditional generative adversarial network conditional CGAN enables the delivery of high-frequency textural information directly useful for medical imaging. Graph convolutional network (GCN) achieves adversarial learning through two primary components: a generator and a discriminator network structure. The generator creates synthetic target-contrast pictures from source-contrast images it receives, while the discriminator distinguishes them. CGN minimizes an adversarial and pixel-wise loss function to learn image translation.32,33 In the context of healthcare, GANs can be employed to generate synthetic medical data while preserving privacy and creating new datasets for training other machine learning models. This can help overcome issues such as limited access to medical data due to privacy concerns without compromising the quality of model training. The strength of GANs lies in their ability to generate high-quality, realistic data that closely mimics real-world distribution, making them valuable in many fields, including medical imaging, drug discovery, and personalized medicine. However, challenges such as mode collapse, instability in training, and evaluation of generated outputs remain areas of active research on cybersecurity, highlighting its role in threat detection, malware identification, behavioral analysis, user authentication, adversarial attack mitigation, and ethical considerations, ultimately emphasizing its potential to enhance Security and privacy in digital networks. 34

Using a conditional CGAN, specificity-preserving federated learning (SPFL-Trans) was developed to overcome some of the difficulties in federated learning. The term “specificity” refers to features or information that are unique to a specific client's dataset, such as disease prevalence, processing power, dataset size, and quality. SPFL-Trans with Patch GAN's assistance, 35 The (SPFL-Trans) model operates within an adversarial framework that dynamically adapts and defines feature maps in the generator using latent variables specific to each client's dataset. These latent variables are not explicitly given but are inferred by the model. SPFL-Trans consists of (9) residual blocks and a specialized latent parameter space that includes six dense layers for generating these latent variables. The normalization layer utilizes both image-specific and client-specific latent variables to compute learnable scale and bias vectors, which adjust the data's mean and standard deviation.

Additionally, SPFL-Trans integrates first- and second-order statistical indicators of the data distribution to further refine the output. 36 Evolutionary algorithms often struggle to scale for complex or real-time tasks as objectives and population size grow. 37 Previously, we proposed IMCFN, a CNN-based malware detection method that converts raw binaries into color images for improved classification.38,39

They are ideal for associated datasets because they employ this structure to find patterns.40,41 Trained a GNN with a server model, 42 On clients to identify certain differences in the distributions of datasets. A distinct GNN is constructed and optimized for each client to capture client-specific prevalence and co-occurrence patterns of medical conditions, even though the server model's weights are learned and shared across all clients. However, additional research is needed to determine how this approach handles different class distributions. This method used the CheXpert to evaluate its performance. 43 Model extraction in federated learning is a data-private collaborative approach where participating models utilize the available data to generate knowledge from the aggregated prediction scores. The federated conditional mutual learning architecture, as proposed by Huang et al., 44 employs extraction techniques to adapt models for client-specific datasets. Federated Credential Management (FedCM) utilizes 3D-CNN and Visual Geometry Group (VGG). 45 As the backbone networks, FedCM enables clients to share public subsets of their data within a federated network, allowing the server model to access and learn from aggregated knowledge across diverse datasets.

Feedforward neural networks with federated learning

FNNs play a crucial role in healthcare by enabling tasks like disease prediction, medical image analysis, and patient risk assessment. However, conventional machine learning techniques often require centralized data storage, which raises concerns about privacy and security. To address this, federated learning enhances FNNs by allowing multiple hospitals to collaboratively train models without sharing sensitive patient data. Instead of transferring raw data, FL ensures that only model updates are shared, thereby improving privacy protection and generalization across diverse datasets. To further enhance robustness and improve performance in real-world scenarios, various techniques such as differential privacy, secure aggregation, personalized federated learning, regularization, adaptive optimization (e.g., Adam, RMSprop), and anomaly detection are employed. These techniques help prevent overfitting, secure patient information, and stabilize the learning process, making FL with FNNs highly effective in healthcare applications. Federated learning faces a crucial problem in handling non-IID data, which can deteriorate model performance. Messaging progressed to include federated fuzzy neural networks coupled with evolutionary rule learning as a response to this problem. ERL borrows from biological evolution processes to actively select superior rules while eliminating inferior ones, thereby enhancing adaptability to data uncertainties and non-IID distributions. The solution demonstrates superior performance compared to distributed fuzzy neural networks currently established in the field, thereby making it a strategic medical AI application for decentralized networks. 46

Our FNN with FL is designed for privacy-preserving and robust healthcare applications, ensuring high accuracy while protecting patient data. The model consists of three hidden layers with dense (fully connected) layers using Leaky Elu (alpha = 0.1) activation, which prevents the vanishing gradient problem and ensures stable learning. Each layer is regularized with L2 (λ = 0.01) to prevent overfitting, while batch normalization enhances training stability, and Dropout (0.4) improves generalization by reducing dependency on specific neurons. The last SoftMax output layer provides the correct class probability distribution for multi-class classification tasks. With the inclusion of FL, our model benefits from distributed training across various healthcare institutions while preserving the confidentiality of the data. With the amalgamation of powerful activation functions, regularization, and FL integration, our model is highly generalizable, robust, and suitable for real-world medical AI applications. This enables better early detection of diseases, personalized treatments, and enhanced patient care, making it a potent, privacy-enhancing AI for the healthcare sector.

Gaussian processes with federated learning

The probabilistic modeling framework of GPs is beneficial for defining uncertainty, in addition to tasks such as regression and classification. As such, it finds common use in medical applications. GPs, coupled with FL, enable hospitals in different locations to learn from each other, with the safety and Security of their information remaining intact. Medical information in high dimensions in the Breast Tissue Impedance Measurements (BTIM) dataset is suitable for critiquing healthcare models. Most GP centers are unable to expand because their procedures are expensive, and it is difficult for them to share their information in a manner that the general public has approved. Machine learning models are utilized in FL to achieve good performance for non-identically distributed non-IID data without sharing any individual's highly identifiable information. The system draws its power from three approaches, including the use of sparse Gaussian processes (SGPs), variational inference (VI), and kernel adaptation. Approaches are used to minimize computational requirements by enabling healthcare models to perform distributed tasks effectively in medical contexts.

The combination of GP and FL on BTIM data provides an improved uncertainty assessment, as well as dependable predictions, along with privacy safeguards during learning, which establishes a robust AI system for medical individualization and diagnostic evaluation. (FedG-WC) presents a new clustered FL approach that enhances model resilience through data distribution-based client grouping, operated by Gaussian rewards in combination with a Wasserstein-adjusted score for improved cluster quality and accuracy. 47 This study proposes an optimized image recognition technique that utilizes convolutional neural networks with differential privacy, incorporating Gaussian noise, gradient layer clipping, and the analytical Gaussian mechanism, achieving high accuracy (96.46% for MNIST and 61.10% for CIFAR-10). 48

Our GPs with FL model is perfectly suited for the BTIM dataset due to its ability to handle high-dimensional, complex biomedical time-series and imaging data while ensuring privacy, scalability, and reliability. Our model supports various healthcare institutions in building a predictive GP model through federated learning mechanisms that maintain patient information privacy. A linear kernel that calculates dot products between x and y (K(x,y) = <x,y>) supports efficient training combined with interpretability, making it suitable for health diagnostics and personalized medical practices. The process of data normalization leads to improved model performance and stability, resulting in more precise predictions across various medical datasets. Strong consistency, along with numerical stability, is evident in the inverted covariance matrix values, which confirm the robust operation of our model. Our approach utilizes SGP and VI methods to minimize the processing demands of GP models, offering scalable capabilities for medical systems that require distributed applications. The model achieves an accurate uncertainty margin and precise predictions by using an average target value of 0.2474 to detect primary patterns in the BTIM dataset. Our model, which combines GPs with FL, enables non-IID data generalization while providing privacy-aware, high-accuracy healthcare analytics, making it the most suitable AI-powered solution for superior medical diagnostics and clinical decision support, as depicted in Figure 2.

Figure 2.

Figure 2.

Multilayer perceptron with federated learning.

Multilayer perceptron with federated learning

A multilayer perceptron with federated learning provides institutions with an efficient method for developing distributed machine-learning models that operate across distinct, decentralized locations without compromising data confidentiality. The feedforward neural network MLP can effectively manage complex patterns of data when used for classification, as well as for time-series forecasting and regression prediction. Federated learning enables institutions or organizations to train collaborative models together while preserving privacy standards by utilizing unshared sensitive data; thus, it proves essential in healthcare applications. FL distributes model updates, consisting of weights and gradients, between clients, enabling organizations to develop unique, customized models that understand specific client information while preserving their generalization capabilities. While training the MLP model, information flows in from weighted sources through backpropagation and is then processed by the ReLU or sigmoid activation function before being produced as model outputs. This technique can be applied to multiple clients in parallel, enabling wide-scale use while maintaining system resilience and preventing data homogenization. This can be utilized by hospitals in medicine to make disease predictions while processing medical images and assessing patient risks, all while adhering to confidentiality guidelines. By coupling the use of regularized parameters with adaptive optimization techniques, as well as anomaly detection, the MLP can perform optimally in real-world applications, handling sparse information while also considering anomalous signals and sample noise. Through the combination of FedSec with Federated Learning, the system both extends accuracy rates and enhances privacy standards, thereby enabling better early disease diagnosis and saving patient lives. 49 Additionally, our MLP model, which analyses the BTIM dataset, employs linear nodes to compute weighted attribute sums and multiple sigmoid nodes for tissue type classification through successive layers and nodes, resulting in improved precision of correct classifications based on established attribute-class relationships. The integration of MLP and FL delivers an effective, privacy-enhanced, and scalable AI solution for industries dealing with sensitive data, such as healthcare and finance, based on the privacy attacks in the healthcare domain shown in Table 1.

Table 1.

Presents literature related to privacy attacks in the healthcare domain.

Reference Attack type Attack method Performance Datasets Target zone
50 2021 Data RI The individual was identified using a multimodal Siamese neural network that sequentially captured spatial and temporal information. The approach can re-identify an individual with 65% accuracy across all datasets. Datasets for older adults, healthy adults, gamers, and restaurants. Re-identification of individuals in publicly accessible health data sets.
51 2018 Data RI In certain contexts, a probability-based re-identification framework is employed to assess the risk of violations, utilizing explicit assumptions. The disclosure of demographic characteristics was more common than that of medical condition data. The most frequently reported condition in social media data was pneumonia, whereas anemia was the least common. De-identified HCUP National (NIS) medical research dataset. Sharing demographic and medical information on social media.
52 2021 Model AD The RL framework employs hierarchical position selection to determine attacked positions and utilizes a score-based technique to refine decision-making. The most popular attack method received 3.08%, 2.20%, and 1.74% more votes than the others. Dementia, kidney disorders, and heart failure. Electronic medical system.
53 2020 Model AD Machine learning (ML)–based adversarial attacks with limited knowledge of data distribution models can alter patient status in healthcare systems. Whitebox and BlackBox attacks: 32.27% highest accuracy drop using HopSkipJump; 15.68% attack success rate. Social media health data. EHR
54 2020 Model AD Adversarial attacks on COVID-19 detection models using DL approaches. DL models without protection compared to adversarial perturbations remain susceptible to attacks. COVID-19 data. Medical field.
55 2022 Model MI analyzing the risk of exposure when releasing health data publicly. When compared to fully synthetic data, partial synthetic data is far more susceptible. Drawn from a variety of health data sources. EHR.
56 2021 Model MI A realistic inference attack on a 3D brain imaging-trained DL model. Tested if model training employed an MRI scan with a 60% to 80% success rate. MRI images. 3D neuroimaging.
57 2022 Model MI DL model trained on 3D brain imaging. Create a model HIOT Data sources for model evaluation (e.g., real-world medical records and open-source datasets like MIMIC-III). EHR
58 2021 Model MI To test empirical privacy leaks, use a membership inference framework. Membership inference attacks on CLMs reveal 7% of complex privacy issues. MIMIC III, UMM, VHA. (CLMs).
59 2019 Model MI MIMIC model behavior comparison with a public model to identify prediction discrepancies. Using XGBoost machine learning models deployed on cloud platforms, the inference accuracy and precision averaged 73% and 84%, respectively, with peak values reaching up to 91%. Weibo dataset. Health data.
60 2022 Model MI Framework for inversion attacks based on gradient-based models. Performs better both quantitatively and qualitatively than current gradient-based methods. BraTS dataset. Brain tumor segmentation.
Our proposed model (FNN), Gaussian processes (GPs), and multilayer perceptron (MLP), Integration of differential privacy into MLP for secure classification of biomedical data Achieved 0.9980 correlation coefficient, 51.01 RMSE, 36.80 MAE while preserving privacy Breast Tissue Impedance Measurements (BTIM) dataset Biomedical data classification (Breast tissue)

Remedies and advancements

The literature review identifies key limitations in existing privacy-preserving machine learning approaches for healthcare. Many prior models, including those using FL or differential privacy DP, struggle to maintain a balance between privacy protection and model accuracy. Common issues include vulnerability to re-identification (RI), adversarial attacks (AD), and membership inference (MI), particularly in high-dimensional biomedical datasets. Additionally, earlier models often rely on centralized architectures, which necessitate data sharing and risk compromising patient confidentiality. Some also lack the deep learning capabilities to capture complex, nonlinear patterns in data, which limits their predictive performance. We address these problems by incorporating FL and DP into advanced models LP, GP, and FNN using the BTIM dataset. Our system improves model accuracy because data security is not affected by the added noise. Its multilayer design allows it to represent connections in data that are not easily understood by other architectures. Compared to alternative methods, our technique demonstrates that it is possible to achieve excellent model results and high privacy simultaneously. The advancement offers a suitable and secure alternative for actual clinical use, supports accurate diagnosis, and avoids leaking patients’ details—a much better alternative than similar past studies.

Main contribution

The key contribution of this research is the integration of differential privacy techniques into machine learning models for biomedical data classification, specifically focusing on the BTIM dataset. Three state-of-the-art machine learning models, including FNNs, GPs, and multilayer perceptrons (MLPs), have been evaluated in this research to demonstrate the effective application of these models for protecting healthcare data privacy. We developed this work because healthcare organizations require more effective methods to protect patient confidentiality in medical applications while maintaining accurate predictive modeling results. Healthcare institutions require robust, extended machine learning approaches that ensure data privacy protection during medical diagnostic applications, given the growing adoption of artificial intelligence in healthcare.

Findings

Our proposed MLP model achieves superior results, ranking as the most accurate and reliable model for breast tissue impedance classification, with a correlation coefficient of 0.9980, a root mean squared error of 51.01, and a mean absolute error of 36.80. The multilayer approach of the MLP yields better performance outcomes due to its complex nonlinear capabilities and deep architecture structure. The implementation of differential privacy methods allows the model to deliver strong performance results while maintaining patient confidentiality. Neither privacy protection nor predictive output accuracy suffers from our approach, which establishes itself as an optimal solution for real healthcare applications requiring extensive security and accurate diagnosis.

Data collection and methodology

In this study, we used the BTIM dataset available at https://www.kaggle.com/datasets/tarktunataalt/breast-tissue-impedance-measurements on Kaggle. The dataset comprises 106 instances of freshly excised breast tissue, with electrical impedance measurements recorded across seven electrode frequency levels: 15.625 kHz, 31.25 kHz, 62.5 kHz, 125 kHz, 250 kHz, 500 kHz, and 1000 kHz.The impedance spectrum generated through these measurements yielded features suitable for analytical purposes after plotting them on the real and imaginary planes. The classification dataset enables the categorization of breast tissue samples into either the original six classes or the condensed four combined classes, which merge the fibro-adenoma, mastopathy, and glandular categories. The research implements three machine learning models, namely, FNNs, GPs, and MLPs, for biomedical data classification, integrating differential privacy methods to maintain patient privacy. These models performed well, as indicated by accuracy and correlation coefficient measurements, combined with mean absolute error (MAE) and root mean squared error (RMSE) results, which measure their ability to protect privacy while producing strong prediction performance. Additional regularization techniques, such as L2 regularization, Dropout, and batch normalization, were applied to combat overfitting, primarily in the FNN model. Still, the GP model utilized its linear kernel, along with its interpretability features, to ensure medical transparency. Methodology process of the analysis process involving three distinct Machine Learning and Deep Learning techniques shown in Figure 3.

Figure 3.

Figure 3.

Methodology process of analysis 3 machine and deep learning.

This table provides performance data for two categories of vehicles or machines labelled “car” and “ADI.” Each row represents a unique entry with several associated attributes such as initial performance indicators (I0), various efficiency or performance metrics (PA500, HFS, DA, A. DA, Max.IP), and other factors like area, driving range (DR), and possibly price (P). The table compares these attributes across the two classes, offering a detailed breakdown of factors that likely influence the efficiency, power, and performance of each vehicle or machine type. The data appear to be useful for analyzing and comparing different performance characteristics across these vehicle categories. Data visualization of all parameters is shown in Table 2.

Table 2.

Presents the data visualization of all parameters.

S/NO Car 524.79 0.19 0.03 228.8 6843.6 29.91 60.2 220.74 556.83
1 Car 330 0.23 0.27 121.15 3163.24 26.11 69.72 99.08 400.23
2 Car 551.88 0.23 0.06 264.8 11888.39 44.89 77.79 253.79 656.77
3 Car 380 0.24 0.29 137.64 5402.17 39.25 88.76 105.2 493.7
4 Car 362.83 0.2 0.24 124.91 3290.46 26.34 69.39 103.87 424.8
101 Adi 2000 0.11 0.11 520.22 40,087.92 77.06 204.09 478.52 2088.65
102 Adi 2600 0.2 0.21 1063.44 174,480.5 164.07 418.69 977.55 2664.58
103 Adi 1600 0.07 −0.07 436.94 12,655.34 28.96 103.73 432.13 1475.37
104 Adi 2300 0.05 0.14 185.45 5086.29 27.43 178.69 49.59 2480.59
105 Adi 2600 0.07 0.05 745.47 39,845.77 53.45 154.12 729.37 2545.42
Index Class I0 PA500 HFS DA Area A.DA Max. IP DR P

Equation model

Breast tissue classification using impedance measurements typically involves features such as electrical conductivity and permittivity at different frequencies. The goal is to train a global model across multiple hospitals or devices while ensuring patient privacy and confidentiality.

Global objective function

Minimize the weighted average of local loss functions:

minωf(ω)=k=1knknFk(ω) (1)

where:

ω Represents the model parameters (weights) for breast tissue classification.

  • k is the number of participating medical centers or devices.

  • nk _k is the number of patient samples at the hospital/device k.

  • n=k=1knk is the total number of samples across all hospitals.

  • Fk(ω)=1nki=1nkl(ω,xi,yi) is the local loss function for the hospital k, computed over breast tissue impedance data xi and corresponding labels yi l(ω,xi,yi) Represents the classification loss (e.g., cross-entropy loss) for input xi with label yi .

Local model update

Each hospital trains locally using SGD:

ωk(t+1)=ωk(t)ηFk(ωk(t)) (2)

η is the learning rate.

Differential privacy (DP) in federate learning for breast tissue data

To protect patient privacy, Gaussian noise is added before updates are sent:

ω~k(t+1)=ωk(t+1)+N(0,σ1I) (3)

where:

N(0,σ1I) is Gaussian noise with variance σ1I ,

σ1I represent the variance of the added Gaussian noise multiplied by the identity matrix, used to ensure differential privacy

  • σ1 Controls privacy strength.

The server aggregates the noisy updates:

Federated averaging (FedAvg) update

The server aggregates the differentially private updates:

ω(t+1)=k=1knknω~k(t+1) (4)

Privacy guarantee with (ε, δ) differential privacy

The Gaussian Mechanism ensures (ϵ,δ) Differential privacy, where:

ϵΔσ,with(ϵ,δ)DP (5)

DP refers to Differential Privacy, a technique that ensures that the model update does not reveal individual sample information.

Δ is the sensitivity of the update .

ε,δ are privacy parameters.

The architecture of a federated learning differential privacy system

The architecture of a federated learning differential privacy system (FL-DP) system designed for breast cancer diagnosis using the BTIM dataset. At the top, the BTIM dataset contains features such as electrical conductivity, permittivity, and frequency spectrum, which are distributed across multiple clients, including hospitals and medical devices. Each client performs local training using its dataset without sharing raw data. During training, the local loss function is computed and optimized using the “Stochastic Gradient Descent” (SGD) algorithm. To preserve patient privacy, a Privacy Engine at each client adds Gaussian noise to the locally trained model parameters before sending updates. These noisy updates ensure that individual contributions are obscured, protecting through (ε, δ)-differential privacy. σ1I denotes the noise standard deviation applied at local client l to achieve differential privacy. The architecture of a FL-DP system is shown in Figure 4.

Figure 4.

Figure 4.

Display the architecture of a federated learning system with differential privacy for breast tissue classification.

Differentially private model updates are securely transmitted to a central server via a secure communication channel. The server applies federated averaging (FedAvg) to aggregate these noise-perturbed updates, weighting each contribution based on the corresponding client's sample size. The result is an updated global model, which is subsequently distributed back to the clients. The process iterates over multiple rounds, allowing the model to improve while strictly adhering to privacy constraints. The equations embedded in the diagram formalize the local loss computation, noise injection, update aggregation, and privacy budget estimation, thereby demonstrating the mathematical foundation of the FL-DP framework. This system enables scalable, privacy-preserving machine learning across decentralized medical data sources, aligning with regulations such as HIPAA and GDPR for sensitive healthcare data.

Experimental results

Model 1: feedforward neural network model

The provided architecture illustrates a four-layer FNN. It contains three dense layers and an output layer. Every first dense layer performs a LeakyReLU function, and the alpha value for each is set to 0.1 to stop dead neurons and allow nonlinear signals. A regularization factor with a value of 0.01 (L2 regularization) is added to prevent the model from overfitting. Similarly, a batch normalization operation is performed after each dense layer to consider the composition of the batch, and the same can be said for Dropout (0.4), which randomly turns off 40% of neurons, thus helping the learning process. The final layer activates the SoftMax function, which is regularly applied to classification problems with more than two classes, to produce probabilities for each target class. This architecture aims to strike a balance between performance and regularization for effective learning in classification problems, as shown in Table 3.

Table 3.

Presents a model architecture of a feedforward neural network classification model.

Layer no. Layer type Output shape Activation function Regularization Other features
1 Dense (fully connected) (256) LeakyReLU (alpha = 0.1) L2 (λ=0.01) Batch Normalization, Dropout (0.4)
2 Dense (fully connected) (512) Leaky ReLU (alpha = 0.1) L2 (λ=0.01) Batch Normalization, Dropout(0.4)
3 Dense (fully connected) (256) LeakyReLU (alpha = 0.1) L2 (λ=0.01) Batch Normalization, Dropout (0.4)
4 Output layer (dense) (3) SoftMax L2 (λ=0.01) Batch Normalization, Dropout (0.4)

Compilation details

  • Optimizer: Nadam (learning rate = 0.003)

  • Loss function: Sparse categorical cross-entropy

This multiple density plots represent the distributions of different classes (car, fad, mas, gla, con, adi) across various variables. Each plot visualizes the distribution of the respective variable's values across different classes, as shown in Figure 5.

Figure 5.

Figure 5.

Accuracy distribution of the feedforward neural network classification model on the breast tissue impedance measurements dataset across training, validation, and testing splits.

Table 4 summarizes the model's performance across training, validation, testing, and overall evaluation phases. The results show that the model learned effectively from the training data. However, its performance on the validation set was comparatively lower, indicating a degree of overfitting and a need for better generalization to unseen data. The test accuracy reflects a balanced performance, signifying the model can handle new data reasonably well. Overall, the metrics highlight the model's solid performance, with some scope for improvement in generalization. Accuracy metrics for the model evaluated on the BTIM dataset, including training, validation, and testing, are presented in Table 4.

Table 4.

Presents accuracy metrics for a model evaluated on the breast tissue impedance measurements dataset.

Metric Value
Train accuracy 0.8649
Validation accuracy 0.6875
Test accuracy 0.7500
Overall accuracy 0.8208

Figures 6 and 7 collectively show the models’ training and validation losses across 100 epochs, providing a comprehensive view of their learning dynamics and generalization ability. In both figures, the blue and orange lines represent the training and validation losses, respectively, indicating a steady decline over time, which suggests effective learning and a decrease in error. The small gap between the two curves during training recommends good generalization with minimal overfitting. Particularly, both figures spot a convergence region in the final epochs (80–100), where the loss values steady at low levels, confirming that the model has positively converged. Figure 7 further interprets the small loss gap as an indication of robust generalization. A combination of these diagrams validates that the model has been trained effectively, achieving balanced performance on both seen and hidden data.

Figure 6.

Figure 6.

Display the training and validation loss curves of the feedforward neural networks model on the breast tissue impedance measurements dataset.

Figure 7.

Figure 7.

Display the convergence behavior of the feedforward neural networks model showing training and validation loss trends on the breast tissue impedance measurements dataset.

The observed gap between training and validation accuracy can be rationally clarified over the trends shown in Figures 5 and 6. Both figures show a stable and consistent decline in training and validation loss across 100 epochs, with a slight yet stable gap between the two curves, which indicates effective learning and good generalization. While the inconsistency in accuracy may indicate mild overfitting, it is expected to be a result of the model seizing specific patterns in the training data that do not completely generalize to the validation set. Notably, the convergence of both losses in the later epochs, as illustrated in the figures, confirms that the model has reached a stable state and is no longer overfitting. This suggests that the difference in accuracy is not an indication of poor performance but rather a natural variation resulting from differences in data distribution or complexity between the training and validation sets. Overall, the model validates strong learning capabilities, showing controlled overfitting and reliable generalization.

Model 2: the Gaussian processes model BTIM dataset

Key parameters for a GP model used on the BTIM dataset are shown in Table 5. When using a GP model, the target variable is estimated with a linear kernel, and the data is normalized before training takes place. About 0.2474 is the typical average value found in the dataset. Feature relationships and their variances are shown by the inverted covariance matrix, which ranges from 0.1627 (for the lowest) to 0.9539 (for the highest). By multiplying the inverted covariance matrix with the target value vector, the result ranges from −0.1343 to 0.1432, which proves the model is sensitive to changes in both feature connections and target results. The use of a linear kernel indicates a basic assumption of a linear relationship between the features and the target variable in this dataset.

Table 5.

Present the performance matrix of the Gaussian processes model.

Metric Value
Model type Gaussian Processes
Kernel used Linear Kernel: K(x, y) = <x, y>
Data normalization Yes
Average target value 0.24738728390307013
Inverted covariance matrix (lowest value) −0.1627271095860521
Inverted covariance matrix (highest value) 0.9539897083630564
Inverted covariance matrix * target-value vector (lowest value) −0.1343157400044548
Inverted covariance matrix * target-value vector (highest value) 0.14321101752744442

The model performs very well, as indicated by a correlation coefficient of 0.9741, which suggests a high degree of agreement between the predicted and actual values. The MAE of 108.3815 shows the average magnitude of prediction errors, suggesting a reasonably good fit. The RMSE of 173.4865 highlights the magnitude of errors in the model's predictions, where a smaller value would indicate better performance. The relative absolute error (RAE) of 16.9394% and root relative squared error (RRSE) of 22.675% indicate how the model's errors compare to those of a simple baseline model. Overall, these results indicate that the model exhibits good predictive performance with a moderate error margin. The cross-validation results of the parameters are presented in Table 6.

Table 6.

Presents the cross-validation results of parameters.

Metric Value
Correlation coefficient 0.9741
Mean absolute error (MAE) 108.3815
Root mean squared error (RMSE) 173.4865
Relative absolute error (RAE) 16.9394%
Root relative squared error (RRSE) 22.675%
Total number of instances 106

Model 3: MLP neural network model

This MLP model consists of a linear node and multiple sigmoid nodes applied to the BTIM dataset. The linear node computes a weighted sum of inputs (attributes), such as I0, PA500, HFS, DA, and Area, and its output is influenced by the weights assigned to each input and other nodes. Each sigmoid node transforms its input nonlinearly and gives an indication of an object's probability of belonging to any of the classes (car, fad, mas, gla, con, adi). Attributes are fed into these nodes, which change them based on the corresponding weights. Nodes such as Node 1 store weights for things like I0, PA500, HFS, and DA, and their output reflects how major the influence is of these attributes for choosing between classes (car, fad, mas, and so on). This continues at each node, and every node applies its threshold to get more accurate predictions. Using these layers, the model tries to guess whether a case belongs to a certain class based on the connections found between the attributes and the classes. Varying the node locations and thresholds means that findings about breast tissue impedance can be more detailed and accurate. The neural network model on the BTIM dataset is shown in Table 7.

Table 7.

Presents a neural network model on the breast tissue impedance measurements dataset.

Node type Node ID Threshold Weights
Linear node Node 0 0.6803 Node 1: −0.8843, Node 2: −1.0951, Node 3: 1.3441, Node 4: −0.2543, Node 5: 0.0278, Node 6: −0.5699, Node 7: −0.0744
Sigmoid node Node 1 −0.3895 Class = car: 0.1289, fad: 0.3111, mas: 0.4251, gla: 0.1600, con: 0.2427, adi: 0.3705, I0: −1.4514, PA500: −0.1107, HFS: −0.2046, DA: −0.2682, Area: 0.3096
Sigmoid node Node 2 0.0257 Class = car: 0.2770, fad: 0.0608, mas: −0.2513, gla: 0.0687, con: −0.2277, adi: −0.0092, I0: −1.7922, PA500: 0.2244, HFS: 0.1374, DA: 0.3190
Sigmoid node Node 3 −0.5055 Class = car: 0.4600, fad: 0.3377, mas: 0.5855, gla: 0.3868, con: −0.1678, adi: 0.1212, I0: 1.7076, PA500: −0.2330, HFS: 0.2875, DA: 0.2189
Sigmoid node Node 4 −0.2899 Class = car: 0.1388, fad: 0.2185, mas: 0.3625, gla: 0.2841, con: 0.3709, adi: 0.0568, I0: −0.3892, PA500: 0.0947, HFS: 0.0058, DA: 0.1184
Sigmoid node Node 5 −0.3276 Class = car: 0.1820, fad: 0.3088, mas: 0.2910, gla: 0.3253, con: 0.2121, adi: −0.0182, I0: 0.0167, PA500: 0.0865, HFS: 0.0305, DA: 0.0657
Sigmoid node Node 6 −0.3800 Class = car: 0.1211, fad: 0.0997, mas: 0.4214, gla: 0.2554, con: 0.3974, adi: 0.1363, I0: −0.8778, PA500: −0.0290, HFS: 0.0171, DA: −0.0088
Sigmoid node Node 7 −0.3589 Class = car: 0.1748, fad: 0.2430, mas: 0.3366, gla: 0.3067, con: 0.2142, adi: −0.0054, I0: −0.1475, PA500: 0.1024, HFS: 0.0649, DA: 0.1397

Table 8 presents a comparison of various studies in the healthcare domain, focusing on the integration of machine learning (ML), deep learning (DL), and privacy-preserving techniques. The studies span different years (2018–2022) and explore a wide range of applications, including big data, medical imaging, biomedicine, and healthcare. Notably, the majority of studies from 2019 onward incorporate both ML and DL, with a strong emphasis on privacy concerns, including cryptographic and non-cryptographic techniques, as well as federated learning. Our proposed model, introduced in 2025, builds upon the lessons from these previous works by integrating both privacy-preserving techniques and robust classification methods for biomedical image data, marking a significant step toward a comprehensive, privacy-aware approach to healthcare diagnostics. The table highlights the growing complexity of challenges in healthcare machine learning, particularly in balancing model accuracy, privacy, and Security, while also underscoring the need for future research into more efficient and scalable privacy-preserving solutions. The comparative result of our model design is presented in Table 8.

Table 8.

Presented a comparative result of our model.

Healthcare
ML
DL
Privacy Attacks (Data)
Privacy Attacks (Model)
Cryptographic
Non-Cryptographic
Hybrid
Federated Learning
Challenges
Future Directions
Insights and Pitfalls
Scope Big data related All domains Data publishing Medical imaging Biomedicine Healthcare (most comprehensive) Biomedical image data classification
References 61 62 63 64 65 66 Our proposed model
Year 2018 2019 2019 2020 2022 2022 2025

Discussion

The integration of differential privacy techniques in healthcare data analysis is becoming increasingly important as it strikes a delicate balance between preserving patient confidentiality and enabling effective predictive modeling. In this study, we evaluated three different machine learning models FNNs, GPs, and MLP—in the context of classifying biomedical data while ensuring privacy protection. These models were evaluated based on their ability to classify the BTIM dataset. They were measured using various performance metrics, including accuracy, correlation, MAE, and RMSE.

The FNN model showed solid results in terms of training accuracy (86.49%) and overall accuracy (82.08%), though it exhibited a slightly lower validation accuracy (68.75%). This discrepancy suggests the presence of overfitting, meaning the model performed well on the training data but struggled to generalize to unseen data. However, the application of regularization techniques, such as L2 regularization, batch normalization, and Dropout (0.4), helped mitigate some of the overfitting effects. By introducing Leaky ReLU activations and a SoftMax output layer, the model was able to adapt to the multi-class nature of the problem, providing reliable predictions for various tissue classes, including car, fat, muscle, and others.

The GP model demonstrated strong predictive performance with a correlation coefficient of 0.9741, a MAE of 108.38, and a RMSE of 173.49. This suggests that the model can effectively capture relationships in the data, though it has moderate error margins compared to the more complex models. Utilizing a linear kernel and a linear relationship between features and the target variables can restrict the model's ability to understand more intricate, nonlinear relations. However, the GP model is beneficial because it is interpretable, with interpretability playing a crucial role in medical applications, enabling a deeper understanding of the model's behavior. The MLP model was highly accurate, with the highest correlation coefficient (0.9980) and the lowest MAE (36.80) and RMSE (51.01), indicating the model's great dependability in predicting the impedance values of breast tissue. The deep structure of the MLP model, with its multiple layers featuring sigmoid nodes, allows for higher flexibility in understanding non-linearities in the values. Using both linear and nonlinear transformations in various layers enables the model to be flexible in understanding a range of input features. It is thus a highly effective option for classifying intricate biomedical datasets.

Conclusion and future work

This study demonstrates that integrating federated learning with differential privacy is a feasible and effective approach for the biomedical image classification problem, enabling secure model training while preserving the confidentiality of sensitive patient information. Our MLP model shows robust results among the models evaluated. It proved superior performance, achieving the maximum correlation and low error rates. The ability to handle complex relationships in the data makes it the most suitable for classifying BTIM with high accuracy. In comparison to previous research, our MLP model proves superior performance due to its ability to capture complex patterns in the dataset because of its deeper architecture and regularization strategies.

Additionally, the integration of differential privacy techniques through the modelling process ensures that patient confidentiality is maintained, making it suitable for real-world healthcare applications. However, future research could focus on further improving the model's generalization ability to address overfitting issues, particularly in models like FNN. This could be accomplished through advanced regularization methods or by exploring ensemble learning methods. Additionally, incorporating nonlinear kernels into the GP model can improve its performance by enabling the model to capture complex relationships in the data more effectively. Future work could also extend the analysis to larger and more diverse datasets to validate the robustness and scalability of the model, ensuring it can be applied across a wide range of healthcare applications.

Our study highlights the potential of machine learning models, particularly MLP, in privacy-preserving healthcare diagnostics. By continually refining and optimizing these models, we can further enhance their performance, ensuring accurate, secure, and reliable medical predictions in sensitive environments.

Limitations

While the proposed models demonstrate high classification accuracy and privacy preservation, several limitations remain. First, the BTIM dataset is relatively small, which may limit generalizability. Second, although differential privacy ensures confidentiality, it introduces a trade-off between model accuracy and noise, especially in complex models. Finally, the current implementation lacks real-time evaluation of clinical deployment scenarios. Future work should explore larger, multi-center datasets, advanced noise calibration techniques, and real-world testing in healthcare environments.

Acknowledgment

We sincerely acknowledge the support and contributions of our respective institutions and colleagues in the completion of this research.

Footnotes

Ethical considerations: As a member of my community, school, workplace, and country, I will abide by the laws, rules, and regulations of these institutions.

Author contributions: Conceptualization: SW; methodology: SW; supervision: Liudajun, SW; software analysis: SW; validation: SW; formal analysis: SW; and investigation: HY, PF, and SW; resources: HY and HD; data curation: SW; writing original draft preparation, review and editing: SW; visualization: SW; project administration: SW; funding acquisition: SW. All authors have read and agreed to the published version of the manuscript.

Funding: The authors received no financial support for the research, authorship, and/or publication of this article.

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

List of abbreviations

AI artificial intelligent

AD adversarial attack

BTIM Breast Tissue Impedance Measurements

CGAN conditional generative adversarial network

CLM clinical language model

CNN convolutional neural network

DP differential privacy

DNN deep neural network

EHR electronic health record

EMR electronic medical record

FedAvg federated averaging

FL federated learning

FNN feedforward neural network

GAN generative adversarial network

GP Gaussian process

IoMT internet of medical things

MAE mean absolute error

MI membership inference

ML machine learning

MLP multilayer perceptron

MRI magnetic resonance imaging

RAE relative absolute error

RI re-identification

RMSE root mean squared error

RRSE root relative squared error

SGP Sparse Gaussian process

SPFL-Trans secificity-preserving federated learning transformer

VAT virtual adversarial training

VI variational inference

BAB boundary attention block

CIDN context interactive deep network

FedCM federated credential management

VGG visual geometry group

GCN graph convolutional network

MIB multi-scale information block

References

  • 1.He J, Baxter SL, Xu J, et al. The practical implementation of artificial intelligence technologies in medicine. Nat Med 2019; 25: 30–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wiens J, Saria S, Sendak M, et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med 2019; 25: 1337–1340. [DOI] [PubMed] [Google Scholar]
  • 3.Price WN, Cohen IG. Privacy in the age of medical big data. Nat Med 2019; 25: 37–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.McMahan B, Moore E, Ramage D, et al. Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics (AISTATS). PMLR, 2017, pp.1273–1282. [Google Scholar]
  • 5.Sadilek A, Liu L, Nguyen D, et al. Privacy-first health research with federated learning. NPJ Digit Med 2021; 4: 132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wassan S, Xi C, Jhanjhi N, et al. Effect of frost on plants, leaves, and forecast of frost events using convolutional neural networks. Concur Eng 2021; 17: 15501477211053777. [Google Scholar]
  • 7.Wassan S, Xi C, Jhanjhi NJ. A smart comparative analysis for secure electronic websites. J Inf Assur 2021; 30. [Google Scholar]
  • 8.Wang Q, Zhou Y. FedSPL: federated self-paced learning for privacy-preserving disease diagnosis. Brief Bioinform 2022; 23: bbab498. [DOI] [PubMed] [Google Scholar]
  • 9.Arikumar K, Prathiba SB, Alazab M, et al. FL-PMI: federated learning-based person movement identification through wearable devices in smart healthcare systems. IEEE Access 2022; 22: 1377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zhang A, Xing L, Zou J, et al. Shifting machine learning for healthcare from development to deployment and from models to data. NPJ Digit Med 2022; 6: 1330–1345. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Shah IA, Wassan S, Usmani MH. E-Government security and privacy issues: challenges and preventive approaches. Cybersecurity Management for E-Government Frameworks 2022; 61. [Google Scholar]
  • 12.Loftus TJ, Ruppert MM, Shickel B, et al. Federated learning for preserving data privacy in collaborative healthcare research. NPJ Digit Med 2022; 8: 20552076221134455. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Rischke R, Schneider L, Müller K, et al. Federated learning in dentistry: chances and challenges. Clin Oral Investig 2022; 101: 1269–1273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bharati S, Mondal MRH, Podder P, et al. Federated learning: applications, challenges and future directions. Future Gener Comput Syst 2022; 18: 19–35. [Google Scholar]
  • 15.Patel M, Dayan I, Fishman EK, et al. Accelerating artificial intelligence: how federated learning can protect privacy, facilitate collaboration, and improve outcomes. NPJ Digit Med 2023; 29: 14604582231207744. [DOI] [PubMed] [Google Scholar]
  • 16.Süwer S, Ullah MS, Probul N, et al. Privacy-by-design with federated learning will drive future rare disease research. NPJ Digit Med 2024: 22143602241296276. [DOI] [PubMed] [Google Scholar]
  • 17.Sahid MA, Uddin MP, Saha H, et al. Towards privacy-preserving Alzheimer’s disease classification: federated learning on T1-weighted magnetic resonance imaging data. NPJ Digit Med 2024; 10: 20552076241295577. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Srivenkateswaran C, Jaya Mabel Rani A, Senthil Kumaran R, et al. Securing healthcare data: a federated learning framework with hybrid encryption in cluster environments. J King Saud Univ Comput Inf Sci 2024: 09287329241291397. [DOI] [PubMed] [Google Scholar]
  • 19.Cao H, Liu S, Zhao R, et al. IFed: a novel federated learning framework for local differential privacy in power internet of things. Concur Eng 2020; 16: 1550147720919698. [Google Scholar]
  • 20.Wassan S, Dongyan H, Suhail B, et al. Deep convolutional neural network and IoT technology for healthcare. NPJ Digit Med 2024; 10: 20552076231220123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Nguyen DC, Pham Q-V, Pathirana PN, et al. Federated learning for smart healthcare: a survey. Comput Methods Programs Biomed 2022; 55: 1–37. [Google Scholar]
  • 22.Wu H, Zhang B, Chen C, et al. Federated semi-supervised medical image segmentation via prototype-based pseudo-labeling and contrastive learning. Med Image Anal 2023. [DOI] [PubMed] [Google Scholar]
  • 23.Yan Z, Wicaksana J, Wang Z, et al. Variation-aware federated learning with multi-source decentralized medical image data. IEEE J Biomed Health Inform 2020; 25: 2615–2628. [DOI] [PubMed] [Google Scholar]
  • 24.Zhu W, Luo J. Federated medical image analysis with virtual sample synthesis. In: MICCAI 2022. Springer, pp.728–738. [Google Scholar]
  • 25.Dalmaz O, Mirza U, Elmas G, et al. A specificity-preserving generative model for federated MRI translation. In: Distributed, collaborative, and federated learning workshop, MICCAI 2022. Springer, pp.79–88. [Google Scholar]
  • 26.Yi Z, Zhang H, Tan P, et al. DualGAN: unsupervised dual learning for image-to-image translation. In: Proceedings of the IEEE ICCV, 2017, pp.2849–2857. [Google Scholar]
  • 27.Cai X, Lan Y, Zhang Z, et al. A many-objective optimization based federal deep generation model for enhancing data processing capability in IoT. IEEE Internet Things J 2021; 19: 561–569. [Google Scholar]
  • 28.Zhang X, Tian Y, Jin Y. A knee point-driven evolutionary algorithm for many-objective optimization. IEEE Trans Evol Comput 2014; 19: 761–776. [Google Scholar]
  • 29.Liu X, Zhao J, Li J, et al. Federated neural architecture search for medical data security. Future Gener Comput Syst 2022; 18: 5628–5636. [Google Scholar]
  • 30.Wassan S, Latif RMA, Liudajun, et al. Advancements in optical glucose sensing for diabetes diagnostics and monitoring. Sens Actuators, B 2025; 67. [Google Scholar]
  • 31.Zhang M, Wang H, Wang L, et al. CIDN: a context interactive deep network with edge-aware for X-ray angiography images segmentation. Comput Med Imaging Graph 2024; 87: 201–212. [Google Scholar]
  • 32.Beers A, Brown J, Chang K, et al. High-resolution medical image synthesis using progressively grown generative adversarial networks. arXiv preprint, 2018. [Google Scholar]
  • 33.Lee D, Kim J, Moon W-J, et al. CollaGAN: collaborative GAN for missing image data imputation. In: CVPR2019, pp.2487–2496. [Google Scholar]
  • 34.Khan A, Jhanjhi N, Hamid DHH, et al. Future trends and challenges in cybersecurity and generative AI. Handbook of AI and Cybersecurity 2025: 491–522.
  • 35.Isola P, Zhu J-Y, Zhou T, et al. Image-to-image translation with conditional adversarial networks. In: CVPR 2017, pp.1125–1134. [Google Scholar]
  • 36.Huang X, Belongie S. Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV 2017, pp.1501–1510. [Google Scholar]
  • 37.Codella NC, Gutman D, Celebi ME, et al. Skin lesion analysis toward melanoma detection: ISIC 2017 challenge. In: ISBI 2018. IEEE, pp.168–172. [Google Scholar]
  • 38.Vasan D, Alazab M, Wassan S, et al. IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture. Future Gener Comput Syst 2020; 171: 107138. [Google Scholar]
  • 39.Vasan D, Alazab M, Wassan S, et al. Image-Based malware classification using ensemble of CNN architectures (IMCEC). Inf Sci (Ny) 2020; 92: 101748. [Google Scholar]
  • 40.Xu K, Hu W, Leskovec J, et al. How powerful are graph neural networks? arXiv preprint, 2018. [Google Scholar]
  • 41.Chakravarty A, Kar A, Sethuraman R, et al. Federated learning for site aware chest radiograph screening. In: IEEE ISBI 2021, pp.1077–1081. [Google Scholar]
  • 42.Chakravarty A, Sarkar T, Ghosh N, et al. Learning decision ensemble using a graph neural network for comorbidity aware chest radiograph screening. In: IEEE EMBC 2020, pp.1234–1237. [DOI] [PubMed] [Google Scholar]
  • 43.Irvin J, Rajpurkar P, Ko M, et al. Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI 2019, pp.590–597. [Google Scholar]
  • 44.Huang Y-L, Yang H-C, Lee C-C. Federated learning via conditional mutual learning for Alzheimer’s disease classification on T1 W MRI. In: IEEE EMBC 2021, pp.2427–2432. [DOI] [PubMed] [Google Scholar]
  • 45.Liu Z, Sun M, Zhou T, et al. Rethinking the value of network pruning. arXiv preprint, 2018. [Google Scholar]
  • 46.Zhang L, Shi Y, Chang Y-C, et al. Federated fuzzy neural network with evolutionary rule learning. Neural Netw 2022; 31: 1653–1664. [Google Scholar]
  • 47.Licciardi A, Leo D, Faní E, et al. Interaction-aware Gaussian weighting for clustered federated learning. arXiv preprint, 2025. [Google Scholar]
  • 48.Zhang J, Xue N, Li X, et al. Federated learning under differential privacy with effective clipping and analytic Gaussian mechanism. In: ICNCIS 2024. SPIE, pp.314–322. [Google Scholar]
  • 49.Maurya A, Verma K. FedSec+: an advanced privacy-enhanced cardiovascular disease prediction model using federated learning. In: AIP Conference Proceedings. AIP Publishing, 2025. [Google Scholar]
  • 50.Alam MAU. Person re-identification attack on wearable sensing. arXiv preprint, 2021. [Google Scholar]
  • 51.Karmaker Santu SK, Bindschadler V, Zhai C, et al. NRF: a naive re-identification framework. In: ACM Workshop on Privacy in the Electronic Society (WPES). 2018, pp.121–132. [Google Scholar]
  • 52.Ye M, Luo J, Zheng G, et al. Medattacker: exploring black-box adversarial attacks on risk prediction models in healthcare. In: IEEE BIBM 2022, pp.1777–1780. [Google Scholar]
  • 53.Newaz AI, Haque NI, Sikder AK, et al. Adversarial attacks to machine learning-based smart healthcare systems. In: IEEE GLOBECOM 2020, pp.1–6. [Google Scholar]
  • 54.Rahman A, Hossain MS, Alrajeh NA, et al. Adversarial examples—security threats to COVID-19 deep learning systems in medical IoT devices. IEEE Access 2020; 8: 9603–9610. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Zhang Z, Yan C, Malin BA. Membership inference attacks against synthetic health data. J Biomed Inform 2022; 125: 103977. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Gupta U, Stripelis D, Lam PK, et al. Membership inference attacks on deep regression models for neuroimaging. In: Medical imaging with deep learning (MIDL). PMLR, 2021, pp.228–251. [Google Scholar]
  • 57.Wassan S, Suhail B, Mubeen R, et al. Gradient boosting for health IoT federated learning. Sensors 2022; 14: 16842. [Google Scholar]
  • 58.Jagannatha A, Rawat BPS, Yu H. Membership inference attack susceptibility of clinical language models. arXiv preprint, 2021. [Google Scholar]
  • 59.Liu G, Wang C, Peng K, et al. Socinf: membership inference attacks on social media health data with machine learning. IEEE Access 2019; 6: 907–921. [Google Scholar]
  • 60.Usynin D, Rueckert D, Kaissis G, et al. Beyond gradients: exploiting adversarial priors in model inversion attacks. ACM Trans Priv Secur 2023; 26: 1–30. [Google Scholar]
  • 61.Rasool RU, Ahmad HF, Rafique W, et al. Security and privacy of internet of medical things: a contemporary review in the age of surveillance, botnets, and adversarial ML. Comput Netw 2022; 201: 103332. [Google Scholar]
  • 62.Tanuwidjaja HC, Choi R, Kim K. A survey on deep learning techniques for privacy-preserving. In: Machine learning for cyber security (ML4CS), Springer. 2019; pp.29–46. [Google Scholar]
  • 63.Churi PP, Pawar AV. A systematic review on privacy preserving data publishing techniques. J Emerg Sci Technol Rev 2019; 12. [Google Scholar]
  • 64.Kaissis GA, Makowski MR, Rückert D, et al. Secure, privacy-preserving and federated machine learning in medical imaging. Nat Mach Intell 2020; 2: 305–311. [Google Scholar]
  • 65.Torkzadehmahani R, Nasirigerdeh R, Blumenthal DB, et al. Privacy-preserving artificial intelligence techniques in biomedicine. Nat Methods 2022; 61: e12–e27. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Khalid N, Qayyum A, Bilal M, et al. Privacy-preserving artificial intelligence in healthcare: techniques and applications. Comput Biol Med 2023; 158: 106848. [DOI] [PubMed]

Articles from Digital Health are provided here courtesy of SAGE Publications

RESOURCES