AMTLDC: a new adversarial multi-source transfer learning framework to diagnosis of COVID-19

Hadi Alhares; Jafar Tanha; Mohammad Ali Balafar

doi:10.1007/s12530-023-09484-2

. 2023 Jan 12:1–15. Online ahead of print. doi: 10.1007/s12530-023-09484-2

AMTLDC: a new adversarial multi-source transfer learning framework to diagnosis of COVID-19

Hadi Alhares ¹, Jafar Tanha ^1,^✉, Mohammad Ali Balafar ¹

PMCID: PMC9838404 PMID: 38625255

Abstract

In recent years, deep learning techniques have been widely used to diagnose diseases. However, in some tasks, such as the diagnosis of COVID-19 disease, due to insufficient data, the model is not properly trained and as a result, the generalizability of the model decreases. For example, if the model is trained on a CT scan dataset and tested on another CT scan dataset, it predicts near-random results. To address this, data from several different sources can be combined using transfer learning, taking into account the intrinsic and natural differences in existing datasets obtained with different medical imaging tools and approaches. In this paper, to improve the transfer learning technique and better generalizability between multiple data sources, we propose a multi-source adversarial transfer learning model, namely AMTLDC. In AMTLDC, representations are learned that are similar among the sources. In other words, extracted representations are general and not dependent on the particular dataset domain. We apply the AMTLDC to predict Covid-19 from medical images using a convolutional neural network. We show that accuracy can be improved using the AMTLDC framework, and surpass the results of current successful transfer learning approaches. In particular, we show that the AMTLDC works well when using different dataset domains, or when there is insufficient data.

Keywords: Diagnose diseases, COVID-19 diagnosis, Deep learning, Multi-source adversarial domain adaptation, Coronavirus pneumonia

Introduction

Nearly 251 million people worldwide officially have been infected with COVID-19, and more than 5 million death tolls until November 2021 (Worldometer 2021; Ghaderzadeh et al. 2021a) as of epidemic declaration in March 2020, signifies the rapid diagnosis of the COVID-19 with high reliability in the early stages; Not only to save human lives but also to reduce the social and economic burden on the communities involved. Although the RT-PCR (Real-time polymerase chain reaction) test is the standard reference for confirming COVID-19, some studies show that this laborious method cannot diagnose the disease in the early stages (Ai et al. 2020; Alshazly et al. 2021; Jokandan et al. 2007), and some studies report a high false-negative rate (Long et al. 2020; Ghaderzadeh and Aria 2020).

One standard way to identify morphological patterns of lung lesions associated with COVID-19 is to use chest scan images. There are two common techniques for scanning the chest: X-rays and computer tomography (CT). Detection of COVID-19 from chest images by a radiologist is time-consuming, and the accuracy of COVID-19 diagnosis depends strongly on the radiologist's opinion (Ng et al. 2020; Ghaderzadeh et al. 2021b). Also, manually checking every image might not be feasible in emergency cases. Recently, deep learning-based methods (Ghaderzadeh et al. 2021a; Hemdan et al. 2003; Farooq and Hafeez 2003; Luz et al. 2021; Li et al. 2020c; Wang et al. 2020a) have been applied to help the medical community diagnose COVID-19 quickly, accurately, and automatically.

The use of deep learning in various fields of machine vision has shown promising results. In particular, deep learning is widely used in medical imaging (e.g. for the diagnosis of diabetic retinopathy (Gulshan et al. 2016), skin cancer (Esteva et al. 2017), breast cancer (Wang et al. 2016), and other tasks (Bayani et al. 2022; Bayani et al. 2022; Aria et al. 2022a). However, deep learning faces many challenges. Some of these challenges are related to the intrinsic of deep models; for example, a lot of data is needed for the success of deep learning models. The reason for medical applications' success based on deep learning is the data that has been collected over the years. However, in most other applications, it is difficult to collect sufficient medical data to train the model. This is because of the cost of labeling them, which requires an expert in this field (Wang et al. 2019). The lack of a sufficient dataset is also a major challenge in the Covid-19 diagnostic task with medical imaging. To solve this problem, various methods have been proposed, such as (Altae-Tran et al. 2017; Christodoulidis et al. 2017; Dhungel et al. 2017; Bar et al. 2015). One of these methods is data augmentation. In this method, data can be increased by using some data transformation techniques, such as zooming, image rotation, horizontal or vertical shifting, and horizontal or vertical flipping (Dhungel et al. 2017). In some other methods, such as few-shot learning, data efficiency can be increased (Altae-Tran et al. 2017). Some other methods instill the knowledge learned on sufficient data into target deep models. The purpose of this approach is to train the model with insufficient data. This knowledge can be obtained by training the model on a semi-related dataset and then fine-tuning it with the target dataset (Bar et al. 2015). This technique is known as transfer learning.

The transfer learning technique is very appropriate due to the variety of medical datasets available. For example, we can transfer knowledge between different datasets. Also, multi-source transfer learning can be used to combine multiple sources and extract knowledge from them (Christodoulidis et al. 2016). Knowledge transfer between these datasets and learning common features can improve model generalizability. The advantage of using multi-source transfer learning is that it allows the use of several different datasets, each of which may not be sufficient to train and generalize the model alone. However, the nature of the source datasets can be very different, and the transmission efficiency strongly depends on the similarity between the source tasks and the target task. So in some cases, transfer learning hurts the model instead of helping it to train better. There is also a risk that the model will learn the specific features of each dataset instead of learning the common features between the datasets. Therefore, it harms the generalization of the learned model. This is especially true in the case of COVID-19 detection. Because medical datasets are often collected with different medical imaging tools and methods.

Problem statement: Most existing methods for classifying COVID-19 are trained and evaluated with images from the same dataset. Using only a dataset reduces the generalizability of these methods. So the results of training and testing the network on the same dataset are much better than the results of training and testing the network on different datasets. In other words, in the feature extraction stage, most of the proposed models are very dependent on the domain of the training dataset and do not perform well in the face of unseen datasets. For this reason, they are not trusted in real-world applications where the data used is new and independent of training data. Numerous studies demonstrate that the most recent approaches in the literature are unreliable (Tartaglione et al. 2020; Tabik et al. 2020). For example, two well-known studies (Wang et al. 2020a; Afshar et al. 2020) in this field show a performance close to random classification facing unseen data (i.e., datasets on which the model has not been trained). The classification accuracy in research (Silva et al. 2020) decreases from 98.5% on the test set to 59.12% on unseen datasets. The structural and inherent differences in the images from the available datasets, which arise from different tools and medical imaging methods, are the cause of this issue.

Method: To solve the above problem, in this research, we propose the Adversarial Multi-source Transfer Learning Framework framework for COVID-19 diagnosis from CT (Computed Tomography) images, namely AMTLDC. We use two separate datasets to learn common representations that are independent of the domain of each dataset. The source dataset is used for network training and the target dataset is used to increase the transferability and generalizability. Deep models used traditionally in transfer learning (e.g., convolutional neural networks) generally implement two modules: a feature extractor that extracts knowledge from the inputs, and a predictor that uses the knowledge to make the predictions. In the AMTLDC setting, a new module infers the source of the input data based on its extracted features. By making the features extractor compete against this objective, the learned feature representation generalizes better across the sources. Our hypothesis is that the feature representation, being more general, will then transfer better to an unknown target. This idea is particularly well suited for COVID-19 diagnosis because of the structural and inherent differences in the images from the available datasets, which arise from different tools and medical imaging methods. AMTLDC can perform the correct classification regardless of the specific features of each input data domain. In other words, the representations learned are shared among both data domains and do not depend on a particular dataset domain.

Contribution: The contributions of this research are threefold:

The effect of intrinsic and natural differences in existing datasets obtained with different medical imaging tools and approaches is minimized as a result of the proposed adversarial multi-source transfer learning framework.
An efficient deep framework is developed to make Covid-19 detection more accurate.
Extensive experiments show that the AMTLDC has high generalizability on unseen data.

In the remainder of the paper: the related works are reviewed in Sect. 2; the proposed AMTLDC framework is introduced in Sect. 3; experiments performed are explained in Sects. 4 and 5; and in the last section, the conclusion is presented.

Related work

Various deep learning methods have been introduced to detect Covid 19. These methods can be divided into three general categories. The first category includes methods that have developed customized architectures for COVID-19 detection, such as COVID-Net (Wang et al. 2020a) and CVR-Net (Hasan et al. 2020). The second category includes methods that use common architectures [such as ResNet (Residual Network) (He et al. 2016)] and transfer learning. The last category includes very few studies that have employed handcrafted feature extraction approaches and conventional classifiers. In the following, each of these categories is reviewed.

Customized models

Some methods introduced a customized architecture for COVID-19 detection. COVID-Net (Wang et al. 2020a) is one of the pioneering methods that has introduced a new convolutional architecture for identifying COVID-19. This architecture is trained and evaluated on X-ray images. An improved version of the COVID-Net method has been developed in Wang et al. (2020b). The authors developed a novel joint learning model to detect COVID-19 by effectively learning with heterogeneous datasets with distribution discrepancies. In this model, the generated representations and the network performance are computationally improved.

In Hasan et al. (2020), a robust CNN (Convolutional Neural Network) based network, called CVR-Net was proposed. In this framework, both CT and X-ray images are used to train and test the model. The proposed end-to-end CVR-Net is a multi-scale multi-encoder ensemble model.

To increase the efficiency of coronavirus detection based on CT images, the authors proposed a set of deep models called CovidCTNet (Javaheri et al. 2005) that successfully detect Covid-19 from other lung diseases. CovidCTNet is designed to work with small sample sizes and heterogeneous datasets.

In Amyar et al. (2020) Multitask deep learning-based model was proposed. The proposed model can improve the state-of-the-art U-NET model by leveraging useful information contained in multiple related tasks. The main aim of this approach is on the one hand to leverage useful information contained in multiple related tasks to improve both segmentation and classification performances, and on the other hand to deal with the problems of small datasets.

Pre-trained models based on transfer learning

Various methods based on transfer learning are proposed to detect coronavirus from medical images. In Singh et al. (2020a), the authors used convolutional networks to detect negative and positive cases of coronavirus on CT scan samples. In Apostolopoulos and Mpesiana (2020), common convolutional architectures, such as VGG19, MobileNet v2, Inception, Xception, and Inception ResNet v2 have been used along with transfer learning to classify samples into three categories: normal, bacterial pneumonia, and COVID-19. A common transfer learning technique with fine-tuning is used in Minaee et al. (2020) to identify COVID-19. The authors used some convolutional neural network architectures, such as DenseNet-121, ResNet50, SqueezeNet, and ResNet18. These models were tested on a dataset of 5000 X-ray images. In Hasan et al. (2020), as in the methods mentioned, transfer learning on a trained VGG-16 model is used to diagnose COVID-19. In Brunese et al. (2020), like other similar methods, the pre-trained VGG-16 network is used to detect COVID-19. In Li (2020), an efficient 3D deep learning framework called CONVNet is introduced. CONVNet uses the pre-trained Resnet architecture to extract two-dimensional and three-dimensional features. In Song (2021), the authors have developed a new method called DeepPneumonia for the diagnosis of bacterial pneumonia, COVID-19, and healthy cases. This model has achieved 86.5% and 94% accuracy for detecting COVID-19 with bacterial pneumonia and healthy cases, respectively. Other similar methods are introduced in Zhou et al. (2020), Jaiswal et al. (2020).

Methods based on handcrafted feature extraction

Some COVID-19 detection methods used handcrafted feature extraction approaches. In Pereira et al. (2020), first, different texture features are extracted from the images by popular texture descriptors, and then these texture features are combined with the extracted features from the pre-trained InceptionV3 (Szegedy et al. 2016) model. In Al-Karawi et al. (2020a), a method for classifying the positive and negative cases of COVID-19 based on CT scan images was proposed. Different texture features were extracted from CT images using the Gabor filter, and then the SVM method was used to classify these images. In Hasan et al. (2020), to reduce intensity variations between CT slices, a preprocessing step was applied on CT slices. Then a long short-term memory (LSTM) classifier is used to discriminate between COVID-19, pneumonia, and healthy cases. Other related methods based on the combination of feature extraction approaches and deep learning models are introduced in Farid et al. (2020a).

Recently, some new methods for the segmentation or classification of Corona images have been introduced. In Abd Elaziz et al. (2021), the goal is to present an efficient image segmentation method for COVID-19 CT images. This method depends on improving the density peaks clustering (DPC) using generalized extreme value (GEV) distribution. The DPC is faster than other clustering methods, and it provides more stable results. However, it is difficult to determine the optimal number of clustering centers automatically without visualization. So, GEV is used to determine the suitable threshold value to find the optimal number of clustering centers that lead to improving the segmentation process. The proposed model is applied to a set of twelve COVID-19 CT images.

In Elaziz et al. (2020), the authors proposed a hybrid swarm intelligence (SI) based approach that combines the features of two SI methods, marine predators algorithm (MPA) and moth-flame optimization (MFO). This approach was called MPAMFO, in which, the MFO was utilized as a local search method for MPA to avoid trapping at local optima. The MPAMFO was proposed as an MLT approach for image segmentation, which showed excellent performance in all experiments. the authors tested the MPAMFO for a real-world application, such as CT images of COVID-19. Thirteen CT images were used to test the performance of MPAMFO.

To determine the COVID-19 case from other normal and abnormal cases, the authors in Yousri et al. (2021) proposed a method that extracted the informative features from X-ray images, leveraging on a new feature selection method to determine the relevant features. in this method, an enhanced cuckoo search optimization algorithm (CS) was proposed using fractional-order calculus (FO) and four different heavy-tailed distributions in place of the Lévy flight to strengthen the algorithm performance during dealing with the COVID-19 multi-class classification optimization task. The classification process included three classes, called normal patients, COVID-19 infected patients, and pneumonia patients. The distributions used are Mittag–Leffler distribution, Cauchy distribution, Pareto distribution, and Weibull distribution. Two datasets for COVID-19 X-ray images were considered for testing the proposed method.

Most of the mentioned methods are highly dependent on the image domain of datasets on which they were trained. If the test set is from the same domain of the training set, the model performance will be acceptable. However, when the domain of the evaluation dataset is different, model performance is significantly reduced. However, in real-world applications, the domain of the inference image is not always the same as the training set. In other words, unseen data is often independent of the training set, so the results would not be reliable.

Proposed framework for COVID-19 detection

The steps of the AMTLDC framework are shown graphically in Fig. 1. As shown in this figure, the COVID-19 detection model uses two separate datasets to learn common representations that are independent of the domain of each dataset. The source dataset is used for network training and the target dataset is used to increase the transferability and generalizability. The next step is the preprocessing step. In this step, the input data is decoded, resized, normalized, and finally transformed by data augmentation techniques. The next step is the AMTLDC architecture, which consists of three parts: CNN-based Feature Extractor, classifier, and discriminator. These blocks are responsible for extracting features, classifying data into two classes COVID-19 or Non-COVID-19, and distinguishing source data from target data, respectively.

The purpose of the AMTLDC is to learn general features that are useful for both datasets so that the correct classification can be done regardless of the input source and the specific aspects of each input distribution.

Preprocessing

The preprocessing steps are described below:

Step1: Decode: CT images are often saved in DICOM format. These files must be converted to the common image format. In this research, images are converted to png format.

Step2: Resize: CT images are collected from different sources, which may not be in the same size. Therefore, all images should be resized to be suitable for the proposed network input layer.

Step3: Normalize: These images are often in the range of 0 to 255, which should be normalized for network training. So we normalized these images in the [0, 1] range.

Step4: Data Augmentation: Due to insufficient data for network training, we use the data augmentation technique to generate new data. The transformations used on the images are brightness, contrast, rotation, and noise, which are applied in the range/type of 0.2, 0.2, [− 20°, + 20°], and horizontal respectively.

Some images after preprocessing step are shown in Fig. 2.

AMTLDC framework

The AMTLDC architecture consists of three parts: CNN-based Feature Extractor, classifier, and discriminator. Figure 3 shows these modules. As shown in this figure, in the feature extraction block, common convolutional architectures such as VGG16, and Resnet with transfer learning techniques can be used. In AMTLDC, we use the pre-trained ResNet50 (He et al. 2016) architecture on the imagenet dataset. To classify data into two classes, COVID-19 and Non_COVID-19, we pass the extracted representations from the feature extraction block into two consecutive modules consisting of Dense, Batch-normalization, Relu, and Dropout layers. Then, on top of these two blocks, apply the sigmoid activation function. The output of this model determines the probability of assigning each sample to each class. The domain classifier is responsible for identifying and distinguishing source data from target data. The goal is to learn representations that are common among datasets. In other words, representations specific to one dataset are not learned, which may have some structural and inherent differences from other datasets. The architecture of this block is the same as the classification block, except that the output estimates the probability of assigning each image to each dataset (source and target). The purpose of this AMTLDC architecture is to increase the model transferability and generalizability, simultaneously. So that the learned representations are independent of the input domain and general, i.e. they are suitable for both source and target datasets; therefore, the representations learned are based on general features, independent of the specific domain and dataset.

Fig. 3 — CNN-based Feature Extractor, Classification, and Discrimination Blocks

Training phase

To solve the issue of over-specialization of the trained model on multiple datasets, and increase the generalizability and transferability of the model inspired by Bois et al. (2021), we train the model with two loss functions. Figure 4 shows a graphical view of the training approach. This model is trained by an efficient adversarial training approach in a multi-source transfer learning environment. The classification and discrimination blocks use the features extracted by the feature extraction block to classify the input data and the domain the data come from, respectively. Both of these blocks are trained by backpropagating their respective losses. The binary cross-entropy loss function is used to calculate the losses of both blocks. When arriving at the feature extractor block, the loss related to the discrimination block is reversed by the inverse gradient layer. Thus, the feature extractor block learns common and general representations of both sources that are useful for classifying input, at the same time, these learned representations are indiscriminative of the domain the data come from.

Fig. 4 — Efficient adversarial training approach in a multi-source transfer learning environment

AMTLDC is trained simultaneously with two loss functions: classification loss and discrimination loss. Equation (1) shows the used loss function in this method. This loss combines a classification loss ( $L_{c}$ ) and a discrimination loss ( $L_{d}$ ). $λ_{c}$ and $λ_{d}$ are coefficients, controlling the bias-vs-variance tradeoff of the generalization.

L = {λ_{c} L}_{c} + λ_{d} L_{d}

We use the cross-entropy loss function to calculate the discriminator domain loss and the classification loss. The classification loss ( $L_{c}$ ) in this algorithm is defined by Eq. (2)

L_{c} = - y l o g (\hat{y})

where y indicates the correct class, $\hat{y}$ indicates the model prediction.

The discrimination domain loss $(L_{d})$ in this algorithm is defined in Eq. (3).

L_{d} = - y l o g (\hat{y})

where y indicates the correct domain class, $\hat{y}$ indicates the model prediction.

Experiments

In this section, the efficiency of the AMTLDC method is evaluated and compared with the following groups of methods:

Methods based on customized network
Methods based on pre-trained networks and transfer learning

For the AMTLDC, the parameters and their values are described in Table 1. In the proposed method, the two parameters that have a significant effect on the results are the $λ_{d}$ and $λ_{c}$ coefficients. We tested these parameters in the range of Worldometer (2021), Hemdan et al. (2003). According to this test, the best results were obtained with the values of 1 and 4 for $λ_{d}$ and $λ_{c}$ coefficients, respectively.

Table 1.

AMTLDC parameters

Parameter	Value
dropout rate	$0.5$
Coefficients $λ_{d}$	1
Coefficients $λ_{c}$	4
Batch size	32
maximum number of iterations	${2 \times 10}^{4}$
Learning rate (Adam optimizer)	$10^{- 2}$

Open in a new tab

In the proposed method and conventional architecture (the second group of compared methods), such as VGG-16 and ResNet, all common parameters such as learning rate and batch size are considered the same. Also, the number of layers and neurons in the classification module are similar. So the comparisons are quite fair. In the methods whose source code is not available or have their parameters, the best results are reported directly from the relevant papers. Most of these methods often introduce a customized architecture for Covid classification tasks.

Evaluation criteria

In the experiments, such as (Luz et al. 2021; Yousri et al. 2021; Hashemzadeh et al. 2019; Golzari Oskouei et al. 2021a, 2021b, 2022; Aria et al. 2022b; Golzari Oskouei and Hashemzadeh 2022; Wang et al. 2021; Ghaderzadeh et al. 2022), we use Accuracy, Precision, Recall, F1, and Specificity criteria to evaluate the algorithms. These evaluation criteria are shown in Eqs. (4–8), where TP, FN, TN, and FN represent True Positive, False Positive, True Negative, and False Negative, respectively.

A c c u r a c y = \frac{T N + T P}{T N + T P + F N + F P}

P r e c i s i o n = \frac{TP}{T p + F P} \times 100

Recall = \frac{TP}{TP + FN} \times 100

F 1 = 2 \times \frac{Recall \times Precision}{Recall + Precision} \times 100

Specificity = \frac{TN}{T N + F P} \times 100

Dataset

In recent research, three datasets, SARS-CoV-2 CT (Angelov and Almeida Soares 2020), COVID19-CT (He et al. 2020), and COVID19-CT_v2 (Zhao et al. 2020) are often used to evaluate model performance. We also test the performance of the AMTLDC method on these datasets. The SARS-CoV-2 dataset contains 1252 corona images and 1230 non-corona images. The COVID19-CT dataset contains 349 corona images and 397 non-corona images. The COVID19-CT_v2 dataset contains 349 corona images and 463 non-corona images.

Results

Tables 2, 3, 4, 5, 6, 7 show the results of different evaluation criteria. The results of other methods are reported directly from the relevant articles.