Abstract
(Aim) COVID-19 pandemic causes numerous death tolls till now. Chest CT is an effective imaging sensor system to make accurate diagnosis. (Method) This article proposed a novel seven layer convolutional neural network based smart diagnosis model for COVID-19 diagnosis (7L-CNN-CD). We proposed a 14-way data augmentation to enhance the training set, and introduced stochastic pooling to replace traditional pooling methods. (Results) The 10 runs of 10-fold cross validation experiment show that our 7L-CNN-CD approach achieves a sensitivity of 94.44±0.73, a specificity of 93.63±1.60, and an accuracy of 94.03±0.80. (Conclusion) Our proposed 7L-CNN-CD is effective in diagnosing COVID-19 in chest CT images. It gives better performance than several state-of-the-art algorithms. The data augmentation and stochastic pooling methods are proven to be effective.
Keywords: Convolutional neural network, data augmentation, deep learning, stochastic pooling; COVID-19
I. Introduction
Covid-19 (also known as coronavirus) was declared a Public Health Emergency of International Concern on 30/01/2020, and declared as a pandemic on 11/03/2020.
Till 2/Sep, this COVID-19 pandemic caused 25.8 million confirmed cases and 858.2 thousand death tolls (US 187.4k deaths, Brazil 122.5k deaths, India 66.3k deaths, Mexico 65.2k deaths, UK 41.5k deaths, etc.)
Global economy experienced negative effects from COVID-19. For example, Balsalobre-Lorente, et al. [1] analyzed consequences of COVID-19 on the social isolation of Chinese economy. Chaudhary, et al. [2] presented reflections for policy and program of the effect of COVID-19 on economy in India.
Two prevail diagnosis are available. One is viral testing via a nasopharyngeal swab to test the presence of viral RNA fragments [3]. Another is imaging methods, among which the chest computed tomography (CCT) [4] is one of the imaging devices that can provide the highest sensitivity. The CCT uses X-ray generator and X-ray sensors that rotate around the subjects.
The main biomarkers in CCT differentiating COVID-19 from healthy people are the asymmetric peripheral ground-glass opacities (GGOs) without pleural effusions [5]. This study collects those CCT slices.
However, manual interpretation by radiologists is tedious and easy to be influenced by inter-expert and intra-expert factors (such as fatigue, emotion, etc.). Smart diagnosis systems via computer vision and artificial intelligence can benefit patients, radiologists, experts and hospitals. Traditional artificial intelligence (AI) and modern deep learning (DL) methods have achieved excellent results in analyzing medical images, e.g., Lu [6] proposed a radial-basis-function neural network (RBFNN) to detect pathological brains. Yang [7] presented a kernel-based extreme learning classifier (K-ELM) to create a novel pathological brain detection system. Their method was robust and effective. Lu [8] proposed a novel extreme learning machine trained by the bat algorithm (ELM-BA) approach. Li and Liu [9] introduced the real-coded biogeography-based optimization (RCBBO) to detect diseased brains. Jiang [10] used a six-layer convolutional neural network (6L-CNN) to recognize sign language fingerspelling. Szegedy, et al. [11] presented the GoogleNet. Yu and Wang [12] suggested the use of ResNet18 for mammogram abnormality detection. Furthermore, some smart health systems gained success in emotion-aware security [13], authentication [14], and IoT [15].
We proposed a novel 7-layer convolutional neural network for COVID-19 diagnosis (7L-CNN-CD). To improve its performance, three improvements were proposed in this study: (i) A 12-way data augmentation (DA-12) was proposed; (ii) Stochastic pooling was introduced to replace traditional pooling methods;
II. Dataset
Image acquisition CT configuration and method: Philips Ingenuity 64 row spiral CT machine, KV: 120, MAS: 240, layer thickness 3 mm, layer spacing 3 mm, screw pitch 1.5: lung window (W: 1500 HU, L: -500 HU), Mediastinum window (W: 350 HU, L: 60 HU), thin layer reconstruction according to the lesion display, layer thickness and layer distance are 1mm lung window image. The patients were placed in a supine position, breathing deeply after holding in, and conventionally scanned from the lung tip to the costal diaphragm angle.
For each subject, 1-4 slices were chosen. Slice level selection (SLS) method was employed: For COVID-19 pneumonia patients, the slice showing the largest size and number of lesions was selected. For normal subjects, any level of the image can be selected. The resolutions of all images are
. Table I shows the demographics, where HC means healthy control.
TABLE I. Demographics of Subjects Used in This Study.
| No. of subjects (m/f) | No. of Images | Age Range | |
|---|---|---|---|
| COVID-19 | 142 (95/47) | 320 | 22-91 |
| HC | 142 (88/54) | 320 | 21-76 |
When there are differences between the two analyses
, a superior doctor
was consulted to reach a consensus. Suppose
means a CCT image scan,
means the labelling of each individual expert, and the final labelling
is obtained by
![]() |
where MV denotes majority voting,
represents the labelling of all three experts.
III. Methodology
Table VIII shows the abbreviations and their full names for ease of understanding of our methodology part.
TABLE VIII. Abbreviation List.
| Meanings | Abbreviations |
|---|---|
| MV | majority voting |
| SLS | Slice level selection |
| HC | Healthy control |
| CCT | Chest computed tomography |
| DS | downsampling |
| HS | histogram stretching |
| CR | compression ratio |
| DA | Data augmentation |
| (A)(M)(S)(L2)P | (Average) (Max) (Stochastic) (
norm) pooling |
| MCC | Matthews correlation coefficient |
| FMI | Fowlkes–Mallows index |
A. Preprocessing
The original dataset containing 320 COVID-19 images and 320 HC images is symbolized as
, each image is symbolized as
,
. We have
![]() |
Figure 1(a) shows a raw COVID-19 CCT image. Figure 1(b) shows the flowchart of our preprocessing procedure. First, we converted all color images to grayscale by only reserving the luminance information. The reason of performing grayscale is there is no need to store a grayscale image in three color channels. Directly inputting original RGB images to the neural network will increase the computation burdens. Thus, we get the grayscale image set
as
![]() |
where
means the grayscale operation.
Fig. 1.
Preprocessing on raw dataset.
Second, histogram stretching (HS) method was used to increase every slice’s contrast. For
-th image
, we first calculate their minimum grayscale value
and maximum grayscale value
respectively by
![]() |
here (
,
means coordinates of pixel of the image
. The new histogram stretched image
is obtained by
![]() |
In all, we get the histogram stretched image set
as above.
Third, we crop the images to remove the texts at the margin areas, and the checkup bed at the bottom area. Thus, we get the cropped dataset
as
![]() |
where
represents crop operation. Four crop variables: top, bottom, left, and right means the pixels to be removed during crop operation. In this study all their values equal 150. Now the size of each image is reduced from
to
.
Fourth, we downsampled each image to size of [256, 256], and we now get the resized image set
as
![]() |
where
means the downsampling (DS) function, where
is a downsampled image of original image
.
Table II compares the size and storage of each image
,
,
at every preprocessing step. We can see here after preprocessing procedure, each image will only cost about 2.08% of its original storage or size. The compression ratio (CR) rates of
-th image of final state
to original stage
were calculated by following equation.
![]() |
TABLE II. Image Size and Storage per Image at Each Preprocessing Step.
| Preprocess | Symbol | Size (per image) | Storage (per image) |
|---|---|---|---|
| Original |
![]() |
![]() |
12,582,912 |
| Grayscale |
![]() |
![]() |
4,194,304 |
| HS |
![]() |
![]() |
4,194,304 |
| Crop |
![]() |
![]() |
2,096,704 |
| DS |
![]() |
![]() |
262,144 |
We can see here the storage CR equals size CR for any
-th image. Figure 2 shows two samples from the preprocessed dataset
.
Fig. 2.
Two samples of preprocessed dataset
.
B. Improvement I: Data Augmentation
Generally, the CCT image set faces small-size dataset (SSD) and lack of generalization (LoG) problems. To break the curse of SSD and LoG, there are four possible types of solutions: (i) data generation (DG); (ii) regularization approach (RA), (iii) ensemble approach (EA); and (iv) data augmentation (DA). All those DG, RA, EA, and DA methods are effective in handling SSD and LoG problems.
We proposed a 14-way DA method, as shown in Figure 3. We will use 10-fold cross validation technique. Suppose the preprocessed CCT image set
will split into ten folds, nine of which form the training set
, and the rest forms test set
.
![]() |
where
means the cardinality of the set
. For ease of reading, we ignore the run-index
in following texts, and just simplify the situations as
, and we assume
contains
images
![]() |
For each image
, we shall define all the 14 different DA operations.
Fig. 3.
Illustration of our DA-14.
1). Rotation:
Rotation angle
was in the value from −30° to 30° in increase of 2°, skipping the value of
, since it corresponds to the original image
.
![]() |
where the rotation factor vector
is defined as
![]() |
2). Scaling:
All training CCT images were scaled with scaling factor
, the values of which vary from 0.7 to 1.3 with increase of 0.02, skipping the value of 1.
![]() |
where scaling factor vector
is defined as
![]() |
3). Noise Injection (NI):
The
-mean
-variance Gaussian noises were added to the all CCT training images to produce 30 new noised images.
![]() |
where the mean and variance vector definition of noise are defined as
,
. The values of 0 and 0.01 are default values of mean and variance of Gaussian noises, respectively.
4). Random Translation (RT):
All CCT image
was translated 30 times with random horizontal shift vector
and random vertical shift
![]() |
where the values of
and
are in the range of [−15, 15], and obey uniform distribution
.
![]() |
5). Gamma Correction (GC):
GC can help adjust the contrast of original image [16]. The factor vector of GC
varied from 0.4 to 1.6 with increase of 0.04, skipping the value of 1.
![]() |
where the values of
is chosen as:
![]() |
6). Horizontal Shear Transform (HST):
We will generate 30 horizontal shear transform (HST) images as
![]() |
where the HST values are assigned from −0.15 to 0.15 with increase of 0.01, skipping the value o 0
![]() |
7). Vertical Shear Transform (VST):
Similarly, we generate 30 vertical shear transform (VST) images as below. Besides, the values of VST factor vector
are the same as
.
![]() |
8). Mirror:
The original image
is mirrored and we obtain a new image
. Suppose
is the mirror function, we have
![]() |
we define following operations:
![]() |
9). Concatenation:
All the first seven DA results are concatenated, and we have
![]() |
where
means the concatenation. The size of
is
images, then we have the results of 8–14 DA techniques as
![]() |
Finally, one original image
will yield to 365 images (containing itself) in the enhanced training set.
![]() |
C. Improvement 2: Stochastic Pooling
In traditional CNN, the activation maps (AMs) are usually too large [17] (i.e., contain too many features) which will cause (i) overfitting of the training and (ii) large computational costs. Thus, pooling layers (PLs) are frequently used to reduce the size of AMs. Besides, PL could help guarantee the characteristics of invariance-to-translation. There exist three generally-used pooling techniques: (i)
norm pooling (L2P); (ii) average pooling (AP); and (iii) max pooling (MP). Assume pooling is a function
.
L2P calculates the
norm [18] of a given region
. Suppose
![]() |
L2P output
is defined as
. In this study, we add a constant 1/4 under the square root to make it easier to compare with other pooling methods. This constant 1/4 does not influence training and inference.
![]() |
The AP [19] calculates the mean value of region

![]() |
Finally, MP picks out the maximal value from region
![]() |
Figure 4 showcases the differences of our pooling methods, where we assume both pooling size and pooling stride equal 2. Observe the top left region
, its vectorization is
. The calculation of L2P, AP, and MP are as below:
,
,
.
Fig. 4.
A toy example of four pooling techniques. (
norm pooling; AP = average pooling; MP =max pooling; SP = stochastic pooling).
The SP was invented to conquer the problems caused by aforementioned three pooling methods: L2P, MP and AP. Both L2P and AP does not work well, since all pixels in
are considered by L2P and AP, thus they could reduce the values of strong activations because of other surrounding near-zero pixels. On the other hand, the MP elucidates this obstruction, although it simply overfits the training set and causes the LoG problem.
Instead of computing the
norm, average value or max value, the output of the SP
is attained via sampling from a multinomial distribution [20] formed from the activations of each element in region
[21].
-
(1)Reckon the probability
of each element
. 
-
(2)Select a location
within the
in accordance with the probability
, calculated by scanning the
from up to bottom and left to right [22]. 
-
(3)The output is the value at location
. 
We use the first block
in Figure 4 as an instance. The calculation procedures of SP are described below:
![]() |
Thus, we get
, and
. Using the probability map
, we randomly select the position
associated with probability of
. Thus, the output
of SP at region
is 6. Instead of considering the max values barely or considering all the elements in the region, SP uses non-maximal activations randomly within the region
.
D. Measures and Indicators
We set a 10-fold cross validation on the whole dataset
. Each fold will contain 32 COVID-19 images and 32 HC images. Within each trial, the training set contains 288+288 = 576 images, and the test set contains 32+32 = 64 images. After combining all the 10 trials, the test set will contain 640 images. The above 10-fold cross validation will run 10 times, and so the final report was based on
images. Table III shows the split setting of our dataset.
TABLE III. Split Setting of our Dataset.
| Set | Percentage | COVID-19 | HC | Total |
|---|---|---|---|---|
Training
|
90% | 288 | 288 | 576 |
| DA Training | 121,248 | 121,248 | 242,496 | |
Test
|
10% | 32 | 32 | 64 |
| Total | 100% | 320 | 320 | 640 |
This proposed seven-layer convolutional neural network for COVID-19 diagnosis (7L-CNN-CD) will be tested by 10 runs of 10-fold cross validation. Suppose the ideal confusion matrix
over the test set at
-th trial and
-th run is
![]() |
where the value 32 can be found in the test row in Table III. The value of 32 means the number of COVID-19 cases and the number of HC cases in the test set. After running through 1-10 trials, and we get the confusion matrix of one-run 10-fold CV as
![]() |
In realistic inference, we cannot get the perfect diagonal matrix, where all off-diagonal elements are zero. Suppose the confusion matrix at
-th run is
![]() |
Note
,
in this study. Here
and
represent true positive (TP) and true negative (TN), respectively. Positive class (P) is COVID-19, and negative class (N) is healthy control.
and
represent false negative (FN) and false positive (FP), respectively. We can define four simple measures as
![]() |
Three advanced measures are defined below. F1 score is:
![]() |
Matthews correlation coefficient (MCC) is defined as
![]() |
Fowlkes-Mallows index (FMI) is defined as
![]() |
After combining 10 runs
, we can calculate the mean and standard deviation (SD) of all
-th
measures as
![]() |
E. Proposed 7L-CNN-CD Algorithm
Figure 5 presents the structure of proposed 7-layer CNN (7L-CNN). After training, the network can be used to diagnose COVID-19 is called 7L-CNN-CD. The sizes of activation map are labelled at each cube in Figure 5. Table IV shows the pseudocode of our 7L-CNN-CD model. Here we divide our algorithm into two phases: (I) Preprocessing and (II) 10 runs of 10-fold cross validation.
Fig. 5.
Structure of proposed 7-layer CNN.
TABLE IV. Pseudocode of Our 7L-CNN-CD Model.
Input: Original Image Set
|
|---|
Ground Truth:
obtained from two junior and one senior radiologists. See Eq. (1.a)
|
| Phase I: Preprocessing |
Grayscale
. See Eq. (3)
|
Histogram Stretching
. See Eq. (5.a)
|
Image Crop
. See Eq. (6.a)
|
Downsampling
. See Eq. (7)
|
| Phase II: 10 runs of 10-fold cross validation |
for
%
is run index |
Randomly split preprocessed set
into 10 folds |
, |
for
:10 %
is trial index |
| Step II.A: Training & Test Set |
is chosen as the t-th fold. |
![]() |
Training Set.
B
is chosen as the other folds. |
![]() |
| Enhanced Training Set. |
DA
, see equation (27). |
| Step II.B: Create Initial CNN model |
Create an initial deep network
via 7L-CNN model; |
| Use SP to replace all pooling layers in 7L-CNN model. See equation (34). |
| Step II.C Trained 7L-CNN-CD model |
Train 7L-CNN network using
and ground truth
|
Trained model
|
; |
| Step II.D: Confusion Matrix Performance |
Test prediction
|
![]() |
Test performance.
is obtained by comparing test prediction and ground truth. |
. |
| end |
Summarize all 10 trials and get
, see Eq. (38). |
Calculate
, see Eqs. (39.a)–(42)
|
| end |
Output mean and SD of
. see Eq. (43.a)
|
IV. Results, and Discussions
A. Result of Data Augmentation
Suppose
is Figure 2(a), Figure 6 shows the DA(1-7) results. Due to the page limit, their horizontal results DA(8-14) are not presented in this article. Particularly, we only select 15 new generated images among 30 generate results per DA technique.
Fig. 6.

Half of DA(1–7) results.
Figure 6(a) presents the 15 rotated new images. Figure 6(b-e) present 15 scaled, 15 noise-injected, 15 randomly translated, and 15 Gamma corrected images, respectively. Figure 6(f-g) present the 15 HST and 30 VST new images, respectively.
B. SP Compared With Other Three Pooling Methods
The results of SP against other three pooling methods were presented in Table V, which indicates that SP obtained the best sensitivity, accuracy, F1, MCC, and FMI. The definition of
can be found in Eqs. (39.a)–(42).
TABLE V. Ten Runs of Different Pooling Methods.
| L2P |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|---|---|---|---|
| 1 | 90.63 | 94.69 | 94.46 | 92.66 | 92.50 | 85.38 | 92.52 |
| 2 | 91.56 | 92.81 | 92.72 | 92.19 | 92.14 | 84.38 | 92.14 |
| 3 | 92.50 | 94.06 | 93.97 | 93.28 | 93.23 | 86.57 | 93.23 |
| 4 | 93.13 | 93.75 | 93.71 | 93.44 | 93.42 | 86.88 | 93.42 |
| 5 | 92.19 | 93.44 | 93.35 | 92.81 | 92.77 | 85.63 | 92.77 |
| 6 | 91.56 | 93.44 | 93.31 | 92.50 | 92.43 | 85.01 | 92.43 |
| 7 | 93.13 | 94.38 | 94.30 | 93.75 | 93.71 | 87.51 | 93.71 |
| 8 | 93.75 | 91.56 | 91.74 | 92.66 | 92.74 | 85.33 | 92.74 |
| 9 | 93.13 | 95.31 | 95.21 | 94.22 | 94.15 | 88.46 | 94.16 |
| 10 | 93.75 | 93.75 | 93.75 | 93.75 | 93.75 | 87.50 | 93.75 |
| M+SD | 92.53±1.04 | 93.72±1.04 | 93.65±0.96 | 93.13±0.66 | 93.08±0.67 | 86.27±1.31 | 93.09±0.66 |
| AP |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| 1 | 91.25 | 94.38 | 94.19 | 92.81 | 92.70 | 85.67 | 92.71 |
| 2 | 91.88 | 94.06 | 93.93 | 92.97 | 92.89 | 85.96 | 92.90 |
| 3 | 92.50 | 92.19 | 92.21 | 92.34 | 92.36 | 84.69 | 92.36 |
| 4 | 92.81 | 94.69 | 94.59 | 93.75 | 93.69 | 87.52 | 93.70 |
| 5 | 92.81 | 95.00 | 94.89 | 93.91 | 93.84 | 87.83 | 93.84 |
| 6 | 91.25 | 92.50 | 92.41 | 91.88 | 91.82 | 83.76 | 91.83 |
| 7 | 92.50 | 92.50 | 92.50 | 92.50 | 92.50 | 85.00 | 92.50 |
| 8 | 93.44 | 95.31 | 95.22 | 94.38 | 94.32 | 88.77 | 94.33 |
| 9 | 92.81 | 94.38 | 94.29 | 93.59 | 93.54 | 87.20 | 93.55 |
| 10 | 95.63 | 94.38 | 94.44 | 95.00 | 95.03 | 90.01 | 95.03 |
| M+SD | 92.69±1.25 | 93.94±1.12 | 93.87±1.09 | 93.31±0.98 | 93.27±0.99 | 86.64±1.96 | 93.27±0.99 |
| MP |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| 1 | 94.69 | 95.31 | 95.28 | 95.00 | 94.98 | 90.00 | 94.98 |
| 2 | 92.19 | 92.81 | 92.77 | 92.50 | 92.48 | 85.00 | 92.48 |
| 3 | 94.69 | 94.69 | 94.69 | 94.69 | 94.69 | 89.38 | 94.69 |
| 4 | 93.75 | 92.81 | 92.88 | 93.28 | 93.31 | 86.57 | 93.31 |
| 5 | 92.50 | 94.38 | 94.27 | 93.44 | 93.38 | 86.89 | 93.38 |
| 6 | 95.31 | 91.56 | 91.87 | 93.44 | 93.56 | 86.94 | 93.57 |
| 7 | 94.38 | 93.44 | 93.50 | 93.91 | 93.93 | 87.82 | 93.94 |
| 8 | 95.00 | 94.69 | 94.70 | 94.84 | 94.85 | 89.69 | 94.85 |
| 9 | 94.06 | 93.13 | 93.19 | 93.59 | 93.62 | 87.19 | 93.62 |
| 10 | 94.38 | 94.38 | 94.38 | 94.38 | 94.38 | 88.75 | 94.38 |
| M+SD | 94.09±1.03 | 93.72±1.15 | 93.75±1.08 | 93.91±0.80 | 93.92±0.80 | 87.82±1.60 | 93.92±0.80 |
| SP (Ours) |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| 1 | 95.00 | 90.63 | 91.02 | 92.81 | 92.97 | 85.71 | 92.99 |
| 2 | 93.13 | 92.50 | 92.55 | 92.81 | 92.83 | 85.63 | 92.84 |
| 3 | 94.69 | 93.13 | 93.23 | 93.91 | 93.95 | 87.82 | 93.96 |
| 4 | 94.69 | 95.31 | 95.28 | 95.00 | 94.98 | 90.00 | 94.98 |
| 5 | 95.31 | 92.81 | 92.99 | 94.06 | 94.14 | 88.15 | 94.14 |
| 6 | 94.06 | 95.31 | 95.25 | 94.69 | 94.65 | 89.38 | 94.66 |
| 7 | 93.75 | 95.00 | 94.94 | 94.38 | 94.34 | 88.76 | 94.34 |
| 8 | 94.69 | 92.19 | 92.38 | 93.44 | 93.52 | 86.90 | 93.53 |
| 9 | 93.75 | 94.69 | 94.64 | 94.22 | 94.19 | 88.44 | 94.19 |
| 10 | 95.31 | 94.69 | 94.72 | 95.00 | 95.02 | 90.00 | 95.02 |
| M+SD | 94.44±0.73 | 93.63±1.60 | 93.70±1.47 | 94.03±0.80 | 94.06±0.76 | 88.08±1.59 | 94.06±0.76 |
For the specificity and precision indicators, the AP achieved the best performance. If we consider all the indicators, SP wins five out of seven indicators. Hence, SP gives the best performance compared to other three pooling methods.
C. Effect of DA
We compared using our 14-way DA “DA14” against not using DA (symbolized as DA0), to explore the effects of our DA14 strategies. The cognate comparison performance is presented in Table VI.
TABLE VI. Comparison of DA0 and DA14.
| DA |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|---|---|---|---|
| DA0 | 92.06±0.85 | 91.59±1.60 | 91.65±1.48 | 91.83±0.96 | 91.85±0.92 | 83.67±1.92 | 91.85±0.92 |
| DA14 (Ours) | 94.44±0.73 | 93.63±1.60 | 93.70±1.47 | 94.03±0.80 | 94.06±0.76 | 88.08±1.59 | 94.06±0.76 |
We can observe training with DA14 could significantly provide better performance than DA0 in terms of all seven indicators. Furthermore, the SD of results of DA14 are slightly smaller than that of DA0.
D. Comparison to State-of-the-art Methods
Our 7L-CNN-CD method was compared with five state-of-the-art approaches: RBFNN [6], K-ELM [7], ELM-BA [8], GoogLeNet [11], and ResNet18 [12].
All performances were compared on test set and presented in Table VII. Omitting the SD information, the comparison plot is presented in Figure 7, with measurement indicators chosen from
to
.
TABLE VII. Comparison to State-of-the-art Approaches.
| Approach |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|---|---|---|---|---|---|---|---|
| RBFNN[6] | 67.08 | 74.48 | 72.52 | 70.78 | 69.64 | 41.74 | 69.64 |
| K-ELM[7] | 57.29 | 61.46 | 59.83 | 59.38 | 58.46 | 18.81 | 58.46 |
| ELM-BA [8] | 57.08±3.86 | 72.40±3.03 | 67.48±1.65 | 64.74±1.26 | 61.75±2.24 | 29.90±2.45 | 61.76±2.24 |
| GoogLeNet [11] | 76.88±3.92 | 83.96±2.29 | 82.84±1.58 | 80.42±1.40 | 79.65±1.92 | 61.10±2.62 | 79.65±1.91 |
| ResNet18 [12] | 78.96±2.90 | 89.48±1.64 | 88.30±1.50 | 84.22±1.23 | 83.31±1.53 | 68.89±2.33 | 83.32±1.53 |
| 7L-CNN-CD (Ours) | 94.44±0.73 | 93.63±1.60 | 93.70±1.47 | 94.03±0.80 | 94.06±0.76 | 88.08±1.59 | 94.06±0.76 |
Fig. 7.
Bar plot of performances of six different methods.
V. Conclusion
In this COVID-19 diagnosis study, a novel 7L-CNN-CD was proposed, using a seven-layer standard convolutional neural network as background, and integrating data augmentation and stochastic pooling methods.
Experimental results showcased our 7L-CNN-CD algorithm obtained excellent test performances:
,
,
,
,
,
. The results are better than five state-of-the-art algorithms in terms of COVID-19 diagnosis.
In our future studies, we shall attempt to (i) test more advanced data augmentation techniques; (ii) collect more COVID-19 data to test our algorithm; and (iii) move our algorithm to cloud computing platform to benefit radiologists.
Biographies

Yudong Zhang (Senior Member, IEEE) received the B.E. degree in information sciences and the M.Phil. degree in communication and information engineering from the Nanjing University of Aeronautics and Astronautics in 2004 and 2007, respectively, and the Ph.D. degree in signal and information processing from Southeast University in 2010. He serves as a Professor with the University of Leicester.

Suresh Chandra Satapathy (Senior Member, IEEE) is currently pursuing the Ph.D. degree in computer science engineering with the School of Computer Engineering, KIIT, Bhubaneshwar, India. He is working as a Professor with the School of Computer Engineering and the Dean Research with KIIT. He has developed two new optimization algorithms, social group optimization (SGO) published in Springer Journal and social evolution and learning algorithm (SELO) published in Elsevier.

Li-Yao Zhu received the bachelor’s degree in clinical medicine from Xuzhou Medical University in 1989. He has presided over five municipal scientific research projects, published more than 50 articles, including seven SCI articles, and co-edited a monograph. He has won three Municipal Scientific and Technological Progress Awards and three Municipal New Technology Introduction Awards.

Juan Manuel Górriz received the B.Sc. degree in physics and the B.Sc. degree in electronic engineering from the University of Granada, Spain, in 2000 and 2001, respectively, the Ph.D. degree from the University of Cádiz, Spain, in 2003, and the Ph.D. degree from the University of Granada in 2006. He is currently a Full Professor with the University of Granada.

Shuihua Wang (Senior Member, IEEE) received the bachelor’s degree in information sciences from Southeast University in 2008, the master’s degree in electrical engineering from the City College of New York in 2012, and the Ph.D. degree in electrical engineering from Nanjing University in 2017. She is working as a Research Associate with the University of Leicester, U.K.
Appendix A
(See Table VIII.)
Funding Statement
This work was supported in part by the Natural Science Foundation of China under Grant 61602250; in part by the Henan Key Research and Development Project under Grant 182102310629; in part by the Fundamental Research Funds for the Central Universities under Grant CDLS-2020-03; in part by the Key Laboratory of Child Development and Learning Science (Southeast University), Ministry of Education; in part by the Royal Society International Exchanges Cost Share Award, U.K., under Grant RP202G0230; in part by the Medical Research Council Confidence in Concept Award, U.K., under Grant MC_PC_17171; and in part by the Hope Foundation for Cancer Research, U.K., under Grant RM60G0680.
Contributor Information
Yudong Zhang, Email: yudongzhang@ieee.org.
Suresh Chandra Satapathy, Email: sureshsatapathy@ieee.org.
Li-Yao Zhu, Email: zhu_liyao@126.com.
Juan Manuel Górriz, Email: gorriz@ugr.es.
Shuihua Wang, Email: shuihuawang@ieee.org.
References
- [1].Balsalobre-Lorente D., Driha O. M., Bekun F. V., Sinha A., and Adedoyin F. F., “Consequences of COVID-19 on the social isolation of the chinese economy: Accounting for the role of reduction in carbon emissions,” Air Qual., Atmos. Health, vol. 13, pp. 1–13, Aug. 2020. [Google Scholar]
- [2].Chaudhary M., Sodani P. R., and Das S., “Effect of COVID-19 on economy in India: Some reflections for policy and programme,” J. Health Manage., vol. 22, no. 2, pp. 169–180, Jun. 2020. [Google Scholar]
- [3].Campos G. S.et al. , “Ion torrent-based nasopharyngeal swab metatranscriptomics in COVID-19,” J. Virol. Methods, vol. 282, Aug. 2020, Art. no. 113888. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Mahdavi A.et al. , “The role of repeat chest CT scan in the COVID-19 pandemic,” Academic Radiol., vol. 27, no. 7, pp. 1049–1050, Jul. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Li Y. and Xia L., “Coronavirus disease 2019 (COVID-19): Role of chest CT in diagnosis and management,” Amer. J. Roentgenol., vol. 214, no. 6, pp. 1280–1286, Jun. 2020. [DOI] [PubMed] [Google Scholar]
- [6].Lu Z., Lu S., Liu G., Zhang Y., Yang J., and Phillips P., “A pathological brain detection system based on radial basis function neural network,” J. Med. Imag. Health Informat., vol. 6, no. 5, pp. 1218–1222, Sep. 2016. [Google Scholar]
- [7].Lu S., Lu Z., Yang J., Yang M., and Wang S., “A pathological brain detection system based on kernel based ELM,” Multimedia Tools Appl., vol. 77, no. 3, pp. 3715–3728, Feb. 2018. [Google Scholar]
- [8].Lu S.et al. , “A pathological brain detection system based on extreme learning machine optimized by bat algorithm,” CNS Neurol. Disorders—Drug Targets, vol. 16, no. 1, pp. 23–29, Jan. 2017. [DOI] [PubMed] [Google Scholar]
- [9].Wang S.et al. , “Pathological brain detection via wavelet packet tsallis entropy and real-coded biogeography-based optimization,” Fundamenta Informaticae, vol. 151, nos. 1–4, pp. 275–291, Mar. 2017. [Google Scholar]
- [10].Jiang X., “Chinese Sign Language Fingerspelling Recognition via Six-Layer Convolutional Neural Network with Leaky Rectified Linear Units for Therapy and Rehabilitation,” J. Med. Imag. Health Informat., vol. 9, pp. 2031–2038, 2019. [Google Scholar]
- [11].Szegedy C.et al. , “Going deeper with convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2015, pp. 1–9. [Google Scholar]
- [12].Yu X. and Wang S.-H., “Abnormality diagnosis in mammograms by transfer learning based on ResNet18,” Fundamenta Informaticae, vol. 168, nos. 2–4, pp. 219–230, Sep. 2019. [Google Scholar]
- [13].Zhang Y., Qian Y., Wu D., Hossain M. S., Ghoneim A., and Chen M., “Emotion-aware multimedia systems security,” IEEE Trans. Multimedia, vol. 21, no. 3, pp. 617–624, Mar. 2019. [Google Scholar]
- [14].Zhang Y., Gravina R., Lu H., Villari M., and Fortino G., “PEA: Parallel electrocardiogram-based authentication for smart healthcare systems,” J. Netw. Comput. Appl., vol. 117, pp. 10–16, Sep. 2018. [Google Scholar]
- [15].Zhang Y., Ma X., Zhang J., Hossain M. S., Muhammad G., and Amin S. U., “Edge intelligence in the cognitive Internet of Things: Improving sensitivity and interactivity,” IEEE Netw., vol. 33, no. 3, pp. 58–64, May 2019. [Google Scholar]
- [16].Veluchamy M. and Subramani B., “Fuzzy dissimilarity contextual intensity transformation with gamma correction for color image enhancement,” Multimedia Tools Appl., vol. 79, pp. 19945–19961, Apr. 2020. [Google Scholar]
- [17].Górriz J. M., “Artificial intelligence within the interplay between natural and artificial computation: Advances in data science, trends and applications,” Neurocomputing, vol. 410, pp. 237–270, 2020. [Google Scholar]
- [18].Rezaei M., Yang H., and Meinel C., “Deep Neural Network with l2-Norm Unit for Brain Lesions Detection,” in Proc. Int. Conf. Neural Inf. Process. (ICNIP), Cham, Switzerland, 2017, pp. 798–807. [Google Scholar]
- [19].Ghosh A., Singh S., and Sheet D., “Simultaneous localization and classification of acute lymphoblastic leukemic cells in peripheral blood smears using a deep convolutional network with average pooling layer,” in Proc. IEEE Int. Conf. Ind. Inf. Syst. (ICIIS), Dec. 2017, pp. 529–534. [Google Scholar]
- [20].Wang S.-H.et al. , “Multiple sclerosis identification by 14-layer convolutional neural network with batch normalization, dropout, and stochastic pooling,” Frontiers Neurosci., vol. 12, Nov. 2018. Art. no. 818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Jiang X., Lu M., and Wang S.-H., “An eight-layer convolutional neural network with stochastic pooling, batch normalization and dropout for fingerspelling recognition of chinese sign language,” Multimedia Tools Appl., vol. 79, nos. 21–22, pp. 15697–15715, Jun. 2020. [Google Scholar]
- [22].Sun S., Hu B., Yu Z., and Song X., “A stochastic max pooling strategy for convolutional neural network trained by noisy samples,” Int. J. Comput. Commun. Control, vol. 15, no. 1, Feb. 2020, Art. no. 1007. [Google Scholar]



































































































































