Abstract
With a global COVID-19 pandemic, the number of confirmed patients increases rapidly, leaving the world with very few medical resources. Therefore, the fast diagnosis and monitoring of COVID-19 are one of the world’s most critical challenges today. Artificial intelligence-based CT image classification models can quickly and accurately distinguish infected patients from healthy populations. Our research proposes a deep learning model (WE-SAJ) using wavelet entropy for feature extraction, two-layer FNNs for classification and the adaptive Jaya algorithm as a training algorithm. It achieves superior performance compared to the Jaya-based model. The model has a sensitivity of 85.47±1.84, specificity of 87.23±1.67 precision of 87.03±1.34, an accuracy of 86.35±0.70, and F1 score of 86.23±0.77, Matthews correlation coefficient of 72.75±1.38, and Fowlkes-Mallows Index of 86.24±0.76. Our experiments demonstrate the potential of artificial intelligence techniques for COVID-19 diagnosis and the effectiveness of the Self-adaptive Jaya algorithm compared to the Jaya algorithm for medical image classification tasks.
Keywords: COVID-19, Diagnosis, Deep Learning, Wavelet Entropy, Self-adaptive Jaya, Jaya
1. Introduction
Covid-19 is a respiratory disease caused by the novel coronavirus SARS-CoV-2. Since the first cases appeared in 2019, Covid-19 has spread to most countries and territories worldwide [1]. Although there are several effective vaccines against the disease, they have not been able to definitively stop the spread of Covid-19 due to its high variability. Depending on the severity of the disease, patients with Covid-19 may suffer from mild respiratory symptoms such as cough and fever, severe pneumonia, multi-organ failure, or even death [2]. As the global pandemic progresses, the number of severe cases and deaths from Covid-19 continues to rise, a significant blow to human life and the global economy. Scholars in various fields have become highly concerned about the potentially severe consequences of Covid-19 and have maintained a constant interest in possible solutions to Covid-19. An important issue that continues to be addressed is the lack of medical resources due to the rapid increase in patients. The current standard method of Covid-19 detection is Reverse Transcription Polymerase Chain Reaction (RT_PCR) [3], which has the disadvantage of a high proportion of false negatives and often requires multiple tests to produce reliable results so that the process is highly time-consuming. Diagnostic methods using CT and X-Ray images of the chest also have the advantage of assessing the extent of a patient’s disease [3]. However, this method requires a large number of medical experts to do the diagnosis task.
Meanwhile, the number of confirmed cases of Covid-19 is increasing rapidly. It is difficult to diagnose and monitor the vast number of Covid-19 patients promptly manually. Therefore, finding a quick and accurate diagnosis method is becoming one of the most critical tasks for stopping the spread of COVID-19.
Artificial intelligence has been a popular field of research in recent years, attracting many researchers to solve complex problems in various areas such as medicine, economics, and the cyber security field [4–9]. A significant advantage of AI is that it can be trained to replace humans with machines that perform repetitive and complex tasks [10–13]. The advantage of artificial intelligence is that it can solve the diagnostic difficulties associated with the rapid increase in patients. Many scholars believe that machine learning techniques on medical images can effectively diagnose COVID-19 patients [14]. Many studies on machine learning techniques for chest X-ray and CT images of COVID-19 patients have emerged, and some of them have achieved relatively good performance. Also, a significant number have implemented innovative and inspiring image processing methods on COVID-19.
Szegedy, Liu [15] proposed a 22 layers deep network to do the image classification and detection task. Their model is suitable for COVID-19 diagnosis. Their research did not mention what optimisation algorithm they used, which could be a direction to further increase their model performance. Lu [16] proposed a radial basis function (RBF) based model for brain disease diagnosis. Their model can be generalised for the diagnostic task of COVID-19, but did not achieve a stable and promising performance. Chen [17] combined the greyscale co-occurrence matrix (GLCM) and support vector machine (SVM) to classify COVID-19 chest CT images and demonstrated the effectiveness of this method. Yao and Han [18] worked on the COVID-19 Chest CT image classification task by using a wavelet entropy (WE) and biogeography-based optimisation (BBO) based method (WE-BBO). They have discovered the possible performance improvement of the combination of optimisation algorithm and Wavelet Entropy. Based on the WE-BBO, Wang [19] proposed a Wavelet and Jaya combined method to do the same task and has improved, but their method needs to set the population size manually, which may cause the local optimal solution. The Jaya algorithm they used was proposed by Rao [20], which can solve constrained and unconstrained optimisation problems as close to the optimal solution as possible while avoiding the worst solution. Moreover, this algorithm is parameter-free and straightforward to use.
Rao and More [21] proposed a modified version of the Jaya algorithm, called the self-adaptive Jaya algorithm, which solves the problem of the Jaya algorithm requiring a set population size. Furthermore, this algorithm can automatically adjust the population size based on the current and previous population size and the current solution. Because of these advantages, self-adaptive Jaya can bring higher performance and application value to the model. We constructed this experiment on research Wang [19] by replacing one of the Jaya algorithms and proposing a WE-SAJ model that combines the self-adaptive Jaya algorithm and Wavelet Entropy and made considerable progress compared to the previous research.
Our contributions are as follows: (i) We discovered a self-adaptive Jaya algorithm and Wavelet entropy combined method (WE-SAJ) for Covid-19 diagnosis; (ii) We demonstrated the performance improvement of the self-adaptive Jaya algorithm over the Jaya algorithm for medical image classification models; (iii) We have further demonstrated the value of AI technology for the COVID-19 diagnostic task.
The second part of the paper presents the dataset used for the experiments. The third part describes the main methods involved in the experiments. Finally, the fourth part presents and discusses the experimental results.
2. Dataset
We used a chest CT image dataset for the experiment. The dataset consisted of chest CT from 77 men and 55 women, for a total of 132 subjects. Each sample consisted of one complete chest CT image and the corresponding nucleic acid test results. The data set was divided into two groups, the COVID-19-infected group and the healthy group, and each group included 148 chest CT slices. The COVID-19-infected group consisted of chest CT images from 66 COVID-19 patients from the Fourth People’s Hospital in Huai’an, China, while the healthy group consisted of chest CT images from 66 healthy subjects. Table 1 shows the statistic of the dataset. Figure 1 shows the two samples from this dataset. [19].
Table 1. Statistic of the Dataset.
| Class | Ratio | No. of Images |
|---|---|---|
| Covid-19 | 50% | 148 |
| Health Control | 50% | 148 |
Figure 1. Data samples from the dataset.
3. Methodology
3.1. Wavelet Entropy
Wavelet Transform
The Fourier transform is widely used in many areas of signal analysis as a method that can transform a signal from the time domain to the frequency domain, and the form of the transform is shown in Equation (1).
| (1) |
where ω refers to the frequency, t refers to time.
Although the Fourier transform can analyse the spectrum of a signal and has a high application value, it has certain limitations when dealing with non-stationary signals. The Fourier transform can only capture which frequencies a section of the signal generally consists of and cannot reflect the moments when these frequencies occur, making it possible for two non-stationary signals that differ in the time domain to appear identical in the frequency domain [22]. Many signals in nature are non-stationary, and biology and medical signal analysis problems can rarely be solved using the straightforward Fourier transform as a solution. A simple and feasible way to solve such problems is to decompose the entire time-domain signal into an infinite number of short-time signals. Then making each short-time signal approximately smooth and using the Fourier transform on this basis to know the moment at which each frequency occurs. This decomposition process is known as adding windows, and this Fourier transform process based on the signal decomposition is known as the short-time Fourier transform [23]. However, this method is limited by the width of the window. A too wide window will result in a low temporal resolution and a lack of refinement in the time domain. At the same time, a window that is too narrow will result in a poor frequency resolution and a lack of precision in frequency analysis. Furthermore, the window’s width does not transform during a short-time Fourier transform, so the short-time Fourier transform is also not the best solution for non-stationary signal analysis.
The wavelet transform replaces the infinitely long trigonometric basis of the Fourier transform with a finite decaying wavelet basis to locate the moment when the frequency occurs while obtaining the frequency [24]. The transformation equation is shown in Equation (2).
| (2) |
where a represents scale, τ represents translation, ψ represents parent wavelet function, and t represents time.
A wavelet is a wave that is more concentrated in the time domain than a sine wave in the Fourier transform, where the energy is finite and concentrated at a point. Wavelets can be used to efficiently extract information from a signal and analyse functions or signals at multiple scales of refinement through operational functions such as scaling and translation. The essence of the wavelet transform is similar to the Fourier transform in that a carefully selected basis represents the signal function. Each wavelet transform has a mother wavelet and a scaling function. The basis function of any wavelet transform is the set of scaling and translation of the mother wavelet and scaling function. This scaling and translation correspond to two variables in the wavelet transform function, the scale and the translation. The scale controls the scaling of the wavelet function, which corresponds to frequency, and the translation controls the translation of the wavelet function, which corresponds to time [25]. The wavelet transform can therefore capture which frequency components the signal contains at different moments, thus solving the shortcomings of the Fourier transform when it comes to analysing unstable signals.
Wavelet Entropy
Wavelet entropy is a novel tool for analysing the instantaneous characteristics of non-stationary signals based on the wavelet transform combining wavelet decomposition and entropy. The original Shannon entropy was proposed to provide a valuable criterion for quantifying and comparing the energy distribution in wavelet subbands. Thus, it can be defined as Equation (3) [26].
| (3) |
where g refers to grey levels, and P refers to the probabilities of grey levels. Thus, S(g) can represent the energy distribution in wavelet subbands according to the probabilities of grey levels occurring.
However, most of the research on Shannon’s entropy has been on engineering applications, and its physical meaning and principles have not been discussed in depth. Moreover, the shortcomings of Shannon entropy make it prone to wavelet mixing and energy leakage when dealing with non-stationary signals, which may lead to inaccurate or even incorrect results. Given this, many new solutions to these problems have emerged, such as relative wavelet entropy [27] and Tsallis Wavelet Entropy [28]. Our research uses a 4-level decomposition of biorthogonal wavelets. Compared to orthogonal wavelet bases, biorthogonal wavelet bases resolve the incompatibility of symmetry and exact signal reconstruction. Biorthogonal wavelets consist of two wavelets called dyads, which decompose and reconstruct the signal separately. Bi-orthogonal wavelets resolve the contradiction between linear phase and orthogonality requirements and are widely used in signal and image reconstruction. In this research, wavelet entropy is used for feature extraction. And then, the extracted features are fed into a two-layer Feedforward Neural Network for classification.
3.2. Feedforward Neural Network
The Feedforward Neural Network (FNN) [29] is a typical deep learning model consisting of a multi-layer logistic regression model (continuous nonlinear function), also known as Single/Multi-Layer Perceptron, depending on the number of network layers [30]. Each network layer contains different numbers of neurons (perceptron). The structure of a perceptron can be defined as in Equation (4) [31].
| (4) |
where wi are weights, x is a set of inputs xi, m is the number of inputs to the perceptron, and b is the bias added after the sum of weighted inputs. Learning the value of wi and b can help the model cope with different tasks.
A FNN usually consists of an input layer, serval hidden layer, and an output layer [32]. In a FNN, the preceding network layer is the input layer. The last layer is the output layer, and the network layer between the input and output layers are hidden layers. Although a wide variety of classifiers already exist, when faced with classification tasks of more-approximate classes, neural networks can add non-linearity and change the representation of the data through hidden layers to better generalise the model [33].
In this study, a two-layer FNN is used with Mean Squared Error (MSE) as the loss function. A sample structure of FNN is shown in Figure 2, m is a hyperparameter representing the number of hidden layers. The number of perceptrons in each hidden layer is also a hyperparameter. c represents the number of input xc, d represents the number of outputs (yd), which depends on the number of output classes. For example, for a 5-classes classification task, d can be equal to 5.
Figure 2. The basic structure of an example FNN.
3.3. Self-adaptive Jaya algorithm
Jaya algorithm
The Jaya algorithm is a population-based heuristic algorithm [34]. In the Jaya algorithm, the search agent updates the current value based on the optimal known value and the worst known value, continuously approaching the optimal solution while avoiding the worst one. Thus, this approach helps converge to the optimal global solution [20, 35]. Compared to traditional evolutionary algorithms, the Jaya algorithm is parameter-free, effectively avoiding the situation where traditional evolutionary algorithms produce locally optimal solutions after incorrect algorithmic tuning [36]. Moreover, it has more excellent reliability and generalisation than other heuristics and is widely used in optimisation applications and research in several fields. The basic equations of the Jaya algorithm are shown in Equation (5) [37].
| (5) |
where r1 and r2 are two subjective random numbers taken randomly from [0,1], Yq,best,i refers to the best solution, Yq,worst,i refers to the worst solution, q = 1,2, …,n, and i = 1,2, …,d.
Self-adaptive Jaya algorithm
This study used a modified Jaya algorithm called Self-adaptive Jaya algorithm [21]. This method divides the solution into groups based on quality distinctions, distributed in the search space, to obtain the best solution. The most important feature of this algorithm is that the population size can be determined automatically, and its population development is shown in Equation (6) [38].
| (6) |
where nnew represents the new population, nold represents the old population, and r is the relative population development rate, a random value taken from [-0.5, 0.5]. Since r is random, the size of the new population may be larger or smaller than the old population.
Only the current best population can enter the next generation when the new population is smaller than the old population.
When the new population equals the old population, no change occurs.
When the new population is larger than the old population, all current populations enter the next generation.
The flow chart of the Self-adaptive Jaya algorithm is shown in Figure 3. This research uses the Self-adaptive Jaya algorithm as the training algorithm.
Figure 3. The flow chart of the self-adaptive Jaya algorithm.
3.4. K-fold Cross-Validation
In machine learning, researchers usually divide the dataset into a training set, which is used for model training, and a test set, which is used to test model performance and thus improve the generalisation of the model. However, machine learning is a data-driven science. The size of the dataset has a significant impact on the model’s performance, with larger amounts of data tending to train higher performance models. However, many studies face difficulties with data scarcity, and the division of datasets into training and testing sets severely reduces the amount of data available for training, thus affecting model performance. The core theory of Cross-Validation is to reuse data to increase the amount of data available for training models while ensuring the availability of test sets. The K-fold Cross-Validation [39] used in this study is a widely used Cross-Validation method. This method divides the dataset into pre-specified K groups, takes one of them without repeating as the test set, uses all the other data as the training set, and calculates the model performance using the training set. The training is repeated K times while ensuring that a different data set is switched as the test set. Finally, get the final performance (Pfinal) from the model performances over the K tests. Figure 4 illustrates a concrete form of K-fold Cross-Validation. To obtain a more reliable and robust result, we used a 10-fold Cross-Validation to divide the dataset.
Figure 4. Illustration of dataset divide and performance calculation using K-fold Cross-Validation.
3.5. Evaluation
In this study, we evaluated the model performance by multiple methods. To be specific, they are Accuracy, Sensitivity, Precision, Specificity, F-score, Area Under the ROC Curve (AUC) and Confusion Matrix.
Confusion Matrix
Confusion Matrix is a visualisation tool used to see how the model performs on each class and is represented as a matrix of n rows and n columns, where n is the number of classes in the classification task. The confusion matrix consists of four central values: True Positive TPP, the number of positive samples that are correctly predicted as positive), True Negative TNN, the number of negative samples that are correctly predicted as negative), False Positive FPP, the number of negative samples that are incorrectly predicted as positive) and False Negative FNN, the number of positive samples incorrectly predicted as negative), as shown in Figure 5.
Figure 5. Confusion Matrix sample.
Accuracy
Accuracy is a widely used indicator to assess the performance of machine learning models, and it represents the proportion of correctly determined samples to the total number of samples. In general, the higher the accuracy, the better the classifier. Equation (7) is the formula of accuracy:
| (7) |
Precision
Precision is the proportion of correctly classified samples among all data samples in the dataset. The calculation of precision is shown as Equation (8).
| (8) |
Specificity
Specificity, also known as true negative rate, represents the ability of the model to distinguish between negative samples. The calculation of specificity is shown as Equation (9).
| (9) |
Sensitivity
Sensitivity, also known as true positive rate, represents the ability of the model to differentiate between positive samples. The calculation sensitivity is shown as Equation (10).
| (10) |
F-Score
F-Score is the benchmark generated by the combination of Precision and Sensitivity. The higher the F1-score value, the higher the stability of the classification model. The calculation of precision is shown as Equation (11).
| (11) |
AUC
AUC, a nonparametric statistic that is not affected by category distribution [40], is one of the most common indicators used to evaluate binary classification models. AUC evaluates the model performance by calculating the area under the curve (ROC curve) of the model’s classification results on an axis with a false positive rate on the horizontal axis and a true positive rate on the vertical axis. Since the AUC considers the classification ability of the model for both positive and negative cases, it is still possible to evaluate the classifier reasonably well, even in the case of sample imbalance. Assume the ROC is constructed by m points, (x1, y1), (x2, y2), …, (xm,ym), the estimation calculation of AUC can be defined as Equation (12) [41]:
| (12) |
4. Experiment Results and Discussions
4.1. WE Results
Figure 6 illustrates a sample of 4-level decomposition of biorthogonal. Figure 6(a) is the result of the first-level wavelet transform, shows the input map’s low-frequency subbands (upper left corner) and high-frequency subbands (upper right corner, lower-left corner, and lower right corner). Since the wavelet transform introduces downsampling, the edge lengths of the four self-band maps are general to the input. In the second-level wavelet transform Figure 6(b), the low-frequency self-bands and the high-frequency sub-bands are obtained after a similar transformation of the low-frequency sub-bands obtained from the first-level wavelet transform. The results of the third-level wavelet transform—as shown in Figure 6(c)—and the fourth-level wavelet transform—as shown in Figure 6(d)—are obtained after a recursive operation. Note here that the image in the sample is painted with pseudo-colour, which is still actually represented as greyscale.
Figure 6. A sample of 4-level decomposition of biorthogonal wavelets result.
4.2. Statistical Results
WE-SAJ used Wavelet Entropy as the feature extraction method, 2-layer FNN as the classifier, the Self-adaptive Jaya algorithm as the training algorithm, and K-fold cross-validation to report unbiased performance. As a result, the experiments achieved great performance (shown in Table 2) with an average sensitivity of 85.47±1.84, specificity of 87.23±1.67 precision of 87.03±1.34, the accuracy of 86.35±0.70, and F1 score of 86.23±0.77, Matthews correlation coefficient of 72.75±1.38, and Fowlkes-Mallows Index of 86.24±0.76.
Table 2. Results of 10 runs of 10-fold cross-validation.
| Run | Sen | Spc | Prc | Acc | F1 | MCC | FMI |
|---|---|---|---|---|---|---|---|
| 1 | 85.14 | 86.49 | 86.30 | 85.81 | 85.71 | 71.63 | 85.72 |
| 2 | 83.11 | 90.54 | 89.78 | 86.82 | 86.32 | 73.85 | 86.38 |
| 3 | 85.81 | 85.14 | 85.23 | 85.47 | 85.52 | 70.95 | 85.52 |
| 4 | 81.76 | 88.51 | 87.68 | 85.14 | 84.62 | 70.43 | 84.67 |
| 5 | 85.81 | 87.84 | 87.59 | 86.82 | 86.69 | 73.66 | 86.69 |
| 6 | 86.49 | 87.84 | 87.67 | 87.16 | 87.07 | 74.33 | 87.08 |
| 7 | 87.84 | 85.14 | 85.53 | 86.49 | 86.67 | 73.00 | 86.67 |
| 8 | 87.16 | 85.81 | 86.00 | 86.49 | 86.58 | 72.98 | 86.58 |
| 9 | 85.14 | 87.16 | 86.90 | 86.15 | 86.01 | 72.31 | 86.01 |
| 10 | 86.49 | 87.84 | 87.67 | 87.16 | 87.07 | 74.33 | 87.08 |
| MSD | 85.47 | 87.23 | 87.03 | 86.35 | 86.23 | 72.75 | 86.24 |
| ±1.84 | ±1.67 | ±1.34 | ±0.70 | ±0.77 | ±1.38 | ±0.76 |
(Sen = Sensitivity; Spc = Specificity; Prc = Precision; Acc = Accuracy; F1 = F1 Score; MCC = Matthews correlation coefficient; FMI = Fowlkes-Mallows Index)
4.3. Self-adaptive Jaya compared to Jaya
Table 3 shows the specific numerical performance comparison between the Self-adaptive Jaya algorithm-based model (WE-SAJ) and the previous research (WE-Jaya) based on the Jaya algorithm. Compared to WE-Jaya, WE-SAJ has achieved significant improvements in all performance metrics.
Table 3. Performance comparison between WE-Jaya and WE-SAJ.
| Method | Sen | Spc | Prc | Acc | F1 | MCC | FMI |
|---|---|---|---|---|---|---|---|
| WE-Jaya [19] | 73.31 | 78.11 | 77.03 | 75.71 | 75.10 | 51.51 | 75.14 |
| ±2.26 | ±1.92 | ±1.35 | ±1.04 | ±1.23 | ±2.07 | ±1.22 | |
| WE-SAJ (Ours) | 85.47 | 87.23 | 87.03 | 86.35 | 86.23 | 72.75 | 86.24 |
| ±1.84 | ±1.67 | ±1.34 | ±0.70 | ±0.77 | ±1.38 | ±0.76 |
(Sen = Sensitivity; Spc = Specificity; Prc = Precision; Acc = Accuracy; F1 = F1 Score; MCC = Matthews correlation coefficient; FMI =Fowlkes-Mallows Index)
The WE-SAJ has improved accuracy by more than ten percentage points, which means our method has a higher practical value. More detailed performance improvements can be seen in the other performance indicators. WE-SAJ improves sensitivity by more than 12 percentage points and specificity by more than 11 percentage points. It suggests that WE-SAJ can reduce unnecessary healthcare resources by misdiagnosing as few healthy people as possible while ensuring that more infected people are correctly identified to identify COVID-19 patients effectively. It shows that WE-SAJ can ensure that more infected patients are correctly identified to effectively identify COVID-19 patients while minimising misdiagnosis of healthy people and reducing unnecessary wastage of healthcare resources. The improvement in F1-score also demonstrates that the model achieves better overall performance with equal weighting of precision and sensitivity. The increase of over 11 percentage points in the FMI index indicates the higher relevance of the data features extracted by the model to the data labels, which means that the model has improved its ability to extract useful features.
This series of improvements can be attributed to that the automatic population sizing feature of Adaptive Jaya can effectively set the most appropriate population size, thus enhancing the tracking of the optimal solution and ultimately helping the model achieve better performance.
Figure 7 illustrates the ROC curve and AUC value of WE-Jaya (a) and WE-SAJ (b), respectively. Each point in the graph corresponds to a threshold. When the threshold is maximum, True Positive Rate (TPR)=False Positive Rate (FPR)=0, which corresponds to the origin (0,0) of the graph. TPR=FPR=1 corresponds to the point (1,1) in the upper right corner when the threshold is minimal. As the Threshold increases, both TPR and FPR increase. The ROC curve shows that the WE-SAJ can achieve a lower False Positive Rate and higher True Positive Rate than the Jaya-based model at most thresholds, which leads to a lower AUC for WE-SAJ than the Jaya-based model. It indicates that the WE-SAJ has more diagnostic value than WE-Jaya.
Figure 7. ROC curve and AUC comparison between WE-Jaya and WE-SAJ.
4.4. Comparison to State-of-the-art Approaches
Compared to the other state-of-art approaches in CT image classification of COVID-19, WE-SAJ shows significant improvement in all aspects, as shown in Table 4 for numerical comparison. The model can obtain the overall improvement to these SOTA models because of the advantages of the Self-adaptive Jaya algorithm. It can automatically adjust the population size based on the current and the previous population size. The current solution avoids the local optimal solution and increases the possibility of finding the global optimal solution. The results also demonstrate that our method is promising for the CT image classification task of COVID-19 and has great scope for improvement.
Table 4. Performance comparison to State-of-the-art Approaches.
| Method | Sen | Spc | Prc | Acc | F1 | MCC | FMI |
|---|---|---|---|---|---|---|---|
| RBFNN [16] | 66.89 | 75.47 | 73.23 | 71.18 | 69.88 | 42.56 | 69.97 |
| ±2.43 | ±2.53 | ±1.48 | ±0.80 | ±1.08 | ±1.61 | ±1.04 | |
| WE-BBO [18] | 72.97 | 74.93 | 74.48 | 73.95 | 73.66 | 47.99 | 73.66 |
| ±2.96 | ±2.39 | ±1.34 | ±0.98 | ±0.98 | ±2.00 | ±1.33 | |
| GLCM-SVM [17] | 72.03 | 78.04 | 76.66 | 75.03 | 74.24 | 50.20 | 74.29 |
| ±2.94 | ±1.72 | ±1.07 | ±1.12 | ±1.57 | ±2.17 | ±1.53 | |
| WE-JAYA [19] | 73.31 | 78.11 | 77.03 | 75.71 | 75.10 | 51.51 | 75.14 |
| ±2.26 | ±1.92 | ±1.35 | ±1.04 | ±1.23 | ±2.07 | ±1.22 | |
| GoogLeNet [15] | 77.64 | 83.85 | 82.82 | 80.74 | 80.12 | 61.65 | 80.17 |
| ±2.22 | ±2.00 | ±1.54 | ±0.91 | ±1.07 | ±1.81 | ±1.05 | |
| WE-SAJ | 85.47 | 87.23 | 87.03 | 86.35 | 86.23 | 72.75 | 86.24 |
| ±1.84 | ±1.67 | ±1.34 | ±0.70 | ±0.77 | ±1.38 | ±0.76 |
(Sen = Sensitivity; Spc = Specificity; Prc = Precision; Acc = Accuracy; F1 = F1 Score; MCC = Matthews correlation coefficient; FMI = Fowlkes-Mallows Index)
5. Conclusions
Artificial intelligence-based medical image analysis techniques have a significant application in the fight against COVID-19. It can help in the diagnosis of COVID-19 challenges caused by the lack of medical resources. This experiment validates the feasibility of the Wavelet Entropy and Self-adaptive algorithm-based model for the COVID-19 chest CT image classification task and achieved auspicious performance. It is highly generalisable and theoretically applicable to all types of medical image classification tasks, which needs to be further validated in future studies. Based on the experimentally obtained performance, we have reasons to believe that with more optimisations and improvements in the near future, we can obtain models with better performance for diagnosis and recognition of COVID-19 and even more diseases and solve more medical challenges.
Acknowledgement
The paper was partially supported by: Royal Society International Exchanges Cost Share Award, UK (RP202G0230); Medical Research Council Confidence in Concept Award, UK (MC_PC_17171); Hope Foundation for Cancer Research, UK (RM60G0680); British Heart Foundation Accelerator Award, UK AA/18/3/34220); Sino-UK Industrial Fund, UK (RP202G0289); Global Challenges Research Fund (GCRF), UK (P202PF11).
Contributor Information
Wei Wang, Email: ww152@leicester.ac.uk.
Xin Zhang, Email: 973306782@qq.com.
References
- 1.Hotez PJ, Fenwick A, Molyneux D. The new COVID-19 poor and the neglected tropical diseases resurgence. Infectious Diseases of Poverty. 2021;10(1):3. doi: 10.1186/s40249-020-00784-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Struyf T, et al. Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID - 19 disease. Cochrane Database of Systematic Reviews. 2020;(7) doi: 10.1002/14651858.CD013665. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Fang Y, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115–E117. doi: 10.1148/radiol.2020200432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Du J-X, et al. Shape recognition based on neural networks trained by differential evolution algorithm. Neurocomputing. 2007;70(4-6):896–903. [Google Scholar]
- 5.Wang X-F, Huang D-S, Xu H. An efficient local Chan–Vese model for image segmentation. Pattern Recognition. 2010;43(3):603–618. [Google Scholar]
- 6.Chen W-S, et al. Kernel machine-based one-parameter regularized fisher discriminant method for face recognition. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 2005;35(4):659–669. doi: 10.1109/tsmcb.2005.844596. [DOI] [PubMed] [Google Scholar]
- 7.Wang X-F, Huang D-S. A novel density-based clustering framework by using level set method. IEEE Transactions on knowledge and data engineering. 2009;21(11):1515–1531. [Google Scholar]
- 8.Han F, Ling Q-H, Huang D-S. Modified constrained learning algorithms incorporating additional functional constraints into neural networks. Information Sciences. 2008;178(3):907–919. [Google Scholar]
- 9.Han F, Huang D-S. Improved extreme learning machine for function approximation by encoding a priori information. Neurocomputing. 2006;69(16-18):2369–2373. [Google Scholar]
- 10.Meskó B, Hetényi G, Győrffy Z. Will artificial intelligence solve the human resource crisis in healthcare? BMC health services research. 2018;18(1):1–4. doi: 10.1186/s12913-018-3359-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Huang D-s. Radial basis probabilistic neural networks: Model and application. International Journal of Pattern Recognition and Artificial Intelligence. 1999;13(07):1083–1101. [Google Scholar]
- 12.Huang D-S, Du J-X. A constructive hybrid structure optimization methodology for radial basis probabilistic neural networks. IEEE Transactions on neural networks. 2008;19(12):2099–2115. doi: 10.1109/TNN.2008.2004370. [DOI] [PubMed] [Google Scholar]
- 13.Du J-X, et al. A novel full structure optimization algorithm for radial basis probabilistic neural networks. Neurocomputing. 2006;70(1-3):592–596. [Google Scholar]
- 14.Wehbe RM, et al. DeepCOVID-XR: An Artificial Intelligence Algorithm to Detect COVID-19 on Chest Radiographs Trained and Tested on a Large US Clinical Data Set. Radiology. 2021;299(1):E167–E176. doi: 10.1148/radiol.2020203511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Szegedy C, et al. Going deeper with convolutions; IEEE conference on computer vision and pattern recognition (CVPR); Boston, MA, USA. 2015. [Google Scholar]
- 16.Lu Z. A Pathological Brain Detection System Based on Radial Basis Function Neural Network. Journal of Medical Imaging and Health Informatics. 2016;6(5):1218–1222. [Google Scholar]
- 17.Chen Y. In: COVID-19: Prediction, Decision-Making, and its Impacts. Santosh KC, Joshi A, editors. Springer Singapore; Singapore: 2020. Covid-19 Classification Based on Gray-Level Co-occurrence Matrix and Support Vector Machine; pp. 47–55. [Google Scholar]
- 18.Yao X, Han J. COVID-19: Prediction, Decision-Making, and its Impacts. Springer; 2020. COVID-19 Detection via Wavelet Entropy and Biogeography-Based Optimization; pp. 69–76. [Google Scholar]
- 19.Wang W. Covid-19 Detection by Wavelet Entropy and Jaya. Lecture Notes in Computer Science. 2021;12836:499–508. [Google Scholar]
- 20.Rao R. Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. International Journal of Industrial Engineering Computations. 2016;7(1):19–34. [Google Scholar]
- 21.Rao R, More K. Design optimization and analysis of selected thermal devices using self-adaptive Jaya algorithm. Energy Conversion and Management. 2017;140:24–35. [Google Scholar]
- 22.Saravanan N, Ramachandran K. Incipient gear box fault diagnosis using discrete wavelet transform (DWT) for feature extraction and classification using artificial neural network (ANN) Expert Systems with Applications. 2010;37(6):4168–4181. [Google Scholar]
- 23.Allen JB, Rabiner LR. A unified approach to short-time Fourier analysis and synthesis. Proceedings of the IEEE. 1977;65(11):1558–1564. [Google Scholar]
- 24.Quiroga RQ, et al. Wavelet entropy in event-related potentials: a new method shows ordering of EEG oscillations. Biological cybernetics. 2001;84(4):291–299. doi: 10.1007/s004220000212. [DOI] [PubMed] [Google Scholar]
- 25.Saritha M, Joseph KP, Mathew AT. Classification of MRI brain images using combined wavelet entropy based spider web plots and probabilistic neural network. Pattern Recognition Letters. 2013;34(16):2151–2156. [Google Scholar]
- 26.Yildiz A, et al. Application of adaptive neuro-fuzzy inference system for vigilance level estimation by using wavelet-entropy feature extraction. Expert Systems with Applications. 2009;36(4):7390–7399. [Google Scholar]
- 27.Rosso OA, et al. Wavelet entropy: a new tool for analysis of short duration brain electrical signals. Journal of neuroscience methods. 2001;105(1):65–75. doi: 10.1016/s0165-0270(00)00356-3. [DOI] [PubMed] [Google Scholar]
- 28.Chen J, Li G. Tsallis Wavelet Entropy and Its Application in Power Signal Analysis. Entropy. 2014;16(6):3009–3025. [Google Scholar]
- 29.Jansen-Winkeln B, et al. Feedforward Artificial Neural Network-Based Colorectal Cancer Detection Using Hyperspectral Imaging: A Step towards Automatic Optical Biopsy. Cancers. 2021;13(5):13. doi: 10.3390/cancers13050967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Han F, Huang D-S. A new constrained learning algorithm for function approximation by encoding a priori information into feedforward neural networks. Neural Computing and Applications. 2008;17(5):433–439. [Google Scholar]
- 31.Venkata RR. Abnormal breast detection in mammogram images by feed-forward neural network trained by Jaya algorithm. Fundamenta Informaticae. 2017;151(1-4):191–211. [Google Scholar]
- 32.Han F, Ling Q-H, Huang D-S. An improved approximation approach incorporating particle swarm optimization and a priori information into neural networks. Neural Computing and Applications. 2010;19(2):255–261. [Google Scholar]
- 33.Rudolph S. On topology, size and generalization of non-linear feed-forward neural networks. Neurocomputing. 1997;16(1):1–22. [Google Scholar]
- 34.Cheng H. Multiple sclerosis identification based on fractional Fourier entropy and a modified Jaya algorithm. Entropy. 2018;20(4) doi: 10.3390/e20040254. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Degertekin SO, Bayar GY, Lamberti L. Parameter free Jaya algorithm for truss sizinglayout optimization under natural frequency constraints. Computers & Structures. 2021;245:29. [Google Scholar]
- 36.Han L. Identification of Alcoholism based on wavelet Renyi entropy and three-segment encoded Jaya algorithm. Complexity. 2018;2018 [Google Scholar]
- 37.Zhao G. Smart Pathological brain detection by Synthetic Minority Oversampling Technique, Extreme Learning Machine, and Jaya Algorithm. Multimedia Tools and Applications. 2018;77(17):22629–22648. [Google Scholar]
- 38.Ravipudi JL, Neebha M. Synthesis of linear antenna arrays using jaya, self-adaptive jaya and chaotic jaya algorithms. AEU-International Journal of Electronics and Communications. 2018;92:54–63. [Google Scholar]
- 39.Rajasekaran S, Rajwade A. Analyzing cross-validation in compressed sensing with Poisson noise. Signal Processing. 2021;182:9. [Google Scholar]
- 40.Wu S, Flach P. A scored AUC metric for classifier evaluation and selection; Second Workshop on ROC Analysis in ML; Bonn, Germany. 2005. [Google Scholar]
- 41.Lee J-S. AUC4. 5: AUC-based C4. 5 decision tree algorithm for imbalanced data classification. IEEE Access. 2019;7:106034–106042. [Google Scholar]







