Feature-based deep neural network approach for predicting mortality risk in patients with COVID-19

Thing-Yuan Chang; Cheng-Kui Huang; Cheng-Hsiung Weng; Jing-Yuan Chen

doi:10.1016/j.engappai.2023.106644

. 2023 Jun 19;124:106644. doi: 10.1016/j.engappai.2023.106644

Feature-based deep neural network approach for predicting mortality risk in patients with COVID-19

Thing-Yuan Chang ^a, Cheng-Kui Huang ^b, Cheng-Hsiung Weng ^a,^c,^⁎, Jing-Yuan Chen ^a

PMCID: PMC10277846 PMID: 37366394

Abstract

In this study, we integrate deep neural network (DNN) with hybrid approaches (feature selection and instance clustering) to build prediction models for predicting mortality risk in patients with COVID-19. Besides, we use cross-validation methods to evaluate the performance of these prediction models, including feature based DNN, cluster-based DNN, DNN, and neural network (multi-layer perceptron). The COVID-19 dataset with 12,020 instances and 10 cross-validation methods are used to evaluate the prediction models. The experimental results showed that the proposed feature based DNN model, holding Recall (98.62%), F1-score (91.99%), Accuracy (91.41%), and False Negative Rate (1.38%), outperforms than original prediction model (neural network) in the prediction performance. Furthermore, the proposed approach uses the Top 5 features to build a DNN prediction model with high prediction performance, exhibiting the well prediction as the model built by all features (57 features). The novelty of this study is that we integrate feature selection, instance clustering, and DNN techniques to improve prediction performance. Moreover, the proposed approach which is built with fewer features performs much better than the original prediction models in many metrics and can still remain high prediction performance.

Keywords: COVID-19, Mortality risk, Deep learning, Feature-based DNN, Feature selection

1. Introduction

A new coronavirus disease known as COVID-19 is currently a pandemic that is spread out the whole world. The virus has spread out worldwide and has been declared a pandemic by the World Health Organization (Covid et al., 2020). Although the most important clinical symptoms are fever and cough, symptoms such as fatigue, headache and shortness of breath can also be seen. However, diagnostic tests are needed because all these symptoms are not specific to the disease and the disease can progress rapidly to severe pneumonia (Akçay et al., 2020, Chen et al., 2020). Conghy et al. (2020) considered that once the coronavirus outbreak starts, it will take less than four weeks to overwhelm the healthcare system. Once the hospital capacity gets overwhelmed, the death rate jumps. Therefore, how to predict mortality risk in patients with COVID-19 using machine learning (ML) techniques is an interesting research issue.

There are numerous studies presented in the literature on COVID-19 disease detection by analyzing images. Ozturk et al. (2020) proposed DarkCovidNet model in X-ray images for classifying COVID-19. For example, Hemdan et al. (2020) proposed a deep learning model called COVIDX-Net to analyze 25 COVID-19, and 25 healthy images. Wang et al. (2021) proposed a new architecture called M-inception by modifying the classical inception network to dialogize 1119 CT (computed tomography) images for COVID-19.

However, when ML algorithms are applicable to data-driven capabilities, their performance and reliability are often limited by the quality of data representation used to train and test algorithms (Ellefsen et al., 2019). Besides, datasets with many variables (high dimensionality) and redundant variables (features) generate poor ML algorithm performance (Aremu et al., 2020, Russell and Norvig, 2016). Specifically, high-dimensional datasets highlight the limitations of ML algorithms (Laurence et al., 2019, Russell and Norvig, 2016). A dataset that contains high amounts of redundancies and low information content can result in poor performances of ML algorithms and increased computation time (Cai et al., 2018, Li et al., 2019).

Pourhomayoun and Shakibi (2021) used several machine learning algorithms including support vector machine (SVM), artificial neural networks (ANN), random forest, decision tree, logistic regression, and k-nearest neighbor (KNN), to predict the mortality rate in patients with COVID-19. In addition, how to evaluate better feature selection methods and integrate them with deep neural network (DNN) to build more accurate prediction models is an interesting research issue.

Different to Covid-19 disease detection by deep learning techniques in analyzing images, we attempt to integrate several approaches, feature selection and instance clustering and DNN, to predict mortality risk in patients with COVID-19. We focus on that the proposed model can achieve high performance by using fewer features. Therefore, how to use fewer and significant features to build a prediction model with higher prediction performance and parsimonious model is the main objective of this study. In addition, two approaches (filter and wrapper) are integrated into DNN prediction models.

2. Related work

In this section we mainly review the issue and techniques related to COVID-19 disease detection, deep learning methods used for COVID-19 disease prediction. Besides, we also review feature selection techniques.

2.1. COVID-19 disease detection by deep learning techniques in analyzing images

An increasing number of cases of novel coronavirus (2019-nCoV)–infected pneumonia (NCIP) has been identified since December 2019. The World Health Organization (WHO) defined an official name, COVID-19, for the infectious disease caused by the novel coronavirus. The new coronavirus disease, COVID-19, is currently a pandemic declared a pandemic by the WHO and is spread out the whole world (Covid et al., 2020).

There are numerous studies presented in the literature on COVID-19 disease detection by analyzing images. Ozturk et al. (2020) proposed DarkCovidNet model in X-ray images for classifying COVID-19, healthy and pneumonia disease to achieve 87.02% accuracy rate. Hemdan et al. (2020) proposed a deep learning model called COVIDX-Net to analyze 25 COVID-19 and 25 healthy images to obtain 90% success rate. Wang et al. (2021) proposed a new architecture called M-inception by modifying the classical inception network to dialogize 1119 CT images for COVID-19 to obtain accuracy (89.5%), specificity (88%), and sensitivity (87%). Zhao et al. (2020) integrated transfer learning and data augmentation with deep learning to dialogize 275 CT images for COVID-19 to achieve an accuracy of 84.7%. Moreover, Loey et al. (2020) integrated a deep transfer learning model with classical data augmentation and conditional generative adversarial network (CGAN) for detecting the COVID-19 from the chest CT images.

2.2. Feature selection techniques

The success of ML algorithms depends upon the quality of the data to obtain a generalized predictive model of the classification problem. A dataset that contains a lot amounts of redundancies and low information content would result in poor performance of ML algorithms and increased computation time (Li et al., 2019). Therefore, the importance of feature selection (FS) for improving data quality and subsequently the performance of ML algorithms has been presented in many studies.

The classification of surface electromyography (sEMG) signal has an important usage in the man-machine interfaces for proper controlling of prosthetic devices with multiple degrees of freedom. Mukhopadhyay and Samui (2020) have demonstrated a detailed empirical exploration on DNN based classification system for the upper limb position invariant myoelectric signal. In this study, the DNN based system can outperform the other existing classifiers.

Because uneven environment conditions, such as branch and leaf occlusion, illumination variation, clusters of tomatoes, shading, and so on, have made fruit detection very challenging. Lawal (2021) proposed a modified YOLOv3 model called YOLO-Tomato models to detect tomatoes in complex environmental conditions. Because inter-subject variability, inherent complex properties, and low signal-to-noise ratio (SNR) in electroencephalogram (EEG) signals are major challenges, Roy (2022) proposed an efficient transfer learning (TL)-based multi-scale feature fused convolutional neural networks (MSFFCNN) which can capture the distinguishable features of various non-overlapping canonical frequency bands of EEG signals from different convolutional scales for multi-class MI classification. Automatic analysis and the recognition and prediction of the behavior of large-scale crowds in video-surveillance data is a research field of paramount importance for the security of modern societies. Matkovic et al. (2022) proposed a novel method for generating meta-tracklets and recognition of dominant motion patterns as a basis for automatic crowd behavior analysis at the macroscopic level, where a crowd is treated as an entity.

By considering Chi-square feature selection which ranks the features based on the statistical significance test and only those features that are dependent on the class label, Thaseen et al. (2019) developed an intrusion detection model by utilizing feature selection (Chi-square) and the ensemble of classifiers, including SVM, modified Naive Bayes (MNB), and LPBoost.

To achieve a higher classification accuracy, Ozyurt et al. (2021) proposed two basic feature generation functions (FRDEPFGN and RFINCA), which are used to extract statistical and textural features. The selected most informative features are forwarded to ANN and DNN for classification.

Similar to FS, feature extraction (FeExt) involves the derivation of new attributes from the prevailing attributes. Shastry and Sanjay (2021) proposed a hybrid FS and FeExt strategy, modified-Genetic Algorithm (m-GA) and weighted principal component analysis (wgt-PCA), for selecting features from the agricultural data set to achieve a higher classification accuracy.

Yuvaraj et al. (2021) developed a novel deep decision tree classifier that utilizes the hidden layers of DNN as its tree node to process the input elements. In their study, three feature extraction methods (Information gain, $χ^{2}$ , Pearson correlation) are used to avoid a failure in classifying with limited features.

Considering the COVID-19 outbreak and preventing the severe effects of the COVID-19 pandemic, Akçay et al. (2020) stated that diagnostic tests are needed because all these symptoms are not specific to the disease and the disease can progress rapidly to severe pneumonia. Besides, Conghy et al. (2020) considered that once the coronavirus outbreak starts, it will take less than four weeks to overwhelm the healthcare system. Once the hospital capacity gets overwhelmed, the death rate jumps. Motivated by recent advances and applications of artificial intelligence (AI) and big data in various areas, Pham et al. (2020) emphasized their importance in responding to the COVID-19 outbreak and preventing the severe effects of the COVID-19 pandemic. They also provided researchers and communities with new insights into the ways that AI and big data improve the COVID-19 situation and drive further studies in stopping the COVID-19 outbreak.

Cai et al. (2018) considered that feature selection methods can be broadly classified into categories: filter, wrapper, and embedded methods. Embedded methods are that feature selection methods can be integrated into algorithms. Wrapper methods evaluate feature importance based on the predictor algorithm’s performance using various feature subsets. Filter methods select features by ranking them per various criterion ranging from feature variance properties and independences.

Different to COVID-19 disease detection by deep learning techniques in analyzing images, we attempt to integrate feature selection methods and DNN to predict mortality risk in patients with COVID-19. Two feature selection approaches (filter and wrapper) are integrated into DNN to build prediction models with high prediction performance. How to use fewer features to build prediction model with higher prediction performance is the main objective of this study.

2.3. Comparison of our research and literature

To describe the differences between our and prior studies in various techniques, including machine learning, feature selection and instance clustering, we provide a comparison data in Table 1. Furthermore, we describe the differences between our and prior studies in dialogizing COVID-19 disease, as shown in Table 2.

Table 1.

Comparison of our research and literature.

Studies	Machine learning	Feature selection	Instance clustering
Lawal (2021)	YOLOv3	N	N
Mukhopadhyay and Samui (2020)	DNN	N	N
Ozyurt et al. (2021)	ANN, DNN	Functions (FRDEPFGN, RFINCA)	N
Roy (2022)	MSFFCNN	N	N
Shastry and Sanjay (2021)	m-GA, wgt-PCA	N	N
Thaseen et al. (2019)	SVM, MNB, LPBoost	$χ^{2}$	N
Yuvaraj et al. (2021)	DNN	Information gain, $χ^{2}$ , Pearson correlation	N
This study	DNN	$χ^{2}$ , Pearson correlation, information gain, DT, LR, RF	Y

Studies	Aim	Machine learning	Feature selection	Instance clustering
Hemdan et al. (2020)	Detect COVID-19 disease by analyzing images	COVIDX-Net	N	N

Loey et al. (2020)	Detect COVID-19 by analyzing chest CT images	CGAN	N	N

Ozturk et al. (2020)	Detect COVID-19 disease by analyzing images	DarkCovidNet	N	N

Wang et al. (2021)	Dialogize CT images for COVID-19	M-inception	N	N

Pourhomayoun and Shakibi (2021)	Predict mortality risk in patients with COVID-19	SVM, ANN, RF, DT, LR, KNN	Correlation	N

This study	Predict mortality risk in patients with COVID-19	DNN	$χ^{2}$ , Pearson correlation, information gain, DT, LR, RF	Y

Measures	DNN (This study)	ANN (Pourhomayoun and Shakibi, 2021)	ANN* (Pourhomayoun and Shakibi, 2021)
Recall	98.62%	94.20%	95.49%
Precision	86.20%	86.86%	85.57%
F1-score	91.99%	90.38%	90.44%
Accuracy	91.41%	89.98%	89.91%
FPR	15.79%	14.24%	15.67%
FNR	1.38%	5.79%	4.51%

DNN	ANN	ANN*
(This study)	(Pourhomayoun and Shakibi, 2021)	(Pourhomayoun and Shakibi, 2021)
91.41%	92.76%	89.91%

No	Feature	chi	No	Feature	chi
1	city	0.000000	30	chronic_disease_HIV	0.157299
2	province	0.000000	31	chronic_disease_Parkinson	0.157299
3	country	0.000000	32	anorexia	0.157299
4	age	0.000000	33	expectoration	0.157299
5	travel_history_location	0.000000	34	lesions on chest radiographs	0.157299
6	chronic_disease_binary	0.000000	35	hypertension	0.157299
7	chronic_disease_Hypertension	0.000000	36	cardiac disease	0.157299
8	sex	0.000000	37	hypoxia	0.157299
9	pneumonia	0.000000	38	chronic_disease_prostate	0.179712
10	respiratory distress	0.000000	39	chronic_disease_TB	0.317311
11	chronic_disease_Diabetes	0.000000	40	chronic_disease_cereberal	0.317311
12	septic shock	0.000013	41	conjunctivitis	0.317311
13	chronic_disease_kidney	0.000162	42	dizziness	0.317311
14	Heart attack	0.000311	43	emesis	0.317311
15	rhinorrhea	0.001565	44	eye irritation	0.317311
16	sore throat	0.004509	45	obnubilation	0.317311
17	kidney failure	0.004678	46	myelofibrosis	0.317311
18	chronic_disease_heart	0.008151	47	somnolence	0.317311
19	chronic_disease_cardiac	0.014306	48	cough	0.324756
20	dyspnea	0.014306	49	Myalgia	0.479500
21	gasp	0.014306	50	chronic_disease_hypothyroidism	0.563703
22	headache	0.019631	51	diarrhea	0.563703
23	chronic_disease_COPD	0.025347	52	sputum	0.563703
24	fever	0.048815	53	cold	0.563703
25	chronic_disease_asthma	0.058782	54	shortness of breath	0.654721
26	chest pain	0.058782	55	chronic_disease_cancer	1.000000
27	chronic_disease_bronchitis	0.083265	56	chronic_disease_dyslipidemia	1.000000
28	chills	0.102470	57	fatigue	1.000000
29	chronic_disease_Hepatitis	0.157299

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.8636	0.9745	0.9157	0.9103	0.9103	0.9254	0.1539	0.0255
10	0.8601	0.9699	0.9117	0.9061	0.9061	0.9225	0.1577	0.0301
15	0.8665	0.9647	0.9130	0.9081	0.9081	0.9244	0.1486	0.0353
20	0.8627	0.9814	0.9182	0.9126	0.9126	0.9267	0.1562	0.0186
25	0.8608	0.9815	0.9172	0.9114	0.9114	0.9258	0.1587	0.0185
30	0.8629	0.9732	0.9148	0.9093	0.9093	0.9248	0.1546	0.0268
35	0.8633	0.9784	0.9172	0.9117	0.9117	0.9262	0.1549	0.0216
40	0.8632	0.9872	0.9211	0.9154	0.9154	0.9284	0.1564	0.0128
45	0.8663	0.9689	0.9147	0.9097	0.9097	0.9254	0.1496	0.0311
50	0.8636	0.9775	0.9170	0.9116	0.9116	0.9262	0.1544	0.0225
54	0.8674	0.9607	0.9117	0.9069	0.9069	0.9239	0.1469	0.0393
57	0.8620	0.9862	0.9199	0.9141	0.9141	0.9275	0.1579	0.0138

No	Feature	Pearson	No	Feature	Pearson
1	country	0.502119	30	chronic_disease_hypothyroidism	0.0053
2	age	0.126401	31	diarrhea	0.0053
3	sex	0.114197	32	cold	0.0053
4	chronic_disease_binary	0.089623	33	fatigue	0.0000
5	chronic_disease_Hypertension	0.077358	34	chronic_disease_dyslipidemia	−0.0000
6	chronic_disease_Diabetes	0.059286	35	chronic_disease_cancer	−0.0000
7	chronic_disease_kidney	0.034424	36	shortness of breath	−0.0041
8	rhinorrhea	0.028855	37	dizziness	−0.0091
9	sore throat	0.025922	38	emesis	−0.0091
10	chronic_disease_heart	0.024139	39	obnubilation	−0.0091
11	chronic_disease_cardiac	0.022348	40	myelofibrosis	−0.0091
12	headache	0.021291	41	somnolence	−0.0091
13	chronic_disease_COPD	0.020400	42	anorexia	−0.0129
14	fever	0.018040	43	expectoration	−0.0129
15	chronic_disease_asthma	0.017242	44	hypertension	−0.0129
16	chronic_disease_bronchitis	0.015800	45	cardiac disease	−0.0129
17	chills	0.014898	46	hypoxia	−0.0129
18	chronic_disease_Hepatitis	0.012900	47	chest pain	−0.0172
19	chronic_disease_HIV	0.012900	48	dyspnea	−0.0223
20	chronic_disease_Parkinson	0.012900	49	gasp	−0.0223
21	lesions on chest radiographs	0.012900	50	kidney failure	−0.0258
22	chronic_disease_prostate	0.012240	51	Heart attack	−0.0329
23	chronic_disease_cereberal	0.009123	52	septic shock	−0.0398
24	chronic_disease_TB	0.009121	53	city	−0.0456
25	conjunctivitis	0.009121	54	respiratory distress	−0.0650
26	eye irritation	0.009121	55	pneumonia	−0.0716
27	cough	0.009007	56	province	−0.1025
28	Myalgia	0.006452	57	travel_history_location	−0.1265
29	sputum	0.005267

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.9370	0.7975	0.8617	0.8720	0.8720	0.9179	0.0536	0.2025
10	0.8888	0.8085	0.8467	0.8537	0.8537	0.8965	0.1012	0.1915
15	0.9306	0.8098	0.8660	0.8747	0.8747	0.9178	0.0604	0.1902
20	0.9303	0.8090	0.8654	0.8742	0.8742	0.9174	0.0606	0.1910
25	0.9307	0.8108	0.8666	0.8752	0.8752	0.9180	0.0604	0.1892
30	0.9299	0.8116	0.8667	0.8752	0.8752	0.9178	0.0612	0.1884
35	0.9292	0.8125	0.8669	0.8753	0.8753	0.9177	0.0619	0.1875
40	0.9292	0.8103	0.8657	0.8743	0.8743	0.9172	0.0617	0.1897
45	0.9296	0.8105	0.8660	0.8745	0.8745	0.9174	0.0614	0.1895
50	0.9290	0.8118	0.8665	0.8749	0.8749	0.9174	0.0621	0.1882
55	0.8661	0.9742	0.9170	0.9118	0.9118	0.9266	0.1506	0.0258
57	0.8620	0.9862	0.9199	0.9141	0.9141	0.9275	0.1579	0.0138

No	Feature	Info	No	Feature	Info
1	city	0.411180	30	cardiac disease	0.0001
2	province	0.409138	31	chronic_disease_Hepatitis	0.0001
3	age	0.326288	32	chronic_disease_HIV	0.0001
4	country	0.280728	33	chronic_disease_Parkinson	0.0001
5	travel_history_location	0.017563	34	expectoration	0.0001
6	sex	0.006536	35	hypertension	0.0001
7	chronic_disease_binary	0.004766	36	hypoxia	0.0001
8	chronic_disease_Hypertension	0.003733	37	lesions on chest radiographs	0.0001
9	pneumonia	0.003241	38	chronic_disease_prostate	0.0001
10	respiratory distress	0.002587	39	chronic_disease_TB	0.0001
11	chronic_disease_Diabetes	0.002259	40	conjunctivitis	0.0001
12	septic shock	0.001097	41	dizziness	0.0001
13	Heart attack	0.000750	42	emesis	0.0001
14	chronic_disease_kidney	0.000718	43	eye irritation	0.0001
15	rhinorrhea	0.000577	44	myelofibrosis	0.0001
16	kidney failure	0.000462	45	obnubilation	0.0001
17	chronic_disease_heart	0.000404	46	somnolence	0.0001
18	sore throat	0.000375	47	chronic_disease_cereberal	0.0000
19	chronic_disease_cardiac	0.000346	48	cough	0.0000
20	dyspnea	0.000346	49	Myalgia	0.0000
21	gasp	0.000346	50	chronic_disease_hypothyroidism	0.0000
22	chronic_disease_COPD	0.000288	51	cold	0.0000
23	headache	0.000258	52	diarrhea	0.0000
24	chronic_disease_bronchitis	0.000173	53	sputum	0.0000
25	chest pain	0.000165	54	shortness of breath	0.0000
26	chronic_disease_asthma	0.000165	55	chronic_disease_cancer	0.0000
27	fever	0.000164	56	chronic_disease_dyslipidemia	0.0000
28	chills	0.000121	57	fatigue	0.0000
29	anorexia	0.000115

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.8586	0.9822	0.9163	0.9102	0.9102	0.9249	0.1617	0.0178
10	0.8495	0.9852	0.9123	0.9053	0.9053	0.9210	0.1745	0.0148
15	0.8506	0.9855	0.9131	0.9062	0.9062	0.9217	0.1730	0.0145
20	0.8525	0.9875	0.9150	0.9083	0.9083	0.9231	0.1709	0.0125
25	0.8545	0.9814	0.9136	0.9072	0.9072	0.9226	0.1671	0.0186
30	0.8512	0.9920	0.9162	0.9093	0.9093	0.9236	0.1734	0.0080
35	0.8516	0.9920	0.9165	0.9096	0.9096	0.9238	0.1729	0.0080
40	0.8506	0.9925	0.9161	0.9091	0.9091	0.9234	0.1744	0.0075
45	0.8520	0.9925	0.9169	0.9101	0.9101	0.9241	0.1724	0.0075
50	0.8514	0.9940	0.9172	0.9102	0.9102	0.9242	0.1735	0.0060
54	0.8506	0.9935	0.9165	0.9095	0.9095	0.9237	0.1745	0.0065
57	0.8620	0.9862	0.9199	0.9141	0.9141	0.9275	0.1579	0.0138

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.8611	0.9819	0.9175	0.9117	0.9117	0.9260	0.1584	0.0181
10	0.8644	0.9755	0.9166	0.9112	0.9112	0.9261	0.1531	0.0245
15	0.8630	0.9854	0.9201	0.9145	0.9145	0.9278	0.1564	0.0146
20	0.8603	0.9870	0.9193	0.9134	0.9134	0.9269	0.1602	0.0130
25	0.8643	0.9789	0.9180	0.9126	0.9126	0.9268	0.1537	0.0211
30	0.8599	0.9879	0.9195	0.9135	0.9135	0.9269	0.1609	0.0121
35	0.8620	0.9875	0.9205	0.9147	0.9147	0.9279	0.1581	0.0125
40	0.8614	0.9832	0.9183	0.9125	0.9125	0.9265	0.1582	0.0168
45	0.8644	0.9789	0.9181	0.9126	0.9126	0.9269	0.1536	0.0211
50	0.8761	0.9516	0.9123	0.9085	0.9085	0.9259	0.1346	0.0484

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.4962	0.4013	0.4437	0.4969	0.4969	0.5984	0.4075	0.5987
10	0.5017	0.5050	0.5033	0.5017	0.5017	0.6271	0.5017	0.4950
15	0.5037	0.4065	0.4499	0.5030	0.5030	0.6035	0.4005	0.5935
20	0.4965	0.5953	0.5415	0.4958	0.4958	0.6471	0.6037	0.4047
25	0.5577	0.6276	0.5906	0.5649	0.5649	0.6857	0.4978	0.3724
30	0.5566	0.6303	0.5911	0.5641	0.5641	0.6859	0.5022	0.3697
35	0.5573	0.6303	0.5916	0.5648	0.5648	0.6862	0.5007	0.3697
40	0.9303	0.7413	0.8251	0.8428	0.8428	0.9004	0.0556	0.2587
45	0.9286	0.7556	0.8332	0.8488	0.8488	0.9032	0.0581	0.2444
50	0.9305	0.8128	0.8677	0.8760	0.8760	0.9184	0.0607	0.1872

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.8694	0.9634	0.9140	0.9093	0.9093	0.9255	0.1448	0.0366
10	0.8629	0.9809	0.9181	0.9125	0.9125	0.9266	0.1559	0.0191
15	0.8695	0.9679	0.9161	0.9113	0.9113	0.9267	0.1453	0.0321
20	0.8722	0.9686	0.9178	0.9133	0.9133	0.9282	0.1419	0.0314
25	0.8590	0.9832	0.9169	0.9109	0.9109	0.9253	0.1614	0.0168
30	0.8661	0.9772	0.9183	0.9131	0.9131	0.9273	0.1511	0.0228
35	0.8632	0.9792	0.9175	0.9120	0.9120	0.9264	0.1552	0.0208
40	0.8654	0.9699	0.9146	0.9095	0.9095	0.9251	0.1509	0.0301
45	0.8607	0.9857	0.9189	0.9131	0.9131	0.9268	0.1596	0.0143
50	0.8611	0.9725	0.9134	0.9078	0.9078	0.9237	0.1569	0.0275

Country	Method	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
China (139)	DNN	0.9084	0.9835	0.9444	0.8995	0.6584	0.9531	0.6667	0.0165
China (139)	ANN	0.8705	1.0000	0.9308	0.8702	0.5000	0.9353	1.0000	0.0000

Ethiopia (113)	DNN	0.9646	1.0000	0.9820	0.9649	0.5000	0.9823	1.0000	0.0000
Ethiopia (113)	ANN	0.9646	1.0000	0.9820	0.9649	0.5000	0.9823	1.0000	0.0000

India (7309)	DNN	0.7021	0.9739	0.8159	0.9033	0.9286	0.8409	0.1167	0.0261
India (7309)	ANN	0.6983	0.8285	0.7578	0.8834	0.8637	0.7822	0.1011	0.1715

Philippines (4058)	DNN	0.9423	0.9932	0.9671	0.9367	0.5506	0.9709	0.8919	0.0068
Philippines (4058)	ANN	0.9362	1.0000	0.9670	0.9362	0.5000	0.9681	1.0000	0.0000

Singapore (111)	DNN	0.9640	1.0000	0.9817	0.9640	0.5000	0.9820	1.0000	0.0000
Singapore (111)	ANN	0.9640	1.0000	0.9817	0.9640	0.5000	0.9820	1.0000	0.0000

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.6999	0.7582	0.7279	0.8752	0.8332	0.7557	0.0918	0.2418
10	0.6799	0.8173	0.7423	0.8751	0.8543	0.7687	0.1086	0.1827
15	0.6995	0.7551	0.7262	0.8747	0.8318	0.7543	0.0916	0.2449
20	0.6838	0.8291	0.7494	0.8780	0.8604	0.7752	0.1082	0.1709
25	0.6806	0.8198	0.7437	0.8756	0.8556	0.7700	0.1086	0.1802
30	0.6951	0.7837	0.7368	0.8767	0.8433	0.7632	0.0970	0.2163
35	0.6949	0.8707	0.7730	0.8874	0.8814	0.7971	0.1079	0.1293
37	0.6900	0.8745	0.7714	0.8859	0.8818	0.7961	0.1109	0.1255

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.6940	0.9316	0.7954	0.8945	0.9078	0.8203	0.1160	0.0684
10	0.6949	0.9739	0.8111	0.9001	0.9266	0.8373	0.1207	0.0261
15	0.6871	0.9633	0.8021	0.8953	0.9197	0.8292	0.1239	0.0367
20	0.6957	0.9764	0.8125	0.9008	0.9279	0.8387	0.1205	0.0236
25	0.6950	0.9671	0.8087	0.8993	0.9236	0.8346	0.1198	0.0329
30	0.6944	0.9789	0.8125	0.9005	0.9286	0.8390	0.1216	0.0211
35	0.6918	0.9739	0.8090	0.8988	0.9257	0.8357	0.1225	0.0261
37	0.6869	0.9764	0.8065	0.8968	0.9254	0.8343	0.1256	0.0236

Top N	Precision	Recall	F1-score	Accuracy	ROC	PRC	FPR	FNR
5	0.9394	0.9958	0.9668	0.9359	0.5269	0.9696	0.9421	0.0042
10	0.9409	0.9926	0.9661	0.9347	0.5388	0.9702	0.9151	0.0074
15	0.9398	0.9942	0.9662	0.9349	0.5299	0.9697	0.9344	0.0058
20	0.9397	0.9929	0.9656	0.9337	0.5293	0.9696	0.9344	0.0071
25	0.9393	0.9939	0.9659	0.9342	0.5259	0.9695	0.9421	0.0061
27	0.9399	0.9955	0.9669	0.9362	0.5306	0.9698	0.9344	0.0045

PERMALINK

Feature-based deep neural network approach for predicting mortality risk in patients with COVID-19

Thing-Yuan Chang

Cheng-Kui Huang

Cheng-Hsiung Weng

Jing-Yuan Chen

Abstract

1. Introduction

2. Related work

2.1. COVID-19 disease detection by deep learning techniques in analyzing images

2.2. Feature selection techniques

2.3. Comparison of our research and literature

Table 1.

Table 2.

3. Research methodology

3.1. The proposed framework for prediction models

Fig. 1.

3.2. Feature selection techniques

3.2.1. Feature filter methods (χ2, Pearson correlation, and information gain)

3.2.2. Feature wrapper methods

3.3. Development of prediction models

3.4. Assessment metrics

4. Experimental results

Table 3.

4.1. The dataset description and hyper-parameters of DNN model

4.2. Performance of prediction models (ANN and DNN)

Table 4.

Table 5.

4.3. The impact of the important features

4.3.1. Feature filter methods (χ2, Pearson correlation, and information gain)

Table 6.

Table 7.

Table 8.

Table 9.

Table 10.

Table 11.

4.3.2. Feature wrapper methods (DT, LR, and RF)

Table 12.

Table 13.

Table 14.

4.4. Performance of prediction models (ANN and DNN) in different countries

Table 15.

Table 16.

Table 17.

Table 18.

Table 19.

4.5. Summary of experimental results

5. Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Data availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.2.1. Feature filter methods ( $χ^{2}$ , Pearson correlation, and information gain)

4.3.1. Feature filter methods ( $χ$ 2, Pearson correlation, and information gain)