Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race

Alaa Sheta; Thaer Thaher; Salim R Surani; Hamza Turabieh; Malik Braik; Jingwei Too; Noor Abu-El-Rub; Majdi Mafarjah; Hamouda Chantar; Shyam Subramanian

doi:10.3390/diagnostics13142417

. 2023 Jul 20;13(14):2417. doi: 10.3390/diagnostics13142417

Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race

Alaa Sheta ¹, Thaer Thaher ^2,^*, Salim R Surani ³, Hamza Turabieh ⁴, Malik Braik ⁵, Jingwei Too ⁶, Noor Abu-El-Rub ⁷, Majdi Mafarjah ⁸, Hamouda Chantar ⁹, Shyam Subramanian ¹⁰

Editor: Dechang Chen

¹Computer Science Department, Southern Connecticut State University, New Haven, CT 06514, USA; shetaa1@southernct.edu

²Department of Computer Systems Engineering, Arab American University, Jenin P.O. Box 240, Palestine

³Department of Pulmonary, Critical Care & Sleep Medicine, Texas A&M University, College Station, TX 77843, USA; surani@tamu.edu

⁴Health Management and Informatics Department, School of Medicine, University of Missouri, Columbia, MO 65212, USA; hit8zp@health.missouri.edu

⁵Department of Computer Science, Al-Balqa Applied University, Salt 19117, Jordan; mbraik@bau.edu.jo

⁶Faculty of Electrical Engineering, Universiti Teknikal Malaysia Melaka, Hang Tuah Jaya, Durian Tunggal 76100, Melaka, Malaysia; jamesjames868@gmail.com

⁷Center of Medical Informatics and Enterprise Analytics, University of Kansas Medical Center, Kansas City, KS 66160, USA; nabuelrub@kumc.edu

⁸Department of Computer Science, Birzeit University, Birzeit P.O. Box 14, Palestine; mmafarja@birzeit.edu

⁹Faculty of Information Technology, Sebha University, Sebha 18758, Libya; ham.chantar@sebhau.edu.ly

¹⁰Pulmonary, Critical Care & Sleep Medicine, Sutter Health, Tracy, CA 95376, USA; subramanian@sutterhealth.org

Correspondence: thaer.thaher@aaup.edu

Roles

Alaa Sheta: Conceptualization, Methodology, Data curation, Writing – original draft, Writing – review & editing, Supervision

Thaer Thaher: Methodology, Software, Validation, Investigation, Writing – review & editing, Visualization

Salim R Surani: Conceptualization, Methodology, Writing – review & editing, Supervision

Hamza Turabieh: Methodology, Investigation, Writing – original draft, Writing – review & editing

Malik Braik: Writing – original draft, Writing – review & editing

Jingwei Too: Writing – original draft, Writing – review & editing

Noor Abu-El-Rub: Writing – original draft, Writing – review & editing

Majdi Mafarjah: Methodology, Writing – original draft, Writing – review & editing

Hamouda Chantar: Writing – original draft, Writing – review & editing

Shyam Subramanian: Validation, Writing – review & editing

Dechang Chen: Academic Editor

PMCID: PMC10377846 PMID: 37510161

Abstract

Obstructive sleep apnea (OSA) is a prevalent sleep disorder that affects approximately 3–7% of males and 2–5% of females. In the United States alone, 50–70 million adults suffer from various sleep disorders. OSA is characterized by recurrent episodes of breathing cessation during sleep, thereby leading to adverse effects such as daytime sleepiness, cognitive impairment, and reduced concentration. It also contributes to an increased risk of cardiovascular conditions and adversely impacts patient overall quality of life. As a result, numerous researchers have focused on developing automated detection models to identify OSA and address these limitations effectively and accurately. This study explored the potential benefits of utilizing machine learning methods based on demographic information for diagnosing the OSA syndrome. We gathered a comprehensive dataset from the Torr Sleep Center in Corpus Christi, Texas, USA. The dataset comprises 31 features, including demographic characteristics such as race, age, sex, BMI, Epworth score, M. Friedman tongue position, snoring, and more. We devised a novel process encompassing pre-processing, data grouping, feature selection, and machine learning classification methods to achieve the research objectives. The classification methods employed in this study encompass decision tree (DT), naive Bayes (NB), k-nearest neighbor (kNN), support vector machine (SVM), linear discriminant analysis (LDA), logistic regression (LR), and subspace discriminant (Ensemble) classifiers. Through rigorous experimentation, the results indicated the superior performance of the optimized kNN and SVM classifiers for accurately classifying sleep apnea. Moreover, significant enhancements in model accuracy were observed when utilizing the selected demographic variables and employing data grouping techniques. For instance, the accuracy percentage demonstrated an approximate improvement of 4.5%, 5%, and 10% with the feature selection approach when applied to the grouped data of Caucasians, females, and individuals aged 50 or below, respectively. Furthermore, a comparison with prior studies confirmed that effective data grouping and proper feature selection yielded superior performance in OSA detection when combined with an appropriate classification method. Overall, the findings of this research highlight the importance of leveraging demographic information, employing proper feature selection techniques, and utilizing optimized classification models for accurate and efficient OSA diagnosis.

Keywords: obstructive sleep apnea, grouping, feature selection, machine learning

1. Introduction

Obstructive sleep apnea (OSA) is a severe respiratory disorder that was first introduced in 1837 by Charles Dickens [1]. The foremost common symptoms of OSA are loud snoring, dry mouth upon awakening, morning headaches, and concentration difficulties [2,3]. There are over 100 million patients who suffer from sleep apnea, and it can affect both adults and children [4,5,6]. Moreover, it is estimated that nearly 22 million Americans suffer from a type of apnea that varies from moderate to severe [7]. Typically, the apnea–hypopnea index (AHI) is used to measure the severity of the apnea. For example, with nearly 326 million people living in the USA, it’s reported that 10% of the US population have mild OSA with AHI scores larger than 5, 3.5% have moderate OSA with AHI scores larger than 15, and 4% have severe OSA syndrome (i.e., apnea/hypopnea) [7].

The publication titled “Hidden health crisis costing America billions” by the American Academy of Sleep Medicine (AASM) presents a new analysis that sheds light on the considerable economic consequences of undiagnosed OSA [8]. Neglecting sleep apnea significantly raises the likelihood of expensive health complications such as hypertension, heart disease, diabetes, and depression [9]. By examining 506 patients diagnosed with OSA, the study showcases the potential improvements in their quality of life following treatment, including enhanced sleep quality, increased productivity, and a notable 40% reduction in workplace absences. A substantial 78% of patients regarded their treatment as a significant investment. Frost & Sullivan, a leading market research firm, has estimated the annual economic burden of undiagnosed sleep apnea among adults in the United States to be approximately $149.6 billion. This staggering amount encompasses $86.9 billion in lost productivity, $26.2 billion in motor vehicle accidents, and $6.5 billion in workplace accidents. Sleep apnea can be categorized into three distinct types:

Obstructive sleep apnea (OSA): The most common type of apnea is known as obstructive sleep apnea (OSA), which is identified by two primary characteristics. The first is a continuous reduction in airflow of at least 30% for a duration of 10 seconds, which is accompanied by a minimum oxygen desaturation of 4%. The second is a decrease in airflow of at least 50% for 10 seconds, coupled with a 3% reduction in oxygen saturation [10].
Central sleep apnea (CSA): CSA occurs when the brain fails to send appropriate signals to the muscles responsible for breathing. Unlike OSA, which stems from mechanical issues, CSA arises due to impaired communication between the brain and muscles [11,12].
Mixed sleep apnea (MSA): MSA, also known as complex sleep apnea, represents a combination of obstructive and central sleep apnea disorders, thus presenting a more complex pattern of symptoms and characteristics.

Detecting OSA using an electrocardiogram (ECG) is an expensive process that is inaccessible to a large number of the world’s population. The attributes of the ECG signal differ in the case of awake and sleep intervals [13]. Hence, using a combined signal of awake and sleep stages reduces the overall reliability of the detection process. Several researchers recommend examining the ECG signal based on minutes [14]. In general, to detect OSA, the signal length should be at least 10 seconds in length. The diagnosis of OSA from ECG signals using various machine learning methods is a commonly used approach in the literature. For example, artificial neural networks (ANN) and convolutional neural networks (CNN) were introduced to detect and classify OSA. Wang et al. [15] used the CNN model to detect OSA based on ECG signals. The authors extracted a set of features from each signal and then trained a three-layered CNN model. The obtained results showed an acceptable performance and the ability to apply the proposed method over wearable devices.

Erdenebayar et al. [16] provided an automated detection method for OSA using a single-lead ECG and a CNN. The CNN model proposed in their study was meticulously constructed, featuring six convolution layers that were carefully optimized. These layers incorporated activation functions, pooling operations, and dropout layers. The research findings demonstrated that the proposed CNN model exhibited remarkable accuracy in detecting OSA solely by analyzing a single-lead ECG signal. Faust et al. [17] introduced the use of a long short-term memory (LSTM) neural network to detect sleep apnea based on RR intervals signal. Their results showed the ability of the LSTM network to detect sleep apnea with an accuracy equal to $99.80 %$ . Schwartz et al. [18] employed several machine learning methods to detect four types of abbreviated digital sleep questionnaires (DSQs). The authors showed the ability of machine learning in detecting sleep disturbances with high accuracy. Lakhan et al. [19] proposed a dramatic involvement of a deep learning approach to detect multiple sleep apnea–hypopnea syndrome (SAHS). Two types of classifications were employed in their paper: binary classification with three cutoff indices (i.e., AHI = 5, 15, and 30 events/hour) and multiclass classification (i.e., no SAHS, mild SAHS, moderate SAHS, and severe SAHS). The obtained results for the binary classification showed that an AHI with 30 events/hour outperformed other cutoffs with an accuracy of 92.69%. For multiclass classification problems, the obtained accuracy was 63.70%. Banluesombatkul et al. [20] employed a novel deep learning method to detect OSA (i.e., normal and severe patients). The proposed method used three different deep learning methods: (i) one-dimensional CNN (1-D CNN) for feature extraction; (ii) deep recurrent neural networks (DRNNs) with an LSTM network for temporal information extraction; and (iii) fully connected neural networks (DNNs) for feature encoding. The proposed method showed acceptable results compared to the literature.

There have been several efforts to identify the relation between the snoring sound and OSA in the literature. In general, loud snoring is one of the indicators of OSA, and it is commonly thought that the frequency and amplitude of snoring are associated with the severity of the OSA [21]. Alshaer et al. [22] employed an acoustic analysis of breath sounds to detect OSA. The previous research suggests that OSA can be detected using snoring attributes. However, clinicians should pay attention to the possibility of missing an OSA diagnosis for patients with minimal snoring. Kang et al. [23] applied linear predict coding (LPC) and Mel-frequency cepstral coefficient (MFCC) features to detect OSA based on the amplitude of the snoring signal. The proposed method was able to classify three different events, namely, snoring, apnea, and silence, from sleep recordings with accuracies of 90.65%, 90.99%, and 90.30%, respectively.

Feature extraction and feature selection are the most commonly used techniques for data dimensionality reduction. Several papers have been published that highlight the importance of feature selection in OSA detection. Various features are extracted from the ECG signals; then, feature selection is used to reduce the number of extracted features and to determine the most valuable features related to OSA. In the stage of feature extraction, a set of features is extracted from the time series data, which aims to reveal the hidden information within the ECG signal. However, a feature set may contain redundant and irrelevant information, and feature selection is adopted to resolve this issue. A feature selection algorithm can help find the nearly optimal combination of features. Although feature selection is an expensive method, it can produce better classification performance, and high accuracy is significantly important in OSA detection. There are different classification methods that are used to select important features, such as support vector machine (SVM) networks, k-nearest neighbor (kNN) algorithms, artificial neural networks (ANN), linear discriminant analysis (LDA), and logistic regression (LR).

Many researchers have used demographic data to identify OSA. Sheta et al. [24,25] applied LR and ANN models to detect OSA based on demographic data. A real dataset was used that consists of several demographic features (i.e., weight, height, hip, waist, BMI, neck size, age, snoring, the modified Friedman (MF) score, the Epworth sleepiness scale, sex, and daytime sleepiness). The obtained results suggested that the proposed method could detect OSA with an acceptable accuracy. Surani et al. [26] applied the AdaBoost method as a machine learning classifier to detect OSA based on demographic data. The obtained results were promising. Surani et al. [27] applied a wrapper feature selection method based on binary particle swarm optimization (BPSO) with an ANN to detect OSA. The obtained results illustrated that the use of BPSO with an ANN can detect OSA with high accuracy. Haberfeld et al. [28] proposed a mobile application called Sleep Apnea Screener (SAS) to detect OSA based on demographic data. The authors used nine demographic features (i.e., height, weight, waist, hip, BMI, age, neck, M. Friedman, Epworth, snoring, gender, and daytime sleepiness). The application had two machine learning methods: LR and SVM. Moreover, the authors studied the performance of each classifier based on gender. The reported results showed that the proposed application can help patients detect OSA easily compared to an overnight test for OSA diagnosis.

There are many screening approaches for OSA, including tools such as the Berlin Questionnaire, the STOP-BANG Questionnaire, Epworth Sleepiness Scale (ESS), clinical assessment, and population-specific screening tools [29]. These approaches aim to identify individuals at a higher risk of OSA based on symptoms, risk factors, and questionnaire responses. Positive screening results prompt further evaluation using diagnostic tests such as polysomnography (PSG) or home sleep apnea testing (HSAT). Screening helps prioritize resources and directs individuals toward comprehensive sleep assessments. Subramanian et al. [30] introduced a novel screening approach known as the NAMES, which employs statistical methods to identify OSA. The NAMES assessment combines various factors, including neck circumference, airway classification, comorbidities, the Epworth scale, and snoring, to create a comprehensive evaluation that incorporates medical records, current symptoms, and physical examination findings. Experimental findings demonstrated the efficacy of the NAMES assessment in detecting OSA. Furthermore, the inclusion of BMI and gender in the assessment improved its screening capabilities.

This work proposes an efficient classification framework for the early detection of OSA. In specific, it is an extension of the NAMES work machine learning classification method and utilizes a metaheuristic-based feature selection scheme. The main contributions are summarized as follows:

The OSA data was grouped based on age, sex, and race variables for performance improvement. This type of grouping is novel and has never been presented in this area of research before.
Various types of the most well-known machine learning algorithms were assessed to determine the best-performing one for the OSA problem. These methods included twelve predefined (fixed) parameter classifiers and two optimized classifiers (using hyperparameter optimization).
A wrapper feature selection approach using particle swarm optimization (PSO) was employed to determine the most valuable features related to the OSA.
Experimental results from the actual data (collected from Torr Sleep Center, Texas, USA) confirmed that the proposed method improved the overall performance of the OSA prediction.

The rest of this paper is organized as follows: Section 2 presents the proposed method used in this work. Section 3 gives a brief description of the dataset used in the experiment. Section 5 discusses the experimental results and simulations. Finally, the conclusion and future work are presented in Section 8.

2. Proposed Diagnosis Process

The proposed OSA diagnosis process is illustrated in Figure 1. We suggest collecting data from patients who have undergone demographic, anthropometric measurements, and polysomnographic studies from a community-based sleep laboratory. An expert from the Torr Sleep Center (Corpus Christi, TX, USA) controlled the collection process for the polysomnography (PSG) evaluation of suspected OSA between 5 February 2007, and 21 April 2008. We processed the data to make the data more suitable for the analysis process. All missing data were handled, and a normalization technique was employed to transform the data into a standard scale. The next step was classification-based grouped data, where the classification model was implemented based on two types of learning methods—fixed parameter setting and adaptive parameter setting—through the training process. The benefit of using two kinds of learning methods is to learn more about the dataset and find the optimal parameter settings. After that, we applied a wrapper feature selection using the best performing classifier to identify the most valuable features related to OSA. This step can reveal useful information to physicians and doctors to understand the demographic characteristics of OSA patients. Finally, we used a set of evaluation criteria (i.e., accuracy, TPR, TNR, AUC, precision, F-score, and G-mean) to evaluate the performance of each classifier.

The proposed methodology. The figure illustrates the step-by-step process of the proposed methodology, which involved five steps: data collection, data preprocessing, data grouping, classification, feature selection, and evaluation.

3. Sleep Apnea Dataset

The initial dataset employed in this study encompasses 620 patients, comprising 366 males and 254 females. The age range for males spans from 19 to 88 years, while for females, it ranges from 20 to 96 years. Notably, the prevalence of snoring was 92.6% among males and 91.7% among females. Each patient underwent comprehensive full-night monitoring as part of the study. The dataset comprises 31 input features and a binary output, represented by either 0 or 1, thus indicating the presence or absence of obstructive sleep apnea (OSA) (see Table 1 for a detailed presentation of these features). Additionally, the study recorded each individual’s Friedman tongue position (FTP), which encompasses four distinct positions, as depicted in Figure 2. Additionally, the Epworth scale, which is used to assess sleepiness, was collected. The scale details are presented in Table 2. Notably, the dataset is imbalanced, with 357 patients identified as positive cases with OSA and 263 individuals identified without OSA. Table 3 provides a comprehensive overview of the dataset’s characteristics.

Table 1.

List of dataset features.

Attributes		Data Type
f1	Race	Categorical
f2	Age	Numeric
f3	Sex	Categorical
f4	BMI	Categorical
f5	Epworth	Numeric
f6	Wast	Numeric
f7	Hip	Numeric
f8	RDI	Numeric
f9	Neck	Numeric
f10	M.Friedman	Numeric
f11	Co-morbid	Categorical
f12	Snoring	Categorical
f13	Daytime sleepiness	Categorical
f14	DM	Categorical
f15	HTN	Categorical
f16	CAD	Categorical
f17	CVA	Categorical
f18	TST	Numeric
f19	Sleep Effic	Numeric
f20	REM AHI	Numeric
f21	NREM AHI	Numeric
f22	Supine AHI	Numeric
f23	Apnea Index	Numeric
f24	Hypopnea Index	Numeric
f25	Berlin Q	Categorical
f26	Arousal index	Numeric
f27	Awakening Index	Numeric
f28	PLM Index	Numeric
f29	Mins. SaO $_{2}$	Numeric
f30	Mins. SaO $_{2}$ Desats	Numeric
f31	Lowest SaO $_{2}$	Numeric
class	Witnessed apnea	Categorical

Range	Description
0–5	Lower normal daytime sleepiness
6–10	Higher normal daytime sleepiness
11–12	Mild level of sleepiness experienced during the daytime
13–15	Moderate level of sleepiness experienced during the daytime
16–24	Significant level of sleepiness experienced during the daytime

Datasets		No. Features	No. Samples	Negative	Positive
Original Dataset		31	274	149	125
Race	Caucasian	30	151	92	59
	Hispanic	30	123	57	66
Gender	Females	30	118	85	33
	Males	30	156	64	92
Age	Age ≤ 50	31	109	55	54
	Age > 50	31	165	94	71

Name	Transfer Function Formula
S1	$S (x) = \frac{1}{1 + e^{- 2 x}}$
S2	$S (x) = \frac{1}{1 + e^{- x}}$
S3	$S (x) = \frac{1}{1 + e^{(- x / 2)}}$
S4	$S (x) = \frac{1}{1 + e^{(- x / 3)}}$

Preset Classifier	Description	Parameter	Value
FDT	Fine Decision Tree	Maximum number of splits	100
		Split criterion	Gini’s diversity index
CDT	Coarse Decision Tree	Maximum number of splits	100
		Split criterion	Gini’s diversity index
LDA	Linear Discrimenant Analaysis	Discriminant type	linear
LR	Logistic Regression	-	-
GNB	Gaussian Naïve Bayes	Distribution name	Gaussian
KNB	Kernel Naïve Bayes	Distribution name	Kernel
		Kernel type	Gaussian
LSVM	Linear Support Vector Machine	Kernel function	Linear
		Kernel scale	Automatic
		Box contraint level	1
		standardize data	TRUE
MGSVM	Medium Gaussian SVM	Kernel function	Gaussian
		Kernel scale	5.6
		Box contraint level	1
		Standardize data	TRUE
CGSVM	Coarse Gaussian SVM	Kernel function	Gaussian
		Kernel scale	22
		Box contraint level	1
		Standardize data	TRUE
CKNN	Cosine K-Nearest Neighbor	Number of neighbors	10
		Distance metric	cosine
		Distance weight	equal
		Standardize data	TRUE
WKNN	Weighted kNN	Number of neighbors	10
		Distance metric	Euclidean
		Distance weight	Squared Inverse
		Standardize data	TRUE
Ensemble	Subspace Discriminant	Ensemble method	Subspace
		Learner type	Discriminant
		Number of learners	30
		Subspace dimension	16

Dataset	Number of Neighbors	Distance Metric	Distance Weight	Standardize Data
All	16	Spearman	Inverse	TRUE
Caucasian	6	Correlation	Squared Inverse	TRUE
Hispanic	32	Cityblock	Squared Inverse	TRUE
Females	4	Hamming	Squared Inverse	FALSE
Males	22	Cityblock	Equal	FALSE
Age ≤ 50	14	Cosine	Squared Inverse	TRUE
Age > 50	31	16	Squared Inverse	TRUE

Dataset	Kernel Function	Kernel Scale	Box Contraint Level	Standardized Data
All	Polynomial (degree = 2)	1	0.002351927	TRUE
Caucasian	Linear	1	0.18078	TRUE
Hispanic	Linear	1	0.01115743	TRUE
Females	Gaussian	2.990548535	122.3491994	FALSE
Males	Linear	1	0.001000015	FALSE
Age ≤ 50	Gaussian	415.5625146	341.7329909	FALSE
Age > 50	Gaussian	26.42211158	7.503025335	TRUE

Classifier	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
DT	0.5876	0.6309	0.5360	0.5834	0.6184	0.6246	0.5815	8.97
LDA	0.6861	0.7584	0.6000	0.6792	0.6933	0.7244	0.6746	5.79
LR	0.6861	0.7383	0.6240	0.6811	0.7006	0.7190	0.6787	5.57
NB	0.6642	0.7785	0.5280	0.6533	0.6629	0.7160	0.6411	7.71
SVM	0.6898	0.8188	0.5360	0.6774	0.6778	0.7416	0.6625	5.71
kNN	0.6934	0.7517	0.6240	0.6878	0.7044	0.7273	0.6849	4.21
Ensemble	0.7044	0.8188	0.5680	0.6934	0.6932	0.7508	0.6820	3.93
SVM*	0.7226	0.7919	0.6400	0.7160	0.7239	0.7564	0.7119	2.14
kNN*	0.7409	0.8322	0.6320	0.7321	0.7294	0.7774	0.7252	1.14

Classifier	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
FDT	0.7285	0.7935	0.6271	0.7103	0.7684	0.7807	0.7054	3.29
LDA	0.6908	0.7957	0.5254	0.6606	0.7255	0.7590	0.6466	6.71
LR	0.6887	0.7609	0.5763	0.6686	0.7368	0.7487	0.6622	6.21
GNB	0.6887	0.8261	0.4746	0.6503	0.7103	0.7638	0.6261	7.93
MGSVM	0.7219	0.8913	0.4576	0.6745	0.7193	0.7961	0.6387	5.79
CKNN	0.7483	0.9457	0.4407	0.6932	0.7250	0.8208	0.6455	4.36
Ensemble	0.7219	0.8696	0.4915	0.6805	0.7273	0.7921	0.6538	5.07
SVM*	0.7351	0.8478	0.5593	0.7036	0.7500	0.7959	0.6886	3.36
kNN*	0.7483	0.8804	0.5424	0.7114	0.7500	0.8100	0.6910	2.29

Data	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
CDT	0.6260	0.5263	0.7121	0.6192	0.6122	0.5660	0.6122	8.29
LDA	0.6423	0.5614	0.7121	0.6368	0.6275	0.5926	0.6323	6.50
LR	0.6260	0.5614	0.6818	0.6216	0.6038	0.5818	0.6187	8.00
GNB	0.6504	0.6140	0.6818	0.6479	0.6250	0.6195	0.6470	5.64
MGSVM	0.6829	0.5614	0.7879	0.6746	0.6957	0.6214	0.6651	2.86
WKNN	0.6585	0.5439	0.7576	0.6507	0.6596	0.5962	0.6419	5.07
Ensemble	0.6585	0.5965	0.7121	0.6543	0.6415	0.6182	0.6517	4.57
SVM*	0.6748	0.6316	0.7121	0.6719	0.6545	0.6429	0.6706	3.00
kNN*	0.7724	0.6316	0.8939	0.7628	0.8372	0.7200	0.7514	1.07

Classifier	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
CDT	0.7119	0.8824	0.2727	0.5775	0.7576	0.8152	0.4906	4.86
LDA	0.6695	0.8118	0.3030	0.5574	0.7500	0.7797	0.4960	5.57
LR	0.6102	0.7529	0.2424	0.4977	0.7191	0.7356	0.4272	8.00
KNB	0.6356	0.7176	0.4242	0.5709	0.7625	0.7394	0.5518	5.00
MGSVM	0.7203	1.0000	0.0000	0.5000	0.7203	0.8374	0.0000	5.79
WKNN	0.7458	0.9765	0.1515	0.5640	0.7477	0.8469	0.3846	4.36
Ensemble	0.7458	0.8824	0.3939	0.6381	0.7895	0.8333	0.5896	2.43
SVM*	0.7203	1.0000	0.0000	0.5000	0.7203	0.8374	0.0000	5.79
kNN*	0.7373	0.9176	0.2727	0.5952	0.7647	0.8342	0.5003	3.21

Classifier	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
CDT	0.5256	0.2969	0.6848	0.4908	0.3958	0.3393	0.4509	8.64
LDA	0.6538	0.5938	0.6957	0.6447	0.5758	0.5846	0.6427	4.07
LR	0.6474	0.5938	0.6848	0.6393	0.5672	0.5802	0.6376	5.14
KNB	0.6234	0.6406	0.6111	0.6259	0.5395	0.5857	0.6257	5.86
LSVM	0.6346	0.5313	0.7065	0.6189	0.5574	0.5440	0.6126	6.79
CKNN	0.6282	0.6094	0.6413	0.6253	0.5417	0.5735	0.6251	6.50
Ensemble	0.6603	0.5469	0.7391	0.6430	0.5932	0.5691	0.6358	4.29
SVM*	0.6667	0.6094	0.7065	0.6579	0.5909	0.6000	0.6562	2.57
kNN*	0.6987	0.6250	0.7500	0.6875	0.6349	0.6299	0.6847	1.14

Classifier	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
FDT	0.7064	0.7091	0.7037	0.7064	0.7091	0.7091	0.7064	4.00
LDA	0.6514	0.6727	0.6296	0.6512	0.6491	0.6607	0.6508	7.29
LR	0.6697	0.7091	0.6296	0.6694	0.6610	0.6842	0.6682	6.29
KNB	0.6239	0.5636	0.6852	0.6244	0.6458	0.6019	0.6214	8.00
LSVM	0.7064	0.8000	0.6111	0.7056	0.6769	0.7333	0.6992	4.93
CKNN	0.6514	0.8182	0.4815	0.6498	0.6164	0.7031	0.6276	7.07
Ensemble	0.7156	0.7818	0.6481	0.7150	0.6935	0.7350	0.7119	3.57
SVM*	0.7523	0.7818	0.7222	0.7520	0.7414	0.7611	0.7514	1.64
kNN*	0.7431	0.8364	0.6481	0.7423	0.7077	0.7667	0.7363	2.21

Classifier	Accuracy	TPR	TNR	AUC	Precision	F-Score	G-Mean	Mean Rank
CDT	0.6061	0.6702	0.5211	0.5957	0.6495	0.6597	0.5910	8.57
LDA	0.6667	0.7234	0.5915	0.6575	0.7010	0.7120	0.6542	5.36
LR	0.6606	0.7234	0.5775	0.6504	0.6939	0.7083	0.6463	6.43
KNB	0.6788	0.7979	0.5211	0.6595	0.6881	0.7389	0.6448	5.93
CGSVM	0.6727	0.9149	0.3521	0.6335	0.6515	0.7611	0.5676	6.29
CKNN	0.6909	0.7766	0.5775	0.6770	0.7087	0.7411	0.6697	3.43
Ensemble	0.6909	0.7979	0.5493	0.6736	0.7009	0.7463	0.6620	4.29
SVM*	0.7091	0.8511	0.5211	0.6861	0.7018	0.7692	0.6660	3.14
KNN*	0.7333	0.8617	0.5634	0.7125	0.7232	0.7864	0.6968	1.57

Classifier	All	Caucasion	Hispanic	Females	Males	Age ≤ 50	Age > 50	Average Rank
DT	1.7018	0.1440	0.1093	0.1185	0.1123	0.0911	0.1280	1.29
LDA	1.6479	0.1742	0.1670	0.1720	0.1729	0.1400	0.1654	2.14
LR	0.3802	0.3067	0.2393	0.2837	0.2654	0.2357	0.2433	4.00
NB	3.6635	2.1350	3.0485	2.7358	3.8645	4.1244	3.2748	6.86
SVM	2.9095	0.2939	0.2926	0.2804	0.2520	0.2577	0.2710	4.57
KNN	1.7064	0.1768	0.1940	0.1648	0.2247	0.1761	0.1751	3.00
Ensemble	4.2913	1.9599	2.1277	1.8031	1.9468	2.0706	2.0714	6.14

Classifier	All	Caucasian	Hispanic	Females	Males	Age ≤ 50	Age > 50
DT	0.5876	0.7285	0.6260	0.7119	0.5256	0.7064	0.6061
LDA	0.6861	0.6908	0.6423	0.6695	0.6538	0.6514	0.6667
LR	0.6861	0.6887	0.6260	0.6102	0.6474	0.6697	0.6606
NB	0.6642	0.6887	0.6504	0.6356	0.6234	0.6239	0.6788
SVM	0.6898	0.7219	0.6829	0.7203	0.6346	0.7064	0.6727
kNN	0.6934	0.7483	0.6585	0.7458	0.6282	0.6514	0.6909
Ensemble	0.7044	0.7219	0.6585	0.7458	0.6603	0.7156	0.6909
SVM*	0.7226	0.7351	0.6748	0.7203	0.6667	0.7523	0.7091
kNN*	0.7409	0.7483	0.7724	0.7373	0.6987	0.7431	0.7333
mean Rank	3.44	1.33	5.00	3.44	6.44	3.78	4.56

Dataset	Measure	BPSO1	BPSO2	BPSO3	BPSO4
All	AVG	0.7511	0.7518	0.7464	0.7449
	STD	0.0075	0.0045	0.0039	0.0040
Caucasian	AVG	0.7967	0.7954	0.7841	0.7808
	STD	0.0136	0.0114	0.0084	0.0091
Females	AVG	0.8034	0.7941	0.7949	0.7907
	STD	0.0078	0.0070	0.0067	0.0098
Age > 50	AVG	0.7836	0.7782	0.7752	0.7655
	STD	0.0111	0.0111	0.0092	0.0064
Hispanic	AVG	0.7870	0.7862	0.7797	0.7740
	STD	0.0064	0.0067	0.0071	0.0107
Age ≤ 50	AVG	0.8321	0.8211	0.8092	0.7945
	STD	0.0168	0.0099	0.0149	0.0131
Males	AVG	0.7513	0.7205	0.7186	0.7180
	STD	0.0159	0.0081	0.0047	0.0043
Mean Rank	F-test	1.14	2.00	2.86	4.00

Dataset	Measure	BPSO1	BPSO2	BPSO3	BPSO4
All	AVG	17.6	16.4	17.7	16.9
	STD	2.5473	3.0984	2.8304	2.6854
Caucasian	AVG	13.0	13.7	14.9	14.5
	STD	3.2660	2.9458	3.4785	2.7988
Females	AVG	14.1	14.4	14.8	13.2
	STD	1.8529	2.2211	2.0440	3.1903
Age > 50	AVG	14.8	15.0	15.4	16.0
	STD	2.6583	2.4944	1.8974	3.4641
Hispanic	AVG	14.6	14.7	15.4	16.2
	STD	0.6992	2.0575	2.5473	2.8206
Age ≤ 50	AVG	14.8	15.6	15.4	15.5
	STD	1.3166	1.7764	2.3190	2.5927
Males	AVG	12.5	12.8	14.8	13.6
	STD	2.8771	3.1903	2.4404	1.3499
Mean Rank	F-test	1.43	2.29	3.43	2.86

Dataset	Measure	BPSO1	BPSO2	BPSO3	BPSO4
All	AVG	464.0	481.7	466.2	467.8
	STD	4.8617	3.6674	3.3655	3.2812
caucasion	AVG	368.1	374.5	371.8	374.0
	STD	4.3402	4.2569	3.6641	3.5733
females	AVG	245.8	247.9	247.9	248.4
	STD	2.1698	2.1417	1.7136	2.1340
age > 50	AVG	266.9	270.2	268.7	270.0
	STD	2.5395	2.2341	2.4543	1.5417
hispanic	AVG	260.5	262.9	262.1	263.9
	STD	2.9500	2.1613	1.8231	2.4382
age ≤ 50	AVG	261.4	262.1	264.8	263.7
	STD	2.3115	1.7837	2.1478	2.0247
males	AVG	382.7	251.8	250.3	250.9
	STD	46.5774	1.8658	1.8342	1.7393
mean rank	F-test	1.43	3.21	2.21	3.14

Dataset	Measure	BPSO1	BHHO	BGSA	BWOA	BGWO	BBA	BALO	BMFO
All	AVG	0.7511	0.7515	0.7245	0.7507	0.7372	0.6624	0.7474	0.7504
	STD	0.0075	0.0095	0.0086	0.0042	0.0064	0.0472	0.0038	0.0039
Caucasian	AVG	0.7967	0.7821	0.7430	0.7815	0.7623	0.7060	0.7781	0.7821
	STD	0.0136	0.0049	0.0061	0.0054	0.0049	0.0356	0.0056	0.0058
Females	AVG	0.8034	0.8068	0.7517	0.8093	0.7864	0.6949	0.8017	0.8059
	STD	0.0078	0.0067	0.0106	0.0072	0.0088	0.0344	0.0091	0.0063
Age > 50	AVG	0.7836	0.7703	0.7236	0.7691	0.7473	0.6945	0.7649	0.7721
	STD	0.0111	0.0078	0.0122	0.0097	0.0081	0.0192	0.0110	0.0077
Hispanic	AVG	0.7870	0.7805	0.7382	0.7789	0.7683	0.6805	0.7813	0.7772
	STD	0.0064	0.0094	0.0120	0.0064	0.0103	0.0617	0.0071	0.0042
Age ≤ 50	AVG	0.8321	0.7991	0.7376	0.7991	0.7661	0.6661	0.7835	0.7853
	STD	0.0168	0.0110	0.0189	0.0091	0.0108	0.0604	0.0151	0.0064
Males	AVG	0.7513	0.7224	0.6949	0.7154	0.7090	0.6430	0.7160	0.7160
	STD	0.0159	0.0053	0.0106	0.0033	0.0033	0.0327	0.0031	0.0031
Mean Rank	F-test	1.57	2.29	7.00	3.36	6.00	8.00	4.36	3.43

Dataset	BPSO (the Best Performaing Method) vs.
Dataset	BHHO	BGSA	BWOA	BGWO	BBA	BALO	BMFO
All	$2.79 \times 10^{- 1}$	$2.62 \times 10^{- 4}$	$5.05 \times 10^{- 1}$	$1.54 \times 10^{- 3}$	$1.69 \times 10^{- 4}$	$4.82 \times 10^{- 2}$	$5.55 \times 10^{- 1}$
Caucasian	$8.58 \times 10^{- 3}$	$1.51 \times 10^{- 4}$	$1.12 \times 10^{- 2}$	$2.27 \times 10^{- 4}$	$1.66 \times 10^{- 4}$	$5.03 \times 10^{- 3}$	$1.13 \times 10^{- 2}$
Females	$3.53 \times 10^{- 1}$	$1.50 \times 10^{- 4}$	$1.16 \times 10^{- 1}$	$1.15 \times 10^{- 3}$	$1.60 \times 10^{- 4}$	$4.69 \times 10^{- 1}$	$5.13 \times 10^{- 1}$
Age > 50	$8.05 \times 10^{- 3}$	$1.74 \times 10^{- 4}$	$8.19 \times 10^{- 3}$	$1.62 \times 10^{- 4}$	$1.73 \times 10^{- 4}$	$2.93 \times 10^{- 3}$	$1.83 \times 10^{- 2}$
Hispanic	$1.04 \times 10^{- 1}$	$1.56 \times 10^{- 4}$	$1.89 \times 10^{- 2}$	$4.39 \times 10^{- 4}$	$1.62 \times 10^{- 4}$	$1.18 \times 10^{- 1}$	$2.24 \times 10^{- 3}$
Age ≤ 50	$3.85 \times 10^{- 4}$	$1.64 \times 10^{- 4}$	$3.28 \times 10^{- 4}$	$1.58 \times 10^{- 4}$	$1.71 \times 10^{- 4}$	$1.61 \times 10^{- 4}$	$1.45 \times 10^{- 4}$
Males	$9.68 \times 10^{- 1}$	$1.51 \times 10^{- 4}$	$8.81 \times 10^{- 3}$	$2.72 \times 10^{- 4}$	$1.57 \times 10^{- 4}$	$1.26 \times 10^{- 2}$	$1.26 \times 10^{- 2}$

Dataset	Measure	BPSO1	BHHO	BGSA	BWOA	BGWO	BBA	BALO	BMFO
All	AVG	17.6	22.8	18.1	22.2	27.4	13	25.1	24.2
	STD	2.55	2.10	2.23	2.66	1.17	2.11	1.37	1.81
Caucasian	AVG	13.0	19.9	14.6	21.5	26.2	12.1	24.6	21.7
	STD	3.27	3.28	2.63	2.32	0.79	2.33	1.35	1.89
Females	AVG	14.1	23	15	22.8	27	13.4	24	24.8
	STD	1.85	1.56	1.89	2.39	1.56	2.17	1.33	1.69
Age > 50	AVG	14.8	18.6	16.9	21.3	27.3	15.1	25.6	22.5
	STD	2.66	5.87	2.88	3.47	1.64	2.42	1.65	2.12
Hispanic	AVG	14.6	19.6	17.3	20.7	25.2	13	23.3	21.1
	STD	0.70	3.78	3.53	3.59	1.23	2.45	1.89	2.38
Age ≤ 50	AVG	14.8	18.3	16.4	19	25.1	12.6	23.4	19.5
	STD	1.32	2.11	2.80	1.83	1.20	2.46	1.51	1.18
Males	AVG	12.5	19.1	14.9	18.6	25.5	11.3	22.5	21.4
	STD	2.88	5.24	1.60	3.75	1.84	4.57	1.96	2.27
Mean Rank	F-test	1.86	4.43	3.00	4.57	8.00	1.14	6.86	6.14

Dataset	Measure	BPSO1	BHHO	BGSA	BWOA	BGWO	BBA	BALO	BMFO
all	AVG	464.05	798.02	465.45	476.24	475.84	468.76	474.47	468.78
	STD	4.862	9.414	4.992	4.813	5.720	5.622	7.119	6.179
caucasion	AVG	368.14	613.92	376.20	378.73	377.41	376.77	375.30	374.28
	STD	4.340	5.295	3.015	3.928	4.045	3.191	5.114	4.507
females	AVG	245.75	401.38	248.90	248.18	249.38	250.27	247.90	247.08
	STD	2.170	4.390	2.088	2.839	2.664	1.802	2.385	2.445
age > 50	AVG	266.88	441.47	267.54	272.05	272.21	269.97	269.77	269.41
	STD	2.540	4.207	2.024	2.318	2.357	2.169	3.116	2.618
hispanic	AVG	260.48	431.94	264.26	261.83	264.53	265.01	262.17	261.13
	STD	2.950	4.529	1.983	2.527	2.019	3.110	2.884	1.943
age ≤ 50	AVG	261.38	429.46	266.24	263.29	266.10	265.21	262.29	262.48
	STD	2.311	3.038	2.525	1.785	2.330	2.237	3.266	1.700
males	AVG	382.68	409.66	250.95	249.72	251.26	255.21	249.02	249.54
	STD	46.577	4.520	1.540	2.240	1.781	3.173	2.732	2.101
mean rank	F-test	1.86	8.00	4.14	4.86	6.00	5.43	3.14	2.57

Dataset	Accuracy	#Features	f1	f2	f3	f4	f5	f6	f7	f8	f9	f10	f11	f12	f13	f14	f15	f17	f18	f19	f20	f21	f22	f23	f24	f25	f26	f27	f28	f29	f30	f31
all	0.7628	18	0	0	1	0	1	1	0	1	0	1	1	0	1	1	0	0	1	1	0	1	1	1	0	1	1	1	0	0	1	1
caucasian	0.8080	13	-	0	1	0	1	1	0	1	0	1	1	0	1	1	0	0	0	0	0	1	1	0	0	0	0	0	1	1	0	1
Hispanic	0.7968	14	-	1	0	1	0	0	0	1	1	0	1	1	0	1	0	1	0	0	0	1	1	0	1	0	1	0	1	0	1	0
females	0.8136	13	1	0	-	0	1	1	0	0	0	1	1	0	1	1	0	0	1	1	0	1	0	0	1	0	0	0	0	1	0	1
males	0.7885	11	0	0	-	0	1	0	0	0	0	1	1	0	1	0	0	0	0	1	0	0	1	1	1	0	1	0	0	1	0	1
age<=50	0.8624	15	0	0	1	0	0	1	0	1	1	0	0	0	1	1	1	1	0	0	1	0	1	1	1	0	1	0	0	0	1	1
age>50	0.8061	17	1	0	0	0	0	1	1	1	1	1	1	1	0	0	1	1	0	1	0	0	1	1	0	0	1	1	0	1	0	1

Dataset	CNN		MLP		kNN*		BPSO-kNN
Dataset	Accuracy	Time	Accuracy	Time	Accuracy	Time	Accuracy	Time
All	0.6105	291.656	0.5438	0.789	0.7409	1.706	0.7628	464.050
Caucasion	0.7283	204.591	0.6159	3.282	0.7483	0.177	0.8080	368.142
Hispanic	0.6513	180.510	0.5285	2.341	0.7724	0.194	0.7968	245.754
Females	0.7023	208.605	0.6102	3.398	0.7373	0.165	0.8136	266.882
Males	0.6263	174.349	0.5769	2.969	0.6987	0.225	0.7885	260.481
Age ≤ 50	0.6427	177.557	0.5780	2.917	0.7431	0.176	0.8624	261.377
Age > 50	0.6629	219.354	0.5455	3.483	0.7333	0.175	0.8061	382.676

PERMALINK

Diagnosis of Obstructive Sleep Apnea Using Feature Selection, Classification Methods, and Data Grouping Based Age, Sex, and Race

Alaa Sheta

Thaer Thaher

Salim R Surani

Hamza Turabieh

Malik Braik

Jingwei Too

Noor Abu-El-Rub

Majdi Mafarjah

Hamouda Chantar

Shyam Subramanian

Roles

Abstract

1. Introduction

2. Proposed Diagnosis Process

Figure 1.

3. Sleep Apnea Dataset

Table 1.

Figure 2.

Table 2.

Table 3.

4. Data Preprocessing

4.1. Missing Data

Figure 3.

4.2. Data Normalization

4.3. Role of Grouping in OSA Diagnosis

Figure 4.

4.4. Wrapper Feature Selection

Table 4.

Figure 5.

4.5. Formulation of Feature Selection Problem

5. Experimental Setup

Table 5.

Table 6.

Table 7.

6. Experimental Results

6.1. Results with All Data

Table 8.

6.2. Data Grouping with Race

Table 9.

Table 10.

6.3. Data Grouping with Gender

Table 11.

Table 12.

6.4. Data Grouping with Age

Table 13.

Table 14.

Table 15.

Figure 6.

6.5. Summary Performance with Data Grouping

Table 16.

Table 17.

7. Feature Selection

7.1. Evaluation of BPSO Using Different TFs

Table 18.

Table 19.

Table 20.

7.2. Comparison of BPSO with Well-Known Algorithms

Table 21.

Table 22.

Table 23.

Table 24.

Table 25.

Figure 7.

7.3. Relevant Features Selected by BPSO

Table 26.

Table 27.

Figure 8.

7.4. Comparison of the BPSO-kNN with CNN, MLP, and kNN*

Table 28.

Figure 9.

7.5. Comparison Study

Table 29.

8. Conclusions and Future Works

Author Contributions

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Results of NAMES [30]		Proposed (BPSO-KNN)		Haberfeld et al. [28]		Surani et al. [27]
Combination	AUC	Dataset	Average AUC	SVM	LR	LR	ANN
NC + MF + CM + ESS + S + BMI	0.6577	all	0.7438
NC + MF + CM + ESS + S + M	0.6572	caucasion	0.7690
NC + MF + CM + ESS + S + BMI + M (NAMES2)	0.6690	hispanic	0.7811
NC + MF + M + ESS + S	0.6583	females	0.6707	0.6220	0.6080	0.7030	0.5830
BMI + MF + CM + ESS + S + M	0.6436	males	0.7318	0.6070	0.6070	0.7130	0.6360
(NC + MF) × 2 + CM + ESS + S	0.6661	age ≤ 50	0.8320
(NC + BMI) × 2 + M + ESS + S	0.6433	ag > 50	0.7684
(NC + MF) × 2 + M + ESS + S	0.6484
(NC + BMI) × 2 + CM + ESS + S	0.6426
(NC + MF + BMI)×2 + CM + ESS + S + M	0.6478