. Author manuscript; available in PMC: 2022 Sep 1.

Published in final edited form as: IEEE Trans Biomed Eng. 2022 May 19;69(6):2094–2104. doi: 10.1109/TBME.2021.3136753

TABLE III.

Overall per-event performance for experiment 1 with all values in mean percentages and 95% confidence intervals, and experiment 2 with mean performance among the individual events, i.e. obstructive apnea, central apnea, rera, hypopnea

	Experiment 1		Experiment 2
	MGH dataset (binary)	SHHS dataset (binary)	MGH dataset (multiclass)

Accuracy	95.7 [95.7–95.7]	94.0 [94.0–94.1]	99.1
Sensitivity	67.7 [67.6–67.8]	70.9 [70.7–71.0]	49.3
Specificity	97.6 [97.6–97.6]	95.1 [95.1–95.2]	99.5
Precision	65.4 [65.3–65.5]	40.7 [40.5–40.8]	37.4
F1-score	66.5 [66.4–66.6]	51.7 [51.6–51.8]	40.6
Cohen’s kappa	64.2 [64.1–64.3]	48.7 [48.6–48.9]	36.5 [36.4–36.6]

Note that for experiment 2 all performance metrics (except Cohen’s kappa) do not show a 95% confidence interval range since these are mean values from all individual event types as seen in Table IV.