Opportune warning of COVID-19 in a Mexican health care worker cohort: Discrete beta distribution entropy of smartwatch physiological records

Alejandro Aguado-García; América Arroyo-Valerio; Galileo Escobedo; Nallely Bueno-Hernández; PV Olguín-Rodríguez; Markus F Müller; José Damián Carrillo-Ruiz; Gustavo Martínez-Mekler

doi:10.1016/j.bspc.2023.104975

. 2023 Apr 21;84:104975. doi: 10.1016/j.bspc.2023.104975

Opportune warning of COVID-19 in a Mexican health care worker cohort: Discrete beta distribution entropy of smartwatch physiological records

Alejandro Aguado-García ^a,^b,¹, América Arroyo-Valerio ^a,¹, Galileo Escobedo ^a, Nallely Bueno-Hernández ^a, PV Olguín-Rodríguez ^c,^d, Markus F Müller ^d,^e,^f, José Damián Carrillo-Ruiz ^a,^g,^⁎, Gustavo Martínez-Mekler ^b,^e,^f,^⁎

PMCID: PMC10121132 PMID: 37125410

Abstract

We present a statistical study of heart rate, step cadence, and sleep stage registers of health care workers in the Hospital General de México “Dr. Eduardo Liceaga” (HGM), monitored continuously and non-invasively during the COVID-19 contingency from May to October 2020, using the Fitbit Charge 3® Smartwatch device. The HGM-COVID cohort consisted of 115 participants assigned to areas of COVID-19 exposure. We introduce a novel biomarker for an opportune signal for the likelihood of SARS-CoV-2 infection based on the Shannon Entropy of the Discrete Generalized Beta Distribution fit of rank ordered smartwatch registers. Our statistical test indicated infection for 94% of patients confirmed by positive polymer chain reaction (PCR+) test, 47% before the test, and 47% in coincidence. These results required innovative data preprocessing for the definition of a new biomarker index. The statistical method parameters are data-driven, confidence estimates were calibrated based on sensitivity tests using appropriately derived surrogate data as a benchmark. Our surrogate tests can also provide a benchmark for comparing results from other anomaly detection methods (ADMs). Biomarker comparison of the negative Immunoglobulin G Antibody (IgG-) subgroup with the PCR+ subgroup showed a statistically significant difference (p < 0.01, effect size = 1.44). The distribution of the uninfected population had a lower median and less dispersion than the PCR+ population. A retrospective study of our results confirmed that the biomarker index provides an early warning of the likelihood of COVID-19, even several days before the onset of symptoms or the PCR+ test request. The method can be calibrated for the analysis of different SARS-CoV-2 strains, the effect of vaccination, and previous infections. Furthermore, our biomarker screening could be implemented to provide general health profiles for other population sectors based on physiological signals from smartwatch wearable devices.

Keywords: Physiological signals, Heart rate, Biomedical signal processing, Time series analysis, Shannon entropy, Early warning, Sleep fragmentation

1. Introduction

Health care workers (HCW) are at higher risk to contract coronavirus disease (COVID-19) than the overall population. Beyond symptomatic cases, HCW can also develop asymptomatic COVID-19 and become a source of infection without being aware of it. After being in contact with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), during the period under study from May to October 2020, the average incubation period took around 4–7 days until the appearance of the first COVID-19 symptoms, when the transmission peak occurs [1], [2]. In this scenario, opportune disease warning is of vital importance, among others, to interrupt disease transmission by isolation, decrease fatality case rates, and avoid the collapse of the medical care system. Safety of HCW starts with the necessary personal protective equipment (PPE) and regulations [3]. We propose that the protection of HCW can be increased with the opportune warning of the disease by continuous monitoring of physiological variables through a smartwatch.

Our attention on smartwatch registers should not come as a surprise since they have been used, amongst others, for the warning of viral influenza infection in association with an increase in the resting heart rate [4]. Also, heart rate and body temperature are interrelated, since for every degree that body temperature raises the heart rate tends to increase by approximately 8.5 beats per minute [5]. On the other hand, sleep is known to affect numerous physiological functions in humans. Sleep increases when viral infections such as influenza, rhinovirus, and Epstein bar virus occur [6]. In the case of influenza A or B, an increase in the sleep duration in the symptomatic period and a reduction of sleep during the incubation period have been reported [5], [6]. With respect to COVID-19, infection can be identified through changes in physiological characteristics, such as heart rate variability [7], oxygen saturation [8], respiration rate [9], and arrhythmia [10].

Some wearable studies have focused on detection and early warning by means of artificial intelligence (machine and deep learning) [11] and statistical analysis [12]. For an extensive review on both methods see [13]. Other studies have contributed to research geared to the understanding of the infection [9], [14], [15]. Herein we introduce a novel statistical analysis of Fitbit® smartwatch data that may help timely recognition of SARS-CoV-2 infection in our cohort study. Our study focuses on the calculation of the so-called Beta Entropy (BE) [16], [17] of the Fitbit registers (which we define further on) as an aid in the identification of a possible SARS-CoV-2 infection. It consisted of a retrospective data analysis of the cohort, identifying patients with high BE fluctuation values for posterior corroboration with a polymerase chain reaction test (PCR+). During our study, using PCR + and anti-SARS-CoV-2 serum immunoglobulin G (IgG) as gold standards to confirm SARS-CoV-2 infection, we determined three groups: PCR + with IgG+, IgG-, and IgG + in the absence of PCR+. We then proceeded to determine statistical relevant physiological parameters for the first group and we differentiated statistically the first and second groups. All in all, through our BE biomarker study of the Fitbit Charge 3® Smartwatch registers we produce a general health profile of the Cohort.

We would like to point out that to our knowledge, neither entropy arguments have been previously considered in the literature for this type of study, nor surrogate data analysis has been implemented to gauge sensitivity of Anomalous Detection Methods (ADMs). We also remark that surrogate calculations additionally provide a means for establishing a benchmark for ADMs. Our data preprocessing and definition of detector index are innovative methodological features.

In the methods section we include a description of the cohort, an explanation of the BE calculation, and a presentation of the warning method and surrogate data generation. The implementation of BE on the HGM-COVID cohort data and the determination of its efficacy and significance in the warning of COVID-19 cases is shown in the results section. Also, statistical differences among the first two above-mentioned cohort subgroups are shown. The paper ends with a discussion and perspectives section.

2. Methods

2.1. Ethics declarations

A prospective cohort study on health workers, at the General Hospital of Mexico “Dr. Eduardo Liceaga”, was conducted in Mexico City, approved by the Ethics and Clinical Research committees of the hospital (Approval No: DI/20/501/04/32) in compliance with the principles of the Declaration of Helsinki. All health workers that agreed to take part in the study signed an informed consent and received a full explanation of the purposes and procedures of the study.

2.2. Description of the HGM-COVID cohort

For our study, a cohort of 115 HCW (medical doctors, nurses, researchers, psychologists, rehabilitators, and administrative staff) was constituted at the General Hospital of Mexico “Dr. Eduardo Liceaga” in Mexico City, designated for treating COVID-19 patients in Mexico City. Only volunteers with no previous diagnosis or symptoms of SARS-Cov-2 were considered. A detailed account of the cohort constitution with criteria related to the selection of participants, clinical evaluation of volunteers and chemical laboratory analysis protocols is presented in [18]. Here we summarize the main cohort procedure considerations: (i) All participants had medical examinations both physical and clinical. (ii) Laboratory biochemical analysis of blood samples were performed. (iii) PCR tests in nasopharyngeal exudates were performed when symptoms arose, 21 subjects were infected and detected by PCR positive during the study. (iv) Specific IgG antibody levels against SARS-CoV-2 Spike S1 proteins were measured by flow cytometric suspension assays at the beginning of the study and six months later, 64 subjects were IgG positive at the end of the study. (v) A Fitbit Charge 3® brand smartwatch was placed for continuous use, to capture and analyze physiological signals such as heart rate, steps, and sleep. (vi) Date of PCR positive detection and presence of IgG antibodies were reported.

2.3. Shannon entropy of the generalized discrete beta distribution

The biomarker is based on an analysis of a discrete generalized beta distribution that has been shown to fit very well samples, with unimodal probability density functions, organized according to their rank (e.g., the size of the quantities under study, the frequency of their appearance, or some other relevant property). The ubiquity of this fit gives rise to the search of universality class properties that lead to the classification of a variety of qualitatively different systems like textbooks, musical scores, genetic regulation networks, social networks, populations, urban planning, and physiological time series [19]. Studies of generated sequences from mathematical models have pointed out that the parameters of the rank-frequency distribution capture non-trivial dynamical and statistical features of the system that provide some understanding of mechanisms behind their emergence [20], [21].

In the present study we are rank ordering daily Fitbit registers according to their magnitude in a descending order, viz.

\{r_{1,} r_{2}, r_{3}, \dots, r_{M}\},

(1)

with $r_{1} \geq r_{2} \geq \dots \geq r_{M}$ .

This rank-size ordered sequence is then log–log fitted with the Discrete Generalized Beta Distribution (DGBD) given by:

f_{(α, β)} (r_{i}) = A \frac{{(N + 1 - r_{i})}^{β}}{r_{i}^{α}}, i = 1, \dots, N

(2)

which depends on a set of fitting parameters:

(A, α, β)

(3)

where $A$ is an overall scale parameter of the rank ordered sequence, while exponents $α$ and $β$ are the main defining features the function $f$ . In the following we evaluate the goodness of a log–log linear fit with the Pearson correlation $R^{2}$ .

We now define a Discrete Generalized Beta probability function (DGB) as $\hat{f} = A^{'} f$ , with the normalizing factor $A^{'}$ :

A^{'} = {[A \sum_{i = 1}^{N} \frac{{(N + 1 - r_{i})}^{β}}{r_{i}^{α}}]}^{- 1}

(4)

This ensures that the sum over the ranges is equal to 1:

\sum_{i = 1}^{N} {\hat{f}}_{(α, β)} (r_{i}) = 1

(5)

Hereafter, we shall refer to $\hat{f}$ as the Beta Probability. Next, following [16], [17], we calculate the Shannon entropy of $\hat{f}$ , namely:

S (\hat{f}) = - \sum_{i = 1}^{N} {\hat{f}}_{(α, β)} (r_{i}) \log {\hat{f}}_{(α, β)} (r_{i})

(6)

Finally, since we will need to compare this entropy for the different Fitbit physiological records, we define the Beta Entropy $BE (d)$ per day as a normalization of $S (\hat{f})$ , i.e.:

BE (d) = \frac{S_{d} (\hat{f}) - μ}{σ}

(7)

where $μ$ and $σ$ are the mean value and the standard deviation of all $S_{d} (\hat{f})$ values over 6 months and d is the day the data was registered. Note that with this definition the standard deviation for the $BE (d)$ is 1 and the mean value is 0. Based on this quantity, we propose in the following a biomarker to detect possible infections of COVID-19.

2.4. Smartwatch data collection

Fitbit® granted permission to store all the participant records provided by the smartwatch in a particular private database for the study with the final purpose to download and analyze all the information automatically. Only the records of heart rate, step cadence and sleep states of each participant in “CSV” format were used. HR data are obtained approximately at a rate of 5 samples per minute, StM and SlS at 1 sample per minute. Overall, the smartwatch was worn all day long (24 h) for about six months for all participants.

2.5. Data fitting

We estimate the $A$ , $α$ and $β$ parameters of the DGBD using Matlab® software to perform a multiple linear regression (regress command) of $l n [f_{(α, β)} (r)] = \ln (A) + α l n (r) + β l n (N + 1 - r),$ in the base functions $\ln (r)$ and $\ln (N + 1 - r)$ . In order to calculate the BE, the above fit is performed over rank ordered properties of the different physiological registers. For heart rate as well as for the physical activity we use the size of fluctuations of the empirical recordings with respect to 0 in a rank ordered fashion. In contrast, sleep stages are not rank ordered by the fluctuation size but are ordered with respect to the duration of sleep stages. We call this characteristic “sleep fragmentation”.

2.6. Detection of risk bands

We performed the fit of the DGBD to daily registers of physiological variables: sleep states (SlS) (during night), cadence of steps per minute (StM) (during day) and heart rate (HR) (during day), throughout the 6-month records, for each subject of the cohort. With this data we determined the daily evolution of the corresponding beta entropy ${BE}_{i} (d)$ , where $i \in {S l S, S t M, H R}$ labels the physiological quantity.

To determine a potential biomarker, we implemented the following 3 steps:

(1)
Given that fluctuations may surface with both positive and negative big values, we work with a rectified absolute value of ${BE}_{i} (d)$ :

R_{i} (d) = |{BE}_{i} (d)|

(8)

(2)
We introduce a daily detector $w_{i} (d)$ for each register:

w_{i} (d) = \{\begin{matrix} R_{i} (d) i f R_{i} (d) > u \\ 0 o t h e r c a s e \end{matrix})

(9)

where $u$ is the cut-off threshold in σ standard deviation units, calculated over six months. The choice of the threshold size u depends on a statistical treatment presented in the results section.

(3)
Finally, for an overall detector that embraces the behavior of the three physiological variables we define:

{D (d) = [w}_{S l S} (d) + w_{S t M} (d) + w_{H R} (d)] / 3

(10)

Recapitulating, the three steps aim to integrate information from the daily BE of the three physiological variables in a single index that is restricted to the absolute values of fluctuations larger than a threshold $u$ .

One of our main results is the identification of an infection risk band pattern, represented in a color-coded fashion of $D (d)$ in Fig. 3, where columns refer days and lines to participants. We show that the identification of high-risk bands (periods of time with big fluctuations in the BE during at least 3 consecutive days where $D (d)$ is above the threshold $u$ ) provides an opportune detector of symptomatic infection. Band perception depends on the choice of the free parameters of the method, namely the length of local warning windows $Δ d$ and the value of the risk threshold $u$ .

2.7. Random warnings

To put our proposal on a solid statistical footing we estimate the amount of random warnings using appropriate surrogate data. To this end we generate for each subject 1000 random shuffles of risk band (yellow, orange and red stripes in Fig. 3) and non-risk band periods (green coloured time windows in Fig. 3) and we estimate the probability that a risk band overlaps just by chance with the date of the PCR-test. Hereby a possible causal coincidence is destroyed and thus, the ensemble of surrogate data represents the null hypothesis of pure random warnings. Furthermore, as we shall see below, we use these surrogates for an appropriate adjustment of the two parameters of the method, namely the length of the warning windows $Δ d$ and the height of the risk threshold $u$ , by maximizing sensitivity as well as the discrepancy with respect to random warnings.

3. Results

3.1. On cohort data registers

From the 115 subjects in the cohort, 21 resulted PCR SARS-CoV-2 positive, of which only 18 Fitbit registers had at least two months of data, the cut-off minimum for our statistical study. Fig. 1 shows an example of the three physiological variables of a subject recorded by the smartwatch over a one-month period. Panel (a) corresponds to the recording of SlS throughout the nights, where state 1 corresponds to deep sleep, 2 to light sleep and 3 to awake. The recording of the physical activity via the cadence of steps per minute (Panel (b)) as well as the heart rate (Panel (c)) are shown during day and night.

Fig. 1 — Example of the physiological variables of a subject from the HGM-COVID cohort recorded over a month by the smart watch from 2020/09/01 to 2020/09/30. (a) Sleep states per minute (SlS). (b) Cadence of steps per minute (StM). (c) Heart rate (HR) per minute.

The lack of signal on the 13th day is because the subject did not synchronize his smartwatch with the database that day. For this subject, one day of data was lost for the absence of three variables simultaneously (red arrow) and another one for the absence of sleep recording (day 29, black arrow). For the remaining subjects of the HGM-COVID cohort, the average number of lost registration days per month is 5.5.

3.2. Beta Entropy estimations

Fig. 2a, shows an example of ${BE}_{i} (d)$ for SlS, StM and HR. Note that approximately within day 110 (first vertical pink solid line) and 120 (beginning of December, thick vertical pink solid line) two of the BEs fluctuate beyond one standard deviation and approximately until day 150 (second vertical thin pink solid line) all BEs fluctuate beyond one standard deviation. In panel b, the temporal evolution of $D (d)$ for the six month record is shown after applying a 3-point moving average to diminish rapid fluctuations and to improve the visualization of global trends. We identified time intervals with $D (d)$ -values above $u$ = 0.75 standard deviations as episodes outside “normal” behavior. The choice of the threshold value results from the statistical significance analysis presented in Section 3.4. The first vertical pink thin line indicates the beginning of the most pronounced region outside. These episodes were defined as “risk bands” in section 2.7. The positive result of the PCR test was delivered around day 120 (thick vertical pink line).

3.3. Performance of the BE biomarker

Fig. 3 is a graphical representation of our results and constitutes not only a source for our findings, but also a starting point for further research. It displays the daily integrated fluctuations of the $BE (d)$ , viz. the temporal evolution of $D (d)$ over six months for the 18 PCR positive subjects. Color coded from green to red is the distance from the average value.

Notice that risk bands are often overlapping with the date of the positive PCR + result. In this figure we marked with blue horizonal lines 14 day intervals around positive PCR detection and in white 10 day windows. These events will be called BE-warnings in the sequel.

The performance of beta entropy when analyzing the fitbit registers is summarized in Table 1 . Here we highligth some important aspects: We oberve 1) a high coincidence of high risk bands with the date of positive PCR test for the 18 PCR + subjetcs with procesable data, 17/18 = 94.4%; 2) oportune coincidence of the 18 PCR + subjects with relevant data 16/18 = 88.8%; 3) considerable robustness of these observations despite having few data points per day and even with an occasionally lack of full day data (no synchronization of the smartwatch by the participants), high risk bands based on the BE biomarker occur even shortly before the SARS-CoV-2 the date of positive PCR test, in 8/18 = 44.4% of the PCR + subjects.

Table 1.

Coincidences of high-risk bands with the date of positive PCR test.

Infected detected until January 2021	Number of subjects	Percentage (%)
BE warnings of PCR +.	17	94.4
BE warnings after PCR + date.	1	5.5
BE warnings with PCR + results.	8	44.4
Early BE warnings before PCR + date.	8	44.4
Average risk bands during 6 months.	3 ± 1	–
BE warnings using a window of 10 days window.	14	77.7
BE warnings using a window of 14 days.	17	94.4
BE warnings using a window of 20 days.	18	100

Open in a new tab

3.4. Statistical significance of the biomarker and parameter selection

In Fig. 4 , we display results for surrogate data (see Section 2.7), described above. For each PCR + subject we estimated the probability of random warnings over 1000 surrogates such that we obtained for each parameter set a small sample of 18 probability estimates. We show the cumulative probability distributions of these samples, estimates for different parameter values in Fig. 4.a.

We probe for 5 different threshold heights and two different warning windows chosen as 10 and 14 days, the case of warning window 20 and onwards was not reported because of the increased number of false positives. Hence, if randomly distributed risk bands fall by chance within the warning window $Δ d$ centered at the PCR + date, it is counted as a random warning. Evidently, one observes that the probability of random coincidences with the PCR + date, and hence, random warnings increase with longer windows $Δ d$ and lower thresholds $u$ .

In panel (b) we provide a comparison between the warnings obtained for the surrogates (RWP) and the original data (TWP). The abscissa corresponds to the median values from the cumulative distributions of panel (a) and the ordinate denotes the probability of true warnings obtained for the original data with the same parameters $Δ d$ and $u$ as used for the surrogates. Note, probability of true warnings means in this context the probability that a risk band coincides with the PCR + date within the window $Δ d$ from original data. The dots of each curve are obtained for the five threshold values chosen in this study from $u = 1$ to $u = 0$ , from the left to the right in steps of 0.25. Within this degree of resolution the best parameter values are those that lead to a maximum number of true warnings and at the same time a minimum number of random warnings in combination with a high threshold value to minimize false alarms.

We find that results obtained for the BE (red) are systematically above the estimates for the usual binning-based Shannon Entropy (blue). These results clearly justify the alternative BE magnitude fitting procedure. Furthermore, we observe that all curves almost saturate between $u = 0.75$ and $u = 0.5$ , viz. a further lowering of the threshold would not increase the number of true warnings but would lead to an increased number of random warnings. Thus, the comparison with surrogates provides more objective criteria to define appropriate values for the length of the warning window $Δ d$ and the threshold $u$ .

To find an appropriate choice for the set of parameters that simultaneously minimize random warnings and maximize true warnings, we display in panel (c) the difference of the probability of true and random warnings, derived from the original and the surrogate data respectively as a function of the threshold $u$ . Again, values for the BE are systematically larger than those of the Shannon Entropy, which constitutes a further hint that the BE is the better choice. For BE we encounter estimates above 50% for thresholds $u$ larger than 0.5. Furthermore, for a warning window of 14 days we observe a clear maximum at $u = 0.75$ . Is to say, for this setting we find the largest discrepancy of true warnings with respect to the random warnings. Consequently, we consider the corresponding parameters: warning window $Δ d = 7$ and the threshold $u = 0.75$ as the appropriate choice and we perform the present study with this setting.

3.5. Differences between IgG- and PCR + groups.

We performed Fitbit register studies on the complete cohort of health care workers and obtained patterns like that shown in Fig. 3 for the whole population. Considering IgG and PCR tests, several groups can be identified. The aim of this section is to search for statistical differences between the IgG- and PCR+, IgG + groups.

The number of subjects in each group is different, as well as the amount of data registered by each subject. To avoid problems concerning different data sizes we build random ensembles from each group. To this end we chose randomly 15 subjects from each group and a window with length 15 days ( $Δ d = 7)$ of BE fluctuations, which is selected without overlap across the recordings. This is repeated 200 times to generate the ensembles. The resulting cumulative distribution functions (CDFs) are shown in the left upper inlet of Fig. 5 . Most pronounced differences between the two groups are observed within the range 0.75 < BE < 1.5. This section of the CDF is displayed in Fig. 5 and the boxplot of the bottom right inlet shows the corresponding boxplots of the median values of the CDFs. Differences are significant on a 1% significance level according to the Kolmogorov-Smirnov as well as the nonparametric Mann-Whitney-Wilcoxon rank test. Subjects with positive PCR-test tend strongly to larger BE-values with more disperse distribution of the combined version of the three physiological variables considered in the present study in comparison with IgG- subjects.

Fig. 5 — CDFs of the BE fluctuations of the the ensembles of PCR + group (dotted red lines) and ensembles of IgG-(dotted blue lines) in fluctuation region [0.75 – 1.5]. Upper left inlet displays the complete CDFs. Bottom right inlet shows the boxplots of the median values obtained from the PCR + and IgG-ensembles. Kolmogorov–Smirnov and Mann Whitney Wilcoxon rank test result in p < 0.01 and the effect size is 1.44.

4. Discussion and perspectives

Our work is a retrospective innovative statistical analysis of heart rate, cadence of steps and sleep Fitbit® registers, sensitive to SARS-CoV-2, eventually corroborated by PCR + test, either as an early warning, or during ongoing infection.

The ensuing biomarker relies on entropy determinations based on a recent application [17], [21] of the discrete generalized beta distribution fit of rank ordered registers. For the calculations an ad-hoc data preprocessing procedure is essential. Based on the Beta Entropy, we define a new biomarker index with a 94.4% succes rate, which has a p < 0.01 statistical significance with respect to random warnings derived from appropriate surrogate data. With this procedure, data driven parameter choices can be carried out, namely, window size $Δ d$ and warning threshold u , which are in good accordance with clinical determinations. Our surrogate analysis provides a benchmark, moreover, given its easy implementation it can also be used for comparison with alternative anomaly detection models [13]. We should also point out our analysis of the BE fluctuations statistically distinguishes between IgG- and PCR + subgroups of the cohort. We show that their difference lies essentially in the appearance of more risk bands with higher duration and intensity, BE median values of these bands are 0.95 for IgG- and 1.03 for PCR + with p < 0.01 Wilcoxon test value.

A limitation of our work is insufficient specificity. Some studies such as [8], [9] have pointed out considerable overlaps of COVID-19 manifestations with symptoms of other diseases such as influenza or common colds. In the literature general specificity considerations are commonly not dealt with [13], so far, laboratory tests, such as PCRs are the main COVID-19 infection confirmation. Although our work does not provide a discriminating tool, two features worth highlighting are continuous monitoring and high sensitivity (level of coincidence with PCR + tests). Recall that only in one case study a PCR + test was not foreseen by our index. In this respect, our findings suggest that it would make sense to do a PCR test when we encounter (a statistically anomalous) measurement. Hence, though the absence of specificity should be dealt with, our study provides a reasonable warning for the need of further checking and may therefore contribute to the opportune treatment of the health care personnel and containment of disease transmission.

The small population size in our study is another drawback that could be improved with more data. For a proper detection method the determination of whether PCR + tests always entail anomalous statistical behavior remains unsolved. In general, to deal with the above limitations, more physiological information (e.g. body temperature), as well as further analisis of additional univariate and multivariate statistical properties such as nonlinear auto-correlations and stochastic determinations are called for.

Potential research areas of the the statistical scheme introduced here are: (1) In the COVID-19 context, the study of recovery or deterioration, reinfection and vaccine followups and infection by SARS-CoV-2 mutations, given that the method can be recalibrated for their analysis. (2) For general health issues, since the registers provided by smartwatches are relevant to other deseases and physiological misfunctions, the incorporation a wider selection of data may lead to the screening of individual and collective health pattern behaviors, detected by generalizations of our maker, that could help to identify and classify population health issues for an ample range of population sectors. For further applications, we recall that after the BE calculation, an appropriate processing proceedure is requiered in order to facilitate the integration of the diverse registers in the definition of a sensitive health anomaly index.

The main asset of our work is that it provides a general health screening of hospital personnel (HCW) and a means of classification according to health profiles. Besides being conceptually straight forward, an advantage of the BE biomarker is that it requies minimal computing effort, hence it can be easily applied to relativelly big populations. To our knowledge, our approach is absent in the scientific literature and may be complementary and revealing for other artificial intelligence, such as machine learning treatments. A note of encouragement is that expected technological advances and increased accesibilty of the smart devices will refine and improve the analysis here presented.

Funding

This work was supported by CONACYT grant CF-263377, CF-610285 and 312512. AAG thanks the postdoctoral fellowship given by CONACYT.

CRediT authorship contribution statement

Alejandro Aguado-García: Conceptualization, Software, Writing – original draft, Writing – review & editing, Investigation, Methodology, Visualization. América Arroyo-Valerio: Conceptualization, Writing – original draft, Writing – review & editing, Funding acquisition, Project administration, Resources, Supervision. Galileo Escobedo: Writing – review & editing. Nallely Bueno-Hernández: Writing – review & editing. P.V. Olguín-Rodríguez: Writing – review & editing. Markus F. Müller: Writing – original draft, Writing – review & editing, Investigation, Methodology. José Damián Carrillo-Ruiz: Funding acquisition, Project administration, Resources. Gustavo Martínez-Mekler: Conceptualization, Writing – original draft, Writing – review & editing, Investigation, Methodology, Supervision, Conceptualization, Writing – original draft, Writing – review & editing, Investigation, Methodology, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was carried out within the HGM-COVID-Research-Collective, we acknowledge the participation of: Yoshio Hikotae Tomita-Cruz, Edna Márquez, Rubén Fossion, Octavio Lecona, Rene Márquez, Luis Ruelas, Ana Leonor Rivera, Antonieta Martínez, Arlex Oscar Marín García, Wady Alexander Ríos Herrera. We are highly indebted to their discussions, proposals, suggestions and shared ideas.

References

1.Elizalde González J.J. SARS-CoV-2 y COVID-19. Una revisión de la pandemia. Med. Crítica. 2020;33(1):53–67. doi: 10.35366/93281. [DOI] [Google Scholar]
2.Leung K., Wu J.T., Liu D., Leung G.M. First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment. Lancet. 2020 doi: 10.1016/S0140-6736. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Barranco R., Ventura F. Covid-19 and infection in health-care workers. An emerging problem, Med. Leg. J. 2020;88(2):65–66. doi: 10.35366/93281. [DOI] [PubMed] [Google Scholar]
4.J.M. Radin, N.E. Wineinger, E.J. Topol, S.R. Steinhubl, Harnessing wearable device data to improve state-level real-time surveillance of influenza-like illness in the USA: a populationbased study 2(2) (2021). doi:https://doi.org/10.1016/S2589-7500(19)30222-5. [DOI] [PMC free article] [PubMed]
5.Pakhomov S.V.S., Thuras P.D., Finzel R., Eppel J., Kotlyar M., Cabiati M. Using consumer-wearable technology for remote assessment of physiological response to stress in the naturalistic environment. PLoS One. 2020;15(3) doi: 10.1371/journal.pone.0229942. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Cohen S., Doyle W.J., Alper C.M., Janicki-Deverts D., Turner R.B. Sleep habits and susceptibility to the common cold. Arch Int. Med. 2009;169(1):62–67. doi: 10.1001/archinternmed.2008.505. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Frederick Hasty, MD, Guillermo García, MD, Héctor Dávila, MD, MSS, MC, USAR (Ret.), S Howard Wittels, MD, Stephanie Hendricks, BA, Stephanie Chong, DNP, CRNA, ARNP. Heart Rate Variability as a Possible Predictive Marker for Acute Inflammatory Response in COVID-19 Patients, Military Medicine, Volume 186, Issue 1-2, January-February 2021, Pages e34–e38, https://doi.org/10.1093/milmed/usaa405. [DOI] [PMC free article] [PubMed]
8.Shapiro A., Marinsek N., Clay I., Bradshaw B., Ramirez E., Min J., Trister A., Wang Y., Althoff T., Foschini L. Characterizing COVID-19 and influenza illnesses in the real world via person-generated health data. Patterns. 2021;2(1):100188. doi: 10.1016/j.patter.2020.100188. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Natarajan A., Su H.-W., Heneghan C., Blunt L., O’Connor C., Niehaus L. Measurement of respiratory rate using wearable devices and applications to COVID-19 detection. NPJ Digit. Med. 2021;4(1) doi: 10.1038/s41746-021-00493-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Öztürk F., Karaduman M., Çoldur R., İncecik Ş., Güneş Y., Tuncer M. Interpretation of arrhythmogenic effects of COVID-19 disease through ECG. Aging Male. 2020;23(5):1362–1365. doi: 10.1080/13685538.2020.1769058. [DOI] [PubMed] [Google Scholar]
11.Shuo Liu, et al., Fitbeat: COVID-19 estimation based on wristband heart rate using a contrastive convolutional auto-encoder, Pattern Recogn. 123 (2022), 108403, ISSN 0031-3203, https://doi.org/10.1016/j.patcog.2021.108403. [DOI] [PMC free article] [PubMed]
12.Mishra T., Wang M., Metwally A.A., Bogu G.K., Brooks A.W., Bahmani A., Alavi A., Celli A., Higgs E., Dagan-Rosenfeld O., Fay B., Kirkpatrick S., Kellogg R., Gibson M., Wang T., Hunting E.M., Mamic P., Ganz A.B., Rolnik B., Li X., Snyder M.P. Pre-symptomatic detection of COVID-19 from smartwatch data. Nat. Biomed. Eng. 2020;4(12):1208–1220. doi: 10.1038/s41551-020-00640-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Shing Hui, Reina Cheong, et al., Wearable technology for early detection of COVID-19: A systematic scoping review, Prevent. Med. 162 (2022) 107170, ISSN 0091-7435. https://doi.org/10.1016/j.ypmed.2022.107170. [DOI] [PMC free article] [PubMed]
14.Quer G., Radin J.M., Gadaleta M., Baca-Motes K., Ariniello L., Ramos E., Kheterpal V., Topol E.J., Steinhubl S.R. Wearable sensor data and self-reported symptoms for COVID-19 detection. Nat. Med. 2021;27(1):73–77. doi: 10.1038/s41591-020-1123-x. [DOI] [PubMed] [Google Scholar]
15.Rezaei N., Grandner M.A. Changes in sleep duration, timing, and variability during the COVID-19 pandemic: large-scale fitbit data from 6 major US cities. Sleep Health. 2021;7(3):303–313. doi: 10.1016/j.sleh.2021.02.008. ISSN 2352–7218. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Ausloos M., Cerqueti R., Zhou W.-X. A universal rank-size law. PLOS ONE. 2016;11(11):e0166011. doi: 10.1371/journal.pone.0166011. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Ghosh A., Basu B. Universal City-size distributions through rank ordering. Physica A. 2019;558 doi: 10.1016/j.physa.2020.125433. [DOI] [Google Scholar]
18.Bueno-Hernández N., Carrillo-Ruíz J.D., Méndez-García L.A., Rizo-Téllez S.A., Viurcos-Sanabria R., Santoyo-Chávez A., Márquez-Franco R., Aguado-García A., Baltazar-López N., Tomita-Cruz Y., Barrón E.V., Sánchez A.L., Márquez E., Fossion R., Rivera A.L., Ruelas L., Lecona O.A., Martínez-Mekler G., Müller M., Arroyo-Valerio A.G., Escobedo G. High incidence rate of SARS-CoV-2 infection in health care workers at a dedicated COVID-19 hospital: experiences of the pandemic from a large mexican hospital. Healthcare. 2022;10(5):896. doi: 10.3390/healthcare10050896. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Martínez-Mekler G., Martínez R.A., del Río M.B., Mansilla R., Miramontes P., Cocho G., Costa M. Universality of rank-ordering distributions in the arts and sciences. PLoS ONE. 2009;4(3):e4791. doi: 10.1371/journal.pone.0004791. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Alvarez-Martinez R., Martinez-Mekler G., Cocho G., Martínez-Mekler G.Y., Cocho G. Order-disorder transition in conflicting dynamics leading to rank-frequency generalized beta distributions. Physica A. 2011;390(1):120–130. doi: 10.1016/j.physa.2010.07.037. [DOI] [Google Scholar]
21.Abhik Ghosh, Preety Shreya, Banasri Basu, Maximum entropy framework for a universal rank order distribution with socio-economic applications, Physica A. 563 (2021) 125433. ISSN 0378-4371. https://doi.org/10.1016/j.physa.2020.125433.

[b0005] 1.Elizalde González J.J. SARS-CoV-2 y COVID-19. Una revisión de la pandemia. Med. Crítica. 2020;33(1):53–67. doi: 10.35366/93281. [DOI] [Google Scholar]

[b0010] 2.Leung K., Wu J.T., Liu D., Leung G.M. First-wave COVID-19 transmissibility and severity in China outside Hubei after control measures, and second-wave scenario planning: a modelling impact assessment. Lancet. 2020 doi: 10.1016/S0140-6736. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0015] 3.Barranco R., Ventura F. Covid-19 and infection in health-care workers. An emerging problem, Med. Leg. J. 2020;88(2):65–66. doi: 10.35366/93281. [DOI] [PubMed] [Google Scholar]

[b0020] 4.J.M. Radin, N.E. Wineinger, E.J. Topol, S.R. Steinhubl, Harnessing wearable device data to improve state-level real-time surveillance of influenza-like illness in the USA: a populationbased study 2(2) (2021). doi:https://doi.org/10.1016/S2589-7500(19)30222-5. [DOI] [PMC free article] [PubMed]

[b0025] 5.Pakhomov S.V.S., Thuras P.D., Finzel R., Eppel J., Kotlyar M., Cabiati M. Using consumer-wearable technology for remote assessment of physiological response to stress in the naturalistic environment. PLoS One. 2020;15(3) doi: 10.1371/journal.pone.0229942. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0030] 6.Cohen S., Doyle W.J., Alper C.M., Janicki-Deverts D., Turner R.B. Sleep habits and susceptibility to the common cold. Arch Int. Med. 2009;169(1):62–67. doi: 10.1001/archinternmed.2008.505. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0035] 7.Frederick Hasty, MD, Guillermo García, MD, Héctor Dávila, MD, MSS, MC, USAR (Ret.), S Howard Wittels, MD, Stephanie Hendricks, BA, Stephanie Chong, DNP, CRNA, ARNP. Heart Rate Variability as a Possible Predictive Marker for Acute Inflammatory Response in COVID-19 Patients, Military Medicine, Volume 186, Issue 1-2, January-February 2021, Pages e34–e38, https://doi.org/10.1093/milmed/usaa405. [DOI] [PMC free article] [PubMed]

[b0040] 8.Shapiro A., Marinsek N., Clay I., Bradshaw B., Ramirez E., Min J., Trister A., Wang Y., Althoff T., Foschini L. Characterizing COVID-19 and influenza illnesses in the real world via person-generated health data. Patterns. 2021;2(1):100188. doi: 10.1016/j.patter.2020.100188. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0045] 9.Natarajan A., Su H.-W., Heneghan C., Blunt L., O’Connor C., Niehaus L. Measurement of respiratory rate using wearable devices and applications to COVID-19 detection. NPJ Digit. Med. 2021;4(1) doi: 10.1038/s41746-021-00493-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0050] 10.Öztürk F., Karaduman M., Çoldur R., İncecik Ş., Güneş Y., Tuncer M. Interpretation of arrhythmogenic effects of COVID-19 disease through ECG. Aging Male. 2020;23(5):1362–1365. doi: 10.1080/13685538.2020.1769058. [DOI] [PubMed] [Google Scholar]

[b0055] 11.Shuo Liu, et al., Fitbeat: COVID-19 estimation based on wristband heart rate using a contrastive convolutional auto-encoder, Pattern Recogn. 123 (2022), 108403, ISSN 0031-3203, https://doi.org/10.1016/j.patcog.2021.108403. [DOI] [PMC free article] [PubMed]

[b0060] 12.Mishra T., Wang M., Metwally A.A., Bogu G.K., Brooks A.W., Bahmani A., Alavi A., Celli A., Higgs E., Dagan-Rosenfeld O., Fay B., Kirkpatrick S., Kellogg R., Gibson M., Wang T., Hunting E.M., Mamic P., Ganz A.B., Rolnik B., Li X., Snyder M.P. Pre-symptomatic detection of COVID-19 from smartwatch data. Nat. Biomed. Eng. 2020;4(12):1208–1220. doi: 10.1038/s41551-020-00640-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0065] 13.Shing Hui, Reina Cheong, et al., Wearable technology for early detection of COVID-19: A systematic scoping review, Prevent. Med. 162 (2022) 107170, ISSN 0091-7435. https://doi.org/10.1016/j.ypmed.2022.107170. [DOI] [PMC free article] [PubMed]

[b0070] 14.Quer G., Radin J.M., Gadaleta M., Baca-Motes K., Ariniello L., Ramos E., Kheterpal V., Topol E.J., Steinhubl S.R. Wearable sensor data and self-reported symptoms for COVID-19 detection. Nat. Med. 2021;27(1):73–77. doi: 10.1038/s41591-020-1123-x. [DOI] [PubMed] [Google Scholar]

[b0075] 15.Rezaei N., Grandner M.A. Changes in sleep duration, timing, and variability during the COVID-19 pandemic: large-scale fitbit data from 6 major US cities. Sleep Health. 2021;7(3):303–313. doi: 10.1016/j.sleh.2021.02.008. ISSN 2352–7218. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0080] 16.Ausloos M., Cerqueti R., Zhou W.-X. A universal rank-size law. PLOS ONE. 2016;11(11):e0166011. doi: 10.1371/journal.pone.0166011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0085] 17.Ghosh A., Basu B. Universal City-size distributions through rank ordering. Physica A. 2019;558 doi: 10.1016/j.physa.2020.125433. [DOI] [Google Scholar]

[b0090] 18.Bueno-Hernández N., Carrillo-Ruíz J.D., Méndez-García L.A., Rizo-Téllez S.A., Viurcos-Sanabria R., Santoyo-Chávez A., Márquez-Franco R., Aguado-García A., Baltazar-López N., Tomita-Cruz Y., Barrón E.V., Sánchez A.L., Márquez E., Fossion R., Rivera A.L., Ruelas L., Lecona O.A., Martínez-Mekler G., Müller M., Arroyo-Valerio A.G., Escobedo G. High incidence rate of SARS-CoV-2 infection in health care workers at a dedicated COVID-19 hospital: experiences of the pandemic from a large mexican hospital. Healthcare. 2022;10(5):896. doi: 10.3390/healthcare10050896. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0095] 19.Martínez-Mekler G., Martínez R.A., del Río M.B., Mansilla R., Miramontes P., Cocho G., Costa M. Universality of rank-ordering distributions in the arts and sciences. PLoS ONE. 2009;4(3):e4791. doi: 10.1371/journal.pone.0004791. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0100] 20.Alvarez-Martinez R., Martinez-Mekler G., Cocho G., Martínez-Mekler G.Y., Cocho G. Order-disorder transition in conflicting dynamics leading to rank-frequency generalized beta distributions. Physica A. 2011;390(1):120–130. doi: 10.1016/j.physa.2010.07.037. [DOI] [Google Scholar]

[b0105] 21.Abhik Ghosh, Preety Shreya, Banasri Basu, Maximum entropy framework for a universal rank order distribution with socio-economic applications, Physica A. 563 (2021) 125433. ISSN 0378-4371. https://doi.org/10.1016/j.physa.2020.125433.

PERMALINK

Opportune warning of COVID-19 in a Mexican health care worker cohort: Discrete beta distribution entropy of smartwatch physiological records

Alejandro Aguado-García

América Arroyo-Valerio

Galileo Escobedo

Nallely Bueno-Hernández

PV Olguín-Rodríguez

Markus F Müller

José Damián Carrillo-Ruiz

Gustavo Martínez-Mekler

Abstract

1. Introduction

2. Methods

2.1. Ethics declarations

2.2. Description of the HGM-COVID cohort

2.3. Shannon entropy of the generalized discrete beta distribution

2.4. Smartwatch data collection

2.5. Data fitting

2.6. Detection of risk bands

Fig. 3.

2.7. Random warnings

3. Results

3.1. On cohort data registers

Fig. 1.

3.2. Beta Entropy estimations

Fig. 2.

3.3. Performance of the BE biomarker

Table 1.

3.4. Statistical significance of the biomarker and parameter selection

Fig. 4.

3.5. Differences between IgG- and PCR + groups.

Fig. 5.

4. Discussion and perspectives

Funding

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases