Skip to main content
Diabetes Technology & Therapeutics logoLink to Diabetes Technology & Therapeutics
. 2023 Aug 23;25(9):631–642. doi: 10.1089/dia.2023.0064

Predicting Immunological Risk for Stage 1 and Stage 2 Diabetes Using a 1-Week CGM Home Test, Nocturnal Glucose Increments, and Standardized Liquid Mixed Meal Breakfasts, with Classification Enhanced by Machine Learning

Eslam Montaser 1, Marc D Breton 1, Sue A Brown 1,2, Mark D DeBoer 1,3, Boris Kovatchev 1, Leon S Farhy 1,2,
PMCID: PMC10460684  PMID: 37184602

Abstract

Background:

Predicting the risk for type 1 diabetes (T1D) is a significant challenge. We use a 1-week continuous glucose monitoring (CGM) home test to characterize differences in glycemia in at-risk healthy individuals based on autoantibody presence and develop a machine-learning technology for CGM-based islet autoantibody classification.

Methods:

Sixty healthy relatives of people with T1D with mean ± standard deviation age of 23.7 ± 10.7 years, HbA1c of 5.3% ± 0.3%, and body mass index of 23.8 ± 5.6 kg/m2 with zero (n = 21), one (n = 18), and ≥2 (n = 21) autoantibodies were enrolled in an National Institutes of Health TrialNet ancillary study. Participants wore a CGM for a week and consumed three standardized liquid mixed meals (SLMM) instead of three breakfasts. Glycemic outcomes were computed from weekly, overnight (12:00–06:00), and post-SLMM CGM traces, compared across groups, and used in four supervised machine-learning autoantibody status classifiers. Classifiers were evaluated through 10-fold cross-validation using the receiver operating characteristic area under the curve (AUC-ROC) to select the best classification model.

Results:

Among all computed glycemia metrics, only three were different across the autoantibodies groups: percent time >180 mg/dL (T180) weekly (P = 0.04), overnight CGM incremental AUC (P = 0.005), and T180 for 75 min post-SLMM CGM traces (P = 0.004). Once overnight and post-SLMM features are incorporated in machine-learning classifiers, a linear support vector machine model achieved the best performance of classifying autoantibody positive versus autoantibody negative participants with AUC-ROC ≥0.81.

Conclusion:

A new technology combining machine learning with a potentially self-administered 1-week CGM home test can help improve T1D risk detection without the need to visit a hospital or use a medical laboratory.

Trial registration:

ClinicalTrials.gov registration no. NCT02663661.

Keywords: Continuous glucose monitoring, Immunological risk, Islet autoantibodies, Machine learning, Prediabetes, Type 1 diabetes

Introduction

The progression toward clinical type 1 diabetes (T1D) can be categorized into three stages: the first stage is characterized by the presence of ≥2 islet autoantibodies with normoglycemia, the second stage progresses to dysglycemia (i.e., at-risk), and finally the third stage is defined by the onset of symptomatic (i.e., clinical) T1D.1 Therefore, the presence of autoantibodies is related to the immunological risk of developing diabetes in the future, and is a key biomarker of the pathogenic processes leading to clinical diagnosis.2 This, and also other, biomarkers could be used to screen a much broader population besides individuals at increased genetic risk of T1D (e.g., first-degree relatives).3

Early identification and screening of individuals at increased T1D risk can reduce the rates of diabetic ketoacidosis (DKA) at diagnosis,4–6 improve the quality of glycemic control,7,8 reduce other future poor health outcomes,6–9 and complications.10 Overall, as monitoring of high-risk individuals in natural history studies markedly reduces DKA rates at diagnosis, research participation in these studies is critical to finding means of preventing or delaying T1D11 and justifies the development of efficient screening methods to identify individuals at high T1D risk, which appears influenced by immunological and genetic factors.

Screening for genetic T1D risk can be performed as a self-administered at-home test, but this test does not directly account for the immunological risk (i.e., presence of autoantibodies). Screening individuals for genetic risk, followed by autoantibody testing may improve the predictive power of a positive autoantibody test but will still miss many individuals that will develop T1D in the future.12 Although ∼90% of those who develop T1D have no family history of the disease, this genetic predisposition puts individuals with first-degree relatives at a 20-fold higher risk of developing T1D.13 The current approach for T1D risk detection includes testing for presence of autoantibodies with the presence of multiple autoantibodies being more predictive of future T1D than a single autoantibody.14

Recently, several studies employed continuous glucose monitoring (CGM) devices not only in people with diabetes15 but also in obese individuals16 and in individuals at different stages of prediabetes.17 Studies have suggested that CGM devices can be used for detecting early hyperglycemia in children with multiple autoantibodies,18 and for predicting progression to diabetes in autoantibody positive (Ab+) children.19,20 Steck et al.20 suggested that “CGM should be included in the ongoing monitoring of high-risk children (Ab+),” where they used home-based CGM wear without any additional metabolic testing (e.g., mixed meal tolerance test [MMTT]).

In addition, in type 2 diabetes (T2D), CGM was able to detect impaired glycemia in certain categories of participants,21 earlier than other standard biomarkers used for the diagnosis and classification of diabetes.22 Recently, a 1 week CGM test has been investigated for its ability to be used for identifying individuals at higher risk for rapid progression to Stage 3 T1D, including in those with a normal oral glucose tolerance test (OGTT).23 This study has identified several CGM-derived metrics of hyperglycemia associated with progression to Stage 3 disease.

Machine-learning techniques have been utilized in the field of diabetes, especially in applications using CGM data to develop predictive models that could help clinicians improve screening and treatment. A logistic regression (LR) model with glycemic variability features extracted from CGM signals was used to classify individuals with and without diabetes.24 Several established machine-learning models for binary classification were used to classify the quality of overnight glycemic control in T1D.25 The proposed machine-learning methodology in this study for using CGM-based glycemic features to predict if a healthy individual is autoantibody positive or autoantibody negative (Ab+ or Ab–) as an alternative to the standard test for islet autoantibodies has not been explored.

At-home testing for disease risk could help address many of the challenges regarding whom to screen for T1D risk using autoantibody testing. The objective of this study is to characterize the CGM traces in healthy individuals with different number of islet autoantibodies and use CGM-based metrics to develop a (pre)screening technology to classify participants autoantibody status (Ab+ vs. Ab–). The new technology uses a dedicated machine-learning methodology and a, potentially self-administered, 1-week CGM home test that includes up to three standardized liquid mixed meals (SLMM) challenges.

Materials and Methods

Study design and data overview

The National Institutes of Health (NIH)-funded TrialNet ancillary study (ClinicalTrials.gov registration no. NCT02663661) was conducted 2015–2019 at the University of Virginia (Institutional Review Board protocol ID #18568). The study enrolled healthy relatives of people with T1D with different numbers of islet autoantibodies, zero, one, or two or more recruited from participants with known autoantibody status in the TrialNet Pathway to Prevention study (https://www.trialnet.org/our-research/risk-screening). Major inclusion criteria included individuals 12–45 years old who had a brother, sister, child, or parent with T1D, or individuals 12–20 years old who had a cousin, aunt, uncle, niece, nephew, half-brother, half-sister, or grandparent with T1D.

Among the major exclusion criteria were diagnosis of diabetes (i.e., T1D or T2D), a relevant medical condition (e.g., gastroparesis), or being treated with medications that might interfere with the study. All participants signed an informed consent. Participants were asked to come to the Clinical Research Unit (CRU) at the University of Virginia for a 10-h inpatient visit (a single 10-h clinical test consisting of an MMTT followed by insulin-induced hypoglycemia).

At the end of the hospital visit, the participants were given a blinded Dexcom G4 Platinum CGM, which they wore for the next 7 days at home. They were asked to calibrate their CGMs according to the manufacturer's instructions. During this period, they consumed SLMM (Boost, Nestlé, Switzerland) over 1–5 min on three occasions to replace their breakfasts (6 mL/kg body weight to a maximum of 360 mL) and recorded its timing to link the start of the SLMMs with the CGM profiles. In this study, we focus solely on the CGM home study.

CGM-based glycemia metrics and group comparison

The CGM-based metrics and characterization of glycemia in the different autoantibodies groups was performed under three different scenarios: OVERALL (based on all 7 days), OVERNIGHT (based on all 7 days overnight periods), and PostSLMM (based only on the post-SLMM CGM traces) as described hereunder.

OVERALL

CGM data from the participants were collected and glycemic features/metrics were extracted and computed, including mean glucose (MG), percent time of glucose >180 mg/dL (T180), >160 mg/dL (T160), >140 mg/dL (T140), <70 mg/dL (T70), <54 mg/dL (T54), coefficient of variation (CV), standard deviation (SD), range, low blood glucose index (LBGI, measures the frequency and magnitude of hypoglycemia), high blood glucose index (HBGI, measures the frequency and magnitude of hyperglycemia), and the average daily risk range (ADRR, the sum of the daily peak risks for hypo- and hyperglycemia) [see Kovatchev,26 Table 1].

Table 1.

Clinical and Demographic Characteristics of 60 Participants in the Three Different Classes of Islet Autoantibodies (Ab) Used for Analysis

Characteristic Negative (zero Ab) 1 Ab 2 or more Ab
Number of subjects (n) 21 18 21
Gender, % female 66.7 50.0 57.1
Age (years) 27.0 (9.9) 23.5 (11.9) 20.7 (10.2)
Race, % White/Caucasian 100 100 100
HbA1c (%) 5.3 (0.3) 5.3 (0.3) 5.3 (0.3)
BMI (kg/m2) 23.7 (5.3) 25.3 (6.5) 22.8 (5.1)

Statistics presented as n, mean (SD), or (%).

Ab, autoantibody; BMI, body mass index; SD, standard deviation.

In more detail, ADRR is a variability metric based on “risk” values obtained from glucose levels that are mathematically transformed to give equal weight to hyperglycemic and hypoglycemic excursions. LBGI and HBGI are based on the same normalizing transformation as the ADRR but are designed to be sensitive to hypoglycemia or hyperglycemia, respectively. These metrics were used to characterize the glycemic responses of participants in different autoantibody classes.

OVERNIGHT

Twelve glycemic features were extracted and computed from the overnight (12:00–06:00) CGM traces. These features include MG, T180, T160, T140, T70, T54, CV, SD, range, LBGI, HBGI, and the AUC above the baseline value at midnight (the overnight CGM incremental area under the curve [IAUC]), to characterize the glycemic responses in the different autoantibodies' groups.

PostSLMM

We investigated the length of 0–2 h post-SLMM to get a significant difference post-SLMM excursion between the different autoantibodies groups by using nine glycemic features: CV, T140, T160, T180, the AUC above the baseline value at t=0 (IAUC), glucose level at t min post-SLMM (Gt), maximal glucose amplitude (Gmax), time to Gmax (Tmax), and slope of glucose 0-t min (S). These features capture the dynamic characteristics of the post-SLMM CGM data set for each participant in the three different autoantibodies groups.

Statistical procedures

All statistical analyses were performed using R Statistical Software 4.0.2 (R Foundation for Statistical Computing). The Shapiro–Wilk test was used to check if glycemic features follow a normal distribution. For normally distributed continuous variables, a one-way analysis of variance (ANOVA) was used to compare the means between autoantibodies groups. For non-normally distributed variables, a Wilcoxon signed-rank test and Kruskal–Wallis test were used to determine whether there are statistically significant differences between the glycemic features in different autoantibodies groups. Bonferroni correction was used for multiple comparisons correction to reduce the chances of obtaining false-positive (FP) results. A P-value <0.05 was considered significant. Pearson's correlation matrix between the glycemic features was computed, to assess the collinearity between glycemic features.

Autoantibodies classification

The extracted glycemic features from the three different scenarios were used to define different classifier models based on the autoantibodies class. Then, these features were aggregated per participant and each feature was mean-centered and scaled before entering the classification procedure.

We merged 1 autoantibody with ≥2 autoantibodies in one class as an autoantibody positive “Ab+” class versus the autoantibodies negative class “Ab–.” Two different options for using glycemic features in the classifiers models were investigated; either using the significant features only (i.e., glycemic features that are statistically significant differences between the autoantibodies groups) or using all the glycemic features in the three different scenarios based on the autoantibodies class.

Classification models

Four different classification models were used to develop an autoantibodies classifier and define the best classifier model: linear discriminant analysis (LDA), linear support vector machine (SVM), LR, and K-nearest neighbors.27,28

Classification strategy

A 10-fold cross-validation technique was implemented. The entire data set of glycemic features from all participants is aggregated per participant and is randomly shuffled. Then, it was subdivided into 10 approximately equal-sized folds/sections. One of the 10-folds was used as a test set to evaluate classification performance, whereas the remaining ninefolds were used to train the classifier models. The procedure was repeated 10 times (iterations) to estimate the mean performance of the different classifier models. This procedure guarantees that data from each participant appears either in the training or in the test set (but not both), avoiding overfitting and improving the generalizability of the results.

Class imbalance

Class imbalance refers to a classification predictive modeling problem when the class distribution is not equal or close to equal in the training data set (i.e., a significantly larger proportion of Ab+ than Ab–) and is instead biased or skewed. This can result in biased predictions and misleading accuracies. We address the class imbalance (i.e., unbalanced samples) by using up-sampling of the minority class (oversampling) only in the training data set. In our experiments, oversampling was performed within rather than before the 10-fold cross-validation technique to ensure no participant will be in both the training and test sets, and thereby avoid the overestimation of the model performance.

Classification performance assessment

To assess the performances of classifier models, a confusion matrix was used to report the four possible outcomes of the comparison between the true and the predicted class, that is, true negative (TN), false negative (FN), true positive (TP), and FP. The receiver operating characteristic area-under-the-curve (AUC-ROC) was used to select the best-performing classifier models. AUC-ROC is a numerical index that depicts the trade-off between the Sensitivity (i.e., TP rate) and (1-Specificity) (i.e., FP rate) across a series of different cutoff points, which are given by

Sensitivity=TPTP+FN
1Specificity=FPFP+TN.

The closer to 1 the AUC-ROC, the better the classifier model at distinguishing between Ab+ and Ab– participants.

Results

Seventy-three participants were recruited for this study, and stratified into three groups with zero (n = 25), one (n = 21), and ≥2 (n = 27) autoantibodies. One participant was diagnosed with diabetes, five failed screening, and seven withdrew from the study (the screen failures/withdrawers were not related to the CGM study. Sixty participants completed the CGM study and were included in the analysis. Of these participants, 21, 18, and 21 had zero, one, and more than one autoantibody, respectively. They had mean ± SD age of 23.7 ± 10.7 years (range 12–42 years), HbA1c of 5.3% ± 0.3%, and body mass index of 23.8 ± 5.6 (kg/m2) (Table 1). There were no statistically significant differences between the three groups regarding these characteristics.

Overall CGM-based glycemia dynamics

The average CGM in the three different scenarios were not significantly different between the three groups, which illustrates the difficulties of using those profiles to characterize the glycemic responses of participants in different groups of autoantibodies, as the single ambulatory glucose profile visual display in the three different scenarios are not apparently distinct (Fig. 1), except panel c (post-SLMM) is actually distinct-appearing for the 2 or more Ab group. It appears the height of the peak, as well as the distribution of CGM traces is different from the other two groups (i.e., negative and 1 Ab).

FIG. 1.

FIG. 1.

Represents three different panels of CGM traces aggregated to create a single AGP as a visual display in different autoantibodies (Ab) groups (i.e., negative, 1 Ab, and 2 ≥ Ab). (a) CGM traces of the entire 7 days for 60 participants in the three different groups of Ab. (b) CGM traces of the overnight periods (i.e., 12:00–06:00) for 60 participants in the three different groups of Ab (blow-up of the first 6 h plotted in [a]). (c) CGM traces of the 2 h-post-MMTT for 53 participants in the three different groups of Ab. The solid line in each Ab group in the three different scenarios is the median or 50% line; half of all CGM values are above and half are below this value. The 25th and 75th percentile curves shaded in dark blue/red/black represent the interquartile range or 50% of all CGM values. The dashed outer lines (the 5th to 95th percentile curves) indicate that only 5% of CGM readings were above or below these values in the three different scenarios. Ab, autoantibody; AGP, ambulatory glucose profile; CGM, continuous glucose monitoring; MMTT, mixed meal tolerance test; N, represents the number of participants in each group.

Characterization of glycemia of the three autoantibodies groups based on the complete 7-day CGM data

Twelve glycemic features were extracted and computed as described in Materials and Methods section (OVERALL). T140, T160, T180, SD, Range, and HBGI were highly correlated (r ≥ 0.83). There are no statistically significant differences between these 12 glycemic features in the 3 groups except for T180 with P = 0.040 (i.e., negative vs. 1 autoantibodies with P = 0.352, negative vs. ≥2 autoantibodies, with P = 0.012, and 1 autoantibodies vs. ≥2 autoantibodies, with P = 0.144), as shown in Figure 2a. Therefore, weekly CGM traces revealed different glycemic patterns among autoantibodies groups only through T180.

FIG. 2.

FIG. 2.

Characterization of CGM data through different glycemic features in different scenarios. (a) Boxplots for 12 different glycemic features extracted from the entire 7 days of CGM traces for 60 participants in the three different groups of autoantibodies (Ab). (b) Boxplots for 12 features extracted from overnight (i.e., 12:00–06:00) CGM traces for 60 participants in the three different groups of Ab. (c) Boxplots for 9 features extracted from 75-min post-MMTT CGM traces for 53 participants in the three different groups of Ab. A significance level of 5% (P-value <0.05) was considered to be significant to distinguish between the different groups of Ab (P-value highlighted in red). ADRR, average daily risk range; CV, coefficient of variation; G75, glucose level at 75 min post-SLMM; Gmax, maximal glucose amplitude; HBGI, high blood glucose index; IAUC, incremental area under the curve (mg/min/dL); LBGI, low blood glucose index; MG, mean glucose; S, slope of glucose 0–75 min (mg/dL)/min; SD, standard deviation; SLMM, standardized liquid mixed meals; T140, percent time >140 mg/dL; T160, percent time >160 mg/dL; T180, percent time >180 mg/dL; T54, percent time <54 mg/dL; T70, percent time <70 mg/dL; Tmax, corresponding time to Gmax (Time [min]).

Characterization of glycemia of the three autoantibodies groups based on 7-day overnight CGM data

Overnight CGM traces with a 6-h duration from 12:00 to 6:00 am were extracted from 60 participants. A total of 406 overnight CGM traces were extracted, and then a set of 12 glycemic features mentioned earlier were extracted and computed. Fifty participants had 7 days, six participants had 6 days, and four participants had 5 days of overnight traces. IAUC was the only statistically significant difference between the glycemic features in the three different autoantibodies groups, with higher IAUC for those with ≥2 autoantibodies (i.e., negative vs. 1 autoantibody with P = 0.012, negative vs. ≥2 autoantibodies, with P = 0.005, and 1 autoantibodies vs. ≥2 autoantibodies, with P = 0.012), as shown in Figure 2b.

In addition, T180 and Range with P = 0.060, P = 0.087, respectively, almost reached significance, as shown in Figure 2b. Several metrics appear highly correlated. For example, the correlation between HBGI and T140, T160, and T180 was r ≥ 0.89, and the correlation between Range and SD, and CV was r ≥ 0.93. Notably, the correlation between IAUC and all other features was weak except with SD (r = 0.48).

Characterization of glycemia of the three autoantibodies groups based on post-SLMM data

Post-SLMM CGM traces were extracted from 53 participants. We excluded seven participants from the analysis (three participants from negative group, three participants from 1 autoantibodies group, and one participant from ≥2 autoantibodies group): six of them had breakfast after SLMM, and one had breakfast 30-min before SLMM. CGM traces after the SLMM were first processed, and the suitable length to get a different post-SLMM excursion (i.e., statistically significant differences) between participants was t = 75 min. Post-SLMM CGM traces (n = 142) for 75 min (i.e., 47 CGM traces zero autoantibodies, 40 traces 1 autoantibodies, and 55 traces ≥2 Ab) were extracted from 53 participants, where 75.6% of those participants completed all three SLMM loads, 16.9% only did two sessions, and 7.5% only completed one session.

Then, a set of nine glycemic features mentioned in Materials and Methods section (PostSLMM) were computed. The only statistically significant difference between the glycemic features in the three different autoantibodies groups was T180 with P = 0.004 (i.e., negative vs. 1 autoantibody with P = 1.000, negative vs. ≥2 autoantibodies, with P = 0.012, and 1 autoantibodies vs. ≥2 autoantibodies, with P = 0.018), as shown in Figure 2c. Besides that, Tmax with P = 0.053 almost reached significance, with higher Tmax for those with 1 autoantibody and ≥2 autoantibodies, as shown in Figure 2c. T140, T160, IAUC, and Gmax were highly correlated features (r ≥ 0.71), whereas the correlation between Tmax and all other features was very weak, except with the slope S and G75 (r = 0.53 and r = 0.45, respectively).

Characterization of glycemia of the Ab+ versus Ab− participants

OVERALL

Sixty observations and 12 glycemic features are contained in the entire 7 days of CGM traces data set, including 65% of all participants in the Ab+ class and the remaining 35% in the Ab– class (39 Ab+ vs. 21 Ab–). T180 of the 12 glycemic features was the only statistically significant difference between both classes with P = 0.041 (Fig. 3a).

FIG. 3.

FIG. 3.

Characterization of CGM data through different glycemic features in different scenarios. (a) Boxplots for 12 different glycemic features extracted from the entire 7 days of CGM traces for 60 participants in two different groups of autoantibodies (Ab+/Ab−). (b) Boxplot for 12 features extracted from overnight (i.e., 12:00–06:00) CGM traces for 60 participants in two different groups of Ab. (c) Boxplots for 9 features extracted from 75-min post-SLMM CGM traces for 53 participants in two different groups of Ab. A significance level of 5% (P-value <0.05) was considered significant to distinguish between the different groups of Ab (P-value highlighted in red).

OVERNIGHT

Sixty observations and 12 glycemic features are contained in the overnight CGM traces data set, including the same portion of participants in both autoantibodies classes as in OVERALL. The overnight CGM IAUC and T180 were statistically significant differences between Ab+ versus Ab– with P = 0.001 and P = 0.019, respectively (Fig. 3b).

PostSLMM

Fifty-three observations and nine glycemic features are contained in the post-SLMM CGM traces data set, including 66% of participants in the Ab+ class and the remaining 34% in the Ab– class (35 Ab+ vs. 18 Ab–). Tmax was the only statistically significant difference between both classes of autoantibodies with P = 0.026 (Fig. 3c).

Defining classifier models based on the Ab+ versus Ab− groups

As the data sets in the three scenarios earlier were “imbalanced” according to the autoantibodies class distribution, we followed the balancing procedure described in Materials and Methods section before applying any of the classifier models. The four binary classifier models with a 10-fold cross-validation technique and oversampling were implemented with the only significant features, and then using all the glycemic features from the three different scenarios, to classify participants in terms of presence (Ab+) or absence (Ab–) of autoantibodies.

OVERALL

The linear SVM classifier model outperforms the other classifier models with a mean AUC-ROC of 0.67, when using T180 as a significant feature, to classify those participants in different autoantibodies classes, as shown in the first column of Table 2. Using the 12 extracted features in the four binary classifier models did not improve the classification accuracy, as shown in the first column of Table 3, where the LR classifier model outperforms the other classifier models with a mean AUC-ROC of 0.69.

Table 2.

Comparison of Classification Performance of Four Models with Oversampling Technique in Terms of Receiver Operating Characteristic Area Under the Curve Based on Different Groups of Autoantibodies (i.e., Ab+ vs. Ab–) in Different Scenarios (i.e., Using Glycemic Features Extracted from the Entire 7 Days of Continuous Glucose Monitoring [CGM] Traces vs. Features Extracted from Overnight [i.e., 12:00–06:00] CGM Traces vs. Features Extracted from 75-Min Post-Standardized Liquid Mixed Meals [SLMM] CGM Traces vs. Mixing Overnight Features and SLMM Features), When We Defined the Four Models by Using Only the Significant Features for Each Scenario

Classification models AUC-ROC Overall CGM (one feature; T180) AUC-ROC Overnight (two features; IAUC, T180) AUC-ROC SLMM (one feature; Tmax) AUC-ROC Overnight and SLMM (three features; IAUC, T180, Tmax)
LDA 0.627 0.754 0.789 0.778
SVM + Linear Kernel 0.671 0.758 0.777 0.811
LR 0.657 0.794 0.789 0.786
KNN 0.627 0.661 0.728 0.777

AUC-ROC, receiver operating characteristic area under the curve; CGM, continuous glucose monitoring; IAUC, incremental area under the curve; KNN, K-nearest neighbors; LDA, linear discriminant analysis; LR, logistic regression; SLMM, standardized liquid mixed meals; SVM, support vector machine. AUC-ROC values in boldface indicate the best performance.

Table 3.

Comparison of Classification Performance of Four Models with Oversampling Technique in Terms of Receiver Operating Characteristic Area Under the Curve Based on Different Groups of Autoantibodies (i.e., Ab+ vs. Ab–) in Different Scenarios, When We Defined the Four Models by Using All the Features for Each Scenario

Classification models AUC-ROC Overall CGM (12 features) AUC-ROC Overnight (12 features) AUC-ROC SLMM (9 features) AUC-ROC Overnight and SLMM (21 features)
LDA 0.679 0.679 0.804 0.693
SVM + Linear Kernel 0.672 0.812 0.825 0.776
LR 0.692 0.765 0.778 0.715
KNN 0.639 0.621 0.776 0.760

AUC-ROC, receiver operating characteristic area under the curve; CGM, continuous glucose monitoring; IAUC, incremental area under the curve; KNN, K-nearest neighbors; LDA, linear discriminant analysis; LR, logistic regression; SLMM, standardized liquid mixed meals; SVM, support vector machine. AUC-ROC values in boldface indicate the best performance.

OVERNIGHT

Using IAUC and T180 as significant features leads to a noticeable improvement as shown in the second column of Table 2, where the LR classifier model outperforms the other classifier models with a mean AUC-ROC of 0.79. Whereas using the extracted 12 features leads to a notable improvement in classification accuracy, where a linear SVM classifier model outperforms the other classifier models with a mean AUC-ROC of 0.81, as shown in the second column of Table 3.

PostSLMM

Using Tmax only as a significant feature leads also to a noticeable improvement, as shown in the third column of Table 2. LR and LDA classifier models outperform the other classifier models with a mean AUC-ROC of 0.79. More improvement was achieved when using the nine features, where a linear SVM classifier model outperforms the other classifier models with a mean AUC-ROC of 0.83, as shown in the third column of Table 3.

In addition, using the significant features from OVERNIGHT and PostSLMM together (i.e., T180, IAUC, and Tmax), improved the classification accuracy, and a linear SVM classifier model outperforms the other classifier models with a mean AUC-ROC of 0.81, as shown in the fourth column of Table 2. However, mixing all the extracted features from both scenarios did not improve the accuracy of classification, as shown in the fourth column of Table 3.

Discussion

In this study, we used data from a recent NIH-funded TrialNet ancillary study using relatives of people with T1D of 12–42 years of age to characterize the extent to which features derived from a 1-week CGM home test can stratify individuals with different number of T1D-specific autoantibodies. Whereas standard metrics, such as MG, SD, and CV, were unable to stratify the different autoantibodies groups in the overall 7 days or overnight CGM traces, T180 based on the overall 7 days CGM traces distinguishes between the three autoantibodies groups, which was also the case for the CGM IAUC based on the overnight CGM traces, where IAUC was lower in the Ab– group versus Ab+.

Besides, the post-SLMM periods T180 was a statistically significant difference between the three autoantibodies groups, and Tmax approached significance. Therefore, the highest glucose excursions (T180) appear as a metric that differentiates between the three autoantibodies groups, likely driven by different meal responses. This is in line with what was observed previously for children with median age 11.5 years,20 with the caveat in our study, T140 was not as predictive as T180. In contrast, the ability of overnight IAUC to distinguish between the different groups suggests the ability of the participants with a lower number of autoantibodies to reach their baseline glucose values faster.

The data collected during the home CGM study and the glycemia metrics/features derived from it allowed the use of machine-learning methodology to develop an autoantibodies status classifier. Notably, features based on the complete 7-day CGM traces were unable to classify with sufficient accuracy the Ab+ versus Ab− participants, but the overnight and post-SLMM CGM traces were able to better capture the differences between the groups. Glycemic features extracted from the overnight and post-SLMM CGM traces were able to distinguish the Ab+ versus Ab− participants, and predict the autoantibodies status with only a small number of significant features such as T180 and IAUC from the overnight traces, and Tmax from the post-SLMM traces.

The proposed methodology of the autoantibody classifier, which combines the CGM home test data with a linear SVM-based classifier, was able to predict with high accuracy (i.e., AUC-ROC ≥0.81) the participant's presence or absence of autoantibodies. Overall, these results support the notion that adding the SLMM intervention to the home CGM test improves our ability to use the test to distinguish Ab+ versus Ab− participants with a small number of features with different but complementary physiological meaning. We also note that proposed technology allows addressing not only the question of classifying Ab+ versus Ab–, but also exploring the option for classifying low-risk (zero and one autoantibody) versus high-risk (two and more autoantibodies; Stage 1 and 2).

As mentioned in the introduction, a recent study in individuals in T1D probands has identified several CGM-derived metrics of hyperglycemia significantly associated with rapid progression to Stage 3 disease, including in those with normal OGTT results.23 These metrics are based on selected percent time (5% or 8%) with glucose above different glucose level thresholds (e.g., glucose over 120, 140, and 160 mg/dL). Even though our technology is not tailored to stratify progressors to Stage 3 from nonprogressors, it identifies new metrics derived from the overnight and post-SLMM CGM periods that can be explored to estimate the imminent risk for progression to Stage 3 T1D.

The proposed CGM home test can be self-administered after a carefully designed interactive online teaching session and would not require a visit to a health care facility or use of a medical laboratory. Therefore, it could be used as an alternative or in addition to current home screening methods such as the GTT@ home (https://www.digostics.com) and self-collected capillary blood autoantibodies test currently employed by TrialNet.29 It can provide additional information on the level of dysglycemia that cannot be obtained by a single-finger stick for autoantibody presence or a genetic test.

Future studies will demonstrate whether it can also complement other T1D risk biomarkers (including genetic), to estimate the autoantibodies status better, and the overall risk of developing T1D, and/or separate progressors from nonprogressors in autoantibody-positive individuals. Ultimately, this could provide insight toward onset of therapy, potentially avoiding cases of DKA and highlighting individuals who could benefit from future immune-modulatory interventions such as teplizumab.30

This study benefited from prospectively collected data from T1D proband individuals with known autoantibody status involved in TrialNet studies. Its limitations include the relatively small number of participants, the fact that 7 out of 60 subjects had breakfast around the time of the SLMM and were excluded from the analysis (PostSLMM), the small number of CGM days available for CGM-based characterization of glycemia, and lack of more detailed information on the autoantibodies (type, confirmation, persistence, etc.). As such, we were not able to perform a meaningful comparison of Stage 1 versus Stage 2 participants in the ≥2 autoantibodies group to be consistent with the current understanding of the pathophysiology of T1D. Using an independent sample in a future study will be needed to confirm the performance of the tested machine-learning methods and the predictive power of the selected features.

In addition, the data used for developing the autoantibodies classifier originated from a limited population of volunteers that have relatives with T1D and are of age between 12 and 42 years. Finally, in this study we use data collected with the Dexcom G4 Platinum CGM, rather than the more advanced G6 model typically used in recent studies. We cannot assess objectively the implications of using an older CGM, but we do not have reasons to expect the outcomes to be sensor-specific. In contrast, newer sensors have many advantages, including improved usability and longer duration of use and are better candidates to be used to provide data for the proposed methodology in this study.

Conclusions

In conclusion, in the early stages of progression to T1D, a CGM-based test can reveal increasing levels of dysglycemia, which may be too subtle in the beginning to cause any visible symptoms, but their progression over time could lead to early diagnosis and avoidance of DKA and hospital admissions. In the very early stages of the disease, standard glycemia metrics derived from a 1-week CGM home test were able to differentiate between individuals at different autoantibodies status through different scenarios. Using machine learning further allowed to develop a method to distinguish CGM patterns between individuals without versus with T1D antibodies, based on assessment performed at home. If applied broadly, this approach could help improve T1D risk detection, potentially alerting individuals for early diagnosis or prevention.

Acknowledgments

The authors would like to thank the UVA Center for Diabetes Technology Data Team for organizing the CGM home study data and the TrialNet Coordinating Center for assistance with recruitment of the autoantibody positive participants.

Authors' Contributions

E.M. analyzed and interpreted data, developed the classifier theory, performed the computations, conducted the statistical analyses, and wrote the article. L.S.F. obtained funding, designed the study and its concepts, participated in the analysis, methodology development, and interpretation of the data, and edited and revised the article. S.A.B. and M.D.D. carried out the clinical study and participated in its design. M.D.B. participated in the analyses, development of the analytical methodology, the original study design, reviewed, and edited the article. B.K. reviewed the article, participated in the development of the analytical methodology, and in the original study design. All authors contributed to the revision of the article and approved the final version. L.S.F. is the guarantor of this study and takes full responsibility for the integrity of the study.

Disclaimer

The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the NIH or the Juvenile Diabetes Research Foundation (JDRF).

Author Disclosure Statement

E.M. has nothing to declare. M.D.B. declares research support handled by the University of Virginia by Dexcom, Novo Nordisk, and Tandem. S.A.B. declares research support handled by the University of Virginia by Dexcom, Insulet Corporation, Roche Diagnostics USA, Tandem Diabetes Care, and Tolerion. M.D.D. declares research support handled by the University of Virginia by Dexcom, Tandem Diabetes Care. B.K. declares research support handled by the University of Virginia by Dexcom, Novo Nordisk, Tandem Diabetes Care; patent royalties handled by the University of Virginia by Dexcom, Johnson & Johnson, Novo Nordisk, Sanofi; presentation honoraria by Tandem Diabetes Care. L.S.F. declares research support handled by the University of Virginia by Dexcom and Novo Nordisk.

Funding Information

This study was supported by an NIH DP3DK106907 (NIH-funded TrialNet ancillary study), Commonwealth Research Commercialization Fund (CRCF) Award MF20-007-LS, Helmsley Charitable Trust (grant no. 2204-05134), JDRF Award 2-SRA-2022-1260-S-B. The Type 1 Diabetes TrialNet Study Group is a clinical trials network funded through a cooperative agreement by the NIH through the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), the National Institute of Allergy and Infectious Diseases (NIAID), and the Eunice Kennedy Shriver National Institute of Child Health and Human Development, and JDRF.

References

  • 1. Insel RA, Dunne JL, Atkinson MA, et al. ; Staging presymptomatic type 1 diabetes: A scientific statement of JDRF, the Endocrine Society, and the American Diabetes Association. Diabetes Care 2015;38(10):1964–1974; doi: 10.2337/dc15-1419 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Ziegler AG, Rewers M, Simell O, et al. ; Seroconversion to multiple islet autoantibodies and risk of progression to diabetes in children. JAMA 2013;309(23):2473–2479; doi: 10.1001/jama.2013.6285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Wherrett DK, Chiang JL, Delamater AM, et al. ; Type 1 Diabetes TrialNet Study Group. Defining pathways for development of disease-modifying therapies in children with type 1 diabetes: A consensus report. Diabetes Care 2015;38(10):1975–1985; doi: 10.2337/dc15-1429 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Elding Larsson H, Vehik K, Bell R, et al. ; TEDDY Study Group; SEARCH Study Group; Swediabkids Study Group; DPV Study Group; Finnish Diabetes Registry Study Group. Reduced prevalence of diabetic ketoacidosis at diagnosis of type 1 diabetes in young children participating in longitudinal follow-up. Diabetes Care 2011;34(11):2347–2352; doi: 10.2337/dc11-1026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Hekkala AM, Ilonen J, Toppari J, et al. Ketoacidosis at diagnosis of type 1 diabetes: Effect of prospective studies with newborn genetic screening and follow up of risk children. Pediatr Diabetes 2018;19(2):314–319; doi: 10.1111/pedi.12541 [DOI] [PubMed] [Google Scholar]
  • 6. Ghetti S, Kuppermann N, Rewers A, et al. ; Cognitive function following diabetic ketoacidosis in children with new-onset or previously diagnosed type 1 diabetes. Diabetes Care 2022;43(11):2768–2775; doi: 10.2337/dc20-0187 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Fredheim S, Johannesen J, Johansen A, et al. ; Danish Society for Diabetes in Childhood and Adolescence. Diabetic ketoacidosis at the onset of type 1 diabetes is associated with future HbA1c levels. Diabetologia 2013;56(5):995–1003 doi: 10.1007/s00125-013-2850-z [DOI] [PubMed] [Google Scholar]
  • 8. Duca LM, Wang B, Rewers M, et al. Diabetic ketoacidosis at diagnosis of type 1 diabetes predicts poor long-term glycemic control. Diabetes Care 2017;40(9):1249–1255; doi: 10.2337/dc17-0558 [DOI] [PubMed] [Google Scholar]
  • 9. Aye T, Mazaika PK, Mauras N, et al. ; Diabetes Research in Children Network (DirecNet) Study Group. Impact of early diabetic ketoacidosis on the developing brain. Diabetes Care 2019;42(3):443–449; doi: 10.2337/dc18-1405 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Lundgren M, Jonsdottir B, Elding Larsson H. Effect of screening for type 1 diabetes on early metabolic control: The DiPiS study. Diabetologia 2019;62(1):53–57; doi: 10.1007/s00125-018-4706-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Jacobsen LM, Haller MJ, Schatz DA. Understanding pre-type 1 diabetes: The key to prevention. Front Endocrinol (Lausanne) 2018;9:70; doi: 10.3389/fendo.2018.00070 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Greenbaum CJ. A key to T1D prevention: Screening and monitoring relatives as part of clinical care. Diabetes 2021;70(5):1029–1037; doi: 10.2337/db20-1112 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Tillil H, Kobberling J. Age-corrected empirical genetic risk estimates for first degree relatives of IDDM patients. Diabetes 1987;36(1):93–99; doi: 10.2337/diab.36.1.93 [DOI] [PubMed] [Google Scholar]
  • 14. Sims EK, Besser REJ, Dayan C, et al. ; Screening for type 1 diabetes in the general population: A status report and perspective. Diabetes 2022;71(4):610–623; doi: 10.2337/dbi20-0054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Vigersky R, Shrivastav M. Role of continuous glucose monitoring for type 2 in diabetes management and research. J Diabetes Complications 2017;31(1):280–287; doi: 10.1016/j.jdiacomp.2016.10.007 [DOI] [PubMed] [Google Scholar]
  • 16. Zou CC, Liang L, Hong F, et al. Glucose metabolism disorder in obese children assessed by continuous glucose monitoring system. World J Pediatr 2008;4(1):26–30; doi: 10.1007/s12519-008-0005-y [DOI] [PubMed] [Google Scholar]
  • 17. Ehrhardt N, Al Zaghal E. Behavior modification in prediabetes and diabetes: Potential use of real-time continuous glucose monitoring. J Diabetes Sci Technol 2019;13(2):271–275; doi: 10.1177/1932296818790994 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Steck AK, Dong F, Taki I, et al. ; Early hyperglycemia detected by continuous glucose monitoring in children at risk for type 1 diabetes. Diabetes Care 2014;37(7):2031–2033; doi: 10.2337/dc13-2965 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Steck AK, Dong F, Taki I, et al. ; Continuous glucose monitoring predicts progression to diabetes in autoantibody positive children. J Clin Endocrinol Metab 2019;104(8):3337–3344; doi: 10.1210/jc.2018-02196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Steck AK, Dong F, Geno Rasmussen C, et al. ; CGM metrics predict imminent progression to type 1 diabetes: Autoimmunity screening for kids (ASK) study. Diabetes Care 2022;45(2):365–371; doi: 10.2337/dc21-0602 [DOI] [PubMed] [Google Scholar]
  • 21. Madhu SV, Muduli SK, Avasthi R. Abnormal glycemic profiles by CGMS in obese first-degree relatives of type 2 diabetes mellitus patients. Diabetes Technol Ther 2013;15(6):461–465; doi: 10.1089/dia.2012.0333 [DOI] [PubMed] [Google Scholar]
  • 22. Chon S, Lee YJ, Fraterrigo G, et al. ; Evaluation of glycemic variability in well-controlled type 2 diabetes mellitus. Diabetes Technol Ther 2013;15(6):455–460; doi: 10.1089/dia.2012.0315 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Wilson DM, Pietropaolo SL, Acevedo-Calado M, et al. ; CGM metrics identify dysglycemic states in participants from the TrialNet pathway to prevention study. Diabetes Care 2023; 46(3):526–534; doi: 10.2337/dc22-1297 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Acciaroli G, Sparacino G, Hakaste L, et al. ; Diabetes and prediabetes classification using glycemic variability indices from continuous glucose monitoring data. J Diabetes Sci Technol 2018;12(1):105–113; doi: 10.1177/1932296817710478 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Güemes A, Cappon G, Hernandez B, et al. ; Predicting quality of overnight glycaemic control in type 1 diabetes using binary classifiers. IEEE J Biomed Health Inform 2020;24(5):1439–1446; doi: 10.1109/JBHI.2019.2938305 [DOI] [PubMed] [Google Scholar]
  • 26. Kovatchev BP. Metrics for glycaemic control—From HbA1c to continuous glucose monitoring. Nat Rev Endocrinol 2017;13(7):425–436; doi: 10.1038/nrendo.2017.3 [DOI] [PubMed] [Google Scholar]
  • 27. Hastie T, Tibshirani R, Friedman JH. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer: New York, NY, USA; 2009;2; pp. 1–758; doi: 10.1007/978-0-387-21606-5 [DOI] [Google Scholar]
  • 28. Sharma S. Applied Multivariate Techniques. John Wiley & Sons, Inc., New York, NY, USA; 1996; doi: 10.5555/225519 [DOI] [Google Scholar]
  • 29. Liu Y, Rafkin LE, Matheson D, et al. ; Type 1 Diabetes TrialNet Study Group. Use of self-collected capillary blood samples for islet autoantibody screening in relatives: A feasibility and acceptability study. Diabet Med 2017;34(7):934–937; doi: 10.1111/dme.13338 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Herold KC, Bundy BN, Long SA, et al. ; Type 1 Diabetes TrialNet Study Group. An anti-CD3 antibody, teplizumab, in relatives at risk for type 1 diabetes. N Engl J Med 2019;381(7):603–613; doi: 10.1056/NEJMoa1902226 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Diabetes Technology & Therapeutics are provided here courtesy of Mary Ann Liebert, Inc.

RESOURCES