Abstract
Objective:
Hyperglycemia is a feature of worse brain injury after acute ischemic stroke, but the underlying metabolic changes and the link to cytotoxic brain injury is not fully understood. In this observational study, we applied regression, machine learning classification analyses to identify metabolites associated with hyperglycemia and a neuroimaging proxy for cytotoxic brain injury.
Methods:
Metabolomics and lipidomics was carried out using liquid chromatography-tandem mass spectrometry in admission plasma samples from 381 patients presenting with an acute stroke. Glucose was measured by a central clinical laboratory, and a subgroup of patients (n=201) had apparent diffusion coefficient (ADC) imaging quantified on magnetic resonance imaging (MRI) to estimate cytotoxic injury.
Results:
Uric acid was the leading metabolite in univariate analysis of both hyperglycemia (OR 19.6, 95% CI 8.6 – 44.7, P=1.44x10−12) and ADC (OR 5.3, 95% CI 2.2 – 13.0, P=2.42x10−4). To further prioritize model features and account for nonlinear correlation structure, a random forest machine learning algorithm was applied to separately model hyperglycemia and ADC. The statistical techniques used, have identified uric acid and gluconic acids as leading candidate markers common to all models (R2=68%, P=2.2 x 10−10 for uric acid; R2=15%, P=8.09 x 10−10 for gluconic acid).
Conclusion:
Both uric acid and gluconic acid were associated with hyperglycemia and cytotoxic brain injury. Both metabolites are linked to oxidative stress, which highlights two candidate targets for limiting brain injury after stroke.
Keywords: Machine learning, metabolomics, stroke, hyperglycemia, neuroimaging
INTRODUCTION
Hyperglycemia in stroke patients is an important marker of mortality and outcome after stroke in both diabetic and nondiabetic populations [2,3]. Even moderately elevated glucose levels are associated with both a higher risk of short-term mortality and an increased risk of poor functional recovery compared with lower glucose levels [4]. Hyperglycemia is further linked to the degree of apparent diffusion coefficient (ADC) restriction [5], an imaging measure which has been found in animal studies to be associated with increasing severity of cytotoxic injury [5-7]. Observational human studies [4,8,9] and preclinical stroke models [10,11] support a causal relationship between blood glucose and outcome. However, reduction of glucose level with insulin infusion does not appear to be a successful treatment strategy [12,13], suggesting a more complex relationship.
Several mechanisms have been postulated to account for hyperglycemia-related ischemic brain injury [3]. These include activation of the stress response with increased cortisol and sympathetic catecholamine release [14], impaired perfusion of ischemic brain [15,16], increased anaerobic glycolysis with lactic acidosis [17,18], and exacerbation of oxidative stress with increased production of reactive oxygen species [19,20]. Resolving these potential mechanisms can have important implications for stroke-related hyperglycemia, as well as other critical care conditions associated with impaired glucose metabolism.
To investigate these possible mechanisms, we studied the Specialized Program of Translational Research in Acute Stroke (SPOTRIAS) study which enrolled patients who presented with symptoms consistent with ischemic stroke within 9 hours of stroke onset. We performed metabolic phenotyping of plasma samples using liquid chromatography-tandem mass spectrometry (LC-MS/MS) and quantitative analysis of ADC maps as a radiographic representation for the degree of cytotoxic injury. We analyzed the data using logistic regression and a random forest (RF) machine learning algorithm, which prioritized leading candidate markers in relation to hyperglycemia and cytotoxic brain injury. The value of RF analysis stems from its ability to handle complex non-linear and non-normally distributed factors but consideration of non-linear interactions and employment of concordant statistical methods may be underutilized [21]. We hypothesized that the analysis of circulating metabolites coupled with magnetic resonance imaging (MRI) features would provide insight into metabolic-mediated hyperglycemic injury and brain cytotoxic injury.
MATERIALS AND METHODS
Study population
The design of the SPOTRIAS trial has been described in detail elsewhere [22,23]. Briefly, the trial enrolled patients age 18 years or greater who presented with symptoms consistent with ischemic stroke within 9 hours of stroke onset between January 2007 and April 2010. Patients were enrolled at two sites (Massachusetts General and Brigham and Women’s Hospitals) and were eligible if the National Institutes of Health Stroke Scale (NIHSS) score was ≥1. All patients (N=522) or their surrogates provided informed consent, and the study was approved by both participating institutional review boards and was performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.
In the current study, patients who had plasma available for metabolic phenotyping (N=381) were included, and a subgroub also had acute brain MRI (N=201) (Table 1). EDTA blood samples were collected at admission (corresponding to 7.1 ± 3.3 hours after stroke onset, referenced from the time when they were last-seen-well) and immediately centrifuged to separate cellular material from plasma. Aliquots of plasma were frozen on dry ice and stored at −80 °C until analysis.
Table 1: Study cohort characteristics.
Out of 522 enrolled patients, 381 had plasma metabolomics and within this group 201 patient had magnetic resonance imaging (MRI) data. The patients with plasma samples were comparable to MRI subgroup in all assessed baseline characteristics.
| Metabolomics cohort (n=381) |
MRI subgroup (n=201) |
P | |
|---|---|---|---|
| Age (years), mean ± SD | 69±15 | 69±15 | 0.86 |
| Sex, male, N (%) | 223 (59%) | 127 (63%) | 0.28 |
| Hyperlipidemia, N (%) | 185 (49%) | 91 (45%) | 0.45 |
| Hyperglycemia (glucose≥130 mg/dL), N (%) | 135 (35%) | 66 (33%) | 0.53 |
| IV tPA, N (%) | 173 (46%) | 98 (50%) | 0.41 |
| Admission NIHSS, median [IQR] | 5 [2.5, 12] | 6 [3, 14] | 0.57 |
| Time from LSW to blood draw (h), mean±SD | 7.0±3.3 | 7.2±3.4 | 0.36 |
| Baseline blood glucose (mg/dL), mean±SD | 136.4±63.5 | 132.4±51.7 | 0.63 |
| ADCr, median [IQR] | 0.69 [0.64, 0.76] | ||
| Time to MRI (h), mean±SD | 5.6±3.8 |
Plasma glucose values were measured in the central clinical laboratory as part of standard of care and these values were abstracted from the medical record. Dichotomization was utilized to highlight the clinically relevant information of high and low glucose concentrations based on a cutoff value of ≥130 mg/dL for hyperglycemia defined by thresholds used in previous studies [25].
Imaging analysis
ADC images were acquired as part of standard clinical care, and thus were conducted on multiple different scanners. In general, diffusion sequences were obtained using b values of 0 and 1000 s/mm2 and then ADC maps calculated using the relationship ADC = -ln(S1000 – S0) / 1000, where Sx represents the signal intensity of the B1000 and B0 images, respectively. ADC analysis was conducted using a semi-automated method in Analyze 11.0 (Biomedical Imaging Resource, Mayo Clinic, Rochester, MN, USA), as previously reported [23,25]. Apparent diffusion coefficient ratio (ADCr) was calculated by taking the ratio of the mean signal intensity in the stroke region-of-interest and the mean signal intensity of the contralateral hemisphere. Patients with bilateral stroke were excluded from analysis. Patients with other brain lesions, including hemorrhage or mass, were not included in the original study cohort. Standardization against the contralateral hemisphere allowed us to minimize any variation introduced by acquiring images on different scanners [26,27]. Values were presented as a percent of baseline ADC signal, and were dichotomized into high and low ADC values based on a previously established threshold of 64% [23] which was associated with clinical outcome.
Polar Metabolite Phenotyping
Polar metabolites were extracted using protein precipitation from 30 μL of EDTA plasma. The extracted metabolites from the supernatant were separated on Xbridge Amide columns (2.1×100 mm 3.5 μm, Waters) using our previously described methods [28,29]. With this method, 139 transitions were monitored (Supplementary Table 1). Peak integration for metabolite quantification was carried out using MassHunter QQQ Quantitative Analysis software (Agilent).
Lipid Metabolite phenotyping
Lipid metabolites were extracted from 50 μL of plasma by liquid-liquid extraction.
Samples were analyzed using an ACE C18-PFP, 3μm, Ultra-Inert HPLC Column, 150 x 2.1mm for the detection of free fatty acids and endocannabinoids in negative and positive ionization, respectively. This part of the analysis quantified 16 metabolites (Supplementary Table 1.)
For open profiling lipidomic analysis, 1 μL of each sample was injected and analyzed. Chromatographic separation was performed with a Kinetex C18 (2.1 mm× 100 mm, 2.6 μm) [30]. MassHunter QQQ Quantitative Analysis software (Agilent) was used for peak integration of 135 transitions (Supplementary Table 1.)
Data analysis
Descriptive statistics of baseline variables and outcomes were performed, reported as mean ± standard deviation (for normally distributed continuous data), median with interquartile range ([IQR]; for non-normal or ordinal data), and proportions (for binary data).
The cohort was divided into those with and without hyperglycemia (defined as glucose >130 mg/dL) [24], and with mild vs. severe cytotoxic injury (defined as ADCr greater or less than 64%, respectively) [23]. Univariate ordered logistic regression was applied to identify associated metabolites, with Bonferroni adjustment for multiple hypothesis testing. As a sensitivity analysis for dichotomization of hyperglycemia, linear regression analysis using glucose as a continuous variable was also carried out for the final metabolites, uric acid and gluconic acid. Variables were log-transformed to achieve data normality prior to linear regression. Statistical analyses were performed using STATA 14.2 (STATA Corporation, College Station, TX).
Random Forest Machine learning
The presented sample and data processing pipeline (Figure 1) summarizes the computational approach we applied in the study of acute stroke patients. To account for non-linear, complex correlation structure, we used Random Forest (RF) [21,31] machine learning ensemble method to model the metabolomics data and obtain predictive performance. We used bootstrapping methods for training (samples for model fitting or model building), for testing (samples to assess the performance of the model) and decision trees for identifying associations. Every bootstrapped sample had a corresponding left out or 'out-of-bag' (OOB) sample which was used to test the performance and robustness of the algorithm. Final associations were made using the average of 1000 predictions from the trees that did not contain training samples in their respective bootstrap sample (test samples).
Figure 1: Data analysis pipeline.

Clinical phenotyping of eligible acute stroke patients was performed at enrollment. This involved magnetic resonance imaging (MRI) and collection of blood samples for clinical chemistry analysis. The imaging data was used to calculate apparent diffusion coefficient ratio (ADCr) and plasma samples were used for metabolic phenotyping. Plasma samples were extracted using either protein precipitation or liquid-liquid extraction for metabolomics and lipidomics analysis, respectively. Liquid-chromatography mass-spectrometry (LC-MS/MS) was used for quantitative analysis. The detected compounds were manually quality checked before statistical analysis. Linear regression and non-linear machine learning approaches were used to identify candidate markers associated with hyperglycemia and cytotoxic cell death
We used RF as a classification method to classify different response variables for glucose and ADCr labeled as binary. For the classification model, RF required some parameters to be set a priori. For example, the number of trees (ntree) and the number of variables/features (e.g., lipids and/or metabolites) randomly sampled as candidates at each split (mtry) needed to be defined. We used ntree=500 and mtry =square root of variables in our models [21].
Stability analysis with marker selection
To select variables (metabolites), we iteratively fitted random forests, at each iteration building a new forest after discarding 20% of the metabolites with the smallest variable importance. The selected set of metabolites was used as independent predictors to fit the model to check the OOB error rate. This procedure was done iteratively using the varSelRF function from the varSerRF package in the R statistical program. The selected markers were further used for generalized linear modeling (GLM) and the area under the curve (AUC) was estimated. Two distributions were produced using GLM showing differences between random AUC values compared to the actual AUC values. The plots were generated permuting the labels and iterating 1000 bootstrapping samples ensuring the combination of the features selected affect the outcome variable.
We further used Generalized Additive Models (GAMs) [32] to capture the nonlinear structure of the data, using gam library in R.
Linear regression and Mediation analysis
The association between glucose as a continuous variable and gluconic acid or uric acid were further investigated using linear regression models. The regression model was evaluated by percentage variation explained (R2) by uric acid for glucose. We used lm() in R to estimate the coefficients and generate significance of the metabolite. Mediation analysis was conducted to determine whether uric acid or gluconic acid influenced the association between glucose and ADCr. A four-step procedure was used [33,34], and the percent difference in the coefficient was measured after introducing the metabolite mediator. The Sobel test was applied to determine significance of the mediation effect, where p<0.05 was considered significant.
Logistic regression analysis was further used to determine the relationship of uric acid, gluconic acid and glucose to long-term clinical outcome based on modified Rankin scale (mRS) dichotomized into good outcome (mRS 0-2) and bad outcome (mRS 3-6). Multivariable logistic regression was used to adjust for age, sex, creatinine, tPA treatment, and admission NIHSS score dichotomized at 7.
RESULTS
Study cohort
Of the 522 patients in the original acute stroke cohort, 381 had plasma samples available for analysis and the MRI subgroup included 201 participants. Metabolomics data was measured for 290 compounds from baseline blood samples, collected at 7.1±3.3 hours from stroke onset. MRI scans were acquired at 5.6±3.8 hours from stroke onset. Additional cohort characteristics are presented in Table 1. The average age of the study population was 69±15 years. The overall rate of hyperglycemia was 35% (n=135). Of those patients with admission hyperglycemia, 26% had a past medical history of diabetes mellitus and 36% had a HbA1c ≥6.5%. There were no differences in baseline characteristics between the patients with plasma samples and the MRI subgroup.
Univariate analysis
We first conducted univariate analysis using logistic regression to identify which metabolites are associated with hyperglycemia and ADCr. The summary of associations in all metabolites is shown in the volcano plots in Figure 2. Five metabolites were associated with admission hyperglycemia and met the threshold for significance after multiple hypothesis testing (unadjusted P value < 1.72x10−4, red dots, Figure 2a), including uric acid (OR 19.6, 95% CI 8.6 – 44.7, P= 1.44x10−12), pyruvic acid (OR 4.9, 95% CI 2.9 – 8.3, P= 1.74x10−9), lactic acid (OR 9.1, 95% CI 4.2 – 19.9, P=2.41x10−8), trimethylamine-N-oxide (OR 4.1, 95% CI 2.2 – 7.6, P=9.00x10−6), and xanthurenic acid (OR 3.1, 95% CI 1.8 – 5.5, P=8.86x10−5). One metabolite was associated with ADCr [Figure 2b, glucose (as measured by LC-MS), OR 4.8, 95% CI 2.2 – 10.7, P=1.23x10− 4], although uric acid was nearly significant after Bonferroni (OR 5.3, 95% CI 2.2 – 13.0, P=2.42x10−4).
Figure 2: Univariate comparisons among metabolites in stress hyperglycemia and cytotoxic brain injury.

a) The geometric mean ratio of each metabolite in hyperglycemic cases versus non-hyperglycemic cases is plotted against the respective P value. The dotted line represents the Bonferroni-corrected threshold, P<1.72x10x−4. The red dots highlight metabolites exceeding the false discovery threshold. b) The geometric mean ratio of each metabolite in patients with severe cytotoxic brain injury versus those without, plotted against the respective P value
Random Forest analysis of hyperglycemia
To account for the non-linear correlation structure of the metabolomics data to reveal important metabolite features, we used RF machine learning analysis to further prioritize candidate metabolomic and lipidomic markers of hyperglycemia. Using a backward selection process, we identified three features: uric acid, gluconic acid and glucose. Using those features the model error was 23%. The identification of glucose by the RF model served as a positive control, therefore we also developed an additional model where the glucose value was excluded.
The variable importance plot for this model included two metabolites uric acid and gluconic acid, as shown in Figure 3a. The error rate was 25% using these two metabolites, whereas it was 31% using all the metabolites, confirming that these two metabolites are the key features that distinguish hyperglycemia from normoglycemia (Supplementary Table 2). Box plots for uric and gluconic acids is shown in Figure 3b, comparing patients with and without hyperglycemia. The association between glucose (as a continuous variable) and gluconic acid or uric acid were also evaluated using linear regression models. The percentage variation explained (R2) by uric acid for glucose were 68% (P<0.0001), and the R2 for gluconic acid was 15% (P <0.0001; Figure 3c). The metabolites were further evaluated by generalized linear modelling (GLM), which was performed 1000 times with bootstrapping by permuting the dichotomized labels. The mean value of a random permuted model provided an area under curve (AUC) of 0.50, consistent with a model that provides no class separation. In contrast, the predictive accuracy of the actual model had an AUC=0.73 (Figure 3d).
Figure 3: Metabolites identified in association with hyperglycemia using random forest analysis.

a) Variable importance plot of uric acid and gluconic acid that were identified from the RF analysis which are associated with glucose. b) Box and whisker plots for gluconic acid and uric acid. Group “Yes” and “No” indicates the presence or absence of hyperglycemia on admission, respectively. The p values for the difference are indicated above the box plots. c) Scatter plot showing the association of uric acid (right panel) and gluconic acid (left panel) in relation to glucose as the outcome. Turquoise dots indicate presence of admission hyperglycemia and red dots indicates its absence, based on the central clinical lab value. d) Two distributions were produced using a generalized linear model showing differences between random AUC values vs. original AUC values. The plots were generated by permuting the original labels and iterating 1000 bootstrapping samples.
Linking ADCr with metabolomics data using Random Forest method
Similar to the analysis of hyperglycemia, we used ADCr as an outcome variable to explore non-linear associations. The metabolomics data was used as predictor variables and the RF was trained in classification mode on the baseline samples. Using a backward selection process, we identified six features: allantoin, ATP, glucose, gluconic acid, uric acid, and a triglyceride C56:2. Using these features, the model error was 20% whereas using all the metabolites the error rate was 24% (Supplementary Table 2).
GLM modeling with bootstrapping was used to test the ability of the metabolites to distinguish the dichotomized groups of ADCr. The model performance generated an AUC of 0.59. Univariate box plots for the individual metabolites are shown in Figure 4, with glucose and uric acid identified as the leading metabolites (P=0.0093 and P=0.0098, respectively). To further understand the relationship between hyperglycemia and ADCr, Pearson correlation analysis was carried out and revealed a weak association r= −0.20, P=0.002 with inverse association between glucose and ADCr and the percentage variation in glucose explained by ADCr was R2=3% (Supplementary Figure 1).
Figure 4: Random forest analysis of cytotoxic brain injury (ADCr) after stroke.

Box and whiskers plots for selected metabolites. Yellow boxes correspond to the values in patients with low ADC values (e.g., worse cytotoxic brain injury) and blue boxes correspond to values in patients with high ADC (less cytotoxic injury). The p value for each metabolite is shown above the respective box plots.
To complement these findings, we performed a Generalized Additive Model (GAM) analysis with the selected features from the RF analysis for both glucose and ADCr as a binary outcome variable. For glucose as the outcome variable, we found uric acid was non-linearly associated at P=1.97x10−4, which was consistent with our RF analysis. For ADCr as the binary outcome variable, we found consistent associations with uric acid (P=0.01), glucose (P=0.03), triglyceride C56:2 (P=0.04), and allantoin (P=0.01; see Supplementary Figure 2).
Mediation analysis demonstrated that uric acid mediated 62% of the relationship between glucose and ADCr (Figure 5a; P=0.003). For gluconic acid, the mediation effect of 26% was not significant (Figure 5b; P=0.068).
Figure 5: Mediation analysis with uric acid and gluconic acid as mediators.

a) Uric acid is associated with both hyperglycemia (P<0.0001) and with ADCr (P=0.0008). The strength of association between hyperglycemia and ADCr was reduced by 62% with the addition of uric acid (Sobel P=0.003). b) Gluconic acid satisfies the first two steps of mediation analysis, demonstrating an association between hyperglycemia (P<0.0001) and ADCr (P=0.043). The addition of gluconic acid reduced the association between hyperglycemia and ADCr by 26%, which was not significant (Sobel P=0.068). The coefficient and P values are shown for each step (steps a, b, c, and c’) and the percentage difference of the coefficients (1–c’/c) are shown.
Analysis of long-term clinical outcome
Logistic regression analysis was used to determine the relationship of uric acid and gluconic acid metabolites to good versus poor clinical outcome. Uric acid was associated with poor outcome in univariate analysis (OR 2.2, 95% CI 1.2 – 4.1, P=0.009), and remained significant after adjusting for age, sex, NIHSS, creatinine and tPA (OR 2.3, 95% CI 1.02 – 5.3, P=0.043). Gluconic acid was also associated with poor outcome (unadjusted OR 1.4, 95% CI 1.1 – 1.8, P=0.005); however, it was not after adjusting for age, sex, NIHSS, creatinine and tPA (OR 0.99, 95% CI 0.5 – 1.9, P=0.994). Consistent with prior results, elevated glucose was associated with poor outcome in unadjusted (OR 2.84, 95% CI 1.4 – 6.0, P=0.006) and adjusted (OR 1.7, 95% CI 0.9 – 3.5, P=0.13) regression models.
DISCUSSION
In this study, we used a combination of metabolomics, lipidomics, and MRI imaging to elucidate the relationships between ischemic brain injury and hyperglycemia after stroke. We applied both traditional logistic regression and machine learning algorithms, which are increasingly being utilized for large scale -omics datasets [21,35], to uncover changes which might be missed using traditional statistical approaches. We investigated the relationship between hyperglycemia, cytotoxic injury (as estimated by ADC imaging), and metabolic phenotypes through the application of RF analysis and iterative feature selection. We found uric acid to be linked to both hyperglycemia and ADCr, and to a lesser extent gluconic acid. Findings were further validated using stability analysis through multiple permutations of AUC models and generalized additive models analyses, supporting the predictive accuracy of the results.
The main finding our study was that uric acid was strongly associated with both hyperglycemia and ADCr, which mediated the association between these two outcomes. Although uric acid has antioxidant activity extracellularly, plasma uric acid levels have been shown to increase in response to high glucose levels [43] and have been linked to increased stroke severity, hypertension, cardiovascular disease and other conditions associated with oxidative stress [44,45]. Uric acid is the end product of the purine metabolism in humans, and accordingly, we found a higher ADCr was also identified in relation to allantoin, which is a metabolic end product of uric acid. Since allantoin is produced exclusively from the hydroxy radical reactions of uric acid in human metabolism, the increase in the plasma concentration suggests relevance as an indicator of free radicals and supports the interpretation that it reflects oxidative stress [46].
There is an acknowledged link between high uric acid levels and worse stroke outcome [47,48], but in animal models neuroprotection has been reported [49]. Several clinical observational studies have addressed this question [47,50] and most recently the URICO-ICTUS trial reported a neutral result with uric acid therapy [51]. However, predefined subgroup analysis identified a possible protective effect of uric acid in those patients with the highest tertile of admission glucose level [51]. Taken together with our study, elevated uric acid may serve as a compensatory, neuroprotective response to hyperglycemia-related oxidative injury. Indeed, lowering glucose with insulin infusion after stroke [12,13], did not lead to improved outcome, suggesting a complex relationship between hyperglycemia, hyperuricemia, and clinical outcome. Future clinical trials of exogenous uric acid may warrant targeting hyperglycemic patients as a strategy to limit the secondary consequences of oxidative injury.
Gluconic acid was also identified as a candidate metabolite, although less consistently than uric acid. Gluconic acid is formed enzymatically from glucose through the activity of glucose oxidase, which liberates hydrogen peroxide. Therefore, gluconic acid can also be considered a marker of oxidative stress. In addition, five additional metabolites were identified using RF that were not found using univariate methods. The likely explanation is that cytotoxic injury (as reflected by ADCr) is itself a complex phenotype and other, non-metabolomic factors contribute to ADCr including the duration of ischemia, cerebrovascular collateral status, occlusion location, and timing of imaging [6,25]. Nevertheless, the identification of gluconic and uric acids point towards oxidative injury is an important pathological step in hyperglycemia-mediated injury after ischemic stroke.
Strengths of our study include the detailed phenotyping, the large number of patients and breadth of measured metabolites. Moreover, both uric and gluconic acids were validated using AUC models lending further confidence to the robustness of the metabolomic associations reported here. It must also be acknowledged that although the threshold of ADCr was identified to predict outcome in a previous study [23], and is within the range identified in preclinical studies to be associated with cytotoxicity [5-7], it may not necessarily reflect the degree of the tissue injury and other thresholds may identify additional metabolites. It is also important to acknowledge that our analyses do not imply a causal relationship nor prove a biological mechanism. Future scientific experiments and/or population-based genetic approaches are needed to establish mechanism. Another limitation is that we studied samples from stroke patients in the acute setting, thus gluconic acid or uric acid are not generalizable as metabolic markers in a population-based study [52,53]. Future work should focus on both validating our current findings and examining the effects of uric acid administration on glucose levels. In addition, the inclusion of a non-stroke cohort could also improve future studies and further help distinguish between factors related to hyperglycemia and brain injury.
CONCLUSIONS
RF machine learning algorithms identified uric acid and gluconic acid that were correlated with hyperglycemia and ADCr. These findings not only uncover new insights into ongoing evaluation of uric acid therapy, but these metabolites might be used to risk stratify patients who present with acute ischemic stroke.
Supplementary Material
Supplementary figure 1. Scatter plot demonstrates an inverse association between glucose and ADCr. The variation explained was 3% at P=0.004. Grey shading around the straight line identifies the 95% confidence interval
Supplementary Figure 2: Generalized Additive Model (GAM) fits of selected metabolites from RF with binary outcome of Glucose and ADCr separately. a) Generalized Additive Model (GAM) fits of uric acid with dichotomized glucose as the outcome. The solid line represents the predicted value of the dependent variable. The dotted line represents two times the standard errors of the estimates. The small lines along the x axis are the "rug", showing the location of the sample plots. The y axis is in linear units, which in this case is logits, so that the values are centered on “0” and extend to both positive and negative values. b) A panel of figures selected from RF analysis that illustrate the fits of uric acid and allantoin using GAM with dichotomized ADCr as the outcome
Acknowledgments
Funding
This work was supported by the NIH R01 NS099209 (W.T.K.), AHA 17CSA33550004 (W.T.K.), AAN CRTS AI18-0000000062 (M.B.B.) and the Andrew David Heitman Neurovascular Foundation (W.T.K and M.B.B). A. A. was supported by National Institute for Health Research (NIHR) Surgical Reconstruction and Microbiology Research Centre (SRMRC), Birmingham, UK. The views expressed in this publication are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, the Medical Research Council or the Department of Health, UK.
Footnotes
Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of a an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.
Competing interests
The authors declare that they have no conflicts of interest.
Ethics approval
Patients were enrolled at two sites (Massachusetts General and Brigham and Women’s Hospitals) and the study was approved by both participating institutional review boards and was performed in accordance with the ethical standards as laid down in the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards.
Consent to participate
All patients (N=522) or their surrogates provided informed consent.
Consent for publication
Not applicable.
Availability of data and material
Metabolomics and lipidomics data are available in the figshare repository https://figshare.com/s/3867fc9baed29154579b.
Code availability
Random Forest analysis, feature selection and stability analyses were done in using R software (v3.4.3). R packages were used for Random Forest analysis: randomForest and varSelRF [1] and GAM package for Generalized Additive Model analysis. All codes used are available in the figshare repositoryhttps://figshare.com/s/3867fc9baed29154579b.
REFERENCES
- 1.Díaz-Uriarte R, Alvarez de Andrés S. Gene selection and classification of microarray data using random forest. BMC Bioinformatics. 2006;7:3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Scott JF, Robinson GM, French JM, O’Connell JE, Alberti KG, Gray CS. Prevalence of admission hyperglycaemia across clinical subtypes of acute stroke. Lancet. 1999;353:376–7. [DOI] [PubMed] [Google Scholar]
- 3.Kruyt ND, Biessels GJ, Devries JH, Roos YB. Hyperglycemia in acute ischemic stroke: Pathophysiology and clinical management. Nat Rev Neurol. 2010;6:145–55. [DOI] [PubMed] [Google Scholar]
- 4.Williams LS, Rotich J, Qi R, Fineberg N, Espay A, Bruno A, et al. Effects of admission hyperglycemia on mortality and costs in acute ischemic stroke. Neurology. 2002;59:67–71. [DOI] [PubMed] [Google Scholar]
- 5.Back T, Hoehn-Berlage M, Kohno K, Hossmann KA. Diffusion nuclear magnetic resonance imaging in experimental stroke. Correlation with cerebral metabolites. Stroke. 1994;25:494–500. [DOI] [PubMed] [Google Scholar]
- 6.Hoehn-Berlage M, Norris DG, Kohno K, Mies G, Leibfritz D, Hossmann KA. Evolution of regional changes in apparent diffusion coefficient during focal ischemia of rat brain: the relationship of quantitative diffusion NMR imaging to reduction in cerebral blood flow and metabolic disturbances. J Cereb Blood Flow Metab. 1995;15:1002–11. [DOI] [PubMed] [Google Scholar]
- 7.Olah L, Wecker S, Hoehn M. Relation of apparent diffusion coefficient changes and metabolic disturbances after 1 hour of focal cerebral ischemia and at different reperfusion phases in rats. J Cereb Blood Flow Metab. 2001;21:430–9. [DOI] [PubMed] [Google Scholar]
- 8.Luitse MJ, Velthuis BK, Kappelle LJ, van der Graaf Y, Biessels GJ, DUST Study Group. Chronic hyperglycemia is related to poor functional outcome after acute ischemic stroke. Int J Stroke. 2017;12:180–6. [DOI] [PubMed] [Google Scholar]
- 9.Zonneveld TP, Nederkoorn PJ, Westendorp WF, Brouwer MC, van de Beek D, Kruyt ND, et al. Hyperglycemia predicts poststroke infections in acute ischemic stroke. Neurology. 2017;88:1415–21. [DOI] [PubMed] [Google Scholar]
- 10.Tarr D, Graham D, Roy LA, Holmes WM, McCabe C, Mhairi Macrae I, et al. Hyperglycemia accelerates apparent diffusion coefficient-defined lesion growth after focal cerebral ischemia in rats with and without features of metabolic syndrome. J Cereb Blood Flow Metab. SAGE Publications; 2013;33:1556–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Yip PK, He YY, Hsu C Y, Garg N, Marangos P, Hogan EL. Effect of plasma glucose on infarct size in focal cerebral ischemia-reperfusion. Neurology. 1991;41:899–905. [DOI] [PubMed] [Google Scholar]
- 12.Gray CS, Hildreth AJ, Sandercock PA, O’Connell JE, Johnston DE, Cartlidge NEF, et al. Glucose-potassium-insulin infusions in the management of post-stroke hyperglycaemia: the UK Glucose Insulin in Stroke Trial (GIST-UK). Lancet Neurol. 2007/April/17. 2007;6:397–406. [DOI] [PubMed] [Google Scholar]
- 13.Johnston KC, Bruno A, Pauls Q, Hall CE, Barrett KM, Barsan W, et al. Intensive vs Standard Treatment of Hyperglycemia and Functional Outcome in Patients with Acute Ischemic Stroke: The SHINE Randomized Clinical Trial. JAMA - J Am Med Assoc. 2019; [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.O’Neill PA, Davies I, Fullerton KJ, Bennett D. Stress hormone and blood glucose response following acute stroke in the elderly. Stroke. 1991;22:842–7. [DOI] [PubMed] [Google Scholar]
- 15.Duckrow RB, Beard DC, Brennan RW. Regional cerebral blood flow decreases during chronic and acute hyperglycemia. Stroke. 1987;18:52–8. [DOI] [PubMed] [Google Scholar]
- 16.Quast MJ, Wei J, Huang NC, Brunder DG, Sell SL, Gonzalez JM, et al. Perfusion Deficit Parallels Exacerbation of Cerebral Ischemia/Reperfusion Injury in Hyperglycemic Rats. J Cereb Blood Flow Metab. SAGE PublicationsSage UK: London, England; 1997;17:553–9. [DOI] [PubMed] [Google Scholar]
- 17.Anderson RE, Tan WK, Martin HS, Meyer FB. Effects of glucose and PaO2 modulation on cortical intracellular acidosis, NADH redox state, and infarction in the ischemic penumbra. Stroke. 1999;30:160–70. [DOI] [PubMed] [Google Scholar]
- 18.Katsura K, Asplund B, Ekholm A, Siesjö BK. Extra- and Intracellular pH in the Brain During Ischaemia, Related to Tissue Lactate Content in Normo- and Hypercapnic rats. Eur J Neurosci. 1992;4:166–76. [DOI] [PubMed] [Google Scholar]
- 19.Kamada H, Yu F, Nito C, Chan PH. Influence of hyperglycemia on oxidative stress and matrix metalloproteinase-9 activation after focal cerebral ischemia/reperfusion in rats: relation to blood-brain barrier dysfunction. Stroke. NIH Public Access; 2007;38:1044–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Suh SW, Shin BS, Ma H, Van Hoecke M, Brennan AM, Yenari MA, et al. Glucose and NADPH oxidase drive neuronal superoxide formation in stroke. Ann Neurol. 2008;64:654–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Acharjee A, Ament Z, West JA, Stanley E, Griffin JL. Integration of metabolomics, lipidomics and clinical data using a machine learning method. BMC Bioinformatics. 2016;17:37–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Wolcott Z, Batra A, Bevers MB, Sastre C, Khoury J, Sperling M, et al. Soluble ST2 predicts outcome and hemorrhagic transformation after acute stroke. Ann Clin Transl Neurol. 2017;4:553–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bevers MB, Vaishnav NH, Pham L, Battey TWK, Kimberly WT. Hyperglycemia is associated with more severe cytotoxic injury after stroke. J Cereb Blood Flow Metab. 2017;37:2577–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Stead LG, Gilmore RM, Bellolio MF, Mishra S, Bhagra A, Vaidyanathan L, et al. Hyperglycemia as an independent predictor of worse outcome in non-diabetic patients presenting with acute ischemic stroke. Neurocrit Care. 2009;10:181–6. [DOI] [PubMed] [Google Scholar]
- 25.Bevers MB, Battey TWK, Ostwaldt A-C, Jahan R, Saver JL, Kimberly WT, et al. Apparent Diffusion Coefficient Signal Intensity Ratio Predicts the Effect of Revascularization on Ischemic Cerebral Edema. Cerebrovasc Dis. 2018; [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Bevers MB, Kimberly WT. Critical Care Management of Acute Ischemic Stroke. Curr Treat Options Cardiovasc Med. NIH Public Access; 2017;19:41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Tsujita N, Kai N, Fujita Y, Hiai Y, Hirai T, Kitajima M, et al. Interimager Variability in ADC Measurement of the Human Brain. Magn Reson Med Sci. 2014;13:81–7. [DOI] [PubMed] [Google Scholar]
- 28.Kimberly WT, Sullivan JFO, Nath AK, Keyes M, Shi X, Larson MG, et al. Metabolite profiling identifies anandamide as a biomarker of nonalcoholic steatohepatitis. JCI Insight. 2017;2:e92989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Nelson SE, Ament Z, Wolcott Z, Gerszten RE, Kimberly WT. Succinate links atrial dysfunction and cardioembolic stroke. Neurology. 2019;92:E802–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Xie T, Zhou X, Wang S, Lu Y, Zhu H, Kang A, et al. Development and application of a comprehensive lipidomic analysis to investigate Tripterygium wilfordii-induced liver injury. Anal Bioanal Chem. 2016;408:4341–55. [DOI] [PubMed] [Google Scholar]
- 31.Xi B, Gu H, Baniasadi H, Raftery D. Statistical analysis and modeling of mass spectrometry-based metabolomics data. Methods Mol Biol. 2014;1198:333–53. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Hastie TJ, Tibshirani RJ. Generalized additive models. Gen. Addit. Model 2017. [Google Scholar]
- 33.MacKinnon DP, Fairchild AJ, Fritz MS. Mediation Analysis. Annu Rev Psychol. 2007;58:593–614. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Baron RM, Kenny DA. The Moderator- Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations. J Pers Soc Psychol. 1986;51:1173–82. [DOI] [PubMed] [Google Scholar]
- 35.Diao JA, Kohane IS, Manrai AK. Biomedical informatics and machine learning for clinical genomics. Hum Mol Genet. 2018;27:R29–34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Liang D, Bhatta S, Volodymyr G, Gerzanich V, Simard JM. Cytotoxic edema : mechanisms of pathological cell swelling. Neurosurg Focus. NIH Public Access; 2007;22:1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Manzanero S, Santro T, Arumugam TV. Neuronal oxidative stress in acute ischemic stroke: sources and contribution to cell injury. Neurochem Int. 2013;62:712–8. [DOI] [PubMed] [Google Scholar]
- 38.Cherubini A, Ruggiero C, Polidori MC, Mecocci P. Potential markers of oxidative stress in stroke. Free Radic Biol Med. Pergamon; 2005;39:841–52. [DOI] [PubMed] [Google Scholar]
- 39.Khoshnam SE, Winlow W, Farzaneh M, Farbood Y, Moghaddam HF. Pathogenic mechanisms following ischemic stroke. Neurol Sci. 2017;38:1167–86. [DOI] [PubMed] [Google Scholar]
- 40.Li P, Stetler RA, Leak RK, Shi Y, Li Y, Yu W, et al. Oxidative stress and DNA damage after cerebral ischemia: Potential therapeutic targets to repair the genome and improve stroke recovery. Neuropharmacology. 2018;134:208–17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Li W, Yang S. Targeting oxidative stress for the treatment of ischemic stroke: Upstream and downstream therapeutic strategies. Brain Circ. Wolters Kluwer -- Medknow Publications; 2016;2:153–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Rink C, Khanna S. Significance of brain tissue oxygenation and the arachidonic acid cascade in stroke. Antioxid Redox Signal. Mary Ann Liebert, Inc.; 2011; 14:1889–903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Facchini F, Chen Y-DI, Hollenbeck CB, Reaven GM. Relationship Between Resistance to Insulin-Mediated Glucose Uptake, Urinary Uric Acid Clearance, and Plasma Uric Acid Concentration. JAMA. American Medical Association; 1991;266:3008. [PubMed] [Google Scholar]
- 44.Kuwabara M, Hisatome I, Niwa K, Hara S, Roncal-Jimenez CA, Bjornstad P, et al. Uric Acid Is a Strong Risk Marker for Developing Hypertension From Prehypertension. Hypertension. Ovid Technologies (Wolters Kluwer Health); 2018;71:78–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Corry DB, Eslami P, Yamamoto K, Nyby MD, Makino H, Tuck ML. Uric acid stimulates vascular smooth muscle cell proliferation and oxidative stress via the vascular renin-angiotensin system. J Hypertens. 2008; [DOI] [PubMed] [Google Scholar]
- 46.Chung WY, Benzie IFF. Plasma allantoin measurement by isocratic liquid chromatography with tandem mass spectrometry: Method evaluation and application in oxidative stress biomonitoring. Clin Chim Acta. 2013;424:237–44. [DOI] [PubMed] [Google Scholar]
- 47.Weir CJ, Muir SW, Walters MR, Lees KR. Serum urate as an independent predictor of poor outcome and future vascular events after acute stroke. Stroke. 2003;34:1951–6. [DOI] [PubMed] [Google Scholar]
- 48.Li M, Hou W, Zhang X, Hu L, Tang Z. Hyperuricemia and risk of stroke: a systematic review and meta-analysis of prospective studies. Atherosclerosis. 2014;232:265–70. [DOI] [PubMed] [Google Scholar]
- 49.Yu ZF, Bruce-Keller AJ, Goodman Y, Mattson MP. Uric acid protects neurons against excitotoxic and metabolic insults in cell culture, and against focal ischemic brain injury in vivo. JNeurosci Res. John Wiley & Sons, Ltd; 1998;53:613–25. [DOI] [PubMed] [Google Scholar]
- 50.Amaro S, Urra X, Gómez-Choco M, Obach V, Cervera A, Vargas M, et al. Uric acid levels are relevant in patients with stroke treated with thrombolysis. Stroke. 2011;42:S28–32. [DOI] [PubMed] [Google Scholar]
- 51.Chamorro Á, Amaro S, Castellanos M, Segura T, Arenillas J, Martí-Fábregas J, et al. Safety and efficacy of uric acid in patients with acute stroke (URICO-ICTUS): a randomised, double-blind phase 2b/3 trial. Lancet Neurol. 2014;13:453–60. [DOI] [PubMed] [Google Scholar]
- 52.Floegel A, Kühn T, Sookthai D, Johnson T, Prehn C, Rolle-Kampczyk U, et al. Serum metabolites and risk of myocardial infarction and ischemic stroke: a targeted metabolomic approach in two German prospective cohorts. Eur J Epidemiol. 2018;33:55–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Sun D, Tiedt S, Yu B, Jian X, Gottesman RF, Mosley TH, et al. A prospective study of serum metabolites and risk of ischemic stroke. Neurology. 2019;92:el890–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary figure 1. Scatter plot demonstrates an inverse association between glucose and ADCr. The variation explained was 3% at P=0.004. Grey shading around the straight line identifies the 95% confidence interval
Supplementary Figure 2: Generalized Additive Model (GAM) fits of selected metabolites from RF with binary outcome of Glucose and ADCr separately. a) Generalized Additive Model (GAM) fits of uric acid with dichotomized glucose as the outcome. The solid line represents the predicted value of the dependent variable. The dotted line represents two times the standard errors of the estimates. The small lines along the x axis are the "rug", showing the location of the sample plots. The y axis is in linear units, which in this case is logits, so that the values are centered on “0” and extend to both positive and negative values. b) A panel of figures selected from RF analysis that illustrate the fits of uric acid and allantoin using GAM with dichotomized ADCr as the outcome
