Abstract
The present study characterized associations among brain metabolite levels, applying bivariate and multivariate (i.e., factor analysis) statistical methods to tCr-referenced estimates of the major PRESS 1H-MRS metabolites (i.e., tNAA/tCr, tCho/tCr, mI/tCr, Glx/tCr), acquired at 3 Tesla from medial parietal lobe in a large (n=299), well-characterized international cohort of healthy volunteers (Povazan et al., 2020). Results supported the hypothesis that 1H-MRS-measured metabolite estimates are moderately intercorrelated (Mr = 0.42, SDr = 0.11, ps < 0.001), with more than half (i.e., 57%) of the total variability in metabolite estimates common to (i.e., shared by) all metabolites. Older age was significantly associated with lower levels of the identified common metabolite variance (CMV) factor (β = −0.09, p = 0.048), despite not being associated with levels of any individual metabolite. Holding CMV factor levels constant, females had significantly lower levels of total choline (i.e., unique metabolite variance or UMV; β = −0.19, p < 0.001), mirroring significant bivariate correlations between sex and total choline reported previously. Supplementary analysis of water-referenced metabolite estimates (i.e., including tCr/water) demonstrated lower, though still substantial, intercorrelations among metabolites, with 37% of total metabolite variance common to all metabolites. If replicated, these results would suggest that applied 1H-MRS researchers shift their analytical framework from examining bivariate associations between individual metabolites and specialty-dependent (e.g., clinical, research) variables of interest (e.g., using t-tests) to examining multi-variable (i.e., covariate) associations between multiple metabolites and specialty-dependent variables of interest (e.g., using multiple regression).
Keywords: 1H-MRS, proton magnetic resonance spectroscopy, factor analysis, age, sex, covariance, common metabolite variance, unique metabolite variance
1. Introduction
Proton MR spectroscopy (1H-MRS) is a widely used research and clinical diagnostic tool that combines standard MRI hardware with specialized pulse sequences to estimate the localized concentrations of products of cellular metabolism, or metabolites, in vivo. Most 1H-MRS data are acquired from brain tissue using single-voxel (i.e., one volume, or “voxel,” at a time) localization pulse sequences like STEAM, semi-LASER or PRESS, and implementations of these are included with most commercially available MRI systems.1 With the help of specialized post-processing (e.g., FID-A)2 and linear-combination modeling (e.g., LCModel)3 software, PRESS 1H-MRS in brain tissue at the commonest clinical MR field strengths (i.e., 1.5 T and 3 T) can reliably resolve five major metabolite signals from overlapping metabolite and nuisance signals (e.g., water, lipids): N-acetylaspartate plus N-acetylaspartylglutamate (i.e., total NAA or tNAA), glycerophosphocholine plus phosphocholine (i.e., total choline or tCho), creatine plus phosphocreatine (i.e., total creatine or tCr), myo-inositol (mI), and glutamate plus glutamine (Glx). The amplitude of each resolved metabolite signal is proportional to the number of signal-generating molecules in the voxel but is also influenced by idiographic characteristics of MRI hardware properties and the measurement process itself (e.g., coil loading, receiver gain, amplifier non-linearities). Signals are therefore typically normalized to an internal spectroscopic reference signal that is affected by the same characteristics (i.e., most often tCr or unsuppressed water) to allow inference across instruments.1 Once quantified, applied research studies typically focus on bivariate associations of individual metabolites with specialty-dependent (e.g., clinical, behavioral) variables of interest, and interpret such associations by referencing accepted literature definitions of each metabolite: tNAA as a marker of neuronal viability or density, tCho of membrane/phospholipid turnover, tCr of bioenergetics (i.e., assuming tCr was not instead used as an internal reference), mI of osmotic regulation, and Glx of excitatory neurotransmission or neuronal metabolism.4 For example, a recent meta-analysis, reporting elevated anterior cingulate cortex levels of Glx in individuals with bipolar disorder (i.e., a severe and chronic mental illness), summarized extant interpretations of this common finding as, “increased glutamatergic neurotransmission [causing] supra-activation of glutamatergic receptors, increasing calcium post-synaptic influx, resulting in excitotoxicity, cell damage or even neuronal death.”5 Interpretations of bivariate associations like these, which are common in the fields of psychiatry, neurology, neuroscience, and beyond, rely on several unevaluated assumptions, including that the metabolite of interest (e.g., Glx) is either statistically independent from other metabolites (e.g., tNAA) and/or is uniquely (e.g., Glx but not tNAA) associated with the specialty-dependent variable of interest (e.g., bipolar disorder). As such, it is important we ask, ‘Are these assumptions appropriate?’
There are many potential sources of inter-metabolite (i.e., metabolite-metabolite) covariance (e.g., shared biological pathways of synthesis and uptake, but also spectroscopic signal overlap, normalization, and differential likelihood of being present in particular tissue types). Inter-metabolite covariance has nonetheless been largely ignored in the 1H-MRS literature; we were unable, for example, to find a single published study characterizing associations among each of the major 1H-MRS metabolites within a reasonably large sample of healthy volunteers (e.g., n ≥ 50). Nonetheless, several application studies have reported bivariate correlations between theoretically selected pairs of metabolites and have generally demonstrated at least moderate inter-metabolite covariance (e.g., Glx and NAA,6,7 Glx and GABA,8,9 GABA and mI10). Furthermore, the influence of some specific sources of inter-metabolite covariance (e.g., normalization) has been well-characterized for similar applications of NMR spectroscopy, for example, in the nutritional and food sciences.11 Together, these lines of evidence provide support for the hypothesis that major 1H-MRS metabolites are moderately intercorrelated, due to some combination of substantive (e.g., common underlying biology) and methodological factors (e.g., shared localization in tissue, signal normalization, signal overlap).
Full characterization of inter-metabolite covariance should include analyses that go beyond estimating bivariate correlations between pairs of 1H-MRS metabolites, since bivariate correlations cannot identify and isolate sources of common (i.e., shared) metabolite variance (CMV) that affect more than two metabolite variables (e.g., normalization). These can only be determined and quantified using multivariate statistical methods, for example, factor analysis, an unsupervised learning algorithm that reduces the dimensionality of a metabolite dataset by identifying linear combinations of variables (or “factors”) that maximally explain observed inter-variable covariance.12,13 Such multivariate methods have been productively used in NMR research for over forty years,14 and have been recommended for 1H-MRS research and clinical applications,15 but have not been widely adopted to date. Only a few studies that we are aware of, each conducted in relatively small clinical samples (i.e., individuals with HIV-associated dementia), have used factor analysis to identify and isolate CMV factors (e.g., “inflammation,” marked by CMV in Cho/Cr and mI/Cr across multiple brain regions).16,17
Since multiple-regression research has demonstrated that observed bivariate associations are often spurious, arising from shared covariance with confounding ‘third variable(s)’,18 it is imperative to not only obtain robust bivariate and multivariate estimates of inter-metabolite covariance in 1H-MRS data, but to also evaluate the extent to which associations of specialty-dependent (e.g., clinical, behavioral) variables of interest with individual metabolites are driven by CMV versus variance unique to those individual metabolites (i.e., unique metabolite variance, UMV), for example, using statistical covariate analyses (e.g., ANCOVA, multiple regression). Unfortunately, the vast majority of applied 1H-MRS studies to date have only evaluated bivariate associations of one or a few theoretically selected metabolites with specialty-dependent variables of interest, ignoring the potential influence of inter-metabolite covariance on these associations. As a result, it is possible that some, many, most, or all reported associations between 1H-MRS metabolites and specialty-dependent variables of interest have been driven by CMV, thereby potentially undermining authors’ interpretations of these associations as reflecting something particular about their chosen 1H-MRS metabolites (i.e., UMV).
The primary goal of the present study was to characterize inter-metabolite covariance in a large (n=299), well-characterized international cohort of healthy volunteers,19 applying both bivariate and multivariate (i.e., factor analysis) statistical methods to tCr-normalized estimates of the major PRESS 1H-MRS metabolites (tNAA/tCr, tCho/tCr, mI/tCr, Glx/tCr). The secondary goal of this study was to evaluate whether observed associations of the major metabolites with available covariates (i.e., age, sex) were driven by CMV and/or UMV in the same cohort of volunteers. Since 1H-MRS metabolite estimates require normalization by a common stable reference signal,11 a final goal of this study was to evaluate the impact of the choice of internal reference (i.e., tCr vs. unsuppressed water) on the observed inter-metabolite covariance.
2. Methods
2.1. Participants and data (acquisition, preprocessing, and modeling)
Data for the present study, along with acquisition details, have been published previously.19 Briefly, 299 healthy volunteers aged 18-35 years from 26 sites (mean [SD] n per site = 11.19 [1.80]) contributed 1H-MRS data from 3x3x3 cm3 medial parietal lobe voxels (Figure 1a), acquired using vendor-native short-TE PRESS sequence on 3 Tesla scanners from 3 vendors (Philips: 10 sites, total n = 119; Siemens: 8 sites, total n = 89; GE: 8 sites, total n = 91), with the following parameters: TR/TE = 2000/35 ms; 64 averages; 2, 4, or 5 kHz spectral bandwidth; 2048-4096 data points; 140, 50, or 150 Hz water suppression pulse bandwidth for Philips, Siemens, or GE, respectively; acquisition time = 2.13 min. Eight to sixteen averages of water reference signal were acquired with similar parameters, but without water suppression. Institutional review board approval and participant written informed consent were obtained at each study site. Data were preprocessed following expert-consensus-recommended practices (e.g., frequency and phase correction of individual transients in the time domain,20,21 eddy current correction,22 and HSVD water removal23 using the open-source MATLAB toolbox Osprey.24 Vendor-specific basis sets were created for linear-combination modeling using the MATLAB toolbox FID-A2 as detailed in Zöllner et al.25 The LCModel version 6.3 algorithm3 was used to fit the pre-processed data with these basis sets and default settings (Figure 1b). Quantitative metabolite estimates were derived from the LCModel-generated model amplitudes in Osprey and subsequently referenced to (i.e., divided by) estimated tCr levels,19 resulting in four metabolite variables (i.e., tNAA/tCr, tCho/tCr, mI/tCr, and Glx/tCr), as well as to unsuppressed water (W), resulting in five metabolite variables (i.e., tNAA/W, tCho/W, mI/W, Glx/W, tCr/W). Water reference signals were fully tissue-corrected for within-voxel CSF fraction and tissue-specific water densities and signal relaxation times in gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF),26 based on observed within-voxel tissue fractions that were estimated from participants’ segmented T1-weighted images using SPM12.27 Data quality metrics (signal-to-noise (SNR) and linewidth of the NAA singlet) were calculated for the pre-processed data. SNR was defined as the ratio of the peak height and the standard deviation of the frequency-domain spectrum between −2 and 0 ppm. Linewidth was defined as the average of the simple FWHM and the FWHM of a Lorentzian peak model. A summary table following the minimum reporting standards recommended by a recent MRS consensus paper28 can be found in Supplementary Table 1.
Figure 1.
a. Sample 3x3x3 cm3 medial parietal lobe voxel position. b. Detailed schematic of the analysis workflow used in this study.
2.2. Statistical analyses
All statistical analyses were performed on group- (i.e., site-) mean centered metabolite variables (i.e., to provide clearly interpretable and unbiased estimates of inter-metabolite covariance)29 in Mplus version 8.7 software,30 using a maximum likelihood estimator with robust standard errors (MLR) and with “type=complex” specified to adjust model-fit indices and standard errors for the complex sampling design of the data (i.e., with participants clustered by site [“cluster = site”] and sites stratified by vendor [“stratification = vendor”]).31,32 The analysis plan detailed below was first completed on tCr-normalized metabolite estimates and then repeated on water-normalized estimates, to evaluate the impact of the choice of quantitative reference signal (i.e., tCr vs. water) on inter-metabolite covariance.
Pearson correlation coefficients were calculated for each pair of metabolite variables, as well as for each available combination of covariate and metabolite variable (e.g., age and Glx/tCr); for correlation coefficients involving the binary variable, sex, resulting values were point-biserial coefficients with p-values equal to those derived from equivalent independent samples t-tests (i.e., with sex as the independent, and metabolite as the dependent, variable).18
Exploratory factor analysis (EFA) was conducted to obtain eigenvalues associated with each extracted factor, representing the total amount of CMV explained by a given factor, with eigenvalues < 1 typically interpreted as denoting ignorable factors.13 Only a single CMV factor was ultimately modeled using confirmatory factor analysis (CFA), because the observed data consisted of only 4 dependent variables, and each modeled factor in a factor analysis should feature ≥ 3 highly associated (or “loading”) dependent variables to avoid estimation problems and to ensure model reliability.12 Factor variances were fixed to “1” (i.e., standardized) for model identification, allowing all factor loadings (i.e., metabolite on factor regression coefficients) to be freely estimated.12 Correlations among model residuals were fixed to “0” to impart the standard, conservative baseline assumption of local independence,33 that the factor model fully explains the observed inter-metabolite covariance. Goodness of model fit to the data was evaluated using the comparative fit index (CFI, cutoff ≥ 0.95), Tucker Lewis index (TLI, cutoff ≥ 0.95), and standardized root mean square residual.34,35 Modification indices (MIs) were generated for constrained model parameters (i.e., correlations among model residuals), reflecting the improvement in model fit (x2) that would be achieved if a given constrained parameter were to be freely estimated.30 Constraints associated with very high MIs (i.e., > 10, corresponding to p-values ≤ 0.001) were considered for free estimation, one at a time.
From the final CFA model, a Multiple Indicator Multiple Causes (MIMIC)36 model was constructed to evaluate associations of age and sex with the identified CMV factor and each of its constituent metabolite variables. The standard, stepwise forward modeling procedure was used to construct the final MIMIC model;12 first regressing the CMV factor on covariates (i.e., while simultaneously constraining each regression path from covariates to individual metabolite variables to “0”), then using MIs to unconstrain only those paths that would result in significant improvement to model fit if freely estimated (i.e., > 10, corresponding to p-values ≤ 0.001), one at a time. Regression paths unconstrained using this method, if any, represented associations of covariates with individual metabolites that were statistically significant at equal levels of (i.e., statistically controlling for) the CMV factor, allowing for dissociation of covariate effects driven by CMV (i.e., CMV factor regressed on covariates) versus UMV (i.e., individual metabolite variables simultaneously regressed on covariates).
Fully standardized model parameters are provided for final CFA and MIMIC models, including factor loadings, estimated covariate regression coefficients, UMV (i.e., variance in each metabolite unaccounted for by the CMV factor and/or covariates, along with its reciprocal [i.e., r2]), and estimated residual correlation coefficients, if any.
3. Results
3.1. Participants and data acquisition, preprocessing, and modeling
Data were generally of excellent quality (NAA SNR: M = 213.12, SD = 68.74; Water FWHM: M = 6.83 Hz, SD = 0.79 Hz). However, data from 8 participants were excluded from all analyses after visual inspection of the modeling outputs due to incorrectly placed voxels (n = 2, vendor = Siemens), unacceptably high lipid peaks (n = 4, vendor = Siemens; n = 1, vendor = Philips), or frequency and phase correction failure (n = 1, vendor = GE). Data from an additional 3 participants were excluded from primary analyses following metabolite quantification because their tNAA/tCr (n = 1, vendor = GE) or Glx/tCr (n = 1, vendor = GE; n = 1, vendor = Philips) values were identified as extreme outliers (i.e., ≥ 3rd quartile + 3 * interquartile range), leaving 288 participants’ data available for primary analyses. Figure 2 shows mean data, model, and residual, while Supplementary Figure 1 shows participant data, model, and residual. LCModel fits of these data were excellent, as indicated by their relative Cramér-Rao Lower Bound (CRLB), a measure of model parameter estimation uncertainty (tNAA M = 1.41 %, SD = 0.50 %; tCho M = 2.67 %, SD = 1.00 %; tCr M = 1.36 %, SD = 0.53 %, mI M = 3.76 %, SD = 0.65 %; Glx M = 4.61 %, SD = 0.67 %). Age (M = 26.46, SD = 4.62) and sex (female n [%] = 147 [51%]) were not significantly associated with one another (β = −0.12, p = 0.15) and, with the exception of sex with tCho (r = −0.23), bivariate correlations of age and sex with metabolite variables were very small (i.e., all [r] < 0.10).
Figure 2.
Mean spectra (black) +/ SD (gray ribbons) and mean fit (red); mean residual (above) and mean contribution from individual metabolites (i.e., tNAA total NAA, Glx glutamate+glutamine, tCho total choline, tCr total creatine, mI myo-inositol) +/− SD (gray ribbons) (below).
3.2. Statistical analysis
3.2.1. Primary analysis of tCr-normalized estimates
Figure 3 shows scatterplots and linear regression lines, along with Pearson correlation coefficients, for each pair of metabolite variables. Inter-metabolite correlations were generally moderate in strength (Mr = 0.42, SDr = 0.11) and were uniformly positive and significantly different from zero (ps < 0.001).
Figure 3.
Scatterplot matrix, featuring linear fit lines, depicting bivariate associations between each pair of tCr-normalized metabolites (i.e., tNAA total NAA, Glx glutamate+glutamine, tCho total choline, tCr total creatine, mI myo-inositol).
EFA-estimated eigenvalues (1: 2.26, 2: 0.75, 3: 0.63, 4: 0.36) strongly supported extraction of a single factor that explained 57% of total metabolite variance, and models with > 1 extracted factor failed to converge, as predicted. As a result, a single factor CFA model was constructed that provided good (CFI = 0.98, SRMR = 0.03) to adequate (TLI = 0.93) fit to the data; no MIs exceeded the pre-defined threshold of p ≤ 0.001, and so the baseline assumption of local independence (i.e., that the model fully explains all inter-metabolite covariance) was retained. Figure 4 shows a diagram of the final CFA model, labeled by its standardized parameter estimates and their standard errors. Factor loadings were moderate-strong (Mβ = 0.64, SDβ = 0.18) and universally statistically significant (ps < 0.001), with the CMV factor explaining between 22% (mI/tCr) and 76% (Glx/tCr) of the variance in each metabolite (Mr2 = 0.56, SDr2 = 0.24).
Figure 4.
Final confirmatory factor analysis model of tCr-normalized metabolites, labeled with fully standardized (ranging between “0” and “1”) parameter estimates (standard errors for these estimates appear alongside them in brackets). Parameter estimates on arrowed lines from the common metabolite variance (CMV) factor to metabolite variables represent the strength of the relationship between each metabolite and the CMV factor. Parameter estimates adjacent to arrowed lines pointing to each metabolite represent the unique metabolite variance (UMV) that was not explained by the CMV factor in that metabolite (and, its inverse, the amount of variance that was explained by the CMV factor). tNAA = total NAA, Glx = glutamate+glutamine, tCho = total choline, tCr = total creatine, mI = myo-inositol.
The baseline MIMIC model, with the CMV factor regressed on age and sex but with all paths from covariates to metabolite variables constrained to “0,” provided adequate (CFI = 0.92, SRMR = 0.05) to inadequate (TLI = 0.86) fit to the data. The constrained path from sex to tCho/tCr was associated with the largest suprathreshold MI value (i.e., 13.95), supporting removal of the constraint and re-estimation of the model. Following this modification, one additional suprathreshold MI value (i.e., 10.38) was provided, supporting removal of the constrained residual correlation between tCho/tCr and mI/tCr (i.e., partially relaxing the assumption of local independence). The final MIMIC model, which included these two modifications to the baseline model, provided excellent fit to the data (CFI/TLI = 1.00, SRMR = 0.02). Figure 5 shows a diagram of the final MIMIC model, labeled by its standardized parameter estimates and their standard errors. The regression path from age (β = −0.09, p = 0.048), but not sex (β = −0.11, p = 0.066), to the CMV factor was statistically significant. However, the variance explained in the CMV factor by age was small (i.e., 2%), likely owing to the restricted age range of the sample (i.e., 18-35 years). The freed regression path from sex to tCho/tCr was statistically significant (β = −0.19, p < 0.001), as anticipated by its associated suprathreshold MI value.
Figure 5.
Final Multiple Indicator Multiple Causes (MIMIC) model of tCr-normalized metabolites, labeled with fully standardized parameter estimates (standard errors for these estimates appear alongside them in brackets). All parameter estimates except for those marked with a “*” were statistically significant at p < 0.05. Parameter estimates on top of single arrowed lines are regression coefficients representing the strength of the relationship between a given pair of variables or factors, while those on top of double arrowed lines are correlation coefficients representing the strength of the relationship between the unexplained variances of a given pair of metabolites. Parameter estimates adjacent to arrowed lines pointing to each metabolite or factor represent the variance in that metabolite or factor that was not explained by the model. Sex was coded: 0 = Male; 1 = Female. CMV = common metabolite variance, tNAA = total NAA, Glx = glutamate+glutamine, tCho = total choline, tCr = total creatine, mI = myo-inositol.
In summary, bivariate correlations among tCr-normalized metabolite variables were generally moderate in strength and uniformly positive and significantly different from zero. A single factor CFA model provided good fit to the data, with the CMV factor explaining 57% of the variance in each metabolite, on average. MIMIC modeling found that older participants had lower levels of CMV, and that, controlling for CMV levels, females had lower levels of total choline (i.e., tCho/tCr) than males.
3.2.2. Supplementary analysis of water-normalized estimates
As detailed in section 3.1, data from 8 participants were excluded from all analyses following visual inspection. Data from an additional 7 participants (n = 7, vendor = GE) were excluded from supplementary analyses of water-normalized data following metabolite quantification because their Glx/W (n = 3), mI/W (n = 2), tCho/W (n = 1), or tCr/W (n = 1) values were identified as extreme outliers (i.e., ≥ 3rd quartile + 3 * interquartile range), leaving 284 participants’ data available for supplementary analyses.
Figure 6 shows scatterplots and linear regression lines, along with Pearson correlation coefficients, for each pair of metabolite variables. Inter-metabolite correlations were small (e.g., Glx/W with tCr/W, r = 0.04) to moderate (e.g., Glx/W with tNAA/W, r = 0.38) in strength (Mr = 0.20, SDr = 0.11) and were uniformly positive. Six of the 10 estimated correlations were significantly different from zero (all p < 0.05).
Figure 6.
Scatterplot matrix, featuring linear fit lines, depicting bivariate associations between each pair of water-normalized metabolites (i.e., tNAA total NAA, Glx glutamate+glutamine, tCho total choline, tCr total creatine, mI myo-inositol).
EFA-estimated eigenvalues (1: 1.83, 2: 1.01, 3: 0.97, 4: 0.71, 5: 0.50) supported extraction of a single factor that explained 37% of total metabolite variance, and models with > 1 extracted factor failed to converge, as predicted. As a result, a single factor CFA model was constructed, which provided adequate (SRMR = 0.06) to poor (CFI = 0.70, TLI = 0.40) fit to the data. The constrained residual correlation between Glx/W and tCr/W was associated with the largest, suprathreshold MI value (i.e., 12.33), supporting removal of the constraint (i.e., partially relaxing the assumption of local independence) and re-estimation of the model. The re-estimated model provided adequate (TLI = 0.92) to good (CFI = 0.97, SRMR = 0.04) fit to the data, with no further suprathreshold MIs provided. Figure 7 shows a diagram of the final CFA model, labeled by its standardized parameter estimates and standard errors. Factor loadings were moderate-strong (Mβ = 0.49, SDβ = 0.18) and universally statistically significant (ps ≤ 0.001), with the CMV factor explaining between 11% (mI/W) and 67% (Glx/W) of the variance in each metabolite (Mr2 = 0.27, SDr2 = 0.23).
Figure 7.
Final confirmatory factor analysis model of water-normalized metabolites, labeled with fully standardized (ranging between “0” and “1”) parameter estimates (standard errors for these estimates appear alongside them in brackets). Parameter estimates on arrowed lines from the common metabolite variance (CMV) factor to metabolite variables represent the strength of the relationship between each metabolite and the CMV factor. Parameter estimates adjacent to arrowed lines pointing to each metabolite represent the unique metabolite variance (UMV) that was not explained by the CMV factor in that metabolite (and, its inverse, the amount of variance that was explained by the CMV factor). tNAA = total NAA, Glx = glutamate+glutamine, tCho = total choline, tCr = total creatine, mI = myo-inositol, W = water.
The baseline MIMIC model, with the CMV factor regressed on age and sex but with all paths from covariates to metabolite variables constrained to “0,” provided adequate (SRMR = 0.05) to inadequate (CFI = 0.86, TLI = 0.77) fit to the data. The constrained residual correlation between tCho/W and mI/W was associated with the largest, suprathreshold MI value (i.e., 10.62), supporting removal of the constraint (i.e., partial relaxation of the assumption of local independence) and re-estimation of the model. Following this modification, one additional suprathreshold MI value (i.e., 10.93) was found, supporting removal of the constrained path from sex to tCho/W. The final MIMIC model, which included these two modifications to the baseline model, provided excellent fit to the data (CFI/TLI = 1.00, SRMR = 0.02), but contained a Heywood case involving Glx/W (i.e., a small, negative residual variance of −0.05).37 To produce a final MIMIC model with interpretable parameter estimates, we constrained the residual variance of Glx/W to “0,” thereby resolving the Heywood case; results from this model should nonetheless be regarded with caution. Figure 8 shows a diagram of the final MIMIC model, labeled by its standardized parameter estimates and standard errors. Neither regression path from age (β = 0.01, p = 0.891) or sex (β = −0.04, p = 0.422) to the CMV factor was statistically significant. As with the tCr-normalized data, the freed regression path from sex to tCho/W was statistically significant (β = −0.20, p < 0.001), as anticipated by its associated suprathreshold MI value.
Figure 8.
Final Multiple Indicator Multiple Causes (MIMIC) model of water-normalized metabolites, labeled with fully standardized parameter estimates (standard errors for these estimates appear alongside them in brackets). All parameter estimates except for those marked with a “*” were statistically significant at p < 0.05. Parameter estimates on top of single arrowed lines are regression coefficients representing the strength of the relationship between a given pair of variables or factors, while those on top of double arrowed lines are correlation coefficients representing the strength of the relationship between the unexplained variances of a given pair of metabolites. Parameter estimates adjacent to arrowed lines pointing to each metabolite or factor represent the variance in that metabolite or factor that was not explained by the model. Sex was coded, 0 = Male, 1 = Female. CMV = common metabolite variance, tNAA = total NAA, Glx = glutamate+glutamine, tCho = total choline, tCr = total creatine, mI = myo-inositol, W = water.
In summary, bivariate correlations among water-normalized metabolite variables were generally small-to-moderate in strength and uniformly positive. A single CMV factor explained 37% of the total metabolite variance (i.e., ~20% less variance than was explained by the CMV factor from the tCr-normalized data), and MIMIC modeling found that, controlling for CMV levels, females had lower levels of total choline than males. Unlike analyses involving the tCr-normalized metabolite data, analyses involving the water-normalized data presented estimation problems, likely caused by the excess between-vendor variability contained in the water-normalized data.38
4. Discussion
The present study characterized associations among brain metabolite levels, applying bivariate and multivariate (i.e., factor analysis) statistical methods to tCr-referenced estimates of the major PRESS 1H-MRS metabolites (i.e., tNAA/tCr, tCho/tCr, mI/tCr, Glx/tCr), acquired at 3 Tesla from medial parietal lobe in a large (n=288), well-characterized international cohort of healthy volunteers.19 Results supported the hypothesis that 1H-MRS-measured metabolite estimates are moderately intercorrelated. A single factor, capturing variability in metabolite estimates common to (i.e., shared by) all metabolites, explained 57% of the variance in each metabolite (on average). Older age was significantly associated with lower levels of this common metabolite variance (CMV) factor, despite not being associated with levels of any individual metabolite. Holding CMV factor levels constant, females had significantly lower levels of total choline (i.e., unique metabolite variance or UMV), mirroring the significant bivariate correlation between sex and total choline reported previously.19 Supplementary analysis of water-normalized metabolite estimates demonstrated lower, though still substantial, intercorrelations among metabolites, with more than a third of total metabolite variance common to all metabolites. If replicated, these findings have broad and substantial implications for the analysis and interpretation of 1H-MRS data in applied studies.
Though both the present study and several applied studies6–10,16,17 have demonstrated that 1H-MRS acquired metabolite estimates are moderately intercorrelated, the vast majority of 1H-MRS applications have focused on one or a few theoretically selected metabolite estimates and their bivariate associations with specialty-dependent (e.g., clinical, behavioral) variables of interest, ignoring the potential influence of inter-metabolite covariance on these associations in their analyses and interpretations. If, however, as the results from the present study suggest, more than half of the total metabolite variance is common to all metabolites, such interpretations are better supported if the associations are driven by the other half of the variance in the selected metabolite(s). Because most applied studies have only evaluated one or a few metabolites using bivariate statistical tests, it is possible that some, many, or most reported associations of 1H-MRS metabolites with specialty-dependent variables of interest have been driven by CMV. Results from the present study suggest that applied researchers broaden their analytical approach to include more metabolites, statistically controlling for CMV using appropriate covariance modeling methods (e.g., multiple regression) in place of more commonly used bivariate methods (e.g., t-test).
What exactly is this CMV factor that appears to account for over half of the variance in PRESS 1H-MRS metabolite estimates? Does it represent one or more common underlying biological and/or methodological phenomena? The first common factor extracted from biological data is typically interpreted as a measure of the overall biological process (e.g., “cellular metabolism,” in the case of 1H-MRS data), with the relative strength of associations between this factor and its constituent metabolites providing additional interpretational clues.39 For example, although the CMV factor was significantly associated with each metabolite in the present study, it associated ~2x more strongly with, thereby explaining >3x more variance in, Glx/tCr versus mI/tCr (with tNAA/tCr > tCho/tCr intermediary), and unlike with Glx/tCr and tNAA/tCr, the CMV factor did not adequately explain the covariance between mI/tCr and tCho/tCr; similar results were obtained in the water-normalized data. Based on this pattern of findings, one could hypothesize that the observed CMV factor reflects variability in Glx and tNAA (each commonly referred to as “neuronal” metabolites”) more than it reflects variability in tCho and mI (each commonly referred to as “glial” metabolites”).4
Because common factors are empirically derived, however, their meaning can arguably only be ascertained through the process of “construct validation”,40 whereby opposing predictions concerning the nature of CMV are systematically tested. For example, one way to determine whether CMV reflects biological (i.e., versus methodological) phenomena is to evaluate associations of CMV with substantive biological variables of interest (e.g., disease, age). In the present study, older age was significantly associated with lower levels of the CMV factor in tCr-normalized metabolite estimates, lending preliminary support to the hypothesis that CMV reflects biological processes, at least in part. Unfortunately, however, because participants in the present study fell between a restricted age range of 18-35 years,19 age was only able to predict a small amount of CMV (approximately 2%). Future studies featuring less restricted samples will be necessary to determine the true relationship between age and CMV. In contrast, sex, which was not restricted in the present sample, was not associated with the CMV factor, but was significantly associated with tCho UMV.
Conversely, one way to determine whether CMV reflects methodological phenomena is to evaluate the impact of reference signal choice (i.e., tCr vs. unsuppressed water) on inter-metabolite covariance, as the process of normalization itself has been shown to induce inter-metabolite covariance in NMR spectroscopy research.11 In the present study, supplementary analysis of water-normalized metabolite estimates demonstrated lower, though still substantial, inter-metabolite covariance, with 37% of total metabolite variance common to all metabolites. These supplementary findings demonstrate that inter-metabolite covariance is partially determined by reference signal choice (i.e., normalization), with ~20% of the CMV observed in the tCr-normalized data likely attributable to tCr specifically. Further supporting this interpretation, while CMV in tCr-normalized metabolite estimates was significantly associated with age, CMV in water-normalized metabolite estimates was not. Unfortunately, however, these interpretations must be tempered somewhat by the model estimation problems that were encountered with the water-, but not tCr-, normalized data. As noted in previous publications involving these data, Siemens unsuppressed water estimates differed substantially and systematically from GE and Philips estimates, introducing a problematically high level of between-vendor variability into the data, and resulting in calls for the wider development and adoption of “universal” (across-vendor) acquisition sequences to facilitate future multi-site 1H-MRS efforts.38 Overall, though not definitive by any means, these preliminary construct validity findings demonstrate the power of the statistical approach taken in the present study to distinguish between associations of specialty-dependent variables of interest with CMV versus UMV.
Another method for determining to what degree CMV reflects methodological versus biological phenomena, that was not possible to explore within the present dataset, is to examine its generalizability across MR field strengths and 1H-MRS acquisition methods. For example, if CMV reflects shared metabolite variance due to the relatively poor resolvability of metabolite signals at common clinical field strengths (1.5 or 3 T), one would predict it to be substantially reduced in metabolite data acquired at higher field strengths (e.g., 7 T) or with advanced acquisition techniques designed to minimize overlap (e.g., J-difference editing,41 two-dimensional J-resolved PRESS42). These techniques have the added advantage of potentially providing more metabolite variables for factor analyses, allowing for evaluation of more complicated models with multiple factors.12,13
Luckily, much of the work needed to replicate and validate the CMV factor found in the present study can be done using data that already exists. Authors of published applied 1H-MRS studies could “kill two birds with one stone” by separating CMV from UMV in their published data and then examining whether their reported bivariate associations were indeed driven by UMV or CMV, thereby simultaneously evaluating the validity of the CMV factor and disambiguating interpretation of their published findings. Analyses for such re-examinations could be accomplished either directly, within the factor model itself (i.e., provided “adequate” sample sizes, often defined in the hundreds) using specialized software like MPlus30 or the lavaan R toolbox,43 or indirectly, by first calculating factor and residual scores (i.e., the latter reflecting metabolite variance not attributable to the common factor plus random error) for each participant using software like IBM SPSS Statistics [IBM Corp., Armonk, N.Y., USA] or the psych R toolbox;44 the latter approach has the benefit of being relatively less sensitive to sample size (i.e., executable in samples of n ~ ≥50). In even smaller samples (e.g., n < 50), multiple regression (i.e., with one or more “metabolites of no interest” entered as covariates in the regression model alongside the focal metabolite) can still be used to examine the degree to which bivariate associations of metabolites of interest with specialty-dependent variables of interest are driven by UMV or CMV.18
In conclusion, the present study demonstrated in a large sample of healthy volunteers that tCr-normalized 1H-MRS metabolite measures are moderately intercorrelated and that over half of the variance contained in these measures is CMV, common to all metabolites. Preliminary construct validity results further demonstrated that older age was associated with lower CMV levels, and that controlling for CMV levels, female sex was associated with lower tCho/tCr UMV levels. Supplementary analysis of water-normalized metabolite estimates demonstrated lower, though still substantial, intercorrelations among metabolites, with over one third of total metabolite variance common to all metabolites. If replicated, these results would suggest that applied 1H-MRS researchers should shift their analytical framework from examining bivariate associations between individual metabolites and specialty-dependent (e.g., clinical) variables of interest (e.g., using t-tests) to examining multi-variable (i.e., covariate) associations between multiple metabolites and specialty-dependent variables of interest (e.g., using multiple regression).
Supplementary Material
Funding sources.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Dr. Prisciandaro was supported by NIH grants, R01 AA025365, R01 DA054275, and P50 AA010761 (PI: Howard Becker). Dr. Oeltzschner was supported by NIH grants, R00 AG062230 and R21 EB033516. Dr. Edden was supported by NIH grants, R01 EB016089, R01 EB023963, and P41 EB031771 (PI: Peter van Zijl).
List of abbreviations:
- CFA
confirmatory factor analysis
- CFI
comparative fit index
- CMV
common metabolite variance
- EFA
exploratory factor analysis
- Glx
glutamate plus glutamine
- MI
modification indices
- mI
myo-Inositol
- MIMIC
Multiple Indicators Multiple Causes
- MLR
maximum likelihood estimator with robust standard errors
- SRMR
standardized root mean square residual
- tCho
glycerophosphocholine plus phosphocholine
- tCr
creatine plus phosphocreatine
- TLI
Tucker Lewis Index
- tNAA
N-acetylaspartate plus N-acetylaspartylglutamate
- UMV
unique metabolite variance
- W
water
Footnotes
Declarations of interest. None.
Data availability statement.
The de-identified neuroimaging data and related materials are available publicly through NITRC at the following URL: https://www.nitrc.org/projects/biggaba/
References
- 1.de Graaf RA. In Vivo NMR Spectroscopy: Principles and Techniques. 3rd ed. Hoboken, NJ: Wiley; 2019. [Google Scholar]
- 2.Simpson R, Devenyi GA, Jezzard P, Hennessy TJ, Near J. Advanced processing and simulation of MRS data using the FID appliance (FID-A)-An open source, MATLAB-based toolkit. Magn Reson Med. 2017;77(1):23–33. [DOI] [PubMed] [Google Scholar]
- 3.Provencher SW. Estimation of metabolite concentrations from localized in vivo proton NMR spectra. Magn Reson Med. 1993;30(6):672–679. [DOI] [PubMed] [Google Scholar]
- 4.Rae CD. A guide to the metabolic pathways and function of metabolites observed in human brain 1H magnetic resonance spectra. Neurochem Res. 2014;39(1):1–36. [DOI] [PubMed] [Google Scholar]
- 5.Scotti-Muzzi E, Umla-Runge K, Soeiro-de-Souza MG . Anterior cingulate cortex neurometabolites in bipolar disorder are influenced by mood state and medication: A meta-analysis of (1)H-MRS studies. Eur Neuropsychopharmacol. 2021;47:62–73. [DOI] [PubMed] [Google Scholar]
- 6.Kraguljac NV, Reid MA, White DM, den Hollander J, Lahti AC. Regional decoupling of N-acetyl-aspartate and glutamate in schizophrenia. Neuropsychopharmacology. 2012;37(12):2635–2642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Oeltzschner G, Wijtenburg SA, Mikkelsen M, et al. Neurometabolites and associations with cognitive deficits in mild cognitive impairment: a magnetic resonance spectroscopy study at 7 Tesla. Neurobiology of Aging. 2019;73:211–218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.He JL, Oeltzschner G, Mikkelsen M, et al. Region-specific elevations of glutamate + glutamine correlate with the sensory symptoms of autism spectrum disorders. Translational Psychiatry. 2021;11(1):411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Steel A, Mikkelsen M, Edden RAE, Robertson CE. Regional balance between glutamate+glutamine and GABA+ in the resting human brain. NeuroImage. 2020;220:117112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Oeltzschner G, Butz M, Baumgarten TJ, Hoogenboom N, Wittsack HJ, Schnitzler A. Low visual cortex GABA levels in hepatic encephalopathy: links to blood ammonia, critical flicker frequency, and brain osmolytes. Metab Brain Dis. 2015;30(6):1429–1438. [DOI] [PubMed] [Google Scholar]
- 11.Capozzi F, Ciampa A, Picone G, Placucci G, Savorani F. Normalization is a Necessary Step in NMR Data Processing: Finding the Right Scaling Factors. In: Magnetic Resonance in Food Science: An Exciting Future. The Royal Society of Chemistry; 2011:147–160. [Google Scholar]
- 12.Brown TA. Confirmatory factor analysis for applied research. New York, NY, US: The Guilford Press; 2006. [Google Scholar]
- 13.Watkins MW. Exploratory Factor Analysis: A Guide to Best Practice. Journal of Black Psychology. 2018;44(3):219–246. [Google Scholar]
- 14.Malinowski ER. Theory of error for target factor analysis with applications to mass spectrometry and nuclear magnetic resonance spectrometry. Analytica Chimica Acta. 1978;103(4):339–354. [Google Scholar]
- 15.El-Deredy W Pattern recognition approaches in biomedical and clinical magnetic resonance spectroscopy: a review. NMR in Biomedicine. 1997;10(3):99–124. [DOI] [PubMed] [Google Scholar]
- 16.Mohamed MA, Lentz MR, Lee V, et al. Factor analysis of proton MR spectroscopic imaging data in HIV infection: metabolite-derived factors help identify infection and dementia. Radiology. 2010;254(2):577–586. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yiannoutsos CT, Ernst T, Chang L, et al. Regional patterns of brain metabolites in AIDS dementia complex. NeuroImage. 2004;23(3):928–935. [DOI] [PubMed] [Google Scholar]
- 18.Cohen J, Cohen P, West SG, Aiken LS. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. 3rd ed. New York, NY: Routledge; 2002. [Google Scholar]
- 19.Považan M, Mikkelsen M, Berrington A, et al. Comparison of Multivendor Single-Voxel MR Spectroscopy Data Acquired in Healthy Brain at 26 Sites. Radiology. 2020;295(1):171–180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Mikkelsen M, Tapper S, Near J, Mostofsky SH, Puts NAJ, Edden RAE. Correcting frequency and phase offsets in MRS data using robust spectral registration. NMR in Biomedicine. 2020;33(10):e4368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Near J, Harris AD, Juchem C, et al. Preprocessing, analysis and quantification in single-voxel magnetic resonance spectroscopy: experts’ consensus recommendations. NMR in Biomedicine. 2021;34(5):e4257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Barkhuijsen H, de Beer R, van Ormondt D. Improved algorithm for noniterative time-domain model fitting to exponentially damped magnetic resonance signals. Journal of Magnetic Resonance (1969). 1987;73(3):553–557. [Google Scholar]
- 23.Klose U In vivo proton spectroscopy in presence of eddy currents. Magnetic resonance in medicine. 1990;14(1):26–30. [DOI] [PubMed] [Google Scholar]
- 24.Oeltzschner G, Zöllner HJ, Hui SCN, et al. Osprey: Open-source processing, reconstruction & estimation of magnetic resonance spectroscopy data. J Neurosci Methods. 2020;343:108827. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zöllner HJ, Považan M, Hui SCN, Tapper S, Edden RAE, Oeltzschner G. Comparison of different linear-combination modeling algorithms for short-TE proton spectra. NMR in Biomedicine. 2021;34(4):e4482. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gasparovic C, Song T, Devier D, et al. Use of tissue water as a concentration reference for proton spectroscopic imaging. Magnetic Resonance in Medicine. 2006;55(6):1219–1226. [DOI] [PubMed] [Google Scholar]
- 27.Friston KJ, Ashburner J, Kiebel SJ, Nichols TE, Penny WD. Statistical Parametric Mapping: The Analysis of Functional Brain Images. 2007.
- 28.Lin A, Andronesi O, Bogner W, et al. Minimum Reporting Standards for in vivo Magnetic Resonance Spectroscopy (MRSinMRS): Experts’ consensus recommendations. NMR in Biomedicine. 2021;34(5):e4484. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Enders CK, Tofighi D. Centering predictor variables in cross-sectional multilevel models: A new look at an old issue. Psychological Methods. 2007;12(2):121–138. [DOI] [PubMed] [Google Scholar]
- 30.Muthen LK, Muthen BO. Mplus: Statistical Analysis with Latent Variables: User’s Guide (Version 8). Los Angeles, CA: Authors; 2017. [Google Scholar]
- 31.Muthén BO, Satorra A. Complex Sample Data in Structural Equation Modeling. Sociological Methodology. 1995;25:267–316. [Google Scholar]
- 32.Stapleton LM. Variance Estimation Using Replication Methods in Structural Equation Modeling With Complex Sample Data. Structural Equation Modeling: A Multidisciplinary Journal. 2008;15(2):183–210. [Google Scholar]
- 33.Henning G Meanings and implications of the principle of local independence. Language Testing. 1989;6(1):95–108. [Google Scholar]
- 34.Hooper D, Coughlan J, Mullen MR. Structural Equation Modelling: Guidelines for Determining Model Fit. The Electronic Journal of Business Research Methods. 2008;6:53–60. [Google Scholar]
- 35.Taasoobshirazi G, Wang S. The performance of the SRMR, RMSEA, CFI, and TLI: An examination of sample size, path size, and degrees of freedom. Journal of Applied Quantitative Methods. 2016;11:31+. [Google Scholar]
- 36.Joreskog KG, Sorbom D. LISREL8: User’s reference guide. Mooresville: Scientific Software; 1996. [Google Scholar]
- 37.Heywood HB, Filon LNG. On finite sequences of real numbers. Proceedings of the Royal Society of London Series A, Containing Papers of a Mathematical and Physical Character. 1931;134(824):486–501. [Google Scholar]
- 38.Mikkelsen M, Rimbault DL, Barker PB, et al. Big GABA II: Water-referenced edited MR spectroscopy at 25 research sites. NeuroImage. 2019;191:537–548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ripley BD. Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press; 1996. [Google Scholar]
- 40.Cronbach LJ, Meehl PE. Construct validity in psychological tests. Psychological Bulletin. 1955;52(4):281–302. [DOI] [PubMed] [Google Scholar]
- 41.Rothman DL, Petroff OA, Behar KL, Mattson RH. Localized 1H NMR measurements of gamma-aminobutyric acid in human brain in vivo. Proc Natl Acad Sci U S A. 1993;90(12):5662–5666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Ryner LN, Sorenson JA, Thomas MA. 3D localized 2D NMR spectroscopy on an MRI scanner. J Magn Reson B. 1995;107(2):126–137. [DOI] [PubMed] [Google Scholar]
- 43.Rosseel Y lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software. 2012;48(2):1–36. [Google Scholar]
- 44.Revelle W psych: Procedures for Psychological, Psychometric, and Personality Research. Evanston, Illinois: Northwestern University; 2022. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The de-identified neuroimaging data and related materials are available publicly through NITRC at the following URL: https://www.nitrc.org/projects/biggaba/