Abstract
Background:
Most pediatric leukemia forms early in life, but early life biology and biomarkers for screening remain undefined. We will perform direct molecular measurements in neonatal dried blood spots (DBS) and maternal pregnancy serum to identify early life biological signatures of pediatric acute lymphoblastic leukemia (ALL).
Methods:
In a nested case-control study design, we obtained mother-infant paired samples from second trimester pregnancy serum and neonatal DBS from 122 children diagnosed with pediatric ALL and 122 matched cancer-free controls. Using liquid chromatography–high resolution mass spectrometry, we performed untargeted metabolomics. The data-driven Reactomics approach was used to identify quantitative paired mass differences (qPMDs) that represent molecular changes in the samples. We identified qPMDs associated with ALL risk, and assessed linolenic and linoleic acid as potential ALL biomarkers.
Results:
Overall, the nine selected qPMDs in DBS were more strongly associated with ALL than the 16 qPMDs in maternal serum. Several of the selected qPMDs were highly correlated suggesting that these qPMDs may represent biological reactivity hubs of metabolic pathways important in leukemogenesis. We also observed a suggestive positive but not significant association between linolenic and linoleic acid in the DBS of children diagnosed with ALL at ages 5 years or older (N=13) and matched controls (N=13).
Conclusions:
While biological interpretation of Reactomics analysis for clinical intervention is currently limited, our study supports the presence of molecular reaction changes in early life associated with later pediatric ALL.
Impact:
Reactomics analysis revealed potential biomarkers in neonatal samples linked with later diagnosis of ALL.
Keywords: metabolomics, Reactomics, pediatric leukemia, dried blood spots, discovery
1. Introduction
Most pediatric acute lymphoblastic leukemia (ALL) forms in-utero and during early life (1,2). Besides well-established risk factors such as genetic predisposition syndromes and ionizing radiation, epidemiology studies of childhood ALL have identified candidate pre- and post-natal environmental, immune, and dietary risk factors including (but not limited to) exposures to pesticides and organic solvents, delayed exposure to infectious agents, lack of prenatal folate supplementation, and no/short duration of breastfeeding (3-5). While these studies provided invaluable information on the etiology of childhood ALL, they mostly relied on indirect exposure assessment using parental interviews and linkages of publicly available database and therefore, are limited in precise investigation of underlying biological pathways. Therefore, direct untargeted molecular measures in pre-diagnosis samples are needed for discovery of biomarkers and early life biology of pediatric ALL.
Metabolomics, the global profiling of circulating small molecules, has emerged as a powerful tool to discover biomarkers and biological pathways underlying the metabolic dependency of cancer initiation and cell growth (6,7). Studies comparing metabolomics profiles in biofluids and tissues of diagnosed leukemia patients have informed potential therapeutic targets, treatments, and biomarkers to aid diagnosis (8-14). Metabolomics has also been used in prospective studies of common adult cancers to identify biomarkers for early diagnosis and to study etiology. For example, choline, ethanolamine glycerophospholipids, and several amino acids were consistently associated with breast cancer risk, while tryptophan, 3-hydroxybutyric acid, and sebacic acid, among others, were consistently associated with colorectal cancer risk (15). However, the limited availability of pre-diagnosis biosamples in childhood health cohort studies (because of the rarity of the disease) has limited prospective metabolomics studies of pediatric leukemia.
In addition to traditional metabolite profiling, the high-resolution mass spectrometry data collected as part of a metabolomics experiment can be mined using a novel data-driven Reactomics approach (16). This discovery approach is not limited by individual metabolites and pathways, but instead, investigates differences in small molecules on a reaction level. In this approach, paired ions that have a mass difference commonly observed in biological processes in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (16) are identified, and only paired ions that are changing at a consistent rate across pairs are retained as they represent molecular reactions. Because the intensities of the molecular reaction ions are continuous variables for each sample, they can then be used in statistical analysis as potential biomarkers of a disease outcome.
Neonatal dried blood spots (DBS) are collected and archived as part of the California neonatal screening programs. They are a valuable resource to perform direct biological measures, such as metabolomics (17-19), because a newborn DBS represents a snapshot of circulating small molecules during neonatal life to aid in the discovery of early biomiomarkers and etiological factors of pediatric cancers. A study in DBS suggested that early life alterations in inflammation and energy metabolism are associated with the risk of retinoblastoma (20). Our team used metabolomics to identify a set of sex-specific metabolites in DBS associated with later pediatric acute myeloid leukemia (21). Finally, our team identified unique DBS metabolite profiles, including linoleic and linolenic acid, associated with the risk of ALL in children 6-14 years old (22).
Replication of our previous ALL findings in an independent study is needed and whether biological signatures of ALL in newborn DBS are also reflective or independent of signatures in paired second trimester maternal pregnancy blood, is unknown. Using pre-diagnostic biospecimens of 122 pediatric ALL cases and 122 cancer-free controls, this study investigated metabolomics in paired DBS and maternal pregnancy serum to provide insights into early biomarkers of pediatric ALL. We used a novel Reactomics approach to identify reaction-level differences in circulating maternal pregnancy blood (serum) and neonatal blood (DBS) that are associated with later diagnosis of pediatric ALL as potential biomarkers. Further, to validate previous findings, we also used a hypothesis-driven approach to evaluate if linolenic and linoleic acid in neonatal DBS and maternal plasma were positively associated with pediatric ALL.
2. Materials and Methods
2.1. Study population
Children diagnosed with ALL at age 0–14 years (International Classification of Disease for Oncology, 3rd edition, codes 9820, 9823, 9826, 9827, 9831–9837, 9940, 9948) (23) and controls matched 1:1 to cases on sex, year and month of birth, and race/ethnicity (Hispanic, non-Hispanic White, and other) were selected through probabilistic record linkage between electronic records from the California Cancer Registry (CCR) (1988–2011), birth data (1978-2009) maintained by the Center for Health Statistics and Informatics (California Department of Public Health [CDPH]), and archived biospecimens from the California Biobank Program (CBP), forming a case-control study nested within the California birth cohort. For this study, out of ~1,300 ALL cases born from 2000 to 2009, 137 had paired neonatal DBS serum (collected as part of California statewide newborn screening program) and maternal pregnancy blood samples (collected as part of the prenatal screening program in five California counties) (24). Of those, 122 cases were randomly selected, and healthy controls with available biospecimens were matched to ALL cases by birth records, ensuring similar distributions of birth year, sex, and race/ethnicity. This subset of 122 cases and 122 controls was fairly representative of the overall childhood cancer linkage, with respect to socio-demographic and birth characteristics (25,26). Since the timing of ALL initiation in early life is unknown, paired newborn DBS and second trimester maternal serum represent two different early lifetime points for identifying biomarkers of ALL. Paired samples from the same participants were then analyzed for the current discovery study. The current study samples are also an independent set of participants to those in our previous study of early life nutrition and development of ALL in which we identified linolenic and linoleic acid as a potential biomarker (22), providing an opportunity for independent validation of these potential biomarkers. Four to five 14-mm diameter blood spot specimens were collected from newborns on S&S filter paper (also known as Guthrie cards) by heel-stick after birth in accordance with the newborn screening program. Venous blood was collected between 15 and 20 weeks of gestation, and leftover maternal specimens (1–2 mL) were refrigerated for 1–2 days and then processed for long-term storage. The current analysis includes a total of 488 individual samples for 244 mother-child pairs (122 case pairs and 122 control pairs), including case/control status and potential confounding variables (Table S1). The study was approved by Institutional Review Boards for the California Health and Human Services and University of California, Berkeley and conducted according to the Declaration of Helsinki. Section 6505 of Title 17 of the California Administrative Code states that blood collected pursuant to the Newborn Screening Program may be used for research purposes without maternal consent if “the person or persons from whom these results were obtained” is not identified. In addition, we obtained a HIPAA waiver as part of our human subjects research protocol for the request of biospecimens from the CBP.
2.2. Metabolite extraction and liquid chromatography-high resolution mass spectrometry (LC-HRMS) analysis
Neonatal DBS were included for the present analysis using a methodology developed by our team (17,21,26,27) An aliquot of prepared DBS extract (26) was used for the current study. In short, 4.7-mm diameter DBS punches were extracted with 100 μL of water at room temperature (15 min, 1400 rpm), and a 5 μL aliquot was reserved for hemoglobin measurements to adjust for original blood volume. 400 μL of acetonitrile containing isotopically labeled internal standards (IS) was added to the remaining aqueous solution containing the DBS punch, samples were agitated (1400 rpm, 37 °C, 1 h), and protein was precipitated at −20 °C for 30 min. Samples were centrifuged, the supernatant was separated into two aliquots, evaporated to dryness, and stored at −80 °C until analysis. Stored maternal serum samples were thawed on ice, and 50 μL aliquots were combined with 150 μL of methanol containing IS. The sample was vortexed, then incubated at −80°C for 30 min., and evaporated to dryness and stored at −80 °C until analysis. Immediately prior to analysis, samples were reconstituted and analyzed with LC-HRMS with reverse-phase chromatography in negative ionization mode with an Agilent 6550 iFunnel Q-TOF HRMS equipped with an Agilent 1290 Infinity II ultra-high-performance LC system (RRID: SCR_019433). For LC-HRMS analysis, the autosampler was held at 5°C. A 2 μL volume of serum extract (or 10 μL volume of DBS extract), sandwiched between 10 μL of water, was injected onto a Zorbax SB-Aq analytical column (1.8 μm, 2.1 x 50 mm) with a Zorbax-SB-C8 cartridge (3.5 μm, 2.1 x 30 mm). Separation occurred using a gradient with buffer A (100% water with 0.2% acetic acid) and buffer B (100% methanol with 0.2% acetic acid). The gradient was set at 2%B and increased to 98%B in 13 minutes, followed by a 6-minute hold at 98%B, dropping back to 2% B, for a 2 min post-run equilibration. Data was acquired in the range of 50-1000 m/z at 1.67 spectra/s in negative mode.
DBS and serum samples were analyzed separately. Pooled quality control (QC) samples were prepared by combining aliquots of the sample extracts (i.e. a QC pool of DBS extracts and a separate QC pool of serum which were extracted the same way as the samples) that were injected routinely throughout the run. These were used to monitor instrument stability and facilitate batch and run order correction.
Semi-quantitative measures for alpha-linolenic acid/gamma-linolenic acid (co-eluting) and linoleic acid were extracted from the DBS and maternal serum using Profinder software (RRID: SCR_017026) considering retention time, accurate mass, isotope distribution, and MS/MS fragmentation pattern matching with reference standards analyzed under the same conditions (matching retention time < 5 sec and delta m/z < 5 ppm).
2.3. Data processing for untargeted data
Data extraction and preprocessing of the two data sets for neonatal and maternal specimens follows methods previously described (21,22,26,28). Raw data were converted into mzxml format and analyzed using the R programming platform (version 4.0.4, RRID:SCR_001905). The xcms package was used to extract peaks, align the peaks, group peaks into features and fill the baseline for grouped peaks as a final peak table using optimized parameters by the IPO package (29,30). We used the scone package (31) for selecting a data-driven optimized normalization for each of the DBS and maternal serum datasets.
For the 244 neonatal DBS, 2,142 metabolite features were measured. Metabolite features detected above the noise (determined by mean-difference plots), with low analytical variance (intra-class correlation cutoff of 0.75 determined by visual inspection of the empirical data), and with < 30% missingness were retained resulting in 1,479 metabolite features. k-nearest-neighbor was used for imputation (number of neighbors set to four) (32). Normalization included variables for case/control status, batch, DBS duration of storage, age at DBS collection, run order, hemoglobin, and trimmed mean of M values (TMM) scaling.
For the 244 maternal pregnancy samples, 3,802 metabolite features were originally measured. After filtering for metabolites above noise, low analytical variance (intra-class correlation cutoff of 0.7 determined by visual inspection of the empirical data), and low missingness, there were 2,459 metabolites remaining for statistical analysis (300 of which appeared in both neonatal and maternal datasets). Imputation was done with k-nearest neighbor (with the number of neighbors set to four) and the data were normalized using identity scaling (no scaling) and the variables: case/control status, batch, and the first three factors of unwanted variation using removing unwanted variation (RUV) analysis (31). Empirically derived negative controls were provided as inputs for the RUV analysis by taking the intersection of metabolites that showed the least evidence of differential abundance by case/control status in each batch using the edgeR package (33). Normalization included variables for maternal sample duration of storage, run order, and gestational age at collection.
2.4. Reactomics
The datasets contain a table of mass spectral features (identified by m/z and retention time) and their respective peak intensities measured by area under the peak for every sample. In a typical metabolomics experiment, these spectral features are annotated to metabolite and chemical names by matching to commercial and in-house libraries and databases (see section 2.2. example for linolenic and linoleic acid). However, Reactomics analysis uses the spectral data before annotation.
Reactomics was performed separately on the normalized untargeted metabolomics datasets for neonatal DBS and maternal serum. The GlobalStd algorithm (34) was used to remove redundant features such as isotopologues, adducts, as well as multi-charged ions. Paired mass distances (PMDs) in Daltons (Da) were computed by determining the mass difference between every spectral feature combination. These differences were then compared to a library of mass differences from the KEGG reaction database (see pmd package (16)). This library was built from the difference in Da between every biological reaction in KEGG. We then filtered the PMDs in the current dataset to include only the PMDs also present in the KEGG reaction database, as biologically relevant. We then calculated the quantitative PMDs (qPMDs). These are the PMDs that represent reaction-level changes across samples. PMDs from paired ions with consistent ratios across samples (i.e., static PMDs, residual standard deviation < 30%) (16) were considered qPMDs. For each sample, the peak intensities for both ions in the pair were summed and used as the value for that qPMD for statistical analysis. qPMDs were labelled with the number that represents the mass difference between the paired ions (in Da). This resulted in 78 and 269 qPMDs for neonatal DBS and maternal pregnancy serum, respectively, with 56 qPMDs found in both neonatal DBS and maternal pregnancy serum (Figure S1).
2.5. Statistical analysis
Covariate distributions were compared between case-control maternal samples and case-control child samples using a Wilcoxon rank-sum test for continuous variables, a Chi-squared test for categorical variables, and Fisher’s exact test for binary covariates.
For all qPMDs, Spearman rank correlation coefficients were calculated for pairwise qPMDs in neonatal and maternal samples, respectively. Empirical cumulative density functions were generated for each set of correlation values (neonatal and maternal) separately and visualized. In addition, Spearman rank correlation coefficients were calculated for the matrix of pairwise correlations between the neonatal qPMDs for the cases and then the equivalent for the controls. For a given pair of qPMDs, the difference of the correlation coefficients (cases - controls) were plotted as a function of the mean of the two correlation coefficients (cases and controls) in mean-difference plots. The equivalent plot was generated for maternal qPMDs, except that it is a random sample of 10% of the points to aid in visualization.
We used an ensemble feature selection method for data-driven discovery without hypothesis testing (22) to select qPMDs associated with case/control status. Briefly, a linear regression was fit for each qPMD. For neonatal DBS, the qPMD value was regressed on case/control status, adjusting for sex, race/ethnicity, mother’s education, delivery mode, and gestational birthweight category (binary based on being above or the below the 90th percentile according to the INTERGROWTH 21 standards (35)). For maternal serum, the qPMD value was regressed on case/control status, adjusting for sex, race/ethnicity, and mom’s education. For both datasets, qPMDs were ranked by the nominal p-value associated with the t-test for the beta coefficient associated with case/control status. Next, a bootstrapped regularized regression (LASSO) was fit where case/control status was regressed on all the qPMDs along with the previously mentioned covariates. In this analysis, the qPMDs were ranked based on the proportion of the 500 bootstrapped iterations in which their associated regression coefficient in the model was non-zero. The same input was given to Random Forests (36) and variables were ranked by the mean decrease in Gini index. The top ranked qPMDs were selected based on whether a separation of at least 0.5 in mean decrease in Gini index existed between them and other qPMDs further down the ranking. The concordance of the rankings from the regression methods was used to select a set of features which was then supplemented by the features selected by Random Forests to allow for non-linear relationships and interactions.
For both neonatal and maternal qPMD data, descriptive analysis was performed using hierarchical clustering of pairwise spearman correlation values using the R package superheat as well as PAM clustering (37) using 1 - r as the distance measure and selecting the number of clusters (between 2 and 15) maximizing the silhouette width. This analysis was performed with and without covariates.
In addition to the individual analyses detailed above, we also repeated the descriptive and statistical analysis on the data set generated by combining neonatal and maternal features together, concatenating them for the mother-child pairs. To examine whether differences in feature variances between the neonatal and maternal data was influencing the results, feature selection of the joint data was repeated after mean-centering and scaling each qPMD by its standard deviation to create a normalized dataset.
We previously reported a positive association between alpha/gamma-linolenic acid and linoleic acid measured in newborn DBS with risk of pediatric ALL in children ≥ 6 years old, but not younger children aged 1–5 years (22). Using a similar analytical method, we tested this hypothesis in the current study population using univariate linear regression where, for each metabolite, its log abundance was regressed on case/control status adjusting for batch, run order, DBS duration of storage, hemoglobin, sex, and race/ethnicity. For each variable, the estimated regression coefficient associated with case/control status () was exponentiated to acquire an estimate of the fold-change and a nominal 95% confidence interval was constructed using , where is the estimated standard error of . Hierarchical clustering was performed on pairwise Spearman correlation values for the selected qPMDs with the log abundance values of linoleic and linolenic acid both with and without adjusting for run order, DBS duration of storage, batch, and hemoglobin.
2.6. Data Availability
The data presented in this study are available upon request from the corresponding author and after approval from the California Health and Human Services. The data are not publicly available due to California Health and Human Services restrictions. The Reactomics database is freely available in the pmd R package.
3. Results
This study included 122 pediatric leukemia cases and 122 matched controls, and their mothers. Birth characteristics and other factors related to the collection of neonatal and maternal samples are presented in Table S1. Cases were diagnosed with ALL between one to seven years old and born between 2000–2008. There were more males than females (138 and 106, respectively) and more Hispanic children than non-Hispanic white children (138 and 73, respectively) in our study. Leukemia cases and controls at birth were similar in birth weight, gestational age at birth, delivery mode, and numbers of infants born large for gestational age (LGA). The neonatal DBS was collected four hours earlier in cases compared to controls (mean of 35 versus 32 hours after birth, respectively, p value = 0.049). Mothers of the child cases and controls were similar in age, week of gestation, and highest level of education at blood sample collection.
3.1. Data-driven discovery with untargeted data to identify possible reactive molecular biomarkers in the neonatal DBS and maternal serum samples
After filtering and applying the Reactomics algorithm, 78 and 269 qPMDs for neonatal DBS and maternal pregnancy serum, respectively, were used for analysis. Of these, 56 qPMDs overlapped between the neonatal DBS and maternal serum datasets (Figure S1). Pairwise associations between qPMDs were assessed separately in neonatal and maternal samples using Spearman rank correlation coefficients. Compared to maternal serum, DBS had a larger number of highly correlated qPMDs (mean r >= 0.5), Figure 1. This suggests that there might be overall tighter metabolic regulation in the neonates.
Figure 1.

Empirical cumulative distribution function (ECDF) of the upper-diagonal of the Spearman correlation matrix between pairs of qPMDs in neonatal DBS samples (78 qPMDs, green) and maternal samples (269 qPMDs, purple), where cases and controls are pooled. The value of the ECDF at r=0.5 is indicated with a dashed vertical line.
Next, Spearman correlation coefficients between pairs of qPMDs were compared between cases and controls in the DBS and maternal serum samples using mean-difference plots (Figure 2 a and b). For highly correlated qPMDs in the DBS (mean r from 0.5–1.0), there were a higher number of qPMDs below the zero-difference line indicating overall higher correlation between qPMDs in the controls compared to cases. For moderately correlated qPMDs in the DBS (mean r from 0.25–0.5 on the x-axis) there are a higher number of qPMDs above the zero-difference line indicating lower correlation in the controls compared to cases. Individual heatmaps of DBS qPMDs in the cases and controls show similar findings, with fewer qPMDs in the upper-right correlation cluster of the cases compared to controls (Figure S2).
Figure 2.

Spearman correlation coefficients of qPMDs in cases and controls suggests biological differences. A) Mean-difference plot of Spearman pairwise correlation coefficients for qPMDs in neonatal DBS cases and controls, where each point corresponds to one of 3,003 correlation pairs (for 78 individual qPMDs, ((78 x 78)-78)/2)= 3,003). B) Same as in A, for maternal qPMDs, where, to aid visualization, we only plot means and differences for a random subset of 10% of the 36,046 correlation pairs of qPMDs (for the 269 qPMDs, ((269 x 269)-269/2) x 10% = 3,605). The x-axis shows the average correlation coefficient for cases and controls and the y-axis shows the correlation coefficient difference (cases – controls). Blue line shows a difference of zero.
In the maternal serum mean-difference plot between cases and controls, highly correlated qPMDs (mean r from 0.5–1) were above the zero-difference line indicating more correlation in the cases than the controls (Figure 2b, Figure S3), opposite to what is observed in the newborns (Figure 2a). In the maternal serum, there were no observable differences between cases and controls for moderately correlated qPMDs (mean r from 0.25-0.5), with a similar distribution above and below the zero-difference line. However, in the maternal serum, slightly inversely correlated qPMDs (mean r from −0.25–0) were below the zero-difference line. Similar results can be seen in individual heat maps of case and control maternal serum, where there are dark bands of slightly negatively correlated qPMDs in controls but not in cases (Figure S4).
Hierarchical clustering results for the joint analysis of neonatal and maternal qPMDs together indicate some overlap between the qPMDs in DBS and serum (Figure S5), suggesting unique reaction-level data from both the DBS and maternal serum.
3.2. Reaction-level differences (qPMDs) in relation to risk of pediatric ALL
To identify qPMDs in neonates associated with later ALL diagnosis, we performed an ensemble feature selection on the DBS data. We identified nine qPMDs associated with case/control status (Figure 3a, Table 1). Six qPMDs, 149.08, 206.08, 5.99, 72.02, 64.04, and 41.06, were in higher abundance in DBS of the cases than the controls. Three qPMDs, 136.04, 98.00, and 30.01, were in lower abundance in the DBS of the cases than the controls. These nine selected qPMDs formed three correlation clusters (Figure S6). qPMDs 64.01 and 72.02 were moderately correlated (r = 0.46), while 30.1, 136.04, and 98.00 (r = 0.56–0.91) were strongly correlated and 149.08, 206.08, 5.99, and 41.06 (r = 0.15–0.87) were weak to strongly correlated.
Figure 3.

Measured qPMDs are associated with leukemia. Numbered qPMDs were selected by the ensemble feature selection method for A) DBS, B) maternal serum, and C) Joint analysis with DBS and maternal serum. Estimated case-control fold-change from the linear regression is plotted on the x-axis and the negative log of the associated nominal p-value is plotted on the y-axis. Red points indicate the qPMD was selected through the concordance of the regression methods. For the joint analysis, orange points use a more lenient concordance cutoff point. Blue points are qPMDs that were selected only in Random Forests. The dashed horizontal line corresponds to a nominal p-value of 0.05 for reference. In the joint analysis, qPMDs labeled with a “b” are from the neonatal data and qPMDs labeled with “m" are from the maternal data. (Note the different x- and y-axis scales.)
Table 1.
Unique qPMDS measured in neonatal DBS or maternal second trimester serum are associated with pediatric ALL. qPMDs are described by m/z, element composition of reaction, potential enzyme involved, and KEGG reaction class ID to aid interpretation.
| Selected qPMDs |
m/z involved | Elemental composition of reaction* |
Potential enzyme involved via KEGG |
KEGG reaction class ID |
|---|---|---|---|---|
| Neonatal DBS | ||||
| 98.00 | 481.1866; 383.1888 | +4C2H3O | 6.3.2.46; 3.7.1.2 | RC00096; RC00957; RC00326; RC00446 |
| 136.04 | 519.2271; 383.1888 | +4C8H5O | RC02065 | |
| 30.01 | 511.1969; 481.1866; and 493.3885; 463.3772 | +C2HO | 2.2.1.1;2.2.1.2;2.2.1.5;1.14.13.0;1.14.14.42 | RC00001; RC02295; RC00014; RC02295 |
| 149.08 | 164.0354; 313.1193 | +9C11HON | RC01859; RC02353 | |
| 206.08 | 329.1137; 123.0323 | +9C10H2O4N | 6.1.2.2 | RC00041 |
| 72.02 | 311.2223; 383.2415 | +3C4H2O | 1.14.11.43;1.14.11.44 | RC03192; RC03193 |
| 41.06 | 662.3637; 621.3033 | +3C7HN/−O | 3.5.4.42 | RC00458; RC01422 |
| 64.02 | 323.2585; 259.2430 | +C4H3O | RC02682 | |
| 5.99 | 335.1008; 329.1137; and 529.4246; 523.4347 | +3N/−4H2O | RC02945 | |
| 29.97† | 603.1687; 573.1961; 509.3834; and 479.4093 | +2O/−2H | 1.7.1.0 | RC00001; RC01333 |
| Maternal Pregnancy Serum | ||||
| 118.08 | 317.2502; 199.1702 | +9C10H | 1.13.11.70 | RC01329; RC01330 |
| 90.05 | 385.2794; 295.2274 | +7C6H | 1.3.0.0;2.8.3.15;3.1.1.70 | RC01576; RC01643; RC00014; RC00137; RC00020; RC00041 |
| 236.05 | 825.4777; 589.4314 | +8C12H8O | RC02202 | |
| 0.08 | 335.2949; 335.2196 | +9C2H/−3O2P | 2.5.1.138 | RC01869 |
| 147.07 | 414.3073; 267.2327 | +9C9HON | 2.3.2.0 | RC0004; RC00055 |
| 52.07 | 183.1026; 131.0347 | +5C8H/−O | 1.13.11.59 | RC00912; RC01690 |
| 129.06 | 729.4411; 600.3806; and 260.0994; 131.0347; 9 and 57.5775; 828.5150 | +9C7HN | 6.3.2.0 | |
| 84.12 | 344.2155; 260.0994; 624.4864; 540.3659 | +4C16H5O/−2NS | 2.1.1.0 | RC03407 |
| 149.08 | 409.1837; 260.0994 | +9C11HON | RC01859; RC02353 | |
| 127.10 | 258.1392; 131.0347; and 387.2022; 260.0994 | +7C13HON | RC01908 | |
| 89.00 | 336.1164; 247.1184 | +3C3H4O/−N | 4.1.3.27 | RC02148; RC02414 |
| 21.00 | 336.1164; 315.1123 | +8CH3S/−6ON2P | 2.4.1.0 | RC00005; RC00049 |
| 64.03 | 425.1785; 361.1499 | +5C4H | 2.3.1.246 | RC00004; RC02933 |
| 99.05 | 699.4321; 600.3806 | +8C5HN/−O | 4.2.1.20;4.2.1.122 | RC00209; RC00210 |
| 41.91 | 489.1018; 447.1948; 527.2169; 485.3111 | +6O2P/−5C4N | 2.4.2.8 | RC00063 |
| 266.26 | 926.6089; 660.3488 | +18C34HO | 2.3.1.288;6.2.1.57 | RC00004; RC00055 |
Elemental composition changes are written with the number of elements preceding the element
qPMD 29.97 was selected in the joint analysis combining qPMDs from the maternal and neonatal to test for associations with pediatric leukemia
Feature selection in the second-trimester serum of case mothers and control mothers resulted in 16 qPMDs associated with case/control status (Figure 3b, Table 1). One of them, qPMD 149.08, was also selected in the DBS. qPMDs 21.00, 147.07, 127.1, 0.08, 149.08, 236.05, 41.91, 129.06, 84.12, 52.07, 64.03, 99.05, 89.0, and 118.08 were positively associated with leukemia status, while qPMD 266.26 and 90.05 were negatively associated with leukemia status. These 16 qPMDs formed 4 correlation clusters (Figure S7). qPMDs 236.05, 0.08, and 47.07 were highly correlated (r>0.72) while qPMDs 52.07, 129.0, 84.12, 149.08, 21.00, 89.0, 64.03, and 127.1 formed a cluster with low to high correlation (r = 0.25–0.95). qPMDs 266.26 and 41.06 did not cluster with any of the other qPMDs (Figure S7).
Feature selection in the joint datasets resulted in a total of 11 qPMDs, with seven positively associated with leukemia and four negatively associated with leukemia (Figure 3c, Table 1). Selected qPMDs in the DBS had larger fold-changes and more significant p-values than the selected qPMDs from the maternal serum. Only one new feature from the neonatal data (29.97) was selected in the joint analysis but not in the individual DBS analysis, while two qPMDs selected from the individual DBS analysis (41.06 and 64.02) and 13 qPMDS from the individual maternal serum analysis (90.05, 127.1, 0.08, 41.91, 236.05, 129.06, 84.12, 52.07, 64.03, 89, 118.08, 149.08, and 99.05) were not selected in the joint analysis, indicating that the qPMDs in the neonatal data represent most of the leukemia-related biological information compared to the qPMDs in the maternal data. Indeed, qPMD 149.08 in the maternal serum was dropped from the joint feature selection while qPMD 149.08 from the neonatal blood was retained. While this suggests that these qPMDs represent similar biological information in both samples, correlation between these qPMDs is low (r =0.14, Figure S8). Overall, selected qPMDs associated with leukemia from maternal serum have only moderate to low correlation with qPMDs associated with leukemia from DBS (Figure S8, r ≤ 0.21). Therefore, qPMDs from the maternal serum provide less information related to leukemia, albeit new information is provided by both matrices.
Repeated analysis on the joint dataset that was centered and scaled showed equivalent results.
The selected qPMDs associated with leukemia in DBS and maternal serum are summarized in Table 1. In DBS, m/z metabolite feature 383.1888 was measured in both qPMD 98.00 and 136.04, which may be a product involved in both reactions. In serum, m/z feature 131.0347 was measured as a potential product of qPMDs 52.07, 129.06, and 127.10, while m/z feature 336.1164 was measured as a potential reactant of both qPMDs 89.00 and 21.00. Since the levels of these m/z metabolite features change with multiple qPMDs, they may be particularly relevant to leukemogenesis and are potential targets for future investigations.
The element composition changes of neonatal DBS qPMDs involved addition of 1, 3, 4 and 9 carbon atoms, while the maternal pregnancy serum qPMDs involved addition of carbon chains of 3, 4, 5, 7, 8, 9 or 18. In particular, compositional changes involving 8 and 9 carbon chains were enriched in the selected qPMDs compared to the KEGG database (Figure S9), indicating that these may be particularly important in the development of pediatric leukemia. Indeed, the qPMD of 149.08 linked with the elemental +9C11HON product was selected in both maternal serum and DBS. While the m/z features of this qPMD were different for the two sample types (consistent with low correlation, r = 0.14), this further points to the potential importance of nine carbon reactions in early leukemia development which may transfer from mothers to fetus.
3.3. Correlations between selected qPMDs and covariates
To investigate factors that could potentially confound relationships between qPMDs and risk of childhood ALL, the continuous covariates of maternal age, birthweight, and gestational age at birth, as well as categorical covariates of child’s sex, race/ethnicity, and LGA, and maternal education were evaluated because they have been suggested as risk factors for ALL. The relative predictive ability of these covariates compared to the selected qPMDs was assessed with mean decrease in Gini Index using Random Forests. All selected qPMDs in the neonatal samples were better predictors of ALL than the available covariates, albeit birthweight, gestational age, and maternal age showed much more substantial mean decrease in Gini Index than the other covariates (Figure S10). Low correlation between maternal age, birthweight, and gestational age with selected neonatal qPMDs (Figure S11) suggesting that the covariates have independent relationships with ALL and did not confound the neonatal qPMDs.
For the maternal pregnancy serum, 14 out of the 16 selected qPMDs were better predictors of childhood ALL than covariates. Like the DBS, gestational age, birthweight and maternal age showed more substantial mean decrease in Gini Index than the other covariates (Figure S12). While gestational age and birthweight were only moderately correlated and clustered together (Figure S13), maternal age clustered with qPMD 99.05 and 41.91, albeit with low correlation (r = 0.16 and 0.1, respectively). Therefore, maternal age may confound relationships between serum qPMDs and ALL in infants.
3.4. Hypothesis testing of linolenic and linoleic acid in relation to the risk of pediatric ALL
Based on our previous study that linolenic and linoleic acid measured in DBS were positively associated with ALL(22), we assessed the association between these metabolites measured in newborn DBS and maternal serum in the current population. Linolenic and linoleic acid were highly correlated within the DBS and maternal samples (r = 0.86 and 0.84, respectively), but correlation between DBS and maternal serum was low for linolenic acid (r = 0.13) and linoleic acid (r = 0.18) (Figure S14).
When all cases were included in the analysis, we found slightly higher levels of linoleic acid in DBS of children that went on to develop leukemia than in DBS of children that did not develop ALL (Table 2). There was no difference between levels of alpha/gamma-linolenic acid and linoleic acid measured in maternal serum or DBS in the subset of cases that developed leukemia at the age of 1-5 years However, imprecise positive associations were found between linolenic acid and linoleic acid in DBS and ALL in the subset of children aged 6 to 7, as well as between linoleic acid in maternal serum and ALL in the subset of children aged 6 to 7. Nevertheless, wide confidence intervals were observed for all tests (Table 2).
Table 2.
Log linolenic acid and linoleic acid metabolite levels and pediatric ALL risk.
| Population | Neonatal DBS | Maternal 2nd trimester serum | ||
|---|---|---|---|---|
| Metabolite | Fold-change* for alpha/gamma-linolenic acid [95% CI] |
Fold-change* for linoleic acid[95% CI] |
Fold-change* for alpha/gamma-linolenic acid [95% CI] |
Fold-change* for linoleic acid [95% CI] |
| All cases diagnosed at age 1–7 years (N=122) and controls (N=122) | 0.985[0.883,1.010] | 1.010[0.931,1.095] | 0.999[0.871,1.147] | 0.995[0.946,1.047] |
| Cases diagnosed at age 6–7 years (N=13) and controls (N=13) | 1.099[0.726,1.665] | 1.024[0.757,1.385] | 0.928[0.506,1.703] | 1.032[0.830,1.283] |
| Cases diagnosed at age1–5 years (N=109) and controls (N=109) | 0.973[0.866,1.094] | 1.006[0.922,1.097] | 1.001[0.865,1.158] | 0.997[0.943,1.053] |
Fold-change is where is the multivariable linear regression coefficient estimate for case/control status adjusted for batch, run order, DBS duration of storage, hemoglobin, sex, race/ethnicity, maternal education level, delivery mode, LGA, and age at blood collection in DBS analyses and batch, run order, sample duration of storage, gestational age at collection, sex, race/ethnicity, and maternal education level in maternal serum analyses.
We also assessed whether linoleic and linolenic acid in DBS were correlated with the DBS qPMDs associated with leukemia. Both linoleic and linolenic acid were moderately correlated with qPMD 72.02 (r = 0.56 and 0.62 respectively) (Figure S15).
4. Discussion
The majority of childhood ALL initiates in-utero (1). However, the early life pathophysiology remains undefined and robust biomarkers for neonatal screening remain elusive. Untargeted metabolomics, which provides an unbiased measure of circulating metabolites, enables discovery of underlying biological pathways and potential biomarkers. However, most metabolomics studies have been performed only in children already diagnosed with leukemia (8,10,11) which masks predictive markers. In contrast, our team has previously used archived newborn DBS to interrogate neonatal metabolomics profiles associated with later diagnosis of ALL and identified a positive association between linolenic and linoleic acid in DBS and ALL diagnosed in children ≥ 6 years, but not in children that were diagnosed at the age of 1-5 years. In this previous study, linolenic acid and linoleic acid were also inversely correlated with breastfeeding duration and positively correlated with maternal pre-pregnancy body mass index (22). As these fatty acids are measured in higher levels in formula than colostrum (38), the breastmilk available to newborns when DBS are typically collected 24-48 h postdelivery, and newborn linoleic acid levels were positively correlated with the maternal obesity and gestational weight gain in other studies (39), we hypothesized that linoleic and linolenic acid may be related to early life nutritional factors as risk factors of childhood ALL. Therefore, in the current study, we performed untargeted metabolomics on an independent set of DBS from 122 newborns who developed pediatric ALL and 122 matched controls, which included mostly children diagnosed early (89% diagnosed between 1-5 years old). The levels of linolenic and linoleic acid in DBS of newborns who went on to develop ALL at 6-7 years of age (N=13) appeared to be somewhat higher compared to controls (N=13). However, low overall correlation between 2nd trimester serum and newborn DBS levels of both linolenic and linoleic acid suggests that associations between these fatty acids and ALL may reflect neonatal as opposed to early prenatal etiological factors. Both observations need to be interpreted cautiously given the small sample size.
We performed metabolomics analysis on the newborn DBS as well as on paired maternal second trimester serum from all 244 participants (122 DBS cases and 122 DBS controls). To our knowledge, this is the first metabolomics studies of maternal pregnancy samples and pediatric leukemia. We used Reactomics analysis for discovery of molecular reactions (i.e., qPMDs) in both the maternal serum and paired DBS samples. Overall, we found fewer qPMDs in the DBS than in the maternal serum likely due to decreased detection in the DBS because of less starting material and overall lower concentrations in the 5-mm punch of DBS (10- μL whole blood equivalent) compared to 50 μL of serum. Correlation analysis was performed, as correlated features are likely co-regulated or functionally related to members of the same biological processes (40). Therefore, differences in correlation between qPMDs in cases and controls may reflect depletion or accumulation of metabolites, alterations in enzyme activity, and the remodeling of metabolic pathways. Maternal second trimester serum and newborn DBS reflect different developmental periods, with different metabolic profiles (41). In the current study, qPMDs measured in both DBS and maternal blood captured biologically relevant reactions with correlation structures. Overall, the qPMDs in the DBS were more highly correlated compared to second trimester serum. While this suggests that there may be tighter metabolic regulation in neonates compared to pregnant mothers, this may be a result of differences in the qPMDs measured in each sample type.
Overall, qPMDs measured in the DBS were highly correlated in the newborns that developed ALL compared to controls, while qPMDs measured in the maternal serum were more correlated in the controls compared to those in mothers of newborns that developed ALL. Using differential analysis, we identified several qPMDs that discriminated ALL cases and controls in this study, indicating that reaction level differences in metabolism observed in utero and around birth are linked to later development of pediatric leukemia. While more qPMDs were selected in the mothers (16 qPMDs) compared to the neonates (9 qPMDs), the qPMDs in the neonates had larger effect sizes and stronger associations with ALL compared to those qPMDs selected in the serum. Therefore, the ALL-associated qPMDs from the maternal serum provide less (albeit unique) information related to leukemogenesis, compared to the ALL-associated qPMDs from the DBS. This was surprising, since the overall sample volume was less in newborn DBS resulting in fewer total qPMDs compared to maternal serum. Indeed, this suggests that biological signals of leukemogenesis may be more present around the time of birth compared to second trimester of gestation. A broader range of molecular analysis, including lipidomics, as well as replication in an independent cohort will provide further insights into whether maternal blood is a viable sample for biomarker discovery in the context of pediatric ALL.
Selected qPMDs in DBS and maternal serum were both positively and negatively associated with ALL, suggesting a higher abundance and lower abundance of biological reactions that occur, respectively, in cases compared to controls. Since several of the qPMDs associated with ALL also showed moderate to high correlation, these qPMDs may represent biological reactivity hubs of metabolic pathways that may be important in ALL development, like observed correlated networks that characterize tumor metabolism (42). Interestingly, the covariates birthweight, maternal age, and gestational age were not correlated with selected qPMDs suggesting that early life biological reactions associated with ALL are independent of these risk factors. However, unlike our previous study of metabolomics in newborn DBS and ALL(22), the data linkages for the current study participants lacked information on breastfeeding, maternal body mass index, and socioeconomic status. In addition, controls were not matched on geographic location or other potential exposure risk factors that may lead to confounding or selection biases.
Biological interpretation of the qPMDs associated with ALL is currently limited by the novelty of the method and availability of databases. To aid in interpretation, we identified potential enzymes and biological reactions from the KEGG database that are consistent with the qPMDs that discriminate ALL cases from controls. The potential enzymes involved in the qPMDs associated with leukemia are broad and include oxidoreductases, transferase, hydrolases, lyases, and ligases. In particular, ligases may be important enzyme reactions in DBS associated with pediatric leukemia, while transferases may be particularly important enzyme reactions in maternal serum associated with leukemia in offspring. Transferases are involved in many biological processes including glutathione reactions involved in cellular detoxification of endogenous compounds (43) and of toxic carcinogens (44) such as polycyclic aromatic hydrocarbons that are also associated with increased pediatric ALL(45). Transferases are also involved in glutaminolysis (46) which may initiate cancer and alter central carbon metabolism (11). Ligases are involved with cancer initiation and progression (47). Given the lack of correlation of qPMDs with measured covariates (i.e., birthweight, maternal age, gestational age) and the activities of the KEGG reactions, it is likely that other, unmeasured risk factors such as environmental exposures are related to the PMDs associated with ALL risk discovered here. This hypothesis is further supported by the observation that more qPMDs were associated with increased risk of ALL than decreased risk of ALL in the maternal serum, including potential transferases (qPMDs 64.03, 41.91, and 0.08). While linolenic and linoleic acid in DBS were moderately correlated with qPMD 72.02, the potential KEGG reactions of RC03192 or RC03193 for this qPMD do not directly indicate involvement of linoleic or linolenic acid, so the interpretation of this relationship remains unclear. Unfortunately, specific enzymes cannot be elucidated from this data to further aid interpretation of the qPMD findings. However, our discovery study provides evidence in support of additional proteomics studies to characterize these reactions.
Our study supports a possible association between neonatal levels of linolenic and linoleic acid and childhood ALL, but the sample size is limited. In the discovery study, we identified new dysregulated metabolic reactions during second trimester of pregnancy and at birth that were associated with later development of pediatric ALL. In particular, biological reaction markers (i.e. qPMDs) were better predictors of ALL than covariates, and qPMDs in newborn DBS were better predictors of ALL than qPMDs in maternal serum. We did not observe overlap between qPMDs selected in DBS and maternal serum. The results highlight the importance of direct measurements for identifying biomarkers and risk factors of pediatric ALL. Reactomics analysis is a novel approach to investigate overall biological changes linked with pediatric cancer. However, interpretation of the biological changes for clinical intervention is currently limited and the utility of Reactomics analysis may be limited to biomarker discovery for early cancer screening.
Supplementary Material
Acknowledgements:
Research reported in this publication was supported by the National Institute of Environmental Health Sciences of the National Institutes of Health under award # P01ES018172 and P50ES018172 and by the United States Environmental Protection Agency under assistance agreement # RD83451101 and RD83615901 (C. Metayer and J.L. Wiemels). Additional salary support provided by U2CES030859, P30ES023515, UL1TR004419, and UH2CA248974 (L.M. Petrick). Resource acquisition was funded by the National Cancer Institute (award # R01CA175737, J.L.Wiemels). The collection of cancer incidence data used in this study was supported by the California Department of Public Health pursuant to California Health and Safety Code Section 103885; Centers for Disease Control and Prevention’s (CDC) National Program of Cancer Registries, under cooperative agreement 5NU58DP006344; the National Cancer Institute’s Surveillance, Epidemiology and End Results Program under contract HHSN261201800032I awarded to the University of California, San Francisco, contract HHSN261201800015I awarded to the University of Southern California, and contract HHSN261201800009I awarded to the Public Health Institute, Cancer Registry of Greater California. The ideas and opinions expressed herein are those of the author(s) and do not necessarily reflect the opinions of the State of California, Department of Public Health, the National Cancer Institute, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors. The biospecimens and/or data used in this study were obtained from the California Biobank Program, (CBP request number 600)” Section 6555(b), 17 CCR. The California Department of Public Health is not responsible for the results or conclusions drawn by the authors of this publication. We would like to thank staff at the California Biobank Program (Robin Cooley, Steve Graham, and Matin Kharrazi) for their logistic support.
Footnotes
Authors Disclosures: The authors declare no potential conflicts of interest.
References
- 1.Wiemels J. Chromosomal translocations in childhood leukemia: natural history, mechanisms, and epidemiology. J Natl Cancer Inst Monogr. 2008;(39):87–90. [DOI] [PubMed] [Google Scholar]
- 2.Greaves M. A causal mechanism for childhood acute lymphoblastic leukaemia. Nat Rev Cancer. 2018;18(8):471–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Whitehead TP, Metayer C, Wiemels JL, Singer AW, Miller MD. Childhood Leukemia and Primary Prevention. Curr Probl Pediatr Adolesc Health Care. 2016. Oct 1;46(10):317–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Onyije FM, Olsson A, Baaken D, Erdmann F, Stanulla M, Wollschläger D, et al. Environmental Risk Factors for Childhood Acute Lymphoblastic Leukemia: An Umbrella Review. Cancers (Basel). 2022. Jan 13;14(2):382. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wiemels J. Perspectives on the causes of childhood leukemia. Chem Biol Interact. 2012. Apr 5;196(3):59–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wojcicki AV, Kasowski MM, Sakamoto KM, Lacayo N. Metabolomics in acute myeloid leukemia. Molec Genet Metab. 2020. Aug 1;130(4):230–8. [DOI] [PubMed] [Google Scholar]
- 7.Aung MMK, Mills ML, Bittencourt-Silvestre J, Keeshan K. Insights into the molecular profiles of adult and paediatric acute myeloid leukaemia. Mol Oncol. 2021. Sep;15(9):2253–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Papadopoulou MT, Panagopoulou P, Paramera E, Pechlivanis A, Virgiliou C, Papakonstantinou E, et al. Metabolic Fingerprint in Childhood Acute Lymphoblastic Leukemia. Diagnostics (Basel). 2024. Mar 24;14(7):682. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Hlozkova K, Pecinova A, Alquezar-Artieda N, Pajuelo-Reguera D, Simcikova M, Hovorkova L, et al. Metabolic profile of leukemia cells influences treatment efficacy of L-asparaginase. BMC Cancer. 2020. Jun 5;20(1):526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bai Y, Zhang H, Sun X, Sun C, Ren L. Biomarker identification and pathway analysis by serum metabolomics of childhood acute lymphoblastic leukemia. Clin Chim Acta. 2014. Sep 25;436:207–16. [DOI] [PubMed] [Google Scholar]
- 11.Schraw JM, Junco JJ, Brown AL, Scheurer ME, Rabin KR, Lupo PJ. Metabolomic profiling identifies pathways associated with minimal residual disease in childhood acute lymphoblastic leukaemia. EBioMedicine. 2019. Oct;48:49–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fu J, Zhang A, Liu Q, Li D, Wang X, Si L. Metabolic profiling reveals metabolic features of consolidation therapy in pediatric acute lymphoblastic leukemia. Cancer Metab. 2023. Jan 23;11(1):2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tiziani S, Kang Y, Harjanto R, Axelrod J, Piermarocchi C, Roberts W, et al. Metabolomics of the tumor microenvironment in pediatric acute lymphoblastic leukemia. PLoS One. 2013;8(12):e82859. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Yuan Y, Wu Q, Zhao J, Feng Z, Dong J, An M, et al. Investigation of pathogenesis and therapeutic targets of acute myeloid leukemia based on untargeted plasma metabolomics and network pharmacology approach. J Pharm Biomed Anal. 2021. Feb 20;195:113824. [DOI] [PubMed] [Google Scholar]
- 15.Vidman L, Zheng R, Bodén S, Ribbenstedt A, Gunter MJ, Palmqvist R, et al. Untargeted plasma metabolomics and risk of colorectal cancer—an analysis nested within a large-scale prospective cohort. Cancer Metab. 2023. Oct 17;11:17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Yu M, Petrick L. Untargeted high-resolution paired mass distance data mining for retrieving general chemical relationships. Commun Chem. 2020;3(1):157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Petrick L, Edmands W, Schiffman C, Grigoryan H, Perttula K, Yano Y, et al. An untargeted metabolomics method for archived newborn dried blood spots in epidemiologic studies. Metabolomics. 2017. Mar;13(3). [Google Scholar]
- 18.Petrick LM, Arora M, Niedzwiecki MM. Minimally Invasive Biospecimen Collection for Exposome Research in Children’s Health. Curr Environ Health Rep. 2020. Sep;7(3):198–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ottosson F, Russo F, Abrahamsson A, MacSween N, Courraud J, Nielsen ZK, et al. Effects of Long-Term Storage on the Biobanked Neonatal Dried Blood Spot Metabolome. J Am Soc Mass Spectrom. 2023. Apr 5;34(4):685–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Yan Q, He D, Walker DI, Uppal K, Wang X, Orimoloye HT, et al. The neonatal blood spot metabolome in retinoblastoma. EJC Paediatr Oncol. 2023. Dec;2:100123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Petrick L, Imani P, Perttula K, Yano Y, Whitehead T, Metayer C, et al. Untargeted metabolomics of newborn dried blood spots reveals sex-specific associations with pediatric acute myeloid leukemia. Leukemia Research. 2021. Jul;106: 106585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Petrick LM, Schiffman C, Edmands WMB, Yano Y, Perttula K, Whitehead T, et al. Metabolomics of neonatal blood spots reveal distinct phenotypes of pediatric acute lymphoblastic leukemia and potential effects of early-life nutrition. Cancer Lett. 2019. Mar 20;452:71–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Steliarova-Foucher E, Stiller C, Lacour B, Kaatsch P. International Classification of Childhood Cancer, third edition. Cancer. 2005. Apr 1;103(7):1457–67. [DOI] [PubMed] [Google Scholar]
- 24.Kharrazi M, Pearl M, Yang J, DeLorenze GN, Bean CJ, Callaghan WM, et al. California Very Preterm Birth Study: design and characteristics of the population- and biospecimen bank-based nested case-control study. Paediatr Perinat Epidemiol. 2012. May;26(3):250–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Wang R, Wiemels JL, Metayer C, Morimoto L, Francis SS, Kadan-Lottick N, et al. Cesarean Section and Risk of Childhood Acute Lymphoblastic Leukemia in a Population-Based, Record-Linkage Study in California. Am J Epidemiol. 2017. Jan 15;185(2):96–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Metayer C, Imani P, Dudoit S, Morimoto L, Ma X, Wiemels JL, et al. One-Carbon (Folate) Metabolism Pathway at Birth and Risk of Childhood Acute Lymphoblastic Leukemia: A Biomarker Study in Newborns. Cancers. 2023. Jan;15(4):1011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yu M, Dolios G, Yong-Gonzalez V, Björkqvist O, Colicino E, Halfvarson J, et al. Untargeted metabolomics profiling and hemoglobin normalization for archived newborn dried blood spots from a refrigerated biorepository. J Pharm Biomed Anal. 2020. Nov 30;191:113574. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Schiffman C, Petrick L, Perttula K, Yano Y, Carlsson H, Whitehead T, et al. Filtering procedures for untargeted LC-MS metabolomics data. BMC Bioinformatics. 2019. Jun 14;20(1):334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem. 2006. Feb 1;78(3):779–87. [DOI] [PubMed] [Google Scholar]
- 30.Libiseller G, Dvorzak M, Kleb U, Gander E, Eisenberg T, Madeo F, et al. IPO: a tool for automated optimization of XCMS parameters. BMC Bioinformatics. 2015. Apr 16;16:118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Risso D, Ngai J, Speed TP, Dudoit S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol. 2014. Sep;32(9):896–902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, et al. Missing value estimation methods for DNA microarrays. Bioinformatics. 2001. Jun 1;17(6):520–5. [DOI] [PubMed] [Google Scholar]
- 33.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010. Jan 1;26(1):139–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Yu M, Olkowicz M, Pawliszyn J. Structure/reaction directed analysis for LC-MS based untargeted analysis. Anal Chim Acta. 2019. Mar 7;1050:16–24. [DOI] [PubMed] [Google Scholar]
- 35.Papageorghiou AT, Kennedy SH, Salomon LJ, Altman DG, Ohuma EO, Stones W, et al. The INTERGROWTH-21st fetal growth standards: toward the global integration of pregnancy and pediatric care. Am J Obstet Gynecol. 2018. Feb;218(2S):S630–40. [DOI] [PubMed] [Google Scholar]
- 36.Breiman L. Random Forests. In: Machine Learning. Netherlands: Kluwer Academic Publishing; 2001. Oct 1;45(1):5–32. [Google Scholar]
- 37.Maechler M, Rousseeuw P, Struyf A, Hubert H, Hornik K. cluster: Cluster Analysis Basics and Extensions [Internet]. 2024. [cited 2025 Mar 2]. Available from: https://cran.r-project.org/web/packages/cluster/citation.html [Google Scholar]
- 38.Sinanoglou VJ, Cavouras D, Boutsikou T, Briana DD, Lantzouraki DZ, Paliatsiou S, et al. Factors affecting human colostrum fatty acid profile: A case study. PLoS One. 2017;12(4):e0175817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Cinelli G, Fabrizi M, Ravà L, Ciofi Degli Atti M, Vernocchi P, Vallone C, et al. Influence of Maternal Obesity and Gestational Weight Gain on Maternal and Foetal Lipid Profile. Nutrients. 2016. Jun 15;8(6):368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Rosato A, Tenori L, Cascante M, De Atauri Carulla PR, Martins dos Santos VAP, Saccenti E. From correlation to causation: analysis of metabolomics data using systems biology approaches. Metabolomics. 2018;14(4):37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Parenti M, Schmidt RJ, Tancredi DJ, Hertz-Picciotto I, Walker CK, Slupsky CM. Neurodevelopment and Metabolism in the Maternal-Placental-Fetal Unit. JAMA Netw Open. 2024. May 28;7(5):e2413399. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Benedetti E, Liu EM, Tang C, Kuo F, Buyukozkan M, Park T, et al. A multimodal atlas of tumour metabolism reveals the architecture of gene–metabolite covariation. Nat Metab. 2023. Jun;5(6):1029–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Allocati N, Masulli M, Di Ilio C, Federici L. Glutathione transferases: substrates, inihibitors and pro-drugs in cancer and neurodegenerative diseases. Oncogenesis. 2018. Jan 24;7(1):1–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Dasari S, Ganjayi MS, Yellanurkonda P, Basha S, Meriga B. Role of glutathione S-transferases in detoxification of a polycyclic aromatic hydrocarbon, methylcholanthrene. Chem Biol Interact. 2018. Oct 1;294:81–90. [DOI] [PubMed] [Google Scholar]
- 45.Deziel NC, Rull RP, Colt JS, Reynolds P, Whitehead TP, Gunier RB, et al. Polycyclic aromatic hydrocarbons in residential dust and risk of childhood acute lymphoblastic leukemia. Environ Res. 2014. Aug;133:388–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Wang Z, Liu F, Fan N, Zhou C, Li D, Macvicar T, et al. Targeting Glutaminolysis: New Perspectives to Understand Cancer Development and Novel Strategies for Potential Target Therapies. Front Oncol. 2020. Oct 26;10:589508. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Duan S, Pagano M. Ubiquitin ligases in cancer: Functions and clinical potentials. Cell Chem Biol. 2021. Jul 15;28(7):918–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data presented in this study are available upon request from the corresponding author and after approval from the California Health and Human Services. The data are not publicly available due to California Health and Human Services restrictions. The Reactomics database is freely available in the pmd R package.
