Abstract
Background
Children of Hispanic ancestry have a higher incidence of acute lymphoblastic leukemia (ALL) than other ethnic groups, but the genetic basis for racial disparities remain incompletely understood. Genome-wide association studies (GWAS) of childhood ALL to date have focused on inherited genetic effects; however, maternal genetic effects (the role of maternal genotype on offspring phenotype development) may also play a role in ALL susceptibility.
Methods
We conducted a family-based exome-wide association study (EXWAS) of maternal genetic effects among Hispanics with childhood B-cell ALL (B-ALL) using the Illumina Human Exome BeadChip. We used a discovery cohort of 312 Guatemalan and Hispanic American families and an independent replication cohort of 152 Hispanic American families.
Results
Three maternal SNPs approached our threshold for significance, after correction for multiple testing (P<1.0×10−6): MTL5 rs12365708 (RR=2.62, 95% CI=1.61-4.27, P=1.8×10−5); ALKBH1 rs6494 (RR=3.77, 95% CI=1.84-7.74, P=3.7×10−5); NEUROG3 rs4536103 (RR=1.75, 95% CI=1.30-2.37, P=1.2×10−4). While effect sizes were similar, these SNPs were not nominally significant in our replication cohort. In a meta-analysis comprised of the discovery cohort and the replication cohort, these SNPs were still not statistically significant after correction for multiple comparisons (rs12365708: pooled RR=2.27, 95% CI=1.48-3.50, P=1.99×10−4; rs6494: pooled RR=2.31, 95% CI=1.38-3.85, P=0.001; rs4536103: pooled RR=1.67, 95% CI=1.29-2.16, P=9.23×10−5).
Conclusions
In the first family-based EXWAS to investigate maternal genotype effects associated with childhood ALL, our results did not implicate a strong role of maternal genotype on disease risk among Hispanics; however, we identified three maternal SNPs that may play a modest role in susceptibility.
Keywords: Acute lymphoblastic leukemia, childhood ALL, genetic epidemiology, exome, maternal genetics, Hispanics
Introduction
Acute lymphoblastic leukemia (ALL) is the most common type of cancer in children, accounting for about 25% of all cancers in those <20 years of age.1 The majority of childhood ALL cases (approximately 80%) are B-cell acute lymphoblastic leukemia (B-ALL).2 A notable risk factor for childhood ALL is Hispanic ethnicity. Children of Hispanic ethnicity have an incidence of ALL that is 10% to 30% higher than non-Hispanic whites and almost two times greater than non-Hispanic blacks.3 In addition, Hispanic children with ALL have a higher incidence of relapse and a lower 5-year survival rate than do non-Hispanic whites.4,5 It is hypothesized that differences in ALL incidence and outcomes among Hispanics are due to differences in the frequency of known or novel genetic risk factors that are unique to this population. For example, there is evidence that genetic risk factors associated with Amerindian ancestry could account for increased ALL incidence and decreased survival rates among Hispanics.5
Previous genome-wide association studies (GWAS) have identified a number of inherited genetic variants associated with ALL risk. Specifically, single nucleotide polymorphisms (SNPs) in ARID5B, IKZF1, CEBPE, CDKN2A, PIP4K2A, and GATA3 genes have been associated with ALL susceptibility.6–13 Furthermore, risk alleles in ARID5B, CEBPE, and GATA3 were found to be more frequent among Hispanics compared to those of non-Hispanic ethnicity.9,10 However, much work remains to identify additional risk factors responsible for the heritability of ALL susceptibility among Hispanics.
Most genome-wide studies of susceptibility to childhood ALL have focused on inherited genetic effects; however, other genetic and biological mechanisms may also play a role. In particular, little is known about whether and to what extent the maternal genotype might influence the risk of the development of ALL in offspring. It has been hypothesized that maternal genotype could have a detrimental effect on a fetus by affecting the intrauterine environment independent of the inherited genotype.14,15 Additionally, several assessments have pointed to the role of environmental exposures on the risk of childhood ALL.16,17 This is particularly important as the evaluation of maternal genetic effects may serve as a proxy for maternal environmental exposures. For example, the maternal genome regulates the in utero environment in relation to various exposures (e.g., nutrients, toxicants), which could affect the developing fetus. Therefore, understanding the role of maternal genetic variation on childhood ALL risk may inform studies evaluating the role of environmental exposures on susceptibility. Notably, the incidence of ALL peaks in early childhood (2 to 5 years of age), suggesting an etiology early in development, perhaps even in utero.1 Furthermore, identical twins with concordant B-ALL have leukemic cells with shared unique clonal chromosomal rearrangements, even when the diagnosis of B-ALL occurred years apart, which is consistent with these molecular changes being transferred via shared circulation in utero.18 Chromosomal abnormalities (e.g., translocations) can also be detected in the newborn dried blood spots of many childhood B-ALL patients, indicating that leukemogenesis begins during fetal development when maternal genetic effects are likely to play a role in disease susceptibility.19 It is possible that variation in the maternal genome (maternal genetic effects) could affect the intrauterine environment in ways that affect hematopoiesis.20
While there have been candidate gene assessments of the role of maternal genetic effects on childhood ALL risk,15,21–23 as well as one whole-exome sequencing analysis suggesting that maternal rare variants in PRDM9 might play a role in development of B-ALL in offspring,24 very few exome- or genome-wide assessments have been conducted, unlike for other phenotypes that arise in utero (e.g., congenital anomalies).25,26 Furthermore, according to some assessments, the age at which Hispanics are diagnosed with ALL is younger than non-Hispanic whites, suggesting maternal genetic effects may play a particularly important role among this population.27,28
Therefore, to determine whether maternal genetic effects are associated with childhood ALL, we conducted a family-based (case-parent trio) exome-wide association study (EXWAS) among Hispanics recruited in Guatemala and the Southwestern United States. Our study focused on Hispanic B-ALL cases in order to enhance our understanding of genetic risk factors for the most common ALL subtype among this high-risk ethnic group.3
Materials and Methods
Study subjects and samples
The discovery cohort included Hispanic B-ALL case-parent trios recruited from 2012 to 2015 from Unidad Nacional de Oncología Pediátrica (UNOP), a pediatric cancer treatment center located in Guatemala City, Guatemala. Hyperdiploid status for UNOP B-ALL cases was determined as those with a DNA index >1.16. Additionally, Hispanic case-parent trios recruited between 2003 and 2011 from the Texas Children's Cancer Center (part of Texas Children's Hospital (TCH) in Houston, Texas) were included in the discovery cohort. Children and adolescents (ages 0 to 19 years) diagnosed with B-ALL and treated at either UNOP or TCH during the facilities’ respective timeframes were eligible for this study. All participating Guatemalan B-ALL cases and parents were of either self-reported Amerindian indigenous ancestry (Mayan ancestry, primarily K'iche’, Mam, and Q'anjob’al) or were Ladino (population admixed with Amerindian ancestry, also known as Mestizo). All parents in the discovery cohort were of the same ethnicity as their children (either Amerindian, Ladino, or U.S. Hispanic). Germline genomic DNA was extracted from peripheral blood samples obtained during clinical remission for all B-ALL cases. Peripheral blood samples were also obtained from participating Guatemalan parents (at any time of convenience), whereas saliva samples were collected from parents of TCH cases. Puregene DNA isolation kits by Gentra Systems were used to extract DNA. A total of 321 families (706 individuals) were eligible for inclusion in the discovery cohort; this included 276 families (601 individuals) recruited from UNOP, as well as 45 families (105 individuals) from TCH.
The replication cohort consisted of Hispanic childhood B-ALL case-parent trios recruited in 2015 from TCH. Peripheral blood samples were collected on cases during clinical remission, and saliva samples were obtained from parents. The replication cohort consisted of 282 individuals from 152 families where cases and both parents self-reported as Hispanic.
The overall study was approved by the University of Texas Health Science Center (UTHSC) Institutional Review Board (IRB), and the relevant components were approved by the Baylor College of Medicine and St. Jude Children's Research Hospital IRBs in the United States and by the Bioethics Committee at Facultad de Medicina, Universidad Francisco Marroquin in Guatemala.
Genotyping and quality control (QC)
DNA for the discovery cohort was submitted for exome-wide genotyping using the Illumina Human Exome BeadChip, v1.1 (Illumina, San Diego, CA). All SNPs (N=237,436) were from exonic regions of the genome. Within the discovery cohort, Guatemalan DNA samples were extracted, hybridized, and SNP-called in a separate batch than the TCH DNA samples. Genotype calls were made using Illumina GenomeStudio Software and genotype was coded as 0, 1, and 2 to indicate the number of minor alleles (assuming an additive genetic model). Case-parent trios (or duos) with a Mendelian error rate (i.e., Mendelian inconsistencies) >0.5% and individuals with a genotype call rate <95% were excluded from the analyses. In addition, non-autosomal SNPs (this approach does not allow for the evaluation of SNPs on sex chromosomes) and SNPs with a minor allele frequency (MAF) <1% in the discovery cohort or with poor genotyping quality (<95% call rate) were excluded from the analysis. For those SNPs with an MAF of 1%-5%, a more stringent set of QC criteria determined by call rate was implemented (Supplementary Figure 1). SNPs that did not meet these criteria were also excluded from the analyses. Three SNPs, which had the lowest P-values in the discovery cohort, were selected for evaluation in the replication cohort. These SNPs were assayed using TaqMan method (Life Technologies Corporation, Carlsbad, CA) and genotypes were read and discriminated on the ABI PRISM® 7900HT Sequence Detection System (Life Technologies Corporation, Carlsbad, CA) in the replication cohort.
Statistical analysis
We evaluated the role of maternal SNPs on the risk of childhood B-ALL in the discovery cohort using the EMIM program.29 This analysis compares whether observed genotype distributions match what is expected, assuming Mendelian transmission of the minor allele and symmetry of maternal and paternal genotypes in the source population.14,30,31 Multinomial modeling allows for the estimation of maternal genetic effects while adjusting for the influence of the child's genotype (i.e., inherited genetic effects). This type of modeling is mathematically equivalent to a log-linear approach,30 which has been used in several case-parent trio GWAS.25,26 However, the fitting of a regression model by directly maximizing the multinomial likelihood allows the direct estimation of effects even when one or more individuals are missing from a case-parent trio, rather than having to use an expectation-maximization (EM) algorithm to estimate missing genotypes, as is the case with the log-linear approach.30 More specifically, case-parent trios, case-mother or case-father duos, parents alone, or cases alone can all be analyzed simultaneously in EMIM to estimate maternal genetic effects.
Because genotype distributions are expected to be similar for mothers and fathers, a significant result for maternal genetic effects indicates that observed maternal and paternal genotype distributions are different from each other.31 Analyses accounted (adjusted) for inherited genetic effects. Statistical significance for each SNP was evaluated using a likelihood ratio test (with 1 degree of freedom (df)), comparing a full model (with terms for both maternal and inherited effects) with a reduced model (with only inherited effects; the reduced model excluded the parameter of interest).
For the discovery cohort, estimated relative risks (RR) for each SNP, as well as corresponding 95% confidence intervals (CIs) and chi-square values, were obtained using EMIM, which was originally developed for the analysis of high-throughput SNP array data. Analyses for the three SNPs genotyped in the replication cohort were conducted using LEM, a program developed for smaller-scale analyses of individual SNPs.32 The methodology behind these programs is analogous.29 Log-additive models were used for all analyses, based on their use in previous genome-wide assessments;10–12 this model provides greater power and has a reduced number of comparisons than an unrestricted model, and is thought to be most appropriate when evaluating the role of common variants on disease risk. Because analyses were stratified by parental mating type, estimates obtained were inherently adjusted for effects of population stratification.30,31 P-values for each SNP were calculated from corresponding chi-square values using R, version 3.1.2. Manhattan plots and Q-Q plots (with corresponding lambda values) were also created using R.
To account for multiple testing, a threshold of P<1.0×10−6 was used to designate statistical significance, based on the total number of comparisons. We had sufficient power (>0.8) to detect a RR≥2.5 in the discovery cohort (α=1.0×10−6) and RR≥2.0 in the replication cohort (α=0.05) for SNPs where MAF≥0.1. Genetic ancestry was determined using STRUCTURE 5,33 on the basis of genotypes at 30,000 randomly selected SNPs, with HapMap samples (CEU, YRI, CHB/JPT) and indigenous Amerindian references34 as ancestral populations. Europeans, Africans, and East Asians were defined as having ≥90% European genetic ancestry, ≥70% African ancestry, and ≥60% East Asian ancestry, respectively. Hispanic Americans were defined by genotype as individuals for whom the proportion of Native American genetic ancestry was ≥10% and greater than the proportion of African ancestry.9
A meta-analysis was conducted for those SNPs with suggestive evidence of association that were analyzed using the replication cohort. Two independent results were evaluated in the meta-analysis: those of the discovery cohort and the replication cohort. Meta-analyses were conducted using Stata/IC Release 12 (StataCorp LP, College Station, TX). Fixed-effects models were used for SNPs with a non-significant test for heterogeneity, and random-effects models were used if the test for heterogeneity was significant (P<0.05).
Results
The discovery cohort used in the final analysis included 312 families with 683 individuals; nine families were excluded from the analysis due to Mendelian inconsistencies. For 24 of the 312 families, parental genetic data were used in the analysis, but no genetic information was available for the B-ALL cases. A summary of demographic characteristics of the 288 ALL cases included in the discovery cohort is shown in Table 1. The 312 families in the discovery cohort were comprised of 108 case-parent trios, 148 case-parent duos, 32 cases alone, 7 mother-father pairs, and 17 parent singletons. Of the 237,436 SNPs available for analysis, 32,537 SNPs met MAF and call rate criteria in the discovery cohort and were used in analyses. The majority (99.6%) of excluded SNPs were omitted from analyses because of a MAF <1%. For SNPs included in the analysis, the mean call rate was 99.95%.
Table 1.
Characteristics of B-ALL cases from discovery and replication cohorts.
Discovery cohort | Replication cohort | |||
---|---|---|---|---|
Characteristic | N | % | N | % |
Race/Ethnicity | ||||
Amerindian | 95 | 33.0 | 0 | 0.0 |
Ladino (Guatemalan Hispanic Mestizo) | 152 | 52.8 | 0 | 0.0 |
Hispanic (U.S.) | 41 | 14.2 | 152 | 100.0 |
Sex | ||||
Male | 164 | 56.9 | 81 | 53.3 |
Female | 124 | 43.1 | 71 | 46.7 |
Age | ||||
<1 | 0 | 0.0 | 7 | 4.6 |
1 – 4 | 62 | 21.5 | 77 | 50.7 |
5 – 9 | 132 | 45.8 | 45 | 29.6 |
10 – 14 | 59 | 20.5 | 17 | 11.2 |
15 – 19 | 35 | 12.2 | 6 | 3.9 |
After correction for multiple testing, none of the SNPs were statistically significant in the discovery exome-wide association analysis (Supplementary Table 1). However, we further analyzed the three most highly significant maternal SNPs: MTL5 rs12365708 (RR=2.62, 95% CI=1.61-4.27, P=1.8×105); ALKBH1 rs6494 (RR=3.77, 95% CI=1.84-7.74, P=3.7×10−5); NEUROG3 rs4536103 (RR=1.75, 95% CI=1.30-2.37, P=1.2×10−4) (Figure 1 and Table 2). When these SNPs were evaluated among hyperdiploid B-ALL cases only, estimates were similar (data not shown). These three SNPs were then assessed in our replication cohort, which consisted of 15 case-parent trios, 96 parent-child duos, 25 cases alone, 4 mother-father pairs, and 12 singleton parents. None of the three SNPs were nominally significant: MTL5 rs12365708: RR=1.36, 95% CI=0.54-3.44, P=0.50; ALKBH1 rs6494: RR=1.38, 95% CI=0.67-2.87, P=0.40; NEUROG3 rs4536103: RR=1.46, 95% CI=0.89-2.41, P=0.10. The allele frequencies for the minor alleles among the discovery and replication cohorts (and reference populations) are shown in Supplementary Table 2.
Figure 1.
Manhattan plot of maternal genetic effects in the discovery cohort.
Table 2.
Discovery cohort, replication cohort, and fixed-effects meta-analysis pooled results for maternal SNPs rs12365708 (MTL5, 11q13.2-q13.3), rs6496 (ALKBH1, 14q24.3), and rs4536103 (NEUROG3, 10q21.3).
Discovery cohort1 | Replication cohort2 | Pooled estimate | ||||||
---|---|---|---|---|---|---|---|---|
SNP | MAF | RR (95% CI) | P | MAF | RR (95% CI) | P | RR (95% CI) | P |
rs12365708 | 0.13 | 2.62 (1.61-4.27) | 1.86×10−5 | 0.06 | 1.36 (0.54-3.44) | 0.52 | 2.27 (1.48-3.50) | 1.99×10−4 |
rs6494 | 0.06 | 3.77 (1.84-7.74) | 3.73×10−5 | 0.09 | 1.38 (0.67-2.87) | 0.38 | 2.31 (1.38-3.85) | 0.001 |
rs4536103 | 0.38 | 1.75 (1.30-2.37) | 1.17×10−4 | 0.41 | 1.46 (0.89-2.41) | 0.14 | 1.67 (1.29-2.16) | 9.23×10−5 |
The discovery cohort was comprised of 312 families: 108 case-parent trios, 148 case-parent duos, 32 cases alone, 7 mother-father pairs, and 17 parent singletons.
The replication cohort was comprised of 152 families: 15 case-parent trios, 96 parent-child duos, 25 cases alone, 4 mother-father pairs, and 12 singleton parents
Not surprisingly given the origin of the two cohorts, principal components analysis (PCA) results for the discovery dataset showed significant differences in genetic ancestry between Guatemalan Amerindians, Guatemalan Ladinos, and U.S. Hispanics (P<0.0001 for all comparisons) (Supplementary Figure 2). This was true when using the SNPs included in the genetic association analysis (n=32,537) or all SNPs on the BeadChip (n=237,436), suggesting differences were due to population differences rather than processing differences (i.e., batch effects). Additionally, the Q-Q plot of the maternal genetic effects results for the discovery dataset (Figure 2) showed very little deviation from expectation (lambda=1.01). Because of the lambda value, the fact that we restricted our analysis to those of Hispanic ancestry, and because family-based genetic association studies are less prone to bias due to population stratification,35 we did not attempt to reduce the genomic inflation factor. As the replication cohort had targeted analysis and was not included in the exome-wide assessment, we were not able to evaluate differences within this group based on PCA.
Figure 2.
Q-Q plot for maternal genetic effects, discovery cohort (lambda=1.01).
In a meta-analysis comprised of the discovery and replication cohorts, the pooled results for rs12365708 (MTL5), rs6494 (ALKBH1), and rs4536103 (NEUROG3) again showed suggestive evidence of association (rs12365708, pooled RR=2.27, 95% CI=1.48-3.50, P=2.0×10−4; rs6494, pooled RR=2.31, 95% CI=1.38-3.85, P=0.001; and rs4536103, pooled RR=1.67, 95% CI=1.29-2.16, P=9.2×10−5; Table 2).
Discussion
Overall, we did not find definitive evidence that maternal genetic effects played a major role in susceptibility to childhood B-ALL among Hispanics. While none of the analyzed variants were statistically significant after correcting for multiple comparisons, we believe three regions merit further investigation. Specifically, maternal SNPs in MTL5 (rs12365708, P=1.9×10−5), ALKBH1 (rs6494, P=3.7×10−5), and NEUROG3 (rs4536103, P=1.2×10−4) demonstrated suggestive evidence of association with childhood ALL in the discovery cohort. In our meta-analysis of the discovery and replication cohorts together, MTL5 rs12365708 and NEUROG3 rs4536103 also had pooled estimates with suggestive evidence of association (P=2.0×10−4 and P=9.2×10−5, respectively), although these SNPs still did not meet our statistical significance threshold.
Our strongest pooled finding was in NEUROG3, a gene on 10q21.3 that encodes a basic helix-loop-helix (bHLH) protein, which is a transcription factor involved in neurogenesis.36 This gene plays a role in the development of enteroendocrine cells.37 The NEUROG3 transcription factor is expressed in neuroendocrine prostate tumors and invasive neuroendocrine cancer,38 and has been associated with development of pancreatic acinar carcinoma.39 In addition, induced inhibition of Neurog3 in mice has been found to impair differentiation of spermatogonial stem cells into progenitor cells;40 it is possible that this could also be the case for other types of stem cells. The rs4536103 minor allele (G) results in a missense mutation (Phe to Ser).41 Using RegulomeDB, there is evidence that rs4536103 may also be a regulatory element42. The role of this variant suggests a plausible association between maternal rs4536103 genotype and the risk of ALL in offspring. In addition, the rs4536103 risk allele is more frequent in Hispanic populations compared to those of European ancestry (Supplemental Table 1).
MTL5, a gene on chromosome 11 (11q13.2-q13.3), encodes a metallothionein-like protein which is involved in regulating cell growth and differentiation.43 MTL5 is likely involved in both male and female meiosis.44 The minor allele of rs12365708 (G) results in a missense mutation (Cys to Arg),41 which is predicted to have a deleterious effect on the function of the resulting protein.45 Additionally, MTL5 acts as a co-activator for aldosterone-induced transactivation of target genes by the mineralocorticoid receptor (MR).46 Aldosterone and the MR mediate the regulation of sodium transport in epithelial cells, but the MR is also expressed in many different tissues throughout the body, including reproductive tissues.46,47 Leukocytes also contain MR, and both MR and epithelial sodium channel genes are expressed in several leukemic cell lines, which is unusual because the sodium transport functionality of MR is generally thought to be restricted to epithelial cells.47
ALKBH1 (14q24.3) is a human homologue to the E. coli alkB gene, which is involved in DNA alkylation damage repair.36 It is a mitochondrial protein that repairs single-stranded DNA and RNA by demethylating 3-methylcytosine.48 The minor allele of rs6494 (A) results in a missense mutation (Met to Leu), and defects in various DNA repair genes, including other alkB homologues, have been found to contribute to the development of various types of cancer.49,50 In addition, ALKBH1 is involved in placental and in utero embryonic development;51 which is consistent with the possibility that this maternal gene could play a role in risk of ALL in offspring.
Maternal genetic effects for childhood ALL have been previously evaluated in several candidate gene studies.15,21–23 In these previous studies, significant maternal genetic effects were observed for SNPs within three folate metabolic genes (BHMT, MTR, and TYMS),15 for two xenobiotic metabolism haplotypes (GSTM3/GSTM4 and GSTP1),22 and for one SNP within a cell-cycle gene (CDKN2A rs36228834).23 Only one of the SNPs identified in these other studies was directly included on the BeadChip used in the current assessment (GSTP1 rs1695);22 we did not observe a significant association for this variant in our analysis (RR=1.22, 95% CI=0.91-1.62, P=0.18). Notably, CDKN2A rs3731249, which was included in our assessment, was in strong LD with rs36228834 (r2=1.0, D'=1.00) and had a similar effect as previously reported (rs3731249: RR=2.69, 95% CI=0.98-7.39, P=0.04 vs. rs36228834: OR=2.56, 95% CI=1.54-4.26, P=0.017).23 The top three SNPs identified in this study were not analyzed in previous candidate gene studies. There has been one whole-exome assessment of rare parental variants and childhood B-ALL in offspring, in which a significant excess of rare PRDM9 alleles was observed for mothers of children with B-ALL when compared to controls.24 While several PRDM9 variants were included on the Illumina Human Exome BeadChip, all had a MAF <1%. Therefore, we were unable to estimate the effects associated with the rare alleles.
As noted, our top three SNPs in the discovery cohort were not nominally significant in the replication cohort. These differences could be due to several factors. First, our hits may have been false positives. However, the effect sizes in the replication cohort were in the same direction as the discovery cohort, suggesting there could be association between our top variants and B-ALL susceptibility. Second, while we did attempt to evaluate risk among individuals of Hispanic (primarily Native American) ancestry, differences in population substructure between the discovery and validation cohorts may have obscured findings, and due to sample size limitations, we were not able to stratify based on sub-ethnicity. Specifically, if our top hits in the discovery population were merely in linkage disequilibrium (LD) with true causal variants, and the LD patterns were different between the discovery and replication populations, this may have led to differences in statistical significance between these populations for the SNPs in question.
Our study must be considered in the light of certain limitations. For instance, we were not able to evaluate interactions between maternal and inherited genetic effects or the role of rare maternal variants on B-ALL susceptibility due to our sample size. Additionally, as noted, differences in population substructure between the discovery and validation cohorts may have led to inability to validate findings from the discovery cohort. An important strength of this study was our focus on Hispanic childhood B-ALL cases. Populations of non-European ancestry are severely under-represented in exome- and genome-wide studies.52 Since the incidence of childhood B-ALL differs considerably by race and ethnicity,3 it is especially important to conduct exome- and genome-wide assessments for this disease that focus on other ethnic or racial groups, as was done in this study.
To our knowledge, this is the first EXWAS to evaluate the role of maternal genetic variants on the risk of childhood B-ALL. We identified three novel maternal SNPs (rs12365708, rs6496, and rs4536103) that merit further analysis in more highly powered replication studies (with larger sample sizes). Accounting for maternal genetic effects (i.e., the in utero environment) could potentially be important in gaining a better understanding of the genetic risk factors for childhood ALL. This study also begins to address an important gap in information about genetic risk factors for Hispanics, a group at higher risk for ALL than other races/ethnicities. Since childhood B-ALL is a multifactorial disease, involving both genetic and environmental factors, next steps for analyzing maternal genetic effects could include whole-genome sequencing as well as investigation of epigenetic and other environmental interactions. Future assessments will also evaluate the role of inherited genetic effects, as well as interactions between maternal genetic effects and inherited genetic effects.
Supplementary Material
Acknowledgements
We thank Larry Archer, Austin Brown, and Spiridon Tsavachidis for their technical assistance, and Anna Wilkinson, Michael Swartz, and Nalini Ranjit for their advice. We also thank the families who participated in this study.
Funding: This study was funded in part by the Cancer Prevention Research Institute of Texas (grant RP101089, PI: Plon; grant RP140258, PI: Lupo), The American Lebanese Syrian Associated Charities (PI: Yang), and the National Institutes of Health (grant U01CA176063, PI: Yang; grant P30CA125123, PI: Scheurer; grant R25CA160078, PI: Scheurer).
Footnotes
Author contributions: Conceptualization: P.J.L. and J.J.Y.; Methodology: M.E.S., P.J.L., and N.P.A.; Formal Analysis: N.P.A., V.P.-A., and P.J.L.; Resources: E.C.P.-G.; Funding Acquisition: S.E.P.; Data Curation: R.C.Z., P.A.D.A., K.S.F., F. A.-K., and C.R.N.; Supervision: P.J.L., J.J.Y., M.E.S., and K.R.R.; Writing – Original Draft: N.P.A. and P.J.L.; Writing – Review & Editing: P.J.L., V.P.-A., J.J.Y., K.R.R.
None of the authors have any conflicts of interest associated with this manuscript.
References
- 1.Wartenberg D, Groves FD, Adelman AS. Acute lymphoblastic leukemia: epidemiology and etiology. In: Estey EH, Faderl SH, Kantarjian HM, editors. Hematologic Malignancies: Acute Leukemias. Springer; Berlin: 2008. pp. 77–93. [Google Scholar]
- 2.Mullighan CG. The molecular genetic makeup of acute lymphoblastic leukemia. Hematology Am Soc Hematol Educ Program. 2012;2012:389–396. doi: 10.1182/asheducation-2012.1.389. [DOI] [PubMed] [Google Scholar]
- 3.Yamamoto JF, Goodman MT. Patterns of leukemia incidence in the United States by subtype and demographic characteristics, 1997-2002. Cancer Causes Control. 2008;19:379–390. doi: 10.1007/s10552-007-9097-2. [DOI] [PubMed] [Google Scholar]
- 4.Kadan-Lottick NS, Ness KK, Bhatia S, Gurney JG. Survival variability by race and ethnicity in childhood acute lymphoblastic leukemia. JAMA. 2003;290:2008–2014. doi: 10.1001/jama.290.15.2008. [DOI] [PubMed] [Google Scholar]
- 5.Yang JJ, Cheng C, Devidas M, et al. Ancestry and pharmacogenomics of relapse in acute lymphoblastic leukemia. Nat Genet. 2011;43:237–241. doi: 10.1038/ng.763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Papaemmanuil E, Hosking FJ, Vijayakrishnan J, et al. Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat Genet. 2009;41:1006–1010. doi: 10.1038/ng.430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Sherborne AL, Hosking FJ, Prasad RB, et al. Variation in CDKN2A at 9p21.3 influences childhood acute lymphoblastic leukemia risk. Nat Genet. 2010;42:492–494. doi: 10.1038/ng.585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Trevino LR, Yang W, French D, et al. Germline genomic variants associated with childhood acute lymphoblastic leukemia. Nat Genet. 2009;41:1001–1005. doi: 10.1038/ng.432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Xu H, Yang W, Perez-Andreu V, et al. Novel susceptibility variants at 10p12.31-12.2 for childhood acute lymphoblastic leukemia in ethnically diverse populations. J Natl Cancer Inst. 2013;105:733–742. doi: 10.1093/jnci/djt042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Perez-Andreu V, Roberts KG, Harvey RC, et al. Inherited GATA3 variants are associated with Ph-like childhood acute lymphoblastic leukemia and risk of relapse. Nat Genet. 2013;45:1494–1498. doi: 10.1038/ng.2803. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Perez-Andreu V, Roberts KG, Xu H, et al. A genome-wide association study of susceptibility to acute lymphoblastic leukemia in adolescents and young adults. Blood. 2015;125:680–686. doi: 10.1182/blood-2014-09-595744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Migliorini G, Fiege B, Hosking FJ, et al. Variation at 10p12.2 and 10p14 influences risk of childhood B-cell acute lymphoblastic leukemia and phenotype. Blood. 2013;122:3298–3307. doi: 10.1182/blood-2013-03-491316. [DOI] [PubMed] [Google Scholar]
- 13.Xu H, Zhang H, Yang W, et al. Inherited coding variants at the CDKN2A locus influence susceptibility to acute lymphoblastic leukaemia in children. Nat Commun. 2015;6:7553. doi: 10.1038/ncomms8553. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wilcox AJ, Weinberg CR, Lie RT. Distinguishing the effects of maternal and offspring genes through studies of “case-parent triads.”. Am J Epidemiol. 1998;148:893–901. doi: 10.1093/oxfordjournals.aje.a009715. [DOI] [PubMed] [Google Scholar]
- 15.Lupo PJ, Nousome D, Kamdar KY, Okcu MF, Scheurer ME. A case-parent triad assessment of folate metabolic genes and the risk of childhood acute lymphoblastic leukemia. Cancer Causes Control. 2012;23:1797–1803. doi: 10.1007/s10552-012-0058-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Soldin OP, Nsouli-Maktabi H, Genkinger JM, et al. Pediatric acute lymphoblastic leukemia and exposure to pesticides. Ther Drug Monit. 2009;31:495–501. doi: 10.1097/FTD.0b013e3181aae982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yan K, Xu X, Liu X, et al. The associations between maternal factors during pregnancy and the risk of childhood acute lymphoblastic leukemia: a meta-analysis. Pediatr Blood Cancer. 2015;62:1162–1170. doi: 10.1002/pbc.25443. [DOI] [PubMed] [Google Scholar]
- 18.Marshall GM, Carter DR, Cheung BB, et al. The prenatal origins of cancer. Nat Rev Cancer. 2014;14:277–289. doi: 10.1038/nrc3679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Gruhn B, Taub JW, Ge Y, et al. Prenatal origin of childhood acute lymphoblastic leukemia, association with birth weight and hyperdiploidy. Leukemia. 2008;22:1692–1697. doi: 10.1038/leu.2008.152. [DOI] [PubMed] [Google Scholar]
- 20.Santure AW, Spencer HG. Influence of mom and dad: quantitative genetic models for maternal effects and genomic imprinting. Genetics. 2006;173:2297–2316. doi: 10.1534/genetics.105.049494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Lightfoot TJ, Johnston WT, Painter D, et al. Genetic variation in the folate metabolic pathway and risk of childhood leukemia. Blood. 2010;115:3923–3929. doi: 10.1182/blood-2009-10-249722. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Nousome D, Lupo PJ, Okcu MF, Scheurer ME. Maternal and offspring xenobiotic metabolism haplotypes and the risk of childhood acute lymphoblastic leukemia. Leuk Res. 2013;37:531–535. doi: 10.1016/j.leukres.2013.01.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Healy J, Bourgey M, Richer C, Sinnett D, Roy-Gagnon MH. Detection of fetomaternal genotype associations in early-onset disorders: evaluation of different methods and their application to childhood leukemia. J Biomed Biotechnol. 2010;2010:369534. doi: 10.1155/2010/369534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hussin J, Sinnett D, Casals F, et al. Rare allelic forms of PRDM9 associated with childhood leukemogenesis. Genome Res. 2013;23:419–430. doi: 10.1101/gr.144188.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Shi M, Murray JC, Marazita ML, et al. Genome wide study of maternal and parent-of-origin effects on the etiology of orofacial clefts. Am J Med Genet A. 2012;158A:784–794. doi: 10.1002/ajmg.a.35257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Mitchell LE, Agopian AJ, Bhalla A, et al. Genome-wide association study of maternal and inherited effects on left-sided cardiac malformations. Hum Mol Genet. 2015;24:265–273. doi: 10.1093/hmg/ddu420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Glazer ER, Perkins CI, Young JLJ, Schlag RD, Campleman SL, Wright WE. Cancer among Hispanic children in California, 1988-1994: comparison with non-Hispanic white children. Cancer. 1999;86:1070–1079. [PubMed] [Google Scholar]
- 28.Goggins WB, Lo FF. Racial and ethnic disparities in survival of US children with acute lymphoblastic leukemia: evidence from the SEER database 1988-2008. Cancer Causes Control. 2012;23:737–743. doi: 10.1007/s10552-012-9943-8. [DOI] [PubMed] [Google Scholar]
- 29.Howey R, Cordell HJ. PREMIM and EMIM: tools for estimation of maternal, imprinting and interaction effects using multinomial modeling. BMC Bioinformatics. 2012;13:149. doi: 10.1186/1471-2105-13-149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ainsworth HF, Unwin J, Jamison DL, Cordell HJ. Investigation of maternal effects, maternal-fetal interactions and parent-of-origin effects (imprinting), using mothers and their offspring. Genet Epidemiol. 2011;35:19–45. doi: 10.1002/gepi.20547. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Weinberg CR, Wilcox AJ, Lie RT. A log-linear approach to case-parent-triad data: assessing effects of disease genes that act either directly or through maternal effects and that may be subject to parental imprinting. Am J Hum Genet. 1998;62:969–978. doi: 10.1086/301802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Van den Oord EJCG, Vermunt JK. Testing for linkage disequilibrium, maternal effects, and imprinting with (in)complete case-parent triads, by use of the computer program LEM. Am J Hum Genet. 2000;66(1):335–338. doi: 10.1086/302708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–959. doi: 10.1093/genetics/155.2.945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mao X, Bigham AW, Mei R, et al. A genomewide admixture mapping panel for Hispanic/Latino populations. Am J Hum Genet. 2007;80:1171–1178. doi: 10.1086/518564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Cardon LR, Palmer LJ. Population stratification and spurious allelic association. Lancet. 2003;361:598–604. doi: 10.1016/S0140-6736(03)12520-2. [DOI] [PubMed] [Google Scholar]
- 36.Pruitt KD, Brown GR, Hiatt SM, et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014;42:D756–D763. doi: 10.1093/nar/gkt1114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kokubu H, Ohtsuka T, Kageyama R. Mash1 is required for neuroendocrine cell development in the glandular stomach. Genes Cells. 2008;13(1):41–51. doi: 10.1111/j.1365-2443.2007.01146.x. [DOI] [PubMed] [Google Scholar]
- 38.Gupta A, Wang Y, Browne C, et al. Neuroendocrine differentiation in the 12T-10 transgenic prostate mouse model mimics endocrine differentation of pancreatic beta cells. Prostate. 2008;68(1):50–60. doi: 10.1002/pros.20650. [DOI] [PubMed] [Google Scholar]
- 39.Ding L, Han L, Zhao J, He P, Zhang W. Neurogenin 3-directed cre deletion of Tsc1 gene causes pancreatic acinar carcinoma. Neoplasia. 2014;16(11):909–917. doi: 10.1016/j.neo.2014.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kaucher AV, Oatley MJ, Oatley JM. NEUROG3 is a critical downstream effector for STAT3-regulated differentiation of mammalian stem and progenitor spermatogonia. Biol Reprod. 2012;86:164, 1–11. doi: 10.1095/biolreprod.111.097386. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Database of Single Nucleotide Polymorphisms (dbSNP) National Center for Biotechnology Information, National Library of Medicine; Bethesda, MD: 2015. http://www.ncbi.nlm.nih.gov/SNP/ [Google Scholar]
- 42.Boyle AP, Hong EL, Hariharan M, et al. Annotation of functional variation in personal genomes using RegulomeDB. Genome Res. 2012;22:1790–1797. doi: 10.1101/gr.137323.112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Brown GR, Hem V, Katz KS, et al. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 2015;43:D36–D42. doi: 10.1093/nar/gku1055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Sugihara T, Wadhwa R, Kaul SC, Mitsui Y. A novel testis-specific metallothionein-like protein, tesmin, is an early marker of male germ cell differentiation. Genomics. 1999;57:130–136. doi: 10.1006/geno.1999.5756. [DOI] [PubMed] [Google Scholar]
- 45.Choi Y, Sims GE, Murphy S, Miller JR, Chan AP. Predicting the functional effect of amino acid substitutionas and indels. PLoS One. 2012;7:e46688. doi: 10.1371/journal.pone.0046688. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Fuller PJ. Novel interactions of the mineralocorticoid receptor. Mol Cell Endocrinol. 2015;408:33–37. doi: 10.1016/j.mce.2015.01.027. [DOI] [PubMed] [Google Scholar]
- 47.Mirshahi M, Golestaneh N, Valamanesh F, Agarwal MK. Paradoxical effects of mineralocorticoids on the ion gated sodium channel in embryologically diverse cells. Biochem Biophys Res Commun. 2000;270:811–815. doi: 10.1006/bbrc.2000.2501. [DOI] [PubMed] [Google Scholar]
- 48.Westbye MP, Feyzi E, Aas PA, et al. Human AlkB homolog 1 is a mitochondrial protein that demethylates 3-methylcytosine in DNA and RNA. J Biol Chem. 2008;283(36):25046–25056. doi: 10.1074/jbc.M803776200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Konishi N, Nakamura M, Ishida E, et al. High expression of anew marker PCA-1 in human prostate carcinoma. Clin Cancer Res. 2005;11(14):5090–5097. doi: 10.1158/1078-0432.CCR-05-0195. [DOI] [PubMed] [Google Scholar]
- 50.Hotta K, Sho M, Fujimoto K, et al. Clinical significanceand therapeutic potential of prostate cancer antigen-1/ALKBH3 in human renal cell carcinoma. Oncol Rep. 2015;34(2):648–654. doi: 10.3892/or.2015.4017. [DOI] [PubMed] [Google Scholar]
- 51.Safran M, Dalah I, Alexandar J, Rosen N. GeneCards Version 3: the human gene integrator. doi: 10.1093/database/baq020. www.genecards.org. [DOI] [PMC free article] [PubMed]
- 52.Bustamante CD, Burchard EG, De la Vega FM. Genomics for the world. Nature. 2011;475:163–165. doi: 10.1038/475163a. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.