Abstract
Nonsyndromic orofacial clefts (OFCs) are a common birth defect and are phenotypically heterogenous in the structure affected by the cleft - cleft lip (CL) and cleft lip and palate (CLP) – as well as other features, such as the severity of the cleft. Here, we focus on bilateral and unilateral clefts as one dimension of OFC severity, because the genetic architecture of these subtypes is not well understood. We tested for subtype-specific genetic associations in 44 bilateral CL (BCL) cases, 434 unilateral CL (UCL) cases, 530 bilateral CLP cases (BCLP), 1123 unilateral CLP (UCLP) cases, and unrelated controls (N = 1626), using a mixed-model approach. While no novel loci were found, the genetic architecture of UCL was distinct compared to BCL, with 44.03% of suggestive loci having different effects between the two subtypes. To further understand the subtype-specific genetic risk factors, we performed a genome-wide scan for modifiers and found a significant modifier locus on 20p11 (p=7.53×10−9), 300kb downstream of PAX1, that associated with higher odds of BCL vs. UCL, and replicated in an independent cohort (p=0.0018) with no effect in BCLP (p>0.05). We further found that this locus was associated with normal human nasal shape. Taken together, these results suggest bilateral and unilateral clefts may have different genetic architectures. Moreover, our results suggest BCL, the rarest form of OFC, may be genetically distinct from the other OFC subtypes. This expands our understanding of modifiers for OFC subtypes and further elucidates the genetic mechanisms behind the phenotypic heterogeneity in OFCs.
Introduction
Orofacial clefts (OFCs) are common, complex birth defects [MIM: 608864]. Affecting 1 in 700 births worldwide, they are caused when one or more of the developmental programs during the first seven weeks of pregnancy that determine the form the face do not occur properly1. While some OFCs present in conjunction with other congenital abnormalities, a majority of OFCs are classified as isolated, nonsyndromic (nsOFC), which are caused by a complex combination of genetic and environmental factors and have been the focus of numerous genome-wide association studies2–13. OFCs also have striking phenotypic heterogeneity. OFCs are typically categorized into three subtypes: cleft lip only (CL), cleft lip and palate (CLP), and cleft palate only (CP), where CL includes clefts confined to the lip and primary palate, CLP includes clefts that affect the lip and extend into the secondary palate (or roof of the mouth), and CP which affects the secondary palate only. CL and CLP are often combined into a more general category of cleft lip with or without cleft palate (CL/P) based on the shared defect of the primary palate. OFCs affecting the primary palate can also be further subdivided based on morphological details to capture severity, including the laterality (unilateral or bilateral), the side of unilateral clefts (left or right), or the completeness of the cleft.
Population-based studies estimating recurrence risks have focused on different classifications of OFCs, and the resulting estimates can inform genetic models and the design of association studies. For example, among CLP cases, there is no difference in the risk of either CL or CLP among their first-degree relatives; suggesting a shared genetic etiology14; 15 contributing to the rationale of studying CL/P in genetic association studies and many of the known risk loci show similar effects between CL and CLP2–4; 16,17. However, less is known about severity in CL and CLP or if there is a separate genetic component to cleft lip severity. Recurrence risk estimates based on severity are limited by sample size and have yielded mixed results. Semi-quantitative measures of completeness showed no effect of severity on estimated recurrence risks15. However, the recurrence risk for bilateral clefts is higher than for unilateral clefts, indicating this more severe cleft type tends to recur more often in family members14; 18, suggesting a potentially distinct genetic etiology. Previous studies examining genetic factors associated with bilateral vs. unilateral clefts have been limited to targeted sequencing of a few selected candidate loci17, although this work has suggested the presence of a genetic contribution to the different subtypes of cleft lip. Therefore, we set out to perform a genome-wide association study (GWAS) to determine if there are additional genetic variants that are either associated with cleft severity or are genetic modifiers for the cleft subtype that forms by focusing on bilateral and unilateral clefting in CL and CLP cases.
Materials and Methods
Sample collection and SNP quality control
This study used samples from the Pittsburgh Orofacial Cleft (POFC) Study. The details of the sample collection and genotype quality control (QC) have been described previously2; 19–21. Briefly, these samples came from 18 sites in 13 countries, including in the continental United States, Guatemala, Argentina, Colombia, Puerto Rico, China, Philippines, Denmark, Turkey, and Spain. All sites had Institutional Review Board (IRB) approval, both locally and at the University of Pittsburgh or University of Iowa, with written informed consent for genomic studies and data sharing. The original study recruited individuals with OFCs, their unaffected relatives, and unrelated controls (individuals with no known family history of OFCs or other craniofacial anomalies; N = 1626). For the current study, affected individuals were classified as either having a bilateral cleft lip (BCL; N = 44), a bilateral cleft lip and palate (BCLP; N = 530), a unilateral cleft lip (UCL; N = 434), or a unilateral cleft lip and palate (UCLP; N = 1123). Although this sample was not recruited with a population-based approach, the relative frequencies of these cleft types in the POFC study are consistent with epidemiological reports of subtypes22. Each cleft subtype was present in each ancestry group, as defined by principal components (PCs) of genetic markers (Table S1, Figure S1). Subjects where the specific subtype of cleft was not known were excluded from this study. Related, affected individuals were retained in this study and a genetic relatedness matrix (GRM) was used to adjust for relationships within and across families (see below).
Samples were genotyped for approximately 580,000 single nucleotide polymorphic (SNP) markers from the Illumina HumanCore+Exome array, of which approximately 539,000 SNPs passed quality control filters recommended by the Center for Inherited Disease Research (CIDR) and the Genetics Coordinating Center (GCC) at the University of Washington2. This data was then phased with SHAPEIT223 and imputed with IMPUTE224 to the 1000 Genomes Project Phase 3 release (September 2014) reference panel. The most-likely imputed genotypes were selected for statistical analysis if the highest probability (r2) > 0.9. SNP markers showing deviation from Hardy-Weinberg equilibrium in European controls, a minor allele frequency or MAF < 5%, or imputation INFO scores < 0.5 were filtered out of all subsequent analyses. The information for the genotyped markers was retained after imputation and the imputed values for these variants were only used to assess concordance. A GRM was calculated from a set of LD-pruned genotyped SNPs as defined by GCTA using the package SNPRelate25.
Statistical analyses
Subtype-specific genome-wide association study (GWAS):
Single subtype genome-wide tests were done by comparing cases from each subtype to a group of unrelated controls to test for genetic variants associated with each cleft subtype. The association between every genetic variant and laterality type was tested using the generalized linear mixed model (GMMAT)26 as implemented in the GENESIS software package27. Sex and the estimated GRM were adjusted for under the null model to account for both population substructure and relatedness. The control group was the same for all analyses. SNPs with association p-values less than 5 × 10−8 were considered genome-wide significant and those with p-values less than 1 × 10−5 were considered ‘suggestive’ and were used for downstream enrichment and comparison analyses. The unadjusted odds ratio (OR) for each SNP was estimated for the additive model using the minor allele frequency in cases compared to controls28; 29.Regional association plots were made with LocusZoom, where the LD blocks and recombination rates were estimated from European populations30.
Modifier GWAS:
We identified genetic modifiers (genetic variants that are associated with phenotypic heterogeneity or expressivity) using case-case group comparisons by directly comparing allele frequencies at each SNP between unilateral and bilateral cleft cases. Thus, this approach has high power to identify genetic risk factors that differ between two subtypes, but no power to find factors important in both groups (i.e., SNPs detected in previous GWAS of CL, CLP, or the combined CL/P group)25. Therefore, this test has the potential to identify new loci for which there is an effect in only one subtype or where the effects are different between two groups. Such loci may be masked in an overall scan when the two groups are combined. We performed modifier analyses for severity separately in the CL and CLP subtypes (UCL vs. BCL and UCLP vs. BCLP) and combined as CL/P (UCL/P vs. BCL/P). Similar to the subtype-specific analyses above, these tests were done using GMMAT26 as implemented in GENESIS27, adjusting for sex and the GRM to account for both population substructure and relatedness. The OR for each SNP was estimated using the minor allele frequency in bilateral cleft cases compared to unilateral cleft cases28; 29. Regional association plots were made with LocusZoom30.
Comparisons between CL and CLP analyses:
The estimated ORs for suggestive SNPs (i.e. those with p < 1 × 10−5) in the subtype-specific analyses were compared both within a single severity subtype across cleft type (i.e. BCL vs. BCLP), and across severity types within a single cleft type (e.g. UCL vs. BCL). To compare whether the SNPs associated with individual subtypes were novel compared to what has already been reported in previous GWAS of CL, CLP, or CL/P, the SNPs in these analyses within 50kb of previously associated risk SNPs2; 5; 6; 21 were also identified. A similar approach was done for the modifier analysis, and the suggestive loci, from either the CL or CLP modifier analyses were compared to see if they either had overlapping 95% confidence intervals (CIs) or gave estimated effects in the same direction. A chi-square test was used to determine if the number of SNPs that both had similar CIs and was previously reported in the literature overlapped more than expected by chance.
Replication cohort
To replicate the statistically significant results from our modifier analysis, data from the GENEVA consortium was used, which was described previously2; 4; 21. Briefly, this cohort recruited case-parent trios, where the affected individual had an oral cleft. The samples were genotyped for approximately 589K SNPs using the Illumina Human610-Quadv.1_B BeadChip, phased using SHAPEIT, and imputed to the 1000 Genomes Project Phase I (June 2011) reference panel using IMPUTE2. Imputed genotype probabilities were converted to most-likely genotype calls with GTOOL. This dataset was subsequently filtered to only include common SNPs with a MAF > 5%. A subset of individuals was included in both the POFC study and the GENEVA consortium, and these were removed from the replication analysis so that the two groups would be independent. Only the cases from this GENEVA cohort were selected, and they were classified as BCL (N = 28), UCL (N = 326), BCLP (N = 301), UCLP (N = 678). PCs of ancestry were calculated using PLINK (v1.9)31, and a majority of the cohort was of Asian (71.6%) or European (26.3%) ancestry (Figure S2). Because the replication cohort did not include related individuals, the modifier analyses (comparing BCL vs. UCL and BCLP vs. UCLP) were conducted using logistic regression models in PLINK (v1.9), with sex and the first 4 PCs as quantitative covariates, instead of the mixed-model approach that adjusts for relatedness implemented in GENESIS. Because of the small sample sizes in the replication cohort and the differences in genotyping arrays and imputation panels, only regions that were significant in the original modifier analysis were tested in this replication strategy. P-values less than a Bonferroni correction for the number of SNPs in the region (0.05/the number of SNPs tested) were considered to be evidence of significant replication.
Association with normal facial variation
The genome-wide significant modifier locus was further examined in relation to normal facial variation by reviewing the association results of SNPs in this locus in a GWAS meta-analysis of facial shape in two large cohorts (n= 8,246) from the US (MetaUS) and UK (MetaUK)32. To analyze normal facial variation, the original study used a data-driven global-to-local facial segmentation approach, and a multivariate GWAS was then performed in each of the resulting 63 hierarchically arranged facial segments. More information on the analysis pipeline and the cohorts can be found in the initial study33.
Epigenomic context of results
Topologically-associated domains (TADs) were defined for significantly associated loci using the H1-ESC cell line in 3D Genome Browser34. Functional enrichment was tested by first annotating all of the SNPs to the craniofacial functional regions defined by Wilderman et. al.35 for human embryos at CS13, CS14, CS15, CS17, and CS20 (4.5–8 weeks post conception). Enrichment tests were done using a chi-square test with the top SNPs (p < 1 × 10−3) for both modifier analyses and each subtype analysis, and estimated ORs and their 95% CIs were calculated.
Results
Subtype-specific analysis
We performed a subtype-specific genome-wide analysis for bilateral cleft lip (BCL), unilateral cleft lip (UCL), bilateral cleft lip and palate (BCLP), and unilateral cleft lip and palate (UCLP) cases by comparing cases of each subtype to unaffected controls. This approach can detect variants associated with increased risk for an OFC in general, but also has the potential to identify variants that increase the risk for one or more subtypes of OFC. A single SNP in chromosome 3q28 achieved genome-wide significance in the analysis of BCL (rs72439195; p = 3.69× 10−8), and 90 regions yielded suggestive evidence, most of which have not been previously implicated in OFC formation. However, some of these regions, like 14q32.33 (lead SNP: rs61996057; p = 8.07× 10−8; Figure S3A, Figure S4A, Figure S5, Table S2), have been implicated in syndromes with facial dysmorphisms36–38. In the analysis of UCL, two loci reached genome-wide significance (8q24 and 1q32), both of which are recognized genetic risk loci for CL/P (Figure S3B, Figure S4B, Table S3)2; 4; 7–10. Among the 21 suggestive loci, 17 have not been previously associated with OFCs which may reflect a lack of GWAS focused specifically on CL. Some of these loci, such as 2q13 (lead SNP: rs6542368; p = 1.06× 10−7; Figure S6), are plausible candidates for craniofacial dysmorphism39. Both BCLP and UCLP have multiple recognized genes/regions, including 8q24 and 17p13, reach genome-wide significance (Figure S3C–D, Figure S4C–D, Table S4, Table S5), and 35 and 41 loci reach suggestive significance, respectively, in this analysis2; 4; 5; 8; 10; 17.
Because of the apparent differences in suggestive and significant loci in the subtype-specific GWASs, we wanted to characterize similarity or dissimilarity of the overall genetic architectures of UCL, UCLP, BCL, and BCLP. Therefore, we performed pairwise analyses comparing the odds ratios and 95% CIs for SNPs identified as suggestive in the GWAS for each subtype being compared. In the comparison of BCL and UCL SNPs, we found a striking difference in estimated ORs in which 44.03% of 738 SNPs did not have overlapping CIs. A majority of these SNPs originating from the BCL analysis had an OR near 1 the UCL analysis (Figure 1), indicating substantial differences in the genetic architecture of BCL, the more severe group. This was also seen, although to lesser degree when the BCL subtype was compared to BCLP, where the 95% CIs for the estimated ORs did not overlap for the 34.1% of 1178 suggestive SNPs (Figure S7). In contrast, BCLP and UCLP were quite similar, with 94.7% of their 1093 SNPs showing overlapping OR confidence intervals (Figure 1). We also found SNPs with different effects in the subtype-specific analyses were less likely to have been previously reported in analyses of the combined group CL/P, suggesting these may be masked in traditional analyses that combine subtypes (Figure 1). For example, in the BCL-UCL comparison, 26.8% of SNPs with overlapping estimated effect sizes were recognized CL/P risk SNPs, indicating these SNPs may predispose to OFC risk but have no effect on specific subtypes. However, only 1.8% of SNPs differing in their effect sizes were previously reported, significantly less than expected by chance alone (p = 2.41× 10−20). This pattern held for all comparison groups (Table S6). We reasoned that SNPs predisposing to any type of bilateral cleft could be identified by first selecting SNPs that had non-overlapping CIs between BCL and UCL that also had overlapping CIs between BCL and BCLP. However, only 4 SNPs met these criteria and all of them also showed nominal significance in UCLP and had overlapping CIs. We employed the same strategy to identify SNPs predisposing to any type of unilateral cleft, but were similarly unsuccessful, supporting the notion that subtype-specific risk factors are not shared between CL and CLP in this sample.
Modifier analysis
To disentangle the effects of SNPs on specific subtypes from more general effects on OFC risk, we performed a genome-wide bilateral vs. unilateral modifier analysis in CL and CLP cases. Because this is a case-to-case group comparison, this analysis would not be able to detect variants generally important for both CL or CLP risk, but would detect variants important for the formation of one severity subtype compared to the other. In the modifier analysis of CL, one locus on chromosome 20p11 reached genome-wide significance (lead SNP: rs143865354; p = 7.53×10−9) and 47 other SNPS yielded suggestive significance (Figure 2A; Table S7; Figure S8A). In the modifier analysis for CLP, no loci reached genome-wide significance, but 19 loci yielded suggestive significance (Figure 2B; Table S8; Figure S8B). Interestingly, when CL and CLP were combined (as is typical in genetic analyses of OFCs), no loci reached genome-wide significance, and only 3 loci gave suggestive significance (Figure S9; Table S9), raising the possibility that these modifiers may not be shared between CL and CLP.
The associated SNPs on 20p11 lie within LINC01432, and are within the same topologically associated domain as PAX1 [MIM: 167411] (Figure 3A; Figure S10). This locus was not significant (p > 0.05) in the modifier analysis of CLP (Figure 3B). Additionally, when the OR for the lead SNP in this region was compared between CL and CLP cases, the direction of effect was not consistent (with either a 95% CI or a 99% CI; Figure 3C). We replicated the 20p11 region in an independent sample of 28 BCL cases, 329 UCL cases, 306 BCLP cases, and 685 UCLP cases. In this 20p11 region, there were 8 SNPs passing filtering in the CL modifier analysis. While none of these SNPs were the same as those in the original analysis, one SNP (rs28970569) was also a significant modifier in the replication cohort (OR = 3.83, 95% CI = 1.64–8.95, p = 0.0018; Table S10). In the CLP modifier analysis, 9 SNPs passed our filters but none of were significant modifiers, consistent with the results for 20p11 in our discovery sample (p > 0.05; Table S11). Additionally, we wanted to determine the extent to which the genetic modifiers in CL were similar to the genetic modifiers in the CLP genome-wide analysis. To test this, we compared SNPs that were suggestive (p < 1 × 10−5) in either the CL or CLP modifier analyses. Notably, there was no overlap between the list of suggestive SNPs in CL and the list of SNPs suggestive in CLP. Moreover, the estimated ORs were not positively correlated and all of the suggestive SNPs in the analysis of CL had no effect in CLP and vice versa (Figure 4), and a majority of the SNPs in each analysis were not near regions previously associated with CL/P (Table S12). Cumulatively, these results suggest the 20p11 modifier for bilateral vs. unilateral OFCs is specific to CL.
Although the 20p11 locus had not previously been associated with risk to OFCs, it has been associated with variation in normal facial structures. Therefore, we next investigated whether the BCL modifier SNPs were also associated with normal facial variation as that could give insights into how these SNPs might influence cleft severity. We found that rs6036034, a SNP in the 20p11 region in LD with rs143865354 (R2 = 0.522; p= 4.75× 10−8 in BCL vs. UCL), was associated with normal variation in nose morphology (p = 2.63× 10−11), specifically projection of the nasal tip and columella and breadth of the nasal alae (Figure 5). These are the same structures disrupted by cleft lip and are derived from the lateral nasal processes where PAX1 is expressed40. Moreover, rs143865354 shows modest evidence of being an eQTL for PAX1 in skin (p=2.9× 10−5) in GTEx.
Functional enrichment
We were also interested in testing whether differences in genetic architecture in BCL, UCL, BCLP, and UCLP at the SNP level were also reflected in functional elements involved in facial development. Therefore, we tested whether SNPs associated with each subtype were enriched in similar functional regions defined by epigenetic marks in human embryonic craniofacial tissues35. For some elements, the apparent enrichment or depletion was consistent across subtypes. For example, BCL, UCL, BCLP, and UCLP SNPs were similarly depleted in heterochromatin regions, and most were enriched in regions of strong transcription. However, there were some regions showing opposite enrichments in the different subtypes. For example, zinc finger repeat regions were enriched in both BCLP and UCLP, but were depleted in BCL (Figure 6). Interestingly, the severity modifiers for both CL and CLP were depleted in regions of weak transcription, and enriched in regions of low activity. Some of the suggestive modifier loci for CLP were enriched in bivalent transcription start sites but none of the putative modifiers for risk to CL were enriched in functional domain. These enrichment/depletions were consistent throughout craniofacial development (4.5–8 weeks post conception; Figure S11; Table S13). These observations, while not definitive, lend some support the idea that although at the SNP level, the genetic underpinnings for cleft subtypes are distinct, this may not extend entirely to gross differences in functional element enrichments. Deciphering the true underlying mechanism(s) resulting in bilateral and unilateral CL and CLP will require a locus-by-locus investigation.
Discussion
While there have been many studies identifying genetic variants that influence overall risk to cleft lip with or without cleft palate (CL/P) and cleft palate only (CP), the genetic underpinnings of specific phenotypic subtypes of cleft lip are less studied. This report furthers our understanding of genetic variants associated with specific subtypes of OFC: bilateral cleft lip (BCL), unilateral cleft lip (UCL), bilateral cleft lip and palate (BCLP), and unilateral cleft lip and palate (UCLP). We used a modifier analysis which provides more power to find genetic loci differing between two groups, and found one locus on 20p11 that replicated in an independent cohort as significantly associated with the formation of a BCL over a UCL. The associated SNPs were located in several long, noncoding RNAs and within the same TAD (300kb downstream) as the PAX1 gene. While PAX1 has not been associated with OFC like its paralog PAX941, they both are transcription factors with similar DNA-binding domains regulating chondrocyte differentiation and the formation of invertebrate discs, and knock-out mouse models show skeletal abnormalities42–44. There is also evidence that PAX1 is upregulated by SHH, and in turn, upregulates SOX5 and BMP443–45. There is only a limited literature describing PAX1 expression in the developing face40 and it has not been previously associated with risk to nonsyndromic OFCs, but PAX1 is in a pathway with other genes known to be associated with nonsyndromic OFCs46–51. Additionally, recent studies have shown mutations in PAX1 cause otofaciocervical syndrome (OTFCS [MIM: 615560]) which presents with facial dysmorphisms52; 53, and studies of normal facial variation have found this locus has also been associated with nasal width (the distance between left and right cartilaginous nasal ala) in people of European descent54, Latin American descent55, and Korean descent56. The link between SNPs at the PAX1 locus and normal facial shape was further substantiated in our analysis, with effects observed in the nasal tip, columella and alae. These anatomical structures are derived from the lateral and medial nasal processes in the embryo, which form the primary palate. Thus, it is biologically plausible that PAX1 could affect the development of specific types of craniofacial abnormalities, however, more work is be needed to investigate the underlying mechanisms.
While 20p11 was the only genome-wide significant modifier found in this study, this may partly be due to limited sample size in some of the OFC subtypes. It is important to note that when a modifier analysis was conducted on all combined CL and CLP cases, fewer loci reached even suggestive significance, suggesting CL and CLP may have distinct modifiers. Consistent with this, the suggestive modifiers for risk in CL and CLP showed no overlap in estimated effect on risk. This suggests that the lack of overlap is not entirely due to a difference in sample size, but that instead that there is a biological difference in the genetics of laterality in CL compared to CLP.
This study tested for severity modifiers at a genome-wide level, but we previously tested for modifiers in 13 recognized GWAS regions known to be associated with OFCs57 and found SNPs in IRF6 [MIM: 607199] were associated with the formation of a unilateral CL/P compared to bilateral CL/P17. In our study, no SNPs in IRF6 reached suggestive significance. Our study was larger than the previous study (2339 cases vs. 1001 cases), therefore, this difference may reflect effects of modifiers for cleft subtypes in regions of genome not recognized by previous GWAS of OFCs. This is not surprising given OFC subtypes are typically combined for GWAS, which maximizes statistical power to detect loci associated with overall risk, but would mask loci with different effects in subtypes.
We also conducted analyses comparing each subtype to unrelated controls. This analysis should find loci associated with either overall risk or one particular cleft subtype, but would have less statistical power to detect loci that differ between two subtypes. Most loci achieving genome-wide significance in these analyses were those already recognized to be associated with risk to OFCs2; 5; 6; 21. There were, however, some loci yielding suggestive evidence of association for several of the subtype-specific analyses not previously reported, but could be in the causal pathway for syndromes with facial dysmorphisms. For example, SNPs in 14q32.33 gave suggestive evidence of association for BCL, with a distinct effect only seen in BCL, and 2q13 yielded suggestive evidence of association for UCL. Microdeletions in both of these regions have been associated with syndromes that include facial dysmorphisms36–39. The 14q32.33 also contains JAG2 which is part of the Notch signaling pathway and is important for craniofacial development58–60.
Overall, our analyses demonstrated that BCL was most distinct from the other three subtypes analyzed and that these modifiers were not shared between CL and CLP. We found that the associated SNPs in all four OFC subtypes were enriched in regions associated with transcription and depleted in heterochromatin regions. This was expected because nonsyndromic OFCs form from the disruption of one of the processes involved in facial development and thus variants associated with any subtype OFC should be enriched in regions active during facial development. It is also consistent with the study defining the functional regions, which showed enrichment in active states for SNPs involved in overall OFC risk35. Importantly, there were some differences in functional enrichment by subtype. For example, SNPs associated with BCLP and UCLP were enriched in zinc finger repeat regions, however, SNPs showing some evidence of association with BCL were depleted in this same region. This further emphasizes the possibility for a distinct genetic architecture associated with risk to BCL. Additionally, the modifiers for both CL and CLP were depleted in regions associated with active transcription and strongly enriched in regions of low activity. This result is somewhat surprising given it is the opposite of what would be expected for an analysis involving craniofacial development. However, the biological mechanism by which modifiers could affect a phenotype is not known. Therefore, this highlights the need for more studies that test how modifiers mechanistically act.
The findings from this study should also be considered in the context of its limitations. Many of the subtypes of clefting, particularly BCL, had small sample sizes. Limits of small sample sizes make it likely other subtype-specific genetic loci and modifiers may exist and we are unable to detect them in this statistical analysis. Additionally, because the subtype-specific analyses were not independent due to the shared controls group and the related individuals, a formal test for heterogeneity could not be conducted. The confidence intervals in our analyses are less precise in the comparison involving smaller groups, and so it is likely that the estimates for different genetic effects are conservative, and that the genetic heterogeneity between these subtypes is larger than we see with our current population. We were also unable to test for heterogeneity across ancestry groups while testing for subtype-specific genetic risk loci and severity modifiers. This cohort is multiethnic, including people of European, Asian, and Latin American ancestry, and previous studies have shown ancestry-specific association with risk for OFCs2. Studies with larger sample sizes for these clefting subtypes could lead to the discovery of more associated genetic loci and test for differences in associated loci between different ancestry populations.
In summary, we conducted a genome-wide scan for severity modifiers in a case-case and case-control design focused on nonsyndromic CL and CLP and found a significant modifier in 20p11 downstream from PAX1 associated with increased risk for BCL over UCL. We also showed these modifiers for CL and CLP were distinct, with the modifiers of one cleft sub-type have little to no genetic effect in the other subtypes. Furthermore, in the subtype-specific GWASs, we found several suggestive loci that had not been previously identified in previous GWASs that combined cleft subtypes. We also found loci associated with BCL were the most distinct from those associated with other cleft subtypes, suggesting the etiology of this rarest subtype of cleft to be unique. Overall, this study expands our understanding of the genetic underpinnings of the genetic and phenotypic heterogeneity of OFCs and suggests new areas of research on cleft lip subtypes.
Supplementary Material
Acknowledgments:
The authors thank the dedicated field staff, collaborators, and participating families for their important contributions to this study. This work was supported by grants from the National Institutes of Health (NIH) including: R00-DE025060 [EJL], X01-HG007485 [MLM, EF], R01-DE016148 [MLM, SMW], U01-DE024425 [MLM], R37- DE008559 [JCM, MLM], R01-DE009886 [MLM], R21-DE016930 [MLM], R01- DE012472 [MLM], R01-DE014581 [TB], U01-DE018993 [TB]. National Institute of Dental and Craniofacial Research (U01-DE020078; R01-DE027023 [SMW]). Funding for genotyping by the National Human Genome Research Institute (X01-HG007821) and funding for initial genomic data cleaning by the University of Washington provided by contract HHSN268201200008I from the National Institute for Dental and Craniofacial Research awarded to the Center for Inherited Disease Research.
Footnotes
Declaration of Interests: None of the authors report any conflicts of interest.
Data Availability
Original data used in this study is available at dbGaP (phs000774.v2.p1, phs000094.v1.p1, https://www.ncbi.nlm.nih.gov/gap/)
References
- 1.Leslie EJ, and Marazita ML (2013). Genetics of cleft lip and cleft palate. Am J Med Genet C Semin Med Genet 163C, 246–258. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Leslie EJ, Carlson JC, Shaffer JR, Feingold E, Wehby G, Laurie CA, Jain D, Laurie CC, Doheny KF, McHenry T, et al. (2016). A multi-ethnic genome-wide association study identifies novel loci for non-syndromic cleft lip with or without cleft palate on 2p24.2, 17q23 and 19q13. Hum Mol Genet 25, 2862–2872. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Leslie EJ, Carlson JC, Shaffer JR, Butali A, Buxo CJ, Castilla EE, Christensen K, Deleyiannis FW, Leigh Field L, Hecht JT, et al. (2017). Genome-wide meta-analyses of nonsyndromic orofacial clefts identify novel associations between FOXE1 and all orofacial clefts, and TP63 and cleft lip with or without cleft palate. Hum Genet 136, 275–286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Beaty TH, Murray JC, Marazita ML, Munger RG, Ruczinski I, Hetmanski JB, Liang KY, Wu T, Murray T, Fallin MD, et al. (2010). A genome-wide association study of cleft lip with and without cleft palate identifies risk variants near MAFB and ABCA4. Nat Genet 42, 525–529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Yu Y, Zuo X, He M, Gao J, Fu Y, Qin C, Meng L, Wang W, Song Y, Cheng Y, et al. (2017). Genome-wide analyses of non-syndromic cleft lip with palate identify 14 novel loci and genetic heterogeneity. Nat Commun 8, 14364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Huang L, Jia Z, Shi Y, Du Q, Shi J, Wang Z, Mou Y, Wang Q, Zhang B, Wang Q, et al. (2019). Genetic factors define CPO and CLO subtypes of nonsyndromicorofacial cleft. PLoS Genet 15, e1008357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Birnbaum S, Ludwig KU, Reutter H, Herms S, de Assis NA, Diaz-Lacava A, Barth S, Lauster C, Schmidt G, Scheer M, et al. (2009). IRF6 gene variants in Central European patients with non-syndromic cleft lip with or without cleft palate. Eur J Oral Sci 117, 766–769. [DOI] [PubMed] [Google Scholar]
- 8.Birnbaum S, Ludwig KU, Reutter H, Herms S, Steffens M, Rubini M, Baluardo C, Ferrian M, Almeida de Assis N, Alblas MA, et al. (2009). Key susceptibility locus for nonsyndromic cleft lip with or without cleft palate on chromosome 8q24. Nat Genet 41, 473–477. [DOI] [PubMed] [Google Scholar]
- 9.Mangold E, Ludwig KU, Birnbaum S, Baluardo C, Ferrian M, Herms S, Reutter H, de Assis NA, Chawa TA, Mattheisen M, et al. (2010). Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate. Nat Genet 42, 24–26. [DOI] [PubMed] [Google Scholar]
- 10.Nikopensius T, Ambrozaityte L, Ludwig KU, Birnbaum S, Jagomagi T, Saag M, Matuleviciene A, Linkeviciene L, Herms S, Knapp M, et al. (2009). Replication of novel susceptibility locus for nonsyndromic cleft lip with or without cleft palate on chromosome 8q24 in Estonian and Lithuanian patients. Am J Med Genet A 149A, 2551–2553. [DOI] [PubMed] [Google Scholar]
- 11.Bureau A, Parker MM, Ruczinski I, Taub MA, Marazita ML, Murray JC, Mangold E, Noethen MM, Ludwig KU, Hetmanski JB, et al. (2014). Whole exome sequencing of distant relatives in multiplex families implicates rare variants in candidate genes for oral clefts. Genetics 197, 1039–1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fu J, Beaty TH, Scott AF, Hetmanski J, Parker MM, Wilson JE, Marazita ML, Mangold E, Albacha-Hejazi H, Murray JC, et al. (2017). Whole exome association of rare deletions in multiplex oral cleft families. Genet Epidemiol 41, 61–69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ludwig KU, Bohmer AC, Bowes J, Nikolic M, Ishorst N, Wyatt N, Hammond NL, Golz L, Thieme F, Barth S, et al. (2017). Imputation of orofacial clefting data identifies novel risk loci and sheds light on the genetic background of cleft lip +/− cleft palate and cleft palate only. Hum Mol Genet 26, 829–842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Grosen D, Chevrier C, Skytthe A, Bille C, Molsted K, Sivertsen A, Murray JC, and Christensen K (2010). A cohort study of recurrence patterns among more than 54,000 relatives of oral cleft cases in Denmark: support for the multifactorial threshold model of inheritance. J Med Genet 47, 162–168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Sivertsen A, Wilcox AJ, Skjaerven R, Vindenes HA, Abyholm F, Harville E, and Lie RT (2008). Familial risk of oral clefts by morphological type and severity: population based cohort study of first degree relatives. BMJ 336, 432–434. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Leslie EJ, Carlson JC, Shaffer JR, Buxo CJ, Castilla EE, Christensen K, Deleyiannis FWB, Field LL, Hecht JT, Moreno L, et al. (2017). Association studies of low-frequency coding variants in nonsyndromic cleft lip with or without cleft palate. Am J Med Genet A 173, 1531–1538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Carlson JC, Taub MA, Feingold E, Beaty TH, Murray JC, Marazita ML, and Leslie EJ (2017). Identifying Genetic Sources of Phenotypic Heterogeneity in Orofacial Clefts by Targeted Sequencing. Birth Defects Res 109, 1030–1038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Mitchell LE, and Risch N (1993). Correlates of genetic risk for non-syndromic cleft lip with or without cleft palate. Clin Genet 43, 255–260. [DOI] [PubMed] [Google Scholar]
- 19.Leslie EJ, Liu H, Carlson JC, Shaffer JR, Feingold E, Wehby G, Laurie CA, Jain D, Laurie CC, Doheny KF, et al. (2016). A Genome-wide Association Study of Nonsyndromic Cleft Palate Identifies an Etiologic Missense Variant in GRHL3. Am J Hum Genet 98, 744–754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Carlson JC, Standley J, Petrin A, Shaffer JR, Butali A, Buxo CJ, Castilla E, Christensen K, Deleyiannis FW, Hecht JT, et al. (2017). Identification of 16q21 as a modifier of nonsyndromic orofacial cleft phenotypes. Genet Epidemiol 41, 887–897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Carlson JC, Anand D, Butali A, Buxo CJ, Christensen K, Deleyiannis F, Hecht JT, Moreno LM, Orioli IM, Padilla C, et al. (2019). A systematic genetic analysis and visualization of phenotypic heterogeneity among orofacial cleft GWAS signals. Genet Epidemiol 43, 704–716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gundlach KK, and Maus C (2006). Epidemiological studies on the frequency of clefts in Europe and world-wide. J Craniomaxillofac Surg 34 Suppl 2, 1–2. [DOI] [PubMed] [Google Scholar]
- 23.Delaneau O, Zagury JF, and Marchini J (2013). Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods 10, 5–6. [DOI] [PubMed] [Google Scholar]
- 24.Howie BN, Donnelly P, and Marchini J (2009). A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5, e1000529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yang J, Lee SH, Goddard ME, and Visscher PM (2011). GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 88, 76–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chen H, Wang C, Conomos MP, Stilp AM, Li Z, Sofer T, Szpiro AA, Chen W, Brehm JM, Celedon JC, et al. (2016). Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models. Am J Hum Genet 98, 653–666. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Gogarten SM, Sofer T, Chen H, Yu C, Brody JA, Thornton TA, Rice KM, and Conomos MP (2019). Genetic association testing using the GENESIS R/Bioconductor package. Bioinformatics 35, 5346–5348. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bland JM, and Altman DG (2000). Statistics notes. The odds ratio. BMJ 320, 1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Lloyd-Jones LR, Robinson MR, Yang J, and Visscher PM (2018). Transformation of Summary Statistics from Linear Mixed Model Association on All-or-None Traits to Odds Ratio. Genetics 208, 1397–1408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, Boehnke M, Abecasis GR, and Willer CJ (2010). LocusZoom: regional visualization of genome-wide association scan results. Bioinformatics 26, 2336–2337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, and Lee JJ (2015). Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.White JD, Indencleef K, Naqvi S, Eller RJ, Hoskens H, Roosenboom J, Lee MK, Li J, Mohammed J, Richmond S, et al. (2020). Insights into the genetic architecture of the human face. Nat Genet. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.White JD, Indencleef K, Naqvi S, Eller RJ, Roosenboom J, Lee MK, Li J, Mohammed J, Richmond S, Quillen EE, et al. (2020). Insights into the genetic architecture of the human face. bioRxiv, 2020.2005.2012.090555. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wang Y, Song F, Zhang B, Zhang L, Xu J, Kuang D, Li D, Choudhary MNK, Li Y, Hu M, et al. (2018). The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol 19, 151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wilderman A, VanOudenhove J, Kron J, Noonan JP, and Cotney J (2018). High-Resolution Epigenomic Atlas of Human Embryonic Craniofacial Development. Cell Rep 23, 1581–1597. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Geier CB, Piller A, Eibl MM, Ciznar P, Ilencikova D, and Wolf HM (2017). Terminal 14q32.33 deletion as a novel cause of agammaglobulinemia. Clin Immunol 183, 41–45. [DOI] [PubMed] [Google Scholar]
- 37.Holder JL Jr., Lotze TE, Bacino C, and Cheung SW (2012). A child with an inherited 0.31 Mb microdeletion of chromosome 14q32.33: further delineation of a critical region for the 14q32 deletion syndrome. Am J Med Genet A 158A, 1962–1966. [DOI] [PubMed] [Google Scholar]
- 38.Maurin ML, Brisset S, Le Lorc’h M, Poncet V, Trioche P, Aboura A, Labrune P, and Tachdjian G (2006). Terminal 14q32.33 deletion: genotype-phenotype correlation. Am J Med Genet A 140, 2324–2329. [DOI] [PubMed] [Google Scholar]
- 39.Hladilkova E, Baroy T, Fannemel M, Vallova V, Misceo D, Bryn V, Slamova I, Prasilova S, Kuglik P, and Frengen E (2015). A recurrent deletion on chromosome 2q13 is associated with developmental delay and mild facial dysmorphisms. Mol Cytogenet 8, 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tang LS, and Finnell RH (2003). Neural and orofacial defects in Folp1 knockout mice [corrected]. Birth Defects Res A Clin Mol Teratol 67, 209–218. [DOI] [PubMed] [Google Scholar]
- 41.Peters H, Neubuser A, Kratochwil K, and Balling R (1998). Pax9-deficient mice lack pharyngeal pouch derivatives and teeth and exhibit craniofacial and limb abnormalities. Genes Dev 12, 2735–2747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Wilm B, Dahl E, Peters H, Balling R, and Imai K (1998). Targeted disruption of Pax1 defines its null phenotype and proves haploinsufficiency. Proc Natl Acad Sci U S A 95, 8692–8697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Takimoto A, Mohri H, Kokubu C, Hiraki Y, and Shukunami C (2013). Pax1 acts as a negative regulator of chondrocyte maturation. Exp Cell Res 319, 3128–3139. [DOI] [PubMed] [Google Scholar]
- 44.Rodrigo I, Hill RE, Balling R, Munsterberg A, and Imai K (2003). Pax1 and Pax9 activate Bapx1 to induce chondrogenic differentiation in the sclerotome. Development 130, 473–482. [DOI] [PubMed] [Google Scholar]
- 45.Sivakamasundari V, Kraus P, Sun W, Hu X, Lim SL, Prabhakar S, and Lufkin T (2017). A developmental transcriptomic analysis of Pax1 and Pax9 in embryonic intervertebral disc development. Biol Open 6, 187–199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Lamb AN, Rosenfeld JA, Neill NJ, Talkowski ME, Blumenthal I, Girirajan S, Keelean-Fuller D, Fan Z, Pouncey J, Stevens C, et al. (2012). Haploinsufficiency of SOX5 at 12p12.1 is associated with developmental delays with prominent language delay, behavior problems, and mild dysmorphic features. Hum Mutat 33, 728–740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Li YH, Yang J, Zhang JL, Liu JQ, Zheng Z, and Hu DH (2017). BMP4 rs17563 polymorphism and nonsyndromic cleft lip with or without cleft palate: A meta-analysis. Medicine (Baltimore) 96, e7676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Saket M, Saliminejad K, Kamali K, Moghadam FA, Anvar NE, and Khorram Khorshid HR (2016). BMP2 and BMP4 variations and risk of non-syndromic cleft lip and palate. Arch Oral Biol 72, 134–137. [DOI] [PubMed] [Google Scholar]
- 49.Hammond NL, Brookes KJ, and Dixon MJ (2018). Ectopic Hedgehog Signaling Causes Cleft Palate and Defective Osteogenesis. J Dent Res 97, 1485–1493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Lipinski RJ, Song C, Sulik KK, Everson JL, Gipp JJ, Yan D, Bushman W, and Rowland IJ (2010). Cleft lip and palate results from Hedgehog signaling antagonism in the mouse: Phenotypic characterization and clinical implications. Birth Defects Res A Clin Mol Teratol 88, 232–240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Dworkin S, Boglev Y, Owens H, and Goldie SJ (2016). The Role of Sonic Hedgehog in Craniofacial Patterning, Morphogenesis and Cranial Neural Crest Survival. J Dev Biol 4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Paganini I, Sestini R, Capone GL, Putignano AL, Contini E, Giotti I, Gensini F, Marozza A, Barilaro A, Porfirio B, et al. (2017). A novel PAX1 null homozygous mutation in autosomal recessive otofaciocervical syndrome associated with severe combined immunodeficiency. Clin Genet 92, 664–668. [DOI] [PubMed] [Google Scholar]
- 53.Pohl E, Aykut A, Beleggia F, Karaca E, Durmaz B, Keupp K, Arslan E, Palamar M, Yigit G, Ozkinay F, et al. (2013). A hypofunctional PAX1 mutation causes autosomal recessively inherited otofaciocervical syndrome. Hum Genet 132, 1311–1320. [DOI] [PubMed] [Google Scholar]
- 54.Shaffer JR, Orlova E, Lee MK, Leslie EJ, Raffensperger ZD, Heike CL, Cunningham ML, Hecht JT, Kau CH, Nidey NL, et al. (2016). Genome-Wide Association Study Reveals Multiple Loci Influencing Normal Human Facial Morphology. PLoS Genet 12, e1006149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Adhikari K, Fuentes-Guajardo M, Quinto-Sanchez M, Mendoza-Revilla J, Camilo Chacon-Duque J, Acuna-Alonzo V, Jaramillo C, Arias W, Lozano RB, Perez GM, et al. (2016). A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation. Nat Commun 7, 11616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Cha S, Lim JE, Park AY, Do JH, Lee SW, Shin C, Cho NH, Kang JO, Nam JM, Kim JS, et al. (2018). Identification of five novel genetic loci related to facial morphology by genome-wide association studies. BMC Genomics 19, 481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Leslie EJ, Taub MA, Liu H, Steinberg KM, Koboldt DC, Zhang Q, Carlson JC, Hetmanski JB, Wang H, Larson DE, et al. (2015). Identification of functional variants for cleft lip with or without cleft palate in or near PAX7, FGFR2, and NOG by targeted sequencing of GWAS loci. Am J Hum Genet 96, 397–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Hooper JE, Feng W, Li H, Leach SM, Phang T, Siska C, Jones KL, Spritz RA, Hunter LE, and Williams T (2017). Systems biology of facial development: contributions of ectoderm and mesenchyme. Dev Biol 426, 97–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Jiang R, Lan Y, Chapman HD, Shawber C, Norton CR, Serreze DV, Weinmaster G, and Gridley T (1998). Defects in limb, craniofacial, and thymic development in Jagged2 mutant mice. Genes Dev 12, 1046–1057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Xu J, Krebs LT, and Gridley T (2010). Generation of mice with a conditional null allele of the Jagged2 gene. Genesis 48, 390–393. [DOI] [PMC free article] [PubMed] [Google Scholar]
Web Resources
- 60.Online Mendelian Inheritance in Man (http://www.omim.org)
- 60.GTOOL (http://www.well.ox.ac.uk/~cfreeman/software/gwas/gtool.html)
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.