Skip to main content
Epigenetics logoLink to Epigenetics
. 2019 Mar 1;14(2):109–117. doi: 10.1080/15592294.2019.1581592

Don’t brush off buccal data heterogeneity

Andrei L Turinsky a,b, Darci T Butcher c, Sanaa Choufani a, Rosanna Weksberg a,d,e,f,g,, Michael Brudno a,b,h,
PMCID: PMC6557604  PMID: 30821575

ABSTRACT

Buccal epithelial cells are among the most clinically accessible tissues and are increasingly being used to identify epigenetic disease patterns. However, substantial variation in buccal DNA methylation patterns indicates heterogeneity of cell types within and between samples, raising questions of data quality. We systematically estimated cell-type composition for a large collection of buccal and saliva samples from 11 published studies of DNA methylation. In these we identified numerous cases of buccal samples with questionable purity, which may be affected by sampling from individuals with neurodevelopmental disorders, and by the brushes used for sample collection. Further challenges are involved in comparisons with tissues such as saliva, in which buccal component varies widely. We propose a reference-based method of correcting for buccal purity that reduces unwanted variation while preserving cross-tissue differences. Our work demonstrates the wide variation of buccal quality in epigenetic studies and suggests a possible approach to overcome this issue.

KEYWORDS: Epigenetics, DNA methylation, buccal epithelial cells, tissue heterogeneity


Epigenetic disruptions play a key role in many genetic disorders [14]. The search for robust, reproducible epigenetic biomarkers associated with disease is often impeded by various factors: from biological covariates such as age, sex and ethnic makeup, to technical factors related to batch effect, tissue accessibility, cell type heterogeneity and sample purity [57]. Blood, saliva and buccal epithelial cells are among the most easily accessible tissues in a clinical setting. As a result, a large number of clinical epigenetic datasets for a broad range of diseases have been collected using these tissue sources. For example, neurodevelopmental diseases present a challenge as the primary affected tissue – the brain – is not easily accessible. Nevertheless, significant epigenetic patterns associated with disease states have been found in secondary tissues such as blood or buccal [811].

As whole blood is the most commonly used tissue in such epigenetic analyses, it is not surprising that the methodology for analysing epigenetic data for DNA derived from blood samples has been prioritized by the scientific community. In particular it has become a standard practice to account for blood cell type heterogeneity when analysing epigenetic data derived from blood, with statistical methods and software packages developed to estimate the cell type composition of each blood sample [1214]. Furthermore, studies detail the dangers of not accounting properly for blood cell type heterogeneity [6]. Although a number of cell type decomposition methods have been developed, both reference-based and reference-free [15], they are most commonly applied to epigenetic data for DNA extracted from blood. The prioritization of this tissue is also reflected in the available software packages.

Clinically accessible tissues other than blood, such as buccal tissue, have not received the same degree of scrutiny, even though the number of epigenetic studies using buccal data is rapidly growing [10,11,1618]. The problem of estimating epithelial content in buccal tissue arose more than two decades ago [19] but it has not yet become a generally recognized concern in the epigenetic community. Buccal has been viewed until recently as a relatively homogeneous tissue, despite several reports of cell type variability in buccal tissues [2023]. Practical challenges of buccal sample collection in a clinical setting introduce additional complications. For example, contamination by the person’s blood during the sample collection process, e.g., due to damage to the tissue by the brush used for collection, may become a significant contributor to the overall cell type composition of the collected sample and affect the epigenetic patterns found through statistical analysis.

In this study we demonstrate that high variation between buccal samples suggests a potential heterogeneity of the samples on the buccal-to-blood spectrum and raises a question of sample purity during data collection. We use publicly available data from several independent studies of genome-wide DNA methylation in buccal epithelial cells, blood and saliva, obtained from the NCBI Gene Expression Omnibus repository [24,25] (GEO accession IDs listed in Table 1). All datasets were generated on the Illumina HumanMethylation450 microarray platform. DNA methylation was represented as either beta-values or M-values [26]. Detailed descriptions of data extraction and pre-processing methods are provided in the Supplemental Text. We used two reference-based approaches to estimate cell-type composition of all samples [12]. First, following the methods already used in several studies [27,28] we extracted a reference DNA methylation (DNAm) profile for buccal tissue from Lowe et al. [29] and reference DNAm profiles for six blood cell types from Reinius et al. [13]. Each data sample was then represented as a combination of cell-specific profiles for buccal cell and six blood cell types: granulocytes, CD4+T cells, CD14+ monocytes, CD19+ B cells, CD8+ T cells, CD56+ NK cells. This defined a system of constrained linear equations, one for every CpG. The seven unknown cell proportions were then estimated using R software based on CpGs associated with blood vs. buccal tissue differences [30]. Secondly, we also used a recently developed method EpiFibIC [20] and the R package EpiDISH [23], in which a generic epithelial DNAm profile is derived from a range of cell lines for estimating the epithelial content. EpiFibIC confirmed the high quality of the buccal samples from which we derived the DNAm reference profiles, such as the selected samples from Lowe et al. and Jones et al. datasets (see Supplemental Text). We then applied EpiFibIC to derive independent estimates of buccal composition in all datasets we examined.

Table 1.

Sources of DNA methylation data used in the study.

Study GEO Accession Description Buccal or Saliva Collection
Berko et al, 2014 GSE50759 48 buccal autism spectrum disorder (ASD), 48 controls Gentra Puregene Buccal cell kit
Jessen et al., 2018 GSE94876 120 buccal samples related to tobacco usage Mouth rinse and swishing
Jones et al, 2013 GSE50586 10 buccal Down Syndrome samples, 10 buccal controls Isohelix Buccal DNA isolation kit
Langie et al, 2016 GSE73745 12 blood-saliva pairs: 6 controls and 6 respiratory allergy samples Oragene saliva kit (DNA Genotek)
Lowe et al, 2013 GSE46573 3 buccal control replicates Gentra Puregene Buccal cell kit
Lussier et al., 2018 GSE109042 24 buccal fetal alcohol spectrum disorder (FASD) subjects and 24 buccal controls, with additional replicates Isohelix Buccal DNA isolation kit
Martino et al, 2013 GSE42700 53 buccal samples from normal mono- and dizygotic twins Catch-All Sample Collection Swabs
Portales-Casamar et al, 2016 GSE80261 110 buccal fetal alcohol spectrum disorder (FASD) samples and 96 buccal controls Isohelix Buccal DNA isolation kit
Reinius et al, 2012 GSE35069 Purified blood cell subtypes from 6 control subjects Not applicable
Slieker et al, 2013 GSE48472 Matched buccal, blood and saliva from 5 control subjects Buccal: Chloroform/isoamyl alcohol protocol. Saliva: Oragene saliva kit (DNA Genotek)
Smith et al, 2015 GSE61653 64 matched blood-saliva pairs Oragene saliva kit (DNA Genotek)
Souren et al, 2013 GSE39560 34 saliva samples from 17 twin pairs discordant for birth weight Oragene saliva kit (DNA Genotek)

Lower-purity buccal samples occur commonly

Using the seven cell-specific reference profiles, we derived the cell type composition for buccal datasets listed in Table 1. Most of the examined datasets contained a fraction of buccal samples in which buccal content was lower than in the rest of the buccal group (Figure 1), although in one of these studies [10] the presence of such lower-purity samples was previously recognized and adjusted for. To explore robustness, we used several alternative sets of CpGs to define the system of equations, each set representing different aspects of tissue-specific variability in DNA methylation among the cell types involved (see Supplemental Text). The lower-purity buccal samples were identified regardless of the alternative set of CpGs used in the decomposition, which were selected to represent different tissue-specific patterns of DNA methylation as defined in the Supplemental Text: either variation among all seven cell types involved (top 700/7000), or only between buccal and blood (buccal 100/1000/full), or only among the six blood cell subtypes (jaffe model), or using all available CpGs. After establishing the consistency of results between the CpG sets (also discussed below) we continued to use in subsequent analyses the top 700 CpG set, which reflected differential DNAm patterns among all seven cell types; this was based on an established approach [6]. The EpiFibIC results presented a similar pattern of buccal composition.

Figure 1.

Figure 1.

Distribution of buccal composition in buccal DNAm samples from six datasets listed in Table 1. The buccal content was estimated using different subsets of CpGs, reflecting patterns of cell-specific DNAm variation among all seven available cell types (top 700/7000); in buccal vs. blood cells (buccal 100/1000/full); among six blood cell types only (jaffe model, extracted from Bioconductor package FlowSorted.Blood.450k); and using all available array probes (all CpGs). Independent estimates using the EpiFibIC/EpiDISH method are also shown. For each dataset the estimates were consistent across all methods used.

Intriguingly, there were no low-purity samples in the Martino et al. dataset of buccal samples from newborn and 18 month old twins [31]. This study used a different swab type (Catch-All Sample Collection Swabs) with softer sponge-like brushes possibly associated with less bleeding, and thus less contamination. On the other hand, a recent dataset of 120 samples related to tobacco usage from Jessen et al. [32] was listed as originating from buccal cells, but showed an unusually low buccal content more typical of saliva datasets [28,33] as shown in Supplemental Figure S1. The authors reported that these buccal samples ‘were collected in water by vigorously swishing’ while the subjects rinsed their mouth, which may explain why the samples actually resemble the saliva. PCA of these 120 samples together with other datasets listed in Table 1 also shows a much closer correspondence of the Jessen et al. data with the saliva cluster but not the buccal cluster (Supplemental Figure S2).

Comparison of buccal, blood and saliva DNAm samples across multiple datasets revealed a persistent pattern in which the first principal component PC1 was associated with the differences between buccal and blood tissues (Figure 2(a); Supplemental Figure S2). Buccal samples were spread widely along PC1 forming a comet-like shape, with the comet’s ‘head’ at the opposite end from the cluster of blood samples, and its ‘tail’ pointing towards the blood cluster. The PC1 was strongly associated with the estimated buccal content (Pearson correlation >95% for the examined datasets), which exhibited similar ‘tail’ patterns in its distribution boxplots (Figure 1). In contrast, samples collected from blood had much lower variability along PC1. Saliva samples vary widely along the PC1 between the blood cluster and buccal cluster, which corresponds well with the expectation that saliva consists of a mixture of leukocytes and epithelial cells.

Figure 2.

Figure 2.

Comparison of Slieker et al. DNA methylation samples together with cell type reference profiles. (a) PCA of buccal (green), blood (red) and saliva (blue) samples from 5 subjects (numbers 1–5) in Slieker et al. data. Samples were analyzed together with buccal reference profile (green B) and six blood cell references: granulocytes (dark red G), CD14+ monocytes (M), CD56+ NK cells (N), CD19+ B cells (dark red B), CD4+ and CD8+ T cells (dark red C). (b) PCA after purity correction was applied to the five buccal samples. (c) Initial clustering of the data from 5 subjects (PT1-5) along with the cell-type profiles, using Pearson correlation metric and Ward clustering. Buccal and saliva samples were mixed in the same cluster. (d) After purity correction applied to the five buccal samples, the buccal group is now well separated from the other tissues.

Lower-purity buccal samples may be over-represented in disease datasets

The PCA pattern noted above was previously reported in Figure 1(c) from the study of Down syndrome (DS) DNAm data [10]. The authors noted that the buccal outliers were associated with the DS phenotype potentially reflecting blood contamination, and possibly due to an increased risk of periodontal disease in DS. Similarly, we observed a statistically significant drop (p = 0.047, Mann-Whitney U test) in buccal purity among ASD subjects compared to controls in the data of Berko et al. [11] (Supplemental Figure S3). The significance diminished somewhat (p = 0.068) after using linear regression analysis that accounted for sex and age of the subjects; this regression model estimated an average drop of 4.4% in buccal content in ASD. However, the statistical significance was lost (p = 0.179 for an average 3.5% drop in buccal content) when the subjects’ estimated ancestry as the percent of European (CEU) and of African (YRI) alleles genome-wide were also added as confounders.

There was also an average drop of 2.8% buccal content for cases of fetal alcohol spectrum disorder (FASD) in the Portales-Casamar et al. dataset [17], which was statistically significant (p = 0.028) after accounting for age and sex confounders. In this case we also observed an even more significant drop of 4.0% buccal content in males (p = 0.0019). However, in a smaller dataset from a follow-up study [16] the effect of the FASD status was much less pronounced (a drop of 0.14% buccal content) and not significant (p = 0.97), nor was the sex effect present. It may be worth noting that the follow-up dataset was found to be of lower quality than the data from the original study during data pre-processing (see Supplemental Text).

Taken collectively these results suggest that, although the magnitude of the disease-associated drop in buccal content varied among the four examined datasets, the pattern of lower buccal purity was often present in neurodevelopmental disease samples. In contrast, neither the subject’s age nor sex nor ethnicity (where available) showed any significant association with the change in buccal composition, except for one case noted above (males in the Portales-Casamar et al. data). Our analysis supports previous observations that the challenges of buccal tissue collection in patients with neurodevelopmental conditions may cause systematic differences in buccal purity between disease and control groups, which may inflate apparent differences in DNAm between the groups. This suggests that buccal purity may be an important confounder that should be routinely assessed and corrected for in epigenomic studies of disease.

Correction for buccal impurity

We developed a modified method of correction for observed buccal impurities. This approach removes the variance due to data impurity and preserves cross-tissue differences based on biologically meaningful reference profiles, which makes the resulting data better suited for follow-up analysis. Our method is based on the original approach from Jones et al. [10]. These authors observed that for a combined set of buccal and blood samples, the first principal component (PC1) is associated with cell type differences and may be used as an indicator of buccal impurity. The authors then subtracted the effects of the buccal-blood PC1 from their buccal data, which eliminated the variance due to tissue contamination while preserving other differences between buccal samples.

Our new method proposes two improvements to this approach. First, we apply the principal component analysis (PCA) not to the dataset that is being analyzed but to the seven reference profiles of buccal and blood cell types, which defines the same PCA space for all subsequent analysis. Other datasets are then mapped onto the same coordinate space using the formulas for each principal component as a weighted combination of CpGs. This modification makes our approach independent of the dataset under investigation.

Secondly, in the original method [10] each data sample xi was represented using PCA as xi = x+∑aijvj and the effects of buccal-blood PC1 were subtracted by effectively setting the corresponding coefficient ai1 to zero. Instead we propose to set ai1 to a value derived from the buccal reference, so that all buccal samples are moved to the same PC1 level as the buccal DNAm reference profile. This modification not only eliminates the variance due to tissue contamination from buccal samples, but also preserves the inter-tissue differences among buccal, blood and saliva samples, allowing meaningful comparisons of data cohorts across different tissues.

Our modified corrective procedure not only substantially reduces the unwanted variation among buccal samples, but also improves the separation among the tissues (Figure 2). The original correction method of removing the PC1 term from the PCA decomposition would have a similar effect of ‘flattening’ the buccal cluster. However, such purity-corrected data could not be used in cross-tissue comparisons: as can be clearly seen in Figure 2(a) and Supplemental Figure S2, setting PC1 = 0 for all buccal data samples in the presence of other tissues would re-position the flattened buccal cluster too close to the saliva and blood clusters. Instead, our modified correction method uses a buccal DNAm reference to anchor all buccal samples at a meaningful, biologically justified distance from other tissues. Another option could be to set PC1 to a value derived from the dataset itself, such as the median PC1 over the buccal cluster. However, unlike with the reference-based approach, such values would depend on the buccal purity of each dataset, which may vary substantially across different studies (Figure 1), making the results from different data sources harder to compare.

Similarly, when applied to the DS dataset from Jones et al., our purification approach not only moved the 5 ‘compromised’ buccal outliers towards the main buccal group (as was also done in the original study), but also enhanced the separation between the buccal group and blood cell types. The increased separation between buccal and blood groups is also evident from the hierarchical clustering of the data (Supplemental Figure S4).

Another potential benefit of purity correction can be seen in the improved separation between different sample groups within the buccal cluster. For example, when the correction was applied to the Berko et al. ASD dataset [11], it led to a statistically significant improvement in the separation between different buccal groups: the purified data fall into two clear clusters, one of which is dominated by ASD samples and the other one by controls (Supplemental Figure S5). The improvement was confirmed by the regression analysis of differentially methylated loci between ASD and controls, which found a much larger number of significantly different CpGs (p-value ≤ 5% after Benjamini-Hochberg adjustment for FDR) after the purity correction. On the other hand, purity correction also led to smaller magnitude of DNAm differences between the ASD and control groups after the variation presumed to be tissue-specific was removed (Supplemental Figure S6 A-B). Similar effects of reaching higher statistical significance but with a lower effect size were observed in the DS dataset from Jones et al. (Supplemental Figure S6 C-D). Furthermore, applying the independent method EpiFibIC to estimate the buccal proportion in samples before and after correction validated the substantial improvement in their purity (Supplemental Figure S7).

Patterns of cell type composition are robust

We examined how using an alternative buccal DNAm reference profile affected the estimated cell type proportions (see Supplemental Text). The results show a high degree of correlation between estimates obtained using different buccal reference profiles (Pearson correlation of 99.995% between buccal content estimates) and the presence of lower-purity buccal samples was consistently detected in each case. The alternative buccal reference based on the Jones et al. dataset gave slightly higher estimates of buccal percentage, especially for high-purity buccal samples (Supplemental Figure S8a). However, samples with lower buccal content remained at the low end of the spectrum regardless of the reference used, suggesting that their presence is a real phenomenon and not a technical artefact of the analysis. We also note that this remarkable degree of correlation was achieved despite clear differences between the two buccal reference profiles (Supplemental Figure S8b).

The effect of choosing a specific CpG set to estimate buccal content was more pronounced, likely because these sets reflected different tissue-specific DNAm patterns, but low-purity outliers were clearly present regardless of the alternative CpG set used (Figure 1, Supplemental Figure S1). For example, the estimated buccal content was higher when we used smaller sets of CpGs associated with variability between buccal and blood cell types (top 700 or buccal 100), and lower when using all available CpGs. The independent estimates from the EpiFibIC method were rather similar, even though the EpiFibIC epithelial DNAm reference was derived from non-buccal epithelial cell lines. Interestingly, buccal composition estimated using the jaffe model CpG set [6] was comparable to other methods, even though these 600 CpGs were derived from only the six blood cell subtypes without using the buccal reference. Nevertheless, the pattern of differences and the relative ranking of the buccal samples by purity remained consistent for all choices of CpGs used (Figure 1). Similar to buccal, the estimates of cell type proportions in saliva samples varied somewhat depending on the CpG set but showed consistent overall patterns (Supplemental Figure S1).

Comparison to other data correction approaches

The main benefit of the purity correction procedure for buccal datasets is to remove the unwanted influence of a suspected blood contamination. Another approach to achieve similar goals may be standard regression analysis, in which the estimated buccal purity is modelled explicitly as one of the confounding factors. However, regression modelling does not directly generate a set of corrected data from which the influence of a confounder has been removed. In this regard our purity-correction approach may be compared to using batch-correction methods such as ComBat and others [34,35]: while it is possible to add batch variables as additional confounders into a regression model, it is often beneficial to generate a batch-corrected version of the data, which can be used in further analysis steps (including regression analysis).

Furthermore, the regression modelling approach derives the influences of each confounder from the dataset being analyzed, thus the model can be swayed by the presence of lower-purity samples within the buccal group affecting the overall estimates. Although methods of robust statistics and robust regression analysis may be recruited to handle a few outliers, this opportunity may be thwarted if low-purity samples appear in relatively large numbers, e.g., due to the systematic difficulties of buccal tissue collection within a study. In contrast, our modified correction method uses the same stable, externally derived buccal DNAm reference to anchor all buccal datasets in the same coordinate system and is thus unaffected by the presence or by the number of low-purity outliers within each dataset.

Computational approaches such as ours are complementary to experimental studies that quantify the complexity of saliva and buccal cellular content [22]. As the number of epigenetic studies using these tissues is growing, we provide systematic evidence to help the research community appreciate the extent of potential buccal heterogeneity in methylome-wide studies, just as it has become common to do so with respect to blood heterogeneity. As such, our modified method of purity correction, as well as future modifications of this method, may become one of the readily available tools in a researcher’s arsenal.

Funding Statement

This research was supported by the Canadian Institutes of Health Research (MOP-126054 and MOP-287680) and the Ontario Brain Institute’s Province of Ontario Brain Neurodevelopment Network (IDS 11 02). Bioinformatics analyses were supported in part by Genome Canada through Ontario Genomics; and by the Canadian Centre for Computational Genomics (C3G), part of the Genome Technology Platform (GTP) funded by Genome Canada through Genome Quebec and Ontario Genomics.

Disclosure statement

No potential conflict of interest was reported by the authors.

Supplementary material

Supplemental data for this article can be accessed here.

Supplemental Material
Supplemental Material

References

  • [1].Egger G, Liang G, Aparicio A, et al. Epigenetics in human disease and prospects for epigenetic therapy. Nature. 2004. May 27;429(6990):457–463. PubMed PMID: 15164071; eng. [DOI] [PubMed] [Google Scholar]
  • [2].Esteller M, Herman JG.. Cancer as an epigenetic disease: DNA methylation and chromatin alterations in human tumours. J Pathol. 2002. January;196(1):1–7. 10.1002/path.1024. PubMed PMID: 11748635. [DOI] [PubMed] [Google Scholar]
  • [3].Martin DIK, Ward R, Suter CM. Germline epimutation: A basis for epigenetic disease in humans. Ann N Y Acad Sci. 2005;1054:68–77. PubMed PMID: 16339653. [DOI] [PubMed] [Google Scholar]
  • [4].Lappalainen T, Greally JM. Associating cellular epigenetic models with human phenotypes. Nat Rev Genet. 2017. July;18(7):441–451. PubMed PMID: 28555657. [DOI] [PubMed] [Google Scholar]
  • [5].Harper KN, Peters BA, Gamble MV. Batch effects and pathway analysis: two potential perils in cancer studies involving DNA methylation array analysis. Cancer Epidemiol Biomarkers Prev. 2013. June;22(6):1052–1060. PubMed PMID: 23629520; PubMed Central PMCID: PMCPMC3687782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Jaffe AE, Irizarry RA. Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 2014. February 4;15(2):R31 PubMed PMID: 24495553; PubMed Central PMCID: PMCPMC4053810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Davies MN, Volta M, Pidsley R, et al. Functional annotation of the human brain methylome identifies tissue-specific epigenetic variation across brain and blood. Genome Biol. 2012. June 15;13(6):R43 PubMed PMID: 22703893; PubMed Central PMCID: PMCPMC3446315. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Butcher DT, Cytrynbaum C, Turinsky AL, et al. CHARGE and kabuki syndromes: gene-specific DNA methylation signatures identify epigenetic mechanisms linking these clinically overlapping conditions. Am J Hum Genet. 2017. May 4;100(5):773–788. PubMed PMID: 28475860; PubMed Central PMCID: PMCPMC5420353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Choufani S, Cytrynbaum C, Chung BH, et al. NSD1 mutations generate a genome-wide DNA methylation signature. Nat Commun. 2015. December 22;6:10207 PubMed PMID: 26690673; PubMed Central PMCID: PMCPMC4703864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Jones MJ, Farre P, McEwen LM, et al. Distinct DNA methylation patterns of cognitive impairment and trisomy 21 in down syndrome. BMC Med Genomics. 2013. December 27;6:58 PubMed PMID: 24373378; PubMed Central PMCID: PMCPMC3879645. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Berko ER, Suzuki M, Beren F, et al. Mosaic epigenetic dysregulation of ectodermal cells in autism spectrum disorder. PLoS Genet. 2014;10(5):e1004402 PubMed PMID: 24875834; PubMed Central PMCID: PMCPMC4038484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Houseman EA, Accomando WP, Koestler DC, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics. 2012. May 8;13:86 PubMed PMID: 22568884; PubMed Central PMCID: PMCPMC3532182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Reinius LE, Acevedo N, Joerink M, et al. Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PloS one. 2012;7(7):e41361 PubMed PMID: 22848472; PubMed Central PMCID: PMCPMC3405143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Koestler DC, Christensen B, Karagas MR, et al. Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis. Epigenetics. 2013. August;8(8):816–826. PubMed PMID: 23903776; PubMed Central PMCID: PMCPMC3883785. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].McGregor K, Bernatsky S, Colmegna I, et al. An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies. Genome Biol. 2016. May 3;17:84 PubMed PMID: 27142380; PubMed Central PMCID: PMCPMC4855979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Lussier AA, Morin AM, MacIsaac JL, et al. DNA methylation as a predictor of fetal alcohol spectrum disorder. Clin Epigenetics. 2018;10:5 PubMed PMID: 29344313; PubMed Central PMCID: PMCPMC5767049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Portales-Casamar E, Lussier AA, Jones MJ, et al. DNA methylation signature of human fetal alcohol spectrum disorder. Epigenetics Chromatin. 2016;9:25 PubMed PMID: 27358653; PubMed Central PMCID: PMCPMC4926300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Teschendorff AE, Yang Z, Wong A, et al. Correlation of smoking-associated DNA methylation changes in Buccal cells with DNA methylation changes in epithelial cancer. JAMA Oncol. 2015. July;1(4):476–485. PubMed PMID: 26181258. [DOI] [PubMed] [Google Scholar]
  • [19].Endler G, Greinix H, Winkler K, et al. Genetic fingerprinting in mouthwashes of patients after allogeneic bone marrow transplantation. Bone Marrow Transplant. 1999. July;24(1):95–98. PubMed PMID: 10435742. [DOI] [PubMed] [Google Scholar]
  • [20].Zheng SC, Webster AP, Dong D, et al. A novel cell-type deconvolution algorithm reveals substantial contamination by immune cells in saliva, buccal and cervix. Epigenomics. 2018. July;10(7):925–940. PubMed PMID: 29693419. [DOI] [PubMed] [Google Scholar]
  • [21].Eipel M, Mayer F, Arent T, et al. Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures. Aging (Albany NY). 2016. May;8(5):1034–1048. PubMed PMID: 27249102; PubMed Central PMCID: PMCPMC4931852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Theda C, Hwang SH, Czajko A, et al. Quantitation of the cellular content of saliva and buccal swab samples. Sci Rep. 2018. May 2;8(1):6944 PubMed PMID: 29720614; PubMed Central PMCID: PMCPMC5932057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Teschendorff AE, Breeze CE, Zheng SC, et al. A comparison of reference-based algorithms for correcting cell-type heterogeneity in epigenome-wide association studies. BMC Bioinformatics. 2017. February 13;18(1):105 PubMed PMID: 28193155; PubMed Central PMCID: PMCPMC5307731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Edgar R, Domrachev M, Lash AE. Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002. January 1;30(1):207–210. PubMed PMID: 11752295; PubMed Central PMCID: PMCPMC99122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Barrett T, Wilhite SE, Ledoux P, et al. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013. January;41(Database issue):D991–5. PubMed PMID: 23193258; PubMed Central PMCID: PMCPMC3531084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Du P, Zhang X, Huang CC, et al. Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. BMC Bioinformatics. 2010;11:587 PubMed PMID: 21118553; PubMed Central PMCID: PMC3012676. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Langie SA, Szarc Vel Szic K, Declerck K, et al. Whole-genome saliva and blood DNA methylation profiling in individuals with a respiratory allergy. PloS one. 2016;11(3):e0151109 PubMed PMID: 26999364; PubMed Central PMCID: PMCPMC4801358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Smith AK, Kilaru V, Klengel T, et al. DNA extracted from saliva for methylation studies of psychiatric traits: evidence tissue specificity and relatedness to brain. Am J Med Genet B Neuropsychiatr Genet. 2015. January;168B(1):36–44. PubMed PMID: 25355443; PubMed Central PMCID: PMCPMC4610814. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Lowe R, Gemma C, Beyan H, et al. Buccals are likely to be a more informative surrogate tissue than blood for epigenome-wide association studies. Epigenetics. 2013. April;8(4):445–454. PubMed PMID: 23538714; PubMed Central PMCID: PMCPMC3674053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Slieker RC, Bos SD, Goeman JJ, et al. Identification and systematic annotation of tissue-specific differentially methylated regions using the Illumina 450k array. Epigenetics Chromatin. 2013. August 6;6(1):26 PubMed PMID: 23919675; PubMed Central PMCID: PMCPMC3750594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Martino D, Loke YJ, Gordon L, et al. Longitudinal, genome-scale analysis of DNA methylation in twins from birth to 18 months of age reveals rapid epigenetic change in early life and pair-specific effects of discordance. Genome Biol. 2013. May 22;14(5):R42 PubMed PMID: 23697701; PubMed Central PMCID: PMCPMC4054827. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Jessen WJ, Borgerding MF, Prasad GL, Global methylation profiles in buccal cells of long-term smokers and moist snuff consumers. Biomarkers. 2018. July 3:1–15. DOI: 10.1080/1354750X.2018.1466367 PubMed PMID: 29771158. [DOI] [PubMed] [Google Scholar]
  • [33].Souren NY, Lutsik P, Gasparoni G, et al. Adult monozygotic twins discordant for intra-uterine growth have indistinguishable genome-wide DNA methylation profiles. Genome Biol. 2013. May 26;14(5):R44 PubMed PMID: 23706164; PubMed Central PMCID: PMCPMC4054831. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007. January;8(1):118–127. PubMed PMID: 16632515. [DOI] [PubMed] [Google Scholar]
  • [35].Chen C, Grennan K, Badner J, et al. Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PloS one. 2011. February 28;6(2):e17238 PubMed PMID: 21386892; PubMed Central PMCID: PMCPMC3046121. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material
Supplemental Material

Articles from Epigenetics are provided here courtesy of Taylor & Francis

RESOURCES