Abstract
Atherosclerosis involves phenotypic modulation and transdifferentiation of vascular smooth muscle cells (SMCs). Data are given in tabular or figure format that illustrate genome-wide DNA methylation alterations in atherosclerotic vs. control aorta (athero DMRs). Data based upon publicly available chromatin state profiles are also shown for normal aorta, monocyte, and skeletal muscle tissue-specific DMRs and for aorta-specific chromatin features (enhancer chromatin, promoter chromatin, repressed chromatin, actively transcribed chromatin). Athero hypomethylated and hypermethylated DMRs as well as epigenetic and transcription profiles are described for the following genes: ACTA2, MYH10, MYH11 (SMC-associated genes); SMAD3 (a signaling gene for SMCs and other cell types); CD79B and SH3BP2 (leukocyte-associated genes); and TBX20 and genes in the HOXA, HOXB, HOXC, and HOXD clusters (T-box and homeobox developmental genes). The data reveal strong correlations between athero hypermethylated DMRs and regions of enhancer chromatin in aorta, which are discussed in the linked research article “Atherosclerosis-associated differentially methylated regions can reflect the disease phenotype and are often at enhancers” (M. Lacey et al., 2019).
Keywords: Atherosclerosis, DNA methylation, Smooth muscle, Monocytes, Enhancers, Differentially methylated regions (DMRs)
Abbreviations: DMR, differentially methylated region; athero hypermeth DMR, atherosclerosis-associated hypermethylated DMR; athero hypometh DMR, atherosclerosis-associated hypomethylated DMR; GO, gene ontology; PMD, percent methylation difference; ctl, control; mod, moderate; SkM, skeletal muscle; PBMC, peripheral blood mononuclear cells; enh, enhancer chromatin; prom, promoter chromatin; repr, repressed; txn chromatin, chromatin with the histone marks of actively transcribed chromatin
Subject area | Biology |
More specific subject area | Molecular biology, gene regulation, epigenetics |
Type of data | Figures, tables, text file |
How data was acquired | Downloaded publically available data from the UCSC genome browser |
Data format | filtered, analyzed |
Experimental factors | Bisulfite-seq profiles for atherosclerotic and control aorta were analyzed for differentially methylated regions (DMRs) |
Experimental features | Original data obtained from UCSC genome browser and various hubs |
Data source location | Data were collected and the DMRs determined at Tulane University, New Orleans, LA, USA, Longitude -90.071533, Latitude, 29.951065 |
Data accessibility | Data provided in this article and as Supplementary Tables |
Related research article | The associated research article is available[1]. M. Lacey et al., Atherosclerosis-associated differentially methylated regions can reflect the disease phenotype and are often at enhancers, Atherosclerosis, 280 (1) (2019) 183-191. https://doi.org/10.1016/j.atherosclerosis.2018.11.031. |
Value of the data
|
1. Data
Tables 1–5 (included as Supplementary files) show statistically significant atherosclerosis-associated DMRs (atheroDMRs), their over-representation among enhancers and super-enhancers, the functional associations of their linked genes based on gene ontology (GO) analysis, a literature survey of the involvement of DMR-linked genes to atherosclerosis, and RNA-seq data for the tissue-specific expression of these genes. In addition, data are presented for specific examples of genes in which athero DMRs only partially overlap leukocyte-associated DMRs (Fig. 1, Fig. 2) or overlap aorta-related enhancer and super-enhancer chromatin in genes important for different aspects of proper smooth muscle cell function (Fig. 3, Fig. 4, Fig. 5, Fig. 6). Lastly, the very strong relationship of athero DMRs to developmental transcription factor-encoding HOX and TBX genes is illustrated (Fig. 7, Fig. 8, Fig. 9, Fig. 10, Fig. 11).
2. Experimental design, materials and methods
2.1. Bioinformatics
For the atherosclerotic and control aorta samples from the same individual (88 yo female, athero aorta A, aortic arch, and control aorta A, thoracic aorta, respectively), the whole genome bisulfite sequencing (bisulfite-seq) data from Zaina et al. [2] were used. In addition, we used bisulfite-seq profiles from two additional control aorta samples: control aortas B, 34 yo male, and control aorta C, a 30 yo female; Roadmap Epigenetics Project [3], [4]. The subsection of aorta for control aorta C is not known but control aorta B, which was used for bisulfite-seq profiles and the analyzed Roadmap histone modification and chromatin segmentation profiles, was intra-abdominal aorta from below the renal arteries, before the iliac bifurcation. The Roadmap databases including the bisulfite-seq methylomes and chromatin state segmentation (chromHMM, AuxilliaryHMM) profiles are available at hubs for the UCSC Genome Browser hg19 [5] and are as previously described [6]. Chromatin state segmentation is based upon histone modifications (histone H3 lysine-4 mono-or-trimethylation; H3 lysine-27 acetylation or trimethylation; H3 lysine-36 trimethylation and H3 lysine-9 trimethylation). For the DNA methylation analysis, we found similar coverage in the DNA-seq analysis of the bisulfite-treated matched atherosclerotic and control samples. Note that the Roadmap sample labeled “macrophage” in the bisulfite-seq track at the UCSC Genome Browser [5] is actually primary CD14+ monocytes from blood, like the corresponding chromatin segmentation track [3]. The skeletal muscle (SkM) sample for bisulfite-seq and chromatin state was psoas muscle. The color code for the 18-state chromatin state segmentation was slightly simplified from the original [3]. Quantification of RNA-seq for tissues was from the GTEx database using transcripts per million read (TPM for Table 5) values from more than 100 samples for each tissue type [7]; the aorta tissue used for these RNA-seq analyses was from the thoracic region. Functional associations of DMRs used the Genomic Regions Enrichment of Annotations Tool, GREAT [8] and associations of the genes themselves used Database for Annotation, Visualization and Integrated Discovery, DAVID [9]. Aorta super-enhancers were determined from the dbSuper database [10].
2.2. Determination of athero and tissue-specific DMRs
Bisulfite-seq data from the athero aorta A and control aorta A samples were initially merged and analyzed on a site-by-site basis by applying Fisher's exact test to the counts of methylated and unmethylated CpG reads in each sample to produce site-specific p-values pi. Based on these results, candidate DMRs were then identified by determining the joint probability of a sequence of five or more consecutive p-values (pi, pi+1, …,pi+k) according to the Uniform Product (UP) distribution as described in our previous study [11], where each candidate DMR was required to begin and end with a statistically significant site. This analysis identified statistically significant regions at the 0.05 level, and these were subsequently filtered to include only those regions with an average percent methylation difference (PMD) of at least ±20%, length greater than 250bp, and no gaps >200 bp between consecutive sites. Next, these samples were merged with control aorta samples B and C, using custom scripts to correct for single-bp shifts due to variable pre-processing routines. For all sites present in all four samples, logistic regression models were fit to the counts of methylated and unmethylated reads at each CpG site to determine the statistical significance associated with the difference in percent methylation between the athero aorta A and the three aorta controls. Associated p-values for the comparison of athero aorta vs. three controls were then analyzed using our UP method to identify candidate DMRs, which were subsequently filtered as above for length, PMD, and gaps. Our final set of athero-associated DMRs were determined from those DMRs identified both in athero A vs. control aorta A as well as in the more general comparison of athero A vs. control aortas A, B, and C. PMD values are reported based on the differences observed for athero A vs. control aorta A.
To determine tissue-specific effects among the selected Roadmap samples (aorta, left ventricle, SkM, lung, adipose, and monocyte), “one-to-many” comparisons were run in which logistic regression models were fit to determine the PM differences associated with a selected sample relative to the others as a group. Because not all sites were present for all six samples, we required that any tested site be contained in the target sample and at least four of the five non-target samples. DMR identification and filtering was done using the same approach as for the athero DMR analyses. All preprocessing and analysis was performed in R version 3.4.4 [12]using custom scripts. For identifying overlaps of athero DMRs with a normal tissue DMRs, we used an R script for each of four normal tissues (aorta, monocytes, SkM and heart) and then for each athero DMR to determine any overlap of the athero DMRs of ≥50 bp with a given tissue DMR.
2.3. Mapping DMRs to genes and enhancer chromatin
Scripts in R language [12] with adaptations [13], [14] were used to determine which genes and enhancer chromatin (enh chromatin) segments were associated with athero DMRs. To determine the gene isoform associated with a given DMR, the reFlat table was used [5] for the RefSeq hg19 genome (accessed January 7, 2018). The protein-coding gene or, secondarily, the non-protein-coding gene was selected with the largest overlap of the gene regions in the following order of precedence with distances given relative to the transcription start site (TSS) or the mRNA-determined transcription end sequence (TES): TSS – 5 kb to TSS + 5 kb; TSS + 5 kb to the TES; intergenic (other sequences). For determining chromatin enhancer segments overlaps, DMRs were used that had at least a 50-bp overlap total with enh chromatin from any of the following enhancer states in the 18-state chromatin segmentation model [3], [5]: state numbers 3, 8, 9 or 10.
Acknowledgements
This research was supported in part by grants from the National Institutes of Health (National Center for Advancing Translational Sciences of the National Institutes of Health, award number UL1TR001417, and NS04885) and the Louisiana Cancer Center to ME and by high performance computing resources and services provided by Technology Services at Tulane University. We thank Dr. Shin Lin (Cardiac Clinic & Services at University of Washington Medical Center, Seattle, Washington) for finding the records for us about the NIH Roadmap aorta sample (Ctl aorta B) being from intra-abdominal aorta. Developmentally linked human DNA hypermethylation is associated with down-modulation, repression, and upregulation of transcription.
Footnotes
Transparency document associated with this article can be found in the online version at https://doi.org/10.1016/j.dib.2019.103812.
Supplementary data to this article can be found online at https://doi.org/10.1016/j.dib.2019.103812.
Contributor Information
Michelle Lacey, Email: mlacey1@tulane.edu.
Carl Baribault, Email: cbaribault@tulane.edu.
Kenneth C. Ehrlich, Email: kehrlich@tulane.edu.
Melanie Ehrlich, Email: ehrlich@tulane.edu.
Transparency document
The following is the transparency document related to this article:
Appendix ASupplementary data
The following are the Supplementary data to this article:
References
- 1.Lacey M. Atherosclerosis-associated differentially methylated regions can reflect the disease phenotype and are often at enhancers. Atherosclerosis. 2019;280(1):183–191. doi: 10.1016/j.atherosclerosis.2018.11.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zaina S. DNA methylation map of human atherosclerosis. Circ. Cardiovasc. Genet. 2014;7(5):692–700. doi: 10.1161/CIRCGENETICS.113.000441. [DOI] [PubMed] [Google Scholar]
- 3.Kundaje A. Integrative analysis of 111 reference human epigenomes. Nature. 2015;518(7539):317–330. doi: 10.1038/nature14248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Song Q. A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics. PLoS One. 2013;8(12) doi: 10.1371/journal.pone.0081148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Rosenbloom K.R. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 2015;43(Database issue):D670–D681. doi: 10.1093/nar/gku1177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Baribault C. Developmentally linked human DNA hypermethylation is associated with down-modulation, repression, and upregulation of transcription. Epigenetics. 2018:1–15. doi: 10.1080/15592294.2018.1445900. Epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.GTEx_Consortium Genetic effects on gene expression across human tissues. Nature. 2017;550(7675):204–213. doi: 10.1038/nature24277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.McLean C.Y. GREAT improves functional interpretation of cis-regulatory regions. Nat. Biotechnol. 2010;28(5):495–501. doi: 10.1038/nbt.1630. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Huang D.W., Sherman B.T., Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009;4(1):44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- 10.Khan A., Zhang X. dbSUPER: a database of super-enhancers in mouse and human genome. Nucleic Acids Res. 2016;44(D1):D164–D171. doi: 10.1093/nar/gkv1002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lacey M.R., Baribault C., Ehrlich M. Modeling, simulation and analysis of methylation profiles from reduced representation bisulfite sequencing experiments. Stat. Appl. Genet. Mol. Biol. 2013;12(6):723–742. doi: 10.1515/sagmb-2013-0027. [DOI] [PubMed] [Google Scholar]
- 12.R Development Team Core, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing[ http://www.R-project.org/. ].
- 13.Lawrence M., Gentleman R., Carey V. rtracklayer: an R package for interfacing with genome browsers. Bioinformatics. 2009;25(14):1841–1842. doi: 10.1093/bioinformatics/btp328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Bhasin J.M., Hu B., Ting A.H. MethylAction: detecting differentially methylated regions that distinguish biological subtypes. Nucleic Acids Res. 2016;44(1):106–116. doi: 10.1093/nar/gkv1461. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.