Skip to main content
Data in Brief logoLink to Data in Brief
. 2019 Dec 12;28:104983. doi: 10.1016/j.dib.2019.104983

Data of epigenomic profiling of histone marks and CTCF binding sites in bovine rumen epithelial primary cells before and after butyrate treatment

Xiaolong Kang a,b, Shuli Liu a,c, Lingzhao Fang d, Shudai Lin e, Mei Liu f, Ransom L Baldwin a, George E Liu a,, Cong-jun Li a,∗∗
PMCID: PMC6933192  PMID: 31890818

Abstract

Discovering the regulatory elements of genomes in livestock is essential for our understanding of livestock's basic biology and genomic improvement programs. Previous studies showed butyrate mediates epigenetic modifications of bovine cells. To explore the bovine functional genomic elements and the vital roles of butyrate on the epigenetic modifications of bovine genomic activities, we generated and deposited the genome-wide datasets of transcript factor binding sites of CTCF (CCCTC-binding factor, insulator binding protein), histone methylation (H3H27me3, H3K4me1, H3K4me3) and histone acetylation (H3K27ac) from bovine rumen epithelial primary cells (REPC) before and after butyrate treatment (doi: 10.1186/s12915-019-0687-8 [1]). In this dataset, we provide detailed information on experiment design, data generation, data quality assessment and guideline for data re-use. Our data will be a valuable resource for systematic annotation of regulatory elements in cattle and the functionally biological role of butyrate in the epigenetic modifications in bovine, as well as for the nutritional regulation and metabolism study of farm animal and human.

Keywords: Butyrate, Histone marks, CTCF, Bovine rumen


Specifications Table

Subject area Biochemistry, Genetics and Molecular Biology
More specific subject area Genetics
Type of data Table and figures
How data was acquired ChIP-seq assay (NextSeq 500) and bioinformatics
Data format Raw, filtered and analyzed
Experimental factors Bovine rumen epithelial primary cells before and after butyrate treatment
Experimental features Rumen epithelial tissue was collected from a two-week-old Holstein bull calf fed with milk replacer only. The epithelial layer of the rumen tissue was manually separated from the muscular layer and rinsed in water to remove residual feed particles. Rumen epithelial fragments generally underwent 5–6 cycles of digestion with fresh trypsin solution. 5mM of butyrate was added to the culture for 24 h before harvested. Chromatin immunoprecipitation was performed for the transcript factor binding sites of CTCF (CCCTC-binding factor, insulator binding protein), histone methylation (H3H27me3, H3K4me1, H3K4me3) and histone acetylation (H3K27ac); immunoprecipitated DNA was isolated and sequenced on Illumina NextSeq 500 platform.
Data source location Animal Genomics and Improvement Laboratory, BARC, Agricultural Research Service, USDA, Beltsville, Maryland, USA
Data accessibility Raw read data were deposited to NCBI Gene Expression Omnibus: GSE129423 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE129423), and data are in the related article [1].
Related research article Fang, L., Liu, S., Liu, M., Kang, X., Lin, S. et al. Functional annotation of the cattle genome through systematic discovery and characterization of chromatin states and butyrate-induced variations. BMC biology, 2019,17 (1), 1–16. DOI: https://doi.org/10.1186/s12915-019-0687-8
Value of the Data
  • A number of studies have revealed the significant roles of butyrate in diverse molecular functions and biological processes in bovine cells [2,3]. The dataset with detailed table and figure information can be used to characterize vital roles of butyrate on the epigenetic modifications of bovine cells.

  • The dataset showed a complex interplay between the genome and the specialized functional proteins such as CTCF, a multifunctional protein [4], as well as post-translationally modified histone markers, H3K4me1, H3K4me3, H3K27ac, and H3K27me3 [5,6].

  • The dataset in our article are useful to researchers interested in butyrate function, nutritional regulation and metabolism study of farm animal and human.

  • This data will provide a valuable resource for systematic annotation of regulatory elements in cattle and the functionally biological role of butyrate in the epigenetic modifications in bovine.

1. Data

The rumen is an important organ mediating food fermentation, digest and nutrition intake in ruminants. Nutrients from dietary supplementary have been shown to influence the function of enzymes that participate in the methylation process [7,8]. Butyrate, one of the short-chain fatty acids (SCFA), can activate epigenetically-silenced genes by increasing global histone acetylation [9], as well as induces cell-cycle arrest and apoptosis [10].

The data of this article sought to investigate the global profile of binding sites of CTCF and four histone marks (H3K4me1, H3K4me3, H3K27ac, and H3K27me3) in bovine rumen epithelial primary cells before and after butyrate treatment by chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq). CTCF is a DNA binding factor with defined functions of regulation of gene expression (transcription activation and repression); RNA splicing, and enhancer/promotor insulation [4]. A total of 468,849,656 raw reads were generated by Illumina sequencing (NextSeq 500), with an average of 39,310,948 ± 2,881,720 per sample (Table 1). The raw reads files (fastq format) of each sample have been uploaded to the NCBI Gene Expression Omnibus (NCBI Gene Expression Omnibus GSE129423, total 11 samples in the dataset, samples ID were GSM3712486-GSM3712696). The sequencing statistics raw reads and alignment for each dataset were summarized in Table 1. After trimming of raw reads, an average of 23 million reads was mapped uniquely to the bovine genome, and 23,401,740 ± 4,603,827 final tags were generated for further analysis. After normalization for tags for each library, 19,627,913 tags used for peak calling for each sample (Table 1). Detailed information on sequence quality control was summarized (Fig. 1 and Supplementary Fig. 1). Boxed area represents the central 2 quartiles (middle line means median), while the whiskers show the top and bottom quartiles without outliers (Fig. 1A). Then heatmap was employed to show the Pearson correlation coefficients (r) of all pairwise comparisons (Fig. 1B). Three different charts were generated to compare the peak sizes and strength between butyrate-treated and untreated samples (Fig. 1, Fig. 2 and Supplementary Fig. 2). For each pairwise comparison, a scatter plot was generated by plotting the tag numbers of sample 1 against sample 2 for each merged region (Supplementary Fig. 2 and Supplementary Table S1). The slope was a measure for the average ratio in tag numbers between butyrate-treated and untreated samples. Peak size boxplot was used for comparing the distribution of peak tag numbers between the samples. The metric used for these charts was the number of tags in the merged peak regions of the assay. The number of tags was calculated from the average values by taking into account the length of the merged regions, the bin size and the in-silico extension (Supplementary Table S1).

Table 1.

Sequencing read alignment statistics for ChIP-seq data set.

Total number of reads Total number of alignments Unique alignments (without duplicate reads) Unique alignments % Final number of tags Normalized tags Input tags used for peak calling FRIP (%)
PC_CTCF 39,610,540 34,652,442 23,328,713 67.3 23,205,192 23,205,192 19,627,913 13.9
BT_CTCF 40,990,109 37,448,576 23,331,071 62.3 23,225,349 23,205,192 19,627,913 19.4
PC_H3K27ac 42,565,192 36,737,369 24,488,622 66.7 24,412,621 20,565,887 19,627,913 39.3
BT_H3K27ac 48,050,969 44,040,540 20,627,722 46.8 20,565,887 20,565,887 19,627,913 23.2
PC_H3K27me3 43,699,377 40,031,049 26,146,058 65.3 26,053,236 26,053,236 19,627,913 49.0
BT_H3K27me4 42,961,510 40,131,792 28,969,252 72.2 28,861,259 26,053,236 19,627,913 45.6
PC_H3K4me1 43,243,959 40,767,839 33,001,733 81.0 32,915,813 32,783,103 19,627,913 27.2
BT_H3K4me2 46,973,975 43,593,617 32,865,978 75.4 32,783,103 32,783,103 19,627,913 20.0
PC_H3K4me3 41,860,309 38,598,738 23,586,847 61.1 23,473,852 21,092,832 19,627,913 60.7
BT_H3K4me4 38,952,658 35,316,748 21,164,467 59.9 21,092,832 21,092,832 19,627,913 68.8
Input 39,941,058 37,593,762 19,832,143 52.8 19,627,913 19,627,913

Fig. 1.

Fig. 1

Quality assessment of reads and ChIP signal. (A) Distribution of peak tag numbers. (B) The Pearson correlation coefficients of all pairwise comparisons. Rumen-primC (PC): rumen-primary epithelial cells; Rumen-BT (BT): rumen primary epithelial cells treated with butyrate.

Fig. 2.

Fig. 2

Cumulative read coverage. A specific and strong ChIP enrichment was indicated by a steep rise of the cumulative sum towards the highest rank. x-axis: percentage rank of signal enriched. y-axis: fraction of cumulative tag density.

The cumulative read coverage for each sample plotted by the fingerprint program from deeptools (v3.3.0) [11] was provided (Fig. 2). Peak distributions across the genomic regions were displayed with pie plots (Supplementary Fig. 3). Tag distributions (using bigWig metrics) across all merged regions (= all peak regions), transcription start sites (TSS) or gene bodies were determined and presented either as average plots (average of values for all target regions) (Supplementary Fig. 4) or as heatmaps (values in z-axis/color, regions in y-axis) (Fig. 3). Overlapping intervals are grouped into “Merged Regions” to compare peak metrics between 2 or more samples (Supplementary Table S2). Super-enhancers were identified by using a proprietary algorithm as described previously [12]. First, MACS [13] or SICER [13] peaks generated by the standard ChIP-Seq analyses were merged if their inner distance was equal or less than 12,500 bp. Then, the merged peak regions with the strongest signals (top 5%) were identified as Super-enhancers (Fig. 4).

Fig. 3.

Fig. 3

Genome-wide enrichment of peaks for histone marks and CTCF. (A) Heatmap of tag distributions across promoters (TSS, Transcription Start Sites) (default = 5 clusters; indicated by C1–C5, values in z-axis/color, regions in y-axis). (B) Heatmap of tag distributions across merged regions. The gradient blue-to-white color indicates high-to-low count in the corresponding region. Rumen-primC (PC): rumen-primary epithelial cells; Rumen-BT (BT): rumen primary epithelial cells treated with butyrate.

Fig. 4.

Fig. 4

Identification of Super-Enhancers. Enhancers are plotted in decreasing order based on ChIP-Seq peak intensity (Tag count). X-axis: Number of Merged peak regions. Y-axis: Tag counts in merged peak regions. Super-Enhancers for both H3K27ac and H3K4me1 before and after butyrate treatment were showed in a-d, separately. primC-: rumen-primary epithelial cells; BT-: rumen primary epithelial cells treated with butyrate.

2. Experimental design, materials, and methods

2.1. Animal and tissue collection

Animal care and tissue isolation work were approved by the Beltsville Area Animal Care and Use Committee Protocol Number 07-025. The methods for epithelial cell isolation and culture were described in an earlier report [14]. Rumen epithelial tissue was collected from a two-week-old Holstein bull calf fed with milk replacer only. At sacrifice, rumen epithelial tissue was photographed and collected from the anterior portion of the ventral sac of the rumen beneath the reticulum and below the rumen fluid layer. The epithelial layer of the rumen tissue was manually separated from the muscular layer and rinsed in water to remove residual feed particles. Samples were further rinsed in ice-cold saline. The tissue was added to 50 ml digestion solution (2% trypsin and 1.15 mmol CaCl2 in phosphate-buffered saline) and then was incubated in 37 °C incubator for 15 min.

Rumen epithelial fragments generally underwent 5–6 cycles of digestion with fresh trypsin solution. The first two rounds of digestion were discarded, and the third, fourth and fifth rounds of digestion were collected. After the epithelial tissue had undergone trypsin digestion, the solution was filtered through a 300-μm-nylon mesh. Following filtration, cell fractions were centrifuged at 60×g for 5 min at 4 °C to pellet the rumen cells. Cells then subjected to three wash cycles with sterile PBS with antibiotic-antimycotic (100 units/ml of Penicillin G sodium, streptomycin sulfate, 0.25μg amphotericin B as Fungizone). Cells were counted using a hemacytometer, and cell viabilities were estimated by trypan blue dye exclusion assays. Cells were plated in a 25 cm plate at a density of 1 million cells/dish in DMEM with antibiotic-antimycotic and 5% fetal bovine serum (DMEM-FBS). After 24h in culture, the cell media were removed and replaced with fresh DMEM-FBS. Cell media were changed every 48h until the cells reached confluence (4–7 days). Cells then removed from the dish by trypsinization, quantified and reseeded for treatment or frozen in liquid nitrogen for further culture. To test the response of the primary rumen epithelial cells to the treatment of butyrate, 5mM of butyrate was added to the culture for 24 h before harvested.

2.2. ChIP sequencing preparation

ChIP-seq of rumen epithelial tissue was performed as reported in our earlier publication [15]. In short, DNA recovered from a conventional ChIP procedure was quantified using the QuantiFluor fluorometer (Promega, Madison, WI). DNA integrity was verified using the Agilent Bioanalyzer 2100 (Agilent; Palo Alto, CA, USA). The DNA was then processed, including end repair, adaptor ligation, and size selection, using an Illumina sample prep kit following the manufacturer's instructions (Illumina, San Diego, CA, USA). Final DNA libraries were validated and sequenced at 75-nt per sequence read, using an Illumina NextSeq 500 platform.

2.3. Read mapping and quality control

The quality of base calling for raw reads generated by Illumina sequencer was assessed using the FastQC program (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/,v0.11.4) to ensure that there are no biases or problem in our raw data. The trimmed reads were aligned to the bovine reference genome (BosTau_UMD3.1) using the BWA algorithm with default settings [16]. After de-duplication, only reads that pass Illumina's purity filter, align with no more than 2 mismatches, and map uniquely to the genome were used in the subsequent analysis. To identify the density of fragments (extended tags) along the genome, the genome was divided into 32-nt bins and the number of fragments in each bin is determined. To compare peak metrics between 2 or more samples, overlapping intervals are grouped into “Merged Regions” by Samtools (v1.9) [13]. Deeptools (v3.3.0) [11] was used to plot the cumulative read coverage for each sample. We used the default versions of code to process our datasets. All sequenced data were aligned by the BWA algorithm and peaks were detected by MACS(v2.1.0) [13] (CTCF, H3K27ac, H3K4me1, H3K4me3) and SICER(v1.1) [13] (H3K27me3). Graphics were generated using seqplot R bioconductor package and deeptools [11].

Acknowledgments

We thank Reuben Anderson, Mary Bowman, Donald Carbaugh, Christina Clover, Cecelia Niland, and Sara McQueeney for technical assistance and sample collection. We thank the anonymous reviewers for many helpful comments. This work was supported in part by AFRI grant numbers 2013-67015-20951, 2016-67015-24886, and 2019-67015-29321 from the USDA National Institute of Food and Agriculture Animal Genome and Reproduction Programs and BARD grant number US-4997-17 from the US-Israel Binational Agricultural Research and Development (BARD) Fund. G.E. L. was supported by appropriated project 8042-31000-001-00-D, “Enhancing Genetic Merit of Ruminants Through Improved Genome Assembly, Annotation, and Selection”, and C-J L. was supported by appropriated project 8042-31310-078-00-D, “Improving Feed Efficiency and Environmental Sustainability of Dairy Cattle through Genomics and Novel Technologies” of the Agricultural Research Service of the United States Department of Agriculture. Mention of trade names or commercial products in this article is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the US Department of Agriculture. The USDA is an equal opportunity provider and employer.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.dib.2019.104983.

Contributor Information

George E. Liu, Email: george.liu@usda.gov.

Cong-jun Li, Email: congjun.li@usda.gov.

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A. Supplementary data

The following are the Supplementary data to this article:

Multimedia component 1
mmc1.pdf (416.6KB, pdf)
Multimedia component 2
mmc2.pdf (477.3KB, pdf)
Multimedia component 3
mmc3.pdf (323.4KB, pdf)
Multimedia component 4
mmc4.pdf (195.3KB, pdf)
Multimedia component 5
mmc5.pdf (195.3KB, pdf)
Multimedia component 6
mmc6.xlsx (47.8MB, xlsx)

References

  • 1.Fang L., Liu S., Liu M., Kang X., Lin S., Li B., Connor E.E., Baldwin R.L., Tenesa A., Ma L. Functional annotation of the cattle genome through systematic discovery and characterization of chromatin states and butyrate-induced variations. BMC Biol. 2019;17:1–16. doi: 10.1186/s12915-019-0687-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Li C.-j., Li R.W. Butyrate induced cell cycle arrest in bovine cells through targeting gene expression relevant to DNA replication apparatus. Gene Regul. Syst. Biol. 2008;2 doi: 10.4137/grsb.s465. GRSB. S465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Li C.-J., Li R.W., Baldwin R.L., Blomberg L.A., Wu S., Li W. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification. Gene Regul. Syst. Biol. 2016;10 doi: 10.4137/GRSB.S35607. GRSB. S35607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kim S., Yu N.-K., Kaang B.-K. CTCF as a multifunctional protein in genome regulation and gene expression. Exp. Mol. Med. 2015;47:e166. doi: 10.1038/emm.2015.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.McVicker G., van de Geijn B., Degner J.F., Cain C.E., Banovich N.E., Raj A., Lewellen N., Myrthil M., Gilad Y., Pritchard J.K. Identification of genetic variants that affect histone modifications in human cells. Science. 2013;342:747–749. doi: 10.1126/science.1242429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Rao S.S., Huang S.-C., St Hilaire B.G., Engreitz J.M., Perez E.M., Kieffer-Kwon K.-R., Sanborn A.L., Johnstone S.E., Bascom G.D., Bochkov I.D. Cohesin loss eliminates all loop domains. Cell. 2017;171:305–320. doi: 10.1016/j.cell.2017.09.026. e324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Choi S.-W., Friso S. Epigenetics: a new bridge between nutrition and health. Adv. Nutr. 2010;1:8–16. doi: 10.3945/an.110.1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Murdoch B.M., Murdoch G.K., Greenwood S., McKay S. Nutritional influence on epigenetic marks and effect on livestock production. Front. Genet. 2016;7:182. doi: 10.3389/fgene.2016.00182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Berger S.L. The complex language of chromatin regulation during transcription. Nature. 2007;447:407. doi: 10.1038/nature05915. [DOI] [PubMed] [Google Scholar]
  • 10.Li C.-J., Li R.W. Bioinformatic dissecting of TP53 regulation pathway underlying butyrate-induced histone modification in epigenetic regulation. Genet. Epigenet. 2014;6 doi: 10.4137/GEG.S14176. GEG. S14176. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ramírez F., Ryan D.P., Grüning B., Bhardwaj V., Kilpert F., Richter A.S., Heyne S., Dündar F., Manke T. deepTools 2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 2016;44:W160–W165. doi: 10.1093/nar/gkw257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Whyte W.A., Orlando D.A., Hnisz D., Abraham B.J., Lin C.Y., Kagey M.H., Rahl P.B., Lee T.I., Young R.A. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013;153:307–319. doi: 10.1016/j.cell.2013.03.035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Baldwin R. The proliferative actions of insulin, insulin-like growth factor-I, epidermal growth factor, butyrate and propionate on ruminal epithelial cells in vitro. Small Rumin. Res. 1999;32:261–268. [Google Scholar]
  • 15.Shin J.H., Li R.W., Gao Y., Baldwin R.t., Li C.J. Genome-wide ChIP-seq mapping and analysis reveal butyrate-induced acetylation of H3K9 and H3K27 correlated with transcription activity in bovine cells. Funct. Integr. Genom. 2012;12:119–130. doi: 10.1007/s10142-012-0263-6. [DOI] [PubMed] [Google Scholar]
  • 16.Li H., Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.pdf (416.6KB, pdf)
Multimedia component 2
mmc2.pdf (477.3KB, pdf)
Multimedia component 3
mmc3.pdf (323.4KB, pdf)
Multimedia component 4
mmc4.pdf (195.3KB, pdf)
Multimedia component 5
mmc5.pdf (195.3KB, pdf)
Multimedia component 6
mmc6.xlsx (47.8MB, xlsx)

Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES