Abstract
The data in this article contains data related to the research articled entitle Genome-wide transcriptome analysis of human epidermal melanocytes. This data article contains a complete list of gene and transcript isoform expression in human epidermal melanocytes. Transcript isoforms that are differentially expressed in lightly versus darkly pigmented melanocytes are identified. We also provide data showing the gene expression profiles of cell signaling gene families (receptors, ion channels, and transcription factors) in melanocytes. The raw sequencing data used to perform this transcriptome analysis is located in the NCBI Sequence Read Archive under Accession No. SRP039354 http://dx.doi.org/10.7301/Z0MW2F2N.
Specifications table
| Subject area | Biology |
| More specific subject area | Human epidermal melanocytes transcriptome |
| Type of data | Tables (Excel spreadsheets), Text (FASTQ sequence files) |
| How data was acquired | High-throughput RNA sequencing using the illumina HiSeq 2000 |
| Data format | Raw and analyzed |
| Experimental factors | Melanocyte mRNA libraries were prepared using the Illumina TruSeq RNA Sample Preparation Kit |
| Experimental features | Samples were derived from human foreskin |
| Data source location | Providence, RI |
| Data accessibility | Data is within this article and in the NCBI Sequence Read Archive under Accession No. SRP039354 10.7301/Z0MW2F2N |
Value of the data.
|
Data, experimental design, materials and methods
1. Human epidermal melanocyte gene and transcript isoform expression data
Supplementary Table 1: A complete data set of transcript abundances, sizes, and alternative splicing for all four HEM cDNA libraries.
Supplementary Table 2: Data set of GPCR, receptor kinase, & ion channel gene expression in HEMs.
Supplementary Table 3: A complete data set of transcription factor gene expression in HEMs.
SupplementaryTable 4: Analysis of differentially expressed isoforms in lightly versus darkly pigmented HEMs.
The raw FASTQ sequencing data used to perform this transcriptome analysis is located in the NCBI Sequence Read Archive under Accession No. SRP039354 http://dx.doi.org/10.7301/Z0MW2F2N.
2. Sample collection, library preparation and sequencing
Lightly and darkly pigmented primary human epidermal melanocytes (HEMs) from neonatal foreskin (Life Technologies/Gibco) were cultured in Medium 254 and Human Melanocyte Growth Supplement (HMGS2, Life Technologies/Gibco). The four HEM lines used for this study (HEM-D1, -D2, -L1, -L2) were each derived from a different donor; all lines were propagated in culture under identical conditions for ≤15 population doublings. Total HEM RNA was isolated using the mirVana miRNA Isolation Kit (Ambion) and its quality was assessed using an Agilent 2100 Bioanalyzer: RNA Integrity Number (RIN)≥8.7 for all samples, in agreement with Illumina recommended RIN≥8. cDNA libraries were prepared with 4 µg total RNA using standard Illumina protocols (TruSeq RNA Sample Preparation Kit) and resulted in cDNA fragments with 229 bp average size. Each cDNA library was sequenced with 50 bp single-read chemistry using the IIlumina HiSeq 2000 system.
3. Data analysis and annotation
The computational pipeline used for data analysis is shown in Supplemental Figure S1 in [1]. The quality of the raw sequencing data was checked using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and the quality scores of the reads for each library and for each position in the read were above 30, which is defined as high quality (see Supplemental Figure S2 in [1]). The sequencing data in FASTQ format was then mapped against NCBI build 37.2 of the human genome using Bowtie 2 (version 2.1.0.0) [2]. RNA sequencing metrics were obtained using Picard tools (http://picard.sourceforge.net/) (see Table 1 in [1]). Spliced junctions were identified using Tophat (version 2.0.8) [3] and transcript abundance estimates were performed using Cufflinks (version 2.1.1) [4]. EdgeR was used to perform differential gene expression analysis between samples with different pigmentation levels [5,6] (see Table 5 in [1]). Genes differentially expressed were considered significant if they had a FDR adjusted p-value<0.05 [7]. For the isoform analysis we imported the Cufflinks isoform expression data of lightly and darkly pigmented HEMs into Microsoft Excel. Using Excel functions we identified the isoforms only present in HEM-L or HEM-D and excluded those with high degree of variability (standard error mean>35% the average FPKM).
Footnotes
Supplementary materials associated with this article can be found in the online version at doi:10.1016/j.dib.2014.09.002.
Supplementary materials
Supplementary data
Supplementary data
Supplementary data
Supplementary data
References
- 1.Haltaufderhyde K.D., Oancea E. Genome-wide transcriptome analysis of human epidermal melanocytes. Genomics. 2014 doi: 10.1016/j.ygeno.2014.09.010. http://dx.doi.org/10.1016/j.ygeno.2014.09.010, in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Trapnell C., Pachter L., Salzberg S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25:1105–1111. doi: 10.1093/bioinformatics/btp120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Trapnell C., Williams B.A., Pertea G., Mortazavi A., Kwan G., van Baren M.J., Salzberg S.L., Wold B.J., Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 2010;28:511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.McCarthy D.J., Chen Y., Smyth G.K. Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res. 2012;40:4288–4297. doi: 10.1093/nar/gks042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Robinson M.D., McCarthy D.J., Smyth G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Benjamini Y., Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann. Appl. Stat. 2001;29:1165–1188. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary data
Supplementary data
Supplementary data
Supplementary data
