Abstract
Cell-type specific gene regulation is a key to gaining a full understanding of how the distinct phenotypes of differentiated cells are achieved and maintained. Here we examined how changes in transcriptional activation during alveolar epithelial cell (AEC) differentiation determine phenotype. We performed transcriptomic profiling using in vitro differentiation of human and rat primary AEC. This model recapitulates in vitro an in vivo process in which AEC transition from alveolar type 2 (AT2) cells to alveolar type 1 (AT1) cells during normal maintenance and regeneration following lung injury. Here we describe in detail the quality control, preprocessing, and normalization of microarray data presented within the associated study (Marconett et al., 2013). We also include R code for reproducibility of the referenced data and easily accessible processed data tables.
Keywords: Alveolar epithelial cells, Differentiation, Transcriptomic analysis
Direct link to deposited data SuperSeries (containing both datasets)
Experimental design, materials & methods
Human remnant lung selection and alveolar epithelial type 2 cell purification
Remnant human transplant lungs were obtained in compliance with Institutional Review Board—approved protocols for the use of human source material in research (HS-07-00660) and processed within 3 days of death. Rat AT2 cells were isolated in compliance with IACUC protocol #11360 from Sprague-Dawley male rats. Lungs were accepted from donors between 18 and 75 years of age with no history of smoking, negative serologies and negative cultures with the exception of CMV, EBV, and hepatitis (with confirmed vaccination record), not a current drug user, and a pO2 > 200 on 100% FIO2. Additionally, donor lungs were rejected for: heavy marijuana usage, any cancer present within the patient, and chest X-ray indicating pneumonia, asthma, emphysema, or chronic obstructive pulmonary disease (COPD). Also rejected were lungs from donors on ventilator greater than 4 days, any presence of bacterial or viral meningitis, or the presence of MRSA. Human lung tissue was processed as previously described [2] and detailed cell purification techniques have also been described previously [1].
RNA isolation
One microgram of RNA was converted into cRNA using the Illumina TotalPrep RNA amplification kit (Life Technologies, USA) and used for human (Illumina HT12v4) or rat (Rat-Ref-12) expression analysis at the Southern California Genotyping Consortium, University of California Los Angeles.
Basic analysis
BeadStudio was used to convert images to raw signal data. Data files from BeadStudio were analyzed in R (version 2.11.1). Code for human expression analysis is included in Appendix A, code for expression analysis of rat is included in Appendix B. Briefly, the data was compiled into an eSet using the LUMI package [3] using metadata from Table 1 (human) and Table 5 (rat). Unique lumiIDs based on probe hybridization sequence were assigned to each probe using lumiHumanIDMapping. For the rat expression data, probes were filtered based on quality using the reMoat reannotation pipeline, available online at: http://www.compbio.group.cam.ac.uk/Resources/Annotation/ [4]. Raw data was checked for enrichment of p-values of less than 0.05, indicating significance above background false discovery using a matrix design (Table 2 for human and Table 6 for rat). Variant stabilization and normalization (VSN) was performed using the VSN package [5] to allow for a large number of differentially expressed genes. Statistical analyses were performed using LIMMA [6] with the technical replicates removed (Table 3, rat had no technical replicates). A linear regression model was fitted over the time-course of differentiation using lmFit, and t-tests performed between D0 and D8. False-discovery rate was controlled using the Benjamini–Hochberg (BH) correction [7]. R was used for principal component analysis and heatmap generation. Heatmaps were generated using Heatmap.plus in R by selecting the top 5% of probes most variant across the whole dataset and clustering with “average” linkage method. Clustering with different linkages, for example “ward” (Fig. 1) and “complete” (Fig. 2), resulted in comparable sample dendrograms. The list of significant differentially expressed genes is included in Table 4 (human) and Table 7 (rat). Pathway analysis was performed on genes with statistically significant differences in expression using IPA (Ingenuity Systems, www.ingenuity.com) or DAVID [8], [9]. Correlation of human and rat gene expression was performed using Entrez identifiers and the Mouse Genome Informatics (MGI) Web database [10], and the correlated microarrays (Table 8) were plotted against each other to reveal genes which were differentially expressed in human, rat, or both. Unique gene symbols were used to calculate overall numbers of genes significantly differentially expressed (Table 9).
Table 1.
Target | Sex | Prepdate | Day | Race | Age | Ter | Smoker |
---|---|---|---|---|---|---|---|
5626686051_A | Female | 2010Dec | D6 | Caucasian | 49 | Yes | No |
5626686051_C | Female | 2010Dec | D4 | Caucasian | 49 | Yes | No |
5626686051_F | Female | 2010Dec | D0 | Caucasian | 49 | Yes | No |
5626686051_H | Female | 2010Dec | D2 | Caucasian | 49 | Yes | No |
5626686051_L | Female | 2010Dec | D8 | Caucasian | 49 | Yes | No |
5626686013_A | Female | 2010Nov | D0 | Caucasian | 61 | Yes | No |
5626686013_B | Female | 2010Nov | D6 | Caucasian | 61 | Yes | No |
5626686013_C | Female | 2009Dec | D4 | Caucasian | 66 | Yes | No |
5626686013_D | Female | 2010Nov | D4 | Caucasian | 61 | Yes | No |
5626686013_E | Female | 2009Dec | D6 | Caucasian | 66 | Yes | No |
5626686013_F | Female | 2009Dec | D2 | Caucasian | 66 | Yes | No |
5626686013_G | Female | 2010Nov | D0 | Caucasian | 61 | Yes | No |
5626686013_H | Female | 2010Nov | D2 | Caucasian | 61 | Yes | No |
5626686013_I | Female | 2009Dec | D8 | Caucasian | 66 | Yes | No |
5626686013_J | Female | 2010Nov | D4 | Caucasian | 61 | Yes | No |
5626686013_K | Female | 2009Dec | D0 | Caucasian | 66 | Yes | No |
5626686013_L | Female | 2010Nov | D8 | Caucasian | 61 | Yes | No |
Table 5.
Sample | ChIP lane | ChIP | Sample name | Prepdate | DAY |
---|---|---|---|---|---|
5665175063_A | A | RAT v1.0 | AEC TII D6 | Round3 | D6 |
5665175063_B | B | RAT v1.0 | AEC TII D6 | Round2 | D6 |
5665175063_C | C | RAT v1.0 | AEC TII D2 | Round3 | D2 |
5665175063_D | D | RAT v1.0 | AEC TII D2 | Round2 | D2 |
5665175063_E | E | RAT v1.0 | AEC TII D4 | Round3 | D4 |
5665175063_G | G | RAT v1.0 | AEC TII D8 | Round3 | D8 |
5665175063_H | H | RAT v1.0 | AEC TII D0 | Round3 | D0 |
5665175063_I | I | RAT v1.0 | AEC TII D0 | Round2 | D0 |
5665175063_J | J | RAT v1.0 | AEC TII D8 | Round2 | D8 |
5665175063_K | K | RAT v1.0 | AEC TII D4 | Round2 | D4 |
5700760018_A | A | RAT v1.0 | AEC TII D2 | Round1 | D2 |
5700760018_F | F | RAT v1.0 | AEC TII D4 | Round1 | D4 |
5700760021_A | A | RAT v1.0 | AEC TII D0 | Round1 | D0 |
5700760021_D | D | RAT v1.0 | AEC TII D6 | Round1 | D6 |
5700760021_I | I | RAT v1.0 | AEC TII D8 | Round1 | D8 |
Table 2.
Target | D0 | D2 | D4 | D6 | D8 |
---|---|---|---|---|---|
5626686051_A | 0 | 0 | 0 | 1 | 0 |
5626686051_C | 0 | 0 | 1 | 0 | 0 |
5626686051_F | 1 | 0 | 0 | 0 | 0 |
5626686051_H | 0 | 1 | 0 | 0 | 0 |
5626686051_L | 0 | 0 | 0 | 0 | 1 |
5626686013_A | 1 | 0 | 0 | 0 | 0 |
5626686013_B | 0 | 0 | 0 | 1 | 0 |
5626686013_C | 0 | 0 | 1 | 0 | 0 |
5626686013_D | 0 | 0 | 1 | 0 | 0 |
5626686013_E | 0 | 0 | 0 | 1 | 0 |
5626686013_F | 0 | 1 | 0 | 0 | 0 |
5626686013_G | 1 | 0 | 0 | 0 | 0 |
5626686013_H | 0 | 1 | 0 | 0 | 0 |
5626686013_J | 0 | 0 | 1 | 0 | 0 |
5626686013_I | 0 | 0 | 0 | 0 | 1 |
5626686013_K | 1 | 0 | 0 | 0 | 0 |
5626686013_L | 0 | 0 | 0 | 0 | 1 |
Table 6.
Target | D0 | D2 | D4 | D6 | D8 |
---|---|---|---|---|---|
5665175063_A | 0 | 0 | 0 | 1 | 0 |
5665175063_B | 0 | 0 | 0 | 1 | 0 |
5665175063_C | 0 | 1 | 0 | 0 | 0 |
5665175063_D | 0 | 1 | 0 | 0 | 0 |
5665175063_E | 0 | 0 | 1 | 0 | 0 |
5665175063_G | 0 | 0 | 0 | 0 | 1 |
5665175063_H | 1 | 0 | 0 | 0 | 0 |
5665175063_I | 1 | 0 | 0 | 0 | 0 |
5665175063_J | 0 | 0 | 0 | 0 | 1 |
5665175063_K | 0 | 0 | 1 | 0 | 0 |
5700760018_A | 0 | 1 | 0 | 0 | 0 |
5700760018_F | 0 | 0 | 1 | 0 | 0 |
5700760021_A | 1 | 0 | 0 | 0 | 0 |
5700760021_D | 0 | 0 | 0 | 1 | 0 |
5700760021_I | 0 | 0 | 0 | 0 | 1 |
Table 3.
Target | D0 | D2 | D4 | D6 | D8 |
---|---|---|---|---|---|
5626686051_A | 0 | 0 | 0 | 1 | 0 |
5626686051_C | 0 | 0 | 1 | 0 | 0 |
5626686051_F | 1 | 0 | 0 | 0 | 0 |
5626686051_H | 0 | 1 | 0 | 0 | 0 |
5626686051_L | 0 | 0 | 0 | 0 | 1 |
5626686013_A | 1 | 0 | 0 | 0 | 0 |
5626686013_B | 0 | 0 | 0 | 1 | 0 |
5626686013_C | 0 | 0 | 1 | 0 | 0 |
5626686013_D | 0 | 0 | 1 | 0 | 0 |
5626686013_E | 0 | 0 | 0 | 1 | 0 |
5626686013_F | 0 | 1 | 0 | 0 | 0 |
5626686013_H | 0 | 1 | 0 | 0 | 0 |
5626686013_I | 0 | 0 | 0 | 0 | 1 |
5626686013_K | 1 | 0 | 0 | 0 | 0 |
5626686013_L | 0 | 0 | 0 | 0 | 1 |
The following are the supplementary data related to this article.
References
- 1.Marconett C.N., Zhou B., Rieger M.E., Selamat S.A., Dubourd M., Fang X., Lynch S.K., Stueve T.R., Siegmund K.D., Berman B.P., Borok Z., Laird-Offringa I.A. Integrated transcriptomic and epigenomic analysis of primary human lung epithelial cell differentiation. PLoS Genet. 2013;9:e1003513. doi: 10.1371/journal.pgen.1003513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Ballard P.L., Lee J.W., Fang X., Chapin C., Allen L., Segal M.R., Fischer H., Illek B., Gonzales L.W., Kolla V. Regulated gene expression in cultured type II cells of adult human lung. Am. J. Physiol. Lung Cell. Mol. Physiol. 2010;299:L36–L50. doi: 10.1152/ajplung.00427.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Du P., Kibbe W.A., Lin S.M. Lumi: a pipeline for processing Illumina microarray. Bioinformatics. 2008;24:1547–1548. doi: 10.1093/bioinformatics/btn224. [DOI] [PubMed] [Google Scholar]
- 4.Barbosa-Morais N.L., Dunning M.J., Samarajiwa S.A., Darot J.F., Ritchie M.E., Lynch A.G., Tavaré S. A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data. Nucleic Acids Res. 2010;38:e17. doi: 10.1093/nar/gkp942. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Huber W., von Heydebreck A., Sültmann H., Poustka A., Vingron M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002;18:S96–S104. doi: 10.1093/bioinformatics/18.suppl_1.s96. [DOI] [PubMed] [Google Scholar]
- 6.Smyth G.K. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 2004;3 doi: 10.2202/1544-6115.1027. (Article3) [DOI] [PubMed] [Google Scholar]
- 7.Benjamini Y., Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. 1995;57:289–300. [Google Scholar]
- 8.Huang D.W., Sherman B.T., Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009;4:44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
- 9.Huang D.W., Sherman B.T., Lempicki R.A. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37:1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Mouse Genome Informatics (MGI) Web, the Jackson Laboratory, Bar Harbor, Maine. http://www.informatics.jax.org [retrieved 2/2011]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.