Abstract
Hematopoietic differentiation is strictly regulated by complex network of transcription factors that are controlled by ligands binding to cell surface receptors. Disruptions of the intricate sequences of transcriptional activation and suppression of multiple genes cause hematological diseases, such as leukemias, myelodysplastic syndromes, or myeloproliferative syndromes. From a clinical standpoint, deciphering the pattern of gene expression during hematopoiesis may help unravel disease-specific mechanisms in hematopoietic malignancies. Herein, we describe a human in vitro hematopoietic model system where lineage-specific differentiation of CD34+ cells was accomplished using specific cytokines. Microarray and RNAseq-based whole transcriptome and exome analysis was performed on the differentiated erythropoietic, granulopoietic, and megakaryopoietic cells to delineate changes in expression of whole transcripts and exons. Analysis on the Human 1.0 ST exon arrays indicated differential expression of 172 genes (P < 0.0000001) and significant alternate splicing of 86 genes during differentiation. Pathway analysis identified these genes to be involved in Rac/RhoA signaling, Wnt/B-catenin signaling and alanine/aspartate metabolism. Comparison of the microarray data to next generation RNAseq analysis during erythroid differentiation demonstrated a high degree of correlation in gene (R = 0.72) and exon (R = 0.62) expression. Our data provide a molecular portrait of events that regulate differentiation of hematopoietic cells. Knowledge of molecular processes by which the cells acquire their cell-specific fate would be beneficial in developing cell-based therapies for human diseases.
Keywords: hematopoiesis, differentiation, microarrays, gene expression, splicing, qPCR, RNAseq
recent developments in stem cell biology have generated much excitement about the potential for regenerative medicine and cell-based therapies in a variety of clinical applications, such as treating Parkinson's disease, leukemia, and spinal cord injuries (23). Crucial to the success of these applications is the detailed understanding of how the cells remain stem cells and the cues that they require to differentiate and commit themselves to specific cell fates. Given that hematopoietic stem cells are a particularly interesting class of stem cells and a well-characterized cellular differentiation system, a number of studies have recently been undertaken to decipher their genetic program both in culture and in vivo (22).
Hematopoiesis is the process by which all the different cell lineages that form the blood and immune system are generated from a common pluripotent stem cell (28, 35, 36). A complex interplay between the intrinsic genetic processes of hematopoietic cells and their environment, including the effects of specific cytokines such as interleukins and granulocyte/monocyte stimulating factors, determines whether stem cells, lineage-specified progenitors, and mature blood cells self-renew, remain quiescent, proliferate, differentiate, or undergo apoptosis (1, 6–8, 11, 16–18, 24, 34, 37). Catastrophic consequences to aberrant hematopoiesis have been described in diseases such as leukemia, lymphoma, etc. (10, 12, 19, 21, 25, 26, 29). Hence, understanding the nature of the hematopoietic stem cells, as well as the molecular process by which these cells acquire their specific cell fate, is crucial for understanding disease pathogenesis and for the success of cell-based therapies.
As the phenotype of any given cell is ultimately the product of the genes, it is critical to identify gene expression patterns during lineage-specific differentiation. It is also believed that an important source of diversity in the transcriptome of differentiated cells is due to the splicing process in multiexon genes. Alternative splicing is thought to regulate differentiation through coordination of gene networks where each network coordinates a different cell function (4). Current studies of hematopoiesis have mostly examined gene level changes in expression and have not been extended to understand alternative splicing events.
The advancement of genomic technologies has now provided us a platform in the form of Genechip Human Exon 1.0 ST arrays and massively parallel sequencing to study the exome profile of cells. The current study exploits these advances in microarray and sequencing technologies for elucidating the global changes in the whole transcriptome and exome expression during ex vivo lineage-specific hematopoietic cell differentiation.
MATERIALS AND METHODS
Human Granulocyte Colony-stimulating Factor-mobilized CD34+ Peripheral Blood Cells
Human granulocyte colony-stimulating factor (G-CSF)-mobilized CD34+ peripheral blood cells (CD34+ PBCs) were collected by apheresis from healthy volunteers who were given 5 days of G-CSF (10 μg/kg per day). After CD34+ antigen-mediated selection with immunomagnetic beads (ISOLEX300i system; Baxter Healthcare, Deerfield, IL), purified CD34+ PBCs were collected and preserved in liquid nitrogen until use.
Suspension Cultures and Growth Factors
CD34+ PBCs were cultured in X-VIVO10 (BioWhittaker, Walkersville, MD) supplemented with 1% human serum albumin. At least 1 × 106 CD34+ cells were assayed in six-well plates and incubated at 37°C and 5% CO2 in a fully humidified atmosphere in air. Lineage-specific differentiation was induced on CD34+ cells using the method described in Komor et al. (15). In brief, lineage-specific differentiation was induced by the addition of growth factors (R&D Systems, Wiesbaden-Nordenstadt, Germany) stem cell factor (SCF, 50 ng/ml), Flt3-ligand (50 ng/ml), IL-3 (10 ng/ml), erythropoietin (10 U/ml) for erythropoietic differentiation; SCF (50 ng/ml), Flt3-ligand (50 ng/ml), IL-3 (10 ng/ml), G-CSF, and granulocyte/macrophage colony-stimulating factor (each, 10 ng/ml) for granulopoietic differentiation; SCF (50 ng/ml), Flt3-ligand (50 ng/ml), thrombopoietin (20 ng/ml) for megakaryopoietic differentiation. Differentiated cells were harvested on day 11 and purified by immunomagnetic beads using the MACS system using CD71+ microbeads for erythropoietic (E) group, CD15+ microbeads for granulopoietic (G) cells, and CD61+ microbeads for megakaryopoietic (M) cells (15).
Flow Cytometry
Uninduced CD34+ and cultured cells were characterized by dual-color immunofluorescence using a BD FACSCanto flow cytometer. E cells were characterized by staining with an anti-CD71 FITC antibody. Megakaryocytic cells were determined with an anti-CD61 FITC antibody, and G cells were analyzed with anti-CD15 FITC antibodies. Isotype-matched nonspecific antibodies were used as controls. Analysis gates were set to exclude dead cells and debris, with 10,000 viable cells analyzed per sample. Morphology of the flow-sorted differentiated cells was examined by Diff-Quik stain (Dade Behring, Newark, DE) following the manufacturer's protocols. Micrographs were taken with a Leica microscope at ×10 objective lens.
RNA Isolation
Total RNA was extracted using RNeasy mini kit (Qiagen, Valencia, CA) following the manufacturer's directions. Genomic DNA was removed by using the gDNA eliminator spin columns. The concentration of the isolated RNA was determined using the Nanodrop ND-100 spectrophotometer (Nanodrop Technologies, Wilmington, DE). Quality and integrity of the total RNA isolated were assessed on the Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA).
Target Preparation and Hybridization to Human Exon 1.0 ST Arrays
Labeling of samples for hybridization to the Exon array was performed according to Affymetrix GeneChip Whole Transcript Sense Target Labeling Assay. One microgram of total RNA was subjected to ribosomal RNA reduction following Invitrogen's ribominus reduction procedure. Double-stranded cDNA was synthesized using random hexamers incorporating a T7 promoter sequence to generate cRNA by in vitro transcription. Labeled single-stranded cDNA in the sense direction was generated from cRNA and used for array hybridization. Hybridization was performed at 45°C overnight, followed by washing and staining using FS450 Fluidics station. Scanning was carried out using the 7G GCS3000 scanner.
Microarray Data Collection and Annotation
Exon-level core robust multiarray average (RMA)-sketch intensity values for each of 12 chips were collected using Affymetrix Expression Console (EC) Software (Affymetrix, Santa Clara, CA). The 287,329 core probe-sets from EC were mapped to genomic address for human genome build hg18 using the Affymetrix probe-set selection region (downloaded from EC, March 2008). These probe-sets were then mapped to genes and exons using a reduced RefSeq gene model (Hg19, downloaded from http://genome.ucsc.edu). A gene model consists of all possible exons for a set of RefSeq transcripts for a single gene. The gene model is necessary to allow for testing of alternative splicing using a statistical approach, described below. Specifically, the reduced gene model consists of a set of nonoverlapping intervals that represent the possible exons within a gene. When more than one core probe-set maps to the same RefSeq exon interval, the average RMA intensity is calculated. This process yields 174,792 distinct exons grouped into 17,457 genes.
The data has been deposited in the Gene Expression Omnibus (GSE29989).
Principal Component Analysis
Data from 12 samples, after reduction to the RefSeq modeled genes and exons, were subjected to a principal component analysis (PCA) for detection of outliers. The four conditions (CD34+, E cells, granulocytes, megakaryocytes) were studied in triplicate. The bi-plot of PC1 vs. PC2, giving rise to 70% of the variability, revealed four distinct clusters, corresponding to the four conditions.
Analysis of Exon Arrays by ExonSVD
The Affymetrix Exon arrays offer the capability for alternative splicing detection, unlike older 3′ IVT arrays, because each exon of the gene is separately probed. The ExonANOVA, a three-factor, nested mixed-effect ANOVA, has been the widely used model for the analysis of exon arrays (http://www.partek.com, http://www.partek.com/html/products/pdf/Brochures/). The ExonANOVA model fits the following formula
to the data.
The two fixed factors are the treatment effect (A) and exon effect (C). The random factor (β) is the sample within treatment effect. The fixed interaction between the treatment and exon effects (AC) determines if an alternative splicing event has occurred. This model makes a strong assumption of additivity, which may fail if the probe-set “sensitivity” is low and the data fall in the background range, or if the values become “saturated” at the high end of the data range. These two situations describe a dead or unresponsive probe-set, respectively, but there may be other causes of nonadditivity. Probe-sets exhibiting this behavior can strongly affect the alternative splicing signal and thus can falsely increase the number of detected alternative splicing events. Other assumptions required by the design of the ANOVA model are independence of cases and variance homogeneity. These last two assumptions are generally accounted for by proper experimental design and data transformations.
We introduce a new analytical method, termed the ExonSVD model (30), which aims to overcome the limitations inherent in the ExonANOVA. The ExonSVD model
has three new parameters compared with the ExonANOVA. The A′ and AC terms of the ExonANOVA model have been combined and then split into the three new parameters where A′ represents the treatment effect, D represents the probe sensitivity, and E represents the residual or deviation from the simple model. The E factor is tested for significance to determine if alternative splicing has occurred. This new model alleviates the need to detect and remove dead and unresponsive probe-sets. The P (P–E) values for the E term were generated by numerical simulation, fitting rational polynomials to the sum-of-squares curves, as estimates of degrees of freedom, so that one can generate a statistic with approximately an F distribution, and thereby obtain approximate P values.
Selection of Lineage-specific and Differentially Expressed Genes
Using the ExonSVD, differentially expressed genes were detected using the P value for the A′ parameter (differential expression) and a fold-change cutoff. P values ≤10^−7 and fold changes greater than two for any of the three comparisons (E vs. CD34+, M vs. CD34+, G vs. CD34+) were required. Additionally, a gene was defined to be “specifically upregulated” in a particular cell type (E, M, or G) compared with CD34+ if it was significant at P < 10^−4, and the largest observed change was in that cell type, and the largest change was greater than twofold, and neither of the other cell types showed an upward change >1.4-fold.
Identification of Alternatively Spliced Genes
Alternatively spliced genes were defined as having a P-E (P value for alternative splicing) of ≤10^−7 and a largest deviation of twofold or more where the largest deviation is maxi,k|Ei,k − ECD34,k|, where i ranges over E, G, and M, and k ranges over the exons in a gene.
Next Generation Sequencing RNA Transcript Analysis on SOLiD Sequencer
Library preparation.
Total RNA (2 μg) from CD34+ and E cells were depleted of rRNA and enzymatically fragmented using 1 unit of RNase III (Ambion) by incubation at 37°C for 10 min. The fragmented RNA was size selected using the flashPAGE fractionator (Ambion) to collect RNA fragments ranging in size from ∼50 to 150 nucleotides in length. The RNA fragments were then ligated to adaptors, converted into cDNA, and amplified by 15 cycles of PCR using the SOLiD RNA Expression Kit (Ambion). The PCR reactions were purified using the Qiagen Mini elute PCR purification kit and separated on a native Novex 6% TBE polyacrylamide gel (Invitrogen). PCR products ranging in size from ∼150 to 200 bp (corresponding to RNA fragment insert sizes of ∼60–110 nucleotides) were cut out of the gel, and the products eluted overnight and precipitated. The gel-purified material was quantitated by Nanodrop and prepared for emulsion PCR and sequencing on an Applied Biosystems SOLiD sequencer (version 3.0; Applied Biosystems, Carlsbad, CA).
Sequence read processing and alignment.
mRNA-Seq sequencing reads were analyzed using Applied Biosystems' whole transcriptome software tools (http://www.solidsoftwaretools.com/). Reads of length 50 bases originating from each sample were first aligned to the human genome (US National Center for Biotechnology Information Build 36.3) using Applied Biosystems' SOLiD System Analysis Pipeline Tool (Bioscope). The aligned reads were mapped to RefSeq exons downloaded from the UCSC Genome Browser (Human genome build hg18), and reads per kilobase per million reads (RPKM) values were obtained for each RefSeq exon. The RPKM calculations were adapted at the exon, gene, and transcript level and can be thought of as a normalized expression level (E) based on the read count across the region of interest.
The calculation for RPKM is as follows: RPKM = 10^9 * C/NL, where C = counts or number of reads falling in the exon, N = total mapped reads, and L = length of the transcript.
RPKM values for each of two CD34+ (CD34+_1, CD34+_2) and two erythropoietic (E_1, E_2) samples were obtained for further analysis with the microarray data.
RNAseq and Microarray Data Correlation Analysis
To compare the RNAseq data to the microarray data at the exon level, it was necessary to match each RefSeq exon RPKM value with the corresponding reduced model exon microarray RMA value (see materials and methods). A total of 149,765 RNAseqmicroarray value pairs were obtained. Correlations between the two methods on exon-level fold changes were computed. Gene-level fold changes were determined by averaging the log fold-change values across exons of each gene.
QPCR Analysis to Determine Gene Expression Values
First-strand cDNA was synthesized using 500 ng of RNA and random primers in a 20 μl reverse transcriptase reaction mixture using Invitrogen's Superscript cDNA synthesis kit (Invitrogen, Carlsbad, CA) following the manufacturer's directions. Quantitative real-time PCR assays were carried out with the use of gene-specific double fluorescently labeled probes in a 7900 Sequence Detector (PE Applied Biosystems, Norwalk, CT). In brief, PCR amplification was performed in a 384-well plate with a 20-μl reaction mixture containing 300 nm of each primer, 200 nm probe, 200 nm dNTP in 1× real-time PCR buffer and passive reference (ROX) fluorochrome. The thermal cycling conditions were 2 min at 50°C and 10 min at 95°C, followed by 40 cycles of 15 denaturation at 95°C and 1 min annealing and extension at 60°C. Samples were analyzed in duplicate and the CT values obtained were normalized to the housekeeping gene β-actin. The comparative CT (ΔΔCT) method (33), which compares the differences in CT values between groups, was used to achieve the relative fold change in gene expression between the four groups in the study.
RESULTS
In Vitro Differentiation of CD 34+ Cells
Lineage-specific cells were characterized by immunophenotyping as shown in Fig. 1A. The dot plots illustrate the expression of lineage-specific cell surface markers that define the undifferentiated CD34+ and differentiated E, G, and M cells, respectively. Figure 1A, top left, shows the unstained CD34+ cells on day 11 as a control for panels at top right and bottom with erythroid, granulocytic, and megakaryocytic cells. After 11 days in culture, out of the stained cells, 98% stained positive for CD71 representing erythropoietic differentiation, 98% stained positive for CD15 representing granulopoiesis, and 78% stained positive for CD61 representing megakaryopoietic differentiation. Flow-sorted differentiated cells were analyzed for cell morphology using the Diff-Quik stain set. Figure 1B shows the micrographs taken with a Leica microscope at ×10 objective lens, thereby confirming the purity of the cells selected for transcriptome analysis.
QPCR Analysis to Determine the Expression Levels of Few Lineage-specific Genes
QPCR was carried out on undifferentiated CD34+ and purified differentiated cells (E, M, and G) to determine the expression levels of genes known to be specific for each of the cell type. Table 1 illustrates the fold change in expression levels of few genes ANK1, GYPA (specific for E cells); FCGRA, MIP2 (specific for G cells); and GPIBA and PF4 (specific for M cells). We observed a significant increase in the expression of genes that are specific for each of the cell type confirming the cytokine-induced differentiation of CD34+ cells to their specific lineages. The E group of cells showed a significant increase in the expression of ANK1 and GYPA compared with the CD34+. Genes known to be specific or abundant for M and G groups, such as FCGR2A and MIP2 for G group and GPIBA and PF4 for M group, showed significant increase in their expression in the respective cell types confirming the lineage-specific differentiation.
Table 1.
Symbol | Gene Title | FC ± SD |
---|---|---|
E-specific genes | E vs. CD34 | |
ANK1 | ankyrin 1 | 8,192.44 ± 204.81 |
GYPA | glycophorin A | 78,612.44 ± 2,358.88 |
G-specific genes | G vs. CD34 | |
FCGR2A | Fc fragment of IgG, low affinity receptor 2A | 4.79 ± 0.09 |
MIP2 | macrophage inflammatory protein 2 | 14.24 ± 0.23 |
M-specific genes | M vs. CD34 | |
GPIBA | glycoprotein 1B-alpha | 19.16 ± 0.38 |
PF4 | platelet Factor 4 | 44.46 ± 0.88 |
Fold changes (FC) were calculated by comparing the ΔCT values of differentiated group to the ΔCT values of undifferentiated CD34. Values are given as means ± SD (n = 3) in each group.
Abbreviations: E, erythropoietic cells; G, granulopoietic cells; M, megakaryopoietic cells.
Microarray-based Confirmation of Differentiated Cells by Lineage-specific Gene Expression
PCA was first used to identify outliers within groups and to characterize the stem cells and differentiated cells based on their expression profile on microarrays. Principal component 1 (PC1) vs. principal component 2 (PC2) as plotted in Fig. 2A showed a clear segregation and clustering between the four groups of cells. We observed 60% variability in PC1, which accounted for the difference between the M cells and the other three cell types.
In an effort to confirm established lineage-specific differentiation of CD34+ cells, we examined the microarray-based expression of differentiated cell-specific genes (Fig. 2B). As depicted in the figure, when CD34+ cells differentiate into erythrocytic cells, a 2- to 14-fold (log) increase in the expression of ankyrin, glycophorin A, and tropomodulin 1 (ANK1, GYPA, TMOD1) was observed, which code for erythrocyte membrane proteins. Similarly, the granulocyte-specific genes CD300A and MIP2 and low-affinity immunoglobulin gamma FC region receptor II-A and B proteins (CD300A, FCGR2 A and B) showed the highest fold change in G cells. Finally, megakaryocyte-specific genes prostaglandin-endoperoxide synthase 1, platelet factor 4, glycoprotein 1B alpha and serine (or cysteine) proteinase inhibitor, clade E (PF4, GPIBA, PTGS1, SERPINE1) had highest expression in megakaryocytic cells confirming the lineage specific differentiation of CD34+ cells.
Gene Level Analysis to Identify Differentially Expressed Transcripts During Hematopoietic Differentiation
Global gene-level analysis of transcripts comparing CD 34+ progenitor cells to the three differentiated lineages identified several genes with significant differences in expression between each of the differentiated cell groups. As mentioned in materials and methods, statistical filters were applied to find a total of 172 differentially expressed genes between any of the three cell types compared with CD34 progenitor cells. Selecting at a P-A ≤ 1e−7 and a twofold or more change, the following lists were generated: MvsCD34, 148 genes; GvsCD34, 52 genes; and EvsCD34, 70 genes. These lists show some overlap across the groups. Hierarchical clustering showed the differential expression pattern between these 172 transcripts across the lineage-specific stem cell types as depicted in Fig. 3. Comparison of these significantly altered transcripts by Venn diagram as shown in Fig. 4A shows that 26 transcripts are commonly differentially expressed in all three differentiated groups. Furthermore, 24 transcripts are found to be common to both the M and E groups, 14 are common between G and E, and four transcripts are found to be common between G and M. Six of the top ranking genes are a unique signature for erythrocyte differentiation, including SLC4A1, ALAS, EPPB9, HBB, SELENBP1, and GYPA. The unique granulocyte differentiation signature has only four genes including C20orf12, C19orf16, PRGC, and CEACAM6, while alterations in 90 transcripts appear to be unique for megakaryopoiesis. Table 2 lists all 172 transcripts that are common and unique to each cell lineage.
Table 2.
Gene Symbol | Gene Title | E vs. CD34 | G vs. CD34 | M vs. CD34 | Groups |
---|---|---|---|---|---|
ARHGEF17 | Rho guanine nucleotide exchange factor (GEF) 17 | −2.38 | −1.79 | −2.41 | E, G, M |
CD177 | CD177 molecule | −2.79 | −2.71 | −2.73 | E, G, M |
CDT1 | chromatin licensing and DNA replication factor 1 | 1.50 | 1.46 | −1.07 | E, G, M |
CMTM5 | CKLF-like MARVEL transmembrane domain containing 5 | 3.30 | 1.64 | 5.79 | E, G, M |
DNTT | deoxynucleotidyltransferase, terminal | −3.94 | −3.88 | −4.12 | E, G, M |
FCGR3A | Fc fragment of IgG, low affinity IIIa, receptor (CD16a) | −2.71 | −2.84 | −2.77 | E, G, M |
GPSM1 | G protein signaling modulator 1 (AGS3-like, C. elegans) | −1.40 | −1.33 | −2.52 | E, G, M |
HBG1 | hemoglobin, gamma A | 6.37 | 2.73 | 1.77 | E, G, M |
HBG2 | hemoglobin, gamma G | 7.06 | 3.84 | 2.62 | E, G, M |
ID1 | inhibitor of DNA binding 1 | −2.11 | −2.10 | −3.18 | E, G, M |
ITGB3 | integrin, beta 3 (platelet glycoprotein IIIa, antigen CD61) | 3.49 | 1.58 | 7.26 | E, G, M |
KCNA3 | potassium voltage-gated channel, member 3 | −2.03 | −2.05 | 2.46 | E, G, M |
LAT | linker for activation of T cells | 3.50 | 2.01 | 4.58 | E, G, M |
LSP1 | lymphocyte-specific protein 1 | −2.46 | −1.72 | −3.89 | E, G, M |
MDK | midkine (neurite growth-promoting factor 2) | −2.06 | −1.98 | −2.65 | E, G, M |
MGC29671 | −1.12 | 1.49 | −2.69 | E, G, M | |
MGC35402 | −3.03 | −2.93 | −2.81 | E, G, M | |
MN1 | meningioma (disrupted in balanced translocation) 1 | −2.93 | −2.78 | −3.15 | E, G, M |
NPTX2 | neuronal pentraxin II | −1.55 | −1.94 | −1.67 | E, G, M |
P2RY11 | purinergic receptor P2Y, G-protein coupled, 11 | −1.47 | −1.47 | −2.87 | E, G, M |
PCDH21 | protocadherin 21 | 1.78 | 1.52 | 4.51 | E, G, M |
RNASE2 | ribonuclease, RNase A family, 2 | 3.01 | 4.56 | −1.54 | E, G, M |
S100A12 | S100 calcium binding protein A12 | −4.64 | −4.53 | −4.92 | E, G, M |
SH2D3C | SH2 domain containing 3C | −1.70 | −1.19 | −2.50 | E, G, M |
SLC11A1 | solute carrier family 11 | −2.45 | −2.55 | −2.75 | E, G, M |
UBE2C | ubiquitin-conjugating enzyme E2C | 3.66 | 3.23 | 3.19 | E, G, M |
SLC4A1 | solute carrier family 4, anion exchanger, member 1 | 3.42 | −0.22 | −0.56 | E |
ALAS2 | aminolevulinate, delta-, synthase 2 | 3.26 | 0.41 | −0.32 | E |
EPPB9 | 1.06 | 0.94 | −0.31 | E | |
HBB | hemoglobin, beta | 3.66 | 0.76 | −0.20 | E |
SELENBP1 | selenium binding protein 1 | 2.13 | 0.36 | −0.09 | E |
GYPA | glycophorin A (MNS blood group) | 3.71 | 0.76 | 0.00 | E |
PRKG1 | protein kinase, cGMP-dependent, type I | −1.84 | −2.15 | −0.82 | E, G |
CA1 | carbonic anhydrase I | 4.39 | 1.40 | −0.48 | E, G |
H2AFX | H2A histone family, member X | 1.59 | 1.18 | −0.43 | E, G |
ERAF | erythroid associated factor | 4.51 | 1.07 | −0.08 | E, G |
ELA2 | 1.19 | 5.17 | −0.05 | E, G | |
PRTN3 | proteinase 3 | 1.17 | 4.69 | 0.13 | E, G |
PRG1 | 1.14 | 2.65 | 0.49 | E, G | |
IL9R | interleukin 9 receptor | 3.00 | 2.24 | 0.69 | E, G |
NFKBIA | nuclear factor of kappa light polypeptide gene enhancer in B-cells inhibitor, alpha | −2.18 | −1.52 | 0.75 | E, G |
MYL4 | myosin, light chain 4, alkali; atrial, embryonic | 3.30 | 1.12 | 0.80 | E, G |
CST7 | cystatin F (leukocystatin) | 1.46 | 4.08 | 0.83 | E, G |
EBP | emopamil binding protein (sterol isomerase) | 2.08 | 2.44 | 0.94 | E, G |
H3/o | 4.21 | 3.69 | 0.99 | E, G | |
HIST2H3C | histone cluster 2, H3c | 4.21 | 3.69 | 0.99 | E, G |
LITAF | lipopolysaccharide-induced TNF factor | −1.21 | −0.11 | −4.20 | E, M |
LAIR1 | leukocyte-associated immunoglobulin-like receptor 1 | −1.56 | 0.15 | −3.97 | E, M |
TNFRSF1B | tumor necrosis factor receptor superfamily, member 1B | −1.67 | −0.80 | −3.65 | E, M |
CECR1 | cat eye syndrome chromosome region, candidate 1 | −1.95 | 0.51 | −3.19 | E, M |
PRAM1 | PML-RARA regulated adaptor molecule 1 | −1.85 | 0.07 | −2.69 | E, M |
KLF4 | Kruppel-like factor 4 (gut) | −1.72 | −0.86 | −2.28 | E, M |
PTPRCAP | protein tyrosine phosphatase, receptor type, C-associated protein | −1.41 | −0.67 | −2.02 | E, M |
BEX2 | brain expressed X-linked 2 | −1.25 | 0.18 | −1.98 | E, M |
HDC | histidine decarboxylase | 1.45 | 0.93 | −1.98 | E, M |
HBA2 | hemoglobin, alpha 2 | 2.16 | −0.78 | −1.58 | E, M |
HBA1 | hemoglobin, alpha 1 | 2.04 | −0.70 | −1.41 | E, M |
HIST1H2BL | histone cluster 1, H2bl | 1.04 | 0.64 | −1.09 | E, M |
FAM89A | family with sequence similarity 89, member A | 1.41 | 0.59 | −1.07 | E, M |
PPP1R3B | protein phosphatase 1, regulatory (inhibitor) subunit 3B | −1.35 | −0.67 | 1.37 | E, M |
PPP1R14A | protein phosphatase 1, regulatory (inhibitor) subunit 14A | 2.00 | 0.59 | 1.89 | E, M |
GATA1 | GATA binding protein 1 (globin transcription factor 1) | 2.31 | 0.62 | 2.30 | E, M |
CCND3 | cyclin D3 | 1.04 | 0.35 | 2.33 | E, M |
THBS1 | thrombospondin 1 | 1.44 | −0.74 | 4.65 | E, M |
PF4 | platelet factor 4 | 1.28 | 0.16 | 4.72 | E, M |
ITGA2B | integrin, alpha 2b | 2.83 | 0.99 | 5.14 | E, M |
ARHGAP6 | Rho GTPase activating protein 6 | 2.11 | 0.63 | 5.22 | E, M |
CTTN | cortactin | 1.52 | 0.36 | 5.29 | E, M |
PPBP | proplatelet basic protein (chemokine (C-X-C motif) ligand 7) | 1.24 | 0.07 | 5.67 | E, M |
RGS6 | regulator of G protein signaling 6 | 1.61 | 0.38 | 6.27 | E, M |
C20orf112 | chromosome 20 open reading frame 112 | −0.75 | −1.32 | −0.70 | G |
C19orf10 | chromosome 19 open reading frame 10 | 0.91 | 1.85 | −0.44 | G |
PRG2 | proteoglycan 2, bone marrow | 0.40 | 2.15 | −0.24 | G |
CEACAM6 | carcinoembryonic antigen-related cell adhesion molecule 6 | 0.42 | 4.00 | −0.11 | G |
FLJ11151 | −0.89 | −1.09 | −2.82 | G, M | |
NKG7 | natural killer cell group 7 sequence | −0.81 | 2.01 | −2.31 | G, M |
IGLL1 | immunoglobulin lambda-like polypeptide 1 | 0.75 | 1.99 | −2.13 | G, M |
CEBPA | CCAAT/enhancer binding protein (C/EBP), alpha | 0.22 | 1.25 | −1.59 | G, M |
TRIM58 | tripartite motif-containing 58 | 0.05 | −1.06 | 1.87 | G, M |
SEPT5 | septin 5 | −0.55 | −1.83 | 3.14 | G, M |
ADCY6 | adenylate cyclase 6 | −0.24 | −1.24 | 3.70 | G, M |
KIAA0513 | KIAA0513 | −0.01 | −1.11 | 4.23 | G, M |
ZFP36L2 | zinc finger protein 36, C3H type-like 2 | −0.97 | −0.87 | −3.22 | M |
PRSSL1 | protease, serine-like 1 | −0.77 | 0.76 | −3.18 | M |
SPI1 | spleen focus forming virus (SFFV) proviral integration oncogene | −0.95 | 0.20 | −2.68 | M |
TRIM14 | tripartite motif-containing 14 | −0.53 | 0.25 | −2.61 | M |
UNQ501 | −0.35 | 0.44 | −2.34 | M | |
GSTP1 | glutathione S-transferase pi 1 | −0.18 | 0.42 | −2.19 | M |
RAB3D | RAB3D, member RAS oncogene family | −0.90 | −0.32 | −2.09 | M |
TIMM13 | translocase of inner mitochondrial membrane 13 | 0.15 | −0.04 | −2.02 | M |
ITPA | inosine triphosphatase | 0.24 | 0.14 | −1.96 | M |
IFITM1 | interferon induced transmembrane protein 1 (9–27) | 0.35 | 0.14 | −1.96 | M |
NHP2L1 | NHP2 nonhistone chromosome protein 2-like 1 | 0.41 | 0.49 | −1.93 | M |
ASB13 | ankyrin repeat and SOCS box-containing 13 | 0.27 | 0.73 | −1.89 | M |
SYNGR1 | synaptogyrin 1 | 0.42 | 0.53 | −1.83 | M |
ARL6IP4 | ADP-ribosylation-like factor 6 interacting protein 4 | 0.26 | 0.25 | −1.72 | M |
POFUT1 | protein O-fucosyltransferase 1 | 0.55 | 0.68 | −1.66 | M |
ZXDB | zinc finger, X-linked, duplicated B | −0.88 | −0.79 | −1.59 | M |
CEBPD | CCAAT/enhancer binding protein (C/EBP), delta | −0.59 | 0.89 | −1.54 | M |
UBE2L6 | ubiquitin-conjugating enzyme E2L 6 | −0.16 | −0.32 | −1.49 | M |
SLIC1 | −0.83 | −0.41 | −1.46 | M | |
TSNARE1 | t-SNARE domain containing 1 | −0.49 | −0.25 | −1.42 | M |
ECHS1 | enoyl Coenzyme A hydratase, short chain, 1, mitochondrial | 0.54 | 0.59 | −1.41 | M |
RPUSD1 | RNA pseudouridylate synthase domain containing 1 | 0.08 | 0.12 | −1.37 | M |
MRPS16 | mitochondrial ribosomal protein S16 | 0.45 | 0.50 | −1.35 | M |
CUEDC2 | CUE domain containing 2 | 0.06 | 0.21 | −1.28 | M |
PDXP | pyridoxal (pyridoxine, vitamin B6) phosphatase | 0.54 | 0.24 | −1.24 | M |
CKAP4 | cytoskeleton-associated protein 4 | 0.78 | 0.83 | −1.22 | M |
STK32C | serine/threonine kinase 32C | 0.56 | 0.24 | −1.21 | M |
ENDOG | endonuclease G | 0.05 | −0.09 | −1.20 | M |
NDUFB7 | NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 7 | 0.36 | 0.38 | −1.16 | M |
PALM | paralemmin | −0.14 | 0.26 | −1.15 | M |
DCI | dodecenoyl-Coenzyme A delta isomerase | 0.54 | 0.49 | −1.13 | M |
ATP5D | ATP synthase, H+ transporting, delta subunit | 0.47 | 0.44 | −1.11 | M |
ABHD14A | abhydrolase domain containing 14A | 0.36 | 0.73 | −1.08 | M |
LOC92154 | −0.16 | −0.42 | 1.35 | M | |
SNN | stannin | −0.70 | −0.47 | 1.37 | M |
BCL2L2 | BCL2-like 2 | −0.32 | −0.34 | 1.37 | M |
LIPC | lipase, hepatic | 0.16 | 0.19 | 1.52 | M |
IRS2 | insulin receptor substrate 2 | −0.48 | 0.44 | 1.60 | M |
GLYATL1 | glycine-N-acyltransferase-like 1 | −0.14 | −0.03 | 1.66 | M |
RGS10 | regulator of G protein signaling 10 | 0.52 | 0.27 | 1.97 | M |
SRC | v-src sarcoma (Schmidt-Ruppin A-2) viral oncogene homolog (avian) | −0.74 | −0.85 | 2.00 | M |
CDKN1A | cyclin-dependent kinase inhibitor 1A (p21, Cip1) | −0.63 | −0.38 | 2.32 | M |
TULP2 | tubby like protein 2 | 0.21 | 0.31 | 2.37 | M |
ADRA2A | adrenergic, alpha-2A, receptor | 0.73 | −0.47 | 2.37 | M |
MGC13057 | 0.94 | −0.31 | 2.38 | M | |
TAL1 | T-cell acute lymphocytic leukemia 1 | 0.62 | −0.78 | 2.38 | M |
C20orf175 | 0.45 | −0.89 | 2.48 | M | |
HTR2A | 5-hydroxytryptamine (serotonin) receptor 2A | 0.10 | −0.08 | 2.52 | M |
NRGN | neurogranin (protein kinase C substrate, RC3) | 0.57 | 0.18 | 2.53 | M |
BCL2L1 | BCL2-like 1 | 0.69 | −0.34 | 2.53 | M |
SEC14L5 | SEC14-like 5 (S. cerevisiae) | 0.15 | 0.16 | 2.57 | M |
TPM1 | tropomyosin 1 (alpha) | 0.78 | −0.56 | 2.71 | M |
C20orf32 | 0.14 | −0.21 | 2.75 | M | |
C6orf188 | −0.07 | 0.08 | 2.83 | M | |
HEXIM1 | hexamethylene bis-acetamide inducible 1 | 0.07 | −0.12 | 2.95 | M |
ENDOD1 | endonuclease domain containing 1 | 0.65 | −0.57 | 3.00 | M |
GP5 | glycoprotein V (platelet) | 0.22 | −0.01 | 3.05 | M |
PLA2G4C | phospholipase A2, group IVC (cytosolic, calcium-independent) | 0.03 | 0.22 | 3.10 | M |
TUBA8 | tubulin, alpha 8 | 0.97 | 0.13 | 3.10 | M |
IL21R | interleukin 21 receptor | 0.90 | 0.08 | 3.16 | M |
PANX1 | pannexin 1 | 0.57 | 0.06 | 3.22 | M |
C5orf4 | chromosome 5 open reading frame 4 | 0.58 | −0.45 | 3.29 | M |
PF4V1 | platelet factor 4 variant 1 | 0.91 | 0.19 | 3.36 | M |
SERPINE1 | serpin peptidase inhibitor, clade E | 0.46 | 0.02 | 3.38 | M |
CXCL3 | chemokine (C-X-C motif) ligand 3 | 0.16 | 0.16 | 3.52 | M |
SLC22A17 | solute carrier family 22, member 17 | 0.69 | −0.54 | 3.59 | M |
PDGFB | platelet-derived growth factor beta polypeptide | 0.05 | 0.29 | 3.63 | M |
F2RL2 | coagulation factor II (thrombin) receptor-like 2 | 0.41 | −0.28 | 3.65 | M |
MFAP3L | microfibrillar-associated protein 3-like | 0.16 | −0.08 | 3.66 | M |
SDPR | serum deprivation response | 0.72 | −0.40 | 3.66 | M |
EGLN3 | egl nine homolog 3 (C. elegans) | 0.88 | 0.14 | 3.86 | M |
CD9 | CD9 molecule | −0.09 | −0.73 | 3.87 | M |
GABRE | gamma-aminobutyric acid (GABA) A receptor, epsilon | 0.24 | 0.36 | 3.96 | M |
ESAM | endothelial cell adhesion molecule | 0.81 | −0.44 | 4.22 | M |
CCR4 | chemokine (C-C motif) receptor 4 | 0.85 | 0.26 | 4.32 | M |
GNAZ | guanine nucleotide binding protein (G protein), alpha z polypeptide | 0.37 | −0.01 | 4.52 | M |
GP9 | glycoprotein IX (platelet) | 0.81 | −0.06 | 4.55 | M |
LRRC32 | leucine rich repeat containing 32 | 0.06 | −0.49 | 4.56 | M |
C6orf21 | 0.82 | 0.10 | 4.82 | M | |
TRPC6 | transient receptor potential cation channel, subfamily C, member 6 | 0.88 | 0.12 | 4.84 | M |
TSPAN9 | tetraspanin 9 | 0.50 | 0.26 | 4.88 | M |
PCSK6 | proprotein convertase subtilisin/kexin type 6 | 0.30 | 0.19 | 4.97 | M |
PDE5A | phosphodiesterase 5A, cGMP-specific | 0.10 | −0.06 | 5.10 | M |
ABCC3 | ATP-binding cassette, subfamily C (CFTR/MRP), member 3 | −0.08 | −0.30 | 5.25 | M |
ITGB5 | integrin, beta 5 | 0.47 | 0.06 | 5.40 | M |
HPSE | heparanase | 0.62 | −0.32 | 5.69 | M |
CD226 | CD226 molecule | 0.10 | −0.24 | 5.75 | M |
VWF | von Willebrand factor | 0.44 | −0.47 | 5.98 | M |
PDE3A | phosphodiesterase 3A, cGMP-inhibited | 0.63 | −0.11 | 6.18 | M |
CLEC1B | C-type lectin domain family 1, member B | 0.66 | 0.67 | 6.62 | M |
FC values are given as log 2 means from an n of 3 per group. P < 0.0000001.
Identification of Differentially Regulated Genes During Lineage-specific Differentiation to E, G, and M Cells
In an effort to identify transcripts that are altered during cell differentiation, we specifically examined “upregulated” genes (see materials and methods) compared with CD34+ cells. We identified three patterns (Fig. 4B) of upregulation. Erythropoietic differentiation resulted in upregulation of 32 transcripts. When CD 34 cells differentiated into G cells, 30 genes were found to be significantly upregulated, and megakaryopoietic differentiation resulted in a much larger modulation of genes as identified by the upregulation of 269 transcripts. A partial list of these genes are tabulated in Tables 3, 4, and 5. (A complete list of these altered genes is shown as Supplemental Table S1).1
Table 3.
Gene Symbol | Gene Title | FC - E vs. CD34 |
---|---|---|
TMEM56 | transmembrane protein 56 | 3.60 |
SLC4A1 | solute carrier family 4, anion exchanger, member 1 | 3.42 |
ALAS2 | aminolevulinate, delta-, synthase 2 | 3.26 |
IRF6 | interferon regulatory factor 6 | 2.51 |
DNAJA4 | DnaJ (Hsp40) homolog, subfamily A, member 4 | 2.49 |
ADD2 | adducin 2 (beta) | 2.47 |
GYPB | glycophorin B (MNS blood group) | 2.36 |
GALNT5 | UDP-N-acetyl-alpha-d-galactosamine:polypeptide5 | 2.21 |
HBA2 | hemoglobin, alpha 2 | 2.16 |
SELENBP1 | selenium binding protein 1 | 2.13 |
HBA1 | hemoglobin, alpha 1 | 2.04 |
ATP7B | ATPase, Cu2+ transporting, beta polypeptide | 1.95 |
SLC14A1 | solute carrier family 14 (urea transporter), member 1 | 1.85 |
LBH | limb bud and heart development homolog | 1.83 |
GYPE | glycophorin E | 1.81 |
SEC14L4 | SEC14-like 4 | 1.73 |
ACSBG1 | acyl-CoA synthetase bubblegum family member 1 | 1.67 |
PCSK9 | proprotein convertase subtilisin/kexin type 9 | 1.60 |
TGM2 | transglutaminase 2 | 1.55 |
SLC25A21 | solute carrier family 25 | 1.42 |
ITGB4 | integrin, beta 4 | 1.30 |
LEF1 | lymphoid enhancer-binding factor 1 | 1.29 |
PISD | phosphatidylserine decarboxylase | 1.18 |
NMNAT3 | nicotinamide nucleotide adenylyltransferase 3 | 1.14 |
ICAM4 | intercellular adhesion molecule 4 | 1.08 |
TRIB2 | tribbles homolog 2 | 1.05 |
S100A4 | S100 calcium binding protein A4 | 1.04 |
TRIM29 | tripartite motif-containing 29 | 1.03 |
ZYX | zyxin | −1.07 |
BTG1 | B-cell translocation gene 1, anti-proliferative | −1.08 |
ZBTB34 | zinc finger and BTB domain containing 34 | −1.10 |
TCEA2 | transcription elongation factor A (SII), 2 | −1.18 |
CXCR4 | chemokine (C-X-C motif) receptor 4 | −1.21 |
TSC22D3 | TSC22 domain family, member 3 | −1.37 |
List of genes highly specific for E cells. FC values are given as log 2 means for an n of 3 per group.
Table 4.
Gene Symbol | Gene Title | FC - G vs. CD34 |
---|---|---|
BPI | bactericidal/permeability-increasing protein | 5.23 |
CEACAM6 | carcinoembryonic antigen-related cell adhesion molecule 6 | 4.00 |
CD24 | CD24 molecule | 3.67 |
CLEC5A | C-type lectin domain family 5, member A | 2.88 |
IL2RA | interleukin 2 receptor, alpha | 2.58 |
BEX1 | brain expressed, X-linked 1 | 2.42 |
FAM107B | family with sequence similarity 107, member B | 2.17 |
PRG2 | plasticity-related gene 2 | 2.15 |
PRG2 | proteoglycan 2, bone marrow | 2.15 |
ELOVL3 | elongation of very long chain fatty acids | 2.10 |
SLPI | secretory leukocyte peptidase inhibitor | 2.04 |
NKG7 | natural killer cell group 7 sequence | 2.01 |
GLT25D2 | glycosyltransferase 25 domain containing 2 | 1.84 |
RHOU | ras homolog gene family, member U | 1.75 |
P2RY2 | purinergic receptor P2Y, G protein-coupled, 2 | 1.75 |
GPR97 | G protein-coupled receptor 97 | 1.72 |
JDP2 | Jun dimerization protein 2 | 1.68 |
RBP4 | retinol binding protein 4, plasma | 1.68 |
SERPINB2 | serpin peptidase inhibitor, clade B, member 2 | 1.63 |
CD48 | CD48 molecule | 1.56 |
CDA | cytidine deaminase | 1.47 |
HIP1 | huntingtin interacting protein 1 | 1.47 |
LPO | lactoperoxidase | 1.45 |
ANPEP | alanyl (membrane) aminopeptidase | 1.37 |
CILP2 | cartilage intermediate layer protein 2 | 1.36 |
LYZ | lysozyme (renal amyloidosis) | 1.31 |
NT5DC3 | 5′-nucleotidase domain containing 3 | 1.27 |
CLEC11A | C-type lectin domain family 11, member A | 1.23 |
SERPINB8 | serpin peptidase inhibitor, clade B, member 8 | 1.21 |
FGFR1 | fibroblast growth factor receptor 1 | 1.20 |
PADI2 | peptidyl arginine deiminase, type II | 1.16 |
DERL3 | Der1-like domain family, member 3 | 1.16 |
MFAP4 | microfibrillar-associated protein 4 | 1.12 |
FUT4 | fucosyltransferase 4 alpha (1,3) fucosyltransferase | 1.05 |
GRAMD1B | GRAM domain containing 1B | 1.01 |
CBFA2T3 | core-binding factor, runt domain, alpha subunit 2; | −1.03 |
ARHGEF12 | Rho guanine nucleotide exchange factor (GEF) 12 | −1.10 |
List of genes highly specific for G cells. FC values are given as log 2 means for an n of 3 per group.
Table 5.
Gene Symbol | Gene Title | FC - M vs. CD34 |
---|---|---|
VWF | von Willebrand factor | 5.98 |
CD226 | CD226 molecule | 5.75 |
PKHD1L1 | polycystic kidney and hepatic disease 1 | 5.49 |
ITGB5 | integrin, beta 5 | 5.40 |
ABCC3 | ATP-binding cassette, subfamily C | 5.25 |
PDE5A | phosphodiesterase 5A, cGMP-specific | 5.10 |
PCSK6 | proprotein convertase subtilisin/kexin type 6 | 4.97 |
TUBB1 | tubulin, beta 1 | 4.93 |
TSPAN9 | tetraspanin 9 | 4.88 |
LRRC32 | leucine rich repeat containing 32 | 4.56 |
GNAZ | guanine nucleotide binding protein (G protein) | 4.52 |
PTPRJ | protein tyrosine phosphatase, receptor type, J | 4.47 |
MMD | monocyte to macrophage differentiation-associated | 4.45 |
EGF | epidermal growth factor (beta-urogastrone) | 4.43 |
VEGFC | vascular endothelial growth factor C | 4.42 |
MYOM1 | myomesin 1, 185 kDa | 4.37 |
KIAA0513 | KIAA0513 | 4.23 |
TMEM40 | transmembrane protein 40 | 4.13 |
GP1BA | glycoprotein Ib (platelet), alpha polypeptide | 3.99 |
GABRE | gamma-aminobutyric acid (GABA) | 3.96 |
ARHGAP21 | Rho GTPase activating protein 21 | 3.96 |
TSPAN18 | tetraspanin 18 | 3.93 |
DAB2 | disabled homolog 2, | 3.90 |
SLC9A9 | solute carrier family 9 | 3.90 |
CD9 | CD9 molecule | 3.87 |
LIPH | lipase, member H | 3.80 |
CCL5 | chemokine (C-C motif) ligand 5 | 3.77 |
LGMN | legumain | 3.77 |
MFAP3L | microfibrillar-associated protein 3-like | 3.66 |
F2RL2 | coagulation factor II (thrombin) receptor-like 2 | 3.65 |
PDGFB | platelet-derived growth factor beta polypeptide | 3.63 |
DENND2C | DENN/MADD domain containing 2C | 3.57 |
CXCL3 | chemokine (C-X-C motif) ligand 3 | 3.52 |
C1orf71 | chromosome 1 open reading frame 71 | 3.47 |
GNG11 | guanine nucleotide binding protein gamma 11 | 3.42 |
SERPINE1 | serpin peptidase inhibitor, clade E | 3.38 |
GRIK4 | glutamate receptor, ionotropic, kainate 4 | 3.37 |
OR2G3 | olfactory receptor 2, subfamily G, member 3 | 3.36 |
SLC6A4 | solute carrier family 6 | 3.36 |
DGKD | diacylglycerol kinase, delta 130 kDa | 3.35 |
AQP10 | aquaporin 10 | 3.34 |
IRAK2 | interleukin-1 receptor-associated kinase 2 | 3.31 |
ABLIM3 | actin binding LIM protein family, member 3 | 3.26 |
MRVI1 | murine retrovirus integration site 1 homolog | 3.25 |
CD40LG | CD40 ligand | 3.24 |
NEXN | nexilin (F actin binding protein) | 3.22 |
SEPT5 | septin 5 | 3.14 |
PAPSS2 | 3′-phosphoadenosine 5′-phosphosulfate synthase 2 | 3.14 |
PLA2G4C | phospholipase A2, group IVC | 3.10 |
ITPK1 | inositol 1,3,4-triphosphate 5/6 kinase | −2.00 |
UCK2 | uridine-cytidine kinase 2 | −2.01 |
ALDH1B1 | aldehyde dehydrogenase 1 family, member B1 | −2.02 |
TIMM13 | translocase of inner mitochondrial membrane 13 | −2.02 |
NSMCE1 | nonSMC element 1 homolog (S. cerevisiae) | −2.03 |
ANKH | ankylosis, progressive homolog (mouse) | −2.04 |
MRPL48 | mitochondrial ribosomal protein L48 | −2.05 |
HIST1H1D | histone cluster 1, H1d | −2.07 |
TIMM50 | translocase of inner mitochondrial membrane 50 | −2.07 |
DIP | interstitial pneumonitis, desquamative, familial | −2.07 |
NIPSNAP3A | nipsnap homolog 3A (C. elegans) | −2.07 |
UBL7 | ubiquitin-like 7 (bone marrow stromal cell-derived) | −2.08 |
EBPL | emopamil binding protein-like | −2.09 |
DENND1A | DENN/MADD domain containing 1A | −2.09 |
NT5C3L | 5′-nucleotidase, cytosolic III-like | −2.09 |
COMMD9 | COMM domain containing 9 | −2.10 |
NDUFA12 | NADH dehydrogenase 1 alpha subcomplex, 12 | −2.11 |
POLE4 | polymerase (DNA-directed), epsilon 4 | −2.12 |
LIG3 | ligase III, DNA, ATP-dependent | −2.12 |
KIAA0664 | KIAA0664 | −2.13 |
IGLL1 | immunoglobulin lambda-like polypeptide 1 | −2.13 |
MRPL16 | mitochondrial ribosomal protein L16 | −2.14 |
DFFA | DNA fragmentation factor, 45 kDa, alpha | −2.16 |
RPUSD4 | RNA pseudouridylate synthase domain 4 | −2.16 |
TFAP4 | transcription factor AP-4 | −2.19 |
GSTP1 | glutathione S-transferase pi 1 | −2.19 |
GLT25D1 | glycosyltransferase 25 domain containing 1 | −2.19 |
C9orf123 | chromosome 9 open reading frame 123 | −2.20 |
MRPL38 | mitochondrial ribosomal protein L38 | −2.21 |
MFHAS1 | malignant fibrous histiocytoma sequence 1 | −2.22 |
MRPL21 | mitochondrial ribosomal protein L21 | −2.22 |
MYBBP1A | MYB binding protein (P160) 1a | −2.23 |
MPO | myeloperoxidase | −2.23 |
NANS | N-acetylneuraminic acid synthase | −2.25 |
ALDH3A2 | aldehyde dehydrogenase 3 family, member A2 | −2.27 |
SLC35F2 | solute carrier family 35, member F2 | −2.29 |
ZNF259 | zinc finger protein 259 | −2.29 |
ALOX5AP | arachidonate 5-lipoxygenase-activating protein | −2.33 |
TGFBR1 | transforming growth factor, beta receptor 1 | −2.34 |
SPN | sialophorin | −2.35 |
FBL | fibrillarin | −2.37 |
POLR1E | polymerase (RNA) I polypeptide E, 53 kDa | −2.37 |
SFXN4 | sideroflexin 4 | −2.38 |
GYPC | glycophorin C | −2.40 |
CAT | catalase | −2.40 |
COMMD2 | COMM domain containing 2 | −2.61 |
IGFBP7 | insulin-like growth factor binding protein 7 | −2.78 |
RCN1 | reticulocalbin 1, EF-hand calcium binding domain | −2.81 |
SPG21 | spastic paraplegia 21 | −2.86 |
B3GNT5 | UDP beta-1,3-N-acetylglucosaminyltransferase 5 | −3.21 |
IL2RG | interleukin 2 receptor, gamma | −3.29 |
OSM | oncostatin M | −3.34 |
HINT1 | histidine triad nucleotide binding protein 1 | −3.42 |
ANXA2 | annexin A2 | −3.58 |
List of genes highly specific for M cells. FC values are given as log 2 means for an n of 3 per group.
Gene Ontology Analysis
Gene lists specific for each differentiated groups were subjected to gene ontology analysis to determine their molecular functions. This analysis yielded seven major functional groups including binding, catalytic, signal transducer, transcription regulator, structural molecule, enzyme regulator, and transporter activity as represented in Fig. 5. Genes with binding function were comparable between the three hematopoietic lineages. Genes with catalytic activity were found to be expressed at a higher percentage in E group; percentage of genes with signal transduction function was the highest in G group; Genes with functions such as enzyme activity and transporter activity were found to be expressed at a higher percentage in M groups.
Validation of Microarray Data by QPCR
To confirm the expression data obtained from the exon microarray studies, we analyzed the expression of a selection of few significantly up regulated genes in each group by real-time PCR. Table 6 illustrates the fold changes in the expression of transcripts between E, G, and M vs. CD34+ from microarray and QPCR analyses. A high degree of correlation was observed between the microarray data and the QPCR data.
Table 6.
Gene Symbol | E vs. CD34 | G vs. CD34 | M vs. CD34 | P Value | |
---|---|---|---|---|---|
Solute carrier family 4, anion exchanger, member 1 | |||||
SLC4A1 | exon array | 10.68 | 0.86 | 0.68 | |
QPCR | 117.26 | 1.14 | 0.03 | < 0.001 | |
Proteoglycan 2, bone marrow | |||||
PRG2 | exon array | 1.32 | 4.45 | 0.85 | |
QPCR | 1.28 | 49.70 | 1.23 | < 0.05 | |
C-type lectin domain family 1, member B | |||||
CLEC1B | exon array | 1.58 | 1.59 | 98.22 | |
QPCR | 19.08 | 5.21 | 222.86 | < 0.05 | |
Phosphodiesterase 3A, cGMP-inhibited | |||||
PDE3A | exon array | 1.55 | 0.93 | 72.41 | |
QPCR | 1.49 | 0.52 | 6.96 | < 0.05 | |
Spleen focus forming virus proviral integration oncogene | |||||
SPI1 | exon array | 0.52 | 1.14 | 0.16 | |
QPCR | 0.38 | 1.21 | 0.42 | <0.001 | |
Translocase of inner mitochondrial membrane 13 | |||||
TIMM13 | exon array | 1.11 | 0.97 | 0.25 | |
QPCR | 1.14 | 0.51 | 0.13 | <0.005 | |
von Willebrand factor | |||||
VWF | exon array | 1.35 | 0.72 | 63.13 | |
QPCR | 0.26 | 0.04 | 29.16 | <0.05 | |
Elastase 2 | |||||
ELA2 | exon array | 2.28 | 35.91 | 0.97 | |
QPCR | 47.50 | 192.67 | 104.69 | <0.001 | |
Erythroid-associated factor | |||||
ERAF | exon array | 22.84 | 2.09 | 0.95 | |
QPCR | 315.73 | 109.89 | 0.13 | <0.001 | |
Proteinase 3 | |||||
PRTN3 | exon array | 2.25 | 25.79 | 1.09 | |
QPCR | 33.59 | 84.44 | 57.28 | <0.001 | |
Rho guanine nucleotide exchange factor (GEF) 17 | |||||
ARHGEF17 | exon array | 0.19 | 0.29 | 0.19 | |
QPCR | 0.03 | 0.04 | 0.05 | <0.001 | |
Fc fragment of IgG, low affinity IIIa, receptor | |||||
FCGR3A | exon array | 0.15 | 0.14 | 0.15 | |
QPCR | 0.01 | 0.06 | 0.01 | <0.001 | |
Integrin, beta 3 (platelet glycoprotein IIIa, antigen CD61) | |||||
ITGB3 | exon array | 11.22 | 2.98 | 152.96 | |
QPCR | 3.65 | 1.01 | 42.22 | <0.001 | |
Solute carrier family 11 member 1 | |||||
SLC11A1 | exon array | 0.18 | 0.17 | 0.15 | |
QPCR | 0.01 | 0.06 | 0.01 | <0.05 | |
Linker for activation of T lymphocytes | |||||
LAT | exon array | 11.16 | 4.01 | 23.92 | |
QPCR | 11.47 | 6.77 | 15.24 | <0.001 |
Validation of microarray data by QPCR. FC in the expression for some differentially expressed genes are given as means for 3 samples in each group from exon array analysis and QPCR.
Alternative Splicing Events During Lineage-specific Hematopoietic Differentiation
We identified 86 alternatively spliced genes among the differentiated cells as shown in Tables 7, 8, and 9. An alternatively spliced gene based on this analysis is a gene with at least one exon whose behavior deviates by a certain magnitude relative to the other exons within the gene. In comparing across the three lineages, we observed 31 alternatively spliced genes in common including CAST, EPHX1, FANCA, CLCN7, PDE4D, PDE4DIP. E and G groups showed an overlap of 14 alternatively spliced genes, while 11 genes were common to E and M groups. G and M groups showed the lowest overlap of five spliced genes. Splicing was unique to 13 genes (ALDOA, ASS1, ATP5H, CLEC12A, ELMO1, NUMA1, PLK4CA, PTPN6, PTPRA, SLC39A4, SMG1, UBE2D3, XLT1) in the M group, two genes (CD97, RGR4275) in G group, and 10 genes (CMTM5, COL6A2, CYBR53, DNMT33, LPHN1, PHC2, SLC2A14, SORL1, STAB1, TPM1) in the E group. These gene lists also include previously identified alternatively spliced genes CAST, CLMN, EPHX1, GAB1, and vWF.
Table 7.
Gene Symbol | Gene Title | E vs. CD34 |
---|---|---|
GAB1 | GRB2-associated binding protein 1 | 2.73 |
AKR1C2 | aldo-keto reductase family 1, member C2 | 2.62 |
COL24A1 | collagen, type XXIV, alpha 1 | 2.62 |
SLC2A14 | solute carrier family 2 member 14 | 2.46 |
EPHX1 | epoxide hydrolase 1, microsomal | 2.43 |
PARL | presenilin associated, rhomboid-like | 2.34 |
GRAP2 | GRB2-related adaptor protein 2 | 2.20 |
ORC4L | origin recognition complex, subunit 4-like | 2.18 |
RBBP6 | retinoblastoma binding protein 6 | 2.01 |
TYRO3 | TYRO3 protein tyrosine kinase | 1.82 |
CTNND1 | catenin (cadherin-associated protein), delta 1 | 1.82 |
STAB1 | stabilin 1 | 1.66 |
VWF | von Willebrand factor | 1.57 |
CLCN7 | chloride channel 7 | 1.55 |
COL6A2 | collagen, type VI, alpha 2 | 1.55 |
NID2 | nidogen 2 (osteonidogen) | 1.44 |
RPL21 | ribosomal protein L21 | 1.36 |
ATP5O | ATP synthase, H+ transporting, O subunit | 1.26 |
RARA | retinoic acid receptor, alpha | 1.24 |
RABGAP1L | RAB GTPase activating protein 1-like | 1.22 |
TSC22D3 | TSC22 domain family, member 3 | 1.22 |
SORL1 | sortilin-related receptor, L | 1.17 |
LDB1 | LIM domain binding 1 | 1.13 |
HMHA1 | histocompatibility (minor) HA-1 | 1.07 |
BOLA2 | bolA homolog 2 (E. coli) | 1.06 |
C13orf3 | chromosome 13 open reading frame 3 | 1.06 |
COL18A1 | collagen, type XVIII, alpha 1 | 0.89* |
ATP5H | ATP synthase, H+ transporting, subunit d | 0.89* |
ALDOA | aldolase A, fructose-bisphosphate | 0.81* |
SMG1 | SMG1 homolog, phosphatidylinositol 3-kinase-related kinase | 0.78* |
AKAP13 | A kinase (PRKA) anchor protein 13 | 0.75* |
UBE2D3 | ubiquitin-conjugating enzyme E2D 3 | 0.63* |
CLEC12A | C-type lectin domain family 12, member A | 0.58* |
PTPRA | protein tyrosine phosphatase, receptor type, A | 0.56* |
ELMO1 | engulfment and cell motility 1 | 0.50* |
HADHB | hydroxyacyl-Coenzyme A dehydrogenase beta subunit | 0.45* |
PTPN6 | protein tyrosine phosphatase, nonreceptor type 6 | 0.26* |
XYLT1 | xylosyltransferase I | −0.31* |
MPO | myeloperoxidase | −0.31* |
NUMA1 | nuclear mitotic apparatus protein 1 | −0.44* |
CD97 | CD97 molecule | −0.45* |
IL16 | interleukin 16 (lymphocyte chemoattractant factor) | −0.58* |
ASS1 | argininosuccinate synthetase 1 | −0.75* |
SLC39A4 | solute carrier family 39 (zinc transporter), member 4 | −0.78* |
CMTM5 | CKLF-like MARVEL transmembrane domain containing 5 | −1.04 |
ADAMTS13 | ADAM metallopeptidase with thrombospondin,motif 13 | −1.12 |
BPI | bactericidal/permeability-increasing protein | −1.16 |
PDE4D | phosphodiesterase 4D, cAMP-specific | −1.21 |
AARSD1 | alanyl-tRNA synthetase domain containing 1 | −1.23 |
CYB5R3 | cytochrome b5 reductase 3 | −1.31 |
RTN4 | reticulon 4 | −1.34 |
PHC2 | polyhomeotic homolog 2 (Drosophila) | −1.35 |
PDE4DIP | phosphodiesterase 4D interacting protein | −1.39 |
RGS3 | regulator of G-protein signaling 3 | −1.40 |
LPHN1 | latrophilin 1 | −1.42 |
UBAP2 | ubiquitin associated protein 2 | −1.47 |
TPM1 | tropomyosin 1 (alpha) | −1.53 |
WIPF1 | WAS/WASL interacting protein family, member 1 | −1.57 |
FANCA | Fanconi anemia, complementation group A | −1.58 |
ATP2A3 | ATPase, Ca2+ transporting, ubiquitous | −1.60 |
TRIM16 | tripartite motif-containing 16 | −1.63 |
EPB49 | erythrocyte membrane protein band 4.9 | −1.69 |
CR1 | complement component (3b/4b) receptor 1 | −1.73 |
SIGLEC12 | sialic acid binding Ig-like lectin 12 | −1.77 |
DNMT3B | DNA (cytosine-5-)-methyltransferase 3 beta | −1.79 |
MYLK | myosin light chain kinase | −1.79 |
TMEM49 | transmembrane protein 49 | −1.84 |
C1orf113 | chromosome 1 open reading frame 113 | −1.85 |
TLE1 | transducin-like enhancer of split 1 | −1.86 |
PTK2 | PTK2 protein tyrosine kinase 2 | −1.88 |
CD44 | CD44 molecule (Indian blood group) | −1.97 |
CLMN | calmin (calponin-like, transmembrane) | −2.04 |
TPCN2 | two pore segment channel 2 | −2.11 |
SLC25A3 | solute carrier family 25 member 3 | −2.30 |
CAST | calpastatin | −2.60 |
SMARCAD1 | SWI/SNF matrix-associated regulator of chromatin | −2.85 |
TPD52L2 | tumor protein D52-like 2 | −2.90 |
List of alternatively spliced genes. The largest deviation of 2-fold over the exons in a gene for the E group is shown (log base 2 scale), The boldfaced genes are unique to the E group comparison.
Genes alternatively spliced in one of the other 2 groups.
Table 8.
Gene Symbol | Gene Title | G vs. CD34 |
---|---|---|
PARL | presenilin associated, rhomboid-like | 2.23 |
RBBP6 | retinoblastoma binding protein 6 | 2.11 |
COL24A1 | collagen, type XXIV, alpha 1 | 1.81 |
RGR4275 | 1.75 | |
GAB1 | GRB2-associated binding protein 1 | 1.68 |
TYRO3 | TYRO3 protein tyrosine kinase | 1.61 |
HADHB | hydroxyacyl-Coenzyme A dehydrogenase/beta | 1.60 |
GRAP2 | GRB2-related adaptor protein 2 | 1.56 |
ATP5O | ATP synthase, H+ transporting, | 1.50 |
PDE4D | phosphodiesterase 4D, cAMP-specific | 1.50 |
ORC4L | origin recognition complex, subunit 4-like | 1.39 |
VWF | von Willebrand factor | 1.34 |
MPO | myeloperoxidase | 1.32 |
RPL21 | ribosomal protein L21 | 1.27 |
C13orf3 | chromosome 13 open reading frame 3 | 1.27 |
EPHX1 | epoxide hydrolase 1, microsomal | 1.17 |
RARA | retinoic acid receptor, alpha | 1.15 |
AKAP13 | A kinase (PRKA) anchor protein 13 | 1.14 |
CTNND1 | catenin (cadherin-associated protein), delta 1 | 1.08 |
AKR1C2 | aldo-keto reductase family 1, member C2 | 0.96* |
STAB1 | stabilin 1 | 0.94* |
ADAMTS13 | ADAM metallopeptidase with thrombospondin 13 | 0.88* |
BOLA2 | bolA homolog 2 (E. coli) | 0.87* |
ELMO1 | engulfment and cell motility 1 | 0.84* |
ATP5H | ATP synthase, H+ transporting, mitochondrial | 0.78* |
SLC2A14 | solute carrier family 2 member 14 | 0.76* |
SMG1 | SMG1 homolog, phosphatidylinositol 3-kinase | 0.76* |
ALDOA | aldolase A, fructose-bisphosphate | 0.71* |
DNMT3B | DNA (cytosine-5-)-methyltransferase 3 beta | 0.68* |
LDB1 | LIM domain binding 1 | 0.68* |
SORL1 | sortilin-related receptor, L (DLR class) | 0.63* |
PTPRA | protein tyrosine phosphatase, receptor type, A | 0.62* |
TLE1 | transducin-like enhancer of split 1 | 0.57* |
SLC39A4 | solute carrier family 39 member 4 | 0.50* |
UBE2D3 | ubiquitin-conjugating enzyme E2D 3 | 0.48* |
TSC22D3 | TSC22 domain family, member 3 | 0.44* |
NUMA1 | nuclear mitotic apparatus protein 1 | 0.43* |
CLEC12A | C-type lectin domain family 12, member A | 0.39* |
HMHA1 | histocompatibility (minor) HA-1 | 0.25* |
PTPN6 | protein tyrosine phosphatase, 6 | −0.22* |
XYLT1 | xylosyltransferase I | −0.34* |
WIPF1 | WAS/WASL interacting protein family, member 1 | −0.41* |
LPHN1 | latrophilin 1 | −0.50* |
PHC2 | polyhomeotic homolog 2 (Drosophila) | −0.61* |
TPM1 | tropomyosin 1 (alpha) | −0.66* |
CYB5R3 | cytochrome b5 reductase 3 | −0.74* |
COL6A2 | collagen, type VI, alpha 2 | −0.75* |
CMTM5 | CKLF-like MARVEL transmembrane domain 5 | −0.82* |
ASS1 | argininosuccinate synthetase 1 | −0.88* |
CLMN | calmin (calponin-like, transmembrane) | −0.91* |
BPI | bactericidal/permeability-increasing protein | −0.96* |
SIGLEC12 | sialic acid binding Ig-like lectin 12 | −1.04 |
CD97 | CD97 molecule | −1.20 |
FANCA | Fanconi anemia, complementation group A | −1.22 |
TRIM16 | tripartite motif-containing 16 | −1.30 |
CR1 | complement component (3b/4b) receptor 1 | −1.31 |
IL16 | interleukin 16 (lymphocyte chemoattractant factor) | −1.32 |
COL18A1 | collagen, type XVIII, alpha 1 | −1.39 |
TMEM49 | transmembrane protein 49 | −1.41 |
CAST | calpastatin | −1.42 |
UBAP2 | ubiquitin associated protein 2 | −1.44 |
RTN4 | reticulon 4 | −1.44 |
MYLK | myosin light chain kinase | −1.45 |
RGS3 | regulator of G protein signaling 3 | −1.48 |
PTK2 | PTK2 protein tyrosine kinase 2 | −1.56 |
EPB49 | erythrocyte membrane protein band 4.9 | −1.57 |
RABGAP1L | RAB GTPase activating protein 1-like | −1.58 |
PDE4DIP | phosphodiesterase 4D interacting protein | −1.63 |
NID2 | nidogen 2 (osteonidogen) | −1.65 |
CD44 | CD44 molecule (Indian blood group) | −1.68 |
TPCN2 | two pore segment channel 2 | −1.71 |
ATP2A3 | ATPase, Ca2+ transporting, ubiquitous | −1.84 |
AARSD1 | alanyl-tRNA synthetase domain containing 1 | −1.90 |
TPD52L2 | tumor protein D52-like 2 | −2.16 |
C1orf113 | chromosome 1 open reading frame 113 | −2.22 |
SLC25A3 | solute carrier family 25 member 3 | −2.79 |
SMARCAD1 | SWI/SNFmatrix-associated regulator of chromatin | −3.44 |
List of alternatively spliced genes. The largest deviation of 2-fold over the exons in a gene for the G qroup is shown (log base 2 scale). The boldfaced genes are unique to the G group comparison.
Genes alternatively spliced in one of the other 2 groups.
Table 9.
Gene Symbol | Gene Title | M vs. CD34 |
---|---|---|
PARL | presenilin associated, rhomboid-like | 3.26 |
ATP5O | ATP synthase, H+ transporting, O subunit | 2.69 |
TYRO3 | TYRO3 protein tyrosine kinase | 2.55 |
COL24A1 | collagen, type XXIV, alpha 1 | 2.27 |
CR1 | complement component (3b/4b) receptor 1 | 2.26 |
XYLT1 | xylosyltransferase I | 2.18 |
PDE4DIP | phosphodiesterase 4D interacting protein | 2.12 |
CLEC12A | C-type lectin domain family 12, member A | 2.12 |
C13orf3 | chromosome 13 open reading frame 3 | 2.10 |
CTNND1 | catenin (cadherin-associated protein), delta 1 | 1.99 |
AARSD1 | alanyl-tRNA synthetase domain containing 1 | 1.94 |
ASS1 | argininosuccinate synthetase 1 | 1.93 |
AKR1C2 | aldo-keto reductase family 1, member C2 | 1.88 |
NUMA1 | nuclear mitotic apparatus protein 1 | 1.79 |
EPHX1 | epoxide hydrolase 1, microsomal (xenobiotic) | 1.77 |
RGS3 | regulator of G protein signaling 3 | 1.71 |
BPI | bactericidal/permeability-increasing protein | 1.60 |
HMHA1 | histocompatibility (minor) HA-1 | 1.52 |
HADHB | hydroxyacyl-Coenzyme A dehydrogenase, beta | 1.50 |
COL18A1 | collagen, type XVIII, alpha 1 | 1.46 |
IL16 | interleukin 16 | 1.44 |
PTPN6 | protein tyrosine phosphatase,nonreceptor 6 | 1.40 |
MPO | myeloperoxidase | 1.37 |
RBBP6 | retinoblastoma binding protein 6 | 1.35 |
SMG1 | SMG1 homolog, phosphatidylinositol 3-kinase | 1.31 |
ORC4L | origin recognition complex, subunit 4-like | 1.29 |
ATP5H | ATP synthase, H+ transporting, subunit d | 1.28 |
TSC22D3 | TSC22 domain family, member 3 | 1.21 |
CLCN7 | chloride channel 7 | 1.15 |
BOLA2 | bolA homolog 2 (E. coli) | 1.10 |
UBE2D3 | ubiquitin-conjugating enzyme E2D 3 | 1.10 |
SIGLEC12 | sialic acid binding Ig-like lectin 12 | 1.10 |
SLC39A4 | solute carrier family 39 zinc transporter 4 | 1.10 |
RARA | retinoic acid receptor, alpha | 1.03 |
ELMO1 | engulfment and cell motility 1 | 1.01 |
LDB1 | LIM domain binding 1 | 1.01 |
VWF | von Willebrand factor | 0.95* |
RPL21 | ribosomal protein L21 | 0.87* |
DNMT3B | DNA (cytosine-5-)-methyltransferase 3 beta | 0.87* |
LPHN1 | latrophilin 1 | 0.81* |
COL6A2 | collagen, type VI, alpha 2 | 0.66* |
STAB1 | stabilin 1 | 0.66* |
GRAP2 | GRB2-related adaptor protein 2 | 0.62* |
GAB1 | GRB2-associated binding protein 1 | 0.57* |
TPM1 | tropomyosin 1 (alpha) | 0.44* |
SLC2A14 | solute carrier family 2 member 14 | 0.41* |
SORL1 | sortilin-related receptor, L(DLR class) | 0.30* |
EPB49 | erythrocyte membrane protein band 4.9 | 0.23* |
PHC2 | polyhomeotic homolog 2 | −0.09* |
CMTM5 | CKLF-like MARVEL transmembrane domain 5 | −0.10* |
TMEM49 | transmembrane protein 49 | −0.38* |
CD44 | CD44 molecule (Indian blood group) | −0.41* |
CD97 | CD97 molecule | −0.43* |
SLC25A3 | solute carrier family 25, member 3 | −0.76* |
CYB5R3 | cytochrome b5 reductase 3 | −0.78* |
RTN4 | reticulon 4 | −0.82* |
NID2 | nidogen 2 (osteonidogen) | −0.85* |
MYLK | myosin light chain kinase | −0.91* |
PTK2 | PTK2 protein tyrosine kinase 2 | −0.93* |
UBAP2 | ubiquitin associated protein 2 | −1.03 |
ALDOA | aldolase A, fructose-bisphosphate | −1.08 |
ADAMTS13 | ADAM metallopeptidase with thrombospondin | −1.10 |
AKAP13 | A kinase (PRKA) anchor protein 13 | −1.25 |
PTPRA | protein tyrosine phosphatase, receptor typeA | −1.39 |
CLMN | calmin (calponin-like, transmembrane) | −1.47 |
TPCN2 | two pore segment channel 2 | −1.51 |
FANCA | Fanconi anemia, complementation group A | −1.70 |
TPD52L2 | tumor protein D52-like 2 | −1.72 |
RABGAP1L | RAB GTPase activating protein 1-like | −1.76 |
C1orf113 | chromosome 1 open reading frame 113 | −1.78 |
ATP2A3 | ATPase, Ca2+ transporting, ubiquitous | −1.83 |
CAST | calpastatin | −1.89 |
WIPF1 | WAS/WASL interacting protein family, 1 | −1.97 |
PDE4D | phosphodiesterase 4D, cAMP-specific | −2.06 |
TRIM16 | tripartite motif-containing 16 | −2.27 |
SMARCAD1 | chromatin DEAD/H box 1 | −2.89 |
TLE1 | transducin-like enhancer of split 1 | −3.24 |
List of alternatively spliced genes. The larqest deviation of 2-fold over the exons in a gene for the M group is shown (log base 2 scale). The boldfaced genes are unique to the M group comparison.
Genes alternatively spliced in one of the other 2 groups.
Important Pathways and Networks Affected by Alternative Splicing
We analyzed these 86 lineage-specific alternatively spliced genes using Ingenuity Pathway Analysis software (http://www.ingenuity.com) for further insights into their potential functional roles during hematopoietic differentiation. Not surprisingly, these analyses of showed preferential enrichment of biological processes related to hematological system development and function. The Ingenuity Pathway Analysis implicated alternative splicing in aspects of cellular development, molecular transport, cellular function and maintenance, cell-to-cell signaling and interaction, hematological system development and function, immune cell trafficking, cell-mediated immune response, antigen presentation and protein trafficking.
The top five canonical pathways were found to be Rac and RhoA signaling, leukocyte extravascular signaling, and wnt/β-catenin signaling, alanine and aspartate metabolism, and regulation of actin based motility by Rho.
Alternative Splicing During Erythroid Differentiation by RNA Sequence Analysis
Along with microarray analysis, we conducted a massively parallel sequencing study of the transcriptome using the SOLiD next generation sequencing platform on the CD34+ and differentiated E cells. Data generated from this study allowed for direct comparison and validation of the microarray data at both the gene and exon levels and offered the opportunity to assess the reliability of this emerging sequencing technology for transcriptome analysis. Using the RPKM measurements for the level of expression of an exon, or transcript, we observed a strong correlation between microarrays and RNA sequencing. At the gene level we observed a correlation of R = 0.72 (Fig. 6A) and at the exon level we observed an R value of 0.62 (Fig. 6B). We further interrogated several genes that had high correlations (R values range: 0.83–0.91) between the two platforms including TPD5L2, GAB1, SLC25A3, AND CAST. The fold-change measurements for the exons of these genes using both technologies are illustrated in Fig. 7, A–D. Shown in the figure panels, each gene tends to have at least one exon deviating by a large degree away from the behavior of the other exons, suggesting an alternative splicing event. Strikingly, each platform shows the same exon deviating by a large magnitude of change, indicating that these are very likely due to alternative splicing. This is further suggested by the RefSeq intron/exon isoforms plotted directly below the relative abundance for each transcript which show the inclusion (Fig. 7, B and D) or exclusion (Fig. 7, A, C, and D) of a specific exon corresponding to an alternatively spliced RefSeq isoform. In the gene TPD52L2 as illustrated in Fig. 7A, Exon 3 shows a deviation in microarray data only, but its magnitude is modest (less than twofold), and it does not correspond to a known splice variant. Thus it is likely due to experimental noise of the microarray. Similarly for the gene GAB1 as shown in Fig. 7B, Exon 2 shows a deviation for RNAseq data only with somewhat smaller magnitude than for Exon 8. This may be evidence for a novel transcript (exclusion in E), but as it is not confirmed by another method, it remains speculative. Thus RNAseq data confirms our microarray findings for statistically significant gene expression changes and alternatively spliced genes observed during erythropoietic differentiation.
DISCUSSION
Blood cells share numerous functional properties (cell motility, immune functions) that distinguish them from differentiated cells of solid tissues. These specific functions are acquired during hematopoietic cell differentiation, and the differentiated cells become fully operative the moment they leave bone marrow or other organs of the immune system toward the peripheral circulation.
Previous reports suggest that the self-renewal and differentiation of hematopoietic cells is not likely be governed by a single or few factors but rather by the integration of many integrating signal inputs affecting gene transcription including chromatin regulation, transcription factors, alternative splicing, and posttranslational modification. In particular, alternative splicing of exons is believed to contribute extensively to transcript and protein complexity in differentiated stem cells. While it is acknowledged that alternative splicing is a major determinant affecting global transcript protein complexity, this has not been examined in functional studies of hematopoietic cell differentiation. Hence, we undertook this study to gain insight into transcriptome and exome of hematopoietic process by using an in vitro human hematopoietic model system that permits analysis of CD34+ differentiation into major blood cell lineages.
Exploiting the QPCR and microarray technology, we analyzed the cytokine induced differentiation of CD34+ progenitors to identify gene signatures and to determine the degree to which alternative splicing might regulate this process. In an effort to confirm and validate this in vitro model we analyzed the parent and differentiated cells by flow cytometry for specific cell surface markers such as CD71, CD15 and CD66 specific for E, G, and M, respectively, and lineage-specific expression of transcripts such as PTGS1, SERPINE1, GPIBA, PF4 (specific for M), FCGRA, MIP2, FCGR2B, and CD300A, (specific for G), and GYPA, TMOD1, ANK1 (specific for E). These analyses showed increased expression of lineage-specific transcripts and proteins in the differentiated groups compared with CD34+ cells, indicating that the day 11 cultures are indeed differentiated cells comparable to those found in the peripheral blood. Our observed data also correlate with other published reports on the lineage-specific gene expression confirming the identity of the cells being studied here.
Initial microarray analysis detected 172 genes that are significantly modulated during differentiation. These transcripts by ingenuity pathway analysis showed them to be involved in cell motility, immune system development, and cell signaling as would be expected for developing hematopoietic cells. CD34+ cells on the other hand showed an upregulated expression of genes such as ARHGEF17, CD177, GPSM1, ID1, S100A12, and SLC11A1, and these genes appear to participate in cell signaling processes. In E cells, genes such as SLC4A1, ALAS2, HBB, GYPA, and SELENBP1 are overexpressed. While this red cell signature is expected for SLC4A1 (band 3 red cell membrane protein), HBB (Hemoglobin subunit), and GYPA (glycophorin a membrane protein), SELENBP1 has only been implicated in the pathogenesis of cancer and neuronal disorders. The G cells showed very few overexpressed genes upon differentiation and these genes included PRG2, CEACAM6. Interestingly, CD34+ cells differentiation into megakaryocytes demonstrated significant modulations for a diversity of genes, 90 in total, which included 32 downregulated genes and 58 upregulated transcripts. These genes as shown in Table 2 are associated with regulation of cell proliferation, cell cycle signaling, and immune system development. Whether or not individual genes within these signatures play a functional role in hematopoietic differentiation will require additional studies.
We further analyzed the dataset to generate highly selective gene lists that would predict the nature of the differentiated cell types and the parent stem cell. The criteria that were used to determine selectivity, in order for a gene to be selectively up for either of the three comparisons (E vs. CD34+, M vs. CD34+, G vs. CD34+) were to have a P < 0.0001 and at least twofold up for the comparison of interest and unchanged or upregulated <1.4-fold in the other two comparisons. Applying these criteria, we were able to generate three different clusters that showed an upregulation of 30 transcripts highly selective for the E group, 32 for the G group, and 269 for the M group. Gene ontology analysis of these highly selective genes classified them to be involved in seven major functions such as binding, catalytic, signal transducer, transcription regulator, structural molecule, enzyme regulator, and transporter activity. However, it should be noted that the functional classification of a gene may be redundant and few genes could be classified into many different functional categories.
Of particular interest, we observed that during erythropoietic differentiation, the expression of GTPase activator proteins are upregulated. These GTPase activator proteins are known to be involved in actin cytoskeleton organization, membrane trafficking, gene expression, and cell proliferation. We also observed and validated increased expression of several genes associated with homeostasis and platelets during megakaryopoietic differentiation including CD44, TPM1, vWF, GP5, PDGFB, F2RL2, and ELMO1.
Having defined the transcriptome of the differentiated hematopoietic cells, we then examined the differences in exon expression that would represent putative alternative splicing. Using the ExonSVD model, we identified 86 known genes to be alternately spliced among the differentiated cells compared with progenitors. These genes include 31 transcripts that are common to the E, G, and M groups. E and G groups showed an overlap of 14 alternatively spliced genes, while G and M groups showed the lowest overlap of five spliced genes. Eleven spliced genes were found to be common to E and M groups, which are of potential functional importance as erythrocytes and megakaryocytes share a common progenitor. Lineage-specific splicing was observed in 13 genes in the megakaryocytes, while only two genes were specific for the granulocytic group. Red cell differentiation identified 10 genes with differentially expressed exon transcripts. This is the first report showing alternative splicing events during lineage-specific in vitro hematopoietic differentiation. Our gene list also includes alternatively spliced genes (CAST, CLMN, EPHX1, GAB1, vWF) that have previously been identified by others using PCR, SAGE, and sequencing technologies. Importantly, we have identified additional novel spliced genes in the current study as shown in Tables 7, 8, and 9. Finally, we were able to validate our microarray-based detection of alternative splicing with the whole transcriptome sequencing on a next generation massively parallel sequencing platform with a significant degree of correlation. It is likely that this sequence-based expression profiling would become a widely used platform future transcriptome studies.
Functional pathway analysis of these 86 alternatively spliced genes showed preferential enrichment of biological processes related to hematological system development and function, molecular transport, cellular function and maintenance, cell-to-cell signaling and interaction, immune cell trafficking, cell-mediated immune response, antigen presentation, and protein trafficking. Most importantly, the top five canonical pathways were found to be Rac and RhoA signaling, leukocyte extravascular signaling and wnt/B-catenin signaling, alanine and aspartate metabolism, and regulation of actin-based motility by Rho.
The Rac signaling pathway has a role in many cellular functions including cell motility and adhesion, cell growth and proliferation, and cell survival and apoptosis (5, 13, 20, 27). Rac proteins constitute a subgroup of the Rho family of small GTPases and include Rac1, Rac2, Rac3, and the splice variant of Rac1, Rac1b. By acting as molecular switches, they control a variety of signal pathways that are essential for cell functions (5, 13, 20, 27). Rac GTPases are key regulators of the actin cytoskeleton, cell-cycle progression and gene transcription, cell survival and apoptosis, and the NADPH oxidase for producing reactive oxygen species. Aberrant Rac signaling is found in some human cancers as a result of changes in the GTPase itself or in its regulation loops (38).
Rho signaling pathway, another pathway that was found to be modulated during hematopoietic differentiation, orchestrates cellular processes as diverse as cell migration, cell-cycle progression and cytokinesis, microbial killing (through phagocytosis and NADPH oxidase activity), and agonist-regulated gene transcription. In particular, Rac- and Rho-induced effects, which correlate with membrane protrusion and contractility, respectively, antagonize each other in a variety of cell types.
Wnt proteins are a family of highly conserved signaling factors controlling cell fate and differentiation during development including regulation of signals regulating the self-renewal and differentiation interface in hematopoietic stem cells (31). Alterations of the Wnt/β-catenin signaling pathway are known to be associated with the tumorigenesis of tissues with a high renewal potential such as that of bone marrow-hematopoietic tissue (2, 3, 9, 14, 32). Moreover, different human wnt genes show a complex organization and pattern of expression with alternative promoters and RNA splicing responsible for the expression of isoforms. Therefore, though less attention has thus far been paid to the regulation of Wnt expression, such an analysis appears to be required to understand and define the respective role of individual Wnt proteins, if not individual Wnt isoforms, in the control of cell fate, differentiation, and tissue regeneration. Further studies are still needed to determine if alternatively spliced isoforms of Wnt pathway genes play a functional role leading to hematopoietic cell differentiation.
In summary, we used an in vitro model to analyze differentiating hematopoietic stem cells and applied microarray and sequencing technologies to generate detailed expression and exon profiles during lineage-specific differentiation of cells. Findings for erythroid differentiation have been validated and extended with next generation sequencing technology. Knowledge about the specific transcriptional programs during normal hematopoiesis may contribute to further understanding of the complex process of hematopoietic stem cell development to define new pathophysiological pathways that can possibly be used for the strategy of target-specific treatment in the near future.
GRANTS
Funding by Intramural Research, NHLBI and CIT, NIH.
DISCLOSURES
No conflicts of interest, financial or otherwise, are declared by the author(s).
Supplementary Material
ACKNOWLEDGMENTS
We acknowledge Intramural Research, NHLBI, for the funding and Edge Biosciences and Applied Biosystems for generous help in RNAseq analysis using the SOLiD platform. We appreciate the help of Drs. Harry L. Malech and Uimook Choi, in the Genetic Immunotherapy Section, Laboratory of Host Defenses, National Institute of Allergy and Infectious Diseases, NIH, for providing the CD34+ cells. We acknowledge Dr. Phil Mccoy and Ms. Leigh Samsel in the NHLBI Flow Cytometry Core for help with the flow characterization of cells. We gratefully acknowledge the help of Dr. Zu Xi Yu in the Pathology Core Facility-NHLBI for help in the staining of cells.
Footnotes
The online version of this article contains supplemental material.
REFERENCES
- 1. Adams GB. Deconstructing the hematopoietic stem cell niche: revealing the therapeutic potential. Regen Med 3: 523–530, 2008. [DOI] [PubMed] [Google Scholar]
- 2. Arnsdorf EJ, Tummala P, Jacobs CR. Non-canonical Wnt signaling and N-cadherin related beta-catenin signaling play a role in mechanically induced osteogenic cell fate. PLoS One 4: e5388, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Auclair BA, Benoit YD, Rivard N, Mishina Y, Perreault N. Bone morphogenetic protein signaling is essential for terminal differentiation of the intestinal secretory cell lineage. Gastroenterology 133: 887–896, 2007. [DOI] [PubMed] [Google Scholar]
- 4. Blencowe BJ. Alternative splicing: new insights from global analyses. Cell 126: 37–47, 2006. [DOI] [PubMed] [Google Scholar]
- 5. Cancelas JA, Jansen M, Williams DA. The role of chemokine activation of Rac GTPases in hematopoietic stem cell marrow homing, retention, and peripheral mobilization. Exp Hematol 34: 976–985, 2006. [DOI] [PubMed] [Google Scholar]
- 6. Chasis JA, Mohandas N. Erythroblastic islands: niches for erythropoiesis. Blood 112: 470–478, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Christofi T, Raptis DA, Kallis A, Ambasakoor F. True trilineage haematopoiesis in excised heterotopic ossification from a laparotomy scar: report of a case and literature review. Ann R Coll Surg Engl 90: W12–W14, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Dent AL, Kaplan MH. T cell regulation of hematopoiesis. Front Biosci 13: 6229–6236, 2008. [DOI] [PubMed] [Google Scholar]
- 9. Eisenmann DM. Wnt signaling. WormBook: 1–17, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Elefanty AG, Robb L, Begley CG. Factors involved in leukaemogenesis and haemopoiesis. Baillieres Clin Haematol 10: 589–614, 1997. [DOI] [PubMed] [Google Scholar]
- 11. Giebel B, Punzel M. Lineage development of hematopoietic stem and progenitor cells. Biol Chem 389: 813–824, 2008. [DOI] [PubMed] [Google Scholar]
- 12. Greaves MF. Differentiation-linked leukemogenesis in lymphocytes. Science 234: 697–704, 1986. [DOI] [PubMed] [Google Scholar]
- 13. Hall A. Rho GTPases and the control of cell behaviour. Biochem Soc Trans 33: 891–895, 2005. [DOI] [PubMed] [Google Scholar]
- 14. Katoh M. WNT signaling pathway and stem cell signaling network. Clin Cancer Res 13: 4042–4045, 2007. [DOI] [PubMed] [Google Scholar]
- 15. Komor M, Guller S, Baldus CD, de Vos S, Hoelzer D, Ottmann OG, Hofmann WK. Transcriptional profiling of human hematopoiesis during in vitro lineage-specific differentiation. Stem Cells 23: 1154–1169, 2005. [DOI] [PubMed] [Google Scholar]
- 16. Kosaki G. Platelet production by megakaryocytes: protoplatelet theory justifies cytoplasmic fragmentation model. Int J Hematol 88: 255–267, 2008. [DOI] [PubMed] [Google Scholar]
- 17. Mikhail A, Covic A, Goldsmith D. Stimulating erythropoiesis: future perspectives. Kidney Blood Press Res 31: 234–246, 2008. [DOI] [PubMed] [Google Scholar]
- 18. Muller-Sieburg C, Sieburg HB. Stem cell aging: survival of the laziest? Cell Cycle 7: 3798–3804, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Nissen-Druey C, Tichelli A, Meyer-Monard S. Human hematopoietic colonies in health and disease. Acta Haematol 113: 5–96, 2005. [DOI] [PubMed] [Google Scholar]
- 20. Nowak JM, Grzanka A, Zuryn A, Stepien A. (The Rho protein family and its role in the cellular cytoskeleton). Postepy Hig Med Dosw (Online) 62: 110–117, 2008. [PubMed] [Google Scholar]
- 21. Olsson I, Bergh G, Ehinger M, Gullberg U. Cell differentiation in acute myeloid leukemia. Eur J Haematol 57: 1–16, 1996. [DOI] [PubMed] [Google Scholar]
- 22. Orkin SH. Diversification of haematopoietic stem cells to specific lineages. Nat Rev Genet 1: 57–64, 2000. [DOI] [PubMed] [Google Scholar]
- 23. Orkin SH. Stem cell alchemy. Nat Med 6: 1212–1213, 2000. [DOI] [PubMed] [Google Scholar]
- 24. Orlovskaya I, Schraufstatter I, Loring J, Khaldoyanidi S. Hematopoietic differentiation of embryonic stem cells. Methods 45: 159–167, 2008. [DOI] [PubMed] [Google Scholar]
- 25. Palis J, Segel GB. Developmental biology of erythropoiesis. Blood Rev 12: 106–114, 1998. [DOI] [PubMed] [Google Scholar]
- 26. Passegue E, Weisman IL. Leukemic stem cells: where do they come from? Stem Cell Rev 1: 181–188, 2005. [DOI] [PubMed] [Google Scholar]
- 27. Pernis AB. Rho GTPase-mediated pathways in mature CD4+ T cells. Autoimmun Rev 8: 199–203, 2009. [DOI] [PubMed] [Google Scholar]
- 28. Pohar TT, Sun H, Davuluri RV. HemoPDB: Hematopoiesis Promoter Database, an information resource of transcriptional regulation in blood cell development. Nucleic Acids Res 32: D86–D90, 2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Scandura JM. Advances in the molecular genetics of acute leukemia. Curr Oncol Rep 7: 323–332, 2005. [DOI] [PubMed] [Google Scholar]
- 30. Solier S, Barb J, Zeeberg BR, Varma S, Ryan MC, Kohn KW, Weinstein JN, Munson PJ, Pommier Y. Genome-wide analysis of novel splice variants induced by topoisomerase I poisoning shows preferential occurrence in genes encoding splicing factors. Cancer Res 70: 8055–8065, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Staal FJ, Luis TC. Wnt signaling in hematopoiesis: crucial factors for self-renewal, proliferation, and cell fate decisions. J Cell Biochem 109: 844–849, 2010. [DOI] [PubMed] [Google Scholar]
- 32. Sumi T, Tsuneyoshi N, Nakatsuji N, Suemori H. Defining early lineage specification of human embryonic stem cells by the orchestrated balance of canonical Wnt/beta-catenin, Activin/Nodal and BMP signaling. Development 135: 2969–2979, 2008. [DOI] [PubMed] [Google Scholar]
- 33. Thomsen R, Solvsten CA, Linnet TE, Blechingberg J, Nielsen AL. Analysis of qPCR data by converting exponentially related Ct values into linearly related X0 values. J Bioinform Comput Biol 8: 885–900, 2010. [DOI] [PubMed] [Google Scholar]
- 34. Warren LA, Rossi DJ. Stem cells and aging in the hematopoietic system. Mech Ageing Dev 130: 46–53, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Watkins NA, Gusnanto A, de Bono B, De S, Miranda-Saavedra D, Hardie DL, Angenent WG, Attwood AP, Ellis PD, Erber W, Foad NS, Garner SF, Isacke CM, Jolley J, Koch K, Macaulay IC, Morley SL, Rendon A, Rice KM, Taylor N, Thijssen-Timmer DC, Tijssen MR, van der Schoot CE, Wernisch L, Winzer T, Dudbridge F, Buckley CD, Langford CF, Teichmann S, Gottgens B, Ouwehand WH. A HaemAtlas: characterizing gene expression in differentiated human blood cells. Blood 113: e1–e9, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Weih F, Carrasco D, Durham SK, Barton DS, Rizzo CA, Ryseck RP, Lira SA, Bravo R. Multiorgan inflammation and hematopoietic abnormalities in mice with a targeted disruption of RelB, a member of the NF-kappa B/Rel family. Cell 80: 331–340, 1995. [DOI] [PubMed] [Google Scholar]
- 37. Weiss MJ, dos Santos CO. Chaperoning erythropoiesis. Blood 113: 2136–2144, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Yamazaki D, Kurisu S, Takenawa T. Regulation of cancer cell motility through actin reorganization. Cancer Sci 96: 379–386, 2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.