Abstract
Hepatocyte-like cells (HLCs) can be derived from pluripotent stem cells (PSCs) by sequential treatment of chemical cues to mimic the microenvironment of embryonic liver development. However, these HLCs do not reach the full maturity level of primary hepatocytes. In this study, we carried out a meta-analysis of cross-species transcriptome data of in vitro differentiation of human PSCs to HLCs and in vivo mouse embryonic liver development to identify the developmental stage at which HLC maturation was blocked at. Systematic variations were found associated with the data source and removed by batch correction. Using principal component analysis, HLCs from different stages of differentiation were aligned with mouse embryonic liver development chronologically. A “unified developmental time” (DT) scale was developed after aligning in vitro HLC differentiation and in vivo embryonic liver development. HLCs were found to cease further maturation at an equivalent stage of mouse embryonic day (E)13–15. Genes with discordant time dynamics were identified by aligning in vivo and in vitro data set onto a common DT scale. These genes may be targets of genetic intervention for enhancing the maturity of PSC-derived HLCs.
Keywords: : hepatic differentiation, meta-analysis, gene expression, stem cells
Introduction
Hepatocytes derived from pluripotent stem cells (PSCs) hold the potential of transforming the treatment of many liver diseases. They may also serve as a virtually unlimited cell source for drug toxicity and pharmacokinetics studies [1]. By culturing PSCs in different combinations of growth factors that mimic the biochemical environment during embryonic liver development, they can be directed to differentiate toward the hepatic lineage [2,3]. These differentiated cells exhibit many functional characteristics of primary hepatocytes, including albumin secretion, certain cytochrome P450 activities, and urea synthesis. Engraftment of cells into animals has also been observed when transplanted under selective conditions [4–8].
However, despite the capability of expressing some hepatic genes, the differentiated cells are regarded as “hepatocyte-like” cells (HLCs), as many mature liver markers are either not expressed or expressed at very low levels. The general lack of maturity of HLCs was further revealed in a microarray transcriptome study showing that HLCs derived from three separate laboratories were clustered distinctively from primary hepatocytes [9]. Through functional activities and proteomic analysis, it was clearly demonstrated that HLCs were closer to fetal liver hepatocytes than primary hepatocytes [9,10].
Many approaches have been explored to enhance maturation of HLCs, including the formation of tissue-like three-dimensional (3D) structure [11,12], coculture of HLCs and endothelial cells [12–14], the addition of small molecules uncovered by screening [15–17], and overexpression of certain transcription factors (TFs) [18–20]. Nevertheless, the derivation of mature hepatocytes from PSCs is still elusive. Although the lack of maturity of HLCs has been well recognized, the stage of embryonic liver development that HLCs are equivalent to is not known. By identifying the developmental stage that HLC maturation is arrested at, one can begin to devise strategies to overcome the genetic roadblocks.
Human gestation occurs over 280 days compared to 20 days for mouse. Although these processes differ in time scales, they share similar progression through the different developmental stages (Supplementary Fig. S1A; Supplementary Data are available online at www.liebertpub.com/scd). A typical directed hepatocyte differentiation for both human and mouse PSCs (hPSCs and mPSCs) lasts for about 20 days [21].
We hypothesize that the transcriptome data of in vitro hepatic differentiation bear the characteristics of embryonic liver development. We set out to align the transcriptome data of different stages of HLC differentiation to those of various gestation stages of embryonic liver development. We will then identify the corresponding embryo development stage that HLC differentiation is blocked at, and uncover the discordant gene expression between the in vitro HLC differentiation and in vivo liver development that may contribute to the differentiation block.
To better understand the genetics hurdles that need to be addressed, we performed a comprehensive meta-analysis of publicly available gene expression data sets along with our own transcriptome data performed for this study. However, the transcriptome data across different stages of liver development are available only for rodents, but not humans. We hypothesized that the transcriptome dynamics of embryonic liver development between rodents and humans are highly similar, and are performed cross-species and cross-in vitro-in vivo analysis with the aim of aligning the human hepatocyte-like cell (hHLC) differentiation and the mouse embryonic liver development process onto a common scale or a “unified developmental time” (DT).
It has been reported that data obtained with different measurement platforms can often bear systematic variations and random errors [22–24]. The ComBat algorithm employs empirical Bayes method to perform a location and scale adjustment by pooling information across each gene in every batch to center the data to an overall grand mean [22]. This technique can be applied to multiple batches of data sets and is robust for small sample sizes to overcome the biasing that is typically observed [22]. Other methods of batch correction have also been reported. Some, such as the limma package in R, were reported to give less “exaggerated” findings [25].
However, ComBat has been used often and shown to outperform the common batch correction techniques, including distance-weighted discrimination, which uses a support vector machine-based approach [26], mean-centering (PAMR), which relies on gene-based analysis of variance [23], and surrogate variable analysis, which uses a combination of single value decomposition and linear model analysis [24,27]. Hence, we used the ComBat algorithm to integrate transcriptome data at different time points of hHLC differentiations and mouse embryonic liver development.
In this study, we seek to identify the equivalent embryonic liver development stage of HLCs. Regardless of the source and the protocol used, HLCs stop the maturation process at the same stage. We also identified genes that have different expression dynamics between in vitro differentiation and embryonic liver development. To the best of our knowledge, this is the first described meta-analysis of highly heterogeneous transcriptome data from human and mouse. This study gives further clues for advancing directed differentiation of PSCs to hepatocytes.
Materials and Methods
Human embryonic stem cell culture
The human embryonic stem cell (hESC) line H9 was cultured using 80% Knockout™ Dulbecco's modified Eagle's medium (DMEM) (Gibco/BRL, USA), supplemented with 20% Knockout Serum Replacement (Gibco), 2.0 mM glutamine, 1 × nonessential amino acids (Gibco), 55 μM β-mercaptoethanol, and 10 ng/mL basic fibroblast growth factor (FGF2). The cells were cultured on irradiated E13–14 CF-1 mouse embryonic fibroblasts (Charles River Laboratories, Wilmington, MA) at 37°C in 10% CO2. The cells were passaged after they reached 70%–80% confluency using 0.1% (w/V) collagenase type IV (Gibco) in Knockout DMEM.
Hepatocyte differentiation
H9 and mouse induced pluripotent stem cells (miPSCs) were passaged onto 2% Matrigel®-coated plates in mTESR™ medium (Stem Cell Technologies) for 24 h or until 50%–70% confluency. Differentiation was initiated by switching to a differentiation medium, which consisted of 60/40 (v/v) mixture of low glucose DMEM (Gibco) and MCDB-201 (Sigma), supplemented with 26 μg/mL ascorbic acid 3-phosphate (Sigma). 2.5 μg/mL linoleic acid, 0.25 mg/mL bovine serum albumin, insulin-transferrin-selenium (Sigma) (2.5 μg/mL insulin, 1.38 μg/mL transferrin, and 1.25 ng/mL sodium selenite), 0.4ug/mL dexamethasone (Sigma), 4.3 μg/mL β-mercaptoethanol (Hyclone), 100 IU/mL penicillin, and 100ug/mL streptomycin (Gibco). Two percent (v/v) fetal bovine serum albumin (Hyclone) was added to the media for the first 6 days and then 0.5% (v/v) for the remaining 14 days. The differentiation medium was supplemented with stage-specific growth factors and cultured at 21% O2 and 5% CO2 following a previously published study [2].
Data sets
The source of human and mouse data sets are listed in Supplementary Tables S1 and S2, respectively. The human data set consisted of transcriptome of different hESCs differentiating to hepatocytes, including HLCs generated using various protocols, two time course data from different protocols (Hu/Duncan) [3,28–31], HLCs in 3D spheroids [11], liver organoids generated through coculture of induced pluripotent stem cell (iPSC)-derived HLCs and mesenchymal and endothelial cells [12], and induced hepatocytes (iHEPs) generated through direct reprogramming of fibroblasts to HLCs by hepatic TF transduction [32]. Also included were primary human hepatocytes (PHHs) (pooled messenger RNA after 1 day of culture), fetal liver (18 weeks of gestation) and adult liver. Transcriptome data of mouse embryonic liver development (E9.5 to postnatal) include CD45− Ter119− liver cells from C57/BL6/Tg mice embryos and whole livers from C57/B6 mice embryos [12,33].
Microarray sample processing
Total RNA was extracted at different stages during hPSC and mPSC differentiation to HLCs using the RNeasy Mini Kit (Qiagen). Differentiating hESC (H9, HSF6) samples were hybridized to the Illumina HT12 bead array v3 (Illumina, Inc.), HES3 samples were hybridized to Human Genome U133 v2 array (Affymetrix), and the miPSC-derived samples were hybridized to the WG-6 v2 (Illumina) array.
Data analysis
Global analysis to elucidate trends in the data was performed using hierarchical clustering, principal component analysis (PCA), and non-negative matrix factorization (NMF). Before subjecting the data to batch correction, genes with static expression profiles in the data set were removed. Only those genes changing greater than fourfold across any timewise comparison were retained in the global analysis. All analysis was carried out in the statistical software, R. Briefly, the VirtualArray package in R was used to obtain Probe IDs among different samples from different microarray platforms [34]. The BioMart package in R was used to convert probe IDs to common ENSEMBL IDs among the different platforms in the expression data set [35]. Further details regarding each statistical analysis as well as the treatment of data sets of hHLC/mouse hepatocyte-like cell (mHLC) and mouse embryonic liver from the meta-analysis are described more in detail in the Supplementary Materials and Methods section.
Results
Compilation of human in vitro hepatic differentiation data
hESCs were differentiated toward the hepatic lineage using a four-stage hepatic differentiation protocol (Supplementary Fig. S1A). The transcript of hepatic markers, HNF4A, AFP, ALB, CYP3A4, AAT, and TAT, increased significantly by the end of the differentiation (Supplementary Fig. S1B). Immunohistochemistry reveals that a majority of cells were positive for endodermal marker FOXA2 by day 6. Hepatic markers HNF4A, AAT, and ALB were prominent in cells by day 20 (Supplementary Fig. S1C). The high transcript expression of AFP, which is low in primary hepatocytes, suggests the fetal nature of hHLCs derived from PSCs. Samples over the course of the differentiation on days 6, 10, 14, and 20 (labeled D6_H, D10_H, D14_H, and D20_H, respectively) were collected for transcriptome assay.
The transcriptome data, along with another set of hHLC differentiation [36], and additional data from the public domain were combined together to increase the data diversity (the sources and references along with sample annotations are listed in Supplementary Table S1). One data set consisted of multiple time points over the course of differentiation from hPSCs to hHLCs (Duncan) [28], while the others consisted of hHLC samples obtained using various differentiation methods. This compilation resulted in a total of 17,683 genes that were common among all the data sets based on ENSEMBL ID.
Batch correction of human in vitro data
Batch correction was performed on the hHLC in vitro data set to remove the systematic bias caused by the heterogeneous samples and platform sources. The effect of batch correction is visible after performing hierarchical clustering of the different samples before and after batch correction. Without batch correction, samples obtained from each study clustered together, as seen in Supplementary Figure S2. After batch correction, samples based on their similarity in differentiation stages clustered together, regardless of the study it came from, suggesting the removal of batch effects (Fig. 1A).
Samples from a similar differentiation stage for both of the time series studies (Hu/Duncan) clustered together, indicating a similar progression through the differentiation stages in different protocols. However, the hHLC data from all sources consistently clustered separately from tissue samples of fetal and adult liver samples, confirming the immaturity of these cells.
NMF analysis was performed to separate the hHLC data set into three clusters (Fig. 1B). The optimal number of clusters, three, was determined by the highest cophenetic coefficient obtained (Supplementary Fig. S3). The smallest cluster consisted of mature hepatic samples from liver tissue and primary hepatocytes, and the other two clusters consisted of samples from early (D4–10) and late differentiation (D14–20) stages. Only two hHLC samples [11,31] were grouped separately in the mature cluster. The grouping of early and late differentiation samples was remarkably similar between NMF and hierarchical clustering. Notably, samples in similar differentiation stages from the two differentiation time series studies (Hu/Duncan) were clustered together.
We used Database for Annotation, Visualization, and Integrated Discovery (DAVID) to probe the functional class of the 967 metagenes identified from the NMF analysis (Supplementary Table S3) [37]. Genes associated with the clustering of early differentiating cell states (D4–10) were involved mainly in developmental processes, while those associated with late differentiation states (D14–20) were mainly involved in cell differentiation, adhesion, extracellular matrix reorganization, epithelial specification, and drug response. The metagenes classifying fetal and adult liver tissue were involved in mature liver functions, including CYP450 drug detoxification, electron carrier activity, and carbohydrate metabolism, among many others.
Thus, the two unsupervised classification methods, hierarchical clustering and NMF, classified the batch corrected human transcriptome data into similar groups based on the functional relevance of gene expression patterns. This gives credence to the data processing method we adopted for batch correction to further carry out the meta-analysis.
Alignment of human in vitro data along a differentiation scale
The batch corrected transcriptome data of the hHLC data set was subjected to PCA. Two principal components, PC1 and PC2, captured 90% of the data variance, suggesting that two components were sufficient to display the variability within the transcriptome data of the samples. When plotted, all samples, starting from endodermal cells to hHLCs, were observed to align chronologically along an arc, while primary hepatocytes and adult liver were in the high PC2 region (Fig 1C). The samples from the time series studies (Hu/Duncan) both aligned from left to right in chronological order based on the differentiation stages. All hHLC samples, regardless of the laboratory and protocol used, aligned within a narrow region in the principal component space, suggesting that they all have a similar degree of hepatocyte maturity.
Alignment of the transcriptome of rodent fetal liver development
The data processing pipeline described for the human data sets was used to process the data set of mouse embryonic liver development. The data set compiled from 47 samples across the development of 2 strains of mice was batch corrected [12,33]. A total of 16,415 genes were common across samples based on ENSEMBL ID.
Hierarchical clustering, NMF and PCA, was conducted on the data set. Similar to the observation made before, batch correction removed the bias arising from different platforms and sources (Fig. 2A; Supplementary Fig. S4). Clustering of the samples into two clusters was most optimal, as assessed by the cophenetic coefficient (Supplementary Fig. S5). NMF classified the batch corrected data set into two groups, early and late development, which was in line with the hierarchical clustering results (Fig. 2A, B). When the data were subjected to PCA, the first two principal components captured >90% of the data variance. As was seen for the hPSC-HLC time series, we found that the mouse embryonic liver development samples lined up chronologically in order of their developmental stages (Fig. 2C).
From the NMF and PCA analysis, the samples seemed to be grouped into an early (E9.5–E14.5) cluster and a late (E15.5–E19.5) cluster. This classification agrees with our understanding of liver development, where the E14.5 stage represents a transition from primarily hematopoiesis to hepatocyte maturation [38]. Functional analysis on the 129 metagenes obtained from the NMF grouping confirmed that genes related to liver functions were key in the grouping of samples (Supplementary Table S4).
Integration of mouse in vitro differentiation and in vivo development data
Transcriptome data of a time course analysis of miPSCs differentiating to HLCs [21] and murine fibroblasts being reprogrammed to iHEPs [39,40] were also compiled and integrated with the mouse embryonic development data sets by batch correction. Again, batch correction removed the bias of platforms, sources and in vitro/in vivo contrast. These combined data were then subjected to PCA. On a plot of PC1 versus PC2, the in vitro differentiation data and in vivo development data can be seen to line up along an arc based on the maturity of the samples, just like the hHLC data (Fig. 3).
The miPSC differentiation data aligned chronologically along E9.5–E15.5 of mouse liver development, while the data points for mouse embryonic liver development beyond E15 continue to spread to the region with higher values of PC2. The results suggest that the miPSC-HLCs and mouse induced hepatocytes (miHEPs) acquired a differentiation stage equivalent of ∼E13.5–E15.5 stage and still lack the maturity of fully developed prenatal E19 and postnatal liver.
Alignment of mouse in vivo and human in vitro developmental paths
We next integrated the hHLC differentiation data with mouse embryo development data by orthologous genes through common homologous identifiers using the BioMart database package in the statistical program, R. This resulted in a master data set with 14,312 genes. Batch correction was performed on the combined data set to eliminate the batch effects within the samples from different species and in vitro and in vivo samples.
PCA analysis was performed on the integrated data set of hHLC differentiation and mouse embryonic liver development. The data points of mouse and human aligned along the same arc, and lined up in order of their development or hHLC differentiation stages, respectively (Fig. 4). From the alignment, the corresponding stages of hHLC differentiation and mouse embryo development can be deduced. hPSC-derived endodermal cells were aligned with E9.5–E10.5 samples, and the majority of fully differentiated hHLCs were aligned with E13.5–E15.5 of mouse development. These results strongly indicate that hHLCs derived from hPSCs are more similar to the mouse ∼E14–E15 stage of in vivo development.
Comparison of HLCs with mature cells to find differentially expressed genes
The transcriptome of our day 20 hHLCs and mHLCs was separately compared to their mature counterparts (PHHs and E19.5 mouse liver, respectively) using Significance Analysis of Microarray (SAM-R) in R (where the criteria of q < 0.05 and fold change >4 were imposed) to identify differentially expressed genes. One hundred twenty-seven genes were identified as differentially expressed between hHLCs and PHHs. Functional analysis on the 127 differentially expressed genes using DAVID revealed that processes involved in CYP450 drug metabolism, carboxylic acid, amine and lipid metabolism, and complement and coagulation cascades were enriched [37].
Among the 127 differentially expressed genes, 42 were common between the human and mouse analyses, which are shown in Figure 5 and listed in Supplementary Table S5. Mature hepatocyte genes that were expressed at low levels in HLCs of both species included metabolic genes, such as G6PC and FBP1, and the cytochrome P450 enzymes, CYP3A4 and CYP2C9, indicating the lack of maturity in the metabolic profile of HLCs.
Expression profile comparison on a DT scale
The alignment of hHLC differentiation and mouse embryonic liver development data on a common coordinate grid allowed us to compare the two series of developmental events along a common “time” scale. We treated the arc formed by those data points as a common developmental path. Using the first point (E9.5) as a reference point, the distance of a sample from the reference point along the developmental path can be taken as a “unified developmental time (DT).” A higher value of DT indicates a higher maturity for that sample. The PC2 value of the two time series data of hHLC differentiation (Hu/Duncan) and all the mouse development data (from Fig. 3) was plotted on the DT scale (Fig. 6A). The hHLCs showed a progressive movement along the DT to ∼0.4 (equivalent to ∼E15.5) by the end of the differentiation.
With hHLC differentiation and mouse embryonic liver development on the same DT scale, the similarity or difference in the expression dynamics of any gene can be systematically compared. While some hepatic markers (eg, AFP) showed similar trends between human in vitro and mouse in vivo data, others showed inconsistent trends.
Dynamic differential expression analysis using Pearson correlation, Spearman's coefficient, and Euclidian distance (PSE) was performed to identify the dynamically differentially expressed genes between mouse development and in vitro differentiation. The Pearson correlation coefficient measures the linear correlation, while the Spearman's coefficient measures the monotonic relationship between two variables. Briefly, the threshold criteria were set for Pearson and Spearman coefficient ≤−0.8 and Euclidean distance ≥μ+2σ, where further details on the criteria imposed are described in the Supplementary Materials and Methods section. The genes fulfilling the PSE criteria were plotted and visually inspected for both human in vitro differentiation and mouse development (Supplementary Fig. S6).
Functional analysis and dynamically differentially expressed gene
One hundred ninety-seven genes were found to be dynamically differentially expressed between hHLC differentiation and mouse embryonic liver development (Supplementary Table S6), including 11 TFs (Fig. 6B), 7 transporters, and 33 other cell surface markers. The major functional classes/pathways involved from the 197 genes identified by the DAVID tool [37] included developmental process, organ development, and cell adhesion as shown in Supplementary Figure S7.
Using the same PSE criteria as before, we identified 141 dynamically differentially expressed genes between mouse embryo liver development and mHLC differentiation (Supplementary Table S7). One hundred one of the 141 genes were common with the 197 genes found in the previous comparison between hHLC differentiation and mouse embryo development. (These genes are marked with * in Supplementary Table S7). The major functional classes/pathways involved from the subset of 101 genes also included developmental process, embryonic morphogenesis, and cell–cell adhesion. This is consistent with the results found previously between hHLC differentiation and mouse embryo development.
Among the 101 consistently discordant genes between mHLC and hHLC are a number of TFs regulating organ development, including HAND1, PITX2, ALX1, CDX2, TSHZ1, and SNAI2. They are found to be highly expressed in the early stages of mouse liver development and subsided after E15, but their transcript levels during hPSC differentiation followed an opposite trend and remained high in hHLCs (Fig. 6B).
We also confirmed the dynamics of the identified differentially expressed TFs, using quantitative reverse transcription-polymerase chain reaction (qRT-PCR) over the course of differentiation (Supplementary Fig. S8). ALX1 is involved in forebrain development [41], and CDX2 plays a role in lineage segregation of the inner cell mass and trophectoderm [41,42], as well as intestinal fate and epithelial mesenchymal signaling [43]. HAND1 regulates embryonic cardiac development [44], PITX2 participates in limb development [45,46], SOX9 in pancreatic and bile duct development [47], SNAI2 in epithelial–mesenchymal transition (EMT) [48,49], and TSHZ1 in pancreatic cell development and maturation [50]. We examined the TF binding sites in the promoter regions of the dynamically differentially expressed genes (http://genome.ucsc.edu/ENCODE/). The TFs that have their binding site present in the promoter region of the most differentially expressed genes are listed in Supplementary Table S8.
Several cell surface proteins also had opposite trends of transcript dynamics between HLC differentiation and mouse liver development (Supplementary Fig. S6), including SPARC, an extracellular matrix secreted factor, and CDH3 (P-cadherin). Their transcript levels increased during differentiation, but decreased in the developing embryonic mouse liver. Comparison of the expression levels of the hepatic markers, alpha-fetoprotein (AFP) and albumin (ALB), for mouse development and our time series in vitro human differentiation (Hu) (Supplementary Fig. S9A–C) revealed consistent trends and similar magnitudes of gene expression dynamics before and after batch correction. The profile of all 197 dynamically differentially expressed genes was also visually inspected after plotting with TimeVIEW to affirm that treatment with ComBat did not distort the data (Supplementary Fig. S10) [51].
Discussion
In this study, we used highly heterogeneous transcriptome data from hPSC and mPSC in vitro differentiation and mouse in vivo embryo development for meta-analysis. Batch corrections were performed on the data to remove systematic variations biasing our observations. Previous studies have shown that the empirical Bayes-based ComBat algorithm can be relied on in removing system bias caused by a mixture of transcriptome data from different populations of cells of different sources [52–57]. ComBat had previously been shown to integrate transcriptome data from multiple studies of different vascular endothelial growth factor (VEGF) isoform mice to identify the role VEGF plays in neural stem cell fate [52].
Recently, ComBat was used for treating the transcriptome data of different PSC populations generated from multiple studies to determine the similarities between genetically matched iPSCs, nuclear transfer stem cells, and embryonic stem cells [56]. We extended its use for combining data from different species and across the in vitro/in vivo to understand stem cell hepatocyte differentiation. Through successive data processing with increasing level of data complexity, we successfully integrated hPSC-HLC differentiation data from different studies with mouse in vivo liver development data.
In each level of analysis, after batch correction, unsupervised clustering grouped the different samples by their biological similarities instead of the source of the sample (Figs. 1 and 2; Supplementary Figs. S2 and S4). The clustering and functional analysis of the batch-corrected data reveal the biological similarities between the different samples after data processing. In addition, the functional classes of the metagene obtained in NMF were found to be related to development and liver functions, confirming the biological relevance from clustering of the samples. Importantly, the genes we identified as dynamically differentially expressed retained the same dynamics before and after batch correction.
Identification of similar stages of development across different species that have vastly different time scales often relies on common morphological, biochemical, and genetic hallmarks. For example, in human embryo, development days 22 and 52 are considered to be equivalent to mouse E9.5 and E14.5, respectively [58]. In an earlier study, neural development events were used to generate a regression model for predicting a time table of corresponding stages across nine species [59]. Recently, transcriptome data of nematode species were compared to morphological markers to establish embryo developmental milestones in different species of Caenorhabditis [60]. In another study, different feature measurements of leaves for different tomato species were subjected to PCA to establish a developmental trajectory [61].
Similarly, RNA-seq data of a species' developing embryo were subjected to PCA to align the developmental stage in the PC plane [62]. However, in many cases, the maturity of HLCs is measured by comparing it to primary hepatocytes at the functional, proteome, and transcriptome level. To our knowledge, this is the first study to compare genes across species and across the in vitro/in vivo border in a dynamic manner to identify the developmental stage which HLCs are stuck at.
Our meta-analysis revealed that HLCs were closer to mid-liver development (∼E15.5) than to primary adult hepatocytes. This indicates that irrespective of the species of origin, or the protocols used, differentiating HLCs appear to encounter universal roadblocks preventing their maturation. It also suggests that there might be a fundamental block that prevents HLCs from maturing further.
Using the common scale that was developed, we can align the transcript dynamics of human in vitro PSC-HLC differentiation with mouse in vivo development to identify genes whose expression was progressing in the wrong direction compared to the in vivo processes. We also identified differentially expressed genes between h/mHLCs and mature cells (E19.5, adult liver), including key CYP450 enzymes (CYP2C8, CYP2C9, CYP2E1, CYP2D6, and CYP3A4) and metabolic isozymes (ALDOB, GLUT2, G6PC, and FBP1), which are all hallmarks of mature liver metabolism.
The 197 dynamically differentially expressed genes between hHLC differentiation and mouse liver embryo development similarly pointed to the lack of maturity in HLCs. An example is the expression of genes involved in the glucose metabolism. As hepatocytes become more mature, a new set of transporters, enzymes, and their isoforms in glucose metabolism is expressed to give the liver its capability in maintaining the homeostasis of glucose and gluconeogenesis. Notably, the increased expression of glucose transporter GLUT2, and aldolase B (ALDOB) seen in mouse liver development and human adult hepatocytes did not yet occur in HLCs (Supplementary Fig. S11). Not surprisingly, the increased expression of the gluconeogenesis enzymes, glucose 6-phosphatase (G6PC) and fructose 1,6-biphosphatase (FBP), seen in mouse development was also not seen in HLCs.
Since both hHLCs and mHLCs ceased further maturation at about E14–E15 stage, it is plausible that they face the same genetic barriers. The “discordant” behavior of some of the common dynamically differentially expressed genes identified may have contributed to the lack of further maturation. One hundred one genes were identified as having discordant behavior in both species of HLC differentiation compared to mouse embryo liver development. Some of the common TFs identified are not known to play a role in liver development, but are involved in the development of heart (HAND1), eye, or lung (SOX9) [63]. Although this may not be surprising, as hepatocytes are created by a mesendodermal precursor early during development, persistent expression of these TFs may prevent final maturation of HLCs to the level of primary hepatocytes.
Furthermore, TGFβ signaling in conjunction with upregulation of SNAI2 is consistent with EMT, which has been described previously as being abnormally expressed in HLCs compared with PHHs [48]. The opposing trend of transcript dynamics of these TFs may be indicative of discordant gene regulation between mouse liver development and HLC differentiation.
The gene expression profile and metabolic activities have led to the notion that HLCs were closer to fetal than adult hepatocytes [10]. Another study of comparison between HLCs and human hepatocytes at the transcriptome level revealed 4,000 differentially expressed genes [9]. Our study demonstrates for the first time that HLCs obtained using various protocols and from different sources are all equivalent to E13–E15.5 of mouse development. We could pin point the stage of developmental block by developing a “unified developmental time” that can span across species and in vitro and in vivo samples.
Despite the difference in the methodology used, a large fraction of genes identified as differentially expressed in the previous studies [9,10] was also identified as dynamically differentially expressed in our study (Supplementary Fig. S12). However, differences between the two studies are also seen. Given the different reference of comparison used (time course of embryonic liver development in this study versus primary hepatocytes in [9,10]), some differences in the differentially expressed gene list are not surprising. Instead of measuring the maturity of HLCs using the expression level of mature markers, the dynamic profile of the differentiation transcriptome was used. We identified not only the liver markers that should be expressed at high levels but also certain genes that might need to be suppressed.
This study reaffirmed that HLCs are at an immature hepatocyte cell state, and for the first time identifies the “corresponding stage” during embryonic liver development. Many genes whose expression followed a different pattern from that in the developing embryo might contribute to this block in further maturation. Whether inhibition of mis-expressed TFs and activation of TFs that are missing will enhance maturation of stem cell-derived hepatocytes remains to be determined. In addition, this meta-analysis pipeline should be applicable to PSC differentiation studies to other lineages if the time course transcriptome data for both in vitro differentiation and in vivo development are available.
Supplementary Material
Acknowledgments
D.C. was supported by the NIH Biotechnology Training Grant (GM08347). Funding to C.M.V. was supported through IWT-SBO-HEPSTEM, IWT-SBO-HILIM-3D, and EC-SERURAT-1-HEMIBIO.
Author Disclosure Statement
No competing financial interests exist.
References
- 1.Haridass D, Narain N. and Ott M. (2008). Hepatocyte transplantation: waiting for stem cells. Curr Opin Organ Transplant 13:627–632 [DOI] [PubMed] [Google Scholar]
- 2.Roelandt P, Pauwelyn KA, Sancho-Bru P, Subramanian K, Bose B, Ordovas L, Vanuytsel K, Geraerts M, Firpo M, et al. (2010). Human embryonic and rat adult stem cells with primitive endoderm-like phenotype can be fated to definitive endoderm, and finally hepatocyte-like cells. PLoS One 5:e12101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jozefczuk J, Prigione A, Chavez L. and Adjaye J. (2011). Comparative analysis of human embryonic stem cell and induced pluripotent stem cell-derived hepatocyte-like cells reveals current drawbacks and possible strategies for improved differentiation. Stem Cells Dev 20:1259–1275 [DOI] [PubMed] [Google Scholar]
- 4.Chamberlain J, Yamagami T, Colletti E, Theise ND, Desai J, Frias A, Pixley J, Zanjani ED, Porada CD. and Almeida-Porada G. (2007). Efficient generation of human hepatocytes by the intrahepatic delivery of clonal human mesenchymal stem cells in fetal sheep. Hepatology 46:1935–1945 [DOI] [PubMed] [Google Scholar]
- 5.Qian H, Wang J, Wang S, Gong Z, Chen M, Ren Z. and Huang S. (2006). In utero transplantation of human hematopoietic stem/progenitor cells partially repairs injured liver in mice. Int J Mol Med 18:633–642 [PubMed] [Google Scholar]
- 6.Basma H, Soto-Gutiérrez A, Yannam GR, Liu L, Ito R, Yamamoto T, Ellis E, Carson SD, Sato S, et al. (2009). Differentiation and transplantation of human embryonic stem cell-derived hepatocytes. Gastroenterology 136:990–999 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Avital I, Inderbitzin D, Aoki T, Tyan DB, Cohen AH, Ferraresso C, Rozga J, Arnaout WS. and Demetriou AA. (2001). Isolation, characterization, and transplantation of bone marrow-derived hepatocyte stem cells. Biochem Biophys Res Commun 288:156–164 [DOI] [PubMed] [Google Scholar]
- 8.Moreno R, Martínez-González I, Rosal M, Nadal M, Petriz J, Gratacós E. and Aran JM. (2012). Fetal liver-derived mesenchymal stem cell engraftment after allogeneic in utero transplantation into rabbits. Stem Cells Dev 21:284–295 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Godoy P, Schmidt-Heck W, Natarajan K, Lucendo-Villarin B, Szkolnicka D, Asplund A, Björquist P, Widera A, Stöber R, et al. (2015). Gene networks and transcription factor motifs defining the differentiation of stem cells into hepatocyte-like cells. J Hepatol 63:934–942 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Baxter M, Withey S, Harrison S, Segeritz CP, Zhang F, Atkinson-Dell R, Rowe C, Gerrard DT, Sison-Young R, et al. (2015). Phenotypic and functional analyses show stem cell-derived hepatocyte-like cells better mimic fetal rather than adult hepatocytes. J Hepatol 62:581–589 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Subramanian K, Owens DJ, Raju R, Firpo M, O'Brien TD, Verfaillie CM. and Hu WS. (2014). Spheroid culture for enhanced differentiation of human embryonic stem cells to hepatocyte-like cells. Stem Cells Dev 23:124–131 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Takebe T, Sekine K, Enomura M, Koike H, Kimura M, Ogaeri T, Zhang RR, Ueno Y, Zheng YW, et al. (2013). Vascularized and functional human liver from an iPSC-derived organ bud transplant. Nature 499:481–484 [DOI] [PubMed] [Google Scholar]
- 13.Soto-Gutiérrez A, Navarro-Alvarez N, Zhao D, Rivas-Carrillo JD, Lebkowski J, Tanaka N, Fox IJ. and Kobayashi N. (2007). Differentiation of mouse embryonic stem cells to hepatocyte-like cells by co-culture with human liver nonparenchymal cell lines. Nat Protoc 2:347–356 [DOI] [PubMed] [Google Scholar]
- 14.Du C, Narayanan K, Leong MF. and Wan AC. (2014). Induced pluripotent stem cell-derived hepatocytes and endothelial cells in multi-component hydrogel fibers for liver tissue engineering. Biomaterials 35:6006–6014 [DOI] [PubMed] [Google Scholar]
- 15.Shan J, Schwartz RE, Ross NT, Logan DJ, Thomas D, Duncan SA, North TE, Goessling W, Carpenter AE. and Bhatia SN. (2013). Identification of small molecules for human hepatocyte expansion and iPS differentiation. Nat Chem Biol 9:514–520 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Tasnim F, Phan D, Toh YC. and Yu H. (2015). Cost-effective differentiation of hepatocyte-like cells from human pluripotent stem cells using small molecules. Biomaterials 70:115–125 [DOI] [PubMed] [Google Scholar]
- 17.Siller R, Greenhough S, Naumovska E. and Sullivan GJ. (2015). Small-molecule-driven hepatocyte differentiation of human pluripotent stem cells. Stem Cell Rep 4:939–952 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Takayama K, Inamura M, Kawabata K, Katayama K, Higuchi M, Tashiro K, Nonaka A, Sakurai F, Hayakawa T, Furue MK. and Mizuguchi H. (2012). Efficient generation of functional hepatocytes from human embryonic stem cells and induced pluripotent stem cells by HNF4alpha transduction. Mol Ther 20:127–137 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Takayama K, Inamura M, Kawabata K, Sugawara M, Kikuchi K, Higuchi M, Nagamoto Y, Watanabe H, Tashiro K, et al. (2012). Generation of metabolically functioning hepatocytes from human pluripotent stem cells by FOXA2 and HNF1alpha transduction. J Hepatol 57:628–636 [DOI] [PubMed] [Google Scholar]
- 20.Du Y, Wang J, Jia J, Song N, Xiang C, Xu J, Hou Z, Su X, Liu B, et al. (2014). Human hepatocytes with drug metabolic function induced from fibroblasts by lineage reprogramming. Cell Stem Cell 14:394–403 [DOI] [PubMed] [Google Scholar]
- 21.Sancho-Bru P, Roelandt P, Narain N, Pauwelyn K, Notelaers T, Shimizu T, Ott M. and Verfaillie C. (2011). Directed differentiation of murine-induced pluripotent stem cells to functional hepatocyte-like cells. J Hepatol 54:98–107 [DOI] [PubMed] [Google Scholar]
- 22.Johnson WE, Li C. and Rabinovic A. (2007). Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8:118–127 [DOI] [PubMed] [Google Scholar]
- 23.Sims AH, Smethurst GJ, Hey Y, Okoniewski MJ, Pepper SD, Howell A, Miller CJ. and Clarke RB. (2008). The removal of multiplicative, systematic bias allows integration of breast cancer gene expression datasets—improving meta-analysis and prediction of prognosis. BMC Med Genomics 1:42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Chen C, Grennan K, Badner J, Zhang D, Gershon E, Jin L. and Liu C. (2011). Removing batch effects in analysis of expression microarray data: an evaluation of six batch adjustment methods. PLoS One 6:e17238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Nygaard V, Rødland EA. and Hovig E. (2016). Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses. Biostatistics 17:29–39 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Benito M, Parker J, Du Q, Wu J, Xiang D, Perou CM. and Marron JS. (2004). Adjustment of systematic microarray data biases. Bioinformatics 20:105–114 [DOI] [PubMed] [Google Scholar]
- 27.Luo J, Schumacher M, Scherer A, Sanoudou D, Megherbi D, Davison T, Shi T, Tong W, Shi L, et al. (2010). A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data. Pharmacogenomics J 10:278–291 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.DeLaForest A, Nagaoka M, Si-Tayeb K, Noto FK, Konopka G, Battle MA. and Duncan SA. (2011). HNF4A is essential for specification of hepatic progenitors from human pluripotent stem cells. Development 138:4143–4153 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Roelandt P, Obeid S, Paeshuyse J, Vanhove J, Van Lommel A, Nahmias Y, Nevens F, Neyts J. and Verfaillie CM. (2012). Human pluripotent stem cell-derived hepatocytes support complete replication of hepatitis C virus. J Hepatol 57:246–251 [DOI] [PubMed] [Google Scholar]
- 30.Touboul T, Hannan NR, Corbineau S, Martinez A, Martinet C, Branchereau S, Mainot S, Strick-Marchand H, Pedersen R, et al. (2010). Generation of functional hepatocytes from human embryonic stem cells under chemically defined conditions that recapitulate liver development. Hepatology 51:1754–1765 [DOI] [PubMed] [Google Scholar]
- 31.Si-Tayeb K, Noto FK, Nagaoka M, Li J, Battle MA, Duris C, North PE, Dalton S. and Duncan SA. (2010). Highly efficient generation of human hepatocyte-like cells from induced pluripotent stem cells. Hepatology 51:297–305 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Huang P, Zhang L, Gao Y, He Z, Yao D, Wu Z, Cen J, Chen X, Liu C, et al. (2014). Direct reprogramming of human fibroblasts to functional and expandable hepatocytes. Cell Stem Cell 14:370–384 [DOI] [PubMed] [Google Scholar]
- 33.Li T, Huang J, Jiang Y, Zeng Y, He F, Zhang MQ, Han Z. and Zhang X. (2009). Multi-stage analysis of gene expression and transcription regulation in C57/B6 mouse liver development. Genomics 93:235–242 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Heider A. and Alt R. (2013). virtualArray: a R/bioconductor package to merge raw data from different microarray platforms. BMC Bioinformatics 14:75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Durinck S, Spellman PT, Birney E. and Huber W. (2009). Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat Protoc 4:1184–1191 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Raju R, Chau D, Cho DS, Park Y, Verfaillie CM. and Hu WS. (2017). Cell expansion during directed differentiation of stem cells toward the hepatic lineage. Stem Cells Dev 26:274–284 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Huang W, Sherman BT. and Lempicki RA. (2009). Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res 37:1–13 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.HegedŰs T, Gyimesi G, Gáspár ME, Szalay KZ, Gangal R. and Csermely P. (2013). Potential application of network descriptions for understanding conformational changes and protonation states of ABC transporters. Curr Pharm Des 19:4155–4172 [DOI] [PubMed] [Google Scholar]
- 39.Huang P, He Z, Ji S, Sun H, Xiang D, Liu C, Hu Y, Wang X. and Hui L. (2011). Induction of functional hepatocyte-like cells from mouse fibroblasts by defined factors. Nature 475:386–389 [DOI] [PubMed] [Google Scholar]
- 40.Sekiya S. and Suzuki A. (2011). Direct conversion of mouse fibroblasts to hepatocyte-like cells by defined factors. Nature 475:390–393 [DOI] [PubMed] [Google Scholar]
- 41.Nelms BL. and Labosky PA. (2010). Transcriptional Control of Neural Crest Development. Morgan & Claypool Life Sciences, San Rafael, CA: [PubMed] [Google Scholar]
- 42.Strumpf D, Mao CA, Yamanaka Y, Ralston A, Chawengsaksophak K, Beck F. and Rossant J. (2005). Cdx2 is required for correct cell fate specification and differentiation of trophectoderm in the mouse blastocyst. Development 132:2093–2102 [DOI] [PubMed] [Google Scholar]
- 43.Gao N, White P. and Kaestner KH. (2009). Establishment of intestinal identity and epithelial-mesenchymal signaling by Cdx2. Dev Cell 16:588–599 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.McFadden DG, Barbosa AC, Richardson JA, Schneider MD, Srivastava D. and Olson EN. (2005). The Hand1 and Hand2 transcription factors regulate expansion of the embryonic cardiac ventricles in a gene dosage-dependent manner. Development 132:189–201 [DOI] [PubMed] [Google Scholar]
- 45.Duboc V. and Logan MP. (2011). Pitx1 is necessary for normal initiation of hindlimb outgrowth through regulation of Tbx4 expression and shapes hindlimb morphologies via targeted growth control. Development 138:5301–5309 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Bensoussan-Trigano V, Lallemand Y, Saint Cloment C. and Robert B. (2011). Msx1 and Msx2 in limb mesenchyme modulate digit number and identity. Dev Dyn 240:1190–1202 [DOI] [PubMed] [Google Scholar]
- 47.Belo J, Krishnamurthy M, Oakie A. and Wang R. (2013). The role of SOX9 transcription factor in pancreatic and duodenal development. Stem Cells Dev 22:2935–2943 [DOI] [PubMed] [Google Scholar]
- 48.Casas E, Kim J, Bendesky A, Ohno-Machado L, Wolfe CJ. and Yang J. (2011). Snail2 is an essential mediator of Twist1-induced epithelial mesenchymal transition and metastasis. Cancer Res 71:245–254 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Fenouille N, Tichet M, Dufies M, Pottier A, Mogha A, Soo JK, Rocchi S, Mallavialle A, Galibert MD, et al. (2012). The epithelial-mesenchymal transition (EMT) regulatory factor SLUG (SNAI2) is a downstream target of SPARC and AKT in promoting melanoma cell invasion. PLoS One 7:e40378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Raum JC, Soleimanpour SA, Groff DN, Coré N, Fasano L, Garratt AN, Dai C, Powers AC. and Stoffers DA. (2015). Tshz1 Regulates Pancreatic β-Cell Maturation. Diabetes 64:2905–2914 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Gadgil M, Mehra S, Kapur V. and Hu WS. (2006). TimeView: for comparative gene expression analysis. Appl Bioinformatics 5:41–44 [DOI] [PubMed] [Google Scholar]
- 52.Cain JT, Berosik MA, Snyder SD, Crawford NF, Nour SI, Schaubhut GJ. and Darland DC. (2014). Shifts in the vascular endothelial growth factor isoforms result in transcriptome changes correlated with early neural stem cell proliferation and differentiation in mouse forebrain. Dev Neurobiol 74:63–81 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Gentles AJ, Newman AM, Liu CL, Bratman SV, Feng W, Kim D, Nair VS, Xu Y, Khuong A, et al. (2015). The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat Med 21:938–945 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Müller C, Schillert A, Röthemeier C, Trégouët DA, Proust C, Binder H, Pfeiffer N, Beutel M, Lackner KJ, et al. (2016). Removing batch effects from longitudinal gene expression—quantile normalization plus ComBat as best approach for microarray transcriptome data. PLoS One 11:e0156594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Miller JA, Ding SL, Sunkin SM, Smith KA, Ng L, Szafer A, Ebbert A, Riley ZL, Royall JJ, et al. (2014). Transcriptional landscape of the prenatal human brain. Nature 508:199–206 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Ma H, Morey R, O'Neil RC, He Y, Daughtry B, Schultz MD, Hariharan M, Nery JR, Castanon R, et al. (2014). Abnormalities in human pluripotent cells due to reprogramming mechanisms. Nature 511:177–183 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Rempel E, Hoelting L, Waldmann T, Balmer NV, Schildknecht S, Grinberg M, Das Gaspar JA, Shinde V, Stöber R, et al. (2015). A transcriptome-based classifier to identify developmental toxicants by stem cell testing: design, validation and optimization for histone deacetylase inhibitors. Arch Toxicol 89:1599–1618 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Downs KM. and Davies T. (1993). Staging of gastrulating mouse embryos by morphological landmarks in the dissecting microscope. Development 118:1255–1266 [DOI] [PubMed] [Google Scholar]
- 59.Clancy B, Darlington RB. and Finlay BL. (2001). Translating developmental time across mammalian species. Neuroscience 105:7–17 [DOI] [PubMed] [Google Scholar]
- 60.Levin M, Hashimshony T, Wagner F. and Yanai I. (2012). Developmental milestones punctuate gene expression in the Caenorhabditis embryo. Dev Cell 22:1101–1108 [DOI] [PubMed] [Google Scholar]
- 61.Chitwood DH, Headland LR, Kumar R, Peng J, Maloof JN. and Sinha NR. (2012). The developmental trajectory of leaflet morphology in wild tomato species. Plant Physiol 158:1230–1240 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Anavy L, Levin M, Khair S, Nakanishi N, Fernandez-Valverde SL, Degnan BM. and Yanai I. (2014). BLIND ordering of large-scale transcriptomic developmental time courses. Development 141:1161–1166 [DOI] [PubMed] [Google Scholar]
- 63.Rockich BE, Hrycaj SM, Shih HP, Nagy MS, Ferguson MA, Kopp JL, Sander M, Wellik DM. and Spence JR. (2013). Sox9 plays multiple roles in the lung epithelium during branching morphogenesis. Proc Natl Acad Sci U S A 110:E4456–E4464 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.