Abstract
Thousands of transcriptome data sets are available, but approaches for their use in dynamic cell response modelling are few, especially for processes affected simultaneously by two orthogonal influencing variables. We approached this problem for neuroepithelial development of human pluripotent stem cells (differentiation variable), in the presence or absence of valproic acid (signaling variable). Using few basic assumptions (sequential differentiation states of cells; discrete on/off states for individual genes in these states), and time-resolved transcriptome data, a comprehensive model of spontaneous and perturbed gene expression dynamics was developed. The model made reliable predictions (average correlation of 0.85 between predicted and subsequently tested expression values). Even regulations predicted to be non-monotonic were successfully validated by PCR in new sets of experiments. Transient patterns of gene regulation were identified from model predictions. They pointed towards activation of Wnt signaling as a candidate pathway leading to a redirection of differentiation away from neuroepithelial cells towards neural crest. Intervention experiments, using a Wnt/beta-catenin antagonist, led to a phenotypic rescue of this disturbed differentiation. Thus, our broadly applicable model allows the analysis of transcriptome changes in complex time/perturbation matrices.
INTRODUCTION
Early cell fate commitment towards the neural lineage is accompanied by highly dynamic and large-scale transcriptome changes (1–6). Disturbances during this process can alter developmental trajectories permanently. Accordingly, key transcription factors as well as large transcriptome modules, such as gene ontologies and constituents of regulatory circuits, are strongly affected (7–12).
Pluripotent stem cells (PSC) are good models to mimic the in vivo processes during neurodevelopment. Human embryonic stem cells (hESC) or induced pluripotent stem cells (iPSC) are of particular interest, as this allows experimental access to the development of neuroectodermal and later nervous system structures in man. Such neurodevelopmental processes have many species specific features (13–16) and they could hardly be studied in a systematic way prior to the advent of stem cell technology. Examples for human specific aspects of brain development range from transposon activity (17), to largely differing time scales and relative steps of wave-like gene activation patterns (18–21).
Prior to the classical neurodevelopmental stages, leading from neural stem cells to brain tissue formation and regional specialization (16,22), early differentiation processes lead from initially pluripotent cells to neuroepithelial progenitor cells (NEP) and various types of neural stem cells (23). Such steps occur in fetuses during the formation and initial folding/closure of the neural plate (24,25). In vitro, the NEP cells form neural rosettes, a correlate of the neural tube (26). A highly standardized method (STOP-tox(UKN) assay) has been developed to track early neural development and its disturbances via analysis of transcriptome changes, epigenome alterations and the capacity to self-organize in rosettes (8,9,12,23,27–29).
Disturbances of the regulatory networks that control the spatio-temporal expression of genes can lead to severe congenital malformations (e.g. spina bifida, anencephaly) (30,31), and they may contribute to neuropsychiatric disorders such as attention deficit/hyperactivity disorder (ADHD), autism spectrum disorders (ASD) or schizophrenia (32,33). Such disturbances can be induced by genetics (34), stress experiences during fetal and early childhood development (35) and environmental influences (drugs, environmental and industrial chemicals, malnutrition) (33,34). Little is known about the biochemical mechanisms that underlie these neurodevelopmental disorders, and how and when transient perturbations, such as by drugs, alter differentiation in a pathological way.
Valproic acid (VPA) may be used to establish bona fide models of clinically-relevant neurodevelopmental disorders. It is an anticonvulsant drug that is administered in epilepsy and mood disorders and it is well known to increase the risk for congenital malformation such as neural tube closure defects and craniofacial malformations (30,31,36–38). Extensive clinical data exist on VPA’s developmental neurotoxicity effects (39), and the relevant plasma concentrations in fetal and maternal blood are known (40,41).
The impact of developmental disturbances depends strongly on the time of onset and duration of the disturbance. In this sense, thalidomide (best known under the brand name Contergan) gained notoriety as there were very small time windows of in utero exposure identified that determined the kind of malformation (42). Therefore, in order to investigate the molecular mechanisms of such disturbances by using differentiating human PSC, it is important to get a better understanding of how these perturbations propagate to dysregulated gene activities and cellular phenotypes.
Mathematical models have been very instrumental to disentangle the complexity of regulatory networks in general, and of gene regulation in the context of stem cell differentiation more specifically. Models have been applied to analyze networks on multiple layers. For stem cells, most models focused on the network of key regulators and the long-term dynamics of cell fate decisions (43,44). Those models were able to disentangle the network topology of small networks of key pluripotency factors and have helped to define the logic that governs the stability and exit from pluripotency or other cell identity transitions (45–47). Such detailed mechanistic models build on extensive mechanistic knowledge and often integrate detailed data from different sources and studies. In contrast, a number of data-driven modeling approaches have been used to model transcriptome changes on a genome-wide scale. Broadly, these models were either used to infer quantitative information about mRNA processing and turnover to describe the response of the transcriptome to acute stimulations, or they have been used to infer hidden variables like transcription factor activity from transcriptome time series data (48–52). So far, these models were only applicable to describe the transcriptome dynamics to immediate stimuli or the effect of individual transcription factor on gene expression. Neither detailed mechanistic approaches nor the current transcriptome-wide modeling approaches are optimally suited to disentangle the transcriptome dynamics of a differentiation program, such as the transition from pluripotency to neural progenitor cells. As these transitions occur in a cascade of steps and lead to waves of gene expression, models are needed that consider the dynamic changes under control conditions to evaluate modulations by stressors/toxicants. Moreover, they need to be simple enough to be parameterized from sparse data, such as transcriptome time series. We therefore developed a coherent modeling framework that allows us to describe and model transcript dynamics during differentiation and to disentangle the effects of perturbation on the transcriptome dynamics. Our model allows to describe time-dependent gene expression in differentiating cells, and it is sufficiently coarse-grained to be applied transcriptome-wide. Using this model, we analyze differences between untreated and disturbed differentiation, where we study the influence of VPA on transcript dynamics as an example. We show that our kinetic model correctly predicts gene expression kinetics at concentrations of VPA not used for training. Additionally, the model provides a mechanistic basis to explain experimentally observed non-monotonic concentration-response relations that were non-intuitive at first. Finally, the model allows us to identify perturbed regulatory modules with high resolution in the time/compound concentration space. With this approach, we find that treatment with VPA leads to a transient upregulation of Wnt signaling which in turn causes the upregulation of neural crest genes. Our work shows that genome-wide models of transcriptome dynamics during stem cell differentiation may not only help to understand organogenesis/embryogenesis, but will also shed new light on developmental disturbances due to genetic and environmental stressors.
MATERIALS AND METHODS
Materials
Gelatine, putrescine, sodium selenite, progesterone, apotransferrin, glucose, insulin, ascorbic acid, valproic acid and ICRT3 were obtained from Sigma (Steinheim, Germany). Accutase was from PAA (Pasching, Austria). FGF-2 (basic fibroblast growth factor), FGF-8b, Sonic hedgehog and noggin and were obtained from R&D Systems (Minneapolis, MN, USA). Y-27632, SB-43154, CHIR99021 and dorsomorphin dihydrochloride were from Tocris Bioscience (Bristol, UK). MatrigelTM was from BD Biosciences (Massachusetts, USA). All cell culture reagents were from Gibco/Invitrogen (Darmstadt, Germany) unless otherwise specified.
Neuroepithelial and rosette differentiation
Human embryonic stem cells (hESC) (H9 from WiCells, Madison, WI, USA) were differentiated according to the protocol published by Chambers and colleagues (24) with modifications established in (8,25). Instead of using 500 μM noggin we used the combination of 35 μM noggin and 600 nM dorsomorphin together with 10 μM SB-431642 for dual SMAD inhibition as described earlier. This was used to prevent BMP and TGF-β signaling, and thus to achieve a highly selective neuroectodermal lineage commitment. For handling details, see supplemental material as decribed by Balmer et al. (8). Beginning on DoD4, KSR medium was gradually replaced by N2 medium (DMEM/F12 medium with 1% Glutamax, 0.1 mg/ml apotransferrin, 1.55 mg/ml glucose, 25 μg/ml insulin, 100 μM putrescine, 30 nM selenium and 20 nM progesterone), supplemented with the same amounts of noggin, dorsomorphin and SB-431642 as KSR. All differentiations were performed in six-well plates containing 2 ml of medium per well. For some experiments, cells were further differentiated until they formed neural rosettes. In detail, cells were differentiated until DoD10 as described above or in Chambers et al. (24). On DoD11, cells were detached using Accutase, and seeded at a density of 150 000 cells/cm2 onto Matrigel-coated 96-well plates or glass cover slips. Cells were grown in N2S medium supplemented with 20 ng/ml FGF2, 100 ng/ml FGF8, 20 ng/ml sonic hedgehog, 20 μM ascorbic acid and 10 μM ROCK inhibitor Y-27632. On DoD13 ROCK inhibitor was removed. Cells were fixed for immune staining at DoD15 after rosettes had formed (26).
Experimental exposure
Developing cells were treated from DoD0–DoD6 with different concentrations of valproic acid (0.025, 0.15, 0.35, 0.45, 0.55, 0.65, 0.8 and 1mM) or the Wnt activator CHIR99021 (2 μM). For rescue experiments with the β-catenin Inhibitor iCRT3 (Sigma) cells were (co)treated from DoD2–DoD6.
Immunostaining
For immunostaining, cells were fixed on DoD6 or DoD15 in 4% paraformaldehyde and 2% sucrose prior to permeabilization in 0.3% Triton X-100 in PBS. After blocking in PBS containing 5% bovine serum albumin and 0.1% Tween-20 for 1 h, primary antibodies (Supplementary Figure S1) were incubated for 1 h, at room temperature (RT). After washing, secondary antibodies were applied for 45 min at RT. DNA was stained with Hoechst H-33342, and cover slips were mounted in FluorSaveTM reagent (Calbiochem, Merck).
Rosette formation assay
Rosette formation assay was performed in 96 well plates. Number of rosettes was assessed in six technical replicates for control and three technical replicates for treatments. The immune cytochemistry staining of golgi matrix protein (GM130), zona occludens 1 (ZO1) and the DNA with Hoechst H-33342 was used to identify rosettes. ZO1 is located in the centre of a rosette, surrounded by GM130. Images of the whole well were taken by Cellomcis (Thermo Fisher Scientific) automated microscope. Using Konstanz information miner software (KNIME) (53), rosettes were detected and counted. The number of rosettes per well was normalized to the total nuclear area. Data is always displayed relative to an untreated control (Described in detail in (26)).
Sample preparation for microarray analysis and quantitative PCR
Cells were lysed in Trizol reagent (Quiagen) at the indicated time points. mRNA was isolated as advised by the manufacturers protocol and an aliquot of 1μg reverse transcribed in a reaction mix of 20 μl (iScript, Biorad: the cDNA synthesis kit works with high sensitivity in this range. Standard curve r = 0.998, efficiency = 96.5%). The reaction mix was diluted 1:5 in water. Quantitative PCR was performed in 10 μl samples using EVAGreen SsoFastTM mix (5 μl EVAGreen mix, 3.6 μl water, 0.4 μl primer (final concentration: 200 nM), 1 μl cDNA (0.01 μg)) on a BioRad Light Cycler (Biorad, Germany). For quantification, qPCR threshold cycles were normalized to reference genes [tatabox binding protein (TBP) and ribosomal protein L13 (RPL13A)] (Primer list: Supplementary Figure S2A). Data was either displayed relative to expression in stem cells or cells treated with chemicals were then expressed relative to transcript levels of untreated control cells (which had been grown and differentiated for the same amount of time). For this normalization, the 2–ΔΔC(T) method was used (54). Primer specificity was assessed by agarose gel electrophoresis and melting curves are provided (Supplementary Figure S2B). A standard curve was generated for each pair of primers and efficiency was measured (Supplementary Figure S2 A and C).
Affymetrix chip-based DNA microarray analysis (Human Genome U133 plus 2.0 arrays) was performed exactly as described earlier (12,55). Expression kinetics for unperturbed differentiation were measured in quadruplicates at the following 12 time points: 0, 6, 10, 16, 24, 36, 48, 60, 72, 96, 120, 144 h. Expression kinetics at 0.6 mM VPA were obtained in quadruplicates at the time points 72 and 144 h and in duplicates at 24, 48 and 96 h. Transcriptomes at 144 h were obtained for the window treatments with 0.6 mM VPA as follows. For treatment from 24 to 72 h in triplicates, for treatment from 24 to 9 6h in quadruplicates and for treatment from 72h to 144h in triplicates. One sample for the window treatment from 72 to 144 h had to be removed because of apparent mislabeling. Concentration dependent transcriptomes were obtained at 144 h in triplicates at eight VPA concentrations (0.025, 0.15, 0.35, 0.45, 0.55, 0.65, 0.8 and 1 mM) and were published earlier (27).
Data preparation
Gene expression data was normalised using bioconductor and rma as described in (12). For each condition, we computed the mean expression of each gene. Empirical Bayesian estimates of the standard deviations were computed using the ebayes function in the R package limma for each condition separately. These estimates were subsequently divided by the square root of the number of replicates for each condition to obtain the standard error of the mean.
Batch effect correction
As the untreated kinetics and the kinetics obtained from cells treated with 0.6 mM VPA were obtained in different batches (first and second batch), we performed batch effect corrections. For this, we used untreated samples at the time points 0, 72 and 144 h, where we computed the difference in expression between batches for each gene at the three time points. These differences were linearly interpolated to obtain b(t), the time dependent batch correction. Finally, for each gene and time point, expression in the VPA-treated kinetic samples was transformed according to
where expr(t) denotes the measured gene expression at time point t and expr′(t) is the batch corrected gene expression.
Selection of regulated genes
First, genes present on the microarray were restricted to protein coding genes using the Ensembl database (Ensembl 84 from March 2016) through the R package biomaRt. To identify regulated genes, the variation of the replicates of single time points was compared to the variation with respect to time using the Hotelling T2 statistic obtained with the mb.long function from the R package timecourse (www.bioconductor.org). For a description of the Hotelling T2 statistic see Tai and Speed (56). The four replicates for each of the 12 time points of the untreated kinetic served as input to mb.long. The 1500 genes with the top T2 statistic were labeled as regulated.
Model and cost function
The dynamic model consists of a linear chain of n states with transition rates gon and goff. The rate gon is the on-rate of each state and the off-rate of its preceding state. Only the final state n has a different off-rate, goff. This model can be analytically solved (see supplemental text S1 for details) to obtain the quantity Act(t,n,gon,goff), the fraction of cells in the active state. To model gene expression kinetics, we also introduce a time shift dt and the parameters A and B as the expression in states 0…n − 1 and n, respectively. Then the gene expression Ep is given by Ep(t, A, B, gon, goff, n, dt) = A + B Act(t + dt, n, gon, goff). To prevent non-defined values when computing the logarithm, 1 + Ep is used instead of Ep in the cost function. This amounts to a redefinition of the parameter A.
To model the dependency of the rates gon (VPA) and goff (VPA) on VPA, we use Hill functions of the form g(VPA,k,n) = g + (g’ −g)*VPAn/(VPAn + kn) for both rates. The full model thus contains the additional parameters gon', goff', k1, k2, n1 and n2.
For each gene and each condition i, we compute the mean Em,i of the measured expression and the standard error of the mean Si as well as Ep,i, the expression predicted by the model for condition i. The cost function used for fitting the parameters is obtained by summing (Em,i− Ep,i)2/Si2 over all data points from the untreated and treated kinetic indexed by i. For the concentration-response data at t = 144 h the sum (Fm,i − Fp,i)2/Si2 over all data points indexed by i is computed. Here, Fm,i is the measured fold change with respect to the time point 144 h in the untreated kinetic and Fp,i is the predicted fold change with respect to the predicted expression for time 144 h, untreated. The overall cost function is the sum of both contributions.
Parameter estimation
We estimated the parameters by optimizing the difference between data and model simulations weighted by the standard-deviation (see above). Because of the sensitivity of optimization results with respect to starting parameters, the often shallow cost landscape and the influence of the details of the optimization algorithm, optimization is performed in three iterations. In the first iteration, the microarray data with the original expression standard deviation computed with the limma package is used. In the second and third iteration, the standard deviation is corrected by a factor determined from the residuals of the first iteration fits. The first and second iteration each start with fitting only the data from untreated cells and subsequently fitting the full data set to improve convergence of the optimizer in the 12-dimensional parameter space of the full model. The third iteration employs different optimization algorithms to improve on the fit determined in the second iteration. We used the analytical solution computed for the model for all optimization algorithms and the analytical gradient function for algorithms that make use of a gradient (see supplemental text S1 and S2 for details).
For all fits, optimization parameters are restricted to the intervals 10−5 < gon' < 10, 10−5 < goff' < 10, 10−2 < k1 <2, 1 < n1 < 4, 10−2 < k2 < 2, 1 < n2 < 4, 1 < A < 6 × 104, 1 < B < 105, 10−5 < gon < 3, 10−5 < goff < 3, 1 < n < 20, 10−2 < dt < 100. In the following we list the optimization algorithms and starting values used for the different iterations.
First iteration, fit of parameters A, B, n, gon, goff, dt to untreated data: optimizer GenSA from the R package GenSA with initial values A = min(e), B = max(e), gon = 0.1, goff = 0.01, n = 1, dt = 0, where e is the vector of expression values in linear units for the respective gene and optimizer option max.time = 200. Second iteration, fit of all parameters to the full data set: optimizer L-BFGS-B from the R package optim with 1000 sets of random starting parameters and optimizer options fnscale = 1, factr = 1e7, maxit = 1000. Random parameters are generated in the interval (1/5⋅par,5⋅par), where par is the optimized parameter from the untreated fits using the latin hypercube sampling implemented in the function randomLHS from the package lhs. The parameters gon', goff ‘, k1, k2, n1 and n2 that are not determined in the first iteration, are either randomly drawn from the entire allowed interval (k1, k2, n1, n2) or drawn from the interval determined by gon and goff, respectively (gon’, goff ').
Second iteration, fit of parameters A, B, n, gon, goff, dt to untreated data with rescaled standard deviation: as in the first iteration but with option max.time = 1000.
Second iteration, fit of all parameters to the full data set with rescaled standard deviation: as in the first iteration with 1000 sets of random start parameters but with options fnscale = 1e6, factr = 1e2, maxit = 2e4.
Third iteration, fit of all parameters to the full data set with rescaled standard deviation: optimization algorithms GenSA and L-BFGS-B as described above as well as LM (package minpack.lm) and NMKB (package dfoptim) are used with 100 random starting parameters. Random parameters are picked form the interval (0.999 par, 1.001 par). Optimizer options are set as follows. GenSA: max.time = 200, L-BFGS-B: fnscale = 1e6, factr = 1e2, maxit = 2e4, LM: ftol = 1e–25, ptol = 1e–16, gtol = 1e–15, maxiter = 2e4 and standard options for NMKB.
Finally, if a set of parameters yielding a better fit of the model to the data was encountered for a gene during the profile likelihood analysis (see below), then this set was adopted.
Parameter identifiability
To analyse parameter identifiability, we performed profile likelihood (PL) as described in Kreutz et al. (57). First, optimal parameter sets were reoptimized using the L-BFGS-B method to ensure a local optimum as start of the PL procedure. Parameters were increased and decreased starting from the optimal value with a fixed step size until the cost was increased by more than 95%-quantile of the χ2-distribution with one degree of freedom or until the upper or lower parameter boundary was reached. The minimal and maximal parameter values determined in this way yield the 95% parameter confidence interval. Further details of the PL procedure are described in the supplemental information (supplemental text S3).
GSEA
GSEA was performed using the HTSAnalyzeR package for R. As reference gene sets for the GSEA we used gene sets from the MSigDB v6.0 (gsea-msigdb.org), containing manually curated gene sets from pathway databases as well as published experiments (all curated sets, c2.all.v6.0.entrez.gmt and KEGG sets, c2.cp.kegg.v6.0.entrez.gmt). For all curated sets and KEGG sets, we used a minimum gene set overlap of 20 genes for computing significance. The FDR for each gene set was computed from 10 000 permutations. The GSEA exponent was set to 1.
Pathway gene sets
Gene sets for Figure 4C–F were obtained as follows. Neural crest signature genes were taken from the MSigDB (see above) gene set named ‘LEE_NEURAL_CREST_STEM_CELL_UP’. TGFbeta pathway genes were taken from the MSigDB gene set named ‘KEGG_TGF_BETA_SIGNALING_PATHWAY’.
Wnt pathway target genes were obtained from (58) through the GEO database (GSE45223). Robust estimates of the standard deviation were derived using an error model that was parameterised as follows. The empirical standard deviation for each gene and time point was computed using replicates. The data was then divided into 15 equally sized bins based on mean expression, and the average empirical standard deviation was computed within each of the 15 bins. For bins with expression <10, the standard deviation was set to the maximal value. The resulting values were then interpolated using the R function loess with a span of 0.75. Wnt induced genes were obtained using the fold change between samples ‘WA09 embryonic stem cells, NC, day 3’ and ‘WA09 embryonic stem cells, DSi, day 3’ with a z-value threshold of 3.5. Wnt repressed genes were obtained using the fold change between samples ‘WA09 embryonic stem cells, DSi, day 3’ and ‘WA09 embryonic stem cells, NC, day 3’ with a z-value threshold of 3.5. The z-value was defined as mean expression divided by the loess interpolated standard deviation.
Gene ordering in heatmaps
In order to sort genes by the time of peak expression, gene expression values were interpolated to a resolution of 0.1 h using the R function loess with a span of 0.8. Subsequently, the time point of the maximum value of the interpolated expression was determined for each gene.
UMAP
Dimensional reduction of the transcriptome data for the 1500 top regulated genes was performed using the umap R package with parameters n_neighbors = 4, n_epochs = 4000 and random_state = 798395.
RESULTS
Heterogeneity of gene regulation during early neural differentiation
We used an in-vitro system where human pluripotent stem cells (hPSC) are differentiated according to the double SMAD inhibition neural induction protocol to analyze gene expression dynamics. We first obtained transcriptome profiles at twelve time points during the first six days of differentiation. To trigger a defined developmental disturbance (8,9,12,59), we treated cells with 0.6 mM VPA and obtained the transcriptome samples after 1, 2, 3, 4 and 6 days (Figure 1A). A 2D Uniform Manifold Approximation and Projection (UMAP) representation was chosen to visualize transcriptome changes over time. The progressive gene expression changes over time were clearly visualized by this approach. Compared to the trajectory of unperturbed cells, VPA-treated cells were clearly offset (Figure 1B). As first overview of gene expression dynamics on the individual gene level the 1500 top regulated transcripts were ordered by the time of their peak expression. The gene expression heatmap clearly showed that many genes had their highest expression at the start or end of differentiation, but that also a sizeable number peaked at intermediate times (n = 810) (Figure 1C). The sequential peaking of groups of genes is typical for a highly coordinated sequence of neurodevelopmental steps. It may be explained by waves of gene activation programs.
The observed pattern is also consistent with the sequential occurrence of cell states that express particular genes. We observed that the timing of peak expression remained broadly unaltered in VPA-treated cells, although a detailed comparison of the patterns showed individual genes that shifted their time course of expression (Figure 1C). To follow up on the latter finding, we examined whether VPA-affected genes generally peaked later or earlier than genes not significantly affected by VPA. However, transcripts most strongly affected by the drug (log fold changes > 2) showed a peak time distribution that did not differ from little-affected genes (Figure 1D). This suggests that VPA does not selectively affect genes induced at a specific time point in development. The above analysis of the overall expression pattern showed that VPA-affected genes are not peaking at a particular point in time and that the number of genes peaking at a given time is not affected by the drug. However, for individual genes we noticed that VPA treatment can cause dramatic changes in the expression concerning the timing as well as the extent of regulations (Figure 1E). The complexity of the transcriptome modulation by VPA treatment was our incentive to develop a mathematical model of gene expression dynamics that would reflect the observed expression changes over time. Our model assumes that each cell transits through different states during differentiation, and that each gene is expressed only in one specific state (Figure 1F). Application of the model to cell population averages implies that the gene expression data it predicts are determined by how many cells are in a state where the gene is in the ‘switched-on’ state. To make the model mathematically tractable, we based it on three assumptions. (i) For each gene, the cell has to transit through n prior states before it enters the state n + 1 where the gene is activated. (ii) Cells transit through the first n states with a rate gon, and they leave the active state with a rate goff. (iii) For each gene, the parameters n, gon and goff can be different, i.e. each gene is modeled separately (Figure 1F). The basic working of the model may be exemplified for an n = 3 situation, in which the gene of interest is expressed in state 4 (final state) (Figure 1F, bottom, left). On the population level, an increase of gon in this situation would lead to earlier and more peaked gene expression (Figure 1F, top, right). An increase of goff would lead to an overall attenuated expression (Figure 1F, bottom, right). An intuitive explanation for this is that few cells ever occupy state 4 at the same time, because the off-rate goff is high compared to the on-rate gon. To allow the model to fit also genes that decrease monotically in expression, we additionally introduced a time shift parameter dt. This parameter shifts the peak to the left so that the data can exclusively be fitted to the decreasing flank of the peak.
Modeling of dynamic transcriptome disturbances by VPA
In order to extend the model to VPA-treated cells, additional data were required. We used transcriptome data published earlier (28), in which the effect of eight additional VPA concentrations (from 25 μM to 1 mM) was measured after 6 days of differentiation (Figure 2A). When we inspected concentration-response curves for different genes in this data set, we noticed a high heterogeneity: for example gene expression was perturbed at different VPA concentration thresholds (NEFL is very sensitive while HPGD is only perturbed at high concentrations of VPA). Also, we found the deregulation to reach a saturation (SIX3) or not (KLF5). Finally, we noticed that some genes showed a non-monotonic concentration-response (e.g. GAD1 or NR2F2) (Figure 2B).
In order to set up a model for the VPA effect that reflects the molecular features of the neurodevelopmental program, we reasoned that VPA was unlikely to affect the number of cell states. We assumed the drug to alter the parameters gon and goff. Therefore, we made gon and goff monotonic functions of the VPA concentration (modelled using a Hill function). The resulting model was then fitted (i) to the expression data at t = 144 h for different VPA concentrations and (ii) to the time series data for unperturbed differentiation and differentiations at 0.6 mM VPA (Figure 2C). This procedure was applied for each gene, i.e. training each model with 12 parameters on data from 86 samples for 25 conditions. Example fits are shown for the genes PAX6 and LGI1 (Figure 2D). The plots for the input functions for gon and goff illustrate how different concentration dependencies of the rates gon and goff can lead to distinct concentration-response behavior at the endpoint. We found that the multi-step model with VPA concentration-dependent rates correctly described the observed changes in gene expression amplitude and timing caused by VPA treatment (Figure 2D).
To judge the general ability of the model to fit the kinetic and concentration-response data, we computed the correlation between data and fits. We found a correlation coefficient larger than 0.9 for 88% of the 1500 fitted genes for the unperturbed differentiation. For the concentration-response data, we compared the fits to the data for genes with a minimum log2 fold change of 1 at the endpoint. For this group of 557 genes, 95% had a correlation coefficient higher than 0.9. We concluded that the model correctly describes expression patterns for the large majority of regulated genes in the presence and absence of VPA.
Since the model describes peaked gene expression kinetics, parameter identifiability will in general depend on the peak location. For montonically decreasing or increasing genes, where the peak is outside of the observed time frame, we expect gon and goff to be less identifiable than for genes peaking at intermediate times. To analyse parameter identifiability, we performed a profile likelihood analysis which allows to compute parameter confidence intervals (57), exemplified for the gene POU4F1 in Supplementary Figure S6. As expected, gon,goff, dt and n are best defined for genes peaking at intermediate times (Supplementary Figure S7). The considerable uncertainty in these parameters even for genes with a clearly defined peak is also due to structural non-identifiabilities. For instance, a shift in the peak location to earlier times by increasing gon can be offset by increasing n. Peak times, which are computed from a combination of parameters, are better identifiable (Supplementary Figure S8). For genes peaking at intermediate times, the uncertainty in the peak time is comparable to the time resolution determined by the transcriptome samples.
In order to characterize the overall effect of VPA on regulated genes, we computed the change in peak-to-trough expression (amplitude shift) and the change in peak timing (time shift) (Figure 2E) from the model fits. We observed a bias towards a decreased amplitude upon VPA treatment (68% of all regulated genes) (Figure 2F). This bias was retained when restricting the analysis to genes with a significant change in amplitude (Supplementary Figure S9). When plotting the change in peak timing against the average of the peak time in the untreated and VPA-treated conditions, we noticed an average peak time shift towards later times for genes that peak intermediately (Figure 2G). When restricting the analysis to genes with a significant peak time shift, the average delay became more pronounced (Supplementary Figure S10). The model fits thus indicate that expression changes over time are mostly attenuated and delayed by VPA. Taken together, we established computational models for hundreds of genes that capture gene expression kinetics during differentiation and model the influence of VPA on the kinetics.
Explanation of non-monotonic concentration-response curves by the gene expression model.
We set out to validate the gene expression kinetics predicted by the model by generating additional kinetic data at VPA concentrations that have not been used for model fitting. Gene expression kinetics for 12 genes were measured at 0.35, 0.6 and 0.8 mM VPA using quantitative RT-PCR. We selected these 12 genes to cover a broad range of qualitatively different concentration-response curves with respect to VPA at t = 144 h (Figure 3A). Saturating concentration-responses were exemplified by three genes (PAX6, DAZL, OTX2) that are down-regulated and two genes (BMP5, SOX9) that are up-regulated by VPA. POU4F1 was included as a gene that shows a slight increase in expression at low concentrations, and a pronounced decrease at high concentrations. Most importantly, we also selected genes that show a pronounced non-monotonic behavior, where expression is either reduced or increased at intermediate, but not at high concentration (Figure 3A, right column). Using the model fits, we simulated expression kinetics for the three different VPA concentrations (0.35, 0.6 and 0.8 mM VPA). The model predicted that treatment with VPA can dramatically alter the amplitude of peak expression (see e.g. OTX2), as well as timing of peak expression (see e.g. DAZL). BMP5 exemplified the case of changes in both peak expression and peak timing. When comparing the peak timing and changes in amplitude observed in the RT-PCR data, we observed good qualitative agreement with the model predictions (Figure 3A, rightmost panel for each gene). Quantitatively, the 12 tested genes have an average correlation between predicted and measured expression values of 0.85 (Pearson's r). Our observation that non-monotonic concentration-response curves were predicted with high accuracy was surprising, given the fact that the model represents the effect of VPA as monotonic functions (gon(VPA) and goff(VPA)). We therefore analyzed the model simulations with regard to the question how such behavior can occur. An intuitive understanding of this model feature can be obtained from a detailed examination of the GAD1 gene (Figure 3B/C). In this example, gon increases with increasing concentration and goff decreases with increasing concentration. This leads to an increase in amplitude and earlier peaking times as the VPA concentration is increased (Figure 3C). However, because goff is more sensitive to changes in concentration than gon, the increase in amplitude precedes the change in timing. This explain why, at t = 144 h expression is first increased, then decreased. We concluded that the gene expression model we developed correctly predicts differentiation kinetics at VPA concentrations on which the model was not trained. It furthermore helps to explain non-monotonic concentration-response curves, as monotonic changes in activation and de-activation kinetics can in combination lead to non-monotonic fold change at a given timepoint (e.g. t = 144 h), while peak time and maximal expression change monotonically.
Identification of Wnt signaling as major driver for divergent differentiation
Due to non-monotonic and transient effects of VPA, analyses of single time points or at single concentrations might be misleading. We therefore used our gene expression model to predict gene expression for various times and VPA concentrations, and used these predictions to ask which cellular processes are perturbed by VPA during differentiation. Specifically, we predicted fold changes for regulated genes at densely spaced time points (0–144 h) and VPA concentrations (0–600 μM). Subsequently, we performed gene set enrichment analysis (GSEA) on the predicted fold changes. The term with the maximum enrichment score was ‘Neural Crest Stem Cell up (NCSC)’. For the VPA concentration of 0.6 mM, a concentration well within the therapeutic range of VPA as an anticonvulsant drug (60,61), and associated with clinical birth defects (28,60–63), neural crest stem cell (NCSC) signature genes were enriched from about 50 h onwards. Even at much lower drug concentrations (down to 0.1 mM), enrichment was found, but it occurred at later time points (Figure 4A).
The model predictions were further used to identify possible mechanisms by which VPA induces NCSC genes. We found only 7 KEGG pathways showing a sufficient overlap with our set of regulated genes to compute meaningful p-values. Three of them were significantly enriched: neuroactive ligand receptor interaction, MAPK signaling and Wnt signaling. The time-concentration matrix of the enrichment scores of neuroactive ligand receptor interaction indicated activation of this pathway at minimal VPA concentrations, and increasing the concentration did not enhance this effect. For MAPK signaling, 300 μM VPA led to an up-regulation, which persisted until at least 144 h. The regulation of Wnt-associated genes was particularly interesting, with only a transient up-regulation that ended between 100 and 120 h. Increased VPA concentrations shifted the time frame of up-regulation towards earlier times, so that it was not enriched for at the 144 h data points (Figure 4B).
The predicted up-regulation of Wnt signaling is remarkable, because together with TGFβ signaling, it is known to play a role in neural crest development (64,65). To further investigate the connection between Wnt/TGFβ signaling and NCSC induction, we derived a Wnt target gene signature (Supplemental Table S1) from a transcriptome study in human embryonic stem cells (58). When examining the VPA-induced fold changes for these genes at the relevant time period (48–96 h), we observed that Wnt induced genes tended to be up-regulated by VPA, while Wnt-repressed genes were rather down-regulated. Additionally, the majority of TGFβ pathway genes were upregulated by VPA (Supplementary Figure S11). This trend is consistent with an activation of Wnt signaling by VPA (Figure 4C).
To confirm this activation hypothesis, we used a more robust approach: average fold-changes caused by VPA treatment for Wnt-activated genes, for TGFβ pathway genes, and NCSC signature genes were calculated for measured time points and VPA concentrations. This data showed that VPA treatment indeed lead to a transient response for Wnt-induced genes. In contrast, NCSC signature genes and TGFβ pathway genes showed increased up-regulation over time (Figure 4D).
Predictions from our model suggested that NCSC signature genes were induced already at very low VPA concentrations (Figure 4E). A steep increase in regulation by VPA was observed in the 200–400 μM range, while the average fold change hardly increased at concentrations above 0.5 mM. Such data are of high interest for systems pharmacology models and toxicity prediction algorithms. With respect to our model, it was also very interesting to observe that Wnt signaling target genes were not significantly altered, when analyzed at the usual model endpoint (144 h) (Figure 4E). They were only clearly identified by modeling of data for earlier time points. This re-emphasizes the need to provide and analyze kinetic data.
The transient activation of Wnt signaling targets could mean that differentiating cells are susceptible to altered Wnt signaling only for a limited duration. This would explain the frequent observation in developmental toxicology of particular windows-of-sensitivity (9,26). To investigate this further, we treated cells with VPA at different times and with variable treatment duration and reanalyzed a published dataset (26,27) (Figure 4F). GSEA for NCSC genes showed a marked difference between early and late treatment windows, with late treatments failing to efficiently upregulate NCSC genes. Brief early treatments led to a very similar up-regulation as continuous treatment. This implies that differentiation disturbances by VPA are most efficient/toxic during early neurodevelopmental periods (Figure 4F). Taken together, these applicability tests of our mathematical gene expression model show that this form of data analysis can unveil complex regulation patterns, exemplified here by a transient dysregulation in Wnt signaling associated with a permanent up-regulation of the NCSC fate.
Altered VPA-induced neural crest formation by interference with Wnt signaling
Finally, we tested in how far the toxicant-induced shift of differentiation from neuroepithelial cells towards neural crest cells may be prevented by interference with Wnt signaling. First, a diverse set of Wnt target genes relevant for early neurodevelopment (DACT, SP5, SIX3, GAD) was analyzed to compare the response to VPA with the one triggered by the well-established Wnt activator CHIR (CHIR99021; a specific inhibitor of glycogen synthase kinase-3) (9,66). The analysis was performed after three days of differentiation (DoD3), a time period relevant to fate switching. Both VPA and CHIR triggered similar response patterns (Supplementary Figure S3A). Moreover, analysis of the established neuroepithelial test endpoints (expression of PAX6 and of OTX2) on DoD6 confirmed that Wnt activation during the differentiation process by CHIR indeed disturbed the correct differentiation. The drug attenuated the up-regulation of PAX6 and of OTX2; in other words, CHIR triggered a relative downregulation of the neuroepithelial markers compared to untreated control cells (Supplementary Figure S3B).
Having established that Wnt signaling may disturb neuroepithelial differentiation in our experimental system, we used the beta-catenin inhibitor ICRT3 as tool to attenuate this pathway. In a proof-of-concept experiment we found that ICRT3 attenuated the expression of typical neural crest genes (TFAP2, PAX3, MSX1) in cells treated with CHIR (Supplementary Figure S4A). Thus, we proceeded to a series of experiments investigating the effect of ICRT3 on cells treated with VPA (Figure 5A). On the gene expression level, 30 μM of ICRT3 already attenuated the VPA-induced up-regulation of TFAP2 and MSX1, but not PAX3 (Figure 5B); at higher concentrations (60 μM), ICRT3 had pronounced attenuating effects on all three neural crest marker genes (Supplementary Figure S4B). More importantly, the beta-catenin inhibitor also prevented protein expression of the neural crest master regulator SOX10 in the presence of either 400 μM (Figure 5C) or 600 μM VPA (Supplementary Figure S5A). This phenotypic effect was persistent, as the intermittent treatment with ICRT3 (from DoD2–DoD6) also prevented the up-regulation of neural crest marker proteins ISL1 and p75 (also named TNFRSF16, CD271, p75NTR) at DoD15 (Figure 5D, Supplementary Figure S5B). These data strongly suggest that the intermittent up-regulation of Wnt pathway genes contributes to the differentiation switch away from neuroepithelial cells and towards neural crest. The rescue from disturbed neuro-differentiation by ICRT3 was further investigated in a rosette formation assay (26,27). Self-organized formation of rosettes is a hallmark of the neuroepithelial phenotype, and has been considered as functional correlate of neural tube formation (8,67). The cells generated by our standard protocol (26) showed extensive formation of neural rosettes, which was completely disturbed in cells exposed to VPA during the pivotal differentiation period of DoD0–DoD6 (Figure 5F). Co-treatment with ICRT3 (from DoD2-DoD6) fully restored the rosette formation capacity of the cells (Figure 5F, G). Thus, interference with Wnt signaling at a critical period of differentiation prevented the neurodevelopmental phenotype triggered by VPA.
DISCUSSION
Analyzing perturbations of differentiating cells is particularly difficult compared to similar approaches in more or less static cell cultures. For instance, a matrix of time- and concentration-resolved data needs to be analyzed. Here, we used interpolation of a limited amount of data by a kinetic model to make such an analysis feasible. The modeling principle may be applied to any other differentiation process in cell biology and organism development that is affected by an additional factor (signaling molecule, drug or gene dose). Previous genome-wide modeling was mainly focused on the description of gene expression dynamics in terms of RNA production and turnover rates (e.g. (52)). This allowed excellent modeling of transcriptome changes triggered by defined signaling events in static cell cultures. Extensions of these models allowed very precise predictions of kinetic rates, particularly when combined with metabolic labeling (68–70). Recent progress also allowed definition of cell trajectories in single cell data (71). A major principle that distinguishes our current work from previous transcriptome descriptions is our model assumption of distinct cellular states in which particular genes are expressed, and that the transition between these states is slow relative to gene expression kinetics. This feature allowed us to model long time series as typically observed during differentiation of human cells. An important finding of the model applications was that a small set of simple assumptions (e.g. monotonic modeling of gon and goff) can lead to predictions of complex behavior, such as non-monotonic concentration-response curves. Another important conclusion is that few experimental samples may be used to predict unobserved points in a relatively large data matrix. Vice versa, the model may be used in the future to predict the most efficient experimental design to allow robust predictions in particular sub-areas of the differentiation-disturbance matrix.
With the help of model simulations, we studied the effect of VPA on differentiating stem cells during the first six days of a well-established neural induction protocol. This model was chosen to exemplify processes where dynamic natural transcriptome changes are overlaid with those triggered by an external stimulus (2-dimensional response matrix). This particular biological model was also used as it is supported by robust clinical data (72,73). The use of VPA during pregnancy can lead to a complex phenotype, often described as fetal valproate syndrome. Notably, different windows of exposure need to be distinguished. If the drug is given during the period of neural network formation (pregnancy months 2–3 onwards), an autism-related phenotype may arise. This is associated with the endophenotype of altered neurite connectivity, and it has recently been shown to be related to the down-regulation of the cytoskeletal/microfilament regulator MARCKSL1 (59). Drug exposure during earlier phases of development (about 4 weeks after conception) is better documented, and it leads to neural tube defects and alterations related to neural crest function (e.g. cleft palate) (72,73). Spina bifida and hydrocephalus are the clinically best-documented malformations in this case, and the test system used here (STOP-tox(UKN)) refers to this neuro-teratological problem (8). Interestingly, the upstream signaling disturbances triggered by VPA may be similar in early and late phases: HDAC inhibition and block of GSK-3 activity. The activation of the Wnt pathway, and a related shift from central neural development to neural crest development was found here as a main pathological pathway for early-stage drug exposure.
In exposed rats, it was shown that VPA induced beta-catenin and phospho-GSK3 in brain tissue (74,75). Although such studies suggest that VPA treatment affects Wnt signaling, an exact molecular explanation is still lacking. VPA itself is a poor inhibitor of GSK3, and other molecules that trigger similar responses as VPA (e.g. trichostatin A) do not affect this kinase at all. Possibly, the main primary target of VPA is a broad spectrum inhibition of HDACs (excluding HDAC6), which then leads indirectly to a Wnt response (40,76). It has been shown that the direct effects of VPA on histone acetylation have an indirect, but permanent effect on methylations (9). In line with this, it has been suggested that VPA increases the expression of Wnt target genes in prenatally exposed rats by altering demethylation state of Wnt-activators (77). Dissecting the transcriptional changes by which VPA affects Wnt signaling is complicated by the dynamic nature of the problem. The initial signaling phase is quickly followed by secondary effects, and signaling pathways often contain feedbacks that limit signal duration. Thus, the timing of the observations after a perturbation can be critical when trying to identify direct effects on signaling, and concentration-dependent effects complicate the analysis. Our approach tries to overcome these limitations by augmenting the limited experimental resolution in time–concentration space using a dynamic model.
Further insights could be gained in the future by combining the model with single cell transcriptome data. This would allow anchoring of the model states to transcriptome states that are observed in cells during differentiation (78–81). Under such conditions, we would be able to use the same states and their transition rates for all genes, instead of using independent state models for each gene. This would further constrain the parameters and thus provide robustness and interpretability of the model. Consequently, the transition rate parameters could yield typical half lifes of transcriptome states that are currently inaccessible by models using a single cell snapshot. It would also complement pseudotime reconstruction algorithms that lack absolute time scales (82,83).
Taken together, our modeling framework allowed us to model the effect of a perturbation on a complex developmental transition and helped us to unveil the molecular pathway that mediates dysregulation due to the perturbation. We believe that this framework is broadly applicable to the analysis of transcriptome changes in complex differentiation processes that are subjected to perturbations.
DATA AVAILABILITY
The R scripts to generate Figures 1–4 are publicly available in the GitHub repository https://github.com/johannesmg/kinetic_modeling_stem_cell_transcriptome_dynamics. The R scripts used for fitting the dynamic model are also publicly available in the GitHub repository https://github.com/johannesmg/BatemanDiff.
The raw CEL files of the Affymetrix chips that have been used are publicly available in the Gene Expression Omnibus (GEO) database under the accession number GSE147270.
Supplementary Material
ACKNOWLEDGEMENTS
We thank the BIH High Performance Computing Unit for support.
Notes
Present address: Tanja Waldmann, trenzyme GmbH, Byk-Gulden-Str.2, D-78467 Konstanz.
Contributor Information
Johannes Meisig, Institute of Pathology, Charité-Universitätsmedizin, 10117 Berlin, Germany; IRI Life Sciences, Humboldt-Universität zu Berlin, 10117 Berlin, Germany.
Nadine Dreser, In Vitro Toxicology and Biomedicine, Dept inaugurated by the Doerenkamp-Zbinden Chair foundation, University of Konstanz, 78457 Konstanz, Germany.
Marion Kapitza, In Vitro Toxicology and Biomedicine, Dept inaugurated by the Doerenkamp-Zbinden Chair foundation, University of Konstanz, 78457 Konstanz, Germany.
Margit Henry, Faculty of Medicine, Institute of Neurophysiology, University of Cologne, 50931 Cologne, Germany; Center for Molecular Medicine Cologne (CMMC), University of Cologne, 50931 Cologne, Germany.
Tamara Rotshteyn, Faculty of Medicine, Institute of Neurophysiology, University of Cologne, 50931 Cologne, Germany; Center for Molecular Medicine Cologne (CMMC), University of Cologne, 50931 Cologne, Germany.
Jörg Rahnenführer, Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.
Jan G Hengstler, Leibniz Research Centre for Working Environment and Human Factors (IfADo), TU Dortmund University, 44139 Dortmund, Germany.
Agapios Sachinidis, Faculty of Medicine, Institute of Neurophysiology, University of Cologne, 50931 Cologne, Germany; Center for Molecular Medicine Cologne (CMMC), University of Cologne, 50931 Cologne, Germany.
Tanja Waldmann, In Vitro Toxicology and Biomedicine, Dept inaugurated by the Doerenkamp-Zbinden Chair foundation, University of Konstanz, 78457 Konstanz, Germany.
Marcel Leist, In Vitro Toxicology and Biomedicine, Dept inaugurated by the Doerenkamp-Zbinden Chair foundation, University of Konstanz, 78457 Konstanz, Germany.
Nils Blüthgen, Institute of Pathology, Charité-Universitätsmedizin, 10117 Berlin, Germany; IRI Life Sciences, Humboldt-Universität zu Berlin, 10117 Berlin, Germany.
SUPPLEMENTARY DATA
Supplementary Data are available at NAR Online.
FUNDING
BMBF [031L0117A-D]; EFSA; DK-EPA [MST-667-00205]; DFG (Konstanz Research School of Chemical Biology; KoRS-CB); European Union's Horizon 2020 research and innovation programme [681002 to EU-ToxRisk, 825759 to ENDpoiNTs]. Funding for open access charge: Funding provided by the German Research Foundation (DFG) and the Open Access Publication Fund of Charité – Universitätsmedizin Berlin.
Conflict of interest statement. None declared.
REFERENCES
- 1. Gaspard N., Bouschet T., Hourez R., Dimidschstein J., Naeije G., van den Ameele J., Espuny-Camacho I., Herpoel A., Passante L., Schiffmann S.N. et al.. An intrinsic mechanism of corticogenesis from embryonic stem cells. Nature. 2008; 455:351. [DOI] [PubMed] [Google Scholar]
- 2. Abranches E., Silva M., Pradier L., Schulz H., Hummel O., Henrique D., Bekman E.. Neural differentiation of embryonic stem cells in vitro: a road map to neurogenesis in the embryo. PLoS One. 2009; 4:e6286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Gaspard N., Vanderhaeghen P.. Mechanisms of neural specification from embryonic stem cells. Curr. Opin. Neurobiol. 2010; 20:37–43. [DOI] [PubMed] [Google Scholar]
- 4. Zimmer B., Kuegler P.B., Baudis B., Genewsky A., Tanavde V., Koh W., Tan B., Waldmann T., Kadereit S., Leist M.. Coordinated waves of gene expression during neuronal differentiation of embryonic stem cells as basis for novel approaches to developmental neurotoxicity testing. Cell Death Differ. 2011; 18:383–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. van de Leemput J., Boles N.C., Kiehl T.R., Corneo B., Lederman P., Menon V., Lee C., Martinez R.A., Levi B.P., Thompson C.L. et al.. CORTECON: a temporal transcriptome analysis of in vitro human cerebral cortex development from human embryonic stem cells. Neuron. 2014; 83:51–68. [DOI] [PubMed] [Google Scholar]
- 6. Suzuki I.K., Vanderhaeghen P.. Is this a brain which I see before me? Modeling human neural development with pluripotent stem cells. Development. 2015; 142:3138–3150. [DOI] [PubMed] [Google Scholar]
- 7. Yoo Y.D., Huang C.T., Zhang X.Q., Lavaute T.M., Zhang S.C.. Fibroblast growth factor regulates human neuroectoderm specification through ERK1/2-PARP-1 pathway. Stem Cells. 2011; 29:1975–1982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Balmer N.V., Weng M.K., Zimmer B., Ivanova V.N., Chambers S.M., Nikolaeva E., Jagtap S., Sachinidis A., Hescheler J., Waldmann T. et al.. Epigenetic changes and disturbed neural development in a human embryonic stem cell-based model relating to the fetal valproate syndrome. Hum. Mol. Genet. 2012; 21:4104–4114. [DOI] [PubMed] [Google Scholar]
- 9. Balmer N.V., Klima S., Rempel E., Ivanova V.N., Kolde R., Weng M.K., Meganathan K., Henry M., Sachinidis A., Berthold M.R. et al.. From transient transcriptome responses to disturbed neurodevelopment: role of histone acetylation and methylation as epigenetic switch between reversible and irreversible drug effects. Arch. Toxicol. 2014; 88:1451–1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Boles N.C., Hirsch S.E., Le S., Corneo B., Najm F., Minotti A.P., Wang Q.J., Lotz S., Tesar P.J., Fasano C.A.. NPTX1 regulates neural lineage specification from human pluripotent stem cells. Cell Rep. 2014; 6:724–736. [DOI] [PubMed] [Google Scholar]
- 11. Cho A., Tang Y.T., Davila J., Deng S.H., Chen L., Miller E., Wernig M., Graef I.A.. Calcineurin signaling regulates neural induction through antagonizing the BMP pathway. Neuron. 2014; 82:109–124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Rempel E., Hoelting L., Waldmann T., Balmer N.V., Schildknecht S., Grinberg M., Das Gaspar J.A., Shinde V., Stober R., Marchan R. et al.. A transcriptome-based classifier to identify developmental toxicants by stem cell testing: design, validation and optimization for histone deacetylase inhibitors. Arch. Toxicol. 2015; 89:1599–1618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Zhang X., Huang C.T., Chen J., Pankratz M.T., Xi J., Li J., Yang Y., Lavaute T.M., Li X.J., Ayala M. et al.. Pax6 is a human neuroectoderm cell fate determinant. Cell Stem Cell. 2010; 7:90–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Ou J.X., Ball J.M., Luan Y.Z., Zhao T.T., Miyagishima K.J., Xu Y.F., Zhou H.Z., Chen J.G., Merriman D.K., Xie Z. et al.. iPSCs from a hibernator provide a platform for studying cold adaptation and its potential medical applications. Cell. 2018; 173:851–863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. López-Tobón A., Villa C.E., Cheroni C., Trattaro S., Caporale N., Conforti P., Iennaco R., Lachgar M., Rigoli M.T., Marcó de la Cruz B. et al.. Human cortical organoids expose a differential function of GSK3 on cortical neurogenesis. Stem Cell Rep. 2019; 13:847–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Walker R.L., Ramaswami G., Hartl C., Mancuso N., Gandal M.J., De La Torre-Ubieta L., Pasaniuc B., Stein J.L., Geschwind D.H.. Genetic control of expression and splicing in developing human brain informs disease mechanisms. Cell. 2019; 179:750–771. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Marchetto M.C.N., Narvaiza I., Denli A.M., Benner C., Lazzarini T.A., Nathanson J.L., Paquola A.C.M., Desai K.N., Herai R.H., Weitzman M.D. et al.. Differential L1 regulation in pluripotent stem cells of humans and apes. Nature. 2013; 503:525–529. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Dragunow M. Opinion - The adult human brain in preclinical drug development. Nat. Rev. Drug Discov. 2008; 7:659. [DOI] [PubMed] [Google Scholar]
- 19. Dolmetsch R., Geschwind D.H.. The human brain in a Dish: the promise of iPSC-Derived neurons. Cell. 2011; 145:831–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Marchetto M.C., Hrvoj-Mihic B., Kerman B.E., Yu D.X., Vadodaria K.C., Linker S.B., Narvaiza I., Santos R., Denli A.M., Mendes A.P.D. et al.. Species-specific maturation profiles of human, chimpanzee and bonobo neural cells. Elife. 2019; 8:e37527. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Bhinge A., Poschmann J., Namboori S.C., Tian X.F., Loh S.J.H., Traczyk A., Prabhakar S., Stanton L.W.. MiR-135b is a direct PAX6 target and specifies human neuroectoderm by inhibiting TGF-beta/BMP signaling. EMBO J. 2014; 33:1271–1283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Chandran V., Coppola G., Nawabi H., Omura T., Versano R., Huebner E.A., Zhang A., Costigan M., Yekkirala A., Barrett L. et al.. A systems-level analysis of the peripheral nerve intrinsic axonal growth program. Neuron. 2016; 89:956–970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Conti L., Cattaneo E.. Neural stem cell systems: physiological players or in vitro entities. Nat. Rev. Neurosci. 2010; 11:176–187. [DOI] [PubMed] [Google Scholar]
- 24. Chambers S.M., Fasano C.A., Papapetrou E.P., Tomishima M., Sadelain M., Studer L.. Highly efficient neural conversion of human ES and iPS cells by dual inhibition of SMAD signaling. Nat. Biotechnol. 2009; 27:275–280. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Chambers S.M., Mica Y., Studer L., Tomishima M.J.. Converting human pluripotent stem cells to neural tissue and neurons to model neurodegeneration. Methods Mol. Biol. 2011; 793:87–97. [DOI] [PubMed] [Google Scholar]
- 26. Dreser N., Madjar K., Holzer A.K., Kapitza M., Scholz C., Kranaster P., Gutbier S., Klima S., Kolb D., Dietz C. et al.. Development of a neural rosette formation assay (RoFA) to identify neurodevelopmental toxicants and to characterize their transcriptome disturbances. Arch. Toxicol. 2020; 94:151–171. [DOI] [PubMed] [Google Scholar]
- 27. Waldmann T., Grinberg M., Konig A., Rempel E., Schildknecht S., Henry M., Holzer A.K., Dreser N., Shinde V., Sachinidis A. et al.. Stem cell transcriptome responses and corresponding biomarkers that indicate the transition from adaptive responses to cytotoxicity. Chem. Res. Toxicol. 2017; 30:905–922. [DOI] [PubMed] [Google Scholar]
- 28. Waldmann T., Rempel E., Balmer N.V., Konig A., Kolde R., Gaspar J.A., Henry M., Hescheler J., Sachinidis A., Rahnenfuhrer J. et al.. Design principles of concentration-dependent transcriptome deviations in drug-exposed differentiating stem cells. Chem. Res. Toxicol. 2014; 27:408–420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Shinde V., Hoelting L., Srinivasan S.P., Meisig J., Meganathan K., Jagtap S., Grinberg M., Liebing J., Bluethgen N., Rahnenfuhrer J. et al.. Definition of transcriptome-based indices for quantitative characterization of chemically disturbed stem cell development: introduction of the STOP-Tox and STOP-Tox tests. Arch. Toxicol. 2016; 91:839–864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Nau H. Valproic acid-induced neural-tube defects. Ciba F Symp. 1994; 181:144–152. [DOI] [PubMed] [Google Scholar]
- 31. Ornoy A. Valproic acid in pregnancy: how much are we endangering the embryo and fetus. Reprod. Toxicol. 2009; 28:1–10. [DOI] [PubMed] [Google Scholar]
- 32. Tebbenkamp A.T., Willsey A.J., State M.W., Sestan N.. The developmental transcriptome of the human brain: implications for neurodevelopmental disorders. Curr. Opin. Neurol. 2014; 27:149–156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Heyer D.B., Meredith R.M.. Environmental toxicology: sensitive periods of development and neurodevelopmental disorders. Neurotoxicology. 2017; 58:23–41. [DOI] [PubMed] [Google Scholar]
- 34. Kondo A., Kamihira O., Ozawa H.. Neural tube defects: prevalence, etiology and prevention. Int. J. Urol. 2009; 16:49–57. [DOI] [PubMed] [Google Scholar]
- 35. Herzog J.I., Schmahl C.. Adverse childhood experiences and the consequences on neurobiological, psychosocial, and somatic conditions across the lifespan. Front Psychiatry. 2018; 9:420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Alsdorf R., Wyszynski D.F.. Teratogenicity of sodium valproate. Expert Opin. Drug Saf. 2005; 4:345–353. [DOI] [PubMed] [Google Scholar]
- 37. Tomson T., Battino D., Perucca E.. Valproic acid after five decades of use in epilepsy: time to reconsider the indications of a time-honoured drug. Lancet Neurol. 2016; 15:210–218. [DOI] [PubMed] [Google Scholar]
- 38. Chateauvieux S., Morceau F., Dicato M., Diederich M.. Molecular and therapeutic potential and toxicity of valproic acid. J. Biomed. Biotechnol. 2010; 2010:479364. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Koren G., Berkovitch M., Ornoy A.. Dose-dependent teratology in humans: clinical implications for prevention. Pediatr Drugs. 2018; 20:331–335. [DOI] [PubMed] [Google Scholar]
- 40. Wiltse J. Mode of action: Inhibition of histone deacetylase, altering WNT-dependent gene expression, and regulation of beta-catenin - developmental effects of valproic acid. Crit. Rev. Toxicol. 2005; 35:727–738. [DOI] [PubMed] [Google Scholar]
- 41. Sztajnkrycer M.D. Valproic acid toxicity: overview and management. J. Toxicol. Clin. Toxicol. 2002; 40:789–801. [DOI] [PubMed] [Google Scholar]
- 42. Ito T., Ando H., Handa H.. Teratogenic effects of thalidomide: molecular mechanisms. Cell. Mol. Life Sci. 2011; 68:1569–1579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Herberg M., Roeder I.. Computational modelling of embryonic stem-cell fate control. Development. 2015; 142:2250–2260. [DOI] [PubMed] [Google Scholar]
- 44. Spector A.A., Grayson W.L.. Stem cell fate decision making: modeling approaches. Acs Biomater Sci Eng. 2017; 3:2702–2711. [DOI] [PubMed] [Google Scholar]
- 45. Chickarmane V., Peterson C.. A Computational Model for understanding stem cell, trophectoderm and endoderm lineage determination. PLoS One. 2008; 3:e3478. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46. Bessonnard S., De Mot L., Gonze D., Barriol M., Dennis C., Goldbeter A., Dupont G., Chazaud C.. Gata6, nanog and Erk signaling control cell fate in the inner cell mass through a tristable regulatory network. Development. 2014; 141:3637–3648. [DOI] [PubMed] [Google Scholar]
- 47. Dunn S.J., Martello G., Yordanov B., Emmott S., Smith A.G.. Defining an essential transcription factor program for naive pluripotency. Science. 2014; 344:1156–1160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Barenco M., Brewer D., Papouli E., Tomescu D., Callard R., Stark J., Hubank M.. Dissection of a complex transcriptional response using genome-wide transcriptional modelling. Mol. Syst. Biol. 2009; 5:327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Honkela A., Girardot C., Gustafson E.H., Liu Y.H., Furlong E.E.M., Lawrence N.D., Rattray M.. Model-based method for transcription factor target identification with limited data. Proc. Natl. Acad. Sci. U.S.A. 2010; 107:7793–7798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50. Miller C., Schwalb B., Maier K., Schulz D., Dumcke S., Zacher B., Mayer A., Sydow J., Marcinowski L., Dolken L. et al.. Dynamic transcriptome analysis measures rates of mRNA synthesis and decay in yeast. Mol. Syst. Biol. 2011; 7:458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51. Honkela A., Peltonen J., Topa H., Charapitsa I., Matarese F., Grote K., Stunnenberg H.G., Reid G., Lawrence N.D., Rattray M.. Genome-wide modeling of transcription kinetics reveals patterns of RNA production delays. Proc. Natl. Acad. Sci. U.S.A. 2015; 112:13115–13120. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Uhlitz F., Sieber A., Wyler E., Fritsche-Guenther R., Meisig J., Landthaler M., Klinger B., Bluthgen N.. An immediate-late gene expression module decodes ERK signal duration. Mol. Syst. Biol. 2017; 13:928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53. Berthold M.R., Cebron N., Dill F., Gabriel T.R., Kötter T., Meinl T., Ohl P., Sieb C., Thiel K., Wiswedel B.. Preisach C., Burkhardt H., Schmidt-Thieme L., Decker R.. Data Analysis, Machine Learning and Applications. 2007; Berlin Heidelberg: Springer; 319–326. [Google Scholar]
- 54. Livak K.J., Schmittgen T.D.. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods. 2001; 25:402–408. [DOI] [PubMed] [Google Scholar]
- 55. Krug A.K., Kolde R., Gaspar J.A., Rempel E., Balmer N.V., Meganathan K., Vojnits K., Baquie M., Waldmann T., Ensenat-Waser R. et al.. Human embryonic stem cell-derived test systems for developmental neurotoxicity: a transcriptomics approach. Arch. Toxicol. 2013; 87:123–143. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56. Tai Y.C., Speed T.P.. A multivariate empirical Bayes statistic for replicated microarray time course data. Ann Stat. 2006; 34:2387–2412. [Google Scholar]
- 57. Kreutz C., Raue A., Kaschek D., Timmer J.. Profile likelihood in systems biology. FEBS J. 2013; 280:2564–2571. [DOI] [PubMed] [Google Scholar]
- 58. Mica Y., Lee G., Chambers S.M., Tomishima M.J., Studer L.. Modeling neural crest induction, melanocyte specification, and disease-related pigmentation defects in hESCs and Patient-Specific iPSCs. Cell Rep. 2013; 3:1140–1152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Chanda S., Ang C.E., Lee Q.Y., Ghebrial M., Haag D., Shibuya Y., Wernig M., Sudhof T.C.. Direct reprogramming of human neurons identifies MARCKSL1 as a pathogenic mediator of valproic acid-induced teratogenicity. Cell Stem Cell. 2019; 25:103–119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Turnbull D.M., Rawlins M.D., Weightman D., Chadwick D.W.. Plasma concentrations of sodium valproate: their clinical value. Ann. Neurol. 1983; 14:38–42. [DOI] [PubMed] [Google Scholar]
- 61. Bentué-Ferrer D., Tribut O., Verdier M.C.. Therapeutic drug monitoring of valproate. Therapie. 2010; 65:233–240. [DOI] [PubMed] [Google Scholar]
- 62. Vajda F.J., O’Brien T.J., Hitchcock A., Graham J., Cook M., Lander C., Eadie M.J.. Critical relationship between sodium valproate dose and human teratogenicity: results of the Australian register of anti-epileptic drugs in pregnancy. J. Clin. Neurosci. 2004; 11:854–858. [DOI] [PubMed] [Google Scholar]
- 63. Omtzigt J.G., Los F.J., Grobbee D.E., Pijpers L., Jahoda M.G., Brandenburg H., Stewart P.A., Gaillard H.L., Sachs E.S., Wladimiroff J.W. et al.. The risk of spina bifida aperta after first-trimester exposure to valproate in a prenatal cohort. Neurology. 1992; 42:119–125. [PubMed] [Google Scholar]
- 64. Patthey C., Edlund T., Gunhaga L.. Wnt-regulated temporal control of BMP exposure directs the choice between neural plate border and epidermal fate. Development. 2009; 136:73–83. [DOI] [PubMed] [Google Scholar]
- 65. Leung A.W., Murdoch B., Salem A.F., Prasad M.S., Gomez G.A., Garcia-Castro M.I.. WNT/beta-catenin signaling mediates human neural crest induction via a pre-neural border intermediate. Development. 2016; 143:398–410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66. Selenica M.L., Jensen H.S., Larsen A.K., Pedersen M.L., Helboe L., Leist M., Lotharius J.. Efficacy of small-molecule glycogen synthase kinase-3 inhibitors in the postnatal rat model of tau hyperphosphorylation. Brit J Pharmacol. 2007; 152:959–979. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67. Elkabetz Y., Panagiotakos G., Al Shamy G., Socci N.D., Tabar V., Studer L.. Human ES cell-derived neural rosettes reveal a functionally distinct early neural stem cell stage. Genes Dev. 2008; 22:152–165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68. de Pretis S., Kress T., Morelli M.J., Melloni G.E., Riva L., Amati B., Pelizzola M.. INSPEcT: a computational tool to infer mRNA synthesis, processing and degradation dynamics from RNA- and 4sU-seq time course experiments. Bioinformatics. 2015; 31:2829–2835. [DOI] [PubMed] [Google Scholar]
- 69. Rabani M., Raychowdhury R., Jovanovic M., Rooney M., Stumpo D.J., Pauli A., Hacohen N., Schier A.F., Blackshear P.J., Friedman N. et al.. High-resolution sequencing and modeling identifies distinct dynamic RNA regulatory strategies. Cell. 2014; 159:1698–1710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70. Jurges C., Dolken L., Erhard F.. Dissecting newly transcribed and old RNA using GRAND-SLAM. Bioinformatics. 2018; 34:i218–i226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71. La Manno G., Soldatov R., Zeisel A., Braun E., Hochgerner H., Petukhov V., Lidschreiber K., Kastriti M.E., Lonnerberg P., Furlan A. et al.. RNA velocity of single cells. Nature. 2018; 560:494–498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72. Blotiere P.O., Raguideau F., Weill A., Elefant E., Perthus I., Goulet V., Rouget F., Zureik M., Coste J., Dray-Spira R.. Risks of 23 specific malformations associated with prenatal exposure to 10 antiepileptic drugs. Neurology. 2019; 93:e167–e180. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73. Bromley R.L., Weston J., Marson A.G.. Maternal use of antiepileptic agents during pregnancy and major congenital malformations in children. JAMA. 2017; 318:1700–1701. [DOI] [PubMed] [Google Scholar]
- 74. Gould T.D., Chen G., Manji H.K.. In vivo evidence in the brain for lithium inhibition of glycogen synthase kinase-3. Neuropsychopharmacol. 2004; 29:32–38. [DOI] [PubMed] [Google Scholar]
- 75. Qin L.Y., Dai X.F., Yin Y.H.. Valproic acid exposure sequentially activates Wnt and mTOR pathways in rats. Mol. Cell. Neurosci. 2016; 75:27–35. [DOI] [PubMed] [Google Scholar]
- 76. De Sarno P., Li X., Jope R.S.. Regulation of Akt and glycogen synthase kinase-3 beta phosphorylation by sodium valproate and lithium. Neuropharmacology. 2002; 43:1158–1164. [DOI] [PubMed] [Google Scholar]
- 77. Wang Z.P., Xu L., Zhu X.P., Cui W.G., Sun Y., Nishijo H., Peng Y.W., Li R.X.. Demethylation of specific Wnt/beta-Catenin pathway genes and its upregulation in rat brain induced by prenatal valproate exposure. Anat. Rec. 2010; 293:1947–1953. [DOI] [PubMed] [Google Scholar]
- 78. Jang S., Choubey S., Furchtgott L., Zou L.N., Doyle A., Menon V., Loew E.B., Krostag A.R., Martinez R.A., Madisen L. et al.. Dynamics of embryonic stem cell differentiation inferred from single-cell transcriptomics show a series of transitions through discrete cell states. Elife. 2017; 6:e20487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79. Trapnell C. Defining cell types and states with single-cell genomics. Genome Res. 2015; 25:1491–1498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80. Saelens W., Cannoodt R., Todorov H., Saeys Y.. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 2019; 37:547–554. [DOI] [PubMed] [Google Scholar]
- 81. Farrell J.A., Wang Y., Riesenfeld S.J., Shekhar K., Regev A., Schier A.F.. Single-cell reconstruction of developmental trajectories during zebrafish embryogenesis. Science. 2018; 360:eaar3131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82. Haghverdi L., Buttner M., Wolf F.A., Buettner F., Theis F.J.. Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods. 2016; 13:845–848. [DOI] [PubMed] [Google Scholar]
- 83. Trapnell C., Cacchiarelli D., Grimsby J., Pokharel P., Li S., Morse M., Lennon N.J., Livak K.J., Mikkelsen T.S., Rinn J.L.. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 2014; 32:381–386. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The R scripts to generate Figures 1–4 are publicly available in the GitHub repository https://github.com/johannesmg/kinetic_modeling_stem_cell_transcriptome_dynamics. The R scripts used for fitting the dynamic model are also publicly available in the GitHub repository https://github.com/johannesmg/BatemanDiff.
The raw CEL files of the Affymetrix chips that have been used are publicly available in the Gene Expression Omnibus (GEO) database under the accession number GSE147270.