Abstract
Purpose
This study was conducted to evaluate the applicability of SPLINDID, a semiparametric, model-based approach for obtaining transcription rates from the pharmacodynamics of mRNA expression.
Methods
A nonparametric exponential cubic spline function was used to obtain the transcription rate profile and the dynamics of mRNA expression was fitted using compartmental approaches. The transcription rate profile and mRNA degradation parameter was estimated using maximum likelihood method of ADAPT II software.
Results
Data sets containing noise for mRNA levels were simulated for four diverse pharmaceutically relevant conditions: receptor nonlinearity, a model in which the variant mRNAs differing in mRNA degradation constants were transcribed and for a minimal model of the cell cycle. SPLINDID was able to fit the data sets and accurately recapitulate the transcription rate profiles normalized to the mRNA degradation rate constants. The model was also challenged using experimental data containing time profiles of cell-cycle-regulated genes.
Conclusions
The SPLINDID approach is flexible in capturing complicated/complex mRNA profiles that are encountered in many experimental data sets.
Keywords: exponential splines, microarray, pharmacodynamics, pharmacokinetics, pharmacogenomic modeling
INTRODUCTION
Techniques such as microarrays and real-time quantitative polymerase chain reaction now allow simultaneous measurements of the levels of many RNA transcript species and are being extensively used in biomedical research. Although these techniques provide a data-rich snapshot of cellular RNA species levels, further mechanistic delineation of the underlying gene regulatory processes requires considerable further analysis in which modeling techniques adapted from pharmacokinetics and pharmacodynamics (PK/PD) can potentially play an important role.
In a previous paper, we reported on SPLINDID, a semiparametric, model-based approach capable of extracting several experimentally difficult to access gene regulation parameters such as transcription rate profile, translation, and protein degradation rate constants from data sets in which mRNA and cognate protein time courses are available (1). The overall strategy and its implementation are referred to as SPLINDID because the nonparametric component of the modeling process uses flexible, relatively “nonparametric” functions based on splines to describe the transcription profiles in combination with the deterministic Hargrove–Schmidt model for describing gene dynamics (1–3). However, because proteomics time profiles are frequently not obtained or available in many cases, the primary focus of this report is to determine whether the SPLINDID approach could be adapted to obtain transcription rate profiles given mRNA dynamics alone. Such an extension of SPLINDID would also allow it to be used for genes (e.g., transfer RNA) whose end products are RNA transcripts that are not translated to protein.
METHODS
Pharmacodynamic Model for mRNA
The dynamics of messenger RNA were parsimoniously defined by the differential equation in Eq. (1) (1,3):
| (1) |
where M is the mRNA concentration, kM is the first-order degradation of mRNA, R(t) is the rate of mRNA transcription per unit volume, and is the rate of change of mRNA concentration.
The ratio R(t)/kM is referred to as the normalized transcription rate profile. It is important to note that the normalized transcription rate has the same units as M and is not, therefore, nondimensional. The normalized transcription rate profile has an intuitive interpretation: it represents the hypothetical time profile of mRNA levels that would be obtained if each instantaneous value of R(t) were to be maintained for a sufficient period of time to approximate steady state in the system.
The SPLINDID modeling process (described below) accurately provides the dynamics of R(t)/kM. However, the individual components of the normalized transcription rate profile, R(t) and kM, cannot be accurately obtained from modeling with mRNA time profiles as inputs because of covariation. The mathematical causes of this covariation can be seen on closer inspection of Eq. (1): a multitude of R(t) and kM value combinations can yield any given value of because increases (or decreases) in R(t) can be offset by corresponding increases (decreases) in the value of kM.
Nonparametric Cubic Spline Approximation
The nonparametric component of the modeling process uses flexible, relatively “nonparametric” functions based on splines to describe the transcription profiles in differential equation, Eq. (1). In SPLINDID, the transcription rate profile, R(t) is modeled as an exponential function of splines:
| (2) |
Cubic splines were used for modeling the Spline(t) term in the exponential of Eq. (2). A cubic spline is a piecewise polynomial of order 4 (degree 3) that can be represented by (4,5):
| (3) |
where Cij are coefficients and Si(t) are piecewise polynomials of order 4 that are defined to be nonzero only between n breakpoints xj that are strictly increasing.
The functional form of Eq. (2) was selected after several numerical experiments indicated its usefulness in the modeling process. Because cubic splines are flexible functions, they can contain undesirable inflections, some of which can take on physically inadmissible negative values; the exponential term in Eq. (2) constrains the values to positive and also dampens the undesirable inflections and oscillations by imposing a steeply increasing penalty during the fitting procedures. However, because the exponential term approaches zero only when the Spline(t) function in the exponent approaches negative infinity, it is numerically infeasible to provide zero-valued initial conditions that could be needed. The term containing −1 included in Eq. (2) allows an initial condition of R(0) = 0 to be imposed whenever appropriate for modeling.
The implementation of the cubic spline was accomplished by setting the order of spline in a B-spline basis function formulation to 4. The DBSINT, DBSOPK, and DBSVAL subroutines from the Fortran programming language version of the International Mathematical Statistical Library (IMSL, Visual Numerics Inc., San Ramon, CA, USA) for the Unix platform were used for spline calculations. The positions of the knots of B splines were optimized with the DBSOPK subroutine; the spline coefficients were computed using the DBSINT subroutine. Outputs from the DBSINT subroutine were provided as input to DBSVAL for interpolation.
Implementation of Modeling Strategy
We explicitly integrated Eq. (1) using the integration factor method, which results in the following equation for M.
| (4) |
where M0 is the constant of integration and represents the initial value of M.
The integrated expression for M (Eq. 4) was incorporated into the right-hand side of Eq. (1). The system consisting of the integral term in Eq. (4) and the resultant forms of Eq. (1) was integrated within the ADAPT pharmacokinetics/pharmacodynamics systems analysis software (6) in the first stage of the heuristic estimation procedure.
The input data consisted of values of the mRNA time profile, M(t), and the model output represented estimates for the normalized transcription rate profile, i.e., the ratio R(t)/kM. The value of M0 was set to zero in the simulations and during fitting.
Several nested modeling runs that varied the number of estimated ordinates for interpolation by the spline function were conducted; the time points corresponding to the estimated ordinates were equally spaced over the interval of the data. Model selection was user-driven and based on a combination of graphical visualization by the user and the Akaike Information Criterion (7). The models identified by visual inspection and Akaike Information Criteria were generally the same or only differed incrementally in complexity.
From the selected model, parameter estimates for the ordinates for interpolating the spline [which describes the transcription rate profile R(t)] and the parameter kM were obtained from ADAPT. The estimated values of R(t) and kM were then used to compute the R(t)/kM ratio.
The parameter estimation procedures employed the maximum likelihood method in the ADAPT software package for the UNIX platform (6). The variance model employed assumed that the residual error standard deviation, σi, was related to the true value of each output Yi, as approximated by its fitted value Ŷi, via the relationship: . The SDSlope is a measure of precision, whereas SDIntercept is a measure of sensitivity.
Generation of Simulated Data Sets from Signaling Models
The ground truth in gene expression experiments is rarely known with certainty and we therefore employed simulated data to assess the performance of the method. Parameter values used for the simulation and results from simulations without noise were used as reference or “true” values to which the performance of the proposed method was compared.
The overall model used for three of the four simulation experiments, shown in Fig. 1A, consisted of modules to describe drug pharmacokinetics, transcriptional signaling, and mRNA dynamics. The drug dose was a bolus input capable of producing an initial drug concentration of 100 concentration units; drug pharmacokinetics was described by one-compartmental model with a first-order elimination rate constant, kel, of 0.5 h−1. The mRNA dynamics are described by Eq. (1); unless otherwise noted, the mRNA degradation rate constant, kM, was set to 1.2 h−1.
Fig. 1.

A schematic of the overall model consisting of the drug pharmacokinetics, transcriptional signaling and mRNA modules is shown in panel (A). (B) Stochastic model-based transcriptional signaling module used in (C). Panel (B) was combined with (A) to generate simulated data for the effect of receptor nonlinearity. (C) The model used to assess the effect of variable mRNA degradation rate constant on the performance of SPLINDID; the three alternatively spliced mRNAs shown (mRNA1, mRNA2, mRNA3) comprised fractions f1, f2, and f3, respectively, of the total transcription and differed in mRNA degradation constants (kM1, kM2, kM3). (D) Minimal cell cycle model involving a substrate A, with zero-order infusion input that activates a cascade consisting of a kinase and a protease. The protease degrades the substrate. Dashed lines represent information flows and the symbols are described in the text.
As shown in Fig. 1B, the transcriptional signaling module consisted of receptor interactions coupled to a stochastic model for signal transduction (8). The drug interacted with its receptor reversibly with a second-order on rate, k1, of 0.5 (concentration units h) −1 and a first-order off rate, k−1, of 20 h −1. The initial receptor concentration was set at 10 concentration units; all remaining initial conditions were set at zero. The drug–receptor interactions were described by the following equation for free receptor:
| (5) |
where C(t), R, and DR are drug, free receptor, and drug-bound receptor, respectively. The stochastic model or tanks-in-series model (Fig. 1B) (8) was used to represent transcriptional signal transduction; three identical tanks each with time constant, τ, of 0.75 h were employed.
Noisy data with CV (defined as the ratio of standard deviation to mean) of 0.2 were obtained in triplicate at time points 0, 1.5, 3, 4.5, 6, 7.5, 9, 10.5, 12, 13.5, 15, 16.5, and 18 h.
Receptor Nonlinearity
For the receptor nonlinearity experiments, the overall model in Fig. 1A and B was used; in the simulations four dose levels were employed, capable of producing input drug concentrations of 10, 100, 1,000, and 10,000 concentration units. The stochastic model was used as the signaling module and other parameters were as previously described.
Differing mRNA Degradation Rate Constants
The model used for simulating the transcription of three alternatively spliced mRNAs (mRNA1, mRNA2, and mRNA3) is shown in Fig. 1C. The model incorporates a common transcription rate R(t) process for all splice variants with a fraction f1 being processed to mRNA1, a fraction f2 being processed to mRNA2, and the remaining fraction f3 = (1 − f1 − f2) yielding variant mRNA-3. The value f = f1 = f2 = f3 = 1/3 was used; the mRNAs were assumed to decay independently with degradation rate constants of kM1 = 1.2 h −1, kM2 = 0.6 h −1, and kM3 = 0.3 h −1, respectively. The stochastic model was used as the signaling module and the other parameters were as previously described.
Minimal Model for the Cell Cycle
The zero-order ultrasensitivity model of Goldbeter (9), which exhibits sharp oscillatory behavior, was used as a prototypical example of a model containing a feedback loop. The model is shown schematically in Fig. 1D and was represented by the following system of differential equations:
| (6a) |
| (6b) |
| (6c) |
| (6d) |
The drug pharmacokinetic compartment and the stochastic signaling modules were not used for this model. The proportionality constant between the level of X and mRNA production, α, was set to 10. In Eq. (6a–d), Ki (i = 1–6) and Vi (i = 1–5) indicate Michaelis–Menten-type disocciation constants and maximal velocities, respectively; X and Y are the fractions of kinase and protease in their respective activated forms. Parameter values for the model were based on those used by Goldbeter (9), which match the experimentally observed period and waveform in in vitro models of the mitotic cell cycle. Initial values were A = 0.01 μM and X = Y = 0.01.
RESULTS
Evaluation of the SPLINDID Semiparametric Approach
Performance with Receptor Nonlinearity
The pharmacodynamics of many drugs is frequently nonlinear because of receptor saturation. To determine whether the SPLINDID approach was capable of handling receptor nonlinearities, we evaluated its ability to fit noisy mRNA data generated by simulations with nonlinear receptor kinetics. Simulated data were obtained using the model in Fig. 1A and B with four drug doses that resulted in initial concentrations of either 10, 100, 1,000, or 10,000 concentration units. Nonlinearities were apparent on visual inspection at the two highest concentrations. The performance of the SPLINDID approach was assessed by its ability to fit the simulated mRNA data and the accuracy of its predictions of the normalized transcription rate. Results from noise-free simulations were used as the “true values” to assess the predictions of the SPLINDID method.
Figure 2A shows the fit of the SPLINDID approach to the mRNA data for doses corresponding to the initial drug concentrations of 10, 100, and 10,000 concentration units. The SPLINDID fit, represented by solid lines, satisfactorily fits the mRNA data and provides a good approximation of the true values (shown in dashed lines). The normalized transcription profiles obtained using SPLINDID also provided satisfactory approximation of the corresponding true values obtained from noise-free simulations (Fig. 2B). The normalized transcription profile corresponding to the initial concentration of 10 concentration units was time-shifted compared to the reference curves; the exact reasons for the modest shift are unclear but are being investigated.
Fig. 2.
Performance of SPLINDID for the receptor nonlinearity model. Simulated mRNA data points used as inputs are shown in filled circles in (A) for each of the three initial drug concentrations indicated. The solid line summarizes the fit of the SPLINDID approach to the simulated mRNA (A) and normalized transcription profile data (B); the dashed lines are the corresponding profiles in the absence of added noise.
Performance with a Model with Differing mRNA Degradation Rate Constants
The normalized transcriptional rate profile, R(t)/kM, contains individual contributions from its R(t) and kM terms, and the primary goal of this numerical experiment was to demonstrate that the SPLINDID approach was capable of detecting alterations in the normalized transcriptional rate profile caused by kM changes upon controlling for R(t). We used the model in Fig. 1C that produces three variant mRNAs: mRNA1, mRNA2, and mRNA3 comprising fractions f1, f2, and f3 = 1 − f1 − f2 of the total transcription, respectively. By setting identical values of f = f1 = f2 = f3 = 1/3 and with differing kM values, the model allows us to test this specific aspect of SPLINDID performance. This model can be considered a minimal representation of alternative splicing (10) and can be extended to more splice variants by assigning additional parameters analogous to f for all but one of the variants.
Results from SPLINDID in Fig. 3A and B show the fits to mRNA input data used and the predicted normalized transcription profiles, respectively, relative to the true values from noise-free simulations. The results demonstrate that the SPLINDID approach is capable of satisfactorily approximating the true normalized transcription rate profiles when the mRNA degradation rate constant is changed.
Fig. 3.
Performance of SPLINDID with the alternative splicing model. The simulated mRNA data points used as inputs are shown in filled circles for kM = 1.2 h−1, in open circles for kM = 0.6 h−1, and in open squares for kM = 0.3 h−1 in (A). The solid line summarizes the fit of the SPLINDID approach to the simulated mRNA (A) and normalized transcription profile data (B); the dashed lines are the corresponding profiles in the absence of added noise. mRNA degradation rate constant values, kM, are also indicated in each figure.
Many chemotherapeutic agents act selectively on specific stages of the cell cycle. In the next stage of the analysis, the SPLINDID approach was challenged with the minimal model for the cell cycle (Fig. 1D), which contains additional structural and pharmacodynamic complexities (9). This model exhibits an unusual nonlinearity that has been termed zero-order ultrasensitivity; it contains a feedback loop and exhibits oscillatory behavior of the limit cycle type (9). Despite these added complexities, the SPLINDID approach fits the mRNA data satisfactorily (Fig. 4A), and also provides accurate estimates of the normalized transcription rate (Fig. 4B). There were modest deviations at sharp corner-like points between peaks in the normalized transcription profile, but the overall approach of SPLINDID to the data was satisfactory.
Fig. 4.
Performance of SPLINDID with the minimal model for the cell cycle. Simulated mRNA data points used as inputs are shown in filled circles in (A). The solid line summarizes the fit of the SPLINDID approach to the simulated mRNA (A) and normalized transcription profile data (B); the dashed lines are the corresponding profiles in the absence of added noise.
Performance with an Experimental Gene Expression Data Set
Several groups have used gene expression profiling to investigate the cell cycle, a fundamental biological process that is often the target of anticancer and immunosuppressive drugs (11,12). We assessed the performance of SPLINDID using publicly available data (available at http://cellcycle-www.stanford.edu) from the cell cycle gene expression profiling experiments of Spellman et al. (12). The downloaded data, which were in log-normalized form, were linearized prior to modeling with SPLINDID. Figure 5 shows the mRNA profiles (Fig. 5A, C) and normalized transcription rate profiles (Fig. 5B, D) for two representative, cell-cycle-dependent mRNAs, CDC5 and SLT2; both mRNAs code for serine/threonine kinases and belong to the mitogen-activated protein (MAP) kinase family. The fit of the SPLINDID model to mRNA data was satisfactory, as assessed from the approach of the fitted line to the data and by the relative absence of bias.
Fig. 5.
Performance of SPLINDID with experimental results from the cell cycle gene expression profiling expression data set of Spellman et al. (12). Experimental data for mRNAs of two representative genes, CDC5 and SLT2, that were used as inputs are shown in filled circles in (A) and (C), respectively. The solid line summarizes the fit of the SPLINDID approach to the simulated mRNA (A and C) and normalized transcription profile data (B and D).
DISCUSSION
In this report, we demonstrated that SPLINDID, a novel semiparametric, model-based approach, previously used for genomic–proteomic time series (1), is effective for extracting normalized transcription rate profiles from gene expression profiles containing mRNA dynamics alone. We challenged the underlying SPLINDID approach extensively to determine whether it was capable of providing accurate normalized transcription profiles when nonlinear signaling occurred. In each challenge, the SPLINDID approach performed satisfactorily.
The SPLINDID approach provides the normalized transcription rate profile, which is difficult—if not sometimes virtually impossible—to obtain by experiment. The experimental method of choice for obtaining the transcription rate is the nuclear run-on assay, which is challenging because it requires subcellular fractionation and ex vivo reconstitution with cofactors that often cause loss of viability (13). Likewise, measurement of mRNA half-life requires transcription blockers, such as actinomycin D, that have broad specificity and alter a multitude of other cellular processes in addition to the RNA of interest (14,15).
Our approach was motivated by Wagner–Nelson deconvolution (16) and by the nonparametric, spline-based modeling techniques proposed for input deconvolution for oral dosage forms in pharmacokinetics (17,18). However, these deconvolution techniques have not been systematically investigated for pharmacogenomic modeling. From the experimental standpoint, data collection for the pharmacogenomic problem presents some specific challenges that are not encountered in human and animal pharmacokinetics studies. For example, the unit impulse response, which is easily obtained by intravenous bolus dosing in pharmacokinetics, is difficult for the pharmacogenomic problem. Furthermore, the polyexponential-type constraints that are commonly used in pharmacokinetics are more difficult to justify in pharmacogenomics because of the nonlinearities in and the relative paucity of quantitative information on biological signaling cascades.
The SPLINDID approach has certain advantages as well as limitations relative to the compartmental models frequently used in pharmacokinetics/pharmacodynamic modeling. Compartmental modeling requires individualized models to be identified for each gene time profile, and because it is resource- and time-intensive, it can be difficult to use for modeling large numbers of genomic profiles. SPLINDID, on the other hand, is semiparametric and does not require gene-specific models. However, a possible disadvantage of SPLINDID is that it only provides the normalized transcription rate profile, R(t)/kM, which is a composite variable containing contributions from both the mRNA input and output processes. In systems with first-order mRNA degradation, the normalized transcription rate provides complete information on the shape of the transcription rate profile, but the presence of the mRNA degradation rate constant (kM) scaling factor precludes determination of the absolute transcription rate profile in the absence of additional experimental data.
In conclusion, our results demonstrate the capabilities and versatility of SPLINDID, and indicate that it is capable of fitting mRNA profiles obtained in a variety of pharmaceutically important contexts.
Acknowledgments
This work was supported in part by grants from the Kapoor Foundation, National Science Foundation (Research Grant 0234895) and the National Institutes of Health (P20-GM 067650).
References
- 1.Bhasi K, Forrest A, Ramanathan M. SPLINDID, a semiparametric, model-based method for obtaining transcription rates and gene regulation parameters from genomic and proteomic expression profiles. Bioinformatics. 2005 doi: 10.1093/bioinformatics/bti624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hargrove JL, Schmidt FH. The role of mRNA and protein stability in gene expression. FASEB J. 1989;3:2360–2370. doi: 10.1096/fasebj.3.12.2676679. [DOI] [PubMed] [Google Scholar]
- 3.Ramanathan M, MacGregor RD, Hunt CA. Predictions of effect for intracellular antisense oligodeoxyribonucleotides from a kinetic model. Antisense Res Dev. 1993;3:3–18. doi: 10.1089/ard.1993.3.3. [DOI] [PubMed] [Google Scholar]
- 4.deBoor C. A Practical Guide to Splines. Springer-Verlag; New York, NY: 1978. [Google Scholar]
- 5.Schumaker LL. Spline Functions: Basic Theory. Wiley; New York, NY: 1981. [Google Scholar]
- 6.D’Argenio DZ, Schlumitzky A. Biomedical Simulations Resource. University of Southern California; Los Angeles, CA: 1997. Users Guide to Release 4: Adapt II Pharmacokinetic/Pharmacodynamic Systems Analysis Software. [Google Scholar]
- 7.Akaike H. A new look at the statistical model identification. IEEE Trans Automat Contr AC. 1974;19:716–723. [Google Scholar]
- 8.Sun YN, Jusko WJ. Transit compartments versus gamma distribution function to model signal transduction processes in pharmacodynamics. J Pharm Sci. 1998;87:732–737. doi: 10.1021/js970414z. [DOI] [PubMed] [Google Scholar]
- 9.Goldbeter A. A minimal cascade model for the mitotic oscillator involving cyclin and cdc2 kinase. Proc Natl Acad Sci USA. 1991;88:9107–9111. doi: 10.1073/pnas.88.20.9107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Stamm S, Ben-Ari I, Rafalska Y, Tang Z, Zhang D, Toiber TA, Thanaraj H. Function of alternative splicing. Gene. 2005;344:1–20. doi: 10.1016/j.gene.2004.10.022. [DOI] [PubMed] [Google Scholar]
- 11.Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell. 1998;2:65–73. doi: 10.1016/s1097-2765(00)80114-8. [DOI] [PubMed] [Google Scholar]
- 12.Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998;9:3273–3297. doi: 10.1091/mbc.9.12.3273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K. Current Protocols in Molecular Biology. Wiley-Interscience; New York: 2005. [Google Scholar]
- 14.Aktipis S, Panayotatos N. A kinetic study on the mechanism of inhibition of RNA synthesis catalyzed by DNA-dependent RNA polymerase. Differences in inhibition by ethidium bromide, 3,8-diamino-6-ethylphenanthridinium bromide and actinomycin d. Biochim Biophys Acta. 1981;655:278–290. doi: 10.1016/0005-2787(81)90038-1. [DOI] [PubMed] [Google Scholar]
- 15.Glynn JM, Cotter TG, Green DR. Apoptosis induced by Actinomycin D, Camptothecin or Aphidicolin can occur in all phases of the cell cycle. Biochem Soc Trans. 1992;20:84S. doi: 10.1042/bst020084s. 1992. [DOI] [PubMed] [Google Scholar]
- 16.Wagner JG. Application of the Wagner-Nelson absorption method to the two-compartment open model. J Pharmacokinet Biopharm. 1974;2:469–486. doi: 10.1007/BF01070942. [DOI] [PubMed] [Google Scholar]
- 17.Fattinger KE, Verotta D. A nonparametric subject-specific population method for deconvolution: I. Description, internal validation, and real data examples. J Pharmacokinet Biopharm. 1995;23:581–610. doi: 10.1007/BF02353463. [DOI] [PubMed] [Google Scholar]
- 18.Gillespie WR, Veng-Pedersen P. A polyexponential deconvolution method. Evaluation of the “gastrointestinal bio-availability” and mean in vivo dissolution time of some ibuprofen dosage forms. J Pharmacokinet Biopharm. 1985;13:289–307. doi: 10.1007/BF01065657. [DOI] [PubMed] [Google Scholar]




