Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jan 2.
Published in final edited form as: Stat Med. 2018 May 2;37(17):2630–2644. doi: 10.1002/sim.7669

Independence screening for high dimensional nonlinear additive ODE models with applications to dynamic gene regulatory networks

Hongqi Xue 1, Shuang Wu 2, Yichao Wu 3, Juan C Ramirez Idarraga 4, Hulin Wu 5
PMCID: PMC6940146  NIHMSID: NIHMS1050616  PMID: 29722041

Abstract

Mechanism-driven low-dimensional ordinary differential equation (ODE) models are often used to model viral dynamics at cellular levels and epidemics of infectious diseases. However, low-dimensional mechanism-based ODE models are limited for modeling infectious diseases at molecular levels such as transcriptomic or proteomic levels, which is critical to understand pathogenesis of diseases. Although linear ODE models have been proposed for gene regulatory networks (GRNs), nonlinear regulations are common in GRNs. The reconstruction of large-scale nonlinear networks from time-course gene expression data remains an unresolved issue. Here, we use high-dimensional nonlinear additive ODEs to model GRNs and propose a 4-step procedure to efficiently perform variable selection for nonlinear ODEs. To tackle the challenge of high dimensionality, we couple the 2-stage smoothing-based estimation method for ODEs and a nonlinear independence screening method to perform variable selection for the nonlinear ODE models. We have shown that our method possesses the sure screening property and it can handle problems with non-polynomial dimensionality. Numerical performance of the proposed method is illustrated with simulated data and a real data example for identifying the dynamic GRN of Saccharomyces cerevisiae.

Keywords: high-dimensional data, independence screening, nonlinear network, ordinary differential equation, time course gene expressions

1 |. INTRODUCTION

Although mechanism-based ordinary differential equation (ODE) models have been widely used to describe dynamics of infectious agents in a host and epidemics of transmission between hosts for infectious diseases, data-driven high-dimensional ODE models for infectious diseases, in particular, at molecular levels, are still sparse. For example, the gene regulatory network (GRN), governing the gene transcriptions, is a collection of interacting genes and their products in the cell. It plays a fundamental role in the development of living cells and host response to infectious agents. Recent advancements in modern high-throughput technologies, such as DNA microarray and next generation RNA-Seq, have allowed the collection of time course gene expression profiles in an affordable cost. It is of increasing interest to reconstruct the GRNs from such experimental data using mathematical models, especially dynamic network models that aim to capture the complex phenomena of biological systems and host response to infections by modeling the time course gene expression data. Commonly used network models include information theory models,1,2 Boolean networks,3,4 Bayesian networks,58 vector autoregressive and state space models,911 latent variable models,12 and differential equation models.1317 Recently, Xing and his associates have introduced temporal exponential random graph models and time-varying networks to capture dynamics of networks.1820 Excellent overviews on diverse data-driven modeling schemes and related topics can be found in De Jong,21 Filkov,22 and Hecker et al.23

We focus on the ODE modeling approach, which offers a description of the gene network as a continuous time dynamical system. Gene regulation is modeled by rate equations, quantifying the change rate (derivative) of the expression of one gene in the system as a function of expression levels of all related genes. The general form of the equations can be written as

X(t)=F(t,X(t),θ), (1)

Where X(t) = {X 1(t),…, Xp(t)}T is a vector representing the gene expression level of the genes 1,…,p at time t, t ∈ [t0, T], 0 ≤ t0 < T < ∞, and X(t) is the first-order derivative of X(t). The link function F describes the regulatory effects of other genes on the expression change of gene i, i = 1,…,p, with a vector of parameters θ.

The identification of the parameters in (1) requires some constraints on the model structure, for instance, specification of the form of the function F. Many previous works have assumed that F is linear, due to the simplicity of linear functions.13,14,16,17 However, complex dynamic behaviors and regulations in a GRN usually cannot be explained by simple linear systems.24 A number of nonlinear ODE models have been proposed for GRNs, such as the sigmoid function model15 and S-systems.25 Compared with linear models, the structure identification and parameter estimation of the nonlinear differential equation models are computationally more intensive and may require more data. Thus, existing efforts on reconstructing GRNs based on nonlinear ODE models have mostly been limited to small-scale systems.

In this article, we consider the following high-dimensional nonlinear additive ODE model for reconstructing GRNs:

Xk(t)=βk0+j=1pβkjfkj{Xj(t),αkj},k=1,,p, (2)

where βk0 is the intercept, fkj is known nonlinear functions with parameters αkj, and coefficients βkj represent the regulation effects of the genes in the network. The nonlinear parameter αkj usually controls the shape of the function fkj. For example, in the sigmoid function,

f(x)=11+eαx, (3)

which is usually used to model nonlinear systems that exhibit “saturation” phenomena, the parameter α>0 indicates the transition rate or how fast the regulation effect of a gene saturates. For a small α, the sigmoid function (3) is close to a linear function, while for a large α, the function (3) is close to a step function. Similar sigmoid functions were used to model nonlinear gene regulatory effects in literature.14,15 For different application scenarios, different nonlinear functions of f(x) can be chosen based on biological mechanisms or knowledge, or through initial data exploration, but the results under the different assumption of nonlinear functions are likely to be different, which is a limitation of nonlinear models. Some sensitivity analysis may be performed under different assumptions. Notice that this nonlinear additive model is different from that of the additive nonparametric model,26 which can be converted into a linear model with spline-basis approximation. Thus, the proposed nonlinear additive model is more challenging.

Since biological systems are seldom fully connected and most nodes are only directly connected to a small number of other nodes resulting in sparse networks,27 we assume that the number of significant nonlinear effects, ie, nonzero βkj in model (2), is small, although the total number of genes p may be large. In practice, the gene expression profiles Xk(t) are never measured continuously, but rather at some discretized time points, and possibly with measurement errors. Denote

Yki=Xk(ti)+εki,i=1,,nk,k=1,,p, (4)

where εki is assumed to be i.i.d r.v.’s with mean 0 and variance σk2. Without loss of generality, we assume that all nk’s are the same and referred as n hereafter. The challenging question is how to perform variable selection to identify the significant edges M* = {1 ≤ k, jp : βkj ≠ 0} in the GRN from the noisy experimental data.

Very little statistical research has been dedicated to parameter estimation and variable selection for nonlinear ODE models. Because nonlinear ODEs have no close-form solutions in general, the standard statistical methodologies developed for nonlinear regression28 and generalized nonlinear regression models29 cannot be applied directly. The existing parameter estimation methods for ODE models include the nonlinear least-squares (NLS) method,30,31 the 2-stage smoothing-based estimation method,3235 the Bayesian approaches,36,37 the principal differential analysis,38 and the generalized profiling approach.39,40 Among these methods, we are more interested in the 2-stage smoothing-based estimation method, which decouples the system of differential equations to a set of pseudo-regression models by substituting the differentials with estimated derivatives from observed data. This method not only reduces the computational burden dramatically but also has some desirable statistical properties.33,34 Moreover, this parameter estimation method also conveniently facilitates the incorporation of variable selection techniques for ODEs. For each pseudo-regression model, we can apply variable selection methods, such as the adaptive lasso,41 SCAD,42 and the nonnegative garrote (NNG),43 to determine the important network edges.

The number of genes (covariates) p in the nonlinear ODE system (2) usually grows with the sample size n, even p ≫ n. Moreover, since the total number of unknown parameters in model (2) is 2p2+p, the model complexity and parameter dimensionality grow a lot faster than the sample size n as both p and n increase. Because of the nonlinearity of model (2), it is very challenging, if possible at all, to directly apply the variable selection methods developed for small or moderate numbers of covariates to model (2). To overcome this challenge, we adopt the idea of independence screening introduced by44 and extend it to the nonlinear additive ODE models. The sure independence screening method performs efficiently for dimension reduction via marginal correlation learning for high-dimensional feature selection. It has been shown to have a sure screening property; that is, with probability tending to 1, the independence screening technique retains all of the important features in the model. The sure independence screening was later extended to high-dimensional generalized linear models,45 nonparametric additive models,46 varying coefficient models,47 and single index hazard rate models.48 The idea of using marginal information to deal with high dimensionality was also adopted in other works.4954

We propose a 4-step variable selection procedure for the high-dimensional nonlinear additive ODE model (2). In step I, we use a nonparametric smoothing-based estimation approach to obtain the estimates for both the state variables and their derivatives based on the measurement model (4), and then in step II, we substitute the estimated state variables X^k(t) and their derivatives X^k(t) from step I into the ODE model (2) to form a “pseudo” nonlinear additive model. In step III, we propose a nonlinear independence screening (NLIS) method based on the pseudo-residual sum of squares of the marginal modeling. For every fixed 1≤ kp, we fit p marginal pseudo nonlinear regressions of the response X^k(t) against each covariate X^j(t) separately and select covariates with the pseudo-residual sum of squares smaller than a prespecified threshold value. Finally in step IV, we use the NNG to select important covariates in the reduced nonlinear additive model conditional on those selected covariates from Step III. The NLIS procedure in step III can significantly reduce the dimension of covariates, and more importantly, under some mild conditions, it has a sure screening property with the same parametric sure screening rate as that for generalized linear models derived by Fan and Song,45 which ensures that all the important covariates are retained in step III with probability tending to 1. To the best of our knowledge, this is the first attempt to propose a variable selection method for a high-dimensional nonlinear ODE model and to establish the screening properties of the NLIS for ODE models under the “large p, small n” setting. We expect that the proposed nonlinear ODE model and method can be widely used to identify nonlinear dynamic GRNs for different biological systems with significant impacts in understanding pathogenesis of various diseases.

2 |. DYNAMIC GRN RECONSTRUCTION VIA A 4-STEP VARIABLE SELECTION PROCEDURE

The proposed variable selection procedure for the high-dimensional nonlinear ODE model (2) can be summarized into the following 4 steps. To simplify the presentation, we assume that for all k,j=1,,p,E[fkj|Xj(t),αkj]]=0 and βk0=0. We use the GRN modeling as an example to present the proposed methodologies.

2.1 |. Step I: nonparametric smoothing to estimate state variables and their derivatives

First, we can apply one of nonparametric smoothing approaches such as smoothing splines, regression splines, penalized splines, or local polynomial to estimate the state variables and their derivatives, Xk(t) and Xk(t). In this article, we adopt the penalized splines34,55 to obtain the estimates, X^k(t) and X^k(t), k=1,,p. That is, we approximate Xk(t) 1 by 1 for 1 ≤ kp by Xk(t)j=νKkδk,jNk,j,ν+1(t)=Nk,ν+1T(t)δk, where δk=(δk,ν,,δk,Kk)T is the unknown coefficient vector to be estimated from the data and Nk,v+1(t)={Nk,v,v+1(t),,Nk,Kk,v+1(t)}T is the B-spline basis function vector of degree v and dimension Kk+v+1 at a sequence of knots t0=τk,v=τk,v+1==τk,1=τk,0<τk,1<<τk,Kk<τk,Kk+1=τk,Kk+2==τk,Kk+ν+1=T on [t0, T]. Further, let Sλk be the “hat” matrix that maps the observation Yk=(Yk1,,Ykn)T into X^k:X^k=SλkYk. Usually, the knots of the P-splines are generally placed on a grid of equally spaced sample quantiles, then the number of knots and the penalty parameter are jointly determined by generalized cross-validation (GCV) described below,55 although more complicated adaptive approaches are also proposed to place the knots at the sharp change points.

The penalized spline (P-spline) objective function contains a penalized sum of squared differences with a penalized term. To determine the penalization (smoothing) parameter, λk and the number of knots, we can use the standard GCV method,56 ie, λk is chosen by minimizing the GCV score:

GCV(λk)=1ni=1n{YkiX^k(ti)}2{1n1tr(Sλk)}2.

It has been shown in the literature55,57 that the GCV performs similar to the classic cross-validation but is much faster in computation in nonparametric smoothing parameter selection. Some other methods such as Mallows Cp criterion and AIC perform similarly to the cross-validation and GCV methods.57 Note that other nonparametric smoothing approaches such as regression spline and local polynomial smoothing can also be used in this step. Each of these smoothing methods has its pros and cons, but it should produce similar results if the appropriate smoothing parameter is chosen.

2.2 |. Step II: pseudo nonlinear additive models

Similar to Brunel32 and Liang and Wu,33 we substitute the estimated state variables X^k(t) and their derivatives X^k(t) from step I into the ODE model (2) to form a set of pseudo nonlinear additive models:

Zki=j=1pβkjfkj(X^ji,αkj)+eki,k=1,,p;i=1,,n, (5)

where Zki=X^k(ti) and X^ji=X^j(ti);eki represents the aggregated measurement error and estimation error of X^k(t) and X^k(t) from step I. The p-dimensional ODE model (2) is then decoupled into p pseudo nonlinear regression models, which can be fitted using the least squares approach. For nonlinear ODE models, usually p cannot be too big; otherwise, it may encounter the computing and convergence problems. In model (5), the response variables and the covariates are derived from the nonparametric smoothing estimates of the state variables and their derivatives, respectively. Moreover, the resulting error terms eki are not i.i.d., but dependent. Thus, this is not a standard nonlinear model or generalized nonlinear model studied in the literature.28,29 That is why we refer it as a pseudo nonlinear model. Since X^k(t) and X^k(t) are estimated continuously at every time point t from step I, we may augment more time points than the original observation times for the next step analysis. This data augmentation strategy has also been used by other investigators for ODE parameter estimation.16,17

2.3 |. Step III: NLIS

We extend the idea of independence screening for linear and generalized linear regression models44,45 to the nonlinear additive ODE models. Now, we rewrite p as pn to emphasize that p may increase as n increases. For every fixed 1 ≤ kpn, we fit pn marginal pseudo nonlinear regression models for the response Zk = (Zk1,…,Zkn)T against each covariate X^j=(X^j1,,X^jn)T separately and rank their importance to the joint model according to a measure of the goodness of fit of the marginal model. One screening strategy is to use the magnitude of the marginal least squares estimate (MLSE) β^kjM as the marginal utility for the independence screening. We refer to this strategy as MLSE screening, where we select a set of variables with the magnitude of β^kjM greater than a prespecified threshold ζk:

N^k={1jpn:|β^kjM|ζk}. (6)

Another natural screening strategy is to rank the variables according to the pseudo residual sum of squares of the component-wise nonlinear regression, where we select a set of variables:

M^k={1jpn:RSSkjξk}, (7)

with RSSkj=minβkj,αkji=1nwk(ti)[Zkiβkjfkj(X^ji,αkj)]2 as the pseudo residual sum of squares of the jth marginal fit and ξk as a prespecified threshold value, and wk(t) is a prescribed weight function on [t0, T] with boundary constraints wk(t0) = wk(T) = 0. More discussions on how to select the weight function can be found in Brunel32 and Wu et al.34 A data-driven threshold ξk can be determined by using random permutations as in Fan et al.46 An alternative thresholding scheme is to choose d covariates with smallest RSSkj. This strategy is analogous to the likelihood ratio screening proposed in Fan et al,58 and we refer to it as the RSS screening. We show that it is asymptotically equivalent to the MLSE screening in preserving the important variables of the joint model (see Supporting Information). We refer to the aforementioned 2 feature screening approaches as NLIS approaches. In the case of large pn, this NLIS approach can save a lot of computing efforts. In this article, we adopt the RSS screening.

To establish sure screening properties, we need to characterize the minimum distinguishable signal level.44,45 In our case, the minimum distinguishable signal in the screening step is related not only to the stochastic error in estimating the nonlinear parameters but also to the approximation error in smoothing the state variables Xk(t) and their derivatives Xk’(t). We show that our screening procedure has a sure screening property, as defined by Fan and Lv44 (see Supporting Information). The NLIS reduces the dimensionality of covariates from pn to a possibly much smaller space with model size |M^k|, to which the NNG for nonlinear regressions can be applied to perform more refined variable selection for the submodel conditional on the covariates selected by the NLIS.

2.4 |. Step IV: NNG for refining variable selection

Given initial estimates α^kj(init), the nonlinear additive regression (5) reduces to a linear additive regression model:

Zki=j=1pβkjf˜kji+eki, (8)

with the predictors being f˜kji=fkj(X^ji,α^kj(init)). One can then use variable selection techniques for linear regressions to shrink unnecessary βkj to zeros. In this article, we adopt the NNG43 to find a set of nonnegative scaling factors ckj to minimize

12i=1nwk(ti)[Zkij=1pckjβ^kj(init)f˜kji]2+nλk*j=0pckj,s.t.ckj0, (9)

where β^kj(init) is initial estimates of βkj and wk(t) is a prescribed weight function on [t0, T] with boundary constraints wk(t0) = wk(T) = 0. The garrote estimates are given by β^kj=c^kjβ^kj(init). An appropriately chosen λk* can shrink some ckj to exact 0 and thus produces a more parsimonious model. We use the GCV to select λk* in our simulations and real data analysis.

Suppose the NNG selects covariates f˜kj=(f˜kj1,,f˜kjn)T,jAk={1jp:c^kj0}. We can then update the parameters αkj conditional on the NNG estimates β^kj,jAk. These procedures iterate until a convergence criterion is met; for instance, the residual sum of squares of the fitted model does not change up to a tolerance. The initial estimates need to be carefully chosen, because the solution path consistency of the NNG depends on the consistency of the initial estimators.41,59 Here, we use the NLS estimates of (5) as the initial estimates for both αkj and βkj, which satisfy the consistency and asymptotical normality.33,34

We have established the sure independence property for the proposed procedure. The detailed theoretical results are provided in Supporting Information.

3 |. MPLEMENTATION ALGORITHM

We propose to iteratively use a large-scale screening, the NLIS procedure and a moderate-scale variable selection technique, the NNG, to enhance the performance of the proposed variable selection method in terms of false selection errors. For every 1 ≤ kpn, the algorithm is detailed as follows.

  1. For every j ∈ {1,…,pn}, compute the marginal fit by solving
    minβkj,αkjl=1mwk(tl*)[Zklβkjfkj(X^jl,αkj)]2,tl*[t0,T], (10)
    where the data argumentation size mn and the weight function wk(t) = sin{(t-t0)π/(T-t0)}. Rank the covariates according to the marginal pseudo residual sum of squares:
    RSSkj=minβkj,αkjl=1mwk(tl*)[Zklβkjfkj(X^jl,αkj)]2.

    Select the top d covariates with the smallest RSSkj or covariates with RSSkj smaller than a threshold ξk estimated from random permutations. The set of selected covariates is denoted by Sk,1.

  2. Apply the NNG for the nonlinear additive model introduced in section 2.1 on the set Sk,1 to select a subset Mk,1. The BIC score of the model with covariates in Mk,1 is computed, denoted as BICk(1).

  3. For every jMk,1c={1,,pn}\Mk,1, minimize
    l=1mwk(tl*)[ZkljMk,1βkjfkj(X^jl,αkj)βkjfkj(X^jl,αkj)]2, (11)
    with respect to βkj,αkj,jMk,1 and βkj, αkjMk,1c. This regression reflects the additional contribution of the jth covariate conditioning on the existence of the variable set Mk,1. After marginal screening similar as in step 1 by ranking the pseudo residual sum of squares of model (11), choose a set of covariates Sk,2Mk,1c. The NNG procedure is then applied on the set Mk,1Sk,2 to select a subset Mk,2. Notice that we should keep model (11) as small as possible. Otherwise, we may be trapped in local solutions or encounter the convergence and computational problems.
  4. Repeat step 3 until Mk,ι = Mk,ι+1, or the size of Mk,ι, reaches a prespecified threshold d*. The set of selected covariates is Mk,Kk, where Kk=argminιBICk(ι).

    For the numerical examples in the following sections, we recruit a fixed number of covariates at each round of marginal screening; that is, each Sk,l includes d covariates with the smallest marginal residual sum of squares. We experimented with different choices of d and found d =1 provides the smallest false selection rate and therefore is adopted in our numerical examples (including both simulation studies and real data application). We keep d as small as possible in order to avoid local solutions and convergence problems while fitting nonlinear ODEs.

4 |. SIMULATION STUDIES

Simulation experiments are designed to study and evaluate numerical performance of the proposed method. We consider a dynamic GRN described by the ODE model (2) with regulatory functions fkj as centered sigmoid functions:

fkj(x,αkj)=f(x,αkj)=11+exp{αkjx}0.5,k,j=1,,p. (12)

The sigmoid function is usually used to model nonlinear systems that exhibit saturation phenomena, and the parameter αkj > 0 indicates the transition rate.15 For a small α, the sigmoid function is close to a linear function, while for a large α, the function is close to a step function. This model is also used for the application example in Section 5.

First, we consider a dynamic gene network with p=10 genes. We set the true nonlinear ODE model as

X1=0.9f(X1,α1)+1.5f(X2,α2),
X2=1.7f(X1,α1)+0.95f(X3,α3),
X3=1.45f(X2,α2)+2f(X4,α4)+1.55f(X6,α6),
X4=1.2f(X3,α3)+1.35f(X5,α5),
X5=2f(X4,α4)1.2f(X5,α5),
X6=1.23f(X7,α7)+1.58f(X10,α10),
X7=1.94f(X6,α6)0.98f(X8,α8),
X8=f(X6,α6)+1.45f(X7,α7),
X9=1.24f(X8,α8)+1.35f(X10,α10),
X10=1.21f(X6,α6)1.38f(X9,α9),

where (α1,…,α10) are equally spaced in [2, 4] and t ∈ [0,10]. The initial conditions are X(0) = (0.6, −0.3, −0.75,−0.6,1.2, 1.4, −0.8, −1.2, −0.6, −1). We solve the above ODEs numerically for Xj(t), j = 1,…,10 and simulate the experimental data Yki at time points t1,…tn by model (4), where the measurement errors ki are normally distributed r.v.’s with mean 0 and variance σ2. We consider 2 variance levels, σ = 0.05 and 0.2, and 2 sample sizes, n = 20 and 50. The observation time points are taken equidistantly in [0, 10]. We use the penalized splines to obtain the estimates of all state variables and their derivatives, where the smoothing parameters λk are chosen by GCV. The data augmentation size is set to be m = 100, and the weight function is chosen to be wk(t)=sin(πt/10).

Next, we consider a sparse GRN consisting of p=30 genes. The nonlinear ODE system is randomly generated, and we need to make sure that it is identifiable and has stable solutions. There are several identifiability analysis approaches such as structural identifiability, practical identifiability, and numerical identifiability analyses that can be used to evaluate the identifiability of nonlinear ODE models. A good survey of ODE model identifiability can be found in Miao et al.60 Notice that, among these p = 30 coupled nonlinear ODEs, the total number of parameters is 2p2 = 1800, which is quite challenging in variable selection and parameter estimation. The total number of nonzero coefficients is 77, and there are 2 to 4 nonzero coefficients for each differential equation. The nonlinear parameter αkj’s are randomly generated similar to that in [2, 4]. Examples of generated parameters for p = 30 nonlinear additive ODEs are provided in Supporting Information. The number of observed time points for each Xj(t), j = 1,…,30 is set to be n = 50 or 100, and the standard deviation for the measurement error is σ = 0.01 or 0.05.

The accuracy of our variable selection approach is measured by sensitivity and specificity, which are defined as follows:

sensitivity=#ofcorrectlyestimatededges#ofalledgesinthetruenetwork,
specificity=#ofcorrectlyestimatededges#ofallestimatededges.

The simulations are repeated by 100 times, and results are presented in Table 1. Both sensitivity and specificity of the fitted models increase with the increase of sample size and decrease of measurement error variance. For the small scale network with 10 genes, our method can correctly identify over 70% of the nonzero coefficients under all simulation scenarios. For the larger scale network with 30 genes, our method can correctly select about 60% of the variables under the worse scenario (the largest measurement error and smallest sample size) and about 66% under the best scenario (the smallest measurement error and largest sample size). As suggested by one referee, we also performed the comparison simulations with the existing linear ODE model selection method17; that is, although the data are generated from nonlinear ODE models, we apply the linear ODE model selection method17 to the data and see whether we can still recover the correct variables in the model. As expected, the sensitivity and specificity of the linear ODE model selection method perform poorly, in particular, for higher dimensional cases with a large noise (Table 1). On the other hand, we also tried to perform simulation comparisons with the data generated from the linear ODE model, but we failed to obtain the simulation results for most cases from the nonlinear ODE model approach in this case. This is expected since the nonlinear model estimation algorithm often fails to converge if the model is misspecified. This limitation is true for any nonlinear modeling approaches.

TABLE 1.

The average values of the total number of estimated edges, and the sensitivity and specificity of the fitted modelsa

Proposed Nonlinear ODE Method Existing Linear ODE Method
p n σ Sensitivity Specificity Sensitivity Specificity
10 20 0.2 0.7143 (0.0864) 0.7187 (0.0841) 0.5971 (0.0704) 0.3006 (0.0312)
0.05 0.8743 (0.0724) 0.8516 (0.0721) 0.5490 (0.0446) 0.3236 (0.0200)
50 0.2 0.8638 (0.0602) 0.8691 (0.0656) 0.5828 (0.0582) 0.3151 (0.0279)
0.05 0.9648 (0.0250) 0.9610 (0.0363) 0.5347 (0.0375) 0.3240 (0.0197)
30 50 0.05 0.6074 (0.0260) 0.7454 (0.0249) 0.0861 (0.0283) 0.0850 (0.0262)
0.01 0.6351 (0.0178) 0.7800 (0.0163) 0.0857 (0.0318) 0.0856 (0.0315)
100 0.05 0.6455 (0.0138) 0.7979 (0.0198) 0.0984 (0.0299) 0.0852 (0.0240)
0.01 0.6610 (0.0198) 0.8055 (0.0188) 0.0949 (0.0275) 0.0854 (0.0247)
a

Standard deviations in parentheses.

In general, the proposed method works well in identifying the structure of nonlinear ODE networks. The estimated parameters from the variable selection step can be used as the initial estimates for more refined ODE estimation procedures, such as the NLS method.30,31 Since the focus of this article is on the variable selection, we did not perform the parameter estimate refinement of the selected network.

5 |. APPLICATION TO DYNAMIC GRN

In this section, we apply the proposed method to infer the GRN of Saccharomyces cerevisiae, based on the time course microarray data collected by Orlando et al.61 The global transcription dynamics were examined in synchronized populations of both wild-type and cyclin-mutant yeast cells. The gene expression was measured every 8 minutes for a total of 30 time points, covering about 2 cell cycles in the wild type and about 1.5 cell cycles in the cyclin mutant. Here, we only use the data for the wild-type cells to illustrate our method. We focus on the cell cycle pathway compiled in KEGG database (http://www.genome.jp/kegg/), which is around CDC28 (YBR160w), the catalytic subunit of the main cell cycle cyclin-dependent kinase (CDK). We select genes contained in this pathway that are also identified as cell cycle-regulated genes in Orlando et al.61 We also include 9 transcription factors that have been identified to have roles in regulating cell cycle-dependent yeast genes62; these include MBP1, SWI4, SWI6, MCM1, FKH1, FKH2, NDD1, SWI5, and ACE2. The total number of genes in this network is 60; that is, p=60.

Following Chen et al,15 we model the dynamics of gene expressions by the nonlinear ODE model (2) with the regulatory function fkj as the centered sigmoid function (12). The sigmoid function is chosen because of its good properties for modeling “dose-response” relationship. Similar sigmoid functions are also popularly used in modeling nonlinear gene regulatory effects in literature.14,15 We first standardize the gene expression profiles of the 60 selected genes and obtain the estimates of the gene expression curves and their first derivatives by penalized smoothing splines as described in section 2.1. Figure 2 shows the smoothed trajectories (solid line) of 9 randomly selected genes. We can see that these genes show periodic patterns with 2 peaks, corresponding to 2 cell cycles. The estimates of the gene expression curves and their derivatives are then plugged into model (2) to form the pseudo nonlinear additive regressions. The variable selection method proposed in Section 2 is applied to identify significant regulations (connections) among the 60 genes, and the results are shown in Table 2. One important feature of the identified GRN is that each of these 60 genes is regulated by only a few other genes (ranging from 2 to 5 genes), reflecting the fact that the GRN is sparsely connected. The complete network is visualized in Figure 1. The fitted expression curves from the nonlinear model (dashed line) for 9 randomly selected genes are plotted and overlaid with the raw data and the smoothed expression curves (solid line) in Figure 2. We can see that these estimates fit the data quite well.

FIGURE 2.

FIGURE 2

Gene expression data for 9 randomly selected genes overlaid with the smoothed curves (solid line) and fitted curves by the nonlinear ODE model (dashed line)

TABLE 2.

Yeast cell cycle gene regulatory network. For a particular gene listed in Column 1, Column 2 provides the list of genes that have significant regulation effects (inward influence) on this gene, and Column 3 provides the list of genes that this particular gene has a significant regulation effect on (outward influence)

Gene Inward Influence Genes Outward Influence Genes
ACE2 CDC14, CDC20, FKH1, SPO12 CLN3
APC9 CDC28, CDC5
BUB1 NDD1, PCL1
CDC14 DBF2, MCM7 ACE2, CLB1, CLB4, FKH1, YHP1
CDC20 CDC20, MCM5, SPO12 ACE2, CDC20, FKH1
CDC28 CDC7, CLB5, CLN3, FAR1, APC9, CDC6, CDC7, CLB5, CLB6, CLN3, FKH2, MBP1,
SWI4 MCM1, MCM2, MCM3, MCM6, MPS1, MRC1, SWI6, WHI5
CDC45 MCM3, MCM7, NDD1, YHP1 CDC6, MCM4, MCM5
CDC5 DBF2, ESP1, WHI5 APC9, DBF2
CDC6 CDC28, CDC45, DBF2, SIC1 CLB2, MCM1, ORC1
CDC7 CDC28, MCM2 CDC28, MAD3, RAD17, SWI4
CLB1 CDC14, MAD3, SPO12 CLB4, ESP1, YCS4
CLB2 CDC6, HSL1, SIC1, SWI6 FKH1
CLB3 DBF2, MCM3, WHI5 MOB1, SWI5
CLB4 CDC14, CLB1 SLK19, WHI5
CLB5 CDC28, MCM5 CDC28, FAR1, MCM2, MCM3, MCM4, MCM7, ORC1
CLB6 CDC28, MCM3 MBP1
CLN1 CLN3, FKH1, SPO12 SWI4
CLN2 CLN3, FKH2, MCM4, SPO12
CLN3 ACE2, CDC28, MBP1, YCS4 CDC28, CLN1, CLN2, MCD1, RAD17, SWI4, SWI5, SWI6, YOX1
DBF2 CDC5, MCM3, SIC1, YCS4 CDC14, CDC5, CDC6, CLB3, FKH2, MPS1
DUN1 MCM3, SWE1, SWI6, WHI5
ESP1 CLB1, SWI6 CDC5
FAR1 CLB5, SLK19 CDC28
FKH1 CDC14, CDC20, CLB2, SWI5 ACE2, CLN1
FKH2 CDC28, DBF2 CLN2, NDD1, YCS4
GIN4 MCM2, SWI6
HSL1 HSL1, MCM2, WHI5 CLB2, HSL1
LTE1 MCM2, MOB1
MAD3 CDC7, SPO12 CLB1, SCC2, SMC1
MBP1 CDC28, CLB6 CLN3
MCD1 CLN3, MCM3
MCM1 CDC28, CDC6, SIC1 MOB1, PCL1
MCM2 CDC28, CLB5 CDC7, GIN4, HSL1, LTE1, RAD53, SMC3
MCM3 CDC28, CLB5 CDC45, CLB3, CLB6, DBF2, DUN1, MCD1, MCM5, YOX1
MCM4 CDC45, CLB5 CLN2, PCL1
MCM5 CDC45, MCM3, SCC2, YCS4 CDC20, CLB5, MRC1, RAD53
MCM6 CDC28, SIC1
MCM7 CLB5, SIC1 CDC14, CDC45
MOB1 CLB3, MCM1, SIC1, SWI6 LTE1, SPO12
MPS1 CDC28, DBF2
MRC1 CDC28, MCM5 NDD1
NDD1 FKH2, MRC1 BUB1, CDC45, RAD53, SMC3, SWE1
ORC1 CDC6, CLB5 SCC2
PCL1 MCM1, MCM4 BUB1
RAD17 CDC7, CLN3
RAD53 MCM2, MCM5, NDD1 WHI5
SCC2 MAD3, ORC1, SWI6 MCM5
SIC1 SIC1, SWI5 CDC6, CLB2, DBF2, MCM1, MCM6, MCM7, MOB1, SIC1, SPO12
SLK19 CLB4, SLK19 FAR1, SLK19, SWI6
SMC1 MAD3, WHI5
SMC3 MCM2, NDD1
SPO12 MOB1, SIC1, SWI6 ACE2, CDC20, CLB1, CLN1, CLN2, MAD3
SWE1 NDD1, SWI4 DUN1
SWI4 CDC7, CLN1, CLN3 CDC28, SWE1, SWI6
SWI5 CLB3, CLN3 FKH1, SIC1, YHP1
SWI6 CDC28, CLN3, SLK19, SWI4 CLB2, DUN1, ESP1, GIN4, MOB1, SCC2, SPO12
WHI5 CDC28, CLB4, RAD53 CDC5, CLB3, DUN1, HSL1, SMC1
YCS4 CLB1, FKH2 CLN3, DBF2, MCM5
YHP1 CDC14, SWI5 CDC45
YOX1 CLN3, MCM3

FIGURE 1.

FIGURE 1

Graph of yeast cell cycle GRN formed by 60 genes

We find that gene CDC28 regulates the most genes in this network, demonstrating its central role in yeast cell cycle process. This is consistent with the fact that CDC28 is essential for the completion of the start and the controlling event in the cell cycle, and it associates with different regulators throughout the cell cycle to accomplish the waves of CDK activity that drive events of the cell cycle through phosphorylation of key substrates.63,64 The identified network also includes several important pathways in the cell cycle process. For example, in our network, CDC28 and the G1 cyclin CLN3 regulate each other, and CDC28 regulates the repressor of G1 transcription WHI5, and CLN3 regulates the other two G1 cyclins CLN1 and CLN2 as well as the SBF complex (SWI4-SWI6). This is in line with the first wave of CDK activity where CDC28 associates with CLN3 and inactivates WHI5, which in turn leads to active SBF, transcription factors that promote transcription of CLN1 and CLN2.65,66 The MCM2–7 proteins are components of a DNA helicase that plays an essential role in DNA replication and cell proliferation. They form a ring-shaped heterohexamer that assembles as part of the prereplicative complex (pre-RC) during the G1 phase of the cell cycle and is activated at the G1/S transition by the DNA replication initiation factor, CDC45.67,68 As cells enter the S phase, CLB/CDC28 kinases trigger the initiation of DNA replication, and this results in disassembly of pre-RCs.67 In addition, CLB/CDC28 kinases can prevent the reassembly of pre-RCs until the end of mitosis, and this is done by promoting the net nuclear export of MCM proteins.69 In our network, genes MCM2–7 are regulated by either CDC28 or CDC45, consistent with previous biological findings.

6 |. DISCUSSION

In this article, we considered variable selection for high-dimensional nonlinear additive ODE models. To tackle the challenge of high dimensionality, we proposed a 4-step variable selection procedure, where we coupled the 2-stage smoothing-based estimation method for ODEs and an NLIS method based on the pseudo-residual sum of squares of the marginal modeling to perform variable selection. We extended the independence screening approach from linear and generalized linear models44,45 to the nonlinear additive ODE model setting. We have shown that our NLIS method possesses the sure screening property, and it can handle problems with non-polynomial dimensionality. After reducing the dimension of covariates to a moderate size through the NLIS procedure, we used the NNG for nonlinear additive regressions to perform more refined variable selection for the submodel conditional on the covariates selected by the NLIS. The proposed method was applied to simulated data and an real data example for identifying the dynamic GRN of S cerevisiae to illustrate its numerical performance.

This is not a trivial combination of the 2-stage smoothing-based estimation method for ODE models and the high-dimensional variable selection techniques. For the pseudo nonlinear additive model, the minimum distinguishable signal of the marginal screening is closely related to the measurement errors and estimation errors of the state variables and their derivatives as well as the stochastic and numerical errors in the optimization of the nonlinear parameters. In addition, the objective function derived from the pseudo nonlinear additive model often has multiple local minima, making the estimation and marginal screening more challenging in computation. There are also several critical challenges in theoretical development. First, the decoupled pseudo nonlinear additive models are based on the nonparametric smoothing estimates of the ODE state variables and their derivatives, rather than the observed data. Therefore, the errors in these regressions are not i.i.d., but dependent. Moreover, the variable selections are also performed with respect to the smoothing estimates, so the theoretical properties are not trivial to establish. Specifically, our results improve the uniform convergence results of the marginal nonparametric estimation obtained by Fan et al46 from nonparametric rates to parametric rates. This results in the improvement of the convergence rate of the nonparametric independence screening.

In our nonlinear additive ODE model (2), the nonlinear functions fkj are assumed to be known, which requires some prior knowledge about the relations between the covariates and the response. If we have no prior information about such relations, then applying our model may result in a bias. In this case, we recommend to use a nonparametric ODE model to replace the model (2), but its methodologies and theories need to be carefully studied. In model (2), we only considered the additive structure for nonlinear functions. The proposed 4-step variable selection procedure may be extended to more complex structure, such as ODE models with interactions, which is left for further research in the future. We would like to point out that, for any nonlinear modeling approaches, a critical limitation is that it is sensitive to model misspecification. If the nonlinear ODE model is misspecified, the computational algorithm often fails to converge.

We used the 2-stage smoothing-based ODE estimation method for the nonlinear ODE model in order to reduce the computational cost and simplify the implementation of variable selection methods. Some other more efficient estimation approaches for ODE models such as the generalized profiling approach39 may also be considered for high-dimensional ODE variable selection, but it may be computationally challenging. In addition, we adopt the data argumentation strategy for ODE parameter estimation in step II, which may improve the numerical performance of the proposed method. Intuitively, the bigger the data argumentation size is, the better the estimation is. However, a bigger data argumentation size will increase the computational cost, and this may become a serious problem when the ODE system is large. It is important to study the trade-off between the data argumentation size and the computational cost, which is still an open problem.

In this article, we have focused on variable selection, however, the proposed approach could also produce parameter estimation simultaneously. But usually, the parameter estimates from a variable selection procedure may not be efficient and accurate. A parameter estimation refinement step based on the selected model is required in order to obtain good parameter estimates. For high-dimensional nonlinear ODE models, the parameter estimation refinement is not trivial due to local solution and computational convergence problems in high-dimensional nonlinear optimizations. The output of parameter estimation from the variable selection procedure could be used as initial values for the parameter refinement step. But this is beyond the scope of this paper. We refer the readers to Li et al30 and Xue et al. 31

Although we have used the time course gene expression data from S cerevisiae, a popular experimental organism to study gene regulation networks and biological mechanisms at molecular levels, the proposed model and methodology could be used to model any high-dimensional and nonlinear systems, in particular, within-host dynamic responses to infectious diseases at gene and protein levels. We are currently applying these methods to time course gene expression and proteomics data from HIV and influenza infections. The biological findings will be reported elsewhere in the near future.

Supplementary Material

sup1

ACKNOWLEDGEMENTS

Hongqi Xue and Shuang Wu are joint first authors, and Hulin Wu is the corresponding author of this article. The authors thank Dr Rui Song for helpful discussions and Dr Nan Deng for his help in simulations. This research was partially supported by the NIH grants HHSN272201000055C, P30AI078498, R01 AI087135, HHSN27220201200005C, HHSN266200700008C, and P01CA142538, and NSF grants DMS-1055210 and DMS-1812354.

Funding information

NIH, Grant/Award Numbers: HHSN272201000055C, P30AI078498, R01 AI087135, HHSN27220201200005C, HHSN266200700008C and P01CA142538; NSF, Grant/Award Number: DMS-1055210 and DMS-1812354

Footnotes

SUPPORTING INFORMATION

Additional supporting information may be found online in the Supporting Information section at the end of the article.

REFERENCES

  • 1.Steuer R, Kurths J, Daub CO, Weise J, Selbig J. The mutual information: detecting and evaluating dependencies between variables. Bioinformatics. 2002;18:S231–S240. [DOI] [PubMed] [Google Scholar]
  • 2.Stuart JM, Segal E, Koller D, Kim SK. A gene-coexpression network for global discovery of conserved genetic modules. Science. 2003;302:249–255. [DOI] [PubMed] [Google Scholar]
  • 3.Thomas R Boolean formalization of genetic control circuits. J Theor Biol. 1973;42:563–585. [DOI] [PubMed] [Google Scholar]
  • 4.Bornholdt S Boolean network models of cellular regulation: prospects and limitations. J R Soc Interface. 2008;5(Suppl. 1):S85–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kim S, Imoto S, Miyano S. Inferring gene networks from time series microarray data using dynamic Bayesian networks. Brief Bioinform. 2003;4:228–235. [DOI] [PubMed] [Google Scholar]
  • 6.Perrin BE, Ralaivola L, Mazurie A, Bottani S, Mallet J, d’Alche Buc F. Gene networks inference using dynamic Bayesian networks. Bioinformatics. 2003;19(Suppl. 2):ii138–148. [DOI] [PubMed] [Google Scholar]
  • 7.Nachman I, Regev A, Friedman N. Inferring quantitative models of regulatory networks from expression data. Bioinformatics. 2004;20(Suppl. 1):i248–256. [DOI] [PubMed] [Google Scholar]
  • 8.Zou M, Conzen SD. A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data. Bioinformatics. 2005;21:71–79. [DOI] [PubMed] [Google Scholar]
  • 9.Hirose O, Yoshida R, Imoto S, et al. Statistical inference of transcriptional module-based gene networks from time course gene expression profiles by using state space models. Bioinformatics. 2008;24:932–942. [DOI] [PubMed] [Google Scholar]
  • 10.Kojima K, Yamaguchi R, Imoto S, et al. A state space representation of var models with sparse learning for dynamic gene networks. Genome Inform. 2009;22:56–68. [PubMed] [Google Scholar]
  • 11.Shimamura T, Imoto S, Yamaguchi R, Fujita A, Nagasaki M, Miyano S. Recursive regularization for inferring gene networks from time-course gene expression profiles. BMC Syst Biol. 2009;3:41–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shojaie A, Michailidis G. Analysis of gene sets based on the underlying regulatory network. J Comput Biol. 2009;16:407–426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Holter NS, Maritan A, Cieplak M, Fedoroff NV, Banavar JR. Dynamic modeling of gene expression data. PNAS USA. 2001;98:1693–1698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yeung MKS, Tegner J, Collins JJ. Reverse engineering gene networks using singular value decomposition and robust regression. PNAS USA. 2002;99:6163–6168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen H-C, Lee H-C, Lin T-Y, Li W-H, Chen B-S. Quantitative characterization of the transcriptional regulatory network in the yeast cell cycle. Bioinformatics. 2004;20:1914–1927. [DOI] [PubMed] [Google Scholar]
  • 16. Bansal M, Gatta GD, di Bernardo D. Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinformatics. 2006;22:815–822. [DOI] [PubMed] [Google Scholar]
  • 17.Lu T, Liang H, Li H, Wu H. High dimensional odes coupled with mixedeffects modeling techniques for dynamic gene regulatory network identification. J Am Stat Assoc. 2011;106:1242–1258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Hanneke S, Xing EP. Discrete temporal models of social networks In Airoldi EM, Blei DM, Fienberg SE, Goldenberg A, Xing EP, Zheng AX, eds. Statistical Network Analysis: Models, Issues and New Directions of Lecture Notes in Computer Science; vol. 4503 Berlin/Heidelberg: Springer; 2007. [Google Scholar]
  • 19. Guo F, Hanneke S, Fu W, Xing EP. Recovering temporally rewiring networks: a model-based approach. In: Proceedings of the 24th International Conference on Machine Learning (ICML) ACM; 2007:321–328. [Google Scholar]
  • 20.Kolar M, Song L, Ahmed A, Xing EP. Estimating time-varying networks. Ann Appl Stat. 2010;4:94–123. [Google Scholar]
  • 21.De Jong H Modeling and simulation of genetic regulatory systems: a literature review. J Comput Biol. 2002;9:67–103. [DOI] [PubMed] [Google Scholar]
  • 22.Filkov V Identifying gene regulatory networks from gene expression data Handbook of compuataional molecular biology. New York: Chanpman and Hall; 2005;27:1–30. [Google Scholar]
  • 23.Hecker M, Lambecka S, Toepferb S, Somerenc EV, Guthke R. Gene regulatory network inference: data integration in dynamic models: a review. BioSystems. 2009;96:86–103. [DOI] [PubMed] [Google Scholar]
  • 24.Savageau MA. Biochemical Systems Analysis: A Study Of Function and Design in Molecular Biology. Reading Mass.: Addison-Wesley Publishing Company, Inc.; 1970. [Google Scholar]
  • 25.Kimura S, Ide K, Kashihara A, et al. Inference of s-system models of genetic networks using a cooperative coevolutionary algorithm. Bioinformatics. 2005;21:1154–1163. [DOI] [PubMed] [Google Scholar]
  • 26.Wu H, Lu T, Xue H, Liang H. Sparse additive odes for dynamic gene regulatory network modeling. J Am Stat Assoc. 2014;109:700–716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi A-L. The large-scale organization of metabolic networks. Nature. 2000;407:651–654. [DOI] [PubMed] [Google Scholar]
  • 28.Seber GAF, Wild CJ. Nonlinear Regression, Wiley Series in Probability and Statistics. Hoboken, New Jersey: John Wiley & Sons, Inc.; 2003. [Google Scholar]
  • 29.Wei BC. Exponential Family Nonlinear Models. Singapore: Springer-Verlag; 1998. [Google Scholar]
  • 30.Li Z, Osborne MR, Prvan T. Parameter estimation in ordinary differential equation. IMA J Numer Anal. 2005;25:264–285. [Google Scholar]
  • 31.Xue H, Miao H, Wu H. Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error. Ann Stat. 2010;38(4):2351–2387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Brunel N Parameter estimation of ODE’s via nonparametric estimators. Electron J Stat. 2008;2:1242–1267. [Google Scholar]
  • 33.Liang H, Wu H. Parameter estimation for differential equation models using a framework of measurement error in regression models. J Am Stat Assoc. 2008;103:1570–1583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wu H, Xue H, Kumar A. Numerical discretization-based estimation methods for ode models with measurement error via penalized spline smoothing. Biometrics. 2012;38:344–352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Gugushvili S, Klaassen CAJ. i/n-consistent parameter estimation for systems of ordinary differential equation: bypassing numerical integration via smoothing. Bernoulli. 2012;18:1061–1098. [Google Scholar]
  • 36.Putter H, Heisterkamp SH, Lange JMA, de Wolf F. A Bayesian approach to parameter estimation in HIV dynamical models. Stat Med. 2002;21:2199–2214. [DOI] [PubMed] [Google Scholar]
  • 37.Huang Y, Liu D, Wu H. Hierarchical Bayesian methods for estimation of parameters in a longitudinal HIV dynamic system. Biometrics. 2006;62:413–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ramsay JO. Principal differential analysis: data reduction by differentialoperators. J R Stat Soc Series B Stat Methodol. 1996;58:495–508. [Google Scholar]
  • 39.Ramsay JO, Hooker G, Campbell D, Cao J. Parameter estimation for differential equations: a generalized smoothingapproach (with discussion). J R Stat Soc Series B Stat Methodol. 2007;69(5):741–796. [Google Scholar]
  • 40.Qi X, Zhao H. Asymptotic efficiency and finite-sample properties of the generalized profiling estimation of the parameters in ordinary differential equations. Ann Appl Stat. 2010;38:435–481. [Google Scholar]
  • 41.Zou H The adaptive lasso and its oracle properties. J Am Stat Assoc. 2006;101:1418–1429. [Google Scholar]
  • 42.Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96:1348–1360. [Google Scholar]
  • 43.Breiman L Better subset regression using the nonnegative garrote. Technometrics. 1995;37:373–384. [Google Scholar]
  • 44.Fan J, Lv J. Sure independence screening for ultrahigh dimensional feature space. J R Stat Soc Series B Stat Methodol. 2008;70:849–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Fan J, Song R. Sure independence screening in generalized linear models with NP-dimensionality. Ann Stat. 2010;38:3567–3604. [Google Scholar]
  • 46.Fan J, Feng Y, Song R. Nonparametric independence screening in sparse ultra-high-dimensional additive models. J Am Stat Assoc. 2011;106:544–557. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Fan J, Ma Y, Dai W. Nonparametric independence screening in sparse ultra-high dimensional varying coefficient models. Manuscript. 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gorst-Rasmussen A, Scheike T. Independent screening for single-index hazard rate models with ultrahigh dimensional feature. J R Stat Soc Series B Stat Methodol. 2013;75:217–245. [Google Scholar]
  • 49.Hall P, Titterington DM, Xue J-H. Tilting methods for assessing the influence of components in a classifier. J R Stat Soc Series B Stat Methodol. 2009;71(4):783–803. [Google Scholar]
  • 50.Wang H Factor profiled sure independence screening. Biometrika. 2012;99:15–28. [Google Scholar]
  • 51.Xue L, Zou H. Sure independence screening and compressed random sensing. Biometrika. 2011;98:371–380. [Google Scholar]
  • 52.Li GR, Peng H, Zhang J, Zhu LX. Robust rank correlation based screening. Ann Appl Stat. 2012;40:1846–1877. [Google Scholar]
  • 53.Li R, Zhong W, Zhu LP. Feature screening via distance correlation learning. J Am Stat Assoc. 2012;107:1129–1139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Chang J, Tang CY, Wu Y. Marginal empirical likelihood and sure independence feature screening. Ann Stat. 2013;41:2123–2148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression, Cambridge Series in Statistical and Probabilistic Mathematics, vol. 12 Cambridge: Cambridge University Press; 2003. [Google Scholar]
  • 56.Craven P, Wahba G. Smoothing noisy data with spline functions. Estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math. 1979;31(4):377–403. [Google Scholar]
  • 57.Aydin D, Tuzemen MS. Smoothing parameter selection problem in nonparametric regression based on smoothing spline: a simulation study. J Appl Sci. 2012;12:636–644. [Google Scholar]
  • 58.Fan J, Samworth R, Wu Y. Ultra-dimensional variable selection via independence learning: beyond the linear model. J Mach Learn Res. 2009;10:1829–1853. [PMC free article] [PubMed] [Google Scholar]
  • 59.Yuan M, Lin Y. On the nonnegative garrote estimator. J R Stat Soc Series B Stat Methodol. 2007;69:143–161. [Google Scholar]
  • 60.Miao H, Xia X, Perelson AS, Wu H. On identifiability of nonlinear ode models and applications in viral dynamics. SIAM Rev. 2010;53:3–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Orlando DA, Lin CY, Bernard A, et al. Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature. 2008;453:944–947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Simon I, Barnett J, Hannett N, et al. Serial regulation of transcriptional regulators in the yeast cell cycle. Cell. 2001;106:697–708. [DOI] [PubMed] [Google Scholar]
  • 63.Wittenberg C Cell cycle: cyclin guides the way. Nature. 2005;434:34–35. [DOI] [PubMed] [Google Scholar]
  • 64.Enserink JM, Kolodner RD. An overview of CDK1-controlled targets and processes. Cell Division. 2010;5:11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.de Bruin R, McDonald W, Kalashnikova T, Yates J, Wittenberg C. Cln3 activates G1-specific transcription via phosphorylation of the SBF bound repressor Whi5. Cell. 2004;117:887–898. [DOI] [PubMed] [Google Scholar]
  • 66.Costanzo M, Nishikawa JL, Tang X, et al. CDK activity antagonizes Whi5, an inhibitor of G1/s transcription in yeast. Cell. 2004;117:899–913. [DOI] [PubMed] [Google Scholar]
  • 67.Aparicio OM, Weinstein DM, Bell SP. Components and dynamics of DNA replication complexes in S. cerevisiae: redistribution of mcm proteins and Cdc45p during S phase. Cell. 1997;91:59–69. [DOI] [PubMed] [Google Scholar]
  • 68.Ilves I, Petojevic T, Pesavento JJ, Botchan MR. Activation of the MCM2–7 helicase by association with Cdc45 and GINS proteins. Mol Cell. 2010;37:247–258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Nguyen VQ, Co C, Irie K, Li JJ. Clb/Cdc28 kinases promote nuclear export of the replication initiator proteins Mcm2–7. Curr Biol. 2000;10:195–205. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sup1

RESOURCES