Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2013 Jul 21.
Published in final edited form as: Struct Equ Modeling. 2009 Oct;16(4):676–701. doi: 10.1080/10705510903206055

Non-linear Growth Models in Mplus and SAS

Kevin J Grimm 1, Nilam Ram 2
PMCID: PMC3717396  NIHMSID: NIHMS482398  PMID: 23882134

Abstract

Non-linear growth curves or growth curves that follow a specified non-linear function in time enable researchers to model complex developmental patterns with parameters that are easily interpretable. In this paper we describe how a variety of sigmoid curves can be fit using the Mplus structural modeling program and the non-linear mixed-effects modeling procedure NLMIXED in SAS. Using longitudinal achievement data collected as part of a study examining the effects of preschool instruction on academic gain we illustrate the procedures for fitting growth models of logistic, Gompertz, and Richards functions. Brief notes regarding the practical benefits, limitations, and choices faced in the fitting and estimation of such models are included.

Non-linear Growth Models in Mplus and SAS

Often a first task in studies of development is describing how individuals change (e.g., grow and/or decline) over time (Wohlwill, 1973). Growth curve techniques and various extensions provide some of the tools necessary for modeling within-person changes and between person differences in change (e.g., Bryk & Raudenbush, 1987, 1992; McArdle & Epstein, 1987; Rogosa & Willett, 1985; Singer & Willett, 2003). In recent years, as the breadth of substantive applications has widened, researchers have begun considering and seeking to use growth curve methods to describe complex patterns of non-linear change (see McArdle & Nesselroade, 2003 for a review). In this paper we present some background information on a set of growth curves that may be useful in describing longitudinal trends characterized by an elongated “S” or “sigmoid” shape, specifically, curves that follow the logistic, Gompertz and Richards (generalized logistic) functions, and illustrate some procedures by which these models may be fit to empirical data.

As a companion to Ram and Grimm’s (2007) introduction to non-linear growth models, we highlight further how the Mplus and SAS frameworks can be used to describe non-linear changes with longitudinal time structured data. We begin with a brief review of the basic growth curve modeling framework and an overview of how the framework may be used to describe patterns of non-linear change. Subsequently, we introduce an example data set and illustrate how various sigmoid curves can be fit to the empirical data using both Mplus and SAS, two popular statistical programs that complement each other because of their differing frameworks (i.e., structural equation and multilevel), and respective benefits and limitations (see Ghisletta & Lindenberger, 2004). We conclude with some notes regarding the practical benefits, limitations, Non-linear Growth Models 4 and choices faced in the fitting and estimation of such models. Although a number of equations are included, the presentation is also meant to highlight the conceptual utility of the models and can be read as such.

Growth Modeling

Growth modeling (Browne, 1993; Cudeck, 1996; McArdle, 1986, 1988; Meredith & Tisak, 1990; Muthen & Curran, 1997; Rogosa & Willet, 1985) is a contemporary analytic technique for modeling systematic within-person change across a series of repeated measurements and between-person differences in those changes. Given repeated measurement of a variable, Y, for n = 1 to N participants on t = 1 to T occasions (or ages), a general form of the growth model can be written as

Y[t]n=g0n+g1n·A1[t]+g2n·A2[t]++gkn·Ak[t]+e[t]n, (1)

where g0n is the intercept for subject n (or predicted score when the vectors A1 to Ak equal zero), g1n to gkn are the individual slopes or the expected amount of change in Y for a one unit change in A1 to Ak, respectively, A1 to Ak are vectors of basis coefficients indicating the relationship between the slopes and the observed scores, and e[t]n is a time-dependent residual that is uncorrelated with the intercept and slopes. The intercept and slopes are assumed to follow multivariate normal distributions with means, variances, and covariances (e.g., g0n,g1n,,gkn~MVN([μ0μ1μk],[σ02σ01σ12σ0kσ1kσk2])). The time-dependent residuals are assumed to have a mean of zero, a single variance (homogeneity of residuals assumption), and unrelated to other variables and each other.

Conceptually, the basic growth modeling framework is used to capture the average trend or pattern of change over time and between-person differences around the average trend. Practically, the framework fits within and bridges both structural equation modeling (SEM) and multilevel modeling traditions, and can be estimated in many SEM software packages, including Mplus (Muthén & Muthén, 1998–2007), LiSRel (Jöreskog & Sörbom, 1996), AMOS (Arbuckle & Wothke, 1999), Mx (Neale, Boker, Xie, & Maes, 1999), and EQS (Bentler, 1995) and mixed-effects or multilevel programs, including HLM (Raudenbush, Bryk, Cheong, & Congdon, 2004), PROC MIXED in SAS (Littell, Milliken, Stroup, Wolfinger, & Schabenberber, 2006), SPSS MIXED, and LME in Splus.

In brief, within SEM, the basic growth model is fit as a restricted common factor model (Meredith & Tisak, 1990). The intercept, g0n, and slopes, g1ngkn, are latent factors indicated by the observed repeated measures. The loadings for the intercept are fixed at 1 and the loadings for the slope(s) (e.g., A1 through Ak) define the shape or pattern of change. The intercept and slope factors have estimated means, variances, and covariances that, together with the residual variance, define the model’s structural expectations for the observed covariance matrix and mean vector (see Grimm & McArdle, 2005).

Figure 1 is a path diagram of a growth model with an intercept and a slope. In the diagram, squares indicate manifest variables, circles indicate latent variables, and the triangle represents the unit constant. Directive relationships such as regression paths and factor loadings are represented as one-headed arrows; unanalyzed or symmetric relationships such as variances and covariances are represented as two-headed arrows; unlabeled paths are fixed at 1. In this model there is a latent intercept, g0, with unit factor loadings, and one latent slope, g1, with loadings equal to the basis vector A1. The intercept and slope have means (i.e., one-headed arrows from the constant), variances (i.e., two-headed arrows from and to the same variable), and a covariance (i.e., two-headed arrow connecting g0 and g1). The manifest variables have a single residual variance indicated by the common label (i.e., σe2).

Figure 1.

Figure 1

Path diagram of a latent growth model

Notes: (1) Squares indicate manifest variables, (2) Circles indicate latent variables, (3) Triangle represents the unit constant, (4) ‘→’ indicate directive relationships, (5) ‘↔’ indicate symmetric relationships (variances/covariances), (6) g0 is the latent intercept, (7) g1 is the latent slope, (8) ‘A1’ are the loadings for g1, (9) unlabeled parameters are fixed at 1.

As a multilevel model, the basic growth model is fit as a two-level model (see e.g., Singer & Willett, 2003). At level 1, the observed scores, Y[t]n, are regressed on variables that define the functional form of within-person change, A1 through Ak. At level 2, fixed effects and random effects are captured by the means, variances, and covariances of the resulting person-specific intercepts, g0n, and slope(s), g1ngkn, obtained at level 1. Readers may find further explication of the SEM-multilevel correspondence in recent literature (e.g., Chou, Bentler, & Pentz, 1998; Curran, 2003; MacCallum, Kim, Malarkey, & Kiecolt-Glaser, 1997; Willett, 2004). Additionally, Ferrer, Hamagami, and McArdle (2004) describe how basic growth models can be fit in a variety of multilevel and SEM programs and Ghisletta and Lindenberger (2004) provide a succinct discussion of the advantages and disadvantages of fitting growth models within each of these frameworks.

Key to the specification of the growth model, whether conceptualized as a structural equation or multilevel model, is the elements of the basis vectors, A1 through Ak. These vectors (input as variables in the multilevel version of the model) are used to define a specific form of change. For example, if a linear pattern of change is desired, the elements of A1 would be fixed to progress in a linear manner (e.g., 1, 2, 3, …; “time” scores) and the elements of A2 to Ak would be fixed to be zero. As will be presented shortly, more complex patterns of change are accommodated by fixing or adjusting the elements of the basis vectors to reflect the desired change pattern (e.g., Gompertz, logistic). Before getting to the specifics, however, a brief overview of the non-linear models is covered.

Non-linear Growth Modeling

There are many ways in which the simple linear growth model can be expanded or adapted to describe non-linear patterns of change over time (see also Ram & Grimm, 2007). One of the most common expansions is the addition of higher order polynomial terms to the linear growth model (e.g., Bryk & Raudenbush, 1992). For example, curvature in the change function might be accommodated by adding a quadratic term (e.g., time2) and/or a cubic term (e.g., time3) to the linear model. The linear model can be expanded by fixing the elements of A2 to progress in a quadratic manner (e.g., 1, 4, 9, …), by fixing the elements of A3 to progress in a cubic manner (e.g., 1, 8, 27, …). Interindividual differences in the higher order polynomial components are captured by the variance/covariance parameters associated with those latent variables (e.g., g2n, g3n, …). Other adaptations include the latent basis growth model (Meredith & Tisak, 1990), where the pattern or shape of non-linear change is derived in an “exploratory,” data driven manner with minimal constraints on the elements of the “shape” vector (e.g., A1). Such models are able to capture non-linear forms of change over time. Conceptually, by either specifying or estimating the A1 to Ak vectors, almost any “shape” of change, non-linear or otherwise, can be accommodated in the growth modeling framework.

Sigmoid Curves

Individual change may be characterized by accelerations and decelerations of a particular form. Learning and population growth, for instance often consists of multiple “phases”, an initial period of adjustment where little growth occurs, a rapid growth phase, and a slowdown as ability or population approaches task or environmental capacity limits (Thieme, 2003). Such patterns of growth can be described by sigmoid curves that generally look like an elongated “S” (see Figure 2). Key parameters of the mathematical functions used to describe such curves include the lower and upper asymptotes, the rate of acceleration, the location of changes, and the symmetry (or asymmetry) in the pattern of acceleration and deceleration. Sigmoid curves have a long history of use in many areas of study, including biology, physiology, economics (e.g., Westerfeld, 1956; Winsor, 1932), where they have been used to describe change processes ranging from bacterial growth to product innovation to early life increases in brain size. Within psychology, sigmoid functions have historically been used to model probability of binary outcomes (e.g., logistic regression), item response probabilities (e.g., item response characteristic curves), neuronal function (e.g., Easton, 2005), and learning (e.g., Browne & du Toit, 1991). Applications within the growth curve modeling framework, however, have been few (see however, Browne, 1993; McArdle, Ferrer-Caja, Hamagami, & Woodcock, 2002). Given the success of such functions for describing growth in many natural systems, we encourage further consideration and use of such functions to describe longitudinal panel data and investigate the intraindividual changes (and interindividual differences) therein. To foster such applications we illustrate how three types of sigmoid curves can be fit to longitudinal panel data using familiar growth curve modeling frameworks.

Figure 2.

Figure 2

Example curves of (A) logistic growth model, (B) Gompertz growth model and (C) Richards curve.

Before proceeding we draw attention to an important technical and practical distinction qualifying the “non-linear” nature of non-linear growth curves. Thus far we have used the term non-linear to describe the pattern of observed changes with respect to time, not in reference to the characteristics of the parameters of the mathematical models describing these changes. Independent of the pattern of change over time, it must be noted that the mathematical functions describing intraindividual change come in (at least) two types, those that are linear in their parameters and those that are non-linear in their parameters. In brief, the distinction has to do with the manner in which the interindividual differences or random effects are incorporated in the model. In looking at Eq. 1 (or the path diagram in Figure 1) one can notice that the outcome variable, Y[t]n, is a weighted sum of the interindividual difference variables (e.g., g0n, g1n, g2n, … gkn,; the random effects). The weighting of each variable is given by the vectors A1, A2, …, Ak. When the elements of the Ak[t] vectors are fixed parameters (i.e., invariant across persons) the random effects are additive and the model is linear in its parameters. For example, an exponential change pattern can be defined by setting the basis vector A1[t] = e−α ·t. If the α parameter is the same for all participants, then the random effects of the model (i.e., g0n and g1n) are additive and the model is considered to be a model of non-linear change that is linear in its parameters. In practice, however, the elements of the Ak vectors can take on almost any values, including those defined by a random, interindividual difference variable. Consider the case when A1[t] = e−αn ·t, where αn is a variable that is allowed to differ between persons. In such cases, random effects are multiplied together (e.g., g1n(e−αn ·t)), and are therefore multiplicative, which makes the model non-linear in its parameters.

In this paper we present models of both types, without drawing too much attention to the technical details and distinctions, other than indicating which software can be used for which type of model and some differences in substantive interpretation. Readers are referred to a set of excellent papers that cover the technical details of parameterization, estimation, and fitting of non-linear growth curves that are either linear or non-linear in their parameters (Blozis, 2004, 2007; Blozis & Cudeck, 1999; Browne, 1993; Browne & du Toit, 1991; Cudeck & du Toit, 2002; Pinheiro & Bates, 2000). We limit our focus to a didactic illustration of how various sigmoid growth curve models can be fit to empirical data using Mplus (Muthén & Muthén, 1998–2007) and SAS PROC NLMIXED (Littell, Milliken, Stroup, Wolfinger, & Schabenberber, 2006).

Sigmoid Growth Functions

As noted earlier, non-linear growth models can take many different forms (polynomials, latent basis, exponential, etc.). Here we focus on three “S-shaped” patterns of change: Logistic, Gompertz, and Richards curves.

Logistic

The logistic function, as with all three curves covered here, is characterized by lower and upper asymptotes, and rates of change that are slowest near the asymptotes and fastest at an “inflection point” in the middle. The logistic growth model can be written as

Y[t]n=g0n+g1n·A1[t]+e[t]nA1[t]=11+e(tλ)·α. (2)

where α denotes a rate of change and λ denotes the time at which the rate of change reaches its maximum, the inflection point. When α is positive, growth proceeds from g0n, an individual-specific lower asymptote, to g0n + g1n, an individual-specific upper asymptote (and vice versa when α is negative). The defining feature of the logistic curve is that growth is distributed equally before and after the inflection point. That is, there is symmetry to the growth pattern such that exactly half of the total change has occurred before the inflection point. Figure 2A depicts logistic growth for three individuals with differences in their lower and long-term upper asymptotes (not always approached by t = 10). Despite these differences however, all individuals are characterized by the same rate of change (α = .4) and by the same inflection point (λ = 5), half of their total growth is achieved prior to the inflection point and half after. To highlight the symmetrical feature of the logistic curve, note that the individual depicted in bold, with lower asymptote at 0 and upper asymptote at 100, achieves 50% of her total growth at t = 5.

Gompertz

Similar in form to the logistic model, the Gompertz function is also characterized by upper and lower asymptotes, and an inflection point. The Gompertz curve, however, is not symmetric with respect to its inflection point. Rather, growth proceeds in a manner such that roughly 37 percent (i.e., 1/e) of the total growth occurs prior to the inflection point with the remainder occurring after. The model can be written as

Y[t]n=g0n+g1n·A1[t]+e[t]nA1[t]=eeα(tλ), (3)

where g0n is the lower assymptote, g0n + g1n equals the maximum asymptotic value of the function, λ represents the time at which maximum growth rate occurs, and α is the rate of change. Figure 2B depicts Gompertz growth for three individuals who differ in their lower and upper asymptotes, but who are characterized by the same rate of change (α = .4) and inflection point (λ = 5). The assymetrical nature of Gompertz growth is highlighted in that the individual depicted in bold, with lower asymptote at 0 and upper asymptote at 100, achieves roughly 37% (i.e., 1/e) of her total growth before the λ = 5 point in time, and the remaining ~63% after. Substantively, it may be noted that the Gompertz growth curve, with its specific assymetrical structure, is often used to describe the growth of populations in confined spaces with limited resources/nutrients (e.g., tumors, Laird, 1964). To the extent that the process one is interested in may follow the growth patterns found in “confined” biological or social systems (e.g., economic markets) the model may provide some of the sought after links between behavioral and natural systems.

Richards

Both the logistic and the Gompertz curves have a priori defined symmetry or asymmetry around the inflection point. As a generalization of the logistic curve, the Richards Curve (Richards, 1959) allows for flexibility in the asymmetry by including an additional parameter, τ, that controls which asymptote the point of inflection is nearest. This model can be written as

Y[t]n=g0n+g1n·A1[t]+e[t]nA1[t]=1(1+τ·e(tλ)·α)1τ. (4)

where g0n is the lower asymptote, g1n controls the upper asymptote, λ is the time at which α, the rate of change, is greatest, and τ controls whether this point of inflection is closer to the lower or upper asymptote. Together, these parameters allow for substantial flexibility in the shape of sigmoid curves. Figure 2C depicts Richards growth for three individuals who differ in their lower and upper asymptotes, but who are characterized by the same rate of change (α = .7), inflection point (λ = 5), and relative asymmetry (τ = 2). While it is necessary to spend additional time understanding how differences in the parameters relate to different patterns of change, some general observations regarding how τ affects the shape of the curve are useful. As τ, increases the amount of change that occurs before the inflection point, λ, increases. More specifically, when τ < 1, less than half the change occurs before λ (e.g., as in the Gompertz curve); when τ = 1, half the change occurs before λ and half after (i.e., as in the logistic curve); when τ > 1, more than half the change occurs before λ (e.g., as in Figure 2C). The latter assymetrical possibility is highlighted in that the individual depicted in bold, with lower asymptote at 0 and upper asymptote at 100, achieves roughly 57% of her total growth before t = 5, and the remaining ~43% after.

Example Data

To illustrate the use of these sigmoid growth functions we use data collected as part of a study examining the effects of preschool instruction on academic gains (Conner, Morrison, & Slominski, 2006). The data contain longitudinal test information on 383 children (195 females, 188 males) from an economically and ethnically diverse community, located on the urban fringe of a major Midwestern city in the United States. From the larger data set, we use the repeated assessments (10 occasions) of the Letter-Word Identification (LWID) sub-test from the Woodcock-Johnson-III Test of Achievement (designed to measure letter and word recognition; McGrew, Werder, & Woodcock, 1991), that were collected in the fall and spring of each school year from preschool through second grade. Throughout the remainder of the manuscript, these measures are referred to by the variable names lw3F – lw7S, denoting Letter-Word Identification W-Scores scores starting with the fall score at age 3 (preschool) and ending with the spring score at age 7 (second grade). A longitudinal plot of the individual trajectories for these children is shown in Figure 3. In examining the individual change patterns in Figure 3, the non-linear change pattern is clear, seems to be characterized by lower and upper asymptotes, and rates of change that are slowest near the asymptotes and fastest somewhere in the middle. In the following sections we describe how the logistic, Gompertz, and Richards growth models outlined above can be fit using Mplus and SAS (PROC NLMIXED).

Figure 3.

Figure 3

Longitudinal plot of individual trajectories on the Letter-Word Identification test of the Woodcock-Johnson Tests of Achievement

Non-linear Growth Models in Mplus

Mplus (Muthén & Muthén, 1996–2007) is a general latent variable program that can be used to conduct a variety of statistical analyses including structural equation modeling, multilevel modeling, mixture modeling, categorical data analysis, and combinations of such models. A typical Mplus script contains six sections of commands: Title, Data, Variable, Analysis, Model, and Output. In this paper we focus on the Model portion of the script where the model is specified. We refer readers to the user’s manual for information about the general layout and execution of the program (available from www.statmodel.com; Muthén & Muthén, 1996–2007). Appendix A also includes a full Mplus script for a logistic growth model. For ease of reading, scripts are presented in the Courier New font to distinguish the program specific commands from text, CAPITAL letters are used for Mplus commands, and lower case letters for manifest variables and latent variables that are specific to the data set and model.

In Mplus, growth models can be specified in several ways - as a structural equation model with time structured data, as a multilevel model using the the multilevel add-on, or using the TSCORES option when participants differ in the sampling of time. The TSCORES option is useful when participants vary in the sampling of time, as in an accelerated longitudinal study (see Bell, 1953; McArdle & Bell, 2000). With this option, the repeated observations are structured according to measurement occasion and timing variables that describe when the specific measurements took place (e.g., individuals’ precise age at measurement) are also contained in the data. In this set-up, the timing structure for the growth model is based on the timing scores (i.e., TSCORES) as opposed to measurement occasion, similar to the setup used in multilevel software. Here, we use the structural equation modeling component to specify the growth curves as restricted common factor models (see Figure 1). Note that the data file for such a specification must be in wide format, with one record per person that contains a separate variable for each repeated measurement.

As a starting point for the presentation of non-linear models we specify a linear growth model. The MODEL statement for the example data can be written as:

     MODEL:
           g0 BY       lw3F-lw7S@1;
           g1 BY       lw3F@0      lw3S@1
                       lw4F@2      lw4S@3
                       lw5F@4      lw5S@5
                       lw6F@6      lw6S@7
                       lw7F@8      lw7S@9;

           g0 g1;      g0 WITH g1;       [g0* g1*];

           lw3F-lw7S (Ve);   [lw3F-lw7S@0];

The model is specified in three pieces. In the first piece, the elements of the basis vectors are specified. Specifically, two latent variables are specified, g0 and g1, that are indicated by the ten observed scores (repeated measurement of the LWID) using the BY command. The loadings for the intercept, g0, are all fixed at 1 using the @ symbol to denote a fixed parameter, while the loadings for the slope, g1, are fixed to follow a linear change pattern (i.e., @0, @1,…, @9). In a second part, the variances, covariances, and means of the latent factors are specified. Variances are denoted by listing the names of the factors (i.e., g0 g1;), a covariance is denoted using the WITH command (i.e., g0 WITH g1;), and means are denoted by listing the names of the factors within square brackets with asterisks to override a default that latent variables have a fixed mean of zero (i.e., [g0* g1*];). Finally, in the third part, the residual variances and intercepts of the observed variables are specified. Consistent with the homogeneity of variances assumptions, the residual variance is constrained to be equivalent at all time points. This constraint is specified by listing the observed variable names with a common label, Ve, (i.e., lw3F-lw7S (Ve);). Additionally, the intercepts of the residual terms are fixed to 0 (i.e., [lw3F-lw7S@0];). This basic layout can then be expanded to accommodate non-linear growth functions.

The extension from the linear model given above to non-linear growth models, which are linear in their parameters, can be accomplished by introducing a series of constraints that “require” the slope loadings to follow a specific non-linear function rather than the linear pattern given above. To accomplish this, two additional portions of script are needed: phantom variables and model constraints.

Phantom variables

A phantom variable is a latent variable that is specified to be unrelated to every other variable in the model. Rather than being a formal part of the model, phantom variables can be used as “place holders” for mathematical necessities (see Rindskopf, 1983). That is, the parameters associated with the phantom variable (i.e., mean, variance) can be specified to create dependencies between parameters in the model (i.e., constraints). In many instances, the parameters of phantom variables in conjunction with the MODEL CONSTRAINT: command are used for boundary statements to create non-equality mathematical constraints between parameters (see Mplus user’s manual, p. 28). In the present context, phantom variables are used to create non-equality mathematical constraints between the slope loadings (elements of the basis coefficients) that “force” them to follow a specified non-linear function.

To illustrate, a phantom variable can be created using the BY command. However, the relationship (e.g., factor loading) between the phantom variable and the observed variable is specified to be fixed at 01. That is, to create the phantom variable, phantom (or any other name of choice), indicated by the first observed Letter-Word Identification variable, lw3F, we write

     phantom BY lw3F@0;

Once created, phantom, like all other variables, has a set of parameters associated with it: a mean, a variance, and covariances with other variables. These parameters can then be used to “house” mathematically necessary model parameters. For example, the phantom variable can be used as a place holder for the rate parameter in the logistic model, α, by attaching a label, ‘alpha’, to its mean.

     [phantom*] (alpha);

Although the mean, variances, and covariances of phantom variables are all available as place holders, we recommend using the phantom variable’s mean, because this parameter can take on positive or negative values (whereas variances must be positive). The variance and covariances of the phantom variable are then “removed” by setting them to 0,

     phantom@0;
     phantom WITH g0@0 g1@0;

These four lines of script create a phantom variable that is unrelated to all other variables in the model, has no variance, but has a mean labeled alpha. Additional phantom variables can be created and labeled as needed. For instance, for the logistic model, a second parameter, λ, can be introduced using the commands

     phantom2 BY lw3F@0;
     [phantom2*1] (lambda);
     phantom2@0; phantom2 WITH g0@0 g1@0 phantom@0;

In this command, we specify a second phantom variable (phantom2), label its mean ‘lambda’, and fix its variance and covariances to zero.

Model constraints

The alpha and lambda parameters created through the phantom variable procedure given above can then be mathematically manipulated into their proper place in the model using constraints that are introduced with the MODEL CONSTRAINT: command. For the non-linear growth models presented here, this means mathematically constraining the slope loadings (i.e., elements of the basis vectors, e.g., Ak[t]) to follow a pre-specified non-linear function (e.g., logistic, Gompertz, Richards).

To do this, the factor loadings associated with g1 are revised to be estimated and are labeled, L1 to L10 (first step of the script from above);

     g1 BY     lw3F* (L1)
               lw3S  (L2)
               lw4F  (L3)
               lw4S  (L4)
               lw5F  (L5)
               lw5S  (L6)
               lw6F  (L7)
               lw6S  (L8)
               lw7F  (L9)
               lw7S  (L10);

Note that an asterisk needs to be placed after the first variable to override the default that latent variables are indicated by the first variable with a fixed weight of 1. The MODEL CONSTRAINT: command is then used to specify the relationship between the slope loadings, now labeled L1 to L10, and the alpha and lambda parameters specified by the mathematical model. For example, to fit the logistic model of Eq. 2 to the example data, where there are ten equally spaced repeated observations, t = 1 to 10, the following constraints are necessary,

     MODEL CONSTRAINT:
     L1    = 1/(1 + EXP (−( 1-lambda)*alpha));
     L2    = 1/(1 + EXP (−( 2-lambda)*alpha));
     L3    = 1/(1 + EXP (−( 3-lambda)*alpha));
     L4    = 1/(1 + EXP (−( 4-lambda)*alpha));
     L5    = 1/(1 + EXP (−( 5-lambda)*alpha));
     L6    = 1/(1 + EXP (−( 6-lambda)*alpha));
     L7    = 1/(1 + EXP (−( 7-lambda)*alpha));
     L8    = 1/(1 + EXP (−( 8-lambda)*alpha));
     L9    = 1/(1 + EXP (−( 9-lambda)*alpha));
     L10   = 1/(1 + EXP (−(10-lambda)*alpha));

Using these constraints, each factor loading (element of the basis vector) is specified to have the value that would be obtained by substituting the appropriate value of t into the logistic equation defining the pattern of change.

In conjunction, the phantom variable and model constraint capabilities can be used to specify many non-linear growth models (that are linear in their parameters). The Mplus script for the logistic growth models is contained in Appendix A. Additionally, Mplus and SAS scripts for all the models covered here are available online at http://psychology.ucdavis.edu/labs/Grimm/personal/downloads.html

Non-linear Growth Models in SAS

Non-linear growth models can be fit in SAS using the NLMIXED procedure. PROC NLMIXED is a very flexible program that can be used to fit a wide variety of statistical models (e.g., item response models, see Sheu, Chen, Su, & Wang, 2005; survival models, see Lambert, Collett, Kimber, & Johnson, 2004; and shared parameter models, see Guo & Carlin, 2004) including many non-linear growth models that are linear or non-linear in their parameters. Here, we present programs that use the hierarchical modeling framework (e.g., level 1 & level 2) and follow directly from the notation used above.

First, we note that the data structure for growth modeling in NLMIXED is different from the structure used for Mplus (Singer & Willett, 2003). Here, the data are in a long (i.e., relational, person-period) format with multiple records per person and variables for person identification, outcome measure, and time of assessment. For the example data, the identification variable is childid, the outcome measure is lw_w, and the time variable is time (ranging from 1 to 10).

We begin with a script for a linear growth model for illustration. This script will then be expanded to articulate the non-linear growth models. As before, CAPITAL letters are used for SAS commands, and lower case letters for manifest variables and latent variables that are specific to the data set and model. An NLMIXED script for a linear growth model can be written as

PROC NLMIXED DATA = lw_long;
*specifying level-2 equations;
      g_0̷n = mu_0̷ + d_0̷n;
      g_1n = mu_1 + d_1n;
*specifying elements of the basis vector;
      A1_t = time − 1;
*specifying level-1 equation;
      traject = g_0̷n + g_1n * A1_t;
*specifying model (outcome – its mean trajectory and residual variance);
      MODEL lw_w ~ NORMAL(traject, v_e);
*specifying random effects;
      RANDOM d_0̷n d_1n ~ NORMAL([0̷,0̷], [v_0̷,
                                       c_0̷1, v_1])
      SUBJECT = childid;
*specifying starting values;
      PARMS
          mu_0̷ = 30̷0̷        mu_1 = 20̷
          v_0̷ = 60̷0̷         v_1 = 12      c_0̷1 = 0̷
          v_e = 175;
RUN;

The script begins by calling the NLMIXED procedure and lw_long dataset. This is followed by two level 2 equations of the growth model. The variable g_0̷n, the individual-level intercept, is set equal to the sample-level mean (mu_0̷) plus the individual deviation (d_0̷n) from the sample-level mean. The same type of level 2 equation is also written for the slope, g_1n. Next, the basis vectors or slope loadings, A1_t (i.e., A1[t]), are defined. In the linear model they are set equal to time − 1. This specifies that the elements of the basis vector proceed linearly with respect to the measurement occasions (from 1 to 10) and centers the intercept at the first measurement occasion (see Ram & Grimm, 2007). This is followed by the level 1 equation without the residual term. We call the expected true score of the outcome variable traject and, following the form of Eq. 1, set it equal to the random intercept, g_0̷n, plus the random slope, g_1n, multiplied by the slope loadings, A1_t. The statements beginning with ‘*’ are comments that are helpful when programming complex models.

The next few lines of the script define the outcome variable, its distribution, and how the random-effects should be included in the model. In the MODEL statement, the outcome variable (lw_w) is defined in terms of the level 1 equation specified in the prior line and a residual term. Here, lw_w is specified to have a normal distribution with a mean equal to the expected value from the level 1 equation, traject, and a level 1 residual variance equal to v_e. Next, the random-effects, or the level 2 variances and covariances are defined. The individual deviations, d_0̷n and d_1n, for the intercept and slope (from the level 2 equations) are specified to be multivariate normally distributed (~ NORMAL) with means equal to 0 ([0̷,000337;]) and a variance-covariance matrix filled with parameters v_0̷, c_0̷1 and v_1, for the variance of d_0̷n, covariance between d_0̷n and d_1n, and variance of d_1n, respectively. Next, the identification variable (childid) is specified in the ‘SUBJECT=’ statement to indicate that random effects are across persons. Finally, the PARMS statement is used to set starting values for the estimation of all unknown parameters. Starting values can be difficult to generate, but are important for obtaining convergence within a reasonable time. Techniques that are useful for obtaining good starting values include, (a) prefitting more simplistic versions of the model using PROC NLIN, NLMIXED without random effects, or the additive model fit above, (b) prefitting models where the covariances among random effects are constrained to zero (by placing zeros in the covariance matrix on the RANDOM line), (c) reducing the number of quadrature points used in the estimation, and (d) using a first order method of approximating the integral of the likelihood over the random effects (METHOD=FIRO) rather than maximum likelihood (METHOD=ML).

Adjusting the script for non-linear growth models is straightforward. The equation for the basis vector is adjusted to match the desired model (e.g., logistic) and starting values for any additional parameters are added. For the logistic model, the specification of the basis vector can be programmed as,

     *specifying elements of the basis vector for a logistic model;
     A1_t = 1/(1 + EXP(−(time-lambda)*alpha));

Starting values for the additional, ‘fixed effect’ parameters are added in the PARMS statement,

     alpha = .5 lambda = 5

As noted earlier, this logistic model only has additive random effects so it is linear in its parameters. One advantage of using NLMIXED is the opportunity to include parameters that enter non-linearly into the model (multiplicative random effects). For instance, in the logistic model, we can specify the lambda parameter as random (varying across individuals) as opposed to fixed (invariant across individuals). The elements of the basis vector become A1[t]=11+e(tλn)·α., where λn has the subscript n to denote its value varies across individuals. In this model, lambda is a random-effect and therefore has a mean, variance, and covariances with the other random-effect parameters. Inclusion of these additional random effects requires further adjustment of the NLMIXED script – specifically to the RANDOM line. Additionally, the below script is written with a single level-1 equation as opposed to separate level 1 and level 2 equations as the above script. This change is presented to show different ways to program growth models in NLMIXED - separate level 1 and level 2 equations could be specified and the model would be identical. Also, standard deviations and correlations are estimated as opposed to variances and covariances for ease of interpretation.

PROC NLMIXED DATA = lw_long;
   traject = g_0̷n + g_1n * 1/(1 + EXP(−(time-lambda)*alpha));

   MODEL lw_w ~ NORMAL(traject, s_e*s_e);
   RANDOM g_0̷n g_1n lambda ~ NORMAL([mu_0̷, mu_1, mu_lambda],
    [s_0̷*s_0̷,
     s_0̷*r_0̷1*s_1, s_1*s_1,
     s_0̷*r_0̷lambda*s_lambda, s_1*r_1lambda *s_lambda, s_lambda*s_lambda])
   SUBJECT = childid;
   PARMS
          mu_0̷ = 300     mu_1 = 200      mu_a = .05
          s_0̷ = 15       s_1 = 25        s_lambda = 1
          r_0̷1 = 0       r_0̷lambda = 0   r_1lambda = 0
          s_e = 13;
RUN;

The major change from the previous script is that the lambda parameter is now included on the RANDOM line with an associated mean (mu_lambda), standard deviation (s_lambda), and correlates with the intercept (r_0̷lambda) and the slope (r_1lambda). There are currently only a few programs, in addition to NLMIXED, that can be used to fit non-linear growth models with multiplicative random effects (e.g., Mx, see Blozis, 2007; Splus using nlme, see Pinheiro & Bates, 2000; winBUGS, see Spiegelhalter, Thomas, Best, & Lunn, 2007). The SAS script for the logistic growth models is contained in Appendix B.

Results

Non-linear Growth Models

To illustrate how these models may be fit in practice, the series of non-linear, sigmoid growth models (as well as a linear growth model) were fit to the example longitudinal achievement data using Mplus and SAS. The parameter estimates and fit statistics from Mplus and SAS are contained in Tables 1 and 2, respectively. Parameter estimates contained within the text reflect Mplus estimates when possible. Predicted curves for each model are contained in Figures 4A–E. Model fit was evaluated using common global fit indices (e.g., CFI, TLI, & RMSEA) and model comparisons were made using likelihood based indices (e.g., AIC, BIC). RMSEA values less than .05 were considered good, less than .08 were adequate, and less than .10 were marginal. Similarly, CFI and TLI values greater than .90 were considered adequate and values greater than .95 were considered good.

Table 1.

Parameter Estimates and Fit Statistics for the Linear and Non-linear Growth Models of Letter-Word Identification from Mplus

M1:
Linear
M2:
Logistic
M3:
Gompertz
M4:
Richards
Means (μ)

  1 → g0 308.6* 315.5* 323.0* 286.5*
  1 → g1 19.5* 189.7* 224.2* 195.0*

Slope Loadings (A[t])

  g1 → LW3F =0 .055* .013* .188*
  g1 → LW3S =1 .091* .039* .238*
  g1 → LW4F =2 .148* .089* .300*
  g1 → LW4S =3 .231* .165* .379*
  g1 → LW5F =4 .341* .261* .478*
  g1 → LW5S =5 .472* .368* .599*
  g1 → LW6F =6 .607* .474* .740*
  g1 → LW6S =7 .727* .574* .872*
  g1 → LW7F =8 .822* .661* .955*
  g1 → LW7S =9 .888* .734* .987*

Additional Parameters

  α -- .55* .29* 1.38*
  λ -- 6.21* 6.00* 6.87*
  τ -- -- -- 5.92*

Variances & Covariances (σ2 & σ01)

  g0 ↔ g0 693.4* 597.4* 583.5* 680.6*
  g1 ↔ g1 10.5* 719.9* 986.2* 787.5*
  g0 ↔ g1 −37.5* −194.0* −195.8* −322.6*
  e[t]n ↔ e[t]n 212.2* 171.5* 178.8* 165.4*

Fit Statistics

  χ2/df 919/59 495/57 562/57 443/56
  RMSEA (90% C.I.) .195 (.184–.206) .142 (.130–.153) .152 (.141–.164) .134 (.123–.146)
  CFI .621 .807 .777 .829
  TLI .711 .848 .824 .863
  −2LL 19497 19073 19140 19021
  BIC 19533 19121 19188 19075
  AIC 19509 19089 19156 19039

Note: (1) ‘--‘ indicates model did not contain parameter, (2) ‘=’ indicates the parameter was fixed at that value, (3) ‘*’ indicates a significant parameter at p< .05, (4) χ2 = maximum likelihood chi-square value, (5) df = degrees of freedom, (6) RMSEA = Root Mean Square Error of Approximation, (7) CFI = Comparative Fit Index, (8) TLI = Tucker-Lewis Index, (9) −2LL = −2 times the log likelihood value, (10) BIC = Bayesian Information Criteria, (11) AIC = Akaike Information Criteria.

Table 2.

Parameter Estimates and Fit Statistics for the Linear and Non-linear Growth Models of Letter-Word Identification from PROC NLMIXED in SAS

M1:
Linear
M2:
Logistic
M3:
Gompertz
M4:
Richards
M5:
Richards
Fixed Effects

  g0g0) 308.6* 315.5* 323.0* 286.5* 292.1*
  g1g1) 19.5* 189.7* 224.2* 195.0* 192.4*
  α (μα) -- .546* .294* 1.38* 1.46*
  λ (μλ) -- 6.20* 6.00* 6.87* 6.95*
  τ (μτ) -- -- -- 5.92* 5.59*

Random Effects

  g0g02) 693.27* 597.31* 583.71* 680.69* 395.21*
  g1g12) 10.50* 719.85* 985.96* 787.36* 506.25*
  λ (σλ2) -- -- -- -- .88*
  g0/g1 correlation (ρg0,g1) −.44* −.30* −.26 −.44* −74*
  g0/λ correlation (ρg0) -- -- -- -- .22
  g1/λ correlation (ρg1) -- -- -- -- -.44*
  e (σe2) 212.28* 171.35* 178.76* 165.38* 143.52*

Fit Statistics

  −2LL 19497 19073 19140 19021 18820
  Parameters 6 8 8 9 12
  BIC 19533 19121 19200 19075 18891
  AIC 19509 19090 19168 19040 18844

Note: (1) ‘--‘ indicates model did not contain parameter, (2) ‘=’ indicates the parameter was fixed at specified value, (3) ‘*’ indicates a significant parameter at p< .05, (4) −2LL = −2 times the log likelihood value, (5) BIC = Bayesian Information Criteria, (6) AIC = Akaike Information Criteria, (7) Model 5 contained λ as a random parameter, (8) Variances are the squares of the estimated standard deviations to assist in the comparison of parameter estimates.

Figure 4.

Figure 4

Mean and individual predicted growth trajectories based on the (A) linear, (B) logistic, (C) Gompertz, (D) Richards and (E) Richards model with variation in λ.

Linear (M1)

As a baseline, a linear growth model was fit to the data. The parameters of the linear model were μg0 = 308.6, μg1 = 19.5 indicating that, on average, children have a score of 308.6 in the fall of preschool and grow 19.5 units every half year. Furthermore, the random effect parameters suggest significant interindividual differences in intercept, σg02 = 693.4, and slope, σg12 = 10.5, and that children who had lower intercepts tended to have greater rates of change from age 3 through age 7, σg0,g1 = −37.5 (ρg0,g1 = −.44). The predicted prototypical trajectory and expected individual trajectories for the linear curve are shown in Figure 4A. Overall, the linear growth model was found to be a relatively inadequate representation of the changes in letter and word recognition for this age range based on the global fit indices (χ2=919 df=59, RMSEA=.195 (.184–.206), CFI=.621, TLI=.711, −2LL=19497, BIC=19533, AIC=19509). Thus, the parameter estimates, and the representation of the data they represent, must be interpreted with caution.

Logistic (M2)

The first non-linear model was the logistic, a curve distinguished by its symmetry. The parameters of the logistic model were μ0 = 315.5, μ1 = 189.7, α = .55, and λ = 6.21. Therefore, on average, the lower asymptote was 315.5; children grew 189.7 units to an upper asymptote of 505.2 (315.5 + 189.7); children reached half of their total change towards the end of kindergarten (λ = 6.2), and the growth rate was .55 (α = .55). Furthermore, children varied in their lower (σg02 = 597.4) asymptotic level, their predicted amount of change (σg12 = 719.9), and children who had a greater level of early reading achievement (lower asymptote) tended to show less total growth (σg0g1 = −194.0; ρg0g1 = −.30). The mean predicted trajectory and individual trajectories for the logistic curve are contained in Figure 4B. The fit of the logistic model (ρ2=495 df=57, RMSEA=.142 (.130–.153), CFI=.807, TLI=.848, −2LL = 19073, BIC = 19121, AIC = 19089) was an improvement over the linear model based on the global fit indices and likelihood statistics; however the logistic model showed relatively poor fit based on the global fit indices.

Gompertz (M3)

The next model fit to the data was the Gompertz, the defining feature of which is its a priori defined asymmetrical growth pattern. The parameters of the Gompertz model were μ0 = 323.0, μ1 = 224.2, α = .29, and λ = 6.00 indicating that, on average, children have a lower asymptotic value of 323.0 and grew about 224.2 units towards an asymptote at 547.2 (323.0 + 224.2). In the spring of kindergarten (λ = 6.0) students were changing more rapidly than any other time and the growth rate was α = .29. As with the logistic model, children varied in their lower asymptotic level (σg02 = 583.5), their predicted amount of change (σg12 = 986.3), and children who had a greater level of early reading achievement (lower asymptote) tended to show less growth (σg0g1 = −195.8; ρg0g1 = −.26). The mean predicted trajectory and individual trajectories for the Gompertz curve are contained in Figure 4C. As with the logistic model, small changes occurred as the children progressed through the first year of preschool, larger changes occurred in the second year of preschool, kindergarten, and into first grade before smaller changes were shown in second grade. The Gompertz model (ρ2=562 df=57, RMSEA=.152 (.141– .164), CFI=.777, TLI=.824, −2LL = 19140, BIC = 19188, AIC = 19156) fit better than the linear model, but not as well as the logistic model. This could be taken as an indication that the data are not characterized by assymetrical growth of the Gompertz type.

Richards Curve (M4)

Moving towards a model where asymmetry of the growth pattern is estimated from the data, we next fit the Richards curve. The parameters obtained were μ0 = 286.5, μ1 = 194.9, α = 1.38, λ = 6.87, τ= 5.92. Therefore, on average, the lower asymptote was 286.5; children grew 194.9 units to the upper asymptote of 481.4 (286.5 + 194.9); children were changing most rapidly towards the beginning of first grade (λ = 6.9), the growth rate was 1.38 (α = 1.38), and the growth was asymmetric, such that the majority of change occurred before the inflection point (i.e., τ= 5.92). Furthermore, children varied in their lower asymptotes (σg02 = 680.6), their predicted amount of change (σg12 = 787.5), and children who had a greater level of early reading achievement (lower asymptote) tended to show less growth (σg0g1 = −322.5; ρg0g1 = −.44). The mean predicted and individual trajectories for the Richards curve are contained in Figure 4D. The fit of the Richards curve (ρ2=443 df=56, RMSEA=.134 (.123–.146), CFI=.829, TLI=.863, −2LL = 19021, BIC = 19075, AIC = 19039) was an improvement over the linear, logistic, and Gompertz models based on the likelihood statistics and fit indices. Therefore, the Richards curve was seen as the best representation, of the models fit, of the changes in letter and word recognition during this age period for these data. As with the previous models, the fit of the Richards curve remained relatively poor based on the global fit indices, which suggest the changes in letter and word recognition are more complicated than these additive non-linear models were able to capture. Therefore, multiplicative non-linear models may be able to account for the additional amount of heterogeneity in growth.

Richards Curve with Multiplicative Random-Effects (M5)

A limitation in the above models is that the parameters of the non-linear function (α, λ, and τ) are assumed to be invariant across persons (i.e., fixed). In NLMIXED, it is also possible to add further complexity into the model by allowing for interindividual differences in αn, λn, and/or τn. For example, when allowing for interindividual differences in λn the Richards curve model becomes

Y[t]n=g0n+g1n·A1[t]+e[t]nA1[t]=1(1+τ·e(tλn)·α)1τ. (4a)

The expanded model accommodates another type of interindividual differences, but the added nonlinearity in parameters makes estimation computationally difficult.

This model was fit in NLMIXED with a reduced number (i.e., 5) of quadrature points. The parameters obtained were μ0 = 292.1, μ1 = 192.4, α = 1.46, μλ = 6.95, τ= 5.59 (see Table 2). Similar to the prior model, the lower asymptote was 292.1; children grew 192.4 units to the upper asymptote of 484.5 (292.1 + 192.4); the growth rate was 1.46 (α = 1.46), and the growth was asymmetric, such that the majority of change occurred before the inflection point (i.e., τ= 5.59), and children were, on average, changing most rapidly towards the beginning of first grade (μλ = 6.95). Now, though, the model includes interindividual differences in children’s lower asymptotes (σg02 = 395.2), their predicted amount of change (σg12 = 506.3), and inflection point (σλ2 = .88). These three random effects covary such that children who had a greater level of early reading achievement (lower asymptote) tended to show less growth (ρg0g1 = −.74). The between-person differences in growth, in turn were negatively associated with the variation in λn, the inflection point (ρg1,λ = −.44) – suggesting that children who begin to change earlier tended to change more. The mean predicted trajectory and individual trajectories for the expanded Richards curve are contained in Figure 4E. While the mean trajectory did not visually change compared with the previous Richards growth curve, the additional interindividual differences in shape can be seen in the individual curves.

The fit of the expanded Richards curve with variation in λ was obtained from NLMIXED (−2LL = 18820, BIC = 18891, AIC = 18844) and was an improvement over all of the prior models. Using the likelihood of the data obtained from Mplus, we are able to calculate the χ2 statistic and RMSEA (Steiger & Lind, 1980) for this model using FITMOD2. The RMSEA for this multiplicative non-linear Richards curve model was .096 (.082–.111), a marginal fit and pointing to the importance of the variation in λn as a way to model these data. Further improvements in fit might be obtained by allowing for interindividual differences in αn, and/or τn. However, given the amount of time spent obtaining model convergence (via good starting values) and the difficulties faced in the interpretation of multiplicative interindividual differences, we suggest careful consideration of practical and substantive issues before incorporating further complexity into these model.

Discussion

Developmentalists are interested in describing how individuals change (e.g., grow and/or decline) over time. As these descriptions increase in complexity, models of non-linear change will be called upon to provide more accurate, complete, and easily interpretable descriptions of how individuals change over time and interindividual differences in such change. In this paper we outlined how a selection of non-linear sigmoid curves may be fit within a growth curve modeling framework to multi-person longitudinal data using Mplus and SAS.

For the models fit in both programs, differences between the fixed-effect parameter estimates obtained from Mplus and SAS were generally small; however, differences in the random-effect parameter estimates, on the other hand, were noticeable for all non-linear growth models fit in this project. Additional research is therefore necessary to assess when, and for what types of data, the random-effect parameters in non-linear growth models fit using these programs are accurate. Irrespective of the program used in the analyses conducted here, however, it should be noted that the substantive conclusions were identical.

At a more general level, we highlight some of the issues to be considered in choosing between Mplus and SAS for the fitting of non-linear growth curves (see also Chou, Bentler, & Pentz, 1998; Curran, 2003; Ferrer, Hamagami, & McArdle, 2004; Ghisletta & Lindenberger, 2004; MacCallum et al., 1997; Willett, 2004). Advantages for using Mplus are the advantages of using SEM. Notably, the non-linear models can be combined with confirmatory factor models to accommodate measurement error (see Blozis, 2004; Hancock, Kuo, & Lawrence, 2001; McArdle, 1988); incomplete data can be handled on the outcome as well as in the predictors of change; change in the measurement instruments across time can be modeled (e.g., McArdle, Grimm, Hamagami, Bowles, & Meredith, 2008; McArdle & Hamagami, 2004); global fit statistics (e.g., CFI, RMSEA) are available to examine model fit and misfit; multiple group, and growth mixture modeling can be combined with non-linear growth models to evaluate group (known or unknown) differences in longitudinal trajectories. Furthermore, estimation was quick and, in our case, was not very dependent on user-provided starting values; however the appropriate sign (positive or negative) of the non-linear parameters was helpful. Additionally, extensions to multivariate growth models for examining correlated change are straightforward.

Major limitations of Mplus are among the advantages of SAS PROC NLMIXED, including the possibility to fit non-linear models with multiplicative random effects, which may substantially improve the fit compared to the conditionally linear counterparts (as was seen in with this data). Additionally, NLMIXED allows for flexibility in the timing basis. For example, age at assessment could be used as opposed to measurement occasion and each individual could have a unique (distinct) age at each assessment3. In turn, the limitations of NLMIXED are the advantages of Mplus (global fit statistics, great flexibility in model specification, etc.)

In conclusion, the non-linear curves discussed here represent only a sample of non-linear models that are appropriate for longitudinal research. Other models of interest include the logarithmic, exponential, dual exponential (bi-exponential) and Michaelis-Menton curves, among many others. Ratkowsky’s (1989) discussion of non-linear regression models contains equations and descriptions for a variety of non-linear regression models that can be adapted for growth curve analysis (see also Pinheiro & Bates, 2000 for applications in S and Splus). The flexibility of the growth modeling framework allows for a wide variety of non-linear patterns of growth for examining within person change. By outlining and illustrating the ease with which such models can be implemented with currently available software we hope to have illustrated how useful the framework may be for describing the complexities of within-person change and between-person differences in change.

Acknowledgments

We would like to thank Frederick Morrison for providing the illustrative data analyzed for this paper. These data collection was supported by National Institute of Child Health and Human Development Grant R01 HD27176, National Science Foundation Grant 0111754, and U.S. Department of Education, Institute for Education Sciences Grant R305H04013. We would also like to thank Jack McArdle, John Nesselroade, Fumiaki Hamagami and our colleagues at the Center for the Advanced Study of Teaching and Learning, the Jefferson Psychometric Laboratory and the Center for Development and Health Research Methodology at the University of Virginia for their helpful comments on this work and NIA T32 AG20500 for funding our training and fostering our initial collaborations. Kevin Grimm was also supported, in part, by the Institute of Education Sciences, U.S. Department of Education, through Grant R305B040049 to the University of Virginia when he was at that institution. Nilam Ram thanks the Max Planck Institite for Human Development for its support. The opinions expressed are those of the authors and do not represent views of the U.S. Department of Education.

Appendix A

Mplus script for a logistic growth model

TITLE: Logistic Growth Model;
!Note: Comments begin with ‘!’
DATA: FILE = wj_morrison.dat;
VARIABLE:
  NAMES =   childid lw3F lw3S lw4F lw4S lw5F lw5S lw6F lw6S lw7F lw7S;
  USEVARIABLES =
            lw3F - lw7S;
  MISSING = .;
ANALYSIS:
  TYPE = MEANSTRUCTURE MISSING H1; ITERATIONS = 100000;  COVERAGE = 0;
MODEL:
g0      BY lw3F - lw7S@1;
g1      BY lw3F*1 (L1)
           lw3S   (L2)
           lw4F   (L3)
           lw4S   (L4)
           lw5F   (L5)
           lw5S   (L6)
           lw6F   (L7)
           lw6S   (L8)
           lw7F   (L9)
           lw7S   (L10);
!Variances & Covariance
        lw3F - lw7S (Ve);
        g0*132 g1*20;   g0 WITH g1;
!Means
        [lw3F - lw7S @0];
        [g0*300 g1*153];
!Parameters of Logistic Model as a latent variables
        phantom1 BY lw3F@0;         phantom1@0;     [phantom1*3] (alpha);
        phantom1 WITH g0@0 g1@0;

        phantom2 BY lw3F@0;         phantom2@0;     [phantom2*1] (lambda);
        phantom2 WITH g0@0 g1@0 phantom1@0;

MODEL CONSTRAINT:
      L1    = 1/(1 + EXP (−( 1-alpha)*lambda));
      L2    = 1/(1 + EXP (−( 2-alpha)*lambda));
      L3    = 1/(1 + EXP (−( 3-alpha)*lambda));
      L4    = 1/(1 + EXP (−( 4-alpha)*lambda));
      L5    = 1/(1 + EXP (−( 5-alpha)*lambda));
      L6    = 1/(1 + EXP (−( 6-alpha)*lambda));
      L7    = 1/(1 + EXP (−( 7-alpha)*lambda));
      L8    = 1/(1 + EXP (−( 8-alpha)*lambda));
      L9    = 1/(1 + EXP (−( 9-alpha)*lambda));
      L10   = 1/(1 + EXP (−(10-alpha)*lambda));

OUTPUT:     SAMPSTAT STANDARDIZED;

Appendix B

SAS Script for data restructuring and logistic growth model

*Reading data into SAS;
DATA morrison;
      INFILE 'D:\Nonlinear growth\Morrison\wj_morrison.dat' LINESIZE = 5000;
      INPUT childid age lw_w01 - lw_w10;
RUN;

*Restructuring Data from Wide to Long Format;
DATA morrison_long;
      SET morrison;
      lw_w = lw_w01;    time = 1;   OUTPUT;
      lw_w = lw_w02;    time = 2;   OUTPUT;
      lw_w = lw_w03;    time = 3;   OUTPUT;
      lw_w = lw_w04;    time = 4;   OUTPUT;
      lw_w = lw_w05;    time = 5;   OUTPUT;
      lw_w = lw_w06;    time = 6;   OUTPUT;
      lw_w = lw_w07;    time = 7;   OUTPUT;
      lw_w = lw_w08;    time = 8;   OUTPUT;
      lw_w = lw_w09;    time = 9;   OUTPUT;
      lw_w = lw_w10;    time = 10;  OUTPUT;
      KEEP childid lw_w time;
RUN;

*Logistic Growth Model;
PROC NLMIXED DATA = morrison_long;
      g_0n = m_0 + d_0n;
      g_1n = m_1 + d_1n;

      A_t = 1/(1 + EXP(−(time - lambda)*alpha));

      traject = g_0n + g_1n * A_t;

      MODEL lw_w ~ NORMAL(traject, v_e);
      RANDOM d_0n d_1n ~ NORMAL([0,0], [v_0,
                                        c_01, v_1])
      SUBJECT = childid;
      PARMS
            m_0 = 300 m_1 = 200 v_0 = 620 v_1 = 900 c_01 = 0
            v_e = 175
            lambda = 5        alpha = .5;
RUN;

Footnotes

1
In version 4.0 and more recent versions of Mplus the phantom variable could be specified without a manifest variable, such as
     phantom BY ;
In version 5.0, the parameters of the non-linear equations can simply be added in the MODEL CONSTRAINT command. Therefore creating phantom variables is unnecessary. This addition of parameters can be programmed as
     MODEL CONSTRAINT:
     NEW(alpha*.5 lambda*5);
2

Available on request from Michael Browne, Dept. of Psychology, Ohio State University, Columbus, OH 43210-1222.

3

Growth models with individually-varying measurement occasions can be fit in Mplus using the multilevel and/or TSCORES options. However, these options can only be used to fit polynomial (e.g., linear, quadratic, cubic) growth models.

Contributor Information

Kevin J. Grimm, University of California, Davis Department of Psychology, One Shields Avenue, Davis, CA 95616, 530-752-1880, kjgrimm@ucdavis.edu

Nilam Ram, The Pennsylvania State University, Department of Human Development and Family Studies, College of Health and Human Development, 110 Henderson Building South, University Park, PA 16802-6504. 814-865-7038, nilam.ram@psu.edu

References

  1. Arbuckle JL, Wothke W. AMOS 4.0 user’s guide. Chicago, IL: SPSS; 1999. [Google Scholar]
  2. Bell RQ. Convergence: an accelerated longitudinal approach. Child Development. 1953;24:145–152. [PubMed] [Google Scholar]
  3. Bentler PM. EQS program manual. Multivariate Software, Inc.; 1995. [Google Scholar]
  4. Blozis SA. Structured latent curve models for the study of change in multivariate repeated measures. Psychological Methods. 2004;9:334–353. doi: 10.1037/1082-989X.9.3.334. [DOI] [PubMed] [Google Scholar]
  5. Blozis SA. On fitting non-linear latent curve models to multiple variables measured longitudinally. Structural Equation Modeling. 2007;14:179–201. [Google Scholar]
  6. Browne MW. Structured latent curve models. In: Cuadras CM, Rao CR, editors. Multivariate analysis: Future directions. Vol. 2. Amsterdam: North-Holland; 1993. pp. 171–198. [Google Scholar]
  7. Browne MW, du Toit SHC. Models for learning data. In: Collins L, Horn JL, editors. Best methods for the analysis of change. Washington, DC: APA; 1991. pp. 47–68. [Google Scholar]
  8. Blozis SA, Cudeck R. Conditionally linear mixed-effects models with latent variable covariates. Journal of Educational & Behavioral Statistics. 24:245–270. [Google Scholar]
  9. Bryk AS, Raudenbush SW. Application of hierarchical linear models to assessing change. Psychological Bulletin. 1987;101:1, 147–158. [Google Scholar]
  10. Bryk AS, Raudenbush SW. Hierarchical linear models: Applications and data analysis methods. Newbury Park, CA: Sage Publications; 1992. [Google Scholar]
  11. Chou C-P, Benterl PM, Pentz MA. Comparisons of two statistical approaches to study growth curves: The multilevel model and latent curve analysis. Structural Equation Modeling. 1998;5:247–266. [Google Scholar]
  12. Conner CM, Morrison FJ, Slominski L. Preschool instruction and children’s emergent literacy growth. Journal of Educational Psychology. 2006;98:665–689. [Google Scholar]
  13. Cudeck R. Mixed-effects models in the study of individual differences with repeated measures data. Multivariate Behavioral Research. 1996;31:371–403. doi: 10.1207/s15327906mbr3103_6. [DOI] [PubMed] [Google Scholar]
  14. Cudeck R, du Toit SHC. A version of quadratic regression with interpretable parameters. Multivariate Behavioral Research. 2002;37:501–519. doi: 10.1207/S15327906MBR3704_04. [DOI] [PubMed] [Google Scholar]
  15. Curran PJ. Have multilevel models been structural equation models all along? Multivariate Behavioral Research. 2003;38:529–569. doi: 10.1207/s15327906mbr3804_5. [DOI] [PubMed] [Google Scholar]
  16. Easton DE. Gompertzian growth and decay: A powerful descriptive tool for neuroscience. Physiology & Behavior. 2005;86:407–414. doi: 10.1016/j.physbeh.2005.08.016. [DOI] [PubMed] [Google Scholar]
  17. Ferrer E, Hamagami F, McArdle JJ. Modeling latent growth curves with incomplete data using different types of structural equation modeling and multilevel software. Structural Equation Modeling. 2004;11:452–483. [Google Scholar]
  18. Ghisletta P, Lindenberger U. Static and dynamic longitudinal structural analyses of cognitive changes in old age. Gerontology. 2004;50:12–16. doi: 10.1159/000074383. [DOI] [PubMed] [Google Scholar]
  19. Grimm KJ, McArdle JJ. A note on the computer generation of structural expectations. In: Dansereau F, Yammarino F, editors. Multi-level issues in strategy and research methods. Volume 4 of Research in multi-level issues. Amsterdam: JAI Press/Elseiver; 2005. pp. 335–372. [Google Scholar]
  20. Guo X, Carlin BP. Separate and joint modeling of longitudinal and event time data using standard computer packages. American Statisticians. 2004;58:16–24. [Google Scholar]
  21. Hancock GR, Kuo W, Lawrence FR. An illustration of second-order latent growth models. Structural Equation Modeling. 2001;8:470–489. [Google Scholar]
  22. Jöreskog KG, Sörbom D. LISREL 8: User's Reference Guide. Lincolnwood, IL: Scientific Software International; 1996. [Google Scholar]
  23. Laird AK. Dynamics of tumor growth. British Journal of Cancer. 1964;18:490–502. doi: 10.1038/bjc.1964.55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lambert P, Collett D, Kimber A, Johnson R. Parametric accelerated failure time models with random effects and an application to kidney transplant survival. Statistics in Medicine. 2004;23:3177–3192. doi: 10.1002/sim.1876. [DOI] [PubMed] [Google Scholar]
  25. Littell RC, Milliken GA, Stroup WW, Wolfinger RD, Schabenberber O. SAS for mixed models. Cary, NC: SAS Institute; 2006. [Google Scholar]
  26. MacCallum RC, Kim C, Malarkey WB, Kiecolt-Glaser JK. Studying multivariate change using multilevel models and latent curve models. Multivaraite Behavioral Research. 1997;32:215–253. doi: 10.1207/s15327906mbr3203_1. [DOI] [PubMed] [Google Scholar]
  27. McArdle JJ. Latent variable growth within behavior genetic models. Behavior Genetics. 1986;16:163–200. doi: 10.1007/BF01065485. [DOI] [PubMed] [Google Scholar]
  28. McArdle JJ. Dynamic but structural equation modeling of repeated measures data. In: Nesselroade JR, Cattell RB, editors. Handbook of multivariate experimental psychology. vol. 2. New York: Plenum; 1988. pp. 561–614. [Google Scholar]
  29. McArdle JJ, Bell RQ. An introduction to latent growth models for developmental data analysis. In: Little TD, Schnabel KU, Baumert J, editors. Modeling longitudinal and multilevel data: Practical issues, applied approaches, and specific examples. Mahwah, NJ: Erlbaum; 2000. pp. 69–107. [Google Scholar]
  30. McArdle JJ, Epstein D. Latent growth curves within developmental structural equation models. Child Development. 1987;58:110–133. [PubMed] [Google Scholar]
  31. McArdle JJ, Ferrer-Caja E, Hamagami F, Woodcock RW. Comparative longitudinal structural analyses of the growth and decline of multiple intellectual abilities over the life span. Developmental Psychology. 2002;38:115–142. [PubMed] [Google Scholar]
  32. McArdle JJ, Grimm KJ, Hamagami F, Bowles RP, Meredith W. Modeling lifespan growth curves of cognition using longitudinal data with changing measures. Manuscript submitted for publication. 2008 doi: 10.1037/a0015857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. McArdle JJ, Hamagami F. Methods for dynamic change hypotheses. In: van Montford K, Oud J, Satorra A, editors. Recent developments on structural equation models: Theory and applications. Amsterdam: Kluwer Academic Publishers; 2004. pp. 295–336. [Google Scholar]
  34. McArdle JJ, Nesselroade JR. Growth curve analysis in contemporary psychological research. In: Shinka J, Velicer W, editors. Comprehensive handbook of psychology, Volume two: Research methods in psychology. New York: Wiley; 2003. pp. 447–480. [Google Scholar]
  35. McGrew KS, Werder JK, Woodcock RW. Woodcock-Johnson technical manual. Allen, TX: DLM; 1991. [Google Scholar]
  36. Meredith W, Tisak J. Latent curve analysis. Psychometrika. 1990;55:107–122. [Google Scholar]
  37. Muthén BO, Curran PJ. General longitudinal modeling of individual differences in experimental designs: A latent variable framework for analysis and power estimation. Psychological Methods. 1997;2:371–402. [Google Scholar]
  38. Muthén LK, Muthén BO. Mplus User’s Guide. Fifth Edition. Los Angeles, CA: Muthén & Muthén; 1998–2007. [Google Scholar]
  39. Neale MC, Boker SM, Xie G, Maes HH. Unpublished program manual. 5th Ed. Richmond, VA: Virginia Institute of Psychiatric and Behavioral Genetics, Medical College of Virginia, Virginia Commonwealth University; 1999. Mx Statistical Modeling. [Google Scholar]
  40. Pinheiro JC, Bates DM. Mixed Effects Models in S and S-Plus. New York: Springer Verland; 2000. [Google Scholar]
  41. Ram N, Grimm KJ. Using simple and complex growth models to articulate developmental change: Matching method to theory. International Journal of Behavioral Development. 2007;31:303–316. [Google Scholar]
  42. Ratkowsky DA. Handbook of Non-linear Regression Models. New York: Marcel Dekker Inc.; 1989. [Google Scholar]
  43. Raudenbush SW, Bryk AS, Cheong YF, Congdon R. HLM 6: Hierarchical linear and non-linear modeling. Lincolnwood, IL: Scientific Software International; 2004. [Google Scholar]
  44. Richards FJ. A flexible growth function for empirical use. Journal of Experimental Botany. 1959;10:290–301. [Google Scholar]
  45. Rindskopf D. Parameterizing inequality constraints on unique variances in linear structural model. Psychometrika. 1983;48:73–83. [Google Scholar]
  46. Rogosa DR, Willett JB. Understanding correlates of change by modeling individual differences in growth. Psychometrika. 1985;50:203–228. [Google Scholar]
  47. Sheu CF, Chen CT, Su YH, Wang WC. Using SAS PROC NLMIXED to fit item response theory models. Behavior Research Methods. 2005;37:202–218. doi: 10.3758/bf03192688. [DOI] [PubMed] [Google Scholar]
  48. Singer JB, Willett JB. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford: Oxford University Press; 2003. [Google Scholar]
  49. Spiegelhalter DJ, Thomas A, Best NG, Lunn D. WinBUGS version 1.4.1 User manual. Cambridge: Medical Research Council Biostatistics Unit; 2007. (Available from. http://www.mrc-bsu.cam.ac.uk/bugs.) [Google Scholar]
  50. Steiger JH, Lind JC. Statistically based tests for the number of common factors; Presented at the Psychometric Society; Iowa City, IA. 1980. [Google Scholar]
  51. Thieme HR. Mathematics in population biology. Princeton, NJ: Princeton University Press; 2003. [Google Scholar]
  52. Westerfeld WW. Biological response curves. Science. 1956;123:1017–1019. doi: 10.1126/science.123.3206.1017. [DOI] [PubMed] [Google Scholar]
  53. Willett JB. Investigating individual change and development: The multilevel model for change and the method of latent growth modeling. Research in Human Development. 2004;1:31–57. [Google Scholar]
  54. Winsor CP. The Gompertz curve as a growth curve. Proceedings of the National Academy of Sciences. 1932;18:1–8. doi: 10.1073/pnas.18.1.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Wohlwill JF. The study of behavioral development. New York: Academic Press; 1973. [Google Scholar]

RESOURCES