Skip to main content
Translational and Clinical Pharmacology logoLink to Translational and Clinical Pharmacology
. 2022 Jun 15;30(2):75–82. doi: 10.12793/tcp.2022.30.e8

A simple time-to-event model with NONMEM featuring right-censoring

Quyen Thi Tran 1, Jung-woo Chae 1,2,, Kyun-Seop Bae 3,, Hwi-yeol Yun 1,2,
PMCID: PMC9253447  PMID: 35800666

Abstract

In healthcare situations, time-to-event (TTE) data are common outcomes. A parametric approach is often employed to handle TTE data because it is possible to easily visualize different scenarios via simulation. Not all pharmacometricians are familiar with the use of non-linear mixed effects models (NONMEMs) to deal with TTE data. Therefore, this tutorial simply explains how to analyze TTE data using NONMEM. We show how to write the code and evaluate the model. We also provide an example of a hands-on model for training.

Keywords: NONMEM, Right-Censoring, Time-to-Event, Tutorial

INTRODUCTION

Time-to-event (TTE) analysis considers whether an event of interest occurs and the time of occurrence. The TTE can be the time from diagnosis to death, or the time from enrollment in a study to the development of a disease of interest. Non-parametric, semi-parametric, and parametric approaches are used to analyze TTE outcomes. In this tutorial, we focus on a parametric approach featuring a non-linear mixed effects model (NONMEM). In addition, we also provide a short description of handling TTE data in R. Generally, with a simple TTE data, R can be a good option due to its simple and easy implementation. However, with a more complex situation such as the repeated TTE data or integrating with pharmacokinetics/pharmacodynamics model, R could not work well. In such cases, NONMEM is a common choice owing to its flexibility of modifying model for specific situations.

In a parametric survival model, the survival time (the outcome) is assumed to follow a known distribution. This is a well-recognized approach used to explore the relationship between survival and various explanatory variables. It yields an appropriate distribution of TTE data of interest. In TTE modeling, several mathematical terms must be understood, including the probability density function f(t), the survival function S(t), and the hazard function h(t) [1]. The probability density function describes the likelihood of observing an event at a particular time t:

f(t) = S(t) × h(t)

The hazard function is the instantaneous failure rate of event occurrence; this means that one has survived to time t but will experience the event in the next instant of time.

ht=limΔt0PtT<t+Δt1|TtΔt=f(t)S(t)

The survival function reflects the probability that the event of interest has not yet occurred by time t.

S(t) = Pr (T > t)

where T is the time to observation of an event.

The relationship between the survival and hazard functions is:

St=e(0thudu)

TTE DISTRIBUTIONS

Several survival models can be applied to describe TTEs, including exponential, Weibull, Gompertz, and log-logistic models. The equations of the baseline hazard and survival functions corresponding to each distributional form are listed in Table 1.

Table 1. The baseline hazard and survival functions of some common TTE distributions.

TTE distribution Baseline hazard function
h0(t)
Survival function
S(t)
Exponential distribution λ e λt
Weibull distribution λγt γ−1
eλtγ
Gompertz distribution λeγt
eλγ(eγt1)
Log-logistic distribution
λγtγ-11+λtγ
11+λtγ

If any covariate (e.g., age, sex, or drug dose) is relevant, the covariate (COV) features in the hazard equation:

h(t) = h0(t) × COV

COVs can be diverse. For example, if a covariate is dichotomous, COV can take the form of eβixi. If a covariate is continuous, COV can be written as eβi(xi−xi,mean) [2].

TRANSLATION OF TRIAL DATA TO THE MODEL

In most survival analyses, censoring is a key analytical problem. Censoring is understood as when we lack information on exact survival times but have some data on individual survival times. Several types of censoring include right-censoring, left-censoring, and interval censoring [3]. For right-censored data, the true survival time is equal to or greater than the observed survival time. For left-censored data, the true survival time is less than or equal to the observed survival time; however, this is rarely encountered. For interval-censored data, the true survival time lies within a known interval. In this tutorial, we focus on how to handle right-censored data. Right-censoring can occur if the study ends without any event, or if patients are lost to follow-up or withdraw.

As an example, Fig. 1 shows certain events occurring within a study. Event IDs 1, 3, 4, 5, and 6 are right-censored. Censoring of ID 1, 3, and 5 reflects the end of the study; censoring of ID 4 and 8 reflect possible loss to follow-up or withdrawal. For ID 2, the event occurred after 4.5 days; for ID 7, the event occurred at 8 days. To apply these data to NONMEM, it is necessary to translate and format the original data as shown in Fig. 2 (the last column [“Note”] simply aids understanding; it is not part of the NONMEM data file). Each subject features two time points, of which one is the commencement of observation (time = 0) and the other the time of an event or censoring. If more subject information (e.g., age or gender) is available, one can add more columns to include the data.

Figure 1. Example of raw time-to-event data.

Figure 1

Figure 2. Translation of clinical data to non-linear mixed effects model data.

Figure 2

BASIC STRUCTURE AND EXECUTION OF NONMEM

Creating a data file

The example dataset used in this tutorial features 50 individuals observed over 100 days. Under the TTE assumption, each individual would randomly experience only one event over the 100 days (this may either occur or be censored). The data for simulation should include complete daily records. The full dataset is provided in the Supplementary Data 1 (dataTTE_nm.csv file); the format is that of Fig. 3. It is tedious to format big data in Excel. Therefore, we provide an R code allowing simple creation of a NONMEM dataset from translated clinical data. All files are in the Supplementary Data 2 and 3 (translated data, dataTTE.csv file; R code, part 1 in R_code.docx file).

Figure 3. An example dataset used for time-to-event analysis by non-linear mixed effects model.

Figure 3

In the Fig. 3, subject identification number; TIME, the times when observations commenced and those of events, including events that in fact occurred or were censored (these are required for the estimation step); and all times from zero to the end of the study (required for the simulation step). For example, for ID 102 in Fig. 3, the values in the TIME column consist of all time points from zero to the end of the study. However, for a specific purpose, TYPE is set to 1 when data are to be estimated (thus only at the times of zero and the events) and to 0 when data are to be simulated (at all time points); DV, discrete observation events (zero-0 or one-1) (zero means censored, one means occurred); MDV, missing DV; SEX, etc., individual characteristics (if available); EVID, event ID (optional) (EVID = 0 indicates an observation; EVID = 3 indicates the commencement of a new individual); TYPE, as mentioned above, TYPE is used for separating dataset as a specific purpose.

The control file

The control stream features several blocks. Most terms were well-explained in the NONMEM Tutorial Part I [4], therefore, we will focus on terms and codes not mentioned there. An example control stream file of the TTE Gompertz model is provided in the Supplementary Data 4 (run101.mod file).

In the $DATA block, the commands ;Sim_start and ;Sim_end allow the employment of user-defined simulation codes when running PsN with the option of -flip_comments. This inverts comments between the tags. For example, if the comments above is in the model, then the MAXEVAL = 0, model will run as such and the simulation model will instead have a line as in the right code:

graphic file with name tcp-30-75-g007.jpg

The $PK block defines the parameters of the hazard model. For example, the hazard model in this .mod file follows the Gompertz distribution; thus, two parameters are defined:

graphic file with name tcp-30-75-g008.jpg

The $DES block describes the hazard equation:

graphic file with name tcp-30-75-g009.jpg

The $ERROR block calculates the hazard, survival, probability of events, and commands for simulation.

graphic file with name tcp-30-75-g010.jpg

The $ERROR block also contains the code for the simulation step (the right of Fig. 4). The explanation of simulation procedure is explained on the left of Fig. 4, where:

Figure 4. Simulation of a time-to-event model in non-linear mixed effects model.

Figure 4

DV, event; receives a value of zero-0 when no event occurs (at time = 0) or an event is censored, but a value of one-1 when an event occurs. TTE, a variable for event; required for a visual predictive check (VPC); OTTE, a variable for counting event; this ensures that in a TTE model only one event occurs; both TTE and OTTE, values of zero-0 when no event occurs, and values of one-1 for events (censored and occurring).

When the simulation is initiated:

At time = 0, DV, TTE, and OTTE are set to zero. Whenever a new ID is called [NEWIND not equal (NE.) to 2], a random value (USUR) is picked from a uniform distribution and compared to the survival at every time point (SUR).

If USUR > SUR and time ≤ 100, an event occurs, and DV, TTE, and OTTE are set to 1. If USUR ≤ SUR and time = 100, the event will be censored, and DV is set to 0 but TTE and OTTE are set to 1.

The estimation step features various methods including Laplace, stochastic approximation expectation-maximization, and importance sampling methods; the details are in the work of Karlsson et al. [5].

graphic file with name tcp-30-75-g011.jpg

In the above code, commands ;Sim_start and ;Sim_end are again used to switch from estimation to simulation. When either comment is called, the PsN will feature this line in the code:

graphic file with name tcp-30-75-g012.jpg

The table output file is used for visualization:

graphic file with name tcp-30-75-g013.jpg

Running the model:

graphic file with name tcp-30-75-g014.jpg

Selection of the model:

During TTE analysis, attention should be paid to the objective function between nested models, the precision of the parameter estimates, and scientific plausibility. In addition, a model is selected based on a visual predictive check. To execute the VPC, the PsN command is:

graphic file with name tcp-30-75-g015.jpg

Here:

  • -flip_comments inverts the comments between the tags ;Sim_start and ;Sim_end.

  • -tte = TTE is required in a TTE model used for simulation, and will have a value of zero-0 if an observation is not an event, but non-zero (for example, 1) if an observation is an event (occurring and censored events). PsN will add this value to the Table file of the simulation as a column name in the TTE, and will format it unlike a regular VPC (to specifically engage the kalplan.plot functionality of Xpose). For more information on the VPC options for TTE modeling, refer to the VPC and NPC user guides [6].

  • -samples = n: The simulation will run with a sample size of n. Generally, to adequately evaluate the appropriateness of a developed model, the VPC should run with a sample size of 1,000.

  • -stratify_on = SEX (optional): This option allows the user to specify the VPC as groups divided by (for example) gender or the drug dose.

  • -rplot = 1 (optional): Basic VPC plots are generated.

Plotting the VPC:

When plotting the VPC using Xpose, the following R code is used. The code is in the Supplementary Data 3 (part 2 in R_code.docx).

graphic file with name tcp-30-75-g016.jpg

Results:

Table 2 and Fig. 5 present the estimated parameters and the VPC, respectively, after estimation and evaluation using the example dataset in the Supplementary Data 1.

Table 2. The estimated parameters of the example TTE dataset.

Parameters Estimated value Relative standard error
Rate factor (λ) 0.0052 32%
Shape factor (γ) 0.0279 18%
Random effect* 0 FIX -

*A random effect describing the log-normal distribution of the hazard.

Figure 5. The visual predictive check of the time-to-event model. Solid line represents observed survival probability, and the shaded area represents 90% prediction interval of simulated data.

Figure 5

ANALYZING TTE DATA IN R

TTE data can be analyzed by using package “flexsurv” [7]. This package provides users with fitting options of various distributions such as exponential, gompertz, weibull, etc. The example code was provided in the Supplementary Data 3, part 3 in R_code.docx.

The function “flexsurvreg” gave the results as same as in the NONMEM model with shape parameter = 0.0279 and rate parameter = 0.0052. The plots of survival probability were shown in the Fig. 6.

Figure 6. Plots of survival probability according to gender. Black line indicates observed survival probability, red line indicates fitted curve, and green lines indicate 95% confidence interval of fitted curve.

Figure 6

Footnotes

Funding: This research was funded by Chungnam National University and an Institute of Information and Communications Technology Planning and Evaluation grant funded by the government of Republic of Korea (MSIT) (No. 2020-0-01441, Artificial Intelligence Convergence Research Center, Chungnam National University; and No.RS-2022-00155857, Artificial Intelligence Convergence Innovation Human Resources Development, Chungnam National University) and supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No.NRF-2022R1A2C1010929).

Conflict of Interest: - Authors: Nothing to declare

- Reviewers: Nothing to declare

- Editors: Nothing to declare

Author Contributions:
  • Conceptualization: Tran QT, Chae JW, Bae KS, Yun HY.
  • Formal analysis: Tran QT.
  • Investigation: Tran QT.
  • Methodology: Tran QT, Chae JW, Bae KS, Yun HY.
  • Software: Tran QT.
  • Supervision: Chae JW, Bae KS, Yun HY.
  • Validation: Chae JW, Yun HY.
  • Writing - original draft: Tran QT.
  • Writing - review & editing: Chae JW, Bae KS, Yun HY.

SUPPLEMENTARY MATERIALS

Supplementary Data 1
tcp-30-75-s001.csv (88.4KB, csv)
Supplementary Data 2
tcp-30-75-s002.csv (1.1KB, csv)
Supplementary Data 3
tcp-30-75-s003.doc (34KB, doc)
Supplementary Data 4
tcp-30-75-s004.mod (1.5KB, mod)

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data 1
tcp-30-75-s001.csv (88.4KB, csv)
Supplementary Data 2
tcp-30-75-s002.csv (1.1KB, csv)
Supplementary Data 3
tcp-30-75-s003.doc (34KB, doc)
Supplementary Data 4
tcp-30-75-s004.mod (1.5KB, mod)

Articles from Translational and Clinical Pharmacology are provided here courtesy of Korean Society for Clinical Pharmacology and Therapeutics

RESOURCES