Interoccasion variability in population pharmacokinetic models: identifiability, influence, interdependencies and derived study design recommendations

Emily Behrens; Sebastian G Wicha

doi:10.1007/s10928-025-09966-7

. 2025 Apr 11;52(2):23. doi: 10.1007/s10928-025-09966-7

Interoccasion variability in population pharmacokinetic models: identifiability, influence, interdependencies and derived study design recommendations

Emily Behrens ¹, Sebastian G Wicha ^1,^✉

PMCID: PMC11992005 PMID: 40216605

Abstract

Modeling interoccasion variability (IOV) of pharmacokinetic parameters is challenging in sparse study designs. We conducted a simulation study with stochastic simulation and estimation (SSE) to evaluate the influence of IOV (25, 75%CV) from numerous perspectives (power, type I error, accuracy and precision of parameter estimates, consequences of neglecting an IOV, capability to detect the ‘correct’ IOV). To expand the scope from modeling-related aspects to clinical trial practice, we investigated the minimal sample size for IOV detection and calculated areas under the concentration-time curve (AUC) derived from models containing IOV and mis-specified models. The power to correctly detect an IOV increased from one to three occasions (OCC) and the type I error rate to falsely include an IOV was not elevated. Two sampling schemes were compared (with/without trough sample) and including a trough sample resulted in better performance throughout the different evaluations in this simulation study. Parameters were estimated more precisely when more OCCs were included and IOV was of high effect size. Neglecting an IOV that was truly present had a high impact on bias and imprecision of the parameter estimates, mostly on interindividual variabilities and residual error. To reach a power of ≥ 95% in all scenarios when sampling in three OCCs between 10 and 50 patients were required in the investigated setting. AUC calculations with mis-specified models revealed a distorted AUC distribution as IOV was not considered.

Supplementary Information

The online version contains supplementary material available at 10.1007/s10928-025-09966-7.

Keywords: Interoccasion variability, Study design, Pharmacometrics, Stochastic simulation and estimation, NONMEM^®

Introduction

Nonlinear mixed-effect modeling aims at identifying and explaining different types of random effects within the model. In principle, two levels of random effects can be distinguished: Karlsson and Sheiner referred to the first level of random effects as the “random variation in parameters” and described the second level as “random variation of observations” [1]. Parameter variability is further subdivided into interindividual variability (IIV) and an intraindividual variability between different occasions (OCC), interoccasion variability (IOV) [1]. The importance of the evaluation and inclusion (if applicable) of IOV has been emphasized repeatedly from a pharmacokinetic (PK) but also pharmacodynamic (PD) perspective [1–3].

Sampling schemes to inform PK modeling in e.g. phase II clinical studies are sparse [4, 5]. Therefore, sampling may only be performed in a single dosing interval. Conceptionally, sampling in one OCC is not sufficient for the quantification of IOV as it represents the variability between different OCCs. However, IOV is often intrinsically present and influential at the time of the first observed OCC, even though in model development pharmacometricians often neglect testing for IOV given the sampling scheme constraints mentioned above.

Our simulation study aimed at exploration and evaluation of the influence of IOV in typical sparse sampling settings of phase II clinical studies using stochastic simulation and estimation (SSE). In a first step, we evaluated different sampling scenarios to evaluate power and type I error concerning the capability of finding an IOV in an estimation given that the IOV is truly present on the one hand and the risk of falsely including an IOV in a PK model on the other hand. In a second step, the influence of a truly present IOV on accuracy and precision of parameter estimates was explored. In addition, we investigated if the correct (simulated) IOV would be detectable when models including an IOV on a different parameter were used for estimation during e.g. model building. Moreover, the influence of an intentionally ignored IOV was evaluated. We performed sample size calculations using an SSE workflow with increasing number of patients represented in the datasets to evaluate the relationship between sample size and power outcome. To demonstrate clinical relevance, we calculated areas under the concentration-time curve (AUC) of mis-specified models and compared them to AUCs from correctly informed models.

Methods

A graphical workflow of the simulation study is shown in Fig. 1. To test and evaluate different hypotheses, SSEs were executed and power and type I error calculations were performed while assessing parameter bias and imprecision simultaneously. As the focus was set on the evaluation of the influence of IOV, datasets were generated that feature PK sampling in one to three dosing OCCs. The following section is a detailed description of each component or function of the simulation study.

Model

The simulation study was conducted using and adapting the one compartment model describing the population PK of linezolid in multidrug-resistant tuberculosis patients previously published by Tietjen et al. [6]. The interindividual variabilities on clearance (CL), volume of distribution (V) and absorption constant (k_a) were set to 32%CV (named IIV_CL/V/ka in the following). The model solely containing IIVs is referred to as IIV_only. Implementing an IOV on one parameter of the IIV_only model at a time led to three different models referenced as IOV_CL, IOV_V and IOV_ka in the following. Two magnitudes of IOV were implemented in each case, 25%CV (IOV25) and 75%CV (IOV75). For the sake of traceability, e.g. IOV25 on CL will be described as IOV25_CL. In total, seven various models were created (IOV25_CL, IOV75_CL, IOV25_V, IOV75_V, IOV25_ka, IOV75_ka, IIV_only). The implementation of the IOV was encoded in compliance with the approach described by Karlsson and Sheiner [1]. ETAs were assigned to every occasion in the dataset. The residual variability was described by a combined proportional and additive error model [6].

The parameter values used in the model file are listed in Table 1.

Table 1.

Initial parameter estimates of the population pharmacokinetic model used in the simulation study (CL: clearance, CV: coefficient of variation, IIV: interindividual variability, IOV: interoccasion variability, k_a: absorption rate constant, V: volume of distribution)

Structural parameters	Parameter value
CL [L/h]	6.8
V [L]	38.9
k_a [h^− 1]	0.617
IIV/IOV^a
IIV CL [%CV]	32
IIV V [%CV]	32
IIV k_a [%CV]	32
IOV_CL or IOV_V or IOV_ka [%CV]	25 and 75
Residual variability
Proportional error [%]	19.1
Additive error [mg/L]	0.56

Open in a new tab

^aEquation used for transformation of ω² to %CV: Inline graphic

Sampling schemes

The $DESIGN feature in NONMEM^® was used to evaluate an optimized study design for the administered daily dose of 600 mg based on the D-optimality criterion [7]. The starting point for the optimization was the IOV25_CL model file. The design element to be optimized was time since the first dosing event (TIME) and the dataset that sets the initial timepoints for the optimization process contained five sampling timepoints. The initial sampling times were provided their own stratification variable (TSTRAT) values to assure an independent variation while optimizing TIME. The boundaries for TIME were 72.5 (TMIN) and 96 (TMAX). TMIN was chosen to reflect steady-state. The $DESIGN tool suggested five samples with sampling at 72.5 h twice and three subsequent samples. This led to the design of sampling scheme (a) with a total of four samples. For sampling scheme (b) a sample right before the timepoint of dosing was added to the dataset manually. Hence, sampling scheme (b) includes a trough concentration informed by the previous occasion.

Ultimately, two sampling schemes were evaluated: (a) 0.5, 2, 8, 22 h after dose and (b) 0/pre-dose, 0.5, 2, 8, 22 h after dose.

Dataset generation

Three simulation datasets were generated using R. One dataset contained samples taken at only one OCC. Another dataset contained samples taken in two dosing OCCs and the last dataset contained samples taken in three dosing OCCs. Dosing events with and without observations were considered an OCC; the added trough sample in sampling scheme (b) belongs to the previous OCC. As OCC is not a predefined variable in NONMEN, it hast to be understood and treated like a time-varying covariate. Analogous to the approach taken by Denti, “dummy records” (EVID = 2) were used to ensure the correct assignment of the OCC number within one dosing interval [8]. A dataset example of one patient for each sampling scheme can be found in the supplementary files (Figure S1).

A daily dose of 600 mg was simulated for 150 patients. The number of patients in phase II studies varies depending on the study design. A total of 150 patients was chosen as a conceivable number of patients from real life phase II studies [9, 10]. Once the dataset framework entered the SSE workflow, it was completed by simulated DVs.

Stochastic simulation and estimation

The SSE tool from Perl-speaks-NONMEM (PsN) was used for the calculation of the power or type I error to find an IOV that truly is or is not present, respectively [11]. The probability of correctly rejecting the null hypothesis is defined as the power of a statistical test, while type I error describes the probability of falsely rejecting the null hypothesis. The different combinations of simulation and estimation models that were used for the power or type I error setup are shown in Fig. 1. The number of simulated datasets to generate in the SSE was set to 500 (N in Fig. 1). Figure 2 gives an overview of the number of different SSE runs that were performed in total except for the SSE runs used for the minimal sample size investigation. We used seven different model files IOV25_CL, IOV75_CL, IOV25_V, IOV75_V, IOV25_ka, IOV75_ka and IIV_only. For power calculations 36 SSEs were performed and we calibrated the chi-square critical value to an alpha of 0.05. Table S2 shows the critical values from the alpha-calibration. The SSEs from the power calculations were also used for the assessment of the ability to identify the correct IOV. Another 12 SSEs were performed to calculate type I errors. The total number of SSEs performed was 48 (except for minimal sample size SSEs), while using multiple alternative models in the SSE command. Additionally, 192 SSE runs were executed for the determination of the minimal sample size. The same initial parameter values were used for the simulation and the estimation step.

Fig. 2 — Summary of evaluated SSE runs (a: sampling scheme (a), b: sampling scheme (b), 1: one occasion, 2: two occasions, 3: three occasions, IOV_CL: interoccasion variability on clearance, IOV_V: interoccasion variability on volume of distribution, IOV_ka: interoccasion variability on absorption constant, SSE: stochastic simulation and estimation) with scenario (1) showing the power setup when IIV_only was chosen as the alternative model (alternative models containing IOV for calculation of difference in objective function value, explained in Identification of correct IOV), while scenario (2) illustrates the setup for type I error calculations

Identification of correct IOV

To assess the ability of identifying the correct IOV, we added several alternative models in the estimation step of the SSE. Therefore, we used a model including IOV in every occasion in the simulation step and models including IOV on structural parameters, which were implemented on different parameters as compared to the simulation model, and IIV_only in the estimation step. We calculated the difference between the objective function value (ΔOFV) from the estimated IIV_only model and the IOV models to mimic the decision-making process of whether to integrate an IOV in a model or not based on ΔOFV. Negative values of ΔOFV are a result of higher objective function values (OFV) of the IOV model compared to the IIV_only model. In general, a lower OFV is an indicator of better model fit and the ΔOFV is considered statistically significant when ΔOFV ≥ critical value. The critical values after alpha-calibration (α = 0.05) can be found in the supplementary files (Table S2).

Impact of neglecting IOV

We also investigated the impact of a simulated IOV that is not represented in the estimation model of an SSE on the accuracy and imprecision of model parameters. For this purpose, we set up two different scenarios: The first scenario (IOV_included) consisted of one model used for both the simulation and estimation step. The second scenario (IOV_excluded) used a simulation model including an IOV and an estimation model that did not include an IOV. Therefore, we could simulate and analyze the impact of ignoring an IOV that is truly present in the simulated data. The SSE runs needed for this evaluation are shown under Scenario (1) in Fig. 2 (e.g. IOV_{CL, included} translates to IOV_CL as simulation/estimation model, IOV_{CL, excluded} translates to IOV_CL as simulation model and IIV_only as estimation/alternative model).

Minimal sample size investigation

We investigated the relationship between power and the number of patients in the simulation study. The ‘full model’ was a model with an implemented IOV and as the ‘reduced model’ we used the IIV_only model, similarly as described above (power setup). The same dataset structure as in the SSE workflow was used, starting with 10 patients up to a total 150 in steps of 10. Sample sizes reaching 95% power were considered sufficient.

Application example: AUC distribution

After evaluating the effect of neglecting IOV regarding parameter estimation and variability distribution, we further investigated the consequences of applying mis-specified models to demonstrate the potential clinical relevance. Hence, we calculated the AUC (Eq. 1) after simulating with different IOV_CL models in one to three OCCs.

In total we covered three scenarios:

I.
True model containing IOV_CL: We used the IOV_CL model for simulation.
II.
True model with final estimates from SSE: We used the re-estimated IOV_CL model for simulation.
III.
Mis-specified model neglecting IOV_CL: We used the re-estimated IIV_only model for simulation.

The AUC calculations focused on the influence of IOV, residual variability was not considered but was part of the model files in the same way as shown in Table 1.

Evaluation

For parameter estimation five candidate models were used: the same model as used for simulation, IIV_only and IOV25_CL/V/ka or IOV75_CL/V/ka models with an IOV on another PK parameter than the one used in the simulation model. When a model with IOV was used in the simulation step and for the estimation part of the SSE a model without this particular IOV (IIV_only) was used, the setup facilitates power calculations (scenario (1), Fig. 2). Thereby, an evaluation of the ability to detect an IOV that is truly present is possible. To evaluate the type I error rate, the simulation was performed with the IIV_only model (scenario (2), Fig. 2). In the estimation step models including IOVs were utilized. Moreover, ΔOFVs from estimations of IIV_only compared to estimations of IOV_CL/IOV_V/IOV_ka were analyzed regarding the ability to correctly detect a simulated (‘true’) IOV in a given scenario.

Imprecision and bias, expressed as relative root mean squared error (rRMSE) (Eq. 2) and relative bias (rBIAS) (Eq. 3) of the population parameters, were used as evaluation criteria.

Power, type I error, rBIAS, rRMSE and ΔOFV values were taken from the respective SSE output files. Overall, we focused on a significance level of α = 0.05. The chi-square critical value was calibrated to correspond to an α of 0.05 for all the scenarios that resulted in a power value < 100% (Table S2). Hence, the ΔOFV that was considered a statistically significant change in OFV varied throughout the different scenarios.

Software

All data was simulated and analyzed using NONMEM^® (version 7.5.0), operated using the PsN (version 5.3.1/5.4.0 (power curves)) SSE [11]. Creation of the different datasets, implementation of the tested sampling schemes and graphical analysis were performed using R (version 4.2.1) and RStudio (version 2022.07.0) [12].

Results

Power

The power to correctly detect an IOV increased with a second or third OCC (Table 2). A notable example of this improvement is the power increase from one (18.2%) to two OCCs (100.0%) up to three OCCs (100.0%) for the simulation model including IOV25_ka in the evaluation of sampling scheme (b). A single OCC did not facilitate estimation of IOV25 (power ≤ 65.4%).

Table 2.

Power (%) of detecting an IOV when truly present for the two sampling schemes (a: 0.5, 2, 8, 22 h after dose, b: 0, 0.5, 2, 8, 22 h after dose) and two magnitudes of IOV (25%, 75%) on parameters CL, V and k_a

Sampling	OCC	SIM/EST with IOV_CL		SIM/EST with IOV_V		SIM/EST with IOV_ka
		25%	75%	25%	75%	25%	75%
		EST with IIV_only
		power [%]	power [%]	power [%]	power [%]	power [%]	power [%]
a	1	48.8	100.0	65.4	99.8	19.2	71.4
a	2	100.0	100.0	100.0	100.0	100.0	100.0
a	3	100.0	100.0	100.0	100.0	100.0	100.0
b	1	100.0	100.0	91.6	100.0	18.2	91.8
b	2	100.0	100.0	100.0	100.0	100.0	100.0
b	3	100.0	100.0	100.0	100.0	100.0	100.0

Open in a new tab

Moreover, the power to correctly detect an IOV was higher for the scenarios with IOV75 than for IOV25 analogues. For the IOV_CL scenario the power increased from 48.8% (IOV25) to 100.0% (IOV75) in one OCC in sampling scheme (a).

Power was overall highest to detect IOV_V while an IOV_ka showed the lowest values. IOV_CL, regardless of the extend, was correctly determinable in sampling scheme (b) according to power calculations (100.0%). Sampling scheme (a) resulted in a lower power value in one observed OCC (IOV25, 48.8%).

Overall, sampling scheme (b) which included the trough sample performed better than (a). The advantages of sampling scheme (b) became obvious while inspecting the power values of the first OCC for both sampling schemes. For instance, the power to correctly detect IOV25_CL was 48.8% for sampling scheme (a) and 100.0% for sampling scheme (b).

Type I error

The results of the type I error calculations are shown in Table 3. In contrast to the results from the power calculations, the type I error rates showed no recordable trend. Notably, the type I errors were less than 5% except for IOV75_CL, sampling scheme (b) in one OCC.

Table 3.

Type I error (%) of falsely including a variability parameter that was not simulated for the two sampling schemes (a: 0.5, 2, 8, 22 h after dose, b: 0, 0.5, 2, 8, 22 h after dose) and two magnitudes of IOV (25%, 75%) on parameters CL, V and k_a

Sampling	OCC	SIM/EST with IIV_only
		EST with IOV_CL		EST with IOV_V		EST with IOV_ka
		25%	75%	25%	75%	25%	75%
		type I error [%]	type I error [%]	type I error [%]	type I error [%]	type I error [%]	type I error [%]
a	1	1.4	0.4	1.6	1.8	0.2	0.0
a	2	0.8	0.8	1.8	0.8	1.6	1.0
a	3	0.4	0.4	2.0	1.4	1.0	3.0
b	1	4.6	5.2	0.8	1.8	0.0	0.0
b	2	0.4	1.4	0.6	0.8	2.0	0.8
b	3	1.4	2.0	0.6	1.4	1.0	0.8

Open in a new tab

rBIAS

Generally, the rBIAS values (Table 4) showed a trend to underestimate IOV. The absolute values of the rBIAS were highest in case of the single observed OCC (e.g. 33.7% for IOV75_ka, sampling scheme (a)). For a power outcome of > 80% the highest rBIAS value of the implemented IOV was 6.4% (IOV25_CL, sampling scheme (b), one OCC) and the lowest − 10.5% (IOV25_V, sampling scheme (a), one OCC), respectively.

Table 4.

rBIAS values of IOV when truly present for the two sampling schemes (a: 0.5, 2, 8, 22 h after dose, b: 0, 0.5, 2, 8, 22 h after dose) and two magnitudes of IOV (25%, 75%) on parameters CL, V and k_a

Sampling	OCC	SIM with IOV_CL		SIM with IOV_V		SIM with IOV_ka
		25%	75%	25%	75%	25%	75%
		EST with IOV_CL		EST with IOV_V		EST with IOV_ka
		rBIAS [%]	rBIAS [%]	rBIAS [%]	rBIAS [%]	rBIAS [%]	rBIAS [%]
a	1	−14.2	3.5	−12.5	−9.7	−20.0	−33.7
a	2	−1.1	−0.0	−2.0	−6.7	1.1	−3.0
a	3	−0.6	−8.0	−2.8	−6.9	−0.0	−2.6
b	1	6.4	−7.7	−10.5	−9.9	7.9	−5.3
b	2	−2.7	−8.6	−3.1	−9.3	1.5	0.0
b	3	−2.2	−8.6	−3.8	−9.6	−0.0	0.0

Open in a new tab

rRMSE

The rRMSE values of the estimated IOV decreased with an increasing number of observed OCCs (Table 5, e.g. IOV25_CL, sampling scheme (a): 72.2% (one OCC), 17.9% (two OCCs), 12.7% (three OCCs)). Overall, comparing IOV25 and IOV75, IOV25 reached higher rRMSE values. For instance, IOV25_CL in one OCC led to an rRMSE value of 72.2% in sampling scheme (a) and for the same setup with IOV75_CL to an rRMSE value of 24.2%. IOV_CL and IOV_V resulted in similar rRMSE values, while IOV_ka yielded higher rRMSE values (e.g. sampling scheme (a), one OCC, IOV25_CL = 62.6%, IOV25_V = 58.8%, IOV25_ka = 116.7%).

Table 5.

rRMSE values of IOV when truly present for the two sampling schemes (a: 0.5, 2, 8, 22 h after dose, b: 0, 0.5, 2, 8, 22 h after dose) and two magnitudes of IOV (25%, 75%) on parameters CL, V and k_a

Sampling	OCC	SIM with IOV_CL		SIM with IOV_V		SIM with IOV_ka
		25%	75%	25%	75%	25%	75%
		EST with IOV_CL		EST with IOV_V		EST with IOV_ka
		rRMSE [%]	rRMSE [%]	rRMSE [%]	rRMSE [%]	rRMSE [%]	rRMSE [%]
a	1	72.2	24.2	58.8	22.0	116.7	57.1
a	2	17.9	11.4	16.9	11.8	25.3	13.3
a	3	12.7	10.8	12.1	10.2	17.3	9.3
b	1	27.6	15.4	39.1	18.2	113.6	34.6
b	2	13.4	11.8	15.8	12.8	23.7	13.8
b	3	10.4	10.8	11.4	11.7	17.3	10.4

Open in a new tab

Comparing the two sampling schemes it becomes apparent that the rRMSE value for IOV_CL or IOV_V in one OCC was higher in sampling scheme (a) than in (b) (IOV_CL, sampling scheme (a): 72.2%, sampling scheme (b): 27.6%). For a power outcome of > 80% the highest rRMSE value of the implemented IOV was 39.1% (IOV25_V, sampling scheme (b), one OCC).