Skip to main content
CPT: Pharmacometrics & Systems Pharmacology logoLink to CPT: Pharmacometrics & Systems Pharmacology
. 2026 Jan 15;15(1):e70186. doi: 10.1002/psp4.70186

Integrating Population Approaches With Physiologically Based Pharmacokinetic Models: A Novel Framework for Parameter Estimation

Donato Teutonico 1,, David Marchionni 2, Marc Lavielle 3,4, Laurent Nguyen 1
PMCID: PMC12823310  PMID: 41540733

ABSTRACT

Physiologically Based Pharmacokinetic (PBPK) modeling is a powerful tool in drug development that integrates drug‐specific information with physiological parameters to predict drug concentrations. However, parameter estimation in PBPK models presents significant challenges due to the large number of parameters involved and limited observed data. This tutorial introduces a novel approach coupling whole‐body PBPK (WB‐PBPK) models with population estimation methods (popWB‐PBPK) to leverage individual data and estimate inter‐individual variability on physiologically relevant parameters. The framework employs an optimized Stochastic Approximation Expectation–Maximization (SAEM) algorithm, reducing the estimation runtime through an adaptive parameter grid optimization and linear interpolation techniques. Using theophylline as a case study, we illustrate how this approach can accurately estimate drug‐specific parameters (CYP1A2 clearance and lipophilicity) while incorporating covariate effects (smoking status). The optimized algorithm significantly reduces computational time compared to the standard SAEM algorithm. Our implementation in the saemixPBPK R package provides an accessible framework for parameter estimation in PBPK models, enabling more robust predictions of pharmacokinetic behavior leveraging individual data. This approach represents an important advancement in mechanistic modeling, allowing simultaneous estimation of population parameters, variability, and uncertainty while maintaining the physiological relevance of PBPK models.

Keywords: individual variability, PBPK, physiologically based pharmacokinetics, popPBPK, popWB‐PBPK, SAEM

1. Introduction

PBPK modeling is a powerful tool in drug development since it allows the integration of drug‐specific information as well as physiologically relevant parameters to predict drug concentrations [1]. PBPK models simulate pharmacokinetic profiles on the basis of compound structure‐related information and a number of relevant physiological input parameters of the individual, such as organ volumes, tissue composition, blood flow rates, and clearance [2, 3, 4]. This technology has already demonstrated its potential in the framework of toxicological risk assessment [5] as well as in the drug research and development process [6]. WB‐PBPK models, as the one implemented in PK‐Sim [7], explicitly contain the physiological properties that influence the pharmacokinetics of a drug. Such models are evolved compartmental models which tend to use realistic biological descriptions of the determinants that regulate the disposition of drugs in the body [7]. Those models describe the body as a set of compartments corresponding to specific organs or tissues (e.g., adipose, bone, brain, gut, heart, kidney, liver, lung, muscle, skin, and spleen etc.). Between compartments, the transport of substances is dictated by various physiological flows (blood, bile, pulmonary ventilation, etc.) or by diffusion [4, 7].

2. Parameter Estimation in PBPK Modeling

When building a PBPK model, in the ideal case, data and predictions agree on average and in terms of variability, and no further model refinement is attempted. Yet, quite often, the predictions need to be adjusted (fitted) to the observed data. That may be the occasion to improve the model through some calibration or data integration procedure and learn something about the true determinants of variability in a population [8]. Parameter estimation in PBPK models is challenging because of the large number of involved parameters and the relatively small amount of observed data usually available. Several approaches have been performed in the literature in order to fit PBPK models to observed data. More specifically, one of the proposed methods was to optimize most of the model parameters together, using complex Bayesian approaches coupled with Monte Carlo optimization or the simplex method [9, 10]. Alternatively, methods such as genetic algorithms, which are based on the concept of natural selection [11], can be applied to optimize simultaneously many parameters in these complex models [12]. Even though these methods were explored, and prototypes of the approach were developed, none of them is currently used in standard modeling practice, probably because of their complexity and long runtime in the estimation processes. Currently, the more commonly used approach is to fix most of the model parameters (to values known from physiology or previous in vitro and in vivo experiments) and estimate only a few unknown model parameters [13]. This is usually done either manually by a trial and error visual calibration to the observed concentration profiles or by more formal statistical fitting algorithm approaches, such as non‐linear least squares and maximum likelihood methods, as for instance the ones implemented in the PBPK software PK‐Sim [7]. It should be recognized that with such an approach, the parameter estimates are conditional on the values that have been assumed for the fixed parameters [9]. Moreover, it should be pointed out that, as with any optimized parameter, the fitted estimate itself is always accompanied by a level of uncertainty which derives from inherent experimental errors or any model mis‐specifications [14]. Specifying the parameter uncertainty is crucial to evaluate the reliability of the parameter estimate, and consequently the model predictive value. At this point, it is important to distinguish uncertainty from variability in any attempt at individual parameter estimation [15]. Briefly, variability refers to differences attributable to environmental or intrinsic subject characteristics and is a fundamental property of the studied system that cannot be reduced. On the contrary, uncertainty is variation that derives from errors in the experimental procedure, measurement, modeling, and assumptions of the studied system. It is not itself a system property, and it can be reduced through optimization of the experiment. Although difficult, it is desirable to disentangle and separate uncertainty and variability in a parameter estimation process [16].

When parameters need to be estimated with a PBPK model, the most frequent approach is to fit average data and calibrate the unknown parameters. It should be noted that in this case, not only is inter‐individual variability (IIV) on model parameters unattainable, but also parameter estimates might be biased as averaging of data can produce a distorted picture of the individual PK profile [17]. If individual data are available, all data from different individuals are combined and analyzed as if they came from a single subject, an approach referred to as naïve‐pooled data (Figure 1), and an average PBPK model is used to fit such data. In this case, usually, the individual characteristics used in the PBPK model correspond to the average covariates of the study population. The limitations of such an approach have been repeatedly described, and the use of hierarchical population modeling is strongly recommended [18, 19, 20] to better assess IIV, get non‐biased parameter estimates, and leverage all data available.

FIGURE 1.

FIGURE 1

Representation of the classical naïve‐pool approach often used in PBPK modeling with the proposed hybrid popWB‐PBPK approach implemented in this workflow using SAEM.

When such developed models need to inform on predicted variability of PK concentrations, the physiological variability can be included in the simulation by employing a range of physiological input parameters (e.g., weight, height, age). A prerequisite is that the variability of relevant properties within a population (e.g., organ weights, blood flows) is known. If this is the case, a PBPK model can be used to make predictions of pharmacokinetic behavior in specific virtual individuals or populations using realistic physiological properties. The use of this predictive modeling based on population variability allows for an a priori assessment of the influence of physiological properties on the pharmacokinetic behavior of drugs, prior to clinical studies [1]. Such physiological variability reflects the variability related to the distribution of parameters, such as organ volumes, tissue composition, and blood flows available in the PBPK model database, and can be considered a priori variability since it is informed by knowledge previously available. Also, other sources of variability can be included, such as the one related to formulation or drug properties (e.g., intrinsic clearances), but very rarely this source of variability is estimated using the PBPK model, while in most cases it comes from estimations performed with other models (e.g., population PK models) or experimental data. If individual data are available, a simple approach could consist of fitting the individual data separately, collecting the resulting individual parameter estimates, and forming their average. This approach, referred to as the Standard two‐stages approach, may be biased if very precise parameter estimates cannot be obtained from the data [21, 22]. The most suitable approach in this case is to use non‐linear mixed effect models [20, 23], which enable estimating, from observed individual concentrations data, non‐biased population estimates, variability, and uncertainty simultaneously in a single step.

The proposed approach in this tutorial is to couple a population approach method to a WB‐PBPK model (popWB‐PBPK) to leverage individual data to estimate inter‐individual variability and uncertainty on physiologically relevant parameters (Figure 1). A typical estimation would consist of selecting a limited number of relevant model parameters and estimating them by leveraging individual concentration profiles. The parameter estimation algorithm employed in our study is an extension of the SAEM algorithm, renowned for its high performance in nonlinear mixed‐effects models [20, 24].

3. Mechanistic vs. Non‐Mechanistic Use of Covariates

There is a different use of the subject covariates in PBPK modeling vs. classical population modeling, so when a population approach is combined with a PBPK model, as is the case for the proposed popWB‐PBPK model, it is important to consider how the covariates are integrated in the model. While demographic covariates (age, weight, height, gender, ethnicity) are integrated into population models after the selection of the structural model, in PBPK modeling, these covariates need to be integrated a priori to generate appropriate physiological characteristics matching those of the study population, which we intend to use in the estimation process (Figure 2). If this step were not conducted, and for instance an average subject were used, the variability related to physiological distributions would be lumped into the variability of the estimated parameters.

FIGURE 2.

FIGURE 2

Diagram illustrating the different model components in population PK (left) and popWB‐PBPK (right). In the popWB‐PBPK approach, the model is composed of a structural model that contains a parameterization dependent on drug properties and subject characteristics. Such characteristics are used to create individuals with physiological properties similar to the real subjects in the dataset, using demographic covariates to create digital tweens of these subjects. As a consequence, in this modeling approach, demographic covariates, which in population modeling are tested on the structural model, in popWB‐PBPK are a priori integrated in the creation of the virtual subjects. Covariates external to the PBPK model, for which there is no mechanistic description of their physiological impact, can be included in the estimation step of the statistical model to empirically explain the observed variability.

As such, gender, age, weight, height, and ethnicity are used to simulate the physiological parameters from the corresponding distribution. When coupling the PBPK model to a population algorithm, such as SAEM, it is important that the physiological variability is integrated in the parameterization of each subject (Figure 2); hence, the vector of covariates of each subject should be used to generate a virtual tween of the subjects available in the dataset. This approach ensures that the variability in physiological parameters, such as organ volumes and blood flows, is considered for each subject. Since population estimation methods allow the use of covariates to explain the observed variability, also when coupling these methods to PBPK modeling, covariates external to the PBPK model can be used in the estimation step (Figure 2). When covariates are used at this stage, their integration would be made in a non‐mechanistic way within the model. If covariates are already mechanistically integrated in the individual or drug parameterization (i.e., are already integrated in the parametrization of the PBPK model), they should usually not be included also in the estimation step.

With respect to covariate analysis, one of the main advantages of coupling PBPK models to a population estimation method is the possibility to integrate covariates also when they are not mechanistically part of the PBPK model (Figure 3). For instance, the smoking status is known to influence the expression of certain liver enzymes, such as CYP1A2, so the smoking status covariate can be tested and used to explain variability in clearance by estimating the covariate effect on the CYP1A2 abundance or clearance. This example will be illustrated in the next section. Thanks to the mechanistic nature of the PBPK model, the non‐mechanistic covariates (i.e., external to the WB‐PBPK model structure) can still be tested on the mechanistic parameter that they are affecting. For instance, in this case, the smoking status (non‐mechanistic covariate) affects CYP1A2 activity, so it can be tested either on the CYP1A2 clearance or abundance (mechanistic parameter). A similar example would be the use of the albumin covariate on the drug fraction unbound.

FIGURE 3.

FIGURE 3

Schematic representation of the different types of covariates that can be included in the SAEM‐PBPK approach proposed. Demographic covariates are included in the organism properties, defining the physiological and anatomical characteristics of the subjects, while non‐mechanistic covariates can be used to explain the observed variability on top of the PBPK model.

It should be noted that disease covariates can also be mechanistically integrated in the model if they are used to define the changes in pathophysiology of the organism parameterization; this is the case of special populations such as liver or renal impairment characteristics, already integrated in PK‐Sim. In this case, the use of covariates such as GFR or albumin, which are already part of the disease population, should be carefully excluded in the population estimation.

4. SAEM Algorithm Optimization

Evaluating a PBPK model for a single individual can be computationally intensive due to the extensive number of differential equations involved. The SAEM algorithm is an iterative method that requires numerous evaluations of the structural model across all individuals in a sample [20, 24], making its application to PBPK models particularly challenging. To address this, we propose reducing the number of model evaluations by limiting the structural model computations to a reduced set of parameter values to be tested. This parameter grid is adaptively optimized for each individual. Subsequently, the structural model's evaluation for any given set of parameters is obtained through linear interpolation from the grid points.

The grid creation algorithm is designed to optimize parameter estimation for complex models, particularly those involving Ordinary Differential Equations (ODEs). It divides the parameter space into intervals, creating a grid that balances accuracy and computational efficiency. For each parameter, the grid is computed by following these steps:

4.1. Step 1

Find a minimum and maximum bound from a normalized mean squared error between predictions and data, then divide the interval into n parameter values (11 by default).

4.2. Step 2

For each parameter value p from the interval, find another parameter value p* such that the deviation from linearity is within a threshold (5% by default) within 10% of tolerance.

4.3. Step 3

Interpolate between the parameter values p and p* to obtain the adaptive grid for the parameter with a deviation from linearity lower than the maximum threshold defined.

The algorithm builds the grid around the initial values of each parameter. This approach is particularly valuable because it identifies regions where the model behaves linearly (requiring fewer test points) vs. regions with non‐linear behavior (requiring more detailed sampling). As a consequence, the final number of grid points in step 1 is then modified depending on the linearization of the parameter space and, consequently, the need for larger or narrower grid steps.

The resulting grid significantly reduces calculation time by focusing computational effort where it's most needed, while maintaining accuracy within defined linearity deviation limits. For instance, in a PBPK model, this might mean testing fewer points in regions where clearance behaves linearly but increasing resolution in regions where non‐linear elimination occurs.

5. Theophylline Use Case

In this section, we illustrate an example of parameter estimation using the theophylline model. This population approach offers key advantages over naive‐pooled analysis by utilizing individual concentration‐time profiles to estimate both population parameters and their variability. Furthermore, it enables the evaluation of empirical covariates on parameter estimates, helping explain the observed in vivo variability, as demonstrated in this case study. The methodology proposed in this paper has been integrated into an R package, saemixPBPK, which provides all the necessary tools for parameter estimation by interfacing with the ospsuite and saemix R packages [25]. The ospsuite R package, available on GitHub, provides a comprehensive interface to the Open Systems Pharmacology Suite, enabling R users to programmatically create, manipulate, and analyze PBPK models. It bridges the powerful statistical capabilities of R with the OSP modeling framework, allowing for automated workflows, reproducible research, and advanced analysis of drug pharmacokinetics. The saemix R package, available on CRAN, is a powerful tool for fitting nonlinear mixed effects models in pharmacometrics using the SAEM algorithm. It provides an accessible implementation of advanced population modeling techniques, allowing pharmacometricians to efficiently analyze pharmacokinetic and pharmacodynamic data while accounting for both fixed effects and inter‐individual variability.

In the development of the saemixPBPK package, we compared the impact of different coding options on the estimation runtimes for a single parameter estimation (intrinsic CYP1A2 clearance) on a simulated theophylline dataset with 12 subjects. We evaluated four different options of estimation coding: The default coding for the ospsuite and saemix R packages, the batch run option for ospsuite, and the integration of an estimation grid coupled with the SAEM algorithm. The results are reported in Table 1. With a simulation batch option available in the ospsuite package, the estimated parameters are specified explicitly during the creation of the simulation batch. This allows the simulation engine to speed up the system significantly, as some equations can be rewritten and simplified. Moreover, a simulation batch also keeps the simplified simulation in memory, so consequently running a simulation again with a new set of values is much faster, as the initialization phase is only done if required and not for every run. With this approach, the improvement in runtimes depends on the number of parameters dependent on the estimated parameters. For additional information on the batch simulations and coding instructions, please refer to the ospsuite user manual available on the OSP website. As illustrated in Table 1, batch simulation coding can improve runtime by more than 5‐fold, while coupling an estimation grid with the SAEM algorithm can improve runtime by more than 300‐fold. Combining these two approaches can decrease runtime by about 2,000‐fold compared to the default use of the two R packages. This last approach has been implemented in the R package saemixPBPK and is used in the proposed estimation framework by the functions included in the package. The comparison of coding reported in Table 1 has the objective of indicatively comparing these different options, and it does not represent an exhaustive runtime comparison, since several aspects can influence the runtime of these different approaches. Moreover, it should be kept in mind that if a larger number of parameters is estimated (e.g., more than 5 fixed effects), the calculation of the grid may become a limiting step, and the benefits of the batch estimation may become marginal. In the proposed approach, it is assumed that in most cases only a limited number of parameters are estimated with a PBPK model; for this reason, an estimation with 2 fixed effects is illustrated in the theophylline example.

TABLE 1.

Comparison of the runtimes, in hours, for the different options of coding with the ospsuite and the implementation of the grid in the SAEM estimation.

Run durations (hr) Coding improvement with ospsuite
(3) ospsuite classical functions (4) ospsuite batch simulation
Coding improvement by integrating the grid (1) saemix 12.6 2.2
(2) saemix+grid 0.04 0.006

Note: The different cases evaluated are: 1. The standard SAEM algorithm is implemented in saemix. 2. The coupling of SAEM with an estimation grid as implemented in the package saemixPBPK. 3. A classical coding to launch the PBPK model with the ospsuite package using the function runSimulations(). 4. Use of the batch method to run the simulations (for details, please refer to the ospsuite package documentation).

To demonstrate the complete workflow implemented in the saemixPBPK package, we used a simulated dataset for theophylline, generated from the PK data available in the saemix package and the theophylline PBPK model available on the OSP model library.

Covariate values, including age and height, were added to the dataset, and all subjects were assumed to be Caucasian (using the “European_ICRP_2002” population in PK‐Sim). PK plasma concentrations were simulated using the PBPK model for theophylline available in the OSP model library. Plasma concentrations for 12 subjects were simulated assuming a CYP1A2‐specific clearance of 0.006712 min−1 (estimated from the original dataset) with a CV of 32%. A second dataset was simulated, introducing a covariate effect (smoking status), increasing CYP1A2‐specific clearance by 28%. It is well‐known that cigarette smoke impacts the metabolizing activity of CYP1A2 [26, 27]. A proportional error of 30% was also randomly added to the model predictions. The two datasets were then combined, and a smoking status covariate was included to stratify the two populations, obtaining a final dataset that contains 24 subjects.

The proposed popWB‐PBPK framework was applied to estimate the individual parameter values for CYP1A2‐specific clearance and lipophilicity. These two parameters were selected to illustrate an example of two different types of parameters. Specific clearance is a classical individual parameter expected to vary between subjects due to physiological variability in metabolizing rates. This parameter is also expected to be affected by smoking status, so the effect of the smoking covariate introduced in the dataset will also be estimated. On the other hand, lipophilicity is a drug‐specific property and is not expected to vary between subjects. Therefore, the population value of this parameter will be estimated without inter‐individual variability.

The estimation of individual parameters will be performed with the following steps, summarized in (Figure 4):

FIGURE 4.

FIGURE 4

Schematic representation of the complete workflow, composed of the different steps required to perform the parameter estimation as well as the main functions available in the package saemixPBPK to perform each task.

5.1. Creation of Subjects

In this step, the covariates of each subject present in the dataset are used to simulate a vector of subjects with the corresponding physiological characteristics of each individual. Physiological parameters include, for instance, organ volumes, blood flow rates, GFR function, intestinal effective surface area, and gastric emptying time. Patient demographic characteristics (age, weight, height, gender, and ethnicity) are used to generate organ volumes, blood flows, and other physiological characteristics of the PBPK model.

This task is performed with the function create_individuals() available in the package saemixPBPK. This function allows generating a vector of subjects using the ospsuite package and the covariate information available in a dataframe object or a dataset. The function can be used to simulate human or preclinical species; in the first case, information on ethnicity, gender, age, height, and weight is required, while for preclinical animals, only the weight information needs to be available in the dataset. If the species is not specified in the dataset, it will be assumed as human.

5.2. Creation of a Vector of Simulations

After creating the vector of individual characteristics, the subjects are combined with the PBPK model, which we want to use in the estimation step to create a vector of simulations. Each simulation contains the subject's physiological parameters, the drug dose administered, the dosing regimen, and the subject‐specific sampling times.

This vector of simulations can be created using the function createSimulations(), which allows coupling a PBPK model, developed using the OSP Suite and exported in .pkml format, with the subjects created in the previous step. If individuals present a different dose, this information can be used to individualize the dose for each subject. If in the estimation dataset the individuals have a different study design, different .pkml files can be used for each subject, allowing the integration of individual study designs for each subject. In this step also the list of parameters that need to be estimated with SAEM is also specified.

5.3. Calculation of the Estimation Grid

This is an optional step, and it is required if a parameter grid is to be used to guide the estimation algorithm, SAEM. In individual‐specific PBPK modeling, the definition of the interpolation grid is crucial. A highly refined grid with numerous points necessitates extensive model evaluations, increasing computational load. Conversely, an overly coarse grid reduces computation time but may introduce significant biases due to insufficient interpolation precision. Our approach dynamically adjusts grid density based on a predefined accuracy threshold, as previously explained. For instance, by setting the interpolation error tolerance at 5%, the grid is constructed to meet this criterion, ensuring an optimal balance between computational efficiency and model accuracy. The vector of simulations, created in the previous step, can then be used to calculate the grid of parameter values to be tested by the SAEM algorithm, guiding the estimation and optimizing the number of calls to the PBPK model, significantly reducing estimation times. The function gridfuncd() is used in the package to compute the grid by specifying the model, the data, and the model output to match the data. Other relevant arguments which can be specified are, for instance, the vector of initial values around which the grid will be built (psi0), the minimum and maximum grid value of the parameter (by default, respectively 0.05*psi0 and 10*psi0), and the maximum relative error between interpolations and model predictions (by default 5%). The grid function calculates a grid of parameter values for a specified number of subjects (6 by default) and then computes an average grid for the population for each one of the estimated parameters. In the estimation step, this average grid will be used in the estimation of each subject, and it will be specifically extended if a subject requires testing parameter values outside the grid range. Depending on the required parameter values to be tested by SAEM, the grid points are used to generate a model prediction, and a linear interpolation is used for predictions between the grid points.

5.4. Estimation Step

This is the actual estimation of the individual and population parameter values for the dataset. The result of this step will be a saemix object which will contain all the estimation results.

This final estimation step is performed with the saemix package, and it requires: A data object, created with the function saemixData() from the saemix package; a model, created with the saemixModelPBPK() function from the package saemixPBPK, and a list of options for the SAEM algorithm. For the functions of the saemix package, additional details can be found in the package documentation as well as in Comets et al. [25]. The function saemixModelPBPK() allows the definition of a model by specifying classical estimation arguments, such as initial values for fixed and random effects, parameter distribution, error model, covariates, and covariance model, as well as specific parameters for the proposed popWB‐PBPK approach. Indeed, in this function, it can be specified if the estimation will be conducted with the standard SAEM algorithm or by using the parameter grid (model. = TRUE). In the latter case, a parameter grid needs to be provided. In this function also a model function also needs to be provided. Such a function uses the simulations previously defined in Step 2 to simulate the required model output (typically a plasma PK profile) that needs to be matched with the observations. A template model (modelBatch) is provided in the package saemixPBPK; this model can be used in the estimation process and can be modified if specific estimations require a different use of the PBPK model.

It is important to note that, for each individual, all previously performed grid evaluations can be saved and reused in a future estimation by the saemixModelPBPK() function using the file argument. This is particularly advantageous when reapplying the SAEM algorithm to fit the same structural model—with identical parameters—but employing a different statistical model, such as incorporating new covariate structures, covariance models, or residual error models. Additionally, this approach is beneficial if one wishes to rerun the SAEM algorithm with different settings, such as varying the number of iterations. However, these precomputed results are not reusable if there is a change in the experimental design, such as alterations in dosing regimens or observation times.

As previously explained, the proposed method requires evaluating the original structural model at specific grid points for each individual, specifically those needed for each linear interpolation. These evaluations are computationally expensive, so it is crucial to minimize their number. To avoid redundant calculations, we recommend storing all completed evaluations and reusing them whenever possible in the SAEM estimation, whether for calculating standard errors, log‐likelihoods, or testing different covariate models. In the Supporting Information are reported the R code needed to execute the case study presented, as well as the theophylline dataset used in the estimation and the estimation results. In this example, two estimations are illustrated, one with the two parameters, CYP1A2 intrinsic clearance and lipophilicity, and the second one with the smoking covariate effect on the intrinsic clearance parameter.

The results of the estimation are reported in Figure 5. Classical modeling diagnostics, such as observed vs. individual predicted, VPC, or NPDE, can be generated to assess model goodness of fit. In Figure 5 are reported the individual fits (A) obtained with the individual parameter values as well as the visual predictive check (VPC) (B). The VPC evaluates the model's ability to reproduce the observed data distribution. Observed concentrations are shown as circles, while the solid dark lines represent the 5th, 50th (median), and 95th percentiles of the observed data. The shaded areas represent the 95% confidence intervals around the corresponding percentiles (5th, 50th, and 95th) of the simulated data, generated from 1000 Monte Carlo simulations using the final estimated model parameters. The model is considered adequate if the observed percentiles fall within the prediction intervals of the simulated percentiles, indicating that the model appropriately captures the central tendency and variability of the observed data across the range of the independent variable (time). In Table 2, the results of the estimation are reported together with the original parameters used to simulate the dataset (true parameter values). The population estimate of the CYP1A2‐specific clearance is consistent with the initial parameter value, as well as the lipophilicity. The slight difference between the true and estimated values for the CYP1A2 variability is very likely related to the limited number of subjects in the dataset (24 individuals). Finally, the estimated covariate effect was 27%, which is consistent with the true covariate effect of 28% included in the dataset simulation. Also in this case, the slight difference between the estimated and the true covariate effect is related to the limited number of subjects as well as the random components in the simulation of the theophylline dataset.

FIGURE 5.

FIGURE 5

Results of the estimation performed with the popWB‐PBPK modeling approach. (A) Individual fit with the final individual parameters. (B) Visual Predictive Check of the final model. Observed concentrations are shown as circles. The dark solid lines represent the 5th, 50th (median), and 95th percentiles of the observed data. The shaded areas represent the 95% confidence intervals around the corresponding percentiles (5th, 50th, and 95th) of the simulated data, generated from 1000 Monte Carlo simulations using the final estimated model parameters.

TABLE 2.

Parameter estimated with the proposed popWB‐PBPK approach and their comparison with the true value used in the generation of the dataset used in the estimation step. To simplify results comparison, here are reported the original parameters provided by the SAEM algorithm in the package saemix for the parameter variability.

Parameter True value Estimated value RSE of the estimated value (%) p‐value (Wald test)
CYP1A2 intrinsic clearance (min−1) 0.00671 0.00701 10
Omega2 (CYP1A2 intrinsic clearance) 0.103 0.0941 35
Smoking effect on int. Cl 1.28 1.27 57 0.0398
Lipophilicity 0.910 0.934 4
Proportional error (%.) 30 30 5

6. Assess the Accuracy of the Interpolation

Once parameter estimation is complete, it is essential to verify that the interpolation adequately captures the parameter space when a parameter grid has been employed. To facilitate this assessment, the saemixPBPK package provides the interpeval() function. This function compares predictions from the full model with those obtained using grid‐based interpolation. Additionally, it visualizes the grid points evaluated during estimation for each individual subject.

The application of this function is demonstrated in the Supporting Information, with key findings illustrated in (Figure 6). While some plots are generated for each subject (Figure 6A–C), only the first subject is shown for clarity.

FIGURE 6.

FIGURE 6

Validation of parameter grid interpolation accuracy in the saemixPBPK package. (A) Comparison of model predictions (concentrations mg/l vs. time h) from the full model (solid lines) and interpolated model (dashed lines) at extreme parameter values: lowest clearance (red), highest clearance (green), highest lipophilicity (purple), and lowest lipophilicity (blue). (B) Individual grid points evaluated during parameter estimation, with the black cross indicating final parameter estimates. Color coding represents extreme grid values defining the two‐dimensional parameter space boundaries. (C) Model predictions (concentrations mg/l vs. time h) comparison using empirical Bayes estimates (EBE), confirming consistency between full and interpolated models for individual parameter predictions. (Full PBPK model in blue and grid with linearization in red). (D) Direct comparison of approximated vs. full model concentration predictions (mg/l), with the gray shaded area representing a 5% prediction tolerance zone around true model predictions. Most points align with the identity line and fall within the tolerance, confirming adequate grid performance. Data shown in A, B, and C for the first subject are a representative example.

Figure 6A presents model predictions using both the full model (solid lines) and interpolated model (dashed lines) at extreme parameter values tested within the grid: Lowest clearance (red), highest clearance (green), highest lipophilicity (purple), and lowest lipophilicity (blue). This comparison validates the approximation accuracy at grid extremes, confirming consistency between the full and approximated models at these critical boundary conditions.

Figure 6C extends this validation by comparing predictions from both models using empirical Bayes estimates (EBE), verifying consistency in individual parameter predictions estimated via SAEM. The results demonstrate excellent agreement between the two modeling approaches.

Figure 6B displays the individual grid points evaluated during estimation for each subject, with the black cross marking the final parameter estimates. Also in this case, color coding indicates extreme grid values, and since two parameters are estimated in this example, four extreme parameter values define the two‐dimensional parameter space boundaries. The grid points effectively cover the region surrounding the final parameter estimates. In the Supporting Information is included the diagnostic plot containing the same information for all subjects. As can be seen in this plot, the points of the grid generated during the estimation, as well as the grid boundaries, may be different for each subject, since they are adapted during the estimation step depending on the parameter values which need to be evaluated by SAEM.

Figure 6D provides a comprehensive validation through direct comparison of approximated vs. full model predictions. The gray shaded area represents a 5% prediction tolerance zone around the true model predictions. The results show that the approximated model predictions align well with the identity line and fall within the required 5% tolerance, confirming the grid calculation's adequacy.

If deviations occur, reducing the grid error tolerance can increase grid density. Additionally, if final estimates differ substantially from initial values, the grid should be recalculated and the estimation repeated. Since grid calculations are centered on initial parameter values, significant deviation in final estimates may necessitate grid recalculation and a subsequent re‐estimation cycle to ensure optimal model performance.

7. Conclusion & Perspectives

In this tutorial, we have illustrated a novel methodology, popWB‐PBPK, which integrates population estimation techniques (specifically SAEM) with WB‐PBPK modeling. This approach consists of using a fully developed PBPK model to estimate individual parameter values as well as their associated variability and uncertainty, taking into account the physiological characteristics of the subject in the estimation process. The technical implementation of this approach was done in an R package, saemixPBPK, which is publicly available on GitHub and which integrates all the required tools to perform the estimation process. This technical framework can be applied to any PBPK or QSP model developed with the OSP Suite. Future extension of this workflow will consist of the integration of multiple outputs in the estimation method and in the grid calculation. This extension would extend the application of this approach also to the simultaneous estimation of different measurements, such as PBPK and PD or parent and metabolite models. Follow‐up of this work will be in assessing the performance of the grid in different conditions in order to optimize its use, as well as in improving the tools needed to assess the quality of the parameter estimates.

Funding

This work was funded by Sanofi.

Conflicts of Interest

D.T., D.M., and L.N. are Sanofi employees and may hold shares and/or stock options in the company. D.T. is a member of the OSP Management Team. M.L. has nothing to disclose.

Supporting information

Data S1: Supporting Information.

PSP4-15-e70186-s001.zip (983.4KB, zip)

Teutonico D., Marchionni D., Lavielle M., and Nguyen L., “Integrating Population Approaches With Physiologically Based Pharmacokinetic Models: A Novel Framework for Parameter Estimation,” CPT: Pharmacometrics & Systems Pharmacology 15, no. 1 (2026): e70186, 10.1002/psp4.70186.

References

  • 1. Willmann S., Höhn K., Edginton A., et al., “Development of a Physiology‐Based Whole‐Body Population Model for Assessing the Influence of Individual Variability on the Pharmacokinetics of Drugs,” Journal of Pharmacokinetics and Pharmacodynamics 34, no. 3 (2007): 401–431. [DOI] [PubMed] [Google Scholar]
  • 2. Bjorkman S., “Prediction of Drug Disposition in Infants and Children by Means of Physiologically Based Pharmacokinetic (PBPK) Modelling: Theophylline and Midazolam as Model Drugs,” British Journal of Clinical Pharmacology 59, no. 6 (2005): 691–704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Poulin P. and Theil F. P., “A Priori Prediction of Tissue:Plasma Partition Coefficients of Drugs to Facilitate the Use of Physiologically‐Based Pharmacokinetic Models in Drug Discovery,” Journal of Pharmaceutical Sciences 89, no. 1 (2000): 16–35. [DOI] [PubMed] [Google Scholar]
  • 4. Willmann S., Lippert J., Sevestre M., Solodenko J., Fois F., and Schmitt W., “PK‐Sim: A Physiologically Based Pharmacokinetic ‘Whole‐Body'model,” Biosilico 1, no. 4 (2003): 121–124. [Google Scholar]
  • 5. Clewell H., Campbell J., and Linakis M., “Recent Applications of Physiologically Based Pharmacokinetic Modeling to Assess the Toxicity of Mixtures: A Review,” Current Opinion in Toxicology 34 (2023): 100390. [Google Scholar]
  • 6. El‐Khateeb E., Burkhill S., Murby S., Amirat H., Rostami‐Hodjegan A., and Ahmad A., “Physiological‐Based Pharmacokinetic Modeling Trends in Pharmaceutical Drug Development Over the Last 20‐Years; In‐Depth Analysis of Applications, Organizations, and Platforms,” Biopharmaceutics & Drug Disposition 42, no. 4 (2021): 107–117. [DOI] [PubMed] [Google Scholar]
  • 7. Kuepfer L., Niederalt C., Wendl T., et al., “Applied Concepts in PBPK Modeling: How to Build a PBPK/PD Model,” CPT: Pharmacometrics & Systems Pharmacology 5, no. 10 (2016): 516–531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Wakefield J. and Bennett J., “The Bayesian Modeling of Covariates for Population Pharmacokinetic Models,” Journal of the American Statistical Association 91, no. 435 (1996): 917–927. [Google Scholar]
  • 9. Woodruff T. J. and Bois F. Y., “Optimization Issues in Physiological Toxicokinetic Modeling: A Case Study With Benzene,” Toxicology Letters 69, no. 2 (1993): 181–196. [DOI] [PubMed] [Google Scholar]
  • 10. Krauss M., Burghaus R., Lippert J., et al., “Using Bayesian‐PBPK Modeling for Assessment of Inter‐Individual Variability and Subgroup Stratification,” In Silico Pharmacology 1 (2013): 6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Goldberg D. E., Genetic Algorithms (pearson education India, 2013). [Google Scholar]
  • 12. Yang J., Jamei M., Heydari A., et al., “Implications of Mechanism‐Based Inhibition of CYP2D6 for the Pharmacokinetics and Toxicity of MDMA,” Journal of Psychopharmacology 20, no. 6 (2006): 842–849. [DOI] [PubMed] [Google Scholar]
  • 13. Peters S. A., “Identification of Intestinal Loss of a Drug Through Physiologically Based Pharmacokinetic Simulation of Plasma Concentration‐Time Profiles,” Clinical Pharmacokinetics 47, no. 4 (2008): 245–259. [DOI] [PubMed] [Google Scholar]
  • 14. Smith A. E. and Evans J. S., “Uncertainty in Fitted Estimates of Apparent in Vivo Metabolic Constants for Chloroform,” Fundamental and Applied Toxicology 25, no. 1 (1995): 29–44. [DOI] [PubMed] [Google Scholar]
  • 15. Gelman A., Bois F., and Jiang J., “Physiological Pharmacokinetic Analysis Using Population Modeling and Informative Prior Distributions,” Journal of the American Statistical Association 91, no. 436 (1996): 1400–1412. [Google Scholar]
  • 16. Tsamandouras N., Rostami‐Hodjegan A., and Aarons L., “Combining the 'bottom Up' and 'top Down' Approaches in Pharmacokinetic Modelling: Fitting PBPK Models to Observed Clinical Data,” British Journal of Clinical Pharmacology 79, no. 1 (2015): 48–55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Ette E. I. and Williams P. J., “Population Pharmacokinetics II: Estimation Methods,” Annals of Pharmacotherapy 38, no. 11 (2004): 1907–1915. [DOI] [PubMed] [Google Scholar]
  • 18. Sheiner L. B. and Beal S. L., “Evaluation of Methods for Estimating Population Pharmacokinetics Parameters. I. Michaelis‐Menten Model: Routine Clinical Pharmacokinetic Data,” Journal of Pharmacokinetics and Biopharmaceutics 8, no. 6 (1980): 553–571. [DOI] [PubMed] [Google Scholar]
  • 19. Sheiner L. B. and Beal S. L., “Evaluation of Methods for Estimating Population Pharmacokinetic Parameters. II. Biexponential Model and Experimental Pharmacokinetic Data,” Journal of Pharmacokinetics and Biopharmaceutics 9, no. 5 (1981): 635–651. [DOI] [PubMed] [Google Scholar]
  • 20. Lavielle M., Mixed Effects Models for the Population Approach: Models, Tasks, Methods and Tools (CRC press, 2014). [Google Scholar]
  • 21. Beal S. L. and Sheiner L. B., “Estimating population kinetics,” Critical Reviews in Biomedical Engineering 8, no. 3 (1982): 195–222. [PubMed] [Google Scholar]
  • 22. Smith A. and Wakefield J., “The Hierarchical Bayesian Approach to Population Pharmacokinetic Modelling,” International Journal of Bio‐Medical Computing 36, no. 1–2 (1994): 35–42. [DOI] [PubMed] [Google Scholar]
  • 23. Sheiner L. B., “The Population Approach to Pharmacokinetic Data Analysis: Rationale and Standard Data Analysis Methods,” Drug Metabolism Reviews 15, no. 1–2 (1984): 153–171. [DOI] [PubMed] [Google Scholar]
  • 24. Delyon B., Lavielle M., and Moulines E., “Convergence of a Stochastic Approximation Version of the EM Algorithm,” Annals of Statistics 27 (1999): 94–128. [Google Scholar]
  • 25. Comets E., Lavenu A., and Lavielle M., “Parameter Estimation in Nonlinear Mixed Effect Models Using Saemix, an R Implementation of the SAEM Algorithm,” Journal of Statistical Software 80, no. 3 (2017): 1–41. [Google Scholar]
  • 26. Dobrinas M., Cornuz J., Oneda B., Kohler Serra M., Puhl M., and Eap C. B., “Impact of Smoking, Smoking Cessation, and Genetic Polymorphisms on CYP1A2 Activity and Inducibility,” Clinical Pharmacology and Therapeutics 90, no. 1 (2011): 117–125. [DOI] [PubMed] [Google Scholar]
  • 27. Plowchalk D. R. and Rowland Yeo K., “Prediction of Drug Clearance in a Smoking Population: Modeling the Impact of Variable Cigarette Consumption on the Induction of CYP1A2,” European Journal of Clinical Pharmacology 68, no. 6 (2012): 951–960. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data S1: Supporting Information.

PSP4-15-e70186-s001.zip (983.4KB, zip)

Articles from CPT: Pharmacometrics & Systems Pharmacology are provided here courtesy of Wiley

RESOURCES