Skip to main content
HAL-INSERM logoLink to HAL-INSERM
. Author manuscript; available in PMC: 2010 Oct 30.
Published in final edited form as: Pharm Res. 2009 Oct 30;27(1):92–104. doi: 10.1007/s11095-009-9980-5

Bioequivalence tests based on individual estimates using non-compartmental or model-based analyses: evaluation of estimates of sample means and type I error for different designs

Anne Dubois 1,*, Sandro Gsteiger 2, Etienne Pigeolet 2, France Mentré 1
PMCID: PMC2881952  PMID: 19876723

Abstract

The main objective of this work is to compare the standard bioequivalence tests based on individual estimates of the area under the curve and the maximal concentration obtained by non compartmental analysis (NCA) to those based on individual empirical Bayes estimates (EBE) obtained by nonlinear mixed effects models. We evaluate by simulation the precision of sample means estimates and the type I error of bioequivalence tests for both approaches. Crossover trials are simulated under H0 using different numbers of subjects (N) and of samples per subject (n). We simulate concentration-time profiles with different variability settings for the between-subject and within-subject variabilities and for the variance of the residual error. Bioequivalence tests based on NCA show satisfactory properties with low and high variabilities, except when the residual error is high which leads to a very poor type I error or when n is small which leads to biased estimates. Tests based on EBE lead to an increase of the type I error when the shrinkage is above 20% which occurs notably when NCA fails. In those cases, tests based on individual estimates cannot be used.

Keywords: pharmacokinetics, bioequivalence tests, non compartmental analysis, nonlinear mixed effects model, SAEM algorithm

1. Introduction

Pharmacokinetic (PK) bioequivalence studies are performed to compare different drug formulations. The most commonly used design for bioequivalence trials is the two-period two-sequence crossover design. This design is recommended by the Food and Drug Administration (FDA) (1) and the European Medicines Evaluation Agency (EMEA) (2). FDA and EMEA recommend to test bioequivalence from the log ratio of the geometric means of two parameters: the area under the curve (AUC) and the maximal concentration (Cmax). These endpoints are usually estimated by non compartmental analysis (NCA) using the trapezoidal rule to evaluate AUC (3). NCA requires few hypotheses but a large number of samples per subject (usually between 10 and 20).

PK data can also be analyzed using nonlinear mixed effects models (NLMEM). This method is more complex than NCA but has several advantages: it takes benefit of the knowledge accumulated on the drug and can characterize the PK with few samples per subject. This allows to perform analyses in patients, the target population, and in whom pharmacokinetics can be different from healthy subjects. Non compartmental AUC is computed by trapezoidal rule which ignores assay error. NCA does not take into account non linear pharmacokinetics, which can bias the bioavailability estimation (4) and may amplify small bioavailability differences between drug products (5). The European guideline on similar biological medicinal products, which frequently exhibit non linear pharmacokinetics, recommends to estimate in the comparative PK studies, the elimination characteristics such as clearance (6). It is known that in these conditions, clearance is not accurately estimated by NCA. Models can also lead to better understanding of the biological system than a fully empirical approach and therefore help interpret ambiguous results.

However, the use of NLMEM is still rare in early phases of drug development or to analyze crossover studies. There are only seven published studies which use NLMEM to analyze bioequivalence trials (7, 8, 9, 10, 11, 12, 13) and except in Zhou et al (12), all analyze a dataset with many samples per subject. Five papers (7, 8, 9, 10, 13) compare tests based on individual NCA estimates to tests based on NLMEM and all conclude that the results are similar. Yet, they use different statistical approaches to test bioequivalence with NLMEM. Furthermore, none perform bioequivalence tests on individual estimates of AUC and Cmax obtained from NLMEM. Pentikis et al (8) propose the estimation of AUC and Cmax by standard nonlinear regression as an alternative to the NCA and Zhou et al (12) perform bioequivalence tests on the individual empirical Bayes estimates (EBE) of the volume of distribution and the steady-state through concentration. Otherwise, bioequivalence tests are performed on treatment effect parameters (7, 8, 9, 10, 11, 13). All authors agree that simulation studies are needed to evaluate bioequivalence tests based on NLMEM and to compare them to tests based on individual NCA estimates.

In this work, we compare the standard analysis of bioequivalence crossover trials based on NCA to the same usual analysis based on individual EBE obtained by NLMEM. We study the influence of the design for each approach. There is already one published simulation study of Panhard and Mentré which evaluates bioequivalence tests based on EBE estimated through NLMEM (14). Our present study relies on the work of Panhard and Mentré as starting point and adds several new features.

The major distinctness concerns the studied tests on the individual estimates (EBE or NCA). Panhard and Mentré perform the Student paired test and the Wilcoxon paired signed rank test whereas we use a linear mixed effects model (LMEM). As specified in the regulatory guidelines (1, 2), the bioequivalence analysis should take into account sources of variation that can be reasonably assumed to have an effect on the endpoints AUC and Cmax. Therefore, LMEM including treatment, period, sequence and subject effects are usually used to analyze the log-transformed data (15).

Panhard and Mentré limit their comparison to bioequivalence tests on AUC and do not evaluate tests based on Cmax. In the present study, both endpoints are analyzed; indeed we expect some issues for bioequivalence test performed on Cmax as the estimation of Cmax by NCA is sensitive to the design and the computation of Cmax from EBE is more complex than for AUC. To simulate PK profiles and then to estimate individual parameters by NLMEM, Panhard and Mentré use a pharmacokinetic model parametrized using AUC as one of the PK parameters whereas we choose a more common parameterization, replacing AUC by the clearance of the drug.

For the estimation of NLMEM parameters, Panhard and Mentré use an algorithm based on a first order linearization with respect to the random effects, the first order conditional estimates (FOCE) algorithm (16) implemented in the R function nlme (17). The FOCE algorithm is the more widely used algorithm and corresponds to the industry standard for model-based PK analyses as it is implemented in NONMEM. Yet, this algorithm presents some convergence issues which could be avoided with the use of a stochastic algorithm using the exact maximum likelihood, such as the stochastic approximation expectation maximisation (SAEM) algorithm (18, 19, 20). The SAEM algorithm is implemented in the free software MONOLIX (21) (first version February 2005) and is applied to several population PK analyses (22, 23, 24).

The main objective of this work is to compare standard bioequivalence tests based on individual estimates of AUC and Cmax obtained by NCA or by NLMEM. The comparison is based on the precision of the sample means of log(AUC) and log(Cmax) and on the type I error of bioequivalence tests for both estimation methods. In section 2 of the article, we describe the model, the simulation study, both estimation methods (NCA and NLMEM), the evaluation of precsion of sample means, how bioequivalence tests are performed and how shrinkage on the tested parameters is estimated. The main results of the simulation are exposed in section 3. Finally, the study results and perspectives are discussed.

2. Methods

2.1. Simulation study

2.1.1. Simulation model

We analyze two-period two-sequence crossover PK trials where subjects are randomly allocated to one of two treatment sequences. In the first sequence (Ref – Test), subjects receive the reference treatment (Ref) and the test treatment (Test) in period one and two, respectively. In the second sequence (Test – Ref), subjects receive treatments in the reverse order (Test then Ref). Designs are balanced, i.e. there is the same number of subjects N/2 for each sequence.

In the following, we denote yijk the concentration for individual i (i = 1, ···, N) at sampling time j (j = 1, ···, nik) for period k (k = 1, 2). We also denote f the nonlinear pharmacokinetic function which links concentrations to sampling times. The nonlinear mixed effects model can be written as follows:

yijk=f(tijk,θik)+εijk (1)

where θik = (θikl; l = 1, ···,p)′ is the p-vector of the PK parameters of subject i for period k. εijk is the residual error assumed to be normally distributed with zero mean and variance σijk2, with:

σijk2=(a+bf(tijk,θik))2 (2)

This is a combined error model with two parameters: a for the additive and b for the proportional part. We assume a multivariate log-normal distribution for the individual parameters θik. In absence of covariates, the lth individual parameter can be decomposed as:

θikl=μleηil+κikl (3)

with μ = (μl; l = 1, ···,p)′ the p-vector of fixed effects, ηi = (ηil; l = 1, ···,p)′ the vector of random effects of subject i and κik = (κikl; l = 1, ···,p)′ the vector of random effects of subject i at period k. ηi represents the variability between individuals and it is named between-subject variability (BSV). κik represents the variability between two periods of treatment for the same individual and it is called within-subject variability (WSV). ηi and κik are assumed to be normally distributed with zero mean and with covariance matrices of size p × p denoted Ω and Γ, respectively. In this study we assume that Ω and Γ are diagonal. ηi, κik and εijk are assumed to be independent.

We introduce three categorical covariates into the statistical model: the treatment Tik, the period Pk and the sequence Si. The reference classes for each covariate are defined as follows: Tik is fixed to zero for the treatment Ref and is equal to 1 for the treatment Test; Pk is fixed to zero for the first period and is equal to 1 for the second one; Si is fixed to zero for the first sequence Ref — Test and is equal to 1 for the second one Test — Ref. βT = (βT,l; l = 1, ···,p)′, βP = (βP,l = 1, ···,p)′ and βs = (βs,l; l = 1, ···,p)′ correspond to vectors of the treatment, period and sequence effect. With these three covariates, μl of Eq. (3) is replaced by μikl defined as:

μikl=λleβT,lTik+βP,lPk+βS,lSi (4)

with λ = (λl; l = 1, ···,p)′ the p-vector of the fixed effects for the reference classes.

2.1.2. Theophylline pharmacokinetics

We use the concentration data of the anti-asthmatic drug theophylline to define the population PK model for the simulation study. These data are classical ones in population pharmacokinetics (17) and are used in previous simulation studies done by Panhard et al. (14, 25). The theophylline data include twelve subjects receiving a single oral dose of theophylline depending on their body weight (from 3 to 6 mg). For each patient, ten blood samples were taken at 0.25, 0.5, 1, 2, 3.5, 5, 7, 9, 12 and 24 h after administration and serum concentrations were measured. A one compartment model with first order absorption and first order elimination adequally describes the data and can be written as follows:

f(t,θ)=FDkaCLVka(ekateCL/Vt) (5)

where D is the dose, F the bioavailability, ka the absorption rate constant, CL the clearance of the drug and V the volume of distribution. As only data after oral administration are obtained, the bioavailability cannot be estimated and, consequently, the vector θ of PK parameters is equal to (ka, CL/F, V/F).

2.1.3. Simulation features

In this simulation study, we use rather similar settings as those of the simulation studies performed by Panhard et al. (14, 25). However we simulate two-period, two-sequence crossover pharmacokinetic trials whereas they simulate two-period, one-sequence crossover trials. For each trial, N/2 subjects are allocated to the sequence Ref – Test and N/2 subjects are allocated to the sequence Test – Ref. We fix the dose for all subjects to 4 mg which corresponds to the rounded median dose of the theophylline study. The vector of population parameters λ is composed of (λka = 1.48 h−1, λCL/F = 40.36 mL/h, λV/F = 0.48 L) for the reference treatment. In order to mimic a change in bioavailability, we add a treatment effect βT = (0,βT,CL/FT,V/F)′ on log(λ), i.e. we multiply λCL/F by eβT,CL/F and λV/F by eβT,V/F for the test treatment. The modification of bioavailability also affects AUC and Cmax. Indeed, AUC = FD/CL and Cmax is defined as:

Cmax=f(tmax,θ)=FDVeCL/Vtmaxwithtmax=log(ka)log(CL/V)kaCL/V (6)

We do not simulate a period effect or a sequence effect. We simulate with two levels of variability for the between-subject and within-subject variability. In the following, BSV and WSV are given as standard deviations of the log-transformed parameters multiply by 100 to be expressed in percent. The standard deviation on the log scale corresponds approximately to the coefficient of variation on the ordinary scale. For the low level, we fix BSV to 20% for ka and CL/F and to 10% for V/F; WSV is fixed to half BSV for the three parameters. For the high level, we fix BSV to 50% and WSV to 15% for the three parameters. We also simulate with two levels of variability for the residual error: a = 0.1 mg/L, b = 10% for the low level, and a = 1 mg/L, b = 25% for the high level. The high level of residual error is only used with the high level of BSV and WSV. We call Sl,l, the variability setting with low variability for BSV and WSV and for the residual error, Sh,l, the variability setting with high variability for BSV and WSV and low for the residual error, and Sh,h, the variability setting with high variability for BSV and WSV and for the residual error. The three variability settings are summarized in Table I.

Table I.

Summary of the three variability settings used in the simulation study. The between-subject (BSV) and within-subject (WSV) variability are given as standard deviations of the log-parameters multiply by 100 and expressed in percent.

Variability Sl,l Sh,l Sh,h
BSV 20% for ka and CL/F
10% for V/F
50% 50%
WSV 10% for ka and CL/F
5% for V/F
15% 15%
Residual error a = 0.1 mg/L
b = 10%
a = 0.1 mg/L
b = 10%
a = 1 mg/L
b = 25%

2.1.4. Simulation process

For each subject i = 1, ···, N of each simulated trial m = 1, ···, M, we simulate a vector of random effects ηi in Inline graphic(0, Ω) and two vectors of random effects κik in Inline graphic(0, Γ), one for each period k = 1,2. To get the logarithm of each individual parameters log(θikl), we add the logarithm of the mean parameter log(λl), the treatment effect βT,l if needed (depending on the treatment group and the PK parameter considered), and both random effects ηil and κikl. The concentrations f(tijk, θik) predicted by the PK model at time tijk (j = 1, ···,nik) are then computed using the individual parameters. In these simulations, the sampling times for all subjects and both periods are similar. So j = 1, ···, n, where n is a fixed number of sampling times for each simulated design. Finally, we add a residual error, generated from a normal distribution Inline graphic(0, (a + b f(tijk, θik))2), to each predicted concentration to obtain the simulated concentrations yijk. We do not incorporate in the simulation a limit of quantification (LOQ) because NCA cannot handle such data, contrary to the SAEM algorithm, and we do not want to favour the later. In the rare cases where the simulated concentration is below zero, we fix it to the value 0.1 mg/L.

We expect more of these fixed concentrations when variability increases but their proportion could also differ from a design to another if the sampling times differ. Consequently, for each simulated design and each variability setting, we compute the proportion of the concentrations fixed to 0.1 mg/L and study the corresponding sampling times.

2.1.5. Simulation designs

We simulate trials with four different designs, which are also used by Panhard et al (14, 25). We simulate with the original design with N = 12 subjects and n = 10 samples per subject and per period, taken at the times of the initial study (0.25, 0.5, 1, 2, 3.5, 5, 7, 9, 12 and 24 h after dosing). We also simulate with an intermediate design with N = 24 subjects and n = 5 samples, taken at 0.25, 1.5, 3.35, 12 and 24 h after dosing, a sparse design with N = 40 subjects and n = 3 samples, taken at 0.25, 3.35 and 24 h after dosing and a rich design N = 40 subjects and n = 10 samples, taken at the times of the initial study. For each design, we simulate using the variability settings Sl,l and Sh,l. We simulate using Sh,h only for the intermediate design. For each design and each variability setting, we simulate 1000 trials under two different hypotheses: H0;80% where βT = (0,log(0.8),log(0.8))′ and H0;125% where βT = (0,log(1.25),log(1.25))′. For each simulated trial, each simulated design and each variability setting, the simulated concentrations for the reference treatment are equal in both simulated hypotheses. In the following, we call simulation setting the association of one design with one variability setting and one hypothesis (H0;80% or H0;125%). Considering this, there are 18 different simulation settings (8 for Sl,l and Sh,l, and 2 for Sh,h). All simulations are performed using the statistical software R 2.7.1. Figure 1 displays the individual data of one trial simulated under H0;80% and H0;125% with the intermediate design and three variability settings (Sl,l, Sh,l and Sh,h).

Figure 1.

Figure 1

Concentrations (mg/L) simulated for the intermediate design (N = 24, n = 5) for the reference treatment (left) and for the test treatment under H0;80% (middle) and H0;125% (right) using the variability settings Sl,l (top), Sh,l (middle) and Sh,h (bottom).

2.2. Estimation of individual parameters

2.2.1. Notations

We perform bioequivalence tests on AUC and Cmax. To estimate the individual parameters using NCA or NLMEM, we do not consider periods or sequences. Only the treatment group (Ref or Test) is taken into account. For each simulated trial m = 1, ···, 1000 of one simulation setting, there are 2N individual AUC and 2N individual Cmax, one for each subject i = 1, ···, N and each treatment group.

In the following, for one simulated trial, we call AUCi(Ref) the true value of the individual AUC of subject i for the reference treatment and AUCi(Test) the true value of the individual AUC of subject i for the test treatment; we also define AUC^i(Ref) the estimated value of individual AUC of subject i for the treatment Ref obtained from NCA or NLMEM and AUC^i(Test) the corresponding AUC for the treatment Test.

Same notations are applied to Cmax. Cmaxi(Ref) and Cmaxi(Test) are the true value of the individual Cmax of subject i f or the treatment Ref and Test, respectively. Cmax^i(Ref) and Cmax^i(Test) are the corresponding estimated value of individual Cmax obtained from NCA or NLMEM.

In some cases, we may refer to these different individual parameters without specifying the treatment group. For each simulated trial, AUCi(Ref),AUCi(Test),Cmaxi(Ref) and Cmaxi(Test) are computed from the corresponding individual parameters ka, CL/F and V/F simulated as described in section 2.1.4.

2.2.2. Estimation based on non compartmental analysis

First, we estimate AUC and Cmax by non compartmental analysis (3) using a R function named mnca which we develop. For each simulated trial, this function provides the estimation of different NCA parameters for each subject and each treatment group. Different options have to be specified in mnca. In this study, we use the linear trapezoidal rule to compute the AUC0–last between the time of dose (equal to 0) and the last sampling time. To obtain the total AUC (between the time of dose and infinity), we compute the terminal slope equal to CL/V using the logarithm of the last concentrations to perform a linear regression. To do so, we use a fixed number of concentrations which depends on the number of samples per subject in the design.

To avoid biased estimation of the terminal slope, the first point used for its computation should be on the descending side of the concentration curve and not too close to Cmax. Using the mean value of PK parameters, tmax, the sampling time corresponding to Cmax, is about 2.06 h for both treatment groups (contrary to Cmax, tmax is not affected by the change of bioavailability). Consequently, for the original and rich designs where n = 10, we use the last four concentrations which correspond to sampling times 7, 9, 12 and 24 h. NCA is normally performed on PK profiles containing ten sampling times per subject or more. For intermediate and sparse designs where n = 5 and n = 3 respectively, the total AUC is estimated by NCA for completeness. For these two designs, we use the last two concentrations which correspond to sampling times 12 and 24 h for the intermediate design, and to 3.35 and 24 h for the sparse design.

Figure 2 displays the individual concentration curves of one simulated trial for the original, intermediate and sparse designs and the two variability settings Sl,l and Sh,l. The bottom left graphic of the Figure 1 presents a similar graphic for the intermediate design and Sh,h, completing our illustration. For rich and intermediate designs, the number of concentrations used to compute the terminal slope seems reasonable. Same observation can be done for the rich design because the sampling times are similar to those of the original design, only the number of subjects differs. For sparse design, the number of concentration used to compute the terminal slope is chosen by default, first point being close to Cmax.

Figure 2.

Figure 2

Concentrations (mg/L) simulated for the original (N =12, n = 10, left), intermediate (N =24, n = 5, middle) and sparse (N = 40, n = 3, right) designs for the reference treatment using the variability settings Sl,l (top) and Sh,l (bottom).

Other assumptions are made to compute the terminal slope, to handle particular PK profiles, especially for the intermediate and sparse designs where only two points are used for the estimation. If the last two concentrations increase instead of decreasing or if they are similar up to the sixth digit, we consider the terminal slope be missing, i.e. there is no estimation of the total AUC for the subject and treatment concerned. The proportion of missing AUC^i should increase with variability and could differ from a design to another due to different sampling times. Consequently, for each design and each variability setting, we compute the proportion of missing AUC^i.

For all designs, Cmax is estimated as the maximal concentration observed. Contrary to AUC, there is no missing Cmax.

2.2.3. Estimation based on nonlinear mixed effects model

We also estimate AUC and Cmax from the individual empirical Bayes estimates of the PK parameters after population analyses. In this study we use the SAEM algorithm implemented in MONOLIX 2.4 to estimate the NLMEM parameters (population and individual parameters). For each simulated trial, we analyze separately the concentrations of each treatment group using NLMEM without taking into account periods and sequences. As each subject receives both treatments, data of each treatment group contain observations from all subjects. In the following, we describe the statistical model used to fit the data of the reference treatment. We consider yij(Ref) the concentration for individual i (i = 1,···, N) at time tij (j = 1,···,n) and or the treatment Ref. Depending on the sequence of the subject i, yij(Ref) corresponds to concentration of the first or second period. The statistical model used has no covariate because no period or sequence effect are incorporated. Furthermore, since periods are not considered, WSV cannot be separated from BSV. Consequently, the lth individual parameter is defined as:

θil(Ref)=μl(Ref)eηij(Ref) (7)

Ω(Ref) is the covariance matrix of the vector of random effects ηi(Ref). A similar statistical model is applied to fit the data of the treatment Test.

Of note, given the BSV and WSV, the overall variability is equal for both treatment groups, i.e. Ω(Ref)(Test). However, for each simulated trial, their estimates, Ω̂(Ref) and Ω̂(Test), are different. The overall simulated variability is 22.4% for ka and CL/F and 11.2% for V/F under Sl,l, and 52.2% for the three PK parameters under Sh,l and Sh,h.

After having estimated the population parameters for the data of one treatment group of one simulated trial, we estimate the conditional modes of the corresponding individual parameters which are defined as the individual empirical Bayes estimates. These EBE provide the individual estimates of PK parameters (ka, CL/F and V/F). We then derive individual AUC^i(Ref) and Cmax^i(Ref) or AUC^i(Test) and Cmax^i(Test) depending on the treatment group considered. Contrary to NCA, there is no missing AUC^i obtained by NLMEM using the SAEM algorithm.

2.2.4. Evaluation of estimates of sample means

In this study we compute individual AUC^i and Cmaxi^ for 1000 replicates of different designs, different variabilities and different treatment groups using two types of estimation. To analyze and compare the accuracy and precision of the estimates of the sample means of log(AUC) and log(Cmax) using NCA or EBE, we compute estimation error for each treatment group (Ref or Test) of each simulated trial. To take into account sampling variability, for each dataset we compute the estimation error as the difference between the sample mean of the estimates (NCA or EBE) and the sample mean of the true simulated values. In the following, definitions are given for AUC^i(Ref) . Same definitions apply to AUC^i(Ref),Cmax^i(Ref) and Cmax^i(Test). For each simulated trial, the estimation error for the sample mean of log(AUC) for the reference treatment is computed as:

eeAUC(Ref)=1Ni=1Nlog(AUC^i(Ref))1Ni=1Nlog(AUCi(Ref)) (8)

with AUC^i(Ref) the AUC estimated by NCA or derived from EBE for subjects i = 1, ···, N*, and AUCi(Ref) the true simulatd parameter for subjects i = 1, ···, N. For the estimation of individual parameters by NCA, there may be missing AUC^i, so that N*N.

For one simulation setting, we call eeAUC,m(Ref) the estimation error for the sample mean of log(AUC) computed for the reference treatment and the mth simulated trial (m = 1, · · ·, 1000). We then define the bias and root mean square error (RMSE) computed from eeAUC,m(Ref) over the 1000 replicates as:

biasAUC(Ref)=11000m=11000eeAUC,m(Ref)rmseAUC(Ref)=11000m=11000(eeAUC,m(Ref))2 (9)

As well as computing bias and RMSE, we compute the 95% confidence interval of biasAUC(Ref) using the standard error of the mean and the 97.5% quantile of the Gaussian distribution. If zero does not belong to the 95% confidence interval of biasAUC(Ref), we can conclude that bias is significantly different from zero with a type I error of 5%.

2.3. Bioequivalence test

2.3.1. Implementation of the two one-sided tests

We perform the standard bioequivalence analysis recommended by FDA and EMEA (1, 2). The individual parameters are log-transformed and analyzed using a linear mixed effects model written as follows:

log(θikl)=νl+βT,lTik+βP,lPk+βS,lSi+ξil+εikl (10)

where θikl represents the lth individual parameter (AUC if l = 1 or Cmax if l = 2) for subject i (i = 1, · · ·, N) at period k (k = 1, 2). νl is the mean value for the studied log-transformed metric. The three covariates Tik, Pk and Si, for treatment, period and sequence are defined as before. It is assumed that the random subject effect ξil (l = 1,2) and the residual error εikl (l = 1,2) are independently normally distributed with zero mean.

For each simulation setting, the individual estimates AUC^i and Cmaxi^ obtained from NCA and NLMEM are analyzed by the LMEM described above. To check the properties of the TOST, we also analyze the true simulated value AUCi and Cmaxi. As specified before, for AUC estimated by NCA, they may be missing AUC^i. In that case, the LMEM is performed on less than 2N AUC^i.

After fitting the LMEM to individual metrics, a bioequivalence test is performed on the estimate of treatment effect β̂T,l. The null hypothesis of the bioequivalence test recommended by the guidelines (1, 2) and performed on the lth individual parameter is H0: {βT,l ≤ log(0.8) or βT,l ≥ log(1.25)}. H0 is rejected if the 90% confidence interval (90% CI) of β̂T,l lies within [log(0.8); log(1.25)]. These limits of the bioequivalence test correspond to a ratio of the geometric mean falling within 80%–125%. This approach based on the 90% CI is equivalent to Schuirmann’s two one-sided tests (TOST) procedure (26). H0 is composed of two unilateral hypotheses {βT,l ≤ log(0.8)} and {βT,l ≥ log(1.25)}. Both are tested separately by a one-sided test with a type I error of 5%. The p-value of the TOST is the maximum of both p-values of the one-sided tests and for each test the limit is the 95% quantile of the Student distribution with df degrees of freedom.

For balanced datasets, the N/2 subjects of each sequence are considered as two independent samples from normal populations with equal variances, and df = N − 2 (15, 27). For unbalanced datasets, i.e. when there is one or more missing AUC^i in a dataset for NCA, the determination of the degrees of freedom is more complex. Different approximations are available as for example the containment method (28), the Kenward-Roger adjustment (29) or the Satterthwaite’s procedure approximation (28, 29). In this study, we use the R function lme from the package nlme to perform the LMEM in which the degrees of freedom are estimated using the containment method (17). There, the degrees of freedom are calculated as: df = nobsN − 2 where nobs is the total number of individual parameters. When there is no missing value, this approach coincides with the degrees of freedom computed in balanced datasets (because then nobs = 2N).

2.3.2. Evaluation of the type I error

Bioequivalence tests are evaluated for AUC^i and Cmaxi^ estimated by NCA or NLMEM on trials simulated under the composite null hypothesis H0. Bioequivalence tests are also performed on the true simulated values AUCi and Cmaxi. The type I error of the TOST procedure is defined as the supremum of the type I errors over the null space (30). It corresponds to the supremum of the type I error of the two one-sided tests. As suggested by Liu and Weng (31), the type I error of the bioequivalence test can be evaluated for each boundary of H0 space, i.e. log(0.8) and log(1.25). Consequently, we simulate for each design of each variability setting 1000 trials under each unilateral hypothesis H0;80% and H0;125% as specified before.

For each unilateral hypothesis H0;80% and H0;125%, the type I error is estimated by the proportion of the simulated trials for which the null hypothesis H0 is rejected. If the bioequivalence tests were performed on the true parameters (AUCi and Cmaxi), the results of both type I errors should be identical because H0;80% and H0;125% are symetric but we are working with estimates. As proposed by Panhard and Mentré (14), we define the global type I error as the maximum value of both type I errors estimated. Due to the 1000 replicates, the 95% prediction interval (95% PI) for a type I error of 5% is [3.7%; 6.4%].

2.3.3. Shrinkage and tests based on empirical Bayes estimates

It is known in NLMEM that, with sparse individual information, the individual estimates of random effects shrink towards their mean value which is zero (32). For the reference treatment group of each simulated trial, the shrinkage on the lth individual EBE (ka, CL/F or V/F) can be defined as:

Shl(Ref)=1var(η^il(Ref))ω^l(Ref)2 (11)

where var(η^il(Ref)) is the empirical variance of the lth individual estimated random effects and ω^l(Ref)2 is the estimated variance of the corresponding random effects.

AUC and Cmax are secondary parameters of the NLMEM because they are defined as functions of the PK parameters, ka, CL/F and V/F. As the shrinkage on individual EBE, the shrinkage on log(AUC) and log(Cmax) can also be computed. Consequently, we can study the link between the type I error of bioequivalence tests based on EBE and the amount of shrinkage.

For log(AUC), Eq.(11) can be expressed as:

ShAUC(Ref)=1var(log(AUC^i(Ref)))ω^AUC(Ref)2 (12)

where var(log(AUC^i(Ref))) is the empirical variance of the individual estimates log(AUC^i(Ref)) and ω^AUC(Ref)2 is its estimated variance in the model. As log(AUC) = log(D) − log(CL/F), ωAUC(Ref)2=ωCL/F(Ref)2 and ω^AUC(Ref)2 is the estimated value ω^CL/F(Ref)2.

For one simulation setting, we call ShAUC,m(Ref) the shrinkage on log(AUC) computed for the reference treatment for the mth simulated trial (m = 1, · · ·, 1000). To summarize the 1000 ShAUC,m(Ref) of each simulation setting, we compute the median shrinkage over these 1000 values.

Eq.(12) can be applied to log(Cmax); var(log(Cmax^i(Ref))) is computed from the individual estimates as for AUC. As the definition of Cmax given in Eq.(6) is complex, the variance of log(Cmax) for the reference treatment, ωCmax(Ref)2 cannot be computed from ωka(Ref)2,ωCL/F(Ref)2 and ωV/F(Ref)2. It must be approximated for instance using the delta method (33). The expression and details are given in Appendix. As for AUC, the median shrinkage over the 1000 values of ShCmax,m(Ref) is computed for each simulation setting.

3. Results

3.1. Simulated data and missing values

As explained in section 2.1.4, if the simulated concentration is below zero, it is fixed to 0.1 mg/L. As expected, the proportion of these fixed concentrations differs from one variability setting to another and from one design to another, except for the original and rich design where the sampling times are similar. The maximal proportion is rather small and is 0.03% for Sl,l, 1.6% for Sh,l and 8.5% for Sh,h. For Sl,l, all fixed concentrations correspond to the last sampling time which is 24 h for all designs. For Sh,l, there are fixed concentrations corresponding to different sampling times but fixed concentrations at 24 h are majoritary, with a minimal proportion of 90%. For Sh,h, fixed concentrations corresponds mostly to 24 h (54%) and then mainly to 0.25 h (20%) and 12 h (19%).

Over all the simulations, some AUC^i estimated by NCA are missing due to particular individual PK profiles (see section 2.2.2). The proportion of missing AUC^i is similar in both hypotheses and remains rare for the four designs of Sl,l and Sh,l. For both variability settings, the maximal proportion corresponds to the intermediate design (N = 24, n = 5) with 0.02% and 3.3% for Sl,l and Sh,l, respectively. This proportion is 25% for Sh,h. Among missing AUC^i of Sh,h, 12% are due to concentrations fixed to 0.1 mg/L, i.e. due to two similar last concentrations. Other missing AUC^i are due to two last concentrations increasing instead of decreasing. As expected, there is no simulated trial where all AUC^i for both treatment groups are missing. In other words, the estimation error for the sample mean of log(AUC) or log(Cmax) is computed on the 1000 simulated trial for each simulation setting, and the type I errors of bioequivalence test are estimated on 1000 replicates for AUC and Cmax for both hypotheses H0;80% and H0;125%.

3.2. Evaluation of estimates of sample means

Figure 3 displays the bias (top) and RMSE (bottom) on sample mean estimates for log(AUC) (left) and log(Cmax) (right) estimated for the reference treatment. Results are similar for both treatment groups (Ref and Test) and both unilateral hypotheses (results not shown). The 95% confidence interval of the bias is not shown in Figure 3 because this interval is tighter than the width of the displayed symbol and all biases are significantly different from zero. There is more bias and larger RMSE for NCA than for EBE for all designs and all variability settings. Note that biases and RMSE are computed on log scale, so that, for instance, a value of 0.038 corresponds approximatively to an error of 3.8% on the ordinary scale for the geometric mean. For NCA estimates, the bias and RMSE increase when the number of samples per subject decreases and are lower for Sl,l compared to Sh,l. For the intermediate design (N = 24, n = 5), the bias on the sample mean of log(AUC) is 0.038, 0.094 and 0.15 for Sl,l, Sh,l and Sh,h, respectively; RMSE is 0.044, 0.12 and 0.21, respectively.

Figure 3.

Figure 3

Bias (top) and root mean square error (RMSE, bottom) of estimates of the sample mean for log(AUC) (left) and log(Cmax) (right) for the reference treatment from 1000 trials for different designs (N: number of subjects, n: number of samples per subject) and different variability settings Sl,l (○), Sh,l (□) and Sh,h (△). The white symbols represent the individual estimates obtained from NCA and the grey ones the individual estimates obtained from EBE.

For individual estimates based on EBE, the bias is small (less than 0.02) for both parameters (log(AUC) and log(Cmax)), all designs and all variability settings whereas RMSE increase when the number of samples per subject decreases and is majoritary lower for Sl,l compared to Sh,l. For instance, for the intermediate design, the bias on the sample mean of log(AUC) is −0.0096, −0.016 and −0.010 for Sl,l, Sh,l and Sh,h respectively; RMSE is 0.019, 0.031 and 0.10, respectively.

3.3. Bioequivalence test

Table II and Figure 4 provide the results of the type I error of bioequivalence tests performed on the treatment effect of log(AUC) and log(Cmax). Table II contains the estimated type I error for each unilateral hypothesis, each design of each variability setting, for the true simulated values and both types of estimates (NCA and EBE). Figure 4 represents the global type I error for log(AUC) (top) and log(Cmax) (bottom) versus the design for each variability setting and both types of estimates. The global type I error is defined as the supremum of both estimated type I errors.

Table II.

Type I error of the bioequivalence tests performed on the treatment effect of log(AUC) and log(Cmax) for each unilateral hypothesis, H0;80% and H0;125%. The type I error is estimated from 1000 bioequivalence trials simulated under H0;80% or H0;125% for different designs (N: number of subjects, n: number of samples per subject), different variability settings Sl,l, Sh,l and Sh,h, for the true simulated values (SIM) and both types of estimates (NCA and EBE). Due to the 1000 replicates, the 95% PI for a type I error of 5% is [3.7%; 6.4%].

N = 40, n = 10 N = 12, n = 10 N = 24, n = 5 N = 40, n = 3
SIM NCA EBE SIM NCA EBE SIM NCA EBE SIM NCA EBE
Sl,l AUC H0;80% 3.9 4.0 5.5 5.4 5.2 7.7 4.3 4.3 8.0 3.9 5.9 14.8
H0;125% 4.6 5.1 5.8 5.4 5.2 7.4 4.4 3.8 7.5 4.6 5.1 16.2
Cmax H0;80% 4.5 6.6 10.0 5.7 5.1 9.0 5.8 5.3 14.6 4.5 6.8 30.6
H0;125% 4.9 6.3 9.1 5.2 5.6 10.9 5.3 5.2 16.2 4.9 5.5 29.1
Sh,l AUC H0;80% 3.9 5.4 4.7 5.4 4.4 6.8 4.3 5.2 7.1 3.9 4.5 8.5
H0;125% 4.6 6.1 5.2 5.4 4.7 6.1 4.4 3.9 5.8 4.6 5.1 11.5
Cmax H0;80% 4.5 5.1 4.0 5.3 5.3 5.3 5.5 6.0 6.5 4.5 7.2 9.2
H0;125% 5.0 5.4 5.0 5.2 5.1 5.8 5.7 6.1 7.1 5.0 6.2 7.8
Sh,h AUC H0;80% 4.3 0.8 20.6
H0;125% 4.4 0.4 22.2
Cmax H0;80% 5.5 7.0 13.8
H0;125% 5.7 9.3 17.0

Figure 4.

Figure 4

Global type I error of the bioequivalence tests performed on the treatment effect of log(AUC) (top) and log(Cmax) (bottom). The global type I error is estimated from 1000 bioequivalence trials simulated under H0;80% and H0;125% for different designs (N: number of subjects, n: number of samples per subject) and different variability settings Sl,l (○), Sh,l (□) and Sh,h (△). The white symbols represent the individual estimates obtained from NCA and the grey ones the individual estimates obtained from EBE. The dashed lines represent the nominal level at 5% and its 95% prediction interval ([3.7%; 6.4%]).

For the bioequivalence test performed on the true simulated values, the type I error for all designs, all variability settings and both null hypotheses lie in the 95% PI of the nominal level showing the good performance of the TOST. Mostly, for one type of estimates (NCA or EBE) and one design of one variability setting, the type I errors of both hypotheses are close.

For log(AUC), the global type I error of test based on NCA estimates lies between the 95% PI of the nominal level for the four designs of Sl,l and Sh,l and it is much too conservative for Sh,h. For instance, for the intermediate design, the global type I error is respectively 4.3%, 5.2% and 0.8% for Sl,l, Sh,l and Sh,h. For Cmax, test based on NCA estimates has a correct global type I error for the original and intermediate designs simulated with Sl,l and Sh,l. The global type I error is above the 95% PI for the sparse design (N = 40,n = 3) simulated with Sl,l and Sh,l and the intermediate design simulated with Sh,h.

Surprisingly, tests based on EBE often lead to an increased type I error especially for the sparse design. For AUC, the global type I error remains at the nominal level for the rich design (N = 40, n = 10). For Cmax, the global type I error lies between the 95% PI for the rich and the original designs simulated with Sl,l. The global type I error increases when the number of samples per subject decreases and is lower for Sh,l compared to Sl,l and Sh,h. Most of the type I errors are below 10% for Sl,l and Sh,l. For AUC and the intermediate design, the global type I error is respectively 8.0%, 7.1% and 22.2% for Sl,l, Sh,l and Sh,h.

Figure 5 represents the global type I errors of bioequivalence tests for the treatment effect on log(AUC) (top) and log(Cmax) (bottom) obtained from NLMEM versus the median shrinkage on the corresponding parameter for the reference treatment. The distribution of the shrinkage is similar for both treatment (Ref and Test) and both unilateral hypotheses (results not shown). For both parameters, the median shrinkage is lower for Sh,l than for Sl,l. For log(AUC), the median shrinkage is also higher for Sh,h than for Sh,l. There is a clear relationship between the inflation of the global type I error and the amount of shrinkage with type I error greater than 15% for shrinkage greater than 20%.

Figure 5.

Figure 5

Global Type I error of the bioequivalence tests performed on the treatment effect of log(AUC) (top) and log(Cmax) (bottom) versus the median shrinkage on the parameter of interest for the reference treatment and different simulation settings Sl,l (○), Sh,l (□) and Sh,h (△). The rich design design (N = 40, n = 10) is represented by white symbols, the original design (N = 12, n = 10) by light grey symbols, the intermediate design (N = 24, n = 5) by dark grey symbols and the sparse design (N = 40, n = 3) by black symbols. The dashed lines represent the nominal level at 5% and its 95% prediction interval ([3.7%; 6.4%]).

4. Discussion

In this study, we compare the standard bioequivalence analysis performed on individual estimates of AUC and Cmax obtained by NCA to the same bioequivalence analysis performed on individual EBE obtained by NLMEM. To do so, we perform a simulation study with different designs and different levels of variability. The estimation of parameters and the type I error are evaluated for both types of estimates.

Compared with the simulation study of Panhard and Mentré (14), we use the bioequivalence analysis recommended in the guidelines (1, 2) and we study both parameters (AUC and Cmax). Besides, the simulation study of Panhard and Mentré is performed using the FOCE algorithm implemented in R function nlme. The FOCE algorithm is widely used to perform population PK analyses but, in simulation studies which compared different algorithms available, stochastic EM algorithms (like the SAEM algorithm) obtained the best results for accuracy and precision of estimates (34, 35).

As Panhard and Mentré, we simulate under both null hypotheses assuming a modification in the bioavailability F, i.e. assuming the same modification for CL/F and V/F which also affects similarly both tested parameters AUC and Cmax. Consequently, the number of simulations are reduced because the unilateral hypothesis H0;80% (H0;125% respectively) for AUC corresponds to the unilateral hypothesis H0;80% (H0;125% respectively) for Cmax; the same set of simulations is used for both parameters. However, other choices may be suitable as any PK parameter is likely to change between two formulations of the same drug. For instance, a change in the elimination rate CL/V due to interaction with excipient could be possible (36). Furthermore, we study only a one compartment model. We do not simulate multi-compartmental models. For both types of estimates (NCA and EBE), we perform bioequivalence test on AUC and Cmax. Even with a multi-compartmental model, PK parameters would be summarized with these two endpoints even though the relationship between Cmax and the PK parameters could be more complicated than for a one compartment model. As shown in Figure 5, the increase of the type I error of bioequivalence test based on EBE is linked to the shrinkage which already appears with one compartment model. We think this relationship should be similar for multi-compartmental models where more shrinkage is expected.

Conversely to the bias for estimates based on EBE, the bias for estimates based on NCA depends on the number of samples per subject and is large for sparse design (N = 40, n = 3) with high variability. Usually, NCA is used with rich designs where there are about ten to twenty samples per subject. This method is not well suited for trials performed in patients where the number of samples is often limited. In comparison to model-based approaches, the estimation of parameters through NCA has several drawbacks. It is giving equal weight to all concentrations without taking into account the measurement error. Furthermore, NCA is sensitive to missing data, especially for the determination of Cmax and the computation of the terminal slope. Even without missing data, the interpolation of the AUC between the last sampling time and infinity is very sensitive to the number of samples used to compute the terminal slope and could be problematic for atypical concentration profiles. This later issue is perfectly illustrated by the simulation settings under Sh,h where 77% of the missing AUC^i are due to the two last concentrations increasing instead of decreasing. Contrary to NCA estimates, there is no missing AUC^i estimated by NLMEM due to this kind of PK profiles because all subjects are analyzed together and information given by classical PK profiles off-set information given by particular ones. NCA does not take into account all the knowledge accumulated on the PK of the studied drug as each new analysis by NCA erases the past contrary to NLMEM. Finally, although we do not simulate such data, NCA applied to nonlinear pharmacokinetics provides meaningless parameters and it cannot handle data below the limit of quantification. In this study, we choose to not introduce LOQ in the simulation because we do not want to favour the SAEM algorithm which can fit such data. We are aware that fixing some concentrations to 0.1 mg/L could introduce some bias. To avoid such arbitrary fixing, another common procedure is to resample until a valid value is obtained; however, resampling can also introduce a bias. Anyhow, the proportion of fixing value remains very low for Sl,l and Sh,l. It is more important for Sh,h but it is responsible for only 12% of the missing AUC^i estimated by NCA.

When the number of samples per subject is large and the variability is not too high, tests based on individual NCA estimates remain a good approach since they are simple and showed satisfactory properties for both tested parameters. For Cmax and the sparse design, we expected an increase of the type I error because there is no sampling time corresponding to the maximal concentration which is close to 2 h. But even with poor sample mean estimates, the type I error is maintained at the nominal level of 5%. Though, for simulation with Sh,h, the type I error of AUC is very conservative (0.8%) which shows the limits of NCA for data with high residual error.

Tests based on individual EBE have higher type I error than tests based on NCA estimates. Our results on the type I error for Sl,l are consistent with the results obtained by Panhard and Mentré with the same variability setting. For the sparse design, the type I error of tests based on EBE is surprisingly high. In that case, EBE shrink towards their mean value and they are more similar in both treatment groups. Therefore, the discrimination of the AUC or Cmax between both treatment groups is more difficult which leads to an increase of the type I error (bioequivalence is obtained more easily). These results are consistent with the results of the simulation study performed by Bertrand et al (37). In that work, they evaluate by simulation the analysis of variance (ANOVA) performed on individual EBE to test the influence of a single nucleotide polymorphism on a pharmacokinetic parameter of a drug. They show the impact of the shrinkage on the power of ANOVA. The power is reduced when the shrinkage increases. In other words, it is more difficult to discriminate between the genotypes with high shrinkage even when data are simulated with a difference.

As discussed by Schuirmann (26), the TOST procedure can be very conservative for highly variable drugs. Consequently, several improvements of this procedure have been proposed as in Berger et al (30), Brown et al (38) or Cao et al (39) to mention only a few. We are aware that there is still a great arguing on which bioequivalence test should be performed. However, we study only the classical TOST in this paper because our main objective is to compare the same standard bioequivalence analysis recommended in the guidelines (1, 2) and performed on individual estimates obtained by two estimation methods (NCA and EBE). Nevertheless, in this simulation study, the type I error of bioequivalence test performed on the true individual simulated values is always at the nominal level of 5%, even for Sh,h where the variability is particularly high. Therefore, we can conclude that, in this study, there is no issue about the TOST procedure. Consequently, liberal or conservative type I errors of bioequivalence tests performed on estimates cannot be imputed to the TOST but rather to the individual parameters estimation.

Tests based on individual estimates, NCA estimates or EBE, cannot be used for data with high residual error or when the number of samples per subject is small. In those cases, the type I error for tests based on NCA estimates is very poor or NCA estimates are biased and the shrinkage of EBE induces an increase of the type I error. In these situations, other tests based on a global analysis of all data should be considered. Panhard et al. already developed a global bioequivalence Wald test based on NLMEM (14, 25). This test is directly performed on the treatment effect parameter after fitting together the data of both treatment groups with the estimation of within-subject variability. In this study, they also used the FOCE algorithm implemented in nlme. Recently, Panhard and Samson developed an extension of the SAEM algorithm for NLMEM including the estimation of the within-subject variability (40). However, the likelihood ratio test for bioequivalence has not been developed, due to the composite null hypothesis. Additional methodological developments and simulations are needed to study bioequivalence tests after global analysis of all PK data. This will be especially useful for drugs with non linear pharmacokinetics and conditions where rich sampling is difficult to achieve, i.e. in pediatric studies or for drugs which cannot be administered in healthy subjects for safety reasons, such as oncology drugs.

Acknowledgments

We would like to thank the Modeling and Simulations group at Novartis Pharma AG, Basel, which supports by a grant Anne Dubois during this work.

Appendix

Approximation of the variance of log(Cmax) by the delta method

For a one compartment model with first order absorption and first order elimination, Cmax is defined in Eq.(6) as a function of the three PK parameters, ka, CL/F and V/F. The variance of log(Cmax), ωCmax2, is approximated by the delta method (33) as:

ωCmax2(log(Cmax)log(ka))log(μ)2ωka2+(log(Cmax)log(CL/F))log(μ)2ωCL/F2+(log(Cmax)log(V/F))log(μ)2ωV/F2 (13)

where log (μ) = (log(μka), log(μCL/F), log(μV/F))′. After computing the derivatives, ωCmax2 can be approximated by:

ωCmax2Δ2(ωka2+ωCL/F2)+(Δ1)2ωV/F2withΔ=μCL/F(μCL/FμkaμV/F)+μkaμCL/FμV/Flog(μkaμV/FμCL/F)(μkaμV/FμCL/F)2 (14)

In this simulation study, the general formula above is applied to approximate the variance of log(Cmax) for both treatment groups (Ref and Test). Given the treatment effect we simulate for the treatment Test, both approximations, ωCmax(Ref)2 and ωCmax(Test)2, are equal.

To approximate the variance of log(Cmax) by the delta method, we use the true simulated values of μ(Ref) and Ω(Ref) described in section 2.2.3. To evaluate the delta method, we also estimate the variance of log(Cmax), using the simulated parameter values of the rich design (N = 40, n = 10) for the reference treatment, under Sl,l and Sh,l. For both variability settings, ωCmax(Ref)2 is estimated as the empirical variance of the 40000 true simulated values of log(Cmaxi(Ref)). For Sl,l, the standard deviation of log(Cmax) for the reference treatment expressed in percent is 10.5% both by simulation and the delta method. For Sh,l, it is 46.3% and 46.7% by simulation and the delta method, respectively.

These results on the true simulated values validate the approximation of the variance of log(Cmax) by the delta method. Consequently, we apply it to the data of each treatment group for each simulated trial of the simulation study to approximate ω^Cmax(Ref)2( ω^Cmax(Test)2 respectively) using μ̂(Ref) (μ̂(Test) respectively) and Ω̂(Ref) (Ω̂(Test) respectively).

References

  • 1.FDA. Technical report. FDA; 2001. Guidance for Industry - Statistical Approaches to establishing bioequivalence. [Google Scholar]
  • 2.EMEA. Technical report. EMEA; 2001. Note for guidance on the investigation of bioavailability and bioequivalence. [Google Scholar]
  • 3.Gabrielson J, Weiner D. Pharmacokinetic and pharmacodynamic data analysis: concepts and applications. Apotekarsocieteten; Stockholm: 2006. [Google Scholar]
  • 4.Jusko WJ, Koup JR, Alván G. Nonlinear assessment of phenytoin bioavailability. Journal of Pharmacokinetics and Biopharmaceutics. 1976;4:327–336. doi: 10.1007/BF01063122. [DOI] [PubMed] [Google Scholar]
  • 5.Hayashi N, Aso H, Higashida M, Kinoshita H, Ohdo S, Yukawa E, Hiquchi S. Estimation of rhG-CSF absorption kinetics after subcutaneous administration using a modified Wagner-Nelson method with a nonlinear elimination model. European Journal of Pharmaceutical Sciences. 2001;13:151–158. doi: 10.1016/s0928-0987(00)00219-0. [DOI] [PubMed] [Google Scholar]
  • 6.EMEA. Technical report. EMEA; 2006. Guideline on similar biological medicinal products containing biotechnology-derived proteins as active substance: non-clinical and clinical issues. [Google Scholar]
  • 7.Kaniwa N, Aoyagi N, Ogata H, Ishii M. Application of the NONMEM method to evaluation of the bioavailability of drug products. Journal of Pharmaceutical Sciences. 1990;79:1116–1120. doi: 10.1002/jps.2600791215. [DOI] [PubMed] [Google Scholar]
  • 8.Pentikis H, Henderson J, Tran N, Ludden T. Bioequivalence: individual and population compartmental modeling compared to noncompartmental approach. Pharmaceutical Research. 1996;13:1116–1121. doi: 10.1023/a:1016083429903. [DOI] [PubMed] [Google Scholar]
  • 9.Combrink M, McFadyen ML, Miller R. A comparison of standard approach and the NONMEM approach in the estimation of bioavailability in man. The Journal of Pharmacy and Pharmacology. 1997;49:731–733. doi: 10.1111/j.2042-7158.1997.tb06101.x. [DOI] [PubMed] [Google Scholar]
  • 10.Maier GA, Lockwood GF, Oppermann JA, Wei G, Bauer P, Fedler-Kelly J, Grasela T. Characterization of the highly variable bioavailability of tiludronate in normal volunteers using population pharmacokinetic methodologies. European Journal of Drug Metabolism and Pharmacokinetics. 1999;24:249–254. doi: 10.1007/BF03190028. [DOI] [PubMed] [Google Scholar]
  • 11.Hu C, Moore K, Kim Y, Sale M. Statistical issues in a modeling approach to assessing bioequivalence or PK similarity with presence of sparsely sampled subjects. Journal of Pharmacokinetics and Pharmacodynamics. 2003;31:312–339. doi: 10.1023/b:jopa.0000042739.44458.e0. [DOI] [PubMed] [Google Scholar]
  • 12.Zhou H, Mayer P, Wajdula J, Fatenejad S. Unaltered etanercept pharmacokinetics with concurrent methotrexate in patients with rheumatoid arthritis. Journal of Clinical Pharmacology. 2004;44:1235–1243. doi: 10.1177/0091270004268049. [DOI] [PubMed] [Google Scholar]
  • 13.Fradette C, Lavigne J, Waters D, Ducharme M. The utility of the population approach applied to bioequivalence in patients. Therapeutic Drug Monitoring. 2005;27:592–600. doi: 10.1097/01.ftd.0000174005.51383.2f. [DOI] [PubMed] [Google Scholar]
  • 14.Panhard X, Mentré F. Evaluation by simulation of tests based on non-linear mixed-effects models in pharmacokinetic interaction and bioequivalence cross-over trials. Statistics in Medicine. 2005;24:1509–1524. doi: 10.1002/sim.2047. [DOI] [PubMed] [Google Scholar]
  • 15.Hauschke D, Steinijans V, Pigeot I. Bioequivalence studies in drug development. John Wiley & sons; Chichester: 2007. [Google Scholar]
  • 16.Lindstrom M, Bates D. Nonlinear mixed effects models for repeated measures data. Biometrics. 1990;46:673–687. [PubMed] [Google Scholar]
  • 17.Pinheiro JC, Bates DM. Mixed-effects models in S and Splus. Springer; New-York: 2000. [Google Scholar]
  • 18.Delyon B, Lavielle M, Moulines E. Convergence of a stochastic approximation version of EM algorithm. The Annals of Statistics. 1999;27:94–128. [Google Scholar]
  • 19.Kuhn E, Lavielle M. Coupling a stochastic approximation version of EM with a MCMC procedure. ESAIM Probability and Statistics. 2004;8:115–131. [Google Scholar]
  • 20.Samson A, Lavielle M, Mentré F. The SAEM algorithm for group comparison tests in longitudinal data analysis based on non-linear mixed-effects model. Statistics in Medicine. 2007;26:4860–4875. doi: 10.1002/sim.2950. [DOI] [PubMed] [Google Scholar]
  • 21.The MONOLIX software . [accessed 05/07/09]. http://software.monolix.org/
  • 22.Lavielle M, Mentré F. Estimation of population pharmacokinetic of saquinavir in HIV patients and covariate analysis with the SAEM algorithm. Journal of Pharmacokinetics and Pharmacodynamics. 2007;34:229–249. doi: 10.1007/s10928-006-9043-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Comets E, Verstuyft C, Lavielle M, Jaillon P, Becquemont L, Mentré F. Modelling the influence of MDR1 polymorphism on digoxin pharmacokinetic parameters. European Journal of Clinical Pharmacology. 2007;63:437–449. doi: 10.1007/s00228-007-0269-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Bertrand J, Treluyer J-M, Panhard X, Tran A, Auleley S, Rey E, Salmon-Céron D, Duval X, Mentré F the COPHAR2-ANRS 111 study group. Influence of pharmacogenetics on indinavir disposition and short-term response in HIV patients initiating HAART. European Journal of Clinical Pharmacology. 2009;65:667–678. doi: 10.1007/s00228-009-0660-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Panhard X, Taburet AM, Piketti C, Mentré F. Impact of modelling intra-subject variability on tests based on non-linear mixed-effects models in cross-over pharmacokinetic trials with application to the interaction of tenofovir on atazanavir in HIV patients. Statistics in Medicine. 2007;26:1268–1284. doi: 10.1002/sim.2622. [DOI] [PubMed] [Google Scholar]
  • 26.Schuirmann DJ. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. Journal of Pharmacokinetics and Biopharmaceutics. 1987;15:657–680. doi: 10.1007/BF01068419. [DOI] [PubMed] [Google Scholar]
  • 27.Chow SC, Liu JP. Design and analysis of bioavailability and bioequivalence studies. Marcel Dekker; 2000. [Google Scholar]
  • 28.Verbeke G, Molenberghs G. Linear mixed models for longitudinal data. Springer; New-York: 2001. [Google Scholar]
  • 29.Brown H, Prescott R. Applied mixed models in medicine. 2. John Wiley & sons; Chichester: 2006. [Google Scholar]
  • 30.Berger R, Hsu J. Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statistical Science. 1996;11:283–319. [Google Scholar]
  • 31.Liu JP, Weng CS. Bias of two one-sided tests procedures in assessment of bioequivalence. Statistics in Medicine. 1995;14:853–861. doi: 10.1002/sim.4780140813. [DOI] [PubMed] [Google Scholar]
  • 32.Savić R, Karlsson M. Shrinkage in empirical Bayes estimates for diagnostics and estimation. 2007. [accessed 05/07/09]. p. 16. Abstr 1087 available at http://www.pagemeeting.org/pdf_assets/9436-EBE_PAGE07_1_web.pdf. [DOI] [PMC free article] [PubMed]
  • 33.Oehlert GW. A note on the delta method. The American Statistician. 1992;46:27–29. [Google Scholar]
  • 34.Girard P, Mentré F. A comparison of estimation methods in nonlinear mixed effects models using a blind analysis. 2005. [accessed 05/07/09]. p. 14. Abstr 834 available at http://www.page-meeting.org/page/page2005/PAGE2005O08.pdf.
  • 35.Bauer R, Guzy S, Ng C. Survey of population analysis methods and software for complex pharmacokinetic and pharmacodynamic models with examples. The AAPS Journal. 2007;9:60–83. doi: 10.1208/aapsj0901007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Rescigno A, Powers J, Herderick EE. Bioequivalent or nonbioequivalent ? Pharmalogical Research. 2001;43:543–546. doi: 10.1006/phrs.2001.0820. [DOI] [PubMed] [Google Scholar]
  • 37.Bertrand J, Comets E, Laffont C, Chenel M, Mentré F. Pharmacogenetics and population pharmacokinetics: impact of the design on three tests using the SAEM algorithm. Journal of Pharmacokinetics and Pharmacodynamics. 2009;36:317–339. doi: 10.1007/s10928-009-9124-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Brown LD, Hwang JTG, Munk A. An unbiased test for the bioequivalence problem. The Annals of Statistics. 1997;25:2345–2367. [Google Scholar]
  • 39.Cao L, Mathew T. A simple numerical approach toward improving the two-one sided test for average bioequivalence. Biometrical Journal. 2008;50:205–211. doi: 10.1002/bimj.200710407. [DOI] [PubMed] [Google Scholar]
  • 40.Panhard X, Samson A. Extension of the SAEM algorithm for nonlinear mixed models with two levels of random effects. Biostatistics. 2009;10:121–135. doi: 10.1093/biostatistics/kxn020. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES