Abstract
Surrogate endpoints have been used to assess the efficacy of a treatment and can potentially reduce the duration and/or number of required patients for clinical trials. Using information theory, Alonso et al. (2007) proposed a unified framework based on Shannon entropy, a new definition of surrogacy that departed from the hypothesis testing framework. In this paper, a new family of surrogacy measures under Havrda and Charvat (H-C) entropy is derived which contains Alonso’s definition as a particular case. Furthermore, we extend our approach to a new model based on the information-theoretic measure of association for a longitudinally collected continuous surrogate endpoint for a binary clinical endpoint of a clinical trial using H-C entropy. The new model is illustrated through the analysis of data from a completed clinical trial. It demonstrates advantages of H-C entropy-based surrogacy measures in the evaluation of scheduling longitudinal biomarker visits for a phase 2 randomized controlled clinical trial for treatment of multiple sclerosis.
Keywords: surrogate endpoint, information theory, Havrda and Charvat entropy, mutual information, clinical trial design
1. Introduction
Surrogate endpoints which can be observed earlier, easier, possibly repeated, or are cost-saving, have been used to replace clinical endpoints in clinical trials. For example, total tumor response rate and progression–free survival have been used in phase II and phase III cancer clinical trials as surrogate endpoints for overall survival, which often requires a longer trial duration to achieve adequate statistical power. The United States Food and Drug Administration (USFDA) has accepted the use of surrogate endpoints in regulatory reviews of new drug applications [1]. Most cancer drug approvals (55 of 83 (66%)) between 2009 and 2014 by the USFDA have used at least one surrogate endpoint [2].
A motivative example is to use biomarkers in phase II cancer trials. Non-randomized single arm or randomized parallel clinical trials are used to evaluate signal of efficacy for a new drug. A binary response status, such as the total response based on the RECIST criterion [3], or a continuous response in change of tumor sizes [4] are common primary endpoints. For molecular-targeted drugs or immune oncology therapies, various serum, tissue, or imaging biomarkers are being developed to assess if the targeted pathways have been activated. These biomarkers are usually continuous, can be measured repeatedly, and their changes should proceed to a clinical response. However, the activation of a targeted pathway doesn’t necessarily imply the response to the treatment. Various questions have been raised about the utility of such biomarkers in phase II trials [5,6].
Surrogate endpoints provide the convenience to speed up the clinical trial [7], but may not represent the actual outcomes well regarding the benefit of therapy [8]. For instance, bevacizumab was approved in metastatic breast cancer based on the surrogate outcome and was later withdrawn for failing to confirm a survival benefit [5]. How to evaluate the therapy benefit between surrogates, denoted as S and the true clinical endpoint outcome, denoted as T, remains a scientific challenge.
There are a lot of successful statistical methods and measures to assess surrogate endpoints. One method to validate surrogate endpoints is to evaluate their correlation with clinically meaningful endpoints through meta-analyses [9]. Only 11 of 89 (12%) studies had found high correlation (r ≥ 0.85), and nine (10%) showed a moderate-only correlation (r > 0.7 to r < 0.85) between surrogates and endpoints [7], suggesting that the strength of surrogates in clinical practice is often unknown or weak.
The landmark paper by Prentice [8] proposed operational criteria for the identification of valid surrogate endpoints. A sufficient condition for an endpoint S, as a valid surrogate of a primary clinical endpoint T, in the evaluation of a treatment, denoted as Z, is that the random vector (Z, S, T) forms a Markov chain Z → S → T, i.e., conditioning on S, Z and T are independent from each other. This condition led to parametric and non-parametric approaches to quantify the proportion of the treatment effect on T that is explained by the treatment effect on the S [9–13]. Other proposed quantities to assess the utility of a surrogate endpoint include dissociative effects, associative effects, average causal necessity, average causal sufficiency, causal effect predictiveness surface, and principle surrogate, etc. [9,14–22].
Buyse and Molenberghs [23] suggested two quantities to validate a surrogate endpoint: the relative effect which related the treatment effect on the primary outcome to that on the surrogate at the population level, and the adjusted association, which quantified the association between the primary outcome and surrogate marker after adjusting for the treatment at the individual level. These methods assumed that information regarding the surrogate and true endpoints was available from a single-trial surrogate evaluation method. They focused only on the validity, whereas the general association of the last one was related to the efficiency of a surrogate marker. Using an information-theoretic approach, Alonso and Molenberghs [24] and Pryseley et al. [25] redefined surrogacy in terms of the information content that S provides with respect to T. Using notations from Pryseley et al [25], let S be a continuous surrogate random variable and T be the continuous targeted clinical endpoint of interest. We use f(·) to denote the density function. The Shannon entropy functions for T and the conditional variable T|S are denoted as h(T) and h(T|S), respectively, where h(T) = E[−logf(T)] and h(T|S) = E[−logf(T|S)]. The corresponding entropy power functions are and . An information theoretic measure of association (ITMA) is defined by Alonso and colleagues as the proportion of uncertainty reduction measured by the entropy power function for T|S in reference to T:
(1) |
Here I(T, S) = E[−logf(T)] – E[−logf(T|S)] is the mutual information. If S is a good surrogate for T, uncertainty about the effect on T is reduced by knowing the knowledge of the effect on S. There are some useful properties as described by Alonso and Molenberghs [24]:
, if and only if (T, S) are independent
is symmetric in (T, S)
is invariant under bijective transformations of T and S
When for continuous models, there is usually a deterministic relationship in the distribution of (T, S), that is, often T = φ(S).
When T is a discrete random variable, . So they propose to use as a modified ITMA.
A good surrogacy should have a high . The beauty of the informatic-theoretical framework is that it moves away from hypothesis testing and provides a quantitative measure of surrogacy. Thus, if there are two surrogate endpoints, S1 and S2, we can compare their utility as surrogacy endpoints based on the value of .
Multiple authors have provided examples of this approach and demonstrated applications for situations when S and T are both binary, continuous, longitudinal, and time-to-event random variables as well as ordinal outcomes [24–28].
In this paper, we present two new results on the topic of surrogate endpoints based on information-theoretic measure of association (ITMA). First, we extend the ITMA construction based on Shannon entropy to a construction based on Havrda and Charvat (H-C) entropy [29]. The extension is motivated by the general existence of the H-C entropy. Explicit expressions as well as the properties of ITMA in different situations based on H-C entropy are presented. Second, we extend the H-C ITMA model to a longitudinally collected continuous surrogate endpoint for a binary clinical endpoint of a clinical trial. Then, the benefit of S is evaluated with the ITMA [24,25]. The current work focuses on a single trial surrogacy and its extension to a meta-analytic framework will need further development.
The paper is organized in the following structure. In Section 2, we give an example of when the Shannon entropy cannot be defined, thus the surrogacy by Alonso and Molenberghs under the information theoretical framework will not work [24]. We then prove the existence of H-C entropy under general conditions. Therefore, a family of surrogacy measures based on ITMA of H-C entropy is defined. An explicit formula is obtained for the following situations: binary-binary, continuous–continuous and binary-continuous. In Section 3, we extend a longitudinal linear random effects model for the longitudinally collected surrogate marker and a probit regression model for a binary primary endpoint in clinical trials. An application using H-C entropy in selecting times to collect surrogate measures is presented using data from a completed clinical trial. Finally, Section 4 presents discussions and a conclusion. R-Programs that generated re-sults for tables in Section 3 are presented in the Appendix A.
2. Extension of ITMA Surrogacy from Shannon Entropy to Havrda-Charvat Entropy
Why should we consider the extension? While Shannon’s entropy is adequate in most applications, there are cases when a Shannon’s entropy function doesn’t exist, and thus the ITMA cannot be properly calculated. We give an example in the following.
Example 1.
Let X be the random variable with heavy tails. Its density function is
(2) |
where c1 ≈ 1.44 and c2 ≈ 0.027 are chosen so that it is a continuous function with a heavy tail. The Shannon entropy h(X) for X is infinity.
One way to make ITMA work is to use the Havrda-Charvat entropy [29], a generalization of entropy function that contains the Shannon entropy as its special case. Mathematically, Havrda-Charvat entropy is defined as follows:
(3) |
where
(4) |
It is easy to see that HC1(X) = h(X).
Proposition 1.
Under a mild regular condition that the density function for X is bounded, HCα(X) can always exist for an α > 1.
Proof.
We only need to prove that for a bounded density function f(x) by a constant K>0, .
Let W = {x: f(x) > 1}, then it follows from the fact that
that m(W) ≤ 1 and , where m(W) is the probability of W. Therefore,
Because Shannon’s entropy is a special case of H-C entropy and H-C entropy always exists with proper choice of α, ITMA of H-C entropy should be a more flexible way as a surrogacy measure for more distribution families. Different from Shannon’s entropy, H-C entropy satisfies a non-additive property such that HCα(T, S) = HCα(T) + HCα(S) + (1 − α)HCα(T)HCα(S), when T and S are independent. In general, the non-additive measures of entropy find justifications in many biological and chemical phenomena [30]. While H-C entropy has been used in quantum physics [31] and medical imaging research [32], it has not yet been used to describe the endpoint surrogacy for clinical trials.
To extend H-C entropy to measure the endpoint surrogacy for trials, we define the ITMA under H-C entropy power as the following:
(5) |
where Iα(T, S) is the mutual information between T and S under H-C entropy [32]. Specifically, for α ≠ 1,
(6) |
When α = 1, Iα(T, S) = I(T, S) and in Equation (1).
It is important to notice that: , for α ≠ 1 where .
Some basic properties are available here.
When T and S are independent, Iα(T, S) = 0. Thus, .
When T and S are deterministic, the value of Iα(T, S) will depend on α > 1 or α < 1 as seem in the following propositions.
Proposition 2.
Let T and S be two continuously normally distributed random variables such as the joint distribution of , the conditional distribution of and , where “′” means vector transpose. Then, we have the following results:
-
2.1The mutual information for H-C entropy depends not only the correlation between T and S, but also their standard deviations for α ≠ 1.
(7) -
2.2
When α ≥ 1, ρ → ±1, Iα → ∞. ρ → ±0, Iα → 0, Iα is an increasing function of |ρ|
-
2.3
When α < 1, ρ → ±1, Iα → 1. ρ → ±0, Iα → 0, Iα is an increasing function of |ρ|.
Therefore, for α < 1, maximum . We can normalize by dividing its maximum value to make normalized in between 0 and 1
(8) |
Proof.
Similarly,
And
As such
Finally, results in 2.2 and 2.3 can be concluded from the expression of Iα(T, S). □
Proposition 3.
Let T and S be two binary outcome variables with 1 for a success and 0 for a failure such as the joint distribution of (T, S)′ ~ Multinomial (p0,0, p1,0, p0,1, p1,1). We have the following results:
(9) |
-
3.1
When pt,s = pt,+ p+,s′ T and S are independent, Iα(T, S) = 0.
-
3.2
Let be the correlation between T and S. If [p0,0]α−1 + [p1,1]α−1 > [p1,0]α−1 + [p0,1]α−1, Iα(T, S) is an increasing function of ρ for α > 1. For α < 1, Iα(T, S) is an increasing function of ρ if [p0,0]α−1 + [p1,1]α−1 > [p1,0]α−1 + [p0,1]α−1.
-
3.3
For α > 1, Iα(T, S) ≤ min(HCα(T), HCα(S)).
-
3.4
For α < 1, Iα(T, S) ≥ max(HCα(T), HCα(S)).
-
3.5For a given marginal distribution of T and S, there is a maximum value of mutual information as
(10)
Thus, we can normalize ITMA as .
Proof.
Because
And
Thus,
Therefore,
which is the expression given in Equation (9).
Result 3.1 is the direct derivation from Equation (9) as pt,s = pt,+p+,s.
Result 3.2 can be derived through the following relationship:
where
(11) |
Then,
Taking the derivative of Iα(T, S) on ρ, we have
Under condition that , for α > 1. Similarly, under condition that , for α < 1.
For 3.3 and 3.4
when α > 1,
However, when α < 1,
Because of a symmetric relationship between T and S, we proved results 3.4 and 3.5.
For 3.5, for fixed marginal probability, Iα(T, S) depends only on ρ in HCα(T, S). Like 3.2,
When , taking the lower boundary of ρ in inequality (11) will derive the min value of HCα(T, S).
When , taking the upper boundary of ρ in inequality (11) will derive the min value of HCα(T, S). □
Remark 1.
When the concordant pairs are more likely than the discordant pairs for the two binary endpoints, for α>1, [p0,0]α−1 + [p1,1]α−1 is more likely to be greater than [p1,0]α−1 + [p0,1]α−1. However, when α<1, [p0,0]α−1 + [p1,1]α−1 is more likely to be less than [p1,0]α−1 + [p0,1]α−1. Thus, when two binary endpoints have more chance to be concordant, the mutual information will be an increasing function of correlation coefficient of ρ as shown in Proposition 3 Result 3.2.
Now we define a model for a binary T and a continuous normally distributed surrogate variable S.
Proposition 4.
Let T be a binary outcome variable and S continuous normally distributed surrogate variable, where T~B(p0, p1) and . We assume that there is a latent variable U such that T = 1 ⇔ U ≥ 0, i.e, a Probit model with U~N(μT, 1) and μT = Φ−1(p1). Assume a correlation coefficient between U and S is ρ, the conditional binary endpoint T|S follows a Bernoulli distribution with and . We have the following results:
-
4.1The mutual information for H-C entropy is
where . -
4.2
When ρ = 0, Iα(T, S) = 0.
-
4.3
When ρ → ±1,
-
4.4
For α → 1,
Proof.
The joint distribution function for (T, S) = (t, s) is:
Therefore,
Therefore,
Using , we can derive an alternative formulation
Iα(T, S)can be simplified as
Thus, we complete the proof for 4.1.
For 4.2, ρ = 0,
For 4.3, as ρ → 1,
Similarly, ρ → −1, .
So, ρ → ±1, .
For α = 1, H-C entropy is similar to Shannon’s entropy. Thus, by taking limit of α to 1, we can derive Shannon’s mutual information for the Probit model in 4.4.
Remark 2.
Taking into account that , where is the Owen’s function [33] and the property T(h, k) = T(−h, k), we can derive explicit formula for α=2,
Since Φ−1(pt) = Φ−1(1 – p1−t) = −Φ−1(p1−t) and we can simplify the expression as
3. Surrogacy of a Longitudinal Biomarker for a Binary Clinical Endpoint
3.1. Model for Longitudinal Continuous Surrogate Biomarkers in Phase II Trials
In many phase II trials, clinical endpoints of interest (T) are often a proportion of a binary endpoint or mean of a continuous variable. For example, in oncology phase II trials, a common clinical endpoint is total response rate. The surrogate biomarkers, on the other hand, are usually lab tests either from serum or urine or imaging modalities that can be measured repeatedly during the study. In this section, we focus on a binary one-time clinical endpoint T and a continuous repeated surrogate variable S.
In the remainder of the paper, we will use tj to denote the time of jth measurement, since baseline in a longitudinal trial. For simplicity, consider the difference model from baseline t0 = 0.
Let the general model as:
Thus, , where
Using a bivariate probit model [34] for the joint distribution of (Zi, Ti)′ we can derive a probit model for the conditional joint distribution for (Zi, Ti)′|S:
where μ0,k and γk are the intercept and regression coefficient vector for the probit regression for T given longitudinal S (k = 1) and for Z given longitudinal S (k = 2), respectively, and ρ is the correlation coefficient of two underlying latent normal variables for T and Z.
Because a linear combination of multivariate normal variables is still a normal random variable, we can use the Proposition 4 to calculate ITMA under H-C entropy power to evaluate surrogacy of the longitudinal biomarker in each arm conditioning on Z. We can also average over the treatment arms to get the mean trial level ITMA under H-C entropy, denoted as
Furthermore, we can use the mutual information Iα(T, Z|S) to verify Prentice’s criteria as suggested by [24]: i.e, conditioning on surrogate S, the clinical endpoint T and treatment assignment Z are independent, thus a good surrogate should lead to Iα(T, Z|S) ≈ 0. Since
where
(12) |
So for α ≠ 1, When α = 1, .
For real data, we can use bivariate probit model to estimate equations for [p(T = t|S)]α, [p(Z = z|S)]α, and [p(T = t, Z = z|S)]α, then use Equation (9) to perform numerical integration for the derivation of Iα(T, Z|S). One way to perform this analysis is to use R-package mvProbit from CRAN-R (https://cran.r-project.org/web/packages/mvProbit/mvProbit.pdf, accessible on January 29, 2022).
3.2. A Data Example
“Safety, Tolerability and Activity Study of Ibudilast in Subjects with Progressive Multiple Sclerosis” (NCT01982942) is a US National Institute of Health (NIH) sponsored multicenter, randomized, double-blind, placebo-controlled, parallel-group phase II study from November 2013 to December 2017. The main study results have been published by Fox, et al. [35]. The trial data is publicly available upon request to NIH. We use this data for the numerical illustration for H-C ITMA.
More specifically, patients were enrolled with primary or secondary progressive multiple sclerosis of this phase II randomized trial of oral ibudilast (≤100 mg daily) or placebo for 96 weeks. The primary efficacy end point was the rate of brain atrophy, as measured by the brain parenchymal fraction (brain size relative to the volume of the outer surface contour of the brain). Major secondary end points included the change in the pyramidal tracts on diffusion tensor imaging and cortical atrophy, all measures of tissue damage in multiple sclerosis.
We requested and received data from the study team that contained 104 placebo patients and 99 treated patients, with longitudinal observations in brain parenchymal fraction (BPF) and thinning of the cortical gray matter (cortical thickness) measured by magnetic resonance imaging at week 0, 24, 48, 72, and 96. For illustration purposes, we altered the primary and secondary endpoints of the trial and created a binary clinical endpoint as the cortical thickness (CTH) greater than 3 mm as a clinical outcome for less cortical gray matter atrophy and used BPF as the continuous longitudinal marker. Table 1 provides a summary of the data used for this illustration.
Table 1.
Summary Statistics for The Real Data Example.
Variable | Control (N = 104) | Treatment (N = 99) | p-Value * |
---|---|---|---|
CTH > 3 mm: N (%) | 50 (48%) | 70 (71%) | 0.0016 |
BPF: Mean (SD) | |||
Week 0 | 0.8023 (0.0301) | 0.8040 (0.0281) | 0.6823 |
Week 24 | 0.8012 (0.0301) | 0.8039 (0.0277) | 0.5001 |
Week 48 | 0.8009 (0.0311) | 0.8036 (0.0282) | 0.5115 |
Week 72 | 0.8001 (0.0303) | 0.8032 (0.0283) | 0.4433 |
Week 96 | 0.7989 (0.0306) | 0.8026 (0.0293) | 0.3813 |
Change/24 weeks ** | −0.0008 (0.0001) | −0.0004 (0.0001) | 0.0056 |
p-value for CTH > 3 mm was calculated using Fisher’s exact test; p-values for mean differences at follow-up visits were calculated using a t-test. P-value for changes in 24 weeks (slopes) was calculated by the mixed random effects model.
change per 24 weeks was estimated using a mixed random effects linear regression model using the R-lmer package.
From Table 1, we can see that 104 patients were randomized to the control arm and 99 patients to the treatment arm. The treatment significantly reduced cortical atrophy for 71% patients who maintained more than 3 mm cortical thickness (CTH) in the treatment arm in comparison to 48% in the placebo arm at 96 weeks post baseline. While the differences in BPFs between treatment arms had p-values above 0.38 in each follow-up MRI, the aggregated changes over time measured by the slopes of a mixed random effects regression model achieved highly statistical significance with a p-value of 0.0056.
The importance of evaluating the surrogacy of the longitudinal BPF measurements for the binary CTH endpoint in MS trials is to understand the strength of surrogacy and whether it can be used to shorten trial duration. More importantly for future trial design, we need to understand how often and when the longitudinal measurements should be performed.
Using formulas derived in Proposition 4, we derived the mean mutual information and ITMA of longitudinal BPF as a surrogate for the clinical endpoint of maintaining more than 3 mm cortical thickness at the end of 96 weeks. We explore three choices of α = 0.5, 1, and 2 to show the difference between H-C and Shannon entropies. The value of α = 1 has been considered because it corresponds to Shannon entropy. The other two alpha values have been considered in other papers such as [32]. The columns of Table 2 are organized according to values of α. Each row in Table 2 represents a design to use BPF in the baseline (week 0) and different follow-up visits to construct a longitudinal surrogate endpoint. For example, the first row used the baseline and week 24 data while the last row used the data from baseline, weeks 48 and 72.
Table 2.
H-C Mutual Information and ITMA by Different Longitudinal Designs.
BPF Data Used | α = 0. 5 | α = 1 | α = 2 | p-Value * | |||
---|---|---|---|---|---|---|---|
Iα(T, S|Z) | ITMA | Iα(T, S|Z) | ITMA | Iα(T, S|Z) | ITMA | ||
0, 24 | 4.6042 | 0.9999 | 2.6300 | 0.9948 | 0.6063 | 0.7026 | 0.0797 |
0, 24, 48 | 4.6117 | 0.9999 | 2.6307 | 0.9948 | 0.6066 | 0.7028 | 0.1025 |
0, 24, 48, 72 | 4.6209 | 0.9999 | 2.6352 | 0.9949 | 0.6071 | 0.7031 | 0.0390 |
0, 24, 48, 72, 96 | 4.6103 | 0.9999 | 2.6361 | 0.9949 | 0.6069 | 0.7029 | 0.0056 |
0, 48 | 4.4683 | 0.9999 | 2.6012 | 0.9945 | 0.5980 | 0.6976 | 0.1586 |
0, 72 | 4.4522 | 0.9999 | 2.5912 | 0.9944 | 0.5962 | 0.6965 | 0.0675 |
0, 24, 72 | 4.6223 | 0.9999 | 2.6348 | 0.9949 | 0.6072 | 0.7031 | 0.0485 |
0, 48, 72 | 4.4696 | 0.9999 | 2.6022 | 0.9945 | 0.5980 | 0.6976 | 0.0382 |
p-value for treatment and visit interactions in a linear mixed random effects model using the R-lmer function.
From Table 2, we can see that the longitudinal BPF measures at the baseline with at least one follow-up visit were all reasonable surrogates for the binary endpoint of CTH > 3.0 mm. H-C entropy with α = 0.5 was not sensitive enough to differentiate subtle differences in surrogacy utility of different designs to collect surrogate endpoints. When α = 1, H-C entropy is Shannon entropy and it was able to discriminate among different designs to the 3rd decimal place. H-C entropy for α = 2 was more sensitive and showed differences in all designs. As it demonstrated, using longitudinal BPF data could shorten trial duration to 72 weeks. For a trial ended at 72 weeks, additional BPF measures at week 24 and week 48 did not add any more valuable utility to surrogacy than a single measure in week 24. Overall, the p-values from the linear mixed random effects model reflected the directions of ITMAs, but not in completely concordance, perhaps, due to random variation in fitting the mixed random effects and the probit models.
Table 3 examines the longitudinal surrogacy based on Prentice’s criteria. Here we want to determine if Iα(T, Z|S) is close to 0. The results of Table 3 confirm the observations in Table 2 that the longitudinal BPF is a good surrogate variable for binary CTH > 3.0 mm at 96 weeks. Because Table 3 uses the same model as Table 2, the p-values for longitudinal models are omitted. Once again, Iα(T, Z|S) decreases with α.
Table 3.
Prentice Criteria for Surrogate Endpoint.
BPF Data Used | α = 0. 5 | α = 1 | α = 2 | |||
---|---|---|---|---|---|---|
Iα(T, Z|S) | ITMA | Iα(T, Z|S) | ITMA | Iα(T, Z|S) | ITMA | |
0, 24 | 0.0390 | 0.0751 | 0.0271 | 0.0528 | 0.0108 | 0.0213 |
0, 24, 48 | 0.0388 | 0.0747 | 0.0270 | 0.0526 | 0.0108 | 0.0213 |
0, 24, 48, 72 | 0.0407 | 0.0782 | 0.0280 | 0.0545 | 0.0110 | 0.0218 |
0, 24, 48, 72, 96 | 0.0395 | 0.0760 | 0.0274 | 0.0533 | 0.0110 | 0.0218 |
0, 48 | 0.0428 | 0.0820 | 0.0297 | 0.0578 | 0.0117 | 0.0231 |
0, 72 | 0.0416 | 0.0798 | 0.0287 | 0.0558 | 0.0111 | 0.0219 |
0, 24, 72 | 0.0403 | 0.0775 | 0.0278 | 0.0541 | 0.0110 | 0.0217 |
0, 48, 72 | 0.0434 | 0.0832 | 0.0299 | 0.0580 | 0.0116 | 0.0229 |
4. Conclusions
Alonso et al. [24] proposed to assess the validity of a surrogate endpoint in terms of uncertainty reduction. The main proposals for measures of uncertainty are found in information theory. These authors based their proposal in the well-known Shannon entropy. In the past there has been an extensive work on generalized entropies [30–32,36–39]. We focus on the Havrda-Charvat entropy, which reduces to the Shannon case if the parameter is set to one, to extend that surrogacy measure. Based on the generalized entropy, we consider a generalized mutual information as it has been proved in other contexts to have better performance of some members of this family [30–32]. In this paper, the theoretical development of these measures has been completed. The advantage of our proposal is that it contains a particular case of a useful measure to assess surrogacy and demonstrates the ability to easily explore other measures which may have performance advantages for specific questions. We have seen the advantage of using α = 2 instead of α = 1 in our example to evaluate scheduling of longitudinal visits.
Some additional issues are pending. On one hand, we are working to carry out a more extensive numerical study for assessing the performance of these measures in the endpoint surrogacy context. In our paper, we compared the performance of ITMA in a real trial with three choices of α (0.5, 1 and 2). They were chosen for illustration purposes. The optimal choice of α remains a research question. Additional research can consider other ITMA, such as divergence measures [36], taking into account that the mutual information is equal to the Kullback divergence, or measures of unilateral dependency as that defined by Andonie et al. [37] based on the informational energy [39] or surrogacy for testing of variances [38].
Acknowledgments:
This work was partially supported by research grant PID2019-104681RB-I00 (Pardo M.C.) of the Spanish Ministry of Science and Innovation and grants from the US National Institute of Health 1UL1TR003142, 4P30CA124435, and R01HL089778 (Lu Y.). We want to thank the US National Institute of Neurological Disorders and Stroke and Fox, the Principal Investigator of NCT01982942 for sharing the de-identified data from the “Safety, Tolerability and Activity Study of Ibudilast in Subjects with Progressive Multiple Sclerosis.” We want to thank the reviewers for their constructive comments that substantially improved the paper.
Funding:
PID2019-104681RB-I00 (Pardo M.C.) of the Spanish Ministry of Science and Innovation and grants from the US National Institute of Health 1UL1TR003142, 4P30CA124435, and R01HL089778 (Lu Y.).
Appendix A. R-Program
R-Program for Table 1
### Row 1 ### fisher.test(table(nihexample$CTh96YesNo,nihexample$trt.group)) ### Row 3–5 ### t.test(nihexample$bpf0~nihexample$trt.group) t.test(nihexample$bpf96~nihexample$trt.group) ### Row 6 ### library(lme4) summary(lmer(bpf~trt.group+week+trt.group*week+(1|ID),data=nihexamplelong))
R-Program for Table 2
hcentr=function(pt,preds,pz,trt, alpha){ s2=var(preds) if(alpha!=1){ mtrinf=pz/(1-alpha)*((2*pi*s2)^((1-alpha)/2)/sqrt(alpha)*((pt[2])^alpha+(1-pt[2])^alpha)-mean((pnorm(preds[trt==1]))^alpha+(1-pnorm(preds[trt==1]))^alpha))+(1-pz)/(1-alpha)*((2*pi*s2)^((1-alpha)/2)/sqrt(alpha)*((pt[1])^alpha+(1-pt[1])^alpha)-mean((pnorm(preds[trt==0]))^alpha+(1-pnorm(preds[trt==0]))^alpha))} if(alpha==1){ mtrinf=pz*(mean(pnorm(preds[trt==1])*log(pnorm(preds[trt==1])))+mean(1-pnorm(preds[trt==1])*log(1-pnorm(preds[trt==1])))-pt[2]*log(pt[2])-(1-pt[2])*log(1-pt[2]))+(1-pz)*(mean(pnorm(preds[trt==0])*log(pnorm(preds[trt==0])))+mean(1-pnorm(preds[trt==0])*log(1-pnorm(preds[trt==0])))-pt[1]*log(pt[1])-(1-pt[1])*log(1-pt[1]))} itma=1-exp(-2*mtrinf) c(mtrinf,itma) } ### Row 1 ### pt=table(nihexample$CTh96YesNo,nihexample$trt.group)[2,]/table(nihexample$trt.group) myprobit1=glm(CTh96YesNo~trt.group+bpf0+bpf24,family=binomial(link=“probit”), data=nihexample) row1=round(c( hcentr(pt,myprobit1$linear.predictors,sum(nihexample$trt.group)/length(nihexample$trt.group),nihexample$trt.group,0.5), hcentr(pt,myprobit1$linear.predictors,sum(nihexample$trt.group)/length(nihexample$trt.group),nihexample$trt.group,1), hcentr(pt,myprobit1$linear.predictors,sum(nihexample$trt.group)/length(nihexample$trt.group),nihexample$trt.group,2), summary(lmer(bpf~trt.group+week+trt.group*week+(1|ID),data=nihexamplelong[nihexamplelong$week==0 |nihexamplelong$week==24, ]))$coefficients[4,5]),4) ### similar codes for other rows ####
R-Program for Table 3
library(mvtnorm) library(mvProbit) ####(T, Z)#### table(table(nihexample$CTh96YesNo,nihexample$trt.group) #####choose varibles: “bpf0”, “bpf24” “bpf48” “bpf72” “bpf96” (1–5) fullmodel1 = mvProbit(cbind(CTh96YesNo,trt.group)~bpf0+bpf24+bpf48+bpf72+bpf96,data=nihexample) #####model summary(fullmodel1) sigma=symMatrix(c(1,fullmodel1$estimate[length(fullmodel1$estimate)],1)) ################################################ #mu1=fullmodel1$estimate[1]+fullmodel1$estimate[2]*nihexample$bpf0+fullmodel1$estimate[3]*nihexample$bpf24 #mu2=fullmodel1$estimate[4]+fullmodel1$estimate[5]*nihexample$bpf0+fullmodel1$estimate[6]*nihexample$bpf24 #mu1=fullmodel1$estimate[1]+fullmodel1$estimate[2]*nihexample$bpf0+fullmodel1$estimate[3]*nihexample$bpf24+fullmodel1$estimate[4]*nihexample$bpf48 #mu2=fullmodel1$estimate[5]+fullmodel1$estimate[6]*nihexample$bpf0+fullmodel1$estimate[7]*nihexample$bpf24+fullmodel1$estimate[8]*nihexample$bpf48 #mu1=fullmodel1$estimate[1]+fullmodel1$estimate[2]*nihexample$bpf0+fullmodel1$estimate[3]*nihexample$bpf24+fullmodel1$estimate[4]*nihexample$bpf48+fullmodel1$estimate[5]*nihexample$bpf72 #mu2=fullmodel1$estimate[6]+fullmodel1$estimate[7]*nihexample$bpf0+fullmodel1$estimate[8]*nihexample$bpf24+fullmodel1$estimate[9]*nihexample$bpf48+fullmodel1$estimate[10]*nihexample$bpf72 mu1=fullmodel1$estimate[1]+fullmodel1$estimate[2]*nihexample$bpf0+fullmodel1$estimate[3]*nihexample$bpf24+fullmodel1$estimate[4]*nihexample$bpf48+fullmodel1$estimate[5]*nihexample$bpf72+fullmodel1$estimate[6]*nihexample$bpf96 mu2=fullmodel1$estimate[7]+fullmodel1$estimate[8]*nihexample$bpf0+fullmodel1$estimate[9]*nihexample$bpf24+fullmodel1$estimate[10]*nihexample$bpf48+fullmodel1$estimate[11]*nihexample$bpf72+fullmodel1$estimate[12]*nihexample$bpf96 fullmodel1$estimate sigma ##################### T|S Z|S############ bs=10000 set.seed(873465) BT=sample(c(1:length(CTh96YesNo)), size = bs, replace = TRUE) ### bootstrap ID### BT_U1=matrix(0,bs,2) #### 1 D normal probability BT_U2=matrix(0,bs,4) ####2 D normal probability for (i in 1:bs) { c=BT[i] BT_U1[i,1]=1-pnorm(0, mu1[c], 1) ####P(T=1|S) BT_U1[i,2]=1-pnorm( 0,mu2[c], 1)####P(Z=1|S) BT_U2[i,1]=pmvnorm(lower=c(0,0),up-per=Inf,mean=c(mu1[c],mu2[c]),sigma)####P(T=1,Z=1|S] BT_U2[i,2]=pmvnorm(lower=c(0,-Inf),up-per=c(Inf,0),mean=c(mu1[c],mu2[c]),sigma)####P(T=1,Z=0|S] BT_U2[i,3]=pmvnorm(lower=c(-Inf,0),upper=c(0,Inf),mean=c(mu1[c],mu2[c]),sigma)####P(T=0,Z=1|S] BT_U2[i,4]=pmvnorm(lower=-Inf,upper=c(0,0),mean=c(mu1[c],mu2[c]),sigma)####P(T=0,Z=0|S]} ################################ alfa=0.5 p1=mean((BT_U1[,1]*BT_U1[,2])^alfa+((1-BT_U1[,1])*BT_U1[,2])^alfa+(BT_U1[,1]*(1-BT_U1[,2]))^alfa+((1-BT_U1[,1])*(1-BT_U1[,2]))^alfa) p2=mean(BT_U2[,1]^alfa+BT_U2[,2]^alfa+BT_U2[,3]^alfa+BT_U2[,4]^alfa) I_alfa=1/(1-alfa)*(p1-p2) IM_alfa=1-exp(-2*I_alfa) I_alfa IM_alfa ################################### alfa=2 p1=mean((BT_U1[,1]*BT_U1[,2])^alfa+((1-BT_U1[,1])*BT_U1[,2])^alfa+(BT_U1[,1]*(1-BT_U1[,2]))^alfa+((1-BT_U1[,1])*(1-BT_U1[,2]))^alfa) p2=mean(BT_U2[,1]^alfa+BT_U2[,2]^alfa+BT_U2[,3]^alfa+BT_U2[,4]^alfa) I_alfa=1/(1-alfa)*(p1-p2) IM_alfa=1-exp(-2*I_alfa) I_alfa IM_alfa ######## alfa=1##################### p1=-mean(BT_U1[,1]*log(BT_U1[,1])+(1-BT_U1[,1])*log(1-BT_U1[,1])) p2=-mean(BT_U1[,2]*log(BT_U1[,2])+(1-BT_U1[,2])*log(1-BT_U1[,2])) p3=mean(BT_U2[,1]*log(BT_U2[,1])+BT_U2[,2]*log(BT_U2[,2])+BT_U2[,3]*log(BT_U2[,3]) +BT_U2[,4]*log(BT_U2[,4])) I=p1+p2+p3 IM=1-exp(-2*I)
Footnotes
Conflicts of Interest: The authors declare no conflict of interest.
Data Availability Statement:
Data can be requested through the US National Institute of Neurological Disorders and Stroke and Fox, the Principal Investigator of NCT01982942 for access the de-identified data from the “Safety, Tolerability and Activity Study of Ibudilast in Subjects with Progressive Multiple Sclerosis.”
References
- 1.Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics Guidance for Industry. U.S. Department of Health and Human Services. Available online: Clinical Trial Endpoints for the Approval of Cancer Drugs and Biologics fda.gov (accessed on January 29, 2022).
- 2.Kim C; Prasad V Cancer drugs approved on the basis of a surrogate end point and subsequent overall survival: An analysis of 5 years of, U.S.; food and drug administration approvals. JAMA Intern. Med 2015, 175, 1992–1994. 10.1001/jamainternmed.2015.5868. [DOI] [PubMed] [Google Scholar]
- 3.Schwartz LH; Litière S; de Vries E; Ford R; Gwyther S; Mandrekar S; Shankar L; Bogaerts J; Chen A; Dancey J; et al. RECIST 1.1-update and clarification: From the RECIST committee. Eur. J. Cancer 2016, 62, 132–137. 10.1016/j.ejca.2016.03.081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Karrison TG; Maitland ML; Stadler WM; Ratain MJ Design of phase II cancer trials using a continuous endpoint of change in tumor size: Application to a study of sorafenib and erlotinib in non-small-cell lung cancer. J. Natl. Cancer Inst 2007, 99, 1455–1461; Erratum in J. Natl. Cancer Inst. 2007, 99, 1819. [DOI] [PubMed] [Google Scholar]
- 5.Burzykowski T; Coart E; Saad ED; Shi Q; Sommeijer DW; Bokemeyer C; Díaz-Rubio E; Douillard JY; Falcone A; Fuchs CS; et al. Evaluation of continuous tumor-size–based end points as surrogates for overall survival in randomized clinical trials in metastatic colorectal cancer. JAMA Netw. Open 2019, 2, e1911750. 10.1001/jamanetworkopen.2019.11750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lu Y Statistical considerations for quantitative imaging measures in clinical trials. In Biopharmaceutical Applied Statistics Symposium: Volume 3 Pharmaceutical Applications; Peace KE, Chen D-G, Menon S, Eds.; ICSA Book Series in Statistics; Springer: Singapore, 2018; pp. 219–240. [Google Scholar]
- 7.Chen EY; Joshi SK; Tran A; Prasad V Estimation of study time reduction using surrogate end points rather than overall survival in oncology clinical trials. JAMA Intern. Med 2019, 179, 642–647. 10.1001/jamainternmed.2018.8351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kok PS; Yoon WH; Lord S; Marschner I; Friedlander M; Lee CK Tumor response end points as surrogates for overall survival in immune checkpoint inhibitor trials: A systematic review and meta-analysis. JCO Precis. Oncol 2021, 5, 1151–1159. 10.1200/PO.21.00108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Shameer K; Zhang Y; Jackson D; Rhodes K; Neelufer IKA; Nampally S; Prokop A; Hutchison E; Ye J; Malkov VA; et al. Correlation between early endpoints and overall survival in non-small-cell lung cancer: A trial-level meta-analysis. Front. Oncol 2021, 11, 672916. 10.3389/fonc.2021.672916. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Haslam A; Hey SP; Gill J; Prasad V A systematic review of trial-level meta-analyses measuring the strength of association between surrogate end-points and overall survival in oncology. Eur. J. Cancer 2019, 106, 196–211. 10.1016/j.ejca.2018.11.012. [DOI] [PubMed] [Google Scholar]
- 11.Prentice RL Surrogate endpoints in clinical trials: Definitions and operational criteria. Stat. Med 1989, 8, 431–440. [DOI] [PubMed] [Google Scholar]
- 12.Freedman LS; Graubard BI; Schatzkin A Statistical validation of intermediate endpoints for chronic diseases. Stat. Med 1992, 11, 167–178. [DOI] [PubMed] [Google Scholar]
- 13.Wang Y; Taylor JM A measure of the proportion of treatment expect explained by a surrogate marker. Biometrics 2002, 58, 803–812. [DOI] [PubMed] [Google Scholar]
- 14.Taylor JM; Wang Y; Thiffebaut R Counterfactual links to the proportion of treatment effect explained by a surrogate marker. Biometrics 2005, 61, 1102–1111. [DOI] [PubMed] [Google Scholar]
- 15.Parast L; Tian L; Cai T Landmark estimation of survival and treatment effect in a randomized clinical trial. J. Am. Stat. Assoc 2014, 109, 384–394. 10.1080/01621459.2013.842488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Parast L; McDermott MM; Tian L Robust estimation of the proportion of treatment effect explained by surrogate marker information. Stat. Med 2016, 35, 1637–1653. 10.1002/sim.6820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Frangakis CE; Rubin DB Principal stratification in causal inference. Biometrics 2002, 58, 21–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Conlon AS; Taylor JM; Elliott MR Surrogacy assessment using principal stratification when surrogate and outcome measures are multivariate normal. Biostatistics 2014, 15, 266–283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Huang Y; Gilbert PB Comparing biomarkers as principal surrogate endpoints. Biometrics 2011, 67, 1442–1451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gabriel EE; Gilbert PB Evaluating principal surrogate endpoints with time-to-event data accounting for time-varying treatment efficacy. Biostatistics 2014, 15, 251–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Gabriel EE; Sachs MC; Gilbert PB Comparing and combining biomarkers as principle surrogates for time-to-event clinical endpoints. Stat. Med 2015, 34, 381–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Gilbert PB; Hudgens MG Evaluating candidate principal surrogate endpoints. Biometrics 2008, 64, 1146–1154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Buyse M; Molenberghs G Criteria for the validation of surrogate endpoints in randomized experiments. Biometrics 1998, 54, 1014–1029. [PubMed] [Google Scholar]
- 24.Alonso A; Molenberghs G Surrogate marker evaluation from an information theoretic perspective. Biometrics 2007, 63, 180–186. [DOI] [PubMed] [Google Scholar]
- 25.Pryseley A; Tilahun A; Alonso A; Molenberghs G Information-theory based surrogate marker evaluation from several randomized clinical trials with continuous true and binary surrogate endpoints. Clin. Trials 2007, 4, 587–597. [DOI] [PubMed] [Google Scholar]
- 26.Alonso A; Molenberghs G Evaluating time to cancer recurrence as a surrogate marker for survival from an information theory perspective. Stat. Methods Med. Res 2008, 17, 497–504. [DOI] [PubMed] [Google Scholar]
- 27.Alonso A; Bigirumurame T; Burzykowski T; Buyse M; Molenberghs G; Muchene L; Perualila NJ; Shkedy Z; Van der Elst W Applied surrogate endpoint evaluation methods with SAS and R. Chapman Hall 2017. 10.1201/9781315372662 [DOI] [Google Scholar]
- 28.Ensor H; Weir CJ Evaluation of surrogacy in the multi-trial setting based on information theory: An extension to ordinal outcomes. J. Biopharm. Stat 2020, 30, 364–376. 10.1080/10543406.2019.1696357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Havrda J; Charvát F Quantification method of classification processes. Concept of structural α-entropy. Kybernetika 1967, 3, 30–35. [Google Scholar]
- 30.Tsallis C Possible generalization of BoltzmannGibbs statistics. J. Stat. Phys 1988, 52, 479–487. [Google Scholar]
- 31.Amigó JM; Balogh SG; Hernández S A brief review of generalized entropies. Entropy 2018, 20, 813. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wachowiak MP; Smolíková R; Tourassi GD; Elmaghraby AS Similarity metrics based on nonadditive entropies for 2D-3D multimodal biomedical image registration. In Medical Imaging 2003: Image Processing; International Society for Optics and Photonics: Bellingham, WA, USA, 2003; Volume 5032, pp. 1090–1100. [Google Scholar]
- 33.Owen D A table of normal integrals. Commun. Stat. Simul. Comput 1980, 9, 389–419. [Google Scholar]
- 34.Chib S; Greenger E Analysis of multivariate probit models, Biometrika 1998, 85, 347–361. [Google Scholar]
- 35.Fox RJ; Coffey CS; Conwit R; Cudkowicz ME; Gleason T; Goodman A; Klawiter EC; Matsuda K; McGovern M; Naismith RT; et al. NN102/SPRINT-MS trial investigators. Phase 2 trial of ibudilast in progressive multiple sclerosis. N. Engl. J. Med 2018, 379, 846–855. 10.1056/NEJMoa1803583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Biswas A; Pardo MC; Guha A Auto-association measures for stationary time series of categorical data. TEST 2004, 23, 487–514. [Google Scholar]
- 37.Andonie R; Petrescu F Interacting systems and informational energy. Found. Control Eng 1986, 11, 53–59. [Google Scholar]
- 38.Pardo JA; Pardo MC; Vicente ML; Esteban MD A statistical information theory approach to compare the homogeneity of several variances. Comput. Stat. Data Anal 1997, 24, 411–416. [Google Scholar]
- 39.Menéndez ML; Pardo JA; Pardo MC Estimators based on sample quantiles using (h,φ)-entropy measures. Appl. Math. Lett 1998, 11, 99–104. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data can be requested through the US National Institute of Neurological Disorders and Stroke and Fox, the Principal Investigator of NCT01982942 for access the de-identified data from the “Safety, Tolerability and Activity Study of Ibudilast in Subjects with Progressive Multiple Sclerosis.”