Interval Estimation for Minimal Clinically Important Difference and its Classification Error via a Bootstrap Scheme

Zehua Zhou; Jiwei Zhao; Melissa Kluczynski

doi:10.1080/24754269.2019.1587692

. Author manuscript; available in PMC: 2020 Nov 20.

Published in final edited form as: Stat Theory Relat Fields. 2019 Mar 19;2019:10.1080/24754269.2019.1587692. doi: 10.1080/24754269.2019.1587692

Interval Estimation for Minimal Clinically Important Difference and its Classification Error via a Bootstrap Scheme

Zehua Zhou ¹, Jiwei Zhao ^1,^*, Melissa Kluczynski ²

PMCID: PMC7678023 NIHMSID: NIHMS1536634 PMID: 33225201

Abstract

With the improved knowledge on clinical relevance and more convenient access to the patient-reported outcome data, clinical researchers prefer to adopt minimal clinically important difference (MCID) rather than statistical significance as a testing standard to examine the effectiveness of certain intervention or treatment in clinical trials. A practical method to determining the MCID is based on the diagnostic measurement. By using this approach, the MCID can be formulated as the solution of a large margin classification problem. However, this method only produces the point estimation, hence lacks of ways to evaluate its performance. In this paper we introduce an m-out-of-n bootstrap approach which provides the interval estimations for MCID and its classification error, an associated accuracy measure for performance assessment. A variety of extensive simulation studies are implemented to show the advantages of our proposed method. Analysis of the chondral lesions and meniscus procedures (ChAMP) trial is our motivating example and is used to illustrate our method.

Keywords: Minimal clinically important difference, Classification error, Confidence interval, Non-convex optimization, Bootstrap, m-out-of-n bootstrap

1. Introduction

Statistical significance is widely reported in clinical studies to infer treatment effect. For instance, in a randomized controlled trial to compare debridement to observation of chondral lesions encountered during partial meniscectomy (Bisson et al., 2017), the difference of the patient outcomes before surgery and one year after is used to assess the existence of any statistically significant effect.

Although this framework based on a threshold of the p-value objectifies the research outcome, solely relying on it can have two potentially serious consequences. First, statistical significance only signifies the existence of treatment effect, no matter how large is the effect size. The statistical significance could result from a huge sample size, hence may clinically irrelevant to the patients at all. Second, a clinically importance effect could be classified as statistically non-significant due to various reasons, say, the small sample size in the study, hence be unfairly ignored. In brief statistical significance does not necessarily imply clinical importance, and vise versa.

Over the years clinical investigators are realizing that the determination of a treatment’s clinical importance is much more valuable and reliable than merely seeking its statistical significance. Also, the development of various patient-rated instruments contributes huge amounts of patient-reported outcome (PRO) data, which provides the researchers with chances to study the clinical relevance. In order to study clinical importance, Jaeschke et al. (1989) proposed the concept of minimal clinically importance difference (MCID). It is defined as the smallest change in an outcome that an individual patient would identify as important, therefore offers a threshold above which outcome is experienced as relevant by the patients. This avoids the problem of mere statistical significance (Wright et al., 2012). The MCID provides objective reference for clinicians and health policy makers regarding the effectiveness of the treatment, hence has quickly gained its popularity (McGlothlin and Lewis, 2014; Erdogan et al., 2016).

A variety of methods have been proposed to calculate the MCID. The anchor based method compares the changes in scores with an anchor as the reference. A popular anchor is the anchor question in the questionnaire. For instance, the short form (SF36) health survey (Ware Jr and Sherbourne, 1992) serves this role in the ChAMP trial study (Bisson et al., 2017; Kluczynski et al., 2017; Bisson et al., 2018, 2019). Hedayat et al. (2015) adopted this anchor based method and formulated the MCID as the threshold value in post-treatment change such that the probability of disagreement between the estimated satisfaction based on the MCID and the PRO is minimized.

Although the proposal in Hedayat et al. (2015) possesses the statistical rigor and paves the way for potential extension, it has some limitations. First, Hedayat et al. (2015) relies on a testing data set with a very large sample size to have their method implemented and assessed; however, the sample size of a clinical study is usually much smaller, therefore such a testing data set is not available for real application. Second, Hedayat et al. (2015) only provides a point estimation for the MCID, which is not informative enough in most clinical studies (Cook, 2008; Erdogan et al., 2016). Without an interval estimation, it is unknown how accurate this point estimation is. Furthermore without an interval estimation, we have no idea how to compare multiple MCIDs derived for different population subgroups, hence the population heterogeneity could not be learned.

In this paper, we aim on solving the problems mentioned above and filling in this gap in the literature. We first introduce the concept of classification error to gauge the effectiveness of the MCID. More importantly, this concept also allows us to compare MCIDs derived for different population subgroups or computed using different methods. Second using the m-out-of-n bootstrap technique, we obtain an interval estimation of the MCID, and also that of the classification error. The interval estimation makes it possible to conduct statistical inference on the MCID. It also allows us to fully learn the population heterogeneity based on the MCID.

Our proposal has two distinct features. First, different from Hedayat et al. (2015), our framework does not rely on a testing data set with a large sample size, hence can be conveniently used in various clinical studies. Second, although the bootstrap has already been a well known and established statistical technique since Efron (1979), we stress that its conventional version can not be directly applied in our context due to the restrictive conditions for it to be valid. Instead, the one we adopt is the m-out-of-n bootstrap with its theoretical properties justified in Shao (1994), Shao (1996), Bickel et al. (1997) and among others.

In the remainder of this paper, we first introduce our motivating example, the ChAMP trial study in Section 2. Then in Section 3 we review the concept of MCID and introduce that of the classification error. Our methodology, including both simple linear and nonparametric kernel MCIDs and the bootstrap scheme to compute the confidence interval, is presented in Section 4. We show the finite sample performance of our proposed method through simulation studies in Section 5 and apply our method to the ChAMP trial study in Section 6. The mathematical details are contained in the Appendix.

2. Motivating Example: ChAMP Trial

Our motivating example is the chondral lesions and meniscus procedures (ChAMP) trial that examines whether the presence of the chondral lesions surrounding the knee cartilage affects patients’ recovery from the arthroscopic partial meniscectomy (APM) (Bisson et al., 2017). In the field of orthopaedics, APM is one of the most common treatment options to repair the knee damage especially for the patients with meniscus tear. During the operations, however, the surgeons can often find the additional knee damages in the form of chondral lesions. The effect of these chondral lesions on patients’ post-operative outcomes are unclear, and whether these lesions need to be treated by debridement remains an open question. Thus the ChAMP trial is designed to help the clinical physicians better understand this relation and provide them reasonable suggestions on preoperative evaluation and treatment option.

This study enrolled eligible patients who were ≥ 30 years old, diagnosed with a symptomatically consistent meniscus tear by magnetic resonance imaging, and underwent APM. Of the subjects who enrolled, 190 patients with surgically significant chondral lesions were randomized to receive debridement (CL-Deb group; n=98) or observation (CL-noDeb group; n=92). Outcome measures include the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) and the SF-36 health survey. Each outcome was evaluated at baseline and 1 year postoperatively. The demographic data such as age and sex at baseline and surgical data including the location and type of meniscal tears were also collected.

The major goal of the study is to assess whether and how the debridement group is different from the observation group in terms of the change of the WOMAC pain score from enrollment to one year after surgery, and how it relates to other covariate variables and clinical biomarkers. In our investigation, we focus on the study of the interval estimation of the MCID and its classification error. We use an anchor based method to compute the MCID and the anchor question we use is in the SF-36 health survey.

3. The MCID and the Classification Error

In the ChAMP trial, we denote each patient’s reported outcome in the SF-36 health survey as a binary variable, where Y = 1 if the patient reports a better health condition after the surgery and Y = −1 otherwise. The difference of each patient’s WOMAC pain score from baseline to one year after surgery, denoted as X, is treated as the patient’s diagnostic measurement. Let a p-dimensional covariate Z be the patient’s clinical profile with $Z \in R^{p}$ .

It is reasonable and of interest to consider the MCID c* as a function of the patient’s clinical profile, c*(z). The population heterogeneity could also be learned from the knowledge of c*(z). According to Hedayat et al. (2015), the c*(z) is defined as the minimizer of

P [Y \neq sign {X - c (Z)}] = \frac{1}{2} E [1 - Y sign {X - c (Z)}],

(1)

where E is the expectation taken with respect to (X, Y, Z) and sign(∙) is the standard sign function. Given independent and identically distributed observations {(x_i, y_i, z_i), i = 1, ... , n}, the empirical version of the objective function in (1) becomes

\frac{1}{2 n} \sum_{i = 1}^{n} [1 - y_{i} sign {x_{i} - c (z_{i})}],

(2)

which involves the 0-1 loss function $L_{01} (u) = \frac{1}{2} (1 - sign (u))$ . The direct minimization of (2) is infeasible. In this paper, we follow Hedayat et al. (2015) to approximate the L₀₁ function with the non-smooth ramp loss function. Note that the non-smooth ramp loss is defined as

L_{δ} (u) = {\begin{matrix} 1 & u \leq 0, \\ 1 - \frac{u}{δ} & 0 < u \leq δ, \\ 0 & u > δ, \end{matrix}

where δ > 0 is a scalar factor. As δ → 0, L_δ(∙) → L₀₁(∙). Then our objective function becomes

\frac{1}{n} \sum_{i = 1}^{n} L_{δ} {y_{i} (x_{i} - c (z_{i}))} .

(3)

Due to the non-convexity of the non-smooth ramp loss, the optimization problem in (3) requires nonconvex minimization. Note that we can write L_δ(u) = L₁(u) − L₂(u), where both $L_{1} (u) = \frac{1}{δ} (δ - u)_{+}$ and $L_{2} (u) = \frac{1}{δ} (- u)_{+}$ are convex functions. Hence we apply the difference of convex (DC) algorithm (Thi Hoai An and Dinh Tao, 1997) to minimize (3), which has the form

\frac{1}{n} \sum_{i = 1}^{n} L_{1} {y_{i} (x_{i} - c (z_{i}))} - \frac{1}{n} \sum_{i = 1}^{n} L_{2} {y_{i} (x_{i} - c (z_{i}))} .

(4)

Figure 1 below illustrates the relation among the L₀₁, L_δ, L₁ and L₂ loss functions.

Once the minimizer $\hat{c}$ is obtained, we can validate whether the debridement or the observation to the chondral lesions for each patient is indeed prescribed correctly. This is essential since it provides knowledgeable advice in future surgical practice for new patients. To re-evaluate whether the treatment is offered appropriately, we need a statistical measure to quantify the discrepancy between the patient’s PRO Y₀ and its dichotomous diagnostic measure from learning his/her MCID sign{X₀ − c(Z₀)}. Here, to avoid confusion, we use a generic notation {(Y₀, X₀, Z₀)} to represent any patient that would like to be validated. This results in

E_{0} [1 {Y_{0} \neq sign {X_{0} - \hat{c} (Z_{0})}}],

(5)

where E₀ is the expectation taken with respect to {(Y₀, X₀, Z₀)}. While there are other alternative measures to study, this error (5), usually called the classification error, or the test error, is popularly used in the statistical machine learning literature.

The estimation for the classification error is not a trivial task (Laber and Murphy, 2011). In this paper, we concentrate on creating confidence intervals for both the MCID and the classification error.

4. Methodology

In this section, we first present detailed algorithms to calculating MCID. We consider both simple linear MCID and its nonparametric kernel counterpart. We then introduce our bootstrap scheme to construct confidence intervals for MCID and the classification error.

4.1. Algorithms for MCID

It is of particular interest to clinicians if the MCID has a comprehensible structure. For instance, c(Z) = α + β^TZ, where Z could include the treatment variable, some of the demographic variables and clinical biomarkers. This is the simple linear MCID we consider below. On the other hand, although easily interpretable, a linear structure suffers from the model misspecification issue, thereby yields a solution which may not achieve optimal performance. Therefore we also consider the nonparametric kernel MCID adopting the reproducing kernel Hilbert space framework.

4.1.1. A Simple Linear MCID

We assume c(Z) = α + β^TZ. We add a penalty term $\frac{λ}{2} β^{T} β$ in (4) to avoid model overfitting. Let ω = (α, β^T)^T, then the objective function in (4) is s(ω) = s₁(ω) − s₂(ω), where

s_{1} (ω) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{1}{δ} {δ - y_{i} (x_{i} - α - β^{T} z_{i})}_{+}] + \frac{λ}{2} β^{T} β, s_{2} (ω) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{1}{δ} {- y_{i} (x_{i} - α - β^{T} z_{i})}_{+}] .

It is an iterative algorithm to minimize (4). Let ${\hat{ω}}^{(k)}$ be the estimator of ω at the k^th iteration. We first approximate s₂(ω) with its affine minorization function $s_{2} ({\hat{ω}}^{(k)}) + 〈 ω - {\hat{ω}}^{(k)}, \nabla s_{2} ({\hat{ω}}^{(k)}) 〉$ , where $\nabla s_{2} ({\hat{ω}}^{(k)})$ is the subgradient of s₂(ω) at ${\hat{ω}}^{(k)}$ ,

\nabla s_{2} ({\hat{ω}}^{(k)}) = (\frac{1}{n δ} \sum_{i = 1}^{n} y_{i} 1 {y_{i} (x_{i} - α^{(k)} - β^{(k) T} z_{i}) < 0} \frac{1}{n δ} \sum_{i = 1}^{n} y_{i} z_{i} 1 {y_{i} (x_{i} - α^{(k)} - β^{(k) T} z_{i}) < 0}) .

Hence

{\hat{ω}}^{(k + 1)} = \underset{ω}{arg \min} s_{1} (ω) - ω^{T} \nabla s_{2} ({\hat{ω}}^{k}), = \underset{ω}{arg \min} s_{1} (ω) - \frac{1}{n δ} \sum_{i = 1}^{n} y_{i} (α + β^{T} z_{i}) 1 {y_{i} (x_{i} - α^{(k)} - β^{(k) T} z_{i}) < 0} .

(6)

To solve (6), we derive its dual problem by using the slack variable technique. The details are retained in the Appendix A. It shows that, we arrive at the dual problem

\min_{τ} τ^{T} Q τ - {b + 2 {Qt}_{1} (α^{(k)}, β^{(k)})}^{T} τ,

(7)

subject to 0 ≤ τ_i ≤ 1 and $\sum_{i = 1}^{n} \frac{1}{δ} y_{i} {τ_{i} - t_{1, i} (α^{(k)}, β^{(k)})} = 0$ . This optimization problem only has simple box constraints hence can be solved by any quadratic programming method.

We conclude the algorithm by presenting how $\hat{α}$ and $\hat{β}$ can be computed. The Karush-Kuhn-Tucker (KKT) conditions associated with the optimization problem (7) are respectively

\frac{1}{n λ δ} \sum_{i = 1}^{n} y_{i} {t_{1, i} (α^{(k)}, β^{(k)}) - τ_{i}} z_{i} = β,

(8)

1 \geq τ_{i} \geq 0,

(9)

τ_{i} {ξ_{i} - 1 + \frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i})} = 0,

(10)

ξ_{i} \geq 0,

(11)

ξ_{i} - 1 + \frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}) \geq 0,

(12)

(1 - τ_{i}) ξ_{i} = 0 .

(13)

Therefore we have three scenarios to discuss depending on the magnitude of τ_i: if τ_i = 0, by (12) and (13) we get ξ_i = 0 and $\frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i})$ ; if 1 > τ_i > 0, by (10) and (13) we have ξ_i = 0 and $\frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}) - 1 = 0$ ; if τ_i = 1, by (10), (12) and (13) we have $\frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}) - 1 = - ξ_{i} \leq 0$ . Therefore we can summarize the KKT conditions more concisely as

\frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}) - 1 = - ξ_{i} \geq 0 if τ_{i} < 1, \frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}) - 1 = - ξ_{i} \leq 0 if τ_{i} > 0,

which implies

\frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}) - 1 = 0 if 0 < τ_{i} < 1 .

(14)

Hence through (8) we can estimate β^(k+1) as

{\hat{β}}^{(k + 1)} = \frac{1}{n λ δ} \sum_{i = 1}^{n} y_{i} {t_{1, i} (α^{(k)}, β^{(k)}) - {\hat{τ}}_{i}} z_{i},

and through (14) we can estimate α^(k+1) as

{\hat{α}}^{(k + 1)} = \frac{1}{∣ {i : 0 < τ_{i} < 1} ∣} \sum_{i : 0 < τ_{i} < 1} (x_{i} - \frac{δ}{y_{i}} - {\hat{β}}^{(k + 1) T} z_{i}) .

Finally we achieve the estimators $\hat{α}$ and $\hat{β}$ after the iterative process (6) converges. For any new patient with data z_new, his/her predicted linear MCID is

\hat{c} (z_{new}) = \hat{α} + {\hat{β}}^{T} z_{new} .

4.1.2. A Nonparametric Kernel MCID

Define a feature vector φ_i = φ(z_i) for the profiles of the i^th patient in the enlarged feature space. We can specify a continuous, symmetric and positive-semidefinite kernel function K corresponding to the inner product in the mapping φ, that is, K(z_i, z_j) = ⟨φ_i, φ_j⟩. Then we have c(z) = w + h(z) with $w \in R$ and $h (z) \in H_{K}$ , where $H_{K}$ is the reproducing kernel Hilbert space (RKHS) with a kernel function K(∙, ∙). The norm in $H_{K}$ , denoted by ∥ ∙ ∥_K, is induced by the following inner product:

〈 f, g 〉_{K} = \sum_{i = 1}^{n} \sum_{j = 1}^{m} v_{i} u_{j} K (z_{i}, z_{j}),

where $f (\cdot) = \sum_{i = 1}^{n} v_{i} K (\cdot, z_{i})$ and $g (\cdot) = \sum_{j = 1}^{m} u_{j} K (\cdot, z_{j})$ .

Following the representation theorem (Kimeldorf and Wahba, 1971), the nonparametric kernel MCID can be expressed as $c (z) = w + \sum_{j = 1}^{n} v_{j} K (z, z_{j})$ . Let η = (w, v^T)^T = (w, v₁, . . . , v_n)^T, then the objective function in (4) is h(η) = h₁(η) − h₂(η), where

h_{1} (η) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{1}{δ} {δ - y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j}))}_{+}] + \frac{λ}{2} \sum_{i, j = 1}^{n} v_{i} v_{j} K (z_{i}, z_{j}), h_{2} (η) = \frac{1}{n} \sum_{i = 1}^{n} [\frac{1}{δ} {- y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j}))}_{+}] .

Similar to the linear case, it is an iterative algorithm to minimize (4). Let ${\hat{η}}^{(k)}$ be the estimator of η at the k^th iteration. We first approximate h₂(η) with its affine minorization function $h_{2} ({\hat{η}}^{(k)}) + 〈 η - {\hat{η}}^{(k)}, \nabla h_{2} ({\hat{η}}^{(k)}) 〉$ , where $\nabla h_{2} ({\hat{η}}^{(k)})$ is the subgradient of h₂(η) at ${\hat{η}}^{(k)}$ ,

\nabla h_{2} ({\hat{η}}^{(k)}) = (\frac{1}{n δ} \sum_{i = 1}^{n} y_{i} 1 {y_{i} (x_{i} - w^{(k)} - \sum_{j = 1}^{n} v_{j}^{(k)} K (z_{i}, z_{j})) < 0} \frac{1}{n δ} \sum_{i = 1}^{n} y_{i} K (z_{i}, z_{1}) 1 {y_{i} (x_{i} - w^{(k)} - \sum_{j = 1}^{n} v_{j}^{(k)} K (z_{i}, z_{j})) < 0} ⋮ \frac{1}{n δ} \sum_{i = 1}^{n} y_{i} K (z_{i}, z_{n}) 1 {y_{i} (x_{i} - w^{(k)} - \sum_{j = 1}^{n} v_{j}^{(k)} K (z_{i}, z_{j})) < 0}) .

Consequently

{\hat{η}}^{(k + 1)} = \underset{η}{arg \min} h_{1} (η) - (η^{T} \nabla h_{2} ({\hat{η}}^{(k)}), = \underset{η}{arg \min} h_{1} (η) - \frac{1}{n δ} \sum_{i = 1}^{n} y_{i} (w + \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) 1 {y_{i} (x_{i} - w^{(k)} - \sum_{j = 1}^{n} v_{j}^{(k)} K (z_{i}, z_{j})) < 1} .

(15)

Similar to the linear case, we use the slack variable technique and we reach the dual problem

\min_{τ^{'}} ({τ'}^{T} Q^{'} τ^{'} - {d + 2 Q^{'} t_{2} (w^{(k)}, v^{(k)})}^{T} τ^{'},

(16)

subject to $0 \leq τ_{i}^{'} \leq 1$ and $\sum_{i = 1}^{n} \frac{1}{δ} y_{i} {τ_{i}^{'} - t_{2, i} (w^{(k)}, v^{(k)})} = 0$ . This optimization problem only have simple box constraints hence can be solved by any quadratic programming method.

We conclude the algorithm by presenting how $\hat{w}$ and $\hat{v}$ can be computed. The Karush-Kuhn-Tucker (KKT) conditions associated with the optimization problem (16) are respectively

\frac{1}{n λ δ} y_{i} {t_{2, i} (w^{(k)}, v^{(k)}) - τ_{i}^{'}} ϕ_{i} = v_{i},

(17)

1 \geq τ_{i}^{'} \geq 0,

(18)

τ_{i}^{'} {ξ_{i}^{'} - 1 + \frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j}))} = 0,

(19)

ξ_{i}^{'} \geq 0,

(20)

ξ_{i}^{'} - 1 + \frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j}) \geq 0,

(21)

(1 - τ_{i}^{'}) ξ_{i}^{'} = 0 .

(22)

Three scenarios can be consequently discussed based on the magnitude of $τ_{i}^{'}$ : if $τ_{i}^{'} = 0$ , by (21) and (22) we have $ξ_{i}^{'} = 0$ and $\frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) - 1 \geq 0$ ; if $1 > τ_{i}^{'} > 0$ , by (19) and (22) we obtain $ξ_{i}^{'} = 0$ and $\frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) - 1 = 0$ ; if $τ_{i}^{'} = 1$ , by (19), (21) and (22) we obtain $\frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) - 1 = - ξ_{i}^{'} \leq 0$ . Now we can summarize the KKT conditions more concisely as

\frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) - 1 = - ξ_{i}^{'} \geq 0 if τ_{i}^{'} < 1, \frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) - 1 = - ξ_{i}^{'} \leq 0 if τ_{i}^{'} > 0,

which implies

\frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})) - 1 = 0 if 0 < τ_{i}^{'} < 1 .

(23)

Thus for $i = 1, \dots, n, v_{i}^{(k + 1)}$ can be estimated via (17) as

{\hat{v}}_{i}^{(k + 1)} = \frac{1}{n λ δ} y_{i} {t_{2, i} (w^{(k)}, v^{(k)}) - {\hat{τ}}_{i}^{'}} ϕ_{i}

and w^(k+1) can be estimated via (23) as

{\hat{w}}^{(k + 1)} = \frac{1}{∣ {i : 0 < τ_{i}^{'} < 1} ∣} \sum_{i : 0 < τ_{i}^{'} < 1} (x_{i} - \frac{δ}{y_{i}} - \sum_{j = 1}^{n} {\hat{v}}_{j} K (z_{i}, z_{j})) .

We can get the estimators $\hat{w}$ and $\hat{v}$ after the iterative process (15) converges. As a result, for any new patient with data z_new, his/her predicted nonparametric kernel MCID is $\hat{c} (z_{new}) = \hat{w} + \sum_{j = 1}^{n} {\hat{v}}_{j} K (z_{new}, z_{j})$ . Note that the most common use of the nonparametric kernel function is the Gaussian radial basis function K(z₁, z₂) = exp(−∥z₁ − z₂∥²/2σ²), where σ is a positive scale parameter. If $K (z_{1}, z_{2}) = z_{1}^{T} z_{2}$ , the nonparametric kernel case will reduce to the simple linear case.

4.2. Bootstrap Procedure for MCID and the Classification Error

A point estimation alone does not allow one to quantify its uncertainty hence limits the usefulness of the MCID and its classification error in real applications. Hedayat et al. (2015) postulates a testing data set with a very large sample size to quantify the effectiveness of their MCID in numerical studies, but such a testing data set is usually infeasible in clinical studies. Resampling method with the boostrap as a representative, on the other hand, could serve as a tool to construct confidence interval for an estimand under these situations.

To appropriately use bootstrap, we have to be cautious on the regularity conditions under which the theoretical properties can be justified (Bickel et al., 1997). If these conditions are not satisfied, the conventional bootstrap has to be properly modified. The objective function to be minimized in (4) and the one in (5) are generally non-smooth functions. If there exists a non-negligible probability concentrated at the discontinuous points of the objective function, that is, $P (X_{0} - \hat{c} (Z_{0}) = 0) > 0$ , it is called the irregular case under which Shao (1994) showed that the conventional bootstrap is inconsistent. Instead, we propose to adopt the m-out-of-n bootstrap, a general method for remedying bootstrap inconsistency due to non-smoothness, and theoretically justified in Shao (1994, 1996); Bickel et al. (1997) and references therein.

The m-out-of-n bootstrap is the conventional nonparametric bootstrap except that the resample size, historically denoted as m, is of a smaller order compared to the original sample size n. That is, m = m_n → ∞ and m/n → 0 (or m log log n/n → 0) as n → ∞ (Shao, 1994). The intuition of the m-out-of-n bootstrap is to let the empirical distribution tend to the true generative distribution at a faster rate, and essentially this allows the empirical distribution to reach its limit faster hence the bootstrap samples are drawn as if they were from the true generative distribution. Intuitively the requirement of m/n → 0 (or m log log n/n → 0) is consistent with Hedayat et al. (2015) who required a very large sample size for their testing data. To some extent, in the m-out-of-n bootstrap, those n − m subjects serve the role of the testing sample (Shao, 1996). Practically m is usually chosen as m = n^κ for some κ < 1. In our numerical studies, we choose κ = 0.9. Our algorithm is detailed below.

For b = 1, . . . , B, we generate bootstrap samples with size m as ${(x_{j}^{(b)}, y_{j}^{(b)}, z_{j}^{(b)}), j = 1, \dots, m}$ . The MCID can be derived as ${\hat{c}}^{(b)}$ based on the method in Section 4. To be more specific, the simple linear MCID is ${\hat{c}}_{l}^{(b)} = {\hat{α}}^{(b)} + {\hat{β}}^{(b) T} z_{0}$ and the nonparametric kernel MCID is ${\hat{c}}_{n}^{(b)} = {\hat{w}}^{(b)} + \sum_{i = 1}^{n} {\hat{v}}_{i}^{(b)} K (z_{0}, z_{i}^{(b)})$ , where z₀ denotes the profile for a new patient. Accordingly, the classification error based on MCID ${\hat{c}}^{(b)}$ is computed as

{\hat{err}}^{(b)} = \frac{1}{m} \sum_{j = 1}^{m} 1 {y_{j}^{(b)} \neq sign (x_{j}^{(b)} - \hat{c} (z_{j}^{(b)}))} .

Similarly ${\hat{err}}^{(b)}$ can also be distinguished as ${\hat{err}}_{l}^{(b)}$ and ${\hat{err}}_{n}^{(b)}$ for simple linear and nonparametric kernel cases.

We repeat the above procedure in total B times. If we approach the simple linear MCID, we obtain ${({\hat{α}}^{(b)}, {\hat{β}}^{(b)}, {\hat{c}}_{l}^{(b)}, {\hat{err}}_{l}^{(b)}}, b = 1, \dots, B$ . Let ${\hat{l}}_{α}$ and ${\hat{u}}_{α}$ be the α/2-th and (1 − α/2)-th quantiles of ${{\hat{α}}^{(b)}, b = 1, \dots, B}$ , ${\hat{l}}_{β}$ and ${\hat{u}}_{β}$ be the α/2-th and (1 − α/2)-th quantiles of ${{\hat{β}}^{(b)}, b = 1, \dots, B}$ , ${\hat{l}}_{c l}$ and ${\hat{u}}_{c l}$ be the α/2-th and (1−α/2)-th quantiles of ${{\hat{c}}_{l}^{(b)}, b = 1, \dots, B}$ , and ${\hat{l}}_{{err}_{l}}$ and ${\hat{u}}_{{err}_{l}}$ be the α/2-th and (1 − α/2)-th quantiles of ${{\hat{err}}^{(b)}, b = 1, \dots, B}$ . Thus, the (1 − α)-th confidence interval of α is given by $[{\hat{l}}_{α}, {\hat{u}}_{α}]$ , the (1 − α)-th confidence interval of β is given by $[{\hat{l}}_{β}, {\hat{u}}_{β}]$ , (1 − α)-th confidence interval of c_l is given by $[{\hat{l}}_{c l}, {\hat{u}}_{c l}]$ , and (1 − α)-th confidence interval of err is given by $[{\hat{l}}_{{err}_{l}}, {\hat{u}}_{{err}_{l}}]$ . If we approach the nonparametric kernel MCID, we have ${({\hat{c}}_{n}^{(b)}, {\hat{err}}_{n}^{(b)})}, b = 1, \dots, B}$ . Similarly the (1 − α)-th confidence interval of c_n is given by $[{\hat{l}}_{c_{n}}, {\hat{u}}_{c_{n}}]$ , and (1 − α)-th confidence interval of err is given by $[{\hat{l}}_{{err}_{n}}, {\hat{u}}_{{err}_{n}}]$ .

5. Simulation Studies

In this section, we apply the proposed method to provide confidence intervals for the MCID and the classification error via extensive numerical studies based on simulated data.

We consider two scenarios. In the first scenario, we generate a random sample consisting of independent and identically distributed observations {(X_i, Y_i, Z_i), i = 1, . . . , n}, where we first generate patient’s clinical profile Z_i from a bivariate normal distribution N₂(μ, I₂), where μ = (0, 0)^T and I₂ = diag(1, 1). Then we generate X_i from N(α + β^Tz_i, 1), where α = 0 and β = (1, 2)^T. Finally we generate the binary patient reported outcome Y_i ϵ {−1, 1} from Bern(F(x_i)), where F(x_i) = P(X_i ≤ x_i). Note that under this scenario, the linear MCID is the underlying truth. We also generate a new observation (Y_new, X_new, Z_new) with Y_new = 1, X_new = −0.3376 and Z_new = (−1.3577, −1.3643) from the same distribution as all {(Y_i, X_i, Z_i)’s}. The true value of the MCID for this patient is −4.0862.

The data under the second scenario are generated similarly to the first, except that the X_i is generated from $N (α + β^{T} z_{i} - β^{T} z_{i}^{2}, 1)$ . Hence the linear structure for the MCID is misspecified in this setting. Similar to the first scenario, we also generate a new observation (Y_new, X_new, Z_new) with Y_new = 1, X_new = −2.1809 and Z_new = (−1.3577, −1.3643). The true value of the MCID for this new patient is −9.6519.

In each of the two scenarios, we apply both simple linear and nonparametric kernel MCID methods. The nonparametric kernel we apply is the Gaussian kernel defined as K(z₁, z₂) = exp(−∥z₁ − z₂∥²/2σ²). The scale parameter can be found by setting it as the median of pairwise Euclidean distances within the observations (z₁, z₂) used to estimate the prediction rule (Hedayat et al., 2015). The m-out-of-n bootstrap samples are generated 1,000 times for each case. For simplicity we set δ = 0.01 in our numerical studies and we use the multifold cross validation method to determine the tuning parameter λ. We report two different sample sizes: n = 500 and n = 1, 000 for each situation.

Based on 500 simulation replications, our results are summarized in Tables 1 and 2. It can be seen that, in scenario 1, the length of the confidence interval is much shorter and the coverage is much more accurate when the correct linear structure and a larger sample size are used. The length is much larger and the coverage is much broader when the nonparametric kernel function is used. In scenario 2, the coverage using the nonparametric kernel is much broader. More importantly we find that the estimation of the MCID using the incorrect simple linear structure is biased so that it gives very poor coverage. This issue cannot be uncovered if confidence interval is not available as in Hedayat et al. (2015). It also reinforces the necessity and importance of developing interval estimation for the MCID.

Table 1:

Confidence interval for MCID in simulation studies. CP=Coverage Probability.

		MCID
Sample Size	Kernel	Lower	Upper	Length	CP
Scenario 1
n = 500	Linear(Correct)	−4.3040	−3.3198	0.9842	0.934
	Kernel	−4.6326	−2.3341	2.2985	0.928
n = 1000	Linear(Correct)	−4.2669	−3.6130	0.6539	0.952
	Kernel	−4.5890	−1.5640	3.0250	0.982
Scenario 2
n = 500	Linear(Incorrect)	−7.5186	−4.1525	3.3661	0.004
	Kernel	−10.7414	−7.4439	3.2974	0.990
n = 1000	Linear(Incorrect)	−7.5371	−4.8906	2.6465	0.000
	Kernel	−10.5402	−4.7963	5.7439	0.998

Open in a new tab

Table 2:

Confidence interval for the classification error in simulation studies. CP=Coverage Probability.

		Classification Error
Sample Size	Kernel	Lower	Upper	Length	CP
Scenario 1
n = 500	Linear(Correct)	0.1951	0.3068	0.1117	1.000
	Kernel	0.1985	0.3198	0.1213	0.996
n = 1000	Linear(Correct)	0.2080	0.2871	0.0791	0.992
	Kernel	0.2063	0.3428	0.1365	1.000
Scenario 2
n = 500	Linear(Incorrect)	0.3595	0.4861	0.1266	0.000
	Kernel	0.2041	0.3335	0.1294	0.996
n = 1000	Linear(Incorrect)	0.3762	0.4687	0.0924	0.000
	Kernel	0.2086	0.3883	0.1797	1.000

Open in a new tab

We notice that the coverage for the classification error is very broad and Xu et al. (2015) also noted the similar phenomenon. Due to the computational burden, we only explore our method up to the sample size of 1,000. Under the situation with a much larger sample size, the coverage would become more accurate.

6. ChAMP Trial Analysis

In ChAMP trial study, 190 patients with chondral lesions undergoing APM are randomized to either treatment (debridement) group (n = 98) or control (no debridement, observation) group (n = 92). It is of interest to investigate whether the debridement treatment on the chondral lesions would encourage the recovery from the surgery of repairing knee damage.

As we mentioned in the previous section, the binary variable Y is derived from the anchor question in the SF-36 health survey. The patient’s diagnostic measurement X is the difference in WOMAC pain score between baseline and one year after surgery. The score is scaled from 0 (extreme problem) to 100 (no problem). Patient’s clinical profile Z includes their age (continuous), treatment assignment (binary), sex (binary) and knee damage (four level categorical). The knee damage variable is the total number of types of meniscus tears that the patient suffers from. It reflects the severity of patient’s knee damage.

After excluding the missing data, our analysis contains 157 patients. Among them, 80 patients are assigned in the debridement group and 77 others observation group. In our analysis, we first compute the MCID for the whole population, for the subpopulation within the debridement group (treatment=1), for the subpopulation within the observation group (treatment=0), and their difference, respectively, where the MCID c(z) only includes an intercept term. We call this “No model” in Table 3. Then based on the structure

c (z) = α + β_{1} treatment,

which we call “Model 1”, the structure

c (z) = α + β_{1} treatment + β_{2} age + β_{3} sex + β_{4} damage,

which we call “Model 2”, and the nonparametric c(z) which we call “Model 3” in Table 3, separately, we implement our proposed estimation procedure. The results are summarized in Table 3. Note that although we concentrate on interval estimations, we also list the “point estimation” column mainly for the purpose of comparing with the results in Hedayat et al. (2015). Here we generated the m-out-of-n bootstrap samples B = 1, 000 times with m = n^0.9 ≈ 95. Also, a sensitivity analysis on the value of B in Table 4 demonstrates that the results are not sensitive when it varies from around 500 to 1,500.

Table 3:

Interval Estimation for the ChAMP Trial Analysis.

	MCID				Classification Error
	Point	Interval Estimation			Interval Estimation
	Estimation	Lower	Upper	Length	Lower	Upper	Length
No model: c(z) = c
MCID_all	1.4199	−0.9868	2.3224	3.3092	0.3050	0.4421	0.1371
MCID_trt=1	2.0216	−0.9944	3.8266	4.8210
MCID_trt=0	−0.0843	−0.9868	1.7207	2.7076
MCID_diff	2.1059	−1.5042	3.9109	5.4151
Model 1: c(z) = α + β₁ treatment
α	−0.0009	−2.3885	1.6182	4.0067	0.2842	0.5263	0.2421
β₁	1.9008	−1.4109	3.4050	4.8159
MCID_all	0.9676	−1.0497	1.8041	2.8538
MCID_trt=1	1.8999	−0.8843	2.5207	3.4050
MCID_trt=0	−0.0009	−2.3885	1.6182	4.0067
MCID_diff	1.9008	−1.4109	3.4050	4.8159
Model 2: c(z) = α + β₁ treatment + β₂ age + β₃ sex + β₄ damage
α	−6.7261	−21.4810	11.3416	32.8226	0.2737	0.5158	0.2421
β₁	2.4068	−1.9171	3.9081	5.8252
β₂	0.1680	−0.1918	0.3768	0.5686
β₃	−0.9713	−3.4509	2.2291	5.6800
β₄	−1.1430	−1.8663	1.3246	3.1909
MCID_all	0.2467	−1.6964	2.0215	3.7178
MCID_trt=1	1.6050	−1.6592	3.3212	4.9804
MCID_trt=0	−1.1645	−3.1225	1.7468	4.8693
MCID_diff	2.7694	−1.6298	4.0097	5.6395
Model 3: c(z) is nonparametric
MCID_all	−0.3465	−1.5073	1.5785	3.0858	0.0526	0.4424	0.3897
MCID_trt=1	−0.2987	−1.4469	1.8478	3.2947
MCID_trt=0	−0.3962	−1.7511	1.5378	3.2890
MCID_diff	0.0975	−0.5132	1.2298	1.7430

Open in a new tab

Table 4:

Sensitivity Analysis for the ChAMP Trial

		Bootstrap Resampling Size (B)
		600	900	1,200	1,500
No model: c(z) = c
MCID_all	Lower	−0.9868	−0.9868	−0.9868	−0.9868
MCID_all	Upper	2.3600	2.3224	2.3224	2.3224
MCID_trt=1	Lower	−0.9868	−0.9868	−1.2877	−1.2877
MCID_trt=1	Upper	3.8266	3.8266	3.8266	3.8266
MCID_trt=0	Lower	−0.9868	−0.9868	−0.9868	−0.9868
MCID_trt=0	Upper	1.7207	1.7207	1.7207	1.7207
MCID_diff	Lower	−1.5042	−1.5042	−1.5042	−1.5042
MCID_diff	Upper	3.9109	3.9109	3.9109	3.9109
Classification	Lower	0.3053	0.2997	0.3053	0.2947
Error	Upper	0.4526	0.4476	0.4526	0.4526
Model 1: c(z) = α + β₁ treatment
α	Lower	−2.2125	−2.3885	−2.3885	−2.3885
α	Upper	1.6182	1.6182	1.6182	1.6182
β₁	Lower	−1.2034	−1.3565	−1.4084	−1.2992
β₁	Upper	3.4050	3.4050	3.4050	3.3092
MCID_all	Lower	−0.8760	−1.0214	−1.0473	−1.0458
MCID_all	Upper	1.7648	1.7624	1.8041	1.9027
MCID_trt=1	Lower	−0.7909	−0.8843	−0.8843	−0.8843
MCID_trt=1	Upper	2.5223	2.5207	2.5208	2.5208
MCID_trt=0	Lower	−2.2125	−2.3885	−2.3885	−2.3885
MCID_trt=0	Upper	1.6182	1.6182	1.6182	1.6182
MCID_diff	Lower	−1.2034	−1.3565	−1.4084	−1.2992
MCID_diff	Upper	3.4050	3.4050	3.4050	3.3092
Classification	Lower	0.2842	0.2842	0.2842	0.2842
Error	Upper	0.5161	0.5263	0.5263	0.5263
Model 2: c(z) = α + β₁ treatment + β₂ age + β₃ sex + β₄ damage
α	Lower	−22.5595	−21.4412	−21.3968	−21.1874
α	Upper	9.7443	10.3888	10.7200	11.2746
β₁	Lower	−1.9235	−1.8822	−1.9171	−1.9289
β₁	Upper	4.0760	3.8862	3.9378	3.9518
β₂	Lower	−0.1830	−0.1842	−0.1830	−0.2019
β₂	Upper	0.4176	0.3760	0.3751	0.3715
β₃	Lower	−3.5843	−3.4407	−3.5837	−3.4926
β₃	Upper	2.3425	2.2252	2.2467	2.3068
β₄	Lower	−1.7888	−1.8370	−1.8925	−1.8103
β₄	Upper	1.3664	1.3070	1.3238	1.3833
MCID_all	Lower	−1.7405	−1.7091	−1.6945	−1.7689
MCID_all	Upper	2.0215	2.0263	1.9981	2.0001
MCID_trt=1	Lower	−1.7345	−1.6240	−1.5910	−1.5944
MCID_trt=1	Upper	3.3077	3.3142	3.3077	3.2833
MCID_trt=0	Lower	−3.2409	−3.1269	−3.0683	−3.1346
MCID_trt=0	Upper	1.8563	1.7076	1.6739	1.6121
MCID_diff	Lower	−1.6303	−1.5644	−1.6227	−1.6262
MCID_diff	Upper	4.2119	3.9789	4.0669	4.0933
Classification	Lower	0.2734	0.2737	0.2737	0.2737
Error	Upper	0.5055	0.5158	0.5053	0.5158
Model 3: c(z) is nonparametric
MCID_all	Lower	−1.4763	−1.4921	−1.5270	−1.5201
MCID_all	Upper	1.5442	1.5951	1.5785	1.6139
MCID_trt=1	Lower	−1.4830	−1.4605	−1.5064	−1.4834
MCID_trt=1	Upper	1.7903	1.8483	1.8195	1.8338
MCID_trt=0	Lower	−1.7379	−1.7465	−1.7623	−1.7518
MCID_trt=0	Upper	1.5213	1.5583	1.5378	1.6119
MCID_diff	Lower	−0.4841	−0.5003	−0.5468	−0.5488
MCID_diff	Upper	1.2007	1.2294	1.2000	1.1966
Classification	Lower	0.0526	0.0526	0.0526	0.0526
Error	Upper	0.4526	0.4476	0.4421	0.4421

Open in a new tab

From Table 3, the point estimation of MCID for the treatment=1 subgroup 2.0216 looks quite different from that for the treatment=0 subgroup −0.0843. Without the interval estimation, one would believe that they are different however with no further evidence on the degree of how they are different. Now with interval estimation, we see the confidence interval for each of them covers zero, and the confidence interval for the difference also covers zero. The similar phenomenon could also be observed under either “Model 1” or “Model 2” or “Model 3”.

Across the first three models, each of the MCID quantities has slightly different estimates, but each of their confidence intervals covers zero. “Model 3” has difference MCID estimates than its linear counterpart, indicating the potential model interpretability insufficiency contributed by the linear component of the MCID.

Also for the treatment effect β₁ in “Model 1” and “Model 2”, although their estimates are different, each of their confidence intervals also covers zero. As we can see, although “Model 3” looks more flexible than its linear counterpart, it lacks clear interpretability of the treatment effect. The different results from “No model” and “Model 3” also indicates that there may also exist some other unobserved covariate variables that may play a role to quantifying the population heterogeneity in terms of MCID.

The classification errors for the first three models are roughly similar with that of “Model 1” slightly greater than “Model 2”. This is reasonable since “Model 2” controls more covariates hence should have a greater ability for model interpretability. “Model 3” has smaller lower bound and upper bound of its confidence interval than “Model 2” because it is generally acknowledged that a nonparametric kernel model entails fewer model assumptions than its linear counterpart model hence is regarded as more flexible.

In all, besides the point estimation of the MCID in each population of interest, our proposed method also provides its interval estimation which gives us a better understanding of the scale of MCID hence could facilitate us to compare with some certain historical values or values from other populations or other diseases so that a better, more convincing health policy decision could be made. The ChAMP study shows that the MCID between the debridement group and the observation group has some difference, but that difference is not significant. One of the major findings of the ChAMP trial study is that to debride the chondral lesions does not have a statistically significant effect hence recommends that not to debride the chondral lesions in future surgical practice (Bisson et al., 2017). Using the proposed method for the MCID in this paper, we reach the same conclusion. This could have a big impact on the orthopaedics surgical practice since the additional debridement of chondral lesions would bring a significant medical cost to the patients.

Acknowledgment

This work was supported by the National Center for Advancing Translational Sciences of the National Institutes of Health under award number UL1TR001412. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Appendix A.

To solve (6), we need to derive its dual problem by replacing the loss function in s₁ with slack variables ξ_i, i = 1, . . . , n, and adding two sets of constraints. This leads to

\min_{ω, ξ} \frac{1}{n} \sum_{i = 1}^{n} {ξ_{i} - \frac{1}{δ} t_{1, i} (α^{(k)}, β^{(k)}) y_{i} (α + β^{T} z_{i})} + \frac{λ}{2} β^{T} β,

(24)

subject to ξ_i ≥ 0 and $ξ_{i} \geq 1 - \frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i}), i = 1, \dots, n$ , where $t_{1, i} (α^{(k)}, β^{(k)}) = 1 {y_{i} (x_{i} - α^{(k)} - β^{(k) T} z_{i}) < 0}$ . Then its primary Lagrangian is

L_{p} = \frac{1}{n} \sum_{i = 1}^{n} {ξ_{i} - \frac{1}{δ} t_{1, i} (α^{(k)}, β^{(k)}) y_{i} (α + β^{T} z_{i})} + \frac{λ}{2} β^{T} β - \frac{1}{n} \sum_{i = 1}^{n} τ_{i} {ξ_{i} - 1 + \frac{1}{δ} y_{i} (x_{i} - α - β^{T} z_{i})} - \frac{1}{n} \sum_{i = 1}^{n} γ_{i} ξ_{i},

where τ = (τ₁, . . . , τ_n)^T and γ = (γ₁, . . . , γ_n)^T are vectors of non-negative Lagrange multipliers, corresponding to the two sets of constraints in (24). Setting derivatives of the Lagrangian with respect to the primary space variables ω and ξ to 0, we get

{\begin{matrix} 0 = \frac{1}{δ} \sum_{i = 1}^{n} y_{i} {τ_{i} - t_{1, i} (α^{(k)}, β^{(k)})}, \\ β = \frac{1}{n λ δ} \sum_{i = 1}^{n} y_{i} {t_{1, i} (α^{(k)}, β^{(k)}) - τ_{i}} z_{i}, \\ 1 = τ_{i} + γ_{i}, i = 1, \dots, n . \end{matrix}

Plugging them back to L_p, we have

L_{p} = - \frac{1}{n δ} \sum_{i = 1}^{n} τ_{i} y_{i} x_{i} + \frac{1}{n} \sum_{i = 1}^{n} τ_{i} - \frac{λ}{2} β^{T} β, \propto - 2 \sum_{i = 1}^{n} δ y_{i} τ_{i} x_{i} + 2 \sum_{i = 1}^{n} δ^{2} τ_{i} - \frac{1}{n λ} {\sum_{i = 1}^{n} y_{i} τ_{i} z_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i} z_{i} - 2 \sum_{i = 1}^{n} y_{i} t_{1, i} (α^{(k)}, β^{(k)}) z_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i} z_{i}}, = - \frac{1}{n λ} \sum_{i = 1}^{n} y_{i} τ_{i} z_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i} z_{i} + 2 \sum_{i = 1}^{n} (δ^{2} - δ y_{i} x_{i}) τ_{i} + \frac{2}{n λ} \sum_{i = 1}^{n} y_{i} t_{1, i} (α^{(k)}, β^{(k)}) z_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i} z_{i}, = - τ^{T} Q τ + b^{T} τ + 2 t_{1} (α^{(k)}, β^{(k)})^{T} Q τ,

where Q is a square matrix with [i, j]^th element as $〈 y_{i} z_{i}, y_{j} z_{j} 〉 ∕ (n λ), t_{1} (α^{(k)}, β^{(k)}) = {t_{1, i} (α^{(k)}, β^{(k)})}_{i = 1}^{n}, b = 2 {δ^{2} - δ y_{i} x_{i}}_{i = 1}^{n}$ .

Appendix B.

To solve (15), we need to derive its dual problem by replacing the loss function in h₁ with slack variables $ξ_{i}^{'}, i = 1, \dots, n$ , and adding two sets of constraints. This results in

\min_{η, ξ^{'}} \frac{1}{n} \sum_{i = 1}^{n} {ξ_{i}^{'} - \frac{1}{δ} t_{2, i} (w^{(k)}, v^{(k)}) y_{i} (w + \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j}))} + \frac{λ}{2} \sum_{i, j = 1}^{n} v_{i} v_{j} K (z_{i}, z_{j}),

(25)

subject to $ξ_{i}^{'} \geq 0$ and $ξ_{i}^{'} \geq 1 - \frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j}))$ , where $t_{2, i} (w^{(k)}, v^{(k)}) = 1 {y_{i} (x_{i} - w^{(k)} - \sum_{j = 1}^{n} v_{j}^{(k)} K (z_{i}, z_{j})) < 0}$ . Then its primary Lagrangian is

L_{p}^{'} = \frac{1}{n} \sum_{i = 1}^{n} {ξ_{i}^{'} - \frac{1}{δ} t_{2, i} (w^{(k)}, v^{(k)}) y_{i} (w + \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{i}))} + \frac{λ}{2} \sum_{i, j = 1}^{n} v_{i} v_{j} K (z_{i}, z_{i}) - \frac{1}{n} \sum_{i = 1}^{n} τ_{i}^{'} {ξ_{i}^{'} - 1 + \frac{1}{δ} y_{i} (x_{i} - w - \sum_{j = 1}^{n} v_{j} K (z_{i}, z_{j})} - \frac{1}{n} \sum_{i = 1}^{n} γ_{i}^{'} ξ_{i}^{'},

where $τ^{'} = (τ_{1}^{'}, \dots, τ_{n}^{'})^{T}$ and $γ^{'} = (γ_{1}^{'}, \dots, γ_{n}^{'})^{T}$ are vectors of non-negative Lagrange multipliers, corresponding to the two sets of constraints in (25). Setting derivatives of the Lagrangian with respect to the primary space variables η and ξ′ to 0, we get

{\begin{matrix} 0 = \frac{1}{δ} \sum_{i = 1}^{n} y_{i} {τ_{i}^{'} - t_{2, i} (w^{(k)}, v^{(k)})}, \\ v = \frac{1}{n λ δ} \sum_{i = 1}^{n} y_{i} {t_{2, i} (w^{(k)}, v^{(k)}) - τ_{i}^{'}} ϕ_{i}, \\ 1 = τ_{i}^{'} + γ_{i}^{'}, i = 1, \dots, n . \end{matrix}

Plugging them back to $L_{p}^{'}$ , we have

L_{p}^{'} = - \frac{1}{n δ} \sum_{i = 1}^{n} τ_{i}^{'} y_{i} x_{i} + \frac{1}{n} \sum_{i = 1}^{n} τ_{i}^{'} - \frac{λ}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} v_{i} v_{j} K (z_{i}, z_{j}), \propto - 2 \sum_{i = 1}^{n} δ y_{i} τ_{i}^{'} x_{i} + 2 \sum_{i = 1}^{n} δ^{2} τ_{i}^{'} - \frac{1}{n λ} {\sum_{i = 1}^{n} y_{i} τ_{i}^{'} ϕ_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i}^{'} ϕ_{i} - 2 \sum_{i = 1}^{n} y_{i} t_{2, i} (w^{(k)}, v^{(k)}) ϕ_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i}^{'} ϕ_{i}}, = - \frac{1}{n λ} \sum_{i = 1}^{n} y_{i} τ_{i}^{'} ϕ_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i}^{'} ϕ_{i} + 2 \sum_{i = 1}^{n} (δ^{2} - δ y_{i} x_{i}) τ_{i}^{'} + \frac{2}{n λ} \sum_{i = 1}^{n} y_{i} t_{2, i} (w^{(k)}, v^{(k)}) ϕ_{i}^{T} \sum_{i = 1}^{n} y_{i} τ_{i}^{'} ϕ_{i}, = - τ^{' T} Q^{'} τ^{'} + d^{T} τ^{'} + 2 t_{2} (w^{(k)}, v^{(k)})^{T} Q^{'} τ^{'},

where Q′ is a square matrix with [i, j]^th element as $〈 y_{i} ϕ_{i}, y_{j} ϕ_{j} 〉 ∕ (n λ), t_{2} (w^{(k)}, v^{(k)}) = {t_{2, i} (w^{(k)}, v^{(k)})}_{i = 1}^{n}, d = 2 {δ^{2} - δ y_{i} x_{i}}_{i = 1}^{n}$ .

References

Bickel P, Götze F, and van Zwet W (1997), “Resampling fewer than n observations: gains, losses, and remedies for losses,” Statistica Sinica, 7, 1–31. [Google Scholar]
Bisson LJ, Kluczynski MA, Wind WM, Fineberg MS, Bernas GA, Rauh MA, Marzo JM, Zhou Z, and Zhao J (2017), “Patient outcomes after observation versus debridement of unstable chondral lesions during partial meniscectomy: the chondral lesions and meniscus procedures (ChAMP) randomized controlled trial,” The Journal of Bone & Joint Surgery, 99, 1078–1085. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bisson LJ (2018), “How Does the Presence of Unstable Chondral Lesions Affect Patient Outcomes After Partial Meniscectomy? The ChAMP Randomized Controlled Trial,” The American Journal of Sports Medicine, 46, 590–597. [DOI] [PubMed] [Google Scholar]
Bisson LJ, Phillips P, Matthews J, Zhou Z, Zhao J, Wind WM, Fineberg MS, Bernas GA, Rauh MA, Marzo JM, and Kluczynski MA (2019), “The association between bone marrow lesions and unstable chondral lesions and pain in patients without radiographic evidence of degenerative joint disease after arthroscopic partial meniscectomy,” Orthopaedic Journal of Sports Medicine, in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cook CE (2008), “Clinimetrics corner: the minimal clinically important change score (MCID): a necessary pretense,” Journal of Manual & Manipulative Therapy, 16, 82E–83E. [DOI] [PMC free article] [PubMed] [Google Scholar]
Efron B (1979), “Bootstrap Methods: Another Look at the Jackknife,” The Annals of Statistics, 1–26. [Google Scholar]
Erdogan BD, Leung YY, Pohl C, Tennant A, and Conaghan PG (2016), “Minimal clinically important difference as applied in rheumatology: an OMERACT Rasch Working Group systematic review and critique,” The Journal of Rheumatology, 43, 194–202. [DOI] [PubMed] [Google Scholar]
Hedayat A, Wang J, and Xu T (2015), “Minimum clinically important difference in medical studies,” Biometrics, 71, 33–41. [DOI] [PubMed] [Google Scholar]
Jaeschke R, Singer J, and Guyatt GH (1989), “Measurement of health status: ascertaining the minimal clinically important difference,” Controlled Clinical Trials, 10, 407–415. [DOI] [PubMed] [Google Scholar]
Kimeldorf G and Wahba G (1971), “Some results on Tchebycheffian spline functions,” Journal of Mathematical Analysis and Applications, 33, 82–95. [Google Scholar]
Kluczynski MA, Marzo JM, Wind WM, Fineberg MS, Bernas GA, Rauh MA, Zhou Z, Zhao J, and Bisson LJ (2017), “The effect of body mass index on clinical outcomes in patients without radiographic evidence of degenerative joint disease after arthroscopic partial meniscectomy,” Arthroscopy: The Journal of Arthroscopic & Related Surgery, 33, 2054–2063. [DOI] [PubMed] [Google Scholar]
Laber EB and Murphy SA (2011), “Adaptive confidence intervals for the test error in classification,” Journal of the American Statistical Association, 106, 904–913. [DOI] [PMC free article] [PubMed] [Google Scholar]
McGlothlin AE and Lewis RJ (2014), “Minimal clinically important difference: defining what really matters to patients,” JAMA, 312, 1342–1343. [DOI] [PubMed] [Google Scholar]
Shao J (1994), “Bootstrap sample size in nonregular cases,” Proceedings of the American Mathematical Society, 122, 1251–1262. [Google Scholar]
Shao J (1996), “Bootstrap model selection,” Journal of the American Statistical Association, 91, 655–665. [Google Scholar]
Thi Hoai An L and Dinh Tao P (1997), “Solving a class of linearly constrained indefinite quadratic problems by DC algorithms,” Journal of Global Optimization, 11, 253–285. [Google Scholar]
Ware JE Jr and Sherbourne CD (1992), “The MOS 36-item short-form health survey (SF-36): I. Conceptual framework and item selection,” Medical Care, 473–483. [PubMed] [Google Scholar]
Wright A, Hannon J, Hegedus EJ, and Kavchak AE (2012), “Clinimetrics corner: a closer look at the minimal clinically important difference (MCID),” Journal of Manual & Manipulative Therapy, 20, 160–166. [DOI] [PMC free article] [PubMed] [Google Scholar]
Xu Y, Yu M, Zhao Y-Q, Li Q, Wang S, and Shao J (2015), “Regularized outcome weighted subgroup identification for differential treatment effects,” Biometrics, 71, 645–653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] Bickel P, Götze F, and van Zwet W (1997), “Resampling fewer than n observations: gains, losses, and remedies for losses,” Statistica Sinica, 7, 1–31. [Google Scholar]

[R2] Bisson LJ, Kluczynski MA, Wind WM, Fineberg MS, Bernas GA, Rauh MA, Marzo JM, Zhou Z, and Zhao J (2017), “Patient outcomes after observation versus debridement of unstable chondral lesions during partial meniscectomy: the chondral lesions and meniscus procedures (ChAMP) randomized controlled trial,” The Journal of Bone & Joint Surgery, 99, 1078–1085. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Bisson LJ (2018), “How Does the Presence of Unstable Chondral Lesions Affect Patient Outcomes After Partial Meniscectomy? The ChAMP Randomized Controlled Trial,” The American Journal of Sports Medicine, 46, 590–597. [DOI] [PubMed] [Google Scholar]

[R4] Bisson LJ, Phillips P, Matthews J, Zhou Z, Zhao J, Wind WM, Fineberg MS, Bernas GA, Rauh MA, Marzo JM, and Kluczynski MA (2019), “The association between bone marrow lesions and unstable chondral lesions and pain in patients without radiographic evidence of degenerative joint disease after arthroscopic partial meniscectomy,” Orthopaedic Journal of Sports Medicine, in press. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Cook CE (2008), “Clinimetrics corner: the minimal clinically important change score (MCID): a necessary pretense,” Journal of Manual & Manipulative Therapy, 16, 82E–83E. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Efron B (1979), “Bootstrap Methods: Another Look at the Jackknife,” The Annals of Statistics, 1–26. [Google Scholar]

[R7] Erdogan BD, Leung YY, Pohl C, Tennant A, and Conaghan PG (2016), “Minimal clinically important difference as applied in rheumatology: an OMERACT Rasch Working Group systematic review and critique,” The Journal of Rheumatology, 43, 194–202. [DOI] [PubMed] [Google Scholar]

[R8] Hedayat A, Wang J, and Xu T (2015), “Minimum clinically important difference in medical studies,” Biometrics, 71, 33–41. [DOI] [PubMed] [Google Scholar]

[R9] Jaeschke R, Singer J, and Guyatt GH (1989), “Measurement of health status: ascertaining the minimal clinically important difference,” Controlled Clinical Trials, 10, 407–415. [DOI] [PubMed] [Google Scholar]

[R10] Kimeldorf G and Wahba G (1971), “Some results on Tchebycheffian spline functions,” Journal of Mathematical Analysis and Applications, 33, 82–95. [Google Scholar]

[R11] Kluczynski MA, Marzo JM, Wind WM, Fineberg MS, Bernas GA, Rauh MA, Zhou Z, Zhao J, and Bisson LJ (2017), “The effect of body mass index on clinical outcomes in patients without radiographic evidence of degenerative joint disease after arthroscopic partial meniscectomy,” Arthroscopy: The Journal of Arthroscopic & Related Surgery, 33, 2054–2063. [DOI] [PubMed] [Google Scholar]

[R12] Laber EB and Murphy SA (2011), “Adaptive confidence intervals for the test error in classification,” Journal of the American Statistical Association, 106, 904–913. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] McGlothlin AE and Lewis RJ (2014), “Minimal clinically important difference: defining what really matters to patients,” JAMA, 312, 1342–1343. [DOI] [PubMed] [Google Scholar]

[R14] Shao J (1994), “Bootstrap sample size in nonregular cases,” Proceedings of the American Mathematical Society, 122, 1251–1262. [Google Scholar]

[R15] Shao J (1996), “Bootstrap model selection,” Journal of the American Statistical Association, 91, 655–665. [Google Scholar]

[R16] Thi Hoai An L and Dinh Tao P (1997), “Solving a class of linearly constrained indefinite quadratic problems by DC algorithms,” Journal of Global Optimization, 11, 253–285. [Google Scholar]

[R17] Ware JE Jr and Sherbourne CD (1992), “The MOS 36-item short-form health survey (SF-36): I. Conceptual framework and item selection,” Medical Care, 473–483. [PubMed] [Google Scholar]

[R18] Wright A, Hannon J, Hegedus EJ, and Kavchak AE (2012), “Clinimetrics corner: a closer look at the minimal clinically important difference (MCID),” Journal of Manual & Manipulative Therapy, 20, 160–166. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] Xu Y, Yu M, Zhao Y-Q, Li Q, Wang S, and Shao J (2015), “Regularized outcome weighted subgroup identification for differential treatment effects,” Biometrics, 71, 645–653. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Interval Estimation for Minimal Clinically Important Difference and its Classification Error via a Bootstrap Scheme

Zehua Zhou

Jiwei Zhao

Melissa Kluczynski

Abstract

1. Introduction

2. Motivating Example: ChAMP Trial

3. The MCID and the Classification Error

Figure 1:

4. Methodology

4.1. Algorithms for MCID

4.1.1. A Simple Linear MCID

4.1.2. A Nonparametric Kernel MCID

4.2. Bootstrap Procedure for MCID and the Classification Error

5. Simulation Studies

Table 1:

Table 2:

6. ChAMP Trial Analysis

Table 3:

Table 4:

Acknowledgment

Appendix A.

Appendix B.

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Interval Estimation for Minimal Clinically Important Difference and its Classification Error via a Bootstrap Scheme

Zehua Zhou

Jiwei Zhao

Melissa Kluczynski

Abstract

1. Introduction

2. Motivating Example: ChAMP Trial

3. The MCID and the Classification Error

Figure 1:

4. Methodology

4.1. Algorithms for MCID

4.1.1. A Simple Linear MCID

4.1.2. A Nonparametric Kernel MCID

4.2. Bootstrap Procedure for MCID and the Classification Error

5. Simulation Studies

Table 1:

Table 2:

6. ChAMP Trial Analysis

Table 3:

Table 4:

Acknowledgment

Appendix A.

Appendix B.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases