Abstract
We aimed to identify markers in blood (serum) to predict clinically relevant knee osteoarthritis (OA) progression defined as the combination of both joint structure and pain worsening over 48 months. A set of 15 serum proteomic markers corresponding to 13 total proteins reached an area under the receiver operating characteristic curve (AUC) of 73% for distinguishing progressors from nonprogressors in a cohort of 596 individuals with knee OA. Prediction based on these blood markers was far better than traditional prediction based on baseline structural OA and pain severity (59%) or the current “best-in-class” biomarker for predicting OA progression, urinary carboxyl-terminal cross-linked telopeptide of type II collagen (58%). The generalizability of the marker set was confirmed in a second cohort of 86 individuals that yielded an AUC of 70% for distinguishing joint structural progressors. Blood is a readily accessible biospecimen whose analysis for these biomarkers could facilitate identification of individuals for clinical trial enrollment and those most in need of treatment.
A multi-biomarker blood test could be used toidentify individuals at risk of worsening knee OA.
INTRODUCTION
Osteoarthritis (OA), the most common joint disease, is a leading cause of disability in the United States and worldwide (1). A cure for OA remains elusive, and its management is largely palliative. This is mainly due to two major obstacles: inability to detect OA sufficiently early, before the onset of irreversible signs and recalcitrant symptoms; and inability to reliably identify individuals at high risk of OA progression, resulting in a high type II error rate, i.e., false-negative OA trials, due to inability to observe a treatment effect on a structural OA end point, such as a knee x-ray, that can only worsen but not improve in a cohort whose structural OA is relatively stable. Measures traditionally used to predict OA progression, such as age, sex, body mass index (BMI), and radiographic severity of OA, are minimally prognostic of structural worsening (2–4). The use of biomarkers in drug development increases by threefold the chance of successfully transitioning a drug from phase 1 to U.S. Food and Drug Administration approval (from 8 to 26%) (5). Thus, there is a strong need to identify OA biomarkers, particularly prognostic biomarkers, to facilitate means of identifying individuals likely to have OA progression during the study period and thereby enhance the success of OA clinical trials to bring to fruition the dream of disease modifying drugs for clinical use in OA.
Currently, however, available OA-related biomarkers are for research use only, and there are no blood-based biomarkers that are strongly predictive of OA progression. Moreover, the “chicken-and-egg” dilemma currently challenges the field; biomarkers are needed to enhance the success of trials, but qualification of biomarkers within successful trials is required to identify the best tools for differentiating specific disease phenotypes and potential responders to a treatment. Therefore, it is of great importance to be able to qualify biomarkers in the context of specific and clear phenotypes. In this work, we focus on the important unmet need of objectively identifying individuals at high risk of knee OA progression based on a user-friendly biospecimen, blood (serum), readily obtained in a clinical or trial setting.
Using a systematic, unbiased, and iterative approach based on extensive discovery proteomic studies in synovial fluid, urine, and serum from knee OA radiographic progressors and nonprogressors, we created a targeted multiple reaction monitoring (MRM) proteomic panel to predict radiographic knee OA progression by ultraperformance liquid chromatography–tandem mass spectrometry (UPLC-MS/MS) MRM analysis of serum samples (6). Our primary goal in this study was to evaluate the capability of this serum panel to predict clinically relevant knee OA progression (radiographic and pain worsening). We applied this MRM proteomic biomarker panel to the prediction of knee OA progression in the Foundation for the National Institutes of Health (FNIH) cohort, with further validation in the Biomarker Factory (BMF) cohort. The FNIH cohort was profiled previously for 18 commercially available enzyme-linked immunosorbent assay–based biomarkers (11 serum and 7 urinary) that were originally selected on the basis of existing evidence for their ability to predict OA progression (7, 8). Among these, urinary C-terminal cross-linked telopeptide of type II collagen (uCTXII) was the strongest prognostic biomarker of clinically relevant OA progression (8); uCTXII thereby provided a “best-in-class” reference against which we could evaluate the performance of our final serum proteomic biomarker sets. Through evaluation of their gene expression patterns in human knee OA articular chondrocytes and synoviocytes, we also explored the potential joint tissue origins of the serum proteomic biomarkers that were selected for inclusion in the final predictive biomarker sets.
RESULTS
FNIH cohort
In total, we quantitatively measured 177 peptides (101 proteins); the final analyses consisted of 107 peptides from 64 proteins that passed the quality control (QC) measures (table S1). We excluded a total of four samples: One was exhausted in laboratory preparation, two were outliers, and one had a missing data rate above 15%; this resulted in 596 FNIH study participants in the final analysis dataset with a mean age of 61.6 ± 8.9 years, a mean BMI of 30.7 ± 4.8 kg/m2, and 58.7% female (Fig. 1). At baseline, most participants had moderate to severe radiographic knee OA based on Kellgren-Lawrence (K/L) grades, mean medial minimum joint space width (JSW) reflecting degree of cartilage loss, and mean Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain scores (Table 1). Most participants (70.5%) had no previous history of pain medication use. We defined four OA progressor groups based on the changes from baseline of radiographic joint space loss (JSL), which reflects cartilage degeneration, and/or WOMAC pain scores, resulting in 192 JSL and pain progressors, 103 JSL-only progressors, 102 pain-only progressors, and 199 JSL and pain nonprogressors (Table 1).
Fig. 1. Screening, classification, and analysis of the FNIH600 cohort.
*Magnetic resonance imaging (MRI) artifact, knee positioning exclusions; **frequency matching for 15 combinations of K/L grades by BMI strata, with random selection.
Table 1. Baseline characteristics of the FNIH and BMF cohorts [mean (SD) or n (%)].
NA, not applicable.
| FNIH | BMF | |||||||
|---|---|---|---|---|---|---|---|---|
| Characteristic | JSL and pain progressor | JSL progressor | Pain progressor | Nonprogressor | Total | JSL progressor | Nonprogressor | Total |
| (n = 192) | (n = 103) | (n = 102) | (n = 199) | (n = 596) | (n = 37) | (n = 49) | (n = 86) | |
| BMI | 30.71 (4.77) | 30.72 (4.66) | 31.08 (5.02) | 30.47 (4.73) | 30.70 (4.80) | 29.6 (5.4) | 28.7 (5.0) | 29.1 (5.1) |
| Age | 62.08 (8.81) | 63.14 (8.35) | 59.32 (8.64) | 61.45 (9.15) | 61.60 (8.90) | 65.0 (10.7) | 66.4 (9.7) | 65.8 (10.1) |
| Sex | ||||||||
| Female | 108 (56.2%) | 46 (44.7%) | 67 (65.7%) | 129 (64.8%) | 350 (58.7%) | 29 (78.4%) | 42 (85.7%) | 71 (82.6%) |
| Male | 84 (43.8%) | 57 (55.3%) | 35 (34.3%) | 70 (35.2%) | 246 (41.3%) | 8 (21.6%) | 7 (14.3%) | 15 (17.4%) |
| Race | ||||||||
| Asian | 2 (1.0%) | 0 (0.0%) | 0 (0.0%) | 3 (1.5%) | 5 (0.8%) | 0 (0%) | 0 (0%) | 0 (0%) |
| Black or African American | 32 (16.7%) | 9 (8.7%) | 28 (27.5%) | 39 (19.6%) | 108 (18.1%) | 1 (2.7%) | 0 (0%) | 1 (1.2%) |
| Other non-White | 5 (2.6%) | 3 (2.9%) | 1 (1.0%) | 2 (1.0%) | 11 (1.8%) | 0 (0%) | 1 (2.0%) | 1 (1.2%) |
| White or Caucasian | 153 (79.7%) | 91 (88.3%) | 73 (71.6%) | 155 (77.9%) | 472 (79.2%) | 36 (97.3%) | 48 (98.0%) | 84/86 (97.7%) |
| WOMAC pain* | 10 (12.9) | 16.5 (19.9) | 9.7 (13.4) | 13.0 (16.2) | 12.0 (15.5) | 26.5 (17.6) | 26.2 (21.2) | 26.3 (19.7) |
| K/L grade | ||||||||
| 1 | 24 (12.5%) | 14 (13.6%) | 13 (12.7%) | 24 (12.1%) | 75 (12.6%) | 14 (37.8%) | 22 (44.9%) | 36 (41.9%) |
| 2 | 83 (43.2%) | 47 (45.6%) | 60 (58.8%) | 113 (56.8%) | 303 (50.8%) | 7 (18.9%) | 8 (16.3%) | 15 (17.4%) |
| 3 | 85 (44.3%) | 42 (40.8%) | 29 (28.4%) | 62 (31.2%) | 218 (36.6%) | 16 (43.2%) | 19 (38.8%) | 35 (40.7%) |
| JSW (mm) | 3.79 (1.39) | 3.77 (1.19) | 3.92 (1.01) | 3.86 (1.01) | 3.80 (1.20) | NA | NA | NA |
| JSN | ||||||||
| 0 | NA | NA | NA | NA | NA | 8/37 (21.6%) | 15/49 (30.6%) | 23/86 (26.7%) |
| 1 | NA | NA | NA | NA | NA | 15/37 (40.5%) | 14/49 (28.6%) | 29/86 (33.7%) |
| 2 | NA | NA | NA | NA | NA | 13/37 (35.1%) | 18/49 (36.7%) | 31/86 (36.0%) |
| 3 | NA | NA | NA | NA | NA | 1/37 (2.7%) | 1/49 (2.0%) | 2/86 (2.3%) |
| 4 | NA | NA | NA | NA | NA | 0/37 (0%) | 1/49 (2.0%) | 1/86 (1.2%) |
| Pain medication | 62/192 (32.3%) | 22/103 (21.4%) | 37/102 (36.3%) | 55/199 (27.6%) | 176/596 (29.5%) | NA | NA | NA |
*WOMAC pain score normalized on a 0 to 100 scale.
Elastic net feature selection
Elastic net regression along with bootstrapping was applied as the variable selection method to account for correlated biomarkers. With bootstrapped elastic net selection, the top 30 biomarkers with highest selection frequencies were defined as the “stable” set for each model outcome (table S2). Backward elimination was applied to each stable set to reduce prediction bias (Fig. 2), resulting in four “essential” biomarker sets for distinguishing the OA groups (table S2) as follows: 15 biomarkers for JSL and pain progressor versus composite comparator group (defined as the combined JSL and pain nonprogressor, JSL-only progressor, and pain-only progressor groups), 13 biomarkers for JSL and pain progressor versus the JSL and pain nonprogressor group, 11 biomarkers for JSL progressor versus the JSL and pain nonprogressor group, and 10 biomarkers for pain progressor versus the JSL and pain nonprogressor group. The summary statistics of the essential biomarkers and clinical covariates in knee OA progressor groups are provided in table S3. A total of 13 proteins were consistently selected in all models (Fig. 3A). Three proteins, represented by eight peptides (whose amino acid locations are indicated in subscripts) were selected for inclusion in all four of the essential biomarker sets, demonstrating their importance as prognostic indicators of knee OA progression, including cartilage acidic protein 1 [CRAC1(101–108) and CRAC1(170–178) peptides, both predicting increased risk], complement C1r subcomponent [the C1R(683–689) peptide predicting reduced risk], and vitamin D binding protein [VTDB(95–114), VTDB(128–149), and VTDB(371–388) predicting increased risk and VTDB(346–352) and VTDB(364–370) predicting reduced risk] (Fig. 3B). Other commonly selected peptides included CD44(30–38), dopamine β-hydroxylase [DOPO(585–602)], kininogen-1 [KNG1(479–496)], phosphatidylinositol glycan–specific phospholipase [PHLD(800–808)], retinol-binding protein 4 [RET4(185–195)], thrombospondin 1 [TSP1(217–228)], and protein Z–dependent protease inhibitor [ZPI(438–444)].
Fig. 2. Backward elimination process to yield essential biomarker sets.
(A) JSL and pain progressor versus composite comparator (JSL or pain-only progression or combined JSL and pain nonprogression), (B) JSL and pain progressor versus nonprogressor, (C) JSL progressor versus nonprogressor, (D) pain progressor versus nonprogressor. The biomarkers in the stable sets were sorted by the area under the receiver operating characteristic curve (AUC) drop associated with a single biomarker removal from the model. Certain rules were applied to determine the stop point (red dot) as described in the methods. The AUC of each model was calculated as the average value of 10 imputed datasets. boot.ave (left y axis), the average AUC when fitting on the bootstrapping samples; orig.ave (left y axis), the average AUC of fitting on the original datasets; bias.ave (right y axis), the average bias between original AUC and bootstrapping AUC. Each peptide is defined (full name and sequence) in table S1.
Fig. 3. Venn diagram of selected protein biomarkers.
(A) Stable biomarker sets of the primary and secondary end points. (B) Essential biomarker sets of the primary and secondary end points. Proteins in overlapping regions are selected in common in corresponding comparison models. prog., progression.
Model evaluation
Clinical covariates meeting P < 0.15 in the univariate logistic analysis were included in the multivariable logistic regression models for different comparisons (table S4): baseline WOMAC pain score and K/L grade for analysis of JSL and pain progressors versus composite comparators; sex, baseline K/L grade, and WOMAC pain score for analysis of JSL and pain progressors versus nonprogressors (neither JSL nor pain progression); sex and baseline K/L grade for analysis of JSL progressors versus nonprogressors; and baseline WOMAC pain score for pain progressors versus nonprogressors. Bootstrapped areas under the receiver operating characteristic curve (AUCs; Table 2) were estimated for various models with different combinations of clinical covariates and biomarkers. Clinical covariates, uCTXII, and α-isomerized version of urinary C-terminal crosslinked telopeptide type I collagen (uCTXIα) each showed limited ability to distinguish progressors from nonprogressors (best average AUCs of 0.601, 0.608, and 0.577, respectively), while models with the essential biomarker sets only yielded much higher AUCs: 0.740 for JSL and pain progressor versus nonprogressor, 0.728 for JSL and pain progressor versus composite comparator, 0.698 for JSL progressor versus nonprogressor, and 0.673 for pain progressor versus nonprogressor. Prediction of clinically relevant progression (JSL and pain progression) was somewhat stronger when the comparator group was nonprogressors (AUC of 0.740) as opposed to the composite comparator (AUC of 0.728) that included 50% progressors (JSL-only progressors and pain-only progressors) and 50% nonprogressors. The tails of the AUC 95% confidence intervals (CIs) overlapped for uCTXII versus the essential biomarker for the comparisons of JSL progressors versus JSL and pain nonprogressors and pain progressors versus JSL and pain nonprogressors; however, for the primary comparisons of clinically relevant progressors (JSL and pain progressor versus composite comparator and JSL and pain progressor versus JSL and pain nonprogressor), there was no overlap of 95% CIs, suggesting that the prediction of clinically relevant progression (JSL and pain progression) by essential biomarkers compared to uCTXII was somewhat stronger than the essential biomarker prediction of JSL-only or pain-only progression.
Table 2. Predictive modeling with AUC and 95% CIs for all models and OA progression outcomes.
| Multivariable model | JSL and pain progressor vs. composite comparator (192 vs. 404) | JSL and pain progressor vs. JSL and pain nonprogressor (192 vs. 199) | JSL progressor vs. JSL and pain nonprogressor (295 vs. 199) |
Pain progressor vs. JSL and pain nonprogressor (294 vs. 199) |
|---|---|---|---|---|
| Covariates* | 0.585 (0.532–0.627) | 0.601 (0.534–0.641) | 0.596 (0.544–0.637) | 0.542 (0.492–0.592) |
| uCTXII | 0.575 (0.525–0.623) | 0.608 (0.552–0.663) | 0.594 (0.543–0.645) | 0.590 (0.539–0.641) |
| uCTXIα | 0.564 (0.512–0.613) | 0.577 (0.513–0.632) | 0.559 (0.503–0.611) | 0.559 (0.501–0.611) |
| uCTXII + covariates | 0.602 (0.545–0.642) | 0.641 (0.576–0.684) | 0.629 (0.570–0.671) | 0.600 (0.544–0.645) |
| Essential biomarker set only† | 0.728 (0.672–0.752) | 0.740 (0.669–0.772) | 0.698 (0.631–0.735) | 0.673 (0.608–0.705) |
| Essential biomarker set + covariates | 0.746 (0.693–0.767) | 0.752 (0.696–0.775) | 0.718 (0.654–0.747) | 0.685 (0.622–0.717) |
| Essential biomarker set + uCTXII | 0.733 (0.682–0.757) | 0.753 (0.687–0.784) | 0.712 (0.650–0.743) | 0.692 (0.633–0.721) |
| Essential biomarker set + uCTXIα | 0.729 (0.673–0.753) | 0.738 (0.667–0.768) | 0.698 (0.632–0.730) | 0.676 (0.620–0.704) |
| Essential biomarker set + uCTXII + covariates | 0.743 (0.694–0.766) | 0.760 (0.706–0.785) | 0.728 (0.670–0.755) | 0.702 (0.636–0.731) |
| Stable biomarker set only | 0.746 (0.715–0.753) | 0.763 (0.730–0.765) | 0.726 (0.692–0.731) | 0.713 (0.688–0.714) |
| Stable biomarker set + covariates | 0.762 (0.729–0.767) | 0.779 (0.752–0.777) | 0.745 (0.712–0.749) | 0.718 (0.689–0.720) |
| Stable biomarker set + uCTXII | 0.752 (0.719–0.759) | 0.775 (0.735–0.778) | 0.736 (0.703–0.740) | 0.727 (0.699–0.727) |
| Stable biomarker set + uCTXIα | 0.747 (0.715–0.754) | 0.763 (0.732–0.765) | 0.725 (0.692–0.726) | 0.716 (0.685–0.714) |
| Stable biomarker set + uCTXII + covariates | 0.771 (0.739–0.777) | 0.794 (0.766–0.794) | 0.761 (0.731–0.764) | 0.732 (0.704–0.734) |
*Selected clinical covariates by univariate analyses: (i) JSL and pain progressor versus composite comparator: K/L grade and WOMAC pain score; (ii) JSL and pain progressor versus JSL and pain nonprogressor: sex, K/L grade, and WOMAC pain score; (iii) JSL progressor versus JSL and pain nonprogressor: sex, and K/L grade; (iv) pain progressor versus JSL and pain nonprogressor: WOMAC pain score.
†The essential (final) biomarker sets consisted of selected biomarkers after stability screening and backward elimination. The stable biomarker set refers to those initially selected by elastic net.
Models with the stable biomarker sets exhibited greater AUCs (by 0.18 to 0.40 U) than those with essential biomarkers (Table 2); the differences were statistically significant as demonstrated by log likelihood ratio tests for only one of the outcomes (JSL and pain progressor versus composite comparator, P = 0.021) (Table 3). This indicated that backward elimination was generally useful for reducing the number of peptide biomarkers of the final models. Adding the covariates to either the essential or stable biomarker sets significantly enhanced discriminatory capacity with AUC increases ranging from 0.05 to 0.20 U (average of 0.15 U) (Tables 2 and 3). Adding uCTXII or uCTXII with covariates to either the essential or stable biomarker sets significantly enhanced the discriminatory capacity with AUC increases ranging from 0.05 to 0.35 U (average of 0.18 U) (Table 2). The addition of uCTXIα did not enhance the discriminatory capacity of any model (Tables 2 and 3). Overall, the highest AUC (0.794; 95% CI, 0.766 to 0.794) was achieved for the discrimination of JSL and pain progressor versus JSL and pain nonprogressor using the stable biomarker set, uCTXII and covariates in combination. Sensitivity analysis of the essential biomarker set performance, excluding the 218 participants with baseline K/L = 3 OA status, resulted in improved AUCs for all outcomes with increases in AUCs by 0.01 to 0.028 U (table S5); this suggests that the essential biomarker sets may be even better discriminators of knee OA progressors in “earlier” (baseline K/L = 1 to 2) stages of radiographic OA.
Table 3. Log likelihood ratio test to compare essential and stable biomarkers sets.
The essential biomarker sets consisted of selected biomarkers after stability screening and backward elimination. The stable biomarker set refers to those initially selected by elastic net. Covariates used in the analyses were as follows: baseline WOMAC pain score and K/L grade for analysis of JSL and pain progression versus composite comparators; sex, baseline K/L grade and WOMAC pain score for analysis of JSL and pain progressor versus JSL and pain nonprogressor; sex and baseline K/L grade for analysis of JSL progressor versus JSL and pain nonprogressor; and baseline WOMAC pain score for pain progressor versus JSL and pain nonprogressor.
| Model | −2log L | Model comparison | 2∆ln L (df) | P value |
|---|---|---|---|---|
| JSL & Pain progressor vs. composite comparator | ||||
| A: Essential biomarker set | 667.84 | |||
| B: Essential biomarker set and covariates | 652.11 | B vs. A | 15.73 (3) | 0.001 |
| C: Essential biomarker set and uCTXII | 662.74 | C vs. A | 5.10 (1) | 0.024 |
| D: Essential biomarker set and uCTXIα | 666.23 | D vs. A | 1.62 (1) | 0.204 |
| E: Essential biomarker set and uCTXII and covariates | 646.46 | E vs. A | 21.38 (4) | <0.001 |
| F: Stable biomarker set | 639.82 | F vs. A | 28.03 (15) | 0.021 |
| G: Stable biomarker set and covariates | 626.06 | G vs. F | 13.76 (3) | 0.003 |
| H: Stable biomarker set and uCTXII | 634.19 | H vs. F | 5.63 (1) | 0.018 |
| I: Stable biomarker set and uCTXIα | 638.14 | I vs. F | 1.68 (1) | 0.195 |
| J: Stable biomarker set and uCTXII and covariates | 620.03 | J vs. F | 19.79 (4) | 0.001 |
| JSL and pain progressor vs. JSL and pain nonprogressor | ||||
| A: Essential biomarker set | 473.51 | |||
| B: Essential biomarker set and covariates | 457.94 | B vs. A | 15.57 (4) | 0.004 |
| C: Essential biomarker set and uCTXII | 463.18 | C vs. A | 10.33 (1) | 0.001 |
| D: Essential biomarker set and uCTXIα | 472.68 | D vs. A | 0.83 (1) | 0.363 |
| E: Essential biomarker set and uCTXII and covariates | 444.00 | E vs. A | 29.51 (5) | <0.001 |
| F: Stable biomarker set | 450.09 | F vs. A | 23.42 (17) | 0.136 |
| G: Stable biomarker set and covariates | 435.65 | G vs. F | 14.44 (4) | 0.006 |
| H: Stable biomarker set and uCTXII | 439.67 | H vs. F | 10.42 (1) | 0.001 |
| I: Stable biomarker set and uCTXIα | 449.58 | I vs. F | 0.51 (1) | 0.477 |
| J: Stable biomarker set and uCTXII and covariates | 423.07 | J vs. F | 27.02 (5) | <0.001 |
| JSL progressor vs. JSL and pain nonprogressor | ||||
| A: Essential biomarker set | 606.43 | |||
| B: Essential biomarker set and covariates | 592.13 | B vs. A | 14.30 (3) | 0.003 |
| C: Essential biomarker set and uCTXII | 595.44 | C vs. A | 11.00 (1) | 0.001 |
| D: Essential biomarker set and uCTXIα | 605.97 | D vs. A | 0.46 (1) | 0.498 |
| E: Essential biomarker set and uCTXII and covariates | 576.36 | E vs. A | 30.07 (4) | <0.001 |
| F: Stable biomarker set | 579.49 | F vs. A | 26.94 (19) | 0.106 |
| G: Stable biomarker set and covariates | 565.26 | G vs. F | 14.24 (3) | 0.003 |
| H: Stable biomarker set and uCTXII | 571.05 | H vs. F | 8.44 (1) | 0.004 |
| I: Stable biomarker set and uCTXIα | 579.20 | I vs. F | 0.29 (1) | 0.587 |
| J: Stable biomarker set and uCTXII and covariates | 552.24 | J vs. F | 27.25 (4) | <0.001 |
| Pain progressor vs. JSL and pain nonprogressor | ||||
| A: Essential biomarker set | 615.94 | |||
| B: Essential biomarker set and covariates | 611.45 | B vs. A | 4.49 (1) | 0.034 |
| C: Essential biomarker set and uCTXII | 604.35 | C vs. A | 11.59 (1) | 0.001 |
| D: Essential biomarker set and uCTXIα | 613.54 | D vs. A | 2.40 (1) | 0.122 |
| E: Essential biomarker set and uCTXII and covariates | 598.36 | E vs. A | 17.58 (2) | <0.001 |
| F: Stable biomarker set | 588.42 | F vs. A | 27.52 (20) | 0.121 |
| G: Stable biomarker set and covariates | 585.56 | G vs. F | 2.86 (1) | 0.091 |
| H: Stable biomarker set and uCTXII | 577.86 | H vs. F | 10.55 (1) | 0.001 |
| I: Stable biomarker set and uCTXIα | 585.83 | I vs. F | 2.58 (1) | 0.108 |
| J: Stable biomarker set and uCTXII and covariates | 573.88 | J vs. F | 14.54 (2) | 0.001 |
As demonstrated by the 95% CIs of the odds ratios (ORs) in multivariable models that included the full set of essential biomarkers (Fig. 4), most peptide biomarkers in the essential sets were statistically significant predictors of OA progression. For each essential set of biomarkers, about half of all peptide biomarkers predicted increased risk, while the other half predicted decreased risk of progression. The highest AUC, using essential biomarkers only, distinguished the clinically relevant progressors from JSL nonprogressors with an AUC of 0.740 based on 13 peptides from 13 distinct proteins. The Youden’s index J statistic for this outcome was 0.395, yielding a sensitivity of 80.2% and a specificity of 59.3% with essential biomarkers only (table S6).
Fig. 4. Forest plot and OR estimates for the models with the essential biomarkers.
(A) The JSL and pain progressor group compared to the composite comparator group defined as the combined JSL and pain nonprogressor, JSL-only progressor, and pain-only progressor groups; (B) the JSL and pain progressor group compared to the JSL and pain nonprogressor group; (C) the JSL-only progressor group compared to the JSL and pain nonprogressor group; (D) the pain-only progressor group compared to the JSL and pain nonprogressor group. Parameter estimates of each model were obtained by fitting on 10 imputed data sets. Rubin’s rules were then applied to combine the estimates from the repeated complete data analyses. Because the peptide ratios were first standardized by their own SD, the results are presented as the ORs for an SD change of the corresponding peptides ratios. With our peptide selection strategy, most of the biomarkers significantly contribute to knee OA progression or benign symptoms.
Prognostic MRM biomarker panel validation in the BMF cohort
The characteristics of the BMF cohort (n = 86, 37 JSL progressors and 49 JSL nonprogressors) were comparable to the FNIH cohort (Table 1). Among the 11 essential biomarkers selected in the comparison of JSL progressor versus JSL nonprogressor in the FNIH cohort, the contrast most comparable to the available groups in the BMF cohort, VTDB(128–149) and CD14(192–210) were not measured in the BMF cohort. VTDB(128–149) was strongly correlated with four other peptides within the VTDB protein (rs > 0.89); VTDB(371–388) with the highest correlation (0.94) was used as an alternative to VTDB(128–149). No replacement peptide was found for CD14(192–210) in the BMF cohort; however, in the work by Nockher et al. (9), serum CD14 concentrations were correlated positively with β2-microglobulin (B2MG) (rs = 0.63, P < 0.0001). Thus, we used B2MG, measured in the BMF cohort, as a replacement for CD14(192–210). Without clinical covariates, the 11 peptide biomarker set was capable of distinguishing JSL progressors from JSL nonprogressors in the BMF cohort (AUC = 0.697). The BMF cohort included five families of two individuals each; in sensitivity analysis, we randomly removed one individual from each of these families, which yielded a similar AUC of 0.688.
Joint tissue expression of essential biomarkers
Single-cell RNA sequencing (scRNA-seq) confirmed the joint tissue gene expression in OA articular cartilage, OA synovium, or both of 19 (70%) of the 27 genes corresponding to proteins composed of the essential OA progression peptides (Fig. 5A), suggesting that this subset of essential biomarkers in the serum could have a potential joint tissue origin. Expression in both OA cartilage and synovium was found for actin cytoplasmic 2 (ACTG; ACTG1), complement C1r subcomponent (C1R; C1R), monocyte differentiation antigen CD14 (CD14; CD14), CD44 antigen (CD44; CD44), complement factor H (CFAH; CFH), tetranectin (TETN; CLEC3B), cartilage acidic protein 1 (CRAC1; CRTAC1), coagulation factor V (FA5; F5), proteoglycan 4 (PRG4; PRG4), retinol-binding protein 4 (RET4; RBP4), plasma protease C1 inhibitor (IC1; SERPING1), and thrombospondin-1 (TSP1; THBS1). Low but detectable expression was found in both tissues for β-Ala-His dipeptidase (CNDP1; CNDP1), glycosylphosphatidylinositol-specific phospholipase D1 (PHLD; GPLD1), hemopexin (HEMO; HPX), heparin cofactor II (HEP2; SERPIND1), and sex hormone binding globulin (SHBG; SHBG); in cartilage predominantly for α1-antichymotrypsin (AACT; SERPINA3); and in synovium predominantly for platelet factor 4 (PLF4; PF4). The essential protein sets of all four OA progression models were described by 12 canonical pathways meeting criteria of P < 0.05 [i.e., −log10 (P value) >1.3] including acute phase response signaling, liver X receptor/retinoid X receptor (LXR/RXR) and farnesoid X receptor (FXR)/RXR activation, complement and coagulation systems including both intrinsic and extrinsic prothrombin activation pathways, catecholamine biosynthesis, the “neuroprotective role of thimet oligopeptidase (THOP1) in Alzheimer’s disease,” actin cytoskeleton signaling, choline biosynthesis III, and macrophage migration inhibitory factor (MIF)–mediated glucocorticoid regulation (Fig. 5B). Collectively, these pathways point to key roles of inflammatory, metabolic, and immune responses in OA progression.
Fig. 5. Joint tissue gene expression profile corresponding to essential biomarkers.
(A) Expression level, defined as number of cells expressing the gene (y axis) corresponding to the essential biomarker (gene name listed on the x axis above the graphic) in the medial tibial (MT) lesioned knee OA articular knee cartilage, outer lateral tibial (OLT) nonlesioned knee OA articular cartilage, and matched knee OA synovium (SY); the data are extracted from a previously described database of scRNA-seq data (47). Of the 27 proteins comprising the essential peptides, 19 were found in the transcriptome data and are plotted. (B) Stacked bar plots depict the top ranked canonical pathways from ingenuity pathway analysis associated with all four OA progression phenotypic models at P < 0.05 [i.e., −log10 (P value) > 1.3]. prog., progression.
DISCUSSION
Our results verified that a combination of serum peptide biomarkers greatly enhanced the accuracy in predicting knee OA progression over existing methodology that work poorly in knee OA prognosis and diagnosis (10, 11) or the current leading systemic biomarker in the field, uCTXII. Based on a serum sample, this proteomic panel at baseline predicted clinically relevant progression (pain and radiographic knee OA progression) and radiographic knee OA progression (independent of any consideration of pain status) over the time course of a typical OA clinical trial (2 to 4 years). Our results are the strongest predictive panel to date for OA progression and most excitedly are performed on a patient-friendly biofluid, serum, as opposed to requiring synovial fluid. On the basis of optimization and evaluation of the smallest essential panel for identifying each progressor phenotype, slightly different but overlapping sets of peptide biomarkers were identified. Differences in these sets are not unexpected given the known heterogeneity of pain phenotypes and lack of congruence of structural and pain features in OA (12). The AUC in the BMF validation cohort demonstrated the robustness of the essential peptide set for radiographic OA progression.
Three proteins, VTDB (five peptides, three indicating increased risk and two indicating reduced risk), CRAC1 (two peptides, both indicating increased risk), and C1R (one peptide indicating reduced risk) were selected as essential biomarkers in all four models, demonstrating their importance as prognostic indicators of knee OA progression. VTDB has a multitude of functions. More than 120 VTDB variants are known, but their health consequences are not understood. We did not identify GC (the vitamin-D binding protein gene) expression in either OA cartilage or synovium; however, VTDB readily penetrates the joint cavity, demonstrated by moderately strong correlation (0.6) of serum and synovial fluid concentrations, and concentrations in synovial fluid ~50% of serum concentrations (13). Therefore, although VTDB originates external to joint tissue, it nevertheless reflects processes relevant to OA pathology. As shown here, both CRAC1 (not only greatest in chondrocytes from lesioned regions of OA cartilage but also expressed by OA synovial cells) and C1R (not only greatest in OA synovial cells but also expressed in chondrocytes from OA cartilage) are unequivocally expressed in joint tissues. Both of these proteins could plausibly play a role in disease pathogenesis in addition to indicating risk of disease progression; therefore, they might be considered as “direct” biomarkers, defined as directly associated with the causal pathway of a disease (14).
In this study, not all VTDB peptides shared the same direction of effect on OA progression in multivariable models that included all essential biomarkers for an outcome; some were associated with increased risk, while others with decreased risk, suggesting that some parts of the protein play a compensatory role in disease progression and some parts act as coactivators of the immune system. These results might be explained on the basis of domain-specific functions of VTDB. VTDB has a single binding site for all forms of vitamin D; it thereby determines vitamin D bioavailability, creating a reserve by which to rapidly replenish the free (active) forms of vitamin D to prevent vitamin D deficiency. In mice (15) and in vitro (16), vitamin D is anti-inflammatory; moreover, in the observational OA initiative (OAI) cohort, vitamin D supplementation over 4 years was associated with significantly less progression of knee joint abnormalities (17). VTDB also scavenges microthrombi-forming actin molecules released from necrotic cells after tissue injury (18, 19). Together, these results suggest a protective role of VTDB in OA. However, VTDB also augments the monocyte and neutrophil chemotactic response to the complement anaphylatoxin C5a (20). These multitudes of functions of VTDB likely complicate the ability to identify clear associations of the whole protein with clinical outcomes; in this regard, domain-specific analyses, as provided here by specific peptide quantification by MS proteomics, appear to unmask the functional complexity of this protein in OA progression. From a biomarker standpoint, these results suggest that epitope-specific measures of circulating VTDB are required for optimal OA progression prediction.
In our study, CRAC1 predicted both structural and OA pain progression. CRAC1 is a glycosylated extracellular matrix protein that is enriched in the interterritorial matrix of the deep zone of articular cartilage (21). Our scRNA-seq analysis confirmed the presence of CRAC1 in human articular cartilage, consistent with a prior murine study (22), its enrichment in lesioned regions as opposed to macroscopically normal appearing regions of OA cartilage, and expression in OA synovium. The function of CRAC1 in joint tissues is not clear, although CRAC1 promotes cell proliferation, migration, and extracellular matrix production in primary human fibroblasts in vitro (23). We identified two CRAC1 peptides as essential predictors; both were positively associated with OA progression. In a recent large study profiling 4792 proteins in plasma, CRAC1 was associated with OA pain and was the most strongly associated of all the proteins with a diagnosis of OA and prediction both knee and hip replacement (24). This study reported that plasma CRAC1 concentrations declined after joint replacement (24), consistent with a joint tissue origin of this plasma analyte. In addition, in the Rotterdam cohort in models adjusted for age, sex, and BMI, serum CRAC1 has been identified as a biomarker of overall radiographic OA severity (hand, hip, and knee OA combined) and knee radiographic OA severity and progression (25). Last, in a recent untargeted MS study of serum (26), CRAC1, along with fibrillin-1 (FBN1), and VTDB (all elevated above control) were identified as knee OA-related proteins; CRAC1 was the strongest contributor to a principal component (PC) discriminating OA from control, underscoring its emerging importance as a robust knee OA-related biomarker.
In our study, C1r predicted reduced risk of both OA structural (radiographic) and pain progression. C1r is a subunit of the C1 complex, the first component of the classical pathway of the complement system consisting of C1r, C1q, and C1s. The C1 complex plays an important role in the innate immune defense system. Our scRNA-seq analysis confirmed the expression of C1r in OA cartilage and synovium. The epitope identified in our study [C1r(683–689)] is situated in the light catalytic (B) chain of C1r (spanning amino acids 464 to 702) (27) that mediates cleavage and autoactivation as well as cleavage and activation of C1s in the C1 complement complex (28). Our study showed that a higher serum concentration of the C1r(683–689) peptide was indicative of a lower risk of OA progression. To potentially understand this result, it is important to recognize that no fragment is released when C1r is cleaved. C1r subsequently cleaves and activates C1s that activates C4 and C2, ultimately resulting in activation of the central complement protein C3 (29). The only known protease inhibitor of activated C1r is the serpin C1 inhibitor (IC1; alternative names C1INH, C1 esterase inhibitor; gene SERPING1) (30), another peptide selected as essential for predicting reduced risk of OA radiographic progression in this study. IC1 forms proteolytically inactive covalent complexes with the C1r and C1s proteases (30); this effectively disassembles the C1 complex, releasing inactive C1r:IC1 and C1s:IC1 complexes (30, 31). Trimer and tetramer complexes containing IC1, C1r, and C1s have been identified in serum and synovial fluid of individuals with rheumatoid arthritis (RA) (32). On the basis of this information, it is plausible that our proteomic analyses detect stable IC1:C1r inhibitory complexes, such as identified in individuals with RA (32), suggesting that C1 activation is triggered but is checked at the C1R stage by IC1, thus inhibiting the further activation of complement components. It has also recently been determined that C1s cleaves noncomplement components, such as major histocompatibility complex class I molecules, insulin-like growth factor binding protein 5, nuclear autoantigens, and the Wnt co-receptor low-density lipoprotein receptor–related protein 6 (LRP6). LRP6 cleavage by C1s results in activation of Wnt signaling, a pathway involved in OA initiation and progression (33). Moreover, IC1 reduces Wnt signaling activation (34). This suggests that C1r, by activating C1s, and IC1 with its well-defined role in regulating host defense through its interaction with the C1 complex also function broadly in tissue homeostasis and immune tolerance, thereby providing possible explanations for the observed associations of higher serum concentrations of C1r and IC1 with lower risk of OA progression.
The biomarker TETN (CLEC3B) was in the stable sets of all four models and in the essential set for predicting pain progression; genetic variants of CLEC3B are associated with knee OA (35). Some biomarker elevations may represent countermeasures of disease as opposed to pathological mediators, or may be neoepitopes or degradation products that are highly informative but difficult to understand a priori without a thorough knowledge of their biology. For instance, FA5 protein (F5 gene), a blood coagulation factor, was in the stable biomarker sets for all four models of OA progression, and it was an essential biomarker for clinically relevant progression versus JSL (radiographic) nonprogression. Given the emerging recognition that components of blood coagulation can have prothrombotic and proinflammatory functions independent of their hemostatic effects (36), one might expect higher FA5 to be a risk marker for OA progression. On the basis of our scRNA-seq data, F5 gene expression was higher in the more diseased compared to the less diseased region of cartilage, but higher serum FA5(1506–1517) peptide predicted OA nonprogression. FA5 is at the heart of the coagulation cascade; FA5(1506–1517) represents an epitope that is degraded during FA5 activation by thrombin, a protein whose activity is associated with OA (37, 38). Thus, a higher blood concentration of FA5(1506–1517) is consistent with low thrombin activity and a state of less OA activity, as expected for nonprogressors, and consistent with the inverse association of this peptide with OA progression.
There were several limitations of this study. First, unlike the usual feature selection methods used in qualification of biomarkers, elastic net was used to accommodate a large number of predictors (peptides) and the potential correlation among predictors in the model. However, elastic net had a disadvantage in this scenario that more biomarkers emerged than expected. We found that 10-fold cross-validation underestimated AUCs because of the relatively small sample size of the validation set. We therefore used bootstrap methods to determine the final set of peptides based on those repeatedly selected among bootstrapped samples. Second, our validation cohort BMF was smaller than the discovery cohort, and two of the essential biomarkers selected in the FNIH cohort were not profiled. We overcame this limitation by selecting two alternative peptides in BMF known to correlate with the two missing ones. With these substitutions, we validated the essential biomarker ability to discriminate radiographic progressors from nonprogressors.
There were also several strengths of this study. We tested a well-developed proteomic panel suitable for prediction of OA progression in serum, a patient friendly biospecimen. Our primary cohort (FNIH) was derived from the deeply phenotyped OAI cohort and had preexisting data for the major currently used OA-related biomarkers; these data allowed head to head comparison of the performance of our essential peptide sets with the existing best-in-class OA-related biomarker, uCTXII. To our knowledge, on the basis of the original report of these data (39), CNDP1, VTDB, and ACTG were not previously identified as biomarkers of OA. In addition, we focused on study participants with standard as opposed to fast rates of progression that may be more representative of the general knee OA patient population.
In summary, this study successfully detected a combination of biomarkers, which effectively discriminated clinically relevant knee OA progressors from nonprogressors using a patient-friendly biospecimen, serum. Using robust and stringent statistical and proteomic analysis methodology, we identified a set of baseline serum biomarkers of participants in the FNIH cohort that were able to predict clinically relevant knee OA progression (the combination of structural and pain progression) better than the current bestin-class OA-related biomarker (uCTXII), and were able to be validated in an independent knee OA cohort. These essential biomarker sets hold promise as tools for overcoming the chicken-and-egg challenge in OA, namely, to facilitate enrichment of clinical trial cohorts with individuals likely to progress over the time course of a typical OA clinical trial (2 to 4 years) and thereby increase the chance of trial successes through enhanced statistical powering. These results also provide a basis for future development of means of identifying individuals most in need of surveillance and disease modifying therapies.
MATERIALS AND METHODS
Experimental design
The overall objective of this study was to identify biomarkers related to OA knee structural (radiographic) and pain progression using a “user-friendly” body fluid, serum.
Participants
The FNIH OA biomarkers consortium cohort
The FNIH cohort was selected as a nested case-control cohort within the OAI dataset to evaluate potential biochemical and magnetic resonance imaging (MRI) biomarkers of OA progression in the FNIH biomarker consortium project (8, 40). Participants were eligible for FNIH cohort inclusion on the basis of K/L (41) grades 1 to 3 radiographic knee OA at baseline; radiographic JSW and pain data from baseline to 48 months; and MRI, serum, and urine samples at both baseline and 24 months. The WOMAC-normalized pain score (range, 0 to 100) was used to quantify knee OA pain severity. Participants were excluded if they were unable to progress (baseline minimal JSW of <1.0 mm or WOMAC score of >91 on a 0- to 100-normalized scale), if MRI artifacts were likely to affect image analysis, or if the radiographs were of poor quality or malpositioned. In addition, as previously described (40), this FNIH subsample excluded knees with radiographic and pain progression at 12 months, those with lateral joint space narrowing (JSN) grade 2 or 3 at baseline, as well as individuals who underwent total knee or total hip replacement between baseline and 24 months. The goal of these exclusions was to ensure that selected biomarkers were prognostic as opposed to concurrent predictors of OA progression and generally reflecting progressors with more standard rates of progression. Thus, what would be considered “fast progressors” were excluded from this FNIH subsample. The sample size of the original case-control FNIH cohort (n = 600) was predetermined by the number of individuals in the OAI fulfilling the primary case status definition with available samples (n = 194) (8). Baseline data included age, gender, BMI, race, medial minimum JSW (minJSW), K/L grade, WOMAC pain score, and history of knee pain medication use. Sufficient sera for 599 individuals were available to perform proteomic analyses for this study; one sample from the JSL and pain progression group in the original FNIH biomarker study (8) was exhausted (Fig. 1).
BMF cohort
The validation (BMF) cohort included 86 individuals with radiographic knee OA (K/L grade 1 to 3 at baseline), 3- to 4-year follow-up radiographic data, and available baseline serum samples. BMF knee OA progressors (n = 37) could have hip OA progression; the nonprogressors (n = 49) were required to have no knee and no hip OA progression. Baseline data included age, BMI, gender, and K/L grade.
Outcomes
Foundation for the National Institutes of Health
The primary case status classification of participants in the FNIH cohort was based on two criteria: knee radiographic JSL progression, defined as a decrease in a minJSW of ≥0.7 mm from 24 to 48 months from baseline; and sustained pain worsening, defined as a WOMAC pain increase from 24 to 48 months of ≥9 U on a normalized 100-U scale, sustained on at least two follow-up visits over 60 months from baseline. Both of these criteria are considered to be above a minimum clinically important difference as previously described (42–44). The primary analysis compared “clinically relevant” knee OA progression (n = 192) defined as JSL and pain progression over 48 months, to a composite comparator (n = 404 consisting of individuals with JSL-only progression, pain-only progression, and JSL or pain nonprogression). The secondary analyses compared JSL and pain progression (n = 192), any JSL progression (n = 295), and any pain progression (n = 294) to the JSL and pain nonprogression group (n = 199).
Biomarker Factory
Using a standardized atlas (45), the medial and lateral compartments of the knee were graded for categorical JSN (score, 0 to 3); knee radiographic OA progression was defined as an increase in categorical JSN of at least one unit (JSN of ≥1) over 3 to 4 years. A one-grade change in categorical JSN used in this cohort is comparable to the radiographic progression measure (JSW of ≥0.7 mm) used in the FNIH cohort (46).
MS analyses
Nondepleted serum sample preparation and stable isotope–labeled peptide spiking
Nondepleted FNIH cohort samples were thawed before sample preparation in 12 partially randomized processing/run blocks. Upon thawing, an aliquot was removed and subjected to a Bradford assay in duplicate after a 50× dilution into 50 mM ammonium bicarbonate. In addition to patient samples, three QC sample types were processed: (i) To assess differences in digestion efficiencies between sets, a digestion QC standard was created by pooling 50 μl from the first block (block 1) of 61 samples, which was then mixed and subaliquoted (at the intact protein level). All digestion QC standards were then frozen. Three random aliquots were thawed and digested with each sample block to assess differences in digestion efficiency between all blocks. (ii) To be able to provide a global reference standard, a nondepleted serum sample from Golden West Biologicals was processed and analyzed in singlicate for each of the unique batches. (iii) To assess intra/interplate variation of the targeted analytes, a sample pool QC (SPQC) sample was created from the initial pooled sample (used for digestion controls), digested using the same protocol as the patient samples, and then subaliquoted in −80°C. An aliquot was removed for each of the plates and run periodically throughout the acquisition window with the same LC-MS method as the rest of the study. For all sample digestions, 5 μl of each sample was removed and diluted 1:40 with a 5.13% deoxycholate (DOC) surfactant such that the starting protein concentration was 1.25 μg/μl. Samples were reduced with 10 mM dithiothreitol (DTT) at 80°C for 20 min and then alkylated with 20 mM iodoacetate at room temperature for 40 min. To each sample, sequencing grade trypsin was added (1:10 enzyme to protein), and samples were digested for 2 hours at 37°C. Samples were then acidified with 1% trifluoroacetic acid and then spun to remove residual DOC before LC-MRM analysis. Included in the spiking of the DTT reducing agent was a stable isotope–labeled (SIL) peptide mixture of 177 C13/N15 R/K/- or L/-labeled peptides corresponding to 101 endogenous human proteins. This SIL mixture was created using the following protocol: SpikeTide TQL SIL peptides were purchased from JPT (Hamburg, Germany) (5× 1 nmol of 99.9% C13/N15 lyophilized peptide). Following resolubilization of the JPT SpikeTides in 100 mM ammonium bicarbonate/20% acetonitrile, SpikeTides were digested with 1:50 sequencing grade trypsin for 18 hours at 37°C to yield the final SIL peptide (removal of TQL “tag”). This SIL mixture was then added to the DTT reagent used for reduction. Because the maximum number of samples processed in any batch was 61 (based on instrument acquisition time and time of samples sitting in autosampler), a multibatch approach in a 96-well plate format was deployed.
Targeted MRM quantitative analysis of nondepleted serum samples
Quantitative LC-MRM was performed on 1 μg of protein digest spiked with 10 fmol of 177 SIL peptides using a nanoACQUITY UPLC system (Waters Corp.) coupled to a Waters Xevo TQ-XS triple quadrupole mass spectrometer via a nanoelectrospray ionization source. Briefly, the sample was first trapped on a Symmetry C18 300 mm–by–180 mm trapping column [5 μl/min at 99.9/0.1 (v/v) water/acetonitrile], after which the analytical separation was performed using a 1.8-μm Acquity HSS T3 C18 75 μm–by–150 mm column (Waters Corp.) using a 55-min gradient of 5 to 40% acetonitrile with 0.1% formic acid at a flow rate of 400 nl/min with a column temperature of 55°C. Data collection on the Xevo TQ-XS mass spectrometer was performed in a targeted mode following method creation within Skyline (MacCoss Lab, University of Washington) with retention time scheduling set to 4 min around the average peak apex retention time from three SIL peptide alone acquisitions. Average peak widths were set to 20 s with 12 points across the peak, the auto-dwell feature was enabled, and the optimal collision energy (CE) for each precursor was calculated experimentally from an SIL peptide alone analysis. To create the MRM assay within Skyline, both data-dependent acquisition (DDA) discovery data (Q Exactive platform) and SIL alone peptide mixtures (100 fmol on column) were used. The method initially allowed for multiple charge states and up to five transitions per precursor. Optimization of the method, including retention time scheduling, precursor charge state selection, and selection of most robust transitions, was then performed on SIL mixture alone samples. Following optimization, the final method (including alcohol dehydrogenase–spiked peptides and targeted endogenous human proteins) targeted 102 proteins, 183 peptides, 360 precursors, and 1114 individual transitions. Each individual processing batch was saved as a separate Skyline file. Each peptide pair was manually verified for correct integration for each of the individual injections. All data were expressed as the area of light endogenous signal divided by the heavy SIL peptide signal for each peptide. All proteomic analyses were conducted by investigators blinded to the clinical data and case status of the participants.
Targeted LC-MRM assay reproducibility metrics and removal of peptide targets
To assess the quality of each MRM assay, including analytical reproducibility, and to identify potential targeted peptides below the limit of quantification (BLOQ) or peptides interfered with, the dot product of the transition ratios in the heavy channel compared to the transition ratios in the light channel was used. This filter is a strong indicator of interference in either the light or heavy channel. In each of those channels, the ratio among the three transitions within a peptide is measured. These ratios were then compared between the light and heavy channels. Of the 177 peptides, 54 had dot products of <90%. These peptides were removed from further analysis. From the remaining peptides, the data were simplified to only the ratio of the light channel to the heavy channel. From these data, the coefficient of variation (CV) for each peptide ratio was calculated across each run block of digestion QC injections. The average %CV of the SPQC samples ranged from 13.2 to 17.5% for all blocks. The average %CV of the digestion QC samples ranged from 10.2 to 17.5%.
Single-cell RNA sequencing
To assess the potential for a joint tissue origin of the selected peptides, we evaluated scRNA-seq gene expression data corresponding to the proteins of the essential peptides selected in the proteomic analysis. scRNA-seq data were generated from cartilage and matched synovium of three individuals with knee OA undergoing joint replacement surgery, as previously described (47) (dataset National Center for Biotechnology Information Gene Expression Omnibus GSE152805). The scRNA-seq data were generated from 11,579 chondrocytes from damaged sites of the medial tibiae (MT), 14,613 chondrocytes from intact sites of the outer lateral tibiae (OLT), and 10,640 synoviocytes. The relative gene expression data corresponding to the essential proteins are visualized in violin (density) plots that reflect the frequency of cells expressing each gene in the medial tibial (lesioned) cartilage (MT), the outer lateral tibial (nonlesioned) cartilage (OLT), and synovium.
Statistical analysis
Data processing and QCs
Before statistical analyses, the technical validity of outlier values (quantity of the endogenous to SIL peptide) was reviewed and removed before statistical analysis if found to be invalid for technical reasons (for example, full spectrum not within the retention time window). The missing rate of returned data was computed for each peptide and individual; one individual was excluded due to a missing rate above 15% (15.9%). PC analysis using all peptides was performed to identify outlier samples; two samples located outside of the clusters in the pairwise PC plots were considered as outliers and were excluded.
A total of 12 peptides were excluded because of SIL BLOQ. Serum samples were run in batches on different dates. To assess intra- and interplate variation of the targeted analytes and thereby identify batch effects, the SPQC sample was digested using the same protocol as the study samples and run multiple times (approximately every six samples). In total, four additional peptides (from serum amyloid P component, VTDB, complement component C9, and HPX) were excluded because of batch effects between plates. Thus, we proceeded with 107 peptides (64 proteins) for final analyses.
Data imputation
Assuming that peptide measures were missing at random, the missing values were imputed 10 times with random forest using the “missRanger” package (48) in R version 3.6.1. Following published guidance (49, 50), we included all variables in the imputation scheme that we desired to study: 107 peptides and demographics of age, BMI, sex, and race. Results from the imputed datasets were combined to obtain 95% CIs and P values using Rubin’s rules (51, 52).
Bootstrapped elastic net
Elastic net is a regularization method that incorporates the L1 and L2 coefficient norms used in Lasso and ridge regressions (53). In this way, similar to Lasso, elastic net encourages shrunken coefficients and sparsity in the model to reduce overfitting while also encouraging grouped selection of variables (53). For each dataset and corresponding outcome, the optimal elastic net tuning parameter α (α determines the mix of L1 and L2 penalties) was chosen. To assess the stabilities of selected biomarkers, 100 bootstrap samples were generated for each outcome contrast. That is, we used a grid of α from 0.1 to 0.9. For each α value and each bootstrap sample, elastic net selection was performed, and the corresponding misclassification rate was computed. Optimal α was determined on the basis of the smallest average misclassification rate across the penalty strengths of different α values. To calculate the stability of selected biomarkers, we created a “weighted frequency score” for the jth biomarker
where j for the jth biomarker, i = 1, …n for n bootstrap samples, mi for the misclassification rate of the ith bootstrap sample, and σm for the SD of n misclassficiation rates of n bootstrap samples; and Iij is an indicator variable indicating if the jth biomarker was selected in the ith bootstrap samples. For each outcome, the top 30 peptides with highest weighted frequency scores were chosen as candidate biomarkers; we refer to them as the stable biomarker set.
Backward elimination
Stable biomarker sets were further screened by backward elimination to create the essential sets of predictors. Here, we defined the AUC bias of a model as the difference between the AUC from the original samples and the average AUC of 1000 bootstrapped samples. The AUC bias was computed for the initial model with all stable peptides. A single stable peptide was eliminated consecutively in each backward step with the AUC bias recomputed each time. Given the sample size, the final model was chosen on the basis of the following criteria: (i) AUC bias between bootstrap samples and original samples of <0.02; (ii) number of biomarkers of ≤20; and (iii) a rapid drop in bias after a peptide removal. We defined the peptides remaining in the final selected model as the essential biomarker set.
Logistic regression models
Logistic regression models for each progression outcome were performed including: (i) baseline characteristics only (covariates); (ii) uCTXII only; (iii) uCTXIα only; (iv) uCTXII and the covariates; (v) essential (or stable) MRM biomarker set only; (vi) essential (or stable) MRM biomarker set and covariates; (vii) essential (or stable) MRM biomarker set and uCTXII; (viii) essential (or stable) MRM biomarker set and uCTXIα; and (ix) essential (or stable) MRM biomarker set, uCTXII, and the covariates. We also performed a sensitivity analysis, (x) with essential MRM biomarkers only excluding individuals with baseline K/L = 3 of the index knee. The baseline characteristics included age, sex, and selected clinical variables. To determine which clinical variables to include, we performed univariate logistic regression on each clinical variable for a given outcome. Clinical variables meeting P < 0.15 from the univariate analyses were selected (details in table S4). ORs and 95% CIs were estimated on the standardized biomarker data. These standardized estimates of ORs are comparable, as they represent the OR for 1 SD increase in a biomarker concentration. AUCs and 95% CIs for each combination of variables for each model outcome were computed; Youden’s J statistic, sensitivity, and specificity were calculated.
Ethics approvals
The FNIH cohort is a subsample of the OAI that was conducted with ethics approval of the University of California, San Francisco (the coordinating site) and all sites participating in the OAI study. The BMF cohort data and samples were derived from studies conducted with ethics approval of Duke University. All samples and data were obtained for research purposes, with informed consent of the study participants.
Acknowledgments
We thank G. Waitt and T. Ho for expert technical assistance with the proteomic analyses.
Funding: This work is supported by National Institutes of Health grants R01AR071450, R01AR048769, and P30 AG028716 and by National Institutes of Health contracts funding the osteoarthritis initiative (OAI): N01 AR-2-2258, N01-AR-2-2259, N01-AR-2-2260, N01-AR-2-2261, and N01-AR-2-2262.
Author contributions: Conceptualization: V.B.K., E.J.S., and M.A.M. Methodology: K.Z., Y.-J.L., E.J.S., A.R., V.J., S.S., M.A.M., and V.B.K. Investigation: K.Z., Y.-J.L., E.J.S., A.R., V.J., S.S., M.A.M., and V.B.K. Visualization: K.Z., A.R., and V.J. Supervision: V.B.K., Y.-J.L., and M.A.M. Writing: K.Z. and V.B.K. Writing (review and editing): K.Z., Y.-J.L., E.J.S., A.R., V.J., S.S., M.A.M., and V.B.K.
Competing interests: V.B.K., E.J.S., and M.A.M. are named inventors in a pending patent related to this work. The other authors declare that they have no competing interests.
Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. All proteomic data for this study are available at ftp://massive.ucsd.edu/ or massive.ucsd.edu.
Supplementary Materials
This PDF file includes:
Tables S1 to S6
REFERENCES AND NOTES
- 1.Safiri S., Kolahi A. A., Smith E., Hill C., Bettampadi D., Mansournia M. A., Hoy D., Ashrafi-Asgarabad A., Sepidarkish M., Almasi-Hashiani A., Collins G., Kaufman J., Qorbani M., Moradi-Lakeh M., Woolf A. D., Guillemin F., March L., Cross M., Global, regional and national burden of osteoarthritis 1990-2017: A systematic analysis of the Global Burden of Disease Study 2017. Ann. Rheum. Dis. 79, 819–828 (2020). [DOI] [PubMed] [Google Scholar]
- 2.Bingham C. O. III, Buckland-Wright J. C., Garnero P., Cohen S. B., Dougados M., Adami S., Clauw D. J., Spector T. D., Pelletier J.-P., Raynauld J.-P., Strand V., Simon L. S., Meyer J. M., Cline G. A., Beary J. F., Risedronate decreases biochemical markers of cartilage degradation but does not decrease symptoms or slow radiographic progression in patients with medial compartment osteoarthritis of the knee: Results of the two-year multinational knee osteoarthritis structural arthritis study. Arthritis Rheum. 54, 3494–3507 (2006). [DOI] [PubMed] [Google Scholar]
- 3.Eckstein F., Maschek S., Wirth W., Hudelmaier M., Hitzl W., Wyman B., Nevitt M., Le Graverand M.-P. H.; the OAI Investigator Group , One year change of knee cartilage morphology in the first release of participants from the Osteoarthritis Initiative progression subcohort: Association with sex, body mass index, symptoms and radiographic osteoarthritis status. Ann. Rheum. Dis. 68, 674–679 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kraus V. B., Feng S., Wang S., White S., Ainslie M., Brett A., Holmes A. C., Charles H. C., Trabecular morphometry by fractal signature analysis is a novel marker of osteoarthritis progression. Arthritis Rheum. 60, 3711–3722 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.D. Thomas, J. Burns, J. Audette, A. Carroll, C. Dow-Hygelund, M. Hay, “Clinical development success rates 2006–2015” [Biotechnology Innovation Organization (BIO), Biomedtracker and Amplion, 2016; Clinical Development Success Rates 2006–2015 - BIO, Biomedtracker, Amplion 2016.pdf].
- 6.Kraus V., Catterall J., Soderblom E., Moseley M., Suchindran S., Development of a serum biomarker panel highly predictive of knee osteoarthritis progression. Osteoarthr. Cartil. 24 (suppl. 1), 22 (2016). [Google Scholar]
- 7.Kraus V. B., Burnett B., Coindreau J., Cottrell S., Eyre D., Gendreau M., Gardiner J., Garnero P., Hardin J., Henrotin Y., Heinegård D., Ko A., Lohmander L. S., Matthews G., Menetski J., Moskowitz R., Persiani S., Poole A. R., Rousseau J.-C., Todman M.; OARSI FDA Osteoarthritis Biomarkers Working Group , Application of biomarkers in the development of drugs intended for the treatment of osteoarthritis. Osteoarthr. Cartil. 19, 515–542 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kraus V. B., Collins J. E., Hargrove D., Losina E., Nevitt M., Katz J. N., Wang S. X., Sandell L. J., Hoffmann S. C., Hunter D. J.; OA Biomarkers Consortium , Predictive validity of biochemical biomarkers in knee osteoarthritis: Data from the FNIH OA Biomarkers Consortium. Ann. Rheum. Dis. 76, 186–195 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Nockher W. A., Bergmann L., Scherberich J. E., Increased soluble CD14 serum levels and altered CD14 expression of peripheral blood monocytes in HIV-infected patients. Clin. Exp. Immunol. 98, 369–374 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Bastick A. N., Belo J. N., Runhaar J., Bierma-Zeinstra S. M. A., What are the prognostic factors for radiographic progression of knee osteoarthritis? A meta-analysis. Clin. Orthop. Relat. Res. 473, 2969–2989 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kraus V. B., Collins J. E., Charles H. C., Pieper C. F., Whitley L., Losina E., Nevitt M., Hoffmann S., Roemer F., Guermazi A., Hunter D. J.; OA Biomarkers Consortium , Predictive validity of radiographic trabecular bone texture in knee osteoarthritis: The Osteoarthritis Research Society International/Foundation for the National Institutes of Health Osteoarthritis Biomarkers Consortium. Arthritis Rheum. 70, 80–87 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.O’Neill T. W., Felson D. T., Mechanisms of osteoarthritis (OA) pain. Curr. Osteoporos. Rep. 16, 611–616 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Fairney A., Straffen A. M., May C., Seifert M. H., Vitamin D metabolites in synovial fluid. Ann. Rheum. Dis. 46, 370–374 (1987). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kraus V. B., Biomarkers as drug development tools: Discovery, validation, qualification and use. Nat. Rev. Rheumatol. 14, 354–362 (2018). [DOI] [PubMed] [Google Scholar]
- 15.Duan A., Ma Z., Liu W., Shen K., Zhou H., Wang S., Kong R., Shao Y., Chen Y., Guo W., Liu F., 1,25-Dihydroxyvitamin D inhibits osteoarthritis by modulating interaction between vitamin D receptor and NLRP3 in macrophages. J. Inflamm. Res. 14, 6523–6542 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
- 16.Kong C., Wang C., Shi Y., Yan L., Xu J., Qi W., Active vitamin D activates chondrocyte autophagy to reduce osteoarthritis via mediating the AMPK-mTOR signaling pathway. Biochem. Cell Biol. 98, 434–442 (2020). [DOI] [PubMed] [Google Scholar]
- 17.Joseph G. B., McCulloch C. E., Nevitt M. C., Neumann J., Lynch J. A., Lane N. E., Link T. M., Associations between vitamins C and D intake and cartilage composition and knee joint morphology over 4 years: Data from the osteoarthritis initiative. Arthritis Care Res. 72, 1239–1247 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Vasconcellos C. A., Lind S. E., Coordinated inhibition of actin-induced platelet aggregation by plasma gelsolin and vitamin D-binding protein. Blood 82, 3648–3657 (1993). [PubMed] [Google Scholar]
- 19.Bouillon R., Schuit F., Antonio L., Rastinejad F., Vitamin D binding protein: A historic overview. Front. Endocrinol. 10, 910 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Dimeloe S., Hawrylowicz C., A direct role for vitamin D-binding protein in the pathogenesis of COPD? Thorax 66, 189–190 (2011). [DOI] [PubMed] [Google Scholar]
- 21.Steck E., Bräun J., Pelttari K., Kadel S., Kalbacher H., Richter W., Chondrocyte secreted CRTAC1: A glycosylated extracellular matrix molecule of human articular cartilage. Matrix Biol. 26, 30–41 (2007). [DOI] [PubMed] [Google Scholar]
- 22.Ge X., Ritter S. Y., Tsang K., Shi R., Takei K., Aliprantis A. O., Sex-specific protection of osteoarthritis by deleting cartilage acid protein 1. PLOS ONE 11, e0159157 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Letsiou S., Félix R. C., Cardoso J. C. R., Anjos L., Mestre A. L., Gomes H. L., Power D. M., Cartilage acidic protein 1 promotes increased cell viability, cell proliferation and energy metabolism in primary human dermal fibroblasts. Biochimie 171-172, 72–78 (2020). [DOI] [PubMed] [Google Scholar]
- 24.Styrkarsdottir U., Lund S. H., Saevarsdottir S., Magnusson M. I., Gunnarsdottir K., Norddahl G. L., Frigge M. L., Ivarsdottir E. V., Bjornsdottir G., Holm H., Thorgeirsson G., Rafnar T., Jonsdottir I., Ingvarsson T., Jonsson H., Sulem P., Thorsteinsdottir U., Gudbjartsson D., Stefansson K., The CRTAC1 protein in plasma is associated with osteoarthritis and predicts progression to joint replacement: A large-scale proteomics scan in Iceland. Arthritis Rheumatol. 73, 2025–2034 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Szilagyi I. A., Vallerga C. L., Boer C. G., Schiphof D., Ikram M. A., Bierma-Zeinstra S. M. A., van Meurs J. B. J., Plasma proteomics identifies CRTAC1 as a biomarker for osteoarthritis severity and progression. Rheumatology (Oxford) , keac415 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Tardif G., Paré F., Gotti C., Roux-Dalvai F., Droit A., Zhai G., Sun G., Fahmi H., Pelletier J.-P., Martel-Pelletier J., Mass spectrometry-based proteomics identify novel serum osteoarthritis biomarkers. Arthritis Res. Ther. 24, 120 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.The UniProt Consortium , UniProt: The universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Weiss V., Fauser C., Engel J., Functional model of subcomponent C1 of human complement. J. Mol. Biol. 189, 573–581 (1986). [DOI] [PubMed] [Google Scholar]
- 29.Gröbner R., Kapferer-Seebacher I., Amberger A., Redolfi R., Dalonneau F., Björck E., Milnes D., Bally I., Rossi V., Thielens N., Stoiber H., Gaboriaud C., Zschocke J., C1R mutations trigger constitutive complement 1 activation in periodontal Ehlers-Danlos syndrome. Front. Immunol. 10, 2537 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Sim R. B., Reboul A., Arlaud G. J., Villiers C. L., Colomb M. G., Interaction of 125I-labelled complement subcomponents C-1r and C-1s with protease inhibitors in plasma. FEBS Lett. 97, 111–115 (1979). [DOI] [PubMed] [Google Scholar]
- 31.A. Bulla, S. Jupe. 2017. Reactome | C1-Inh binds and inactivates C1r, C1s (R-HSA-9021306); Reactome, https://reactome.org/content/detail/R-HSA-9021306 [accessed 15 January 2022].
- 32.Laurell A. B., Mårtensson U., Sjöholm A. G., Trimer and tetramer complexes containing C1 esterase inhibitor, C1r and C1s, in serum and synovial fluid of patients with rheumatic disease. J. Immunol. Methods 129, 55–61 (1990). [DOI] [PubMed] [Google Scholar]
- 33.Zhou Y., Wang T., Hamilton J. L., Chen D., Wnt/β-catenin signaling in osteoarthritis and in other forms of arthritis. Curr. Rheumatol. Rep. 19, 53 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Lu J., Kishore U., C1 complex: An adaptable proteolytic module for complement and non-complement functions. Front. Immunol. 8, 592 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Steinberg J., Ritchie G. R. S., Roumeliotis T. I., Jayasuriya R. L., Clark M. J., Brooks R. A., Binch A. L. A., Shah K. M., Coyle R., Pardo M., Le Maitre C. L., Ramos Y. F. M., Nelissen R., Meulenbelt I., McCaskie A. W., Choudhary J. S., Wilkinson J. M., Zeggini E., Integrative epigenomics, transcriptomics and proteomics of patient chondrocytes reveal genes and pathways involved in osteoarthritis. Sci. Rep. 7, 8935 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Jackson S. P., Darbousset R., Schoenwaelder S. M., Thromboinflammation: Challenges of therapeutically targeting coagulation and other host defense mechanisms. Blood 133, 906–918 (2019). [DOI] [PubMed] [Google Scholar]
- 37.Furmaniak-Kazmierczak E., Cooke T. D., Manuel R., Scudamore A., Hoogendorn H., Giles A. R., Nesheim M., Studies of thrombin-induced proteoglycan release in the degradation of human and bovine cartilage. J. Clin. Invest. 94, 472–480 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Chou P.-Y., Su C.-M., Huang C.-Y., Tang C.-H., The characteristics of thrombin in osteoarthritic pathogenesis and treatment. Biomed. Res. Int. 2014, 2014, 407518 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Zhou K., Li Y.-J., Soderblom E., Reed A., Sun S., Moseley M., Kraus V., Qualifiction of proteomic biomarkers for knee osteoarthritis progression. Osteoarthr. Cartil. 29, S7–S8 (2021). [Google Scholar]
- 40.Collins J. E., Losina E., Nevitt M. C., Roemer F. W., Guermazi A., Lynch J. A., Katz J. N., Kent Kwoh C., Kraus V. B., Hunter D. J., Semiquantitative imaging biomarkers of knee osteoarthritis progression: Data from the Foundation for the National Institutes of Health osteoarthritis biomarkers consortium. Arthritis Rheumatol. 68, 2422–2431 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kellgren J. H., Lawrence J. S., Radiological assessment of osteo-arthrosis. Ann. Rheum. Dis. 16, 494–502 (1957). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Bruyere O., Richy F., Reginster J.-Y., Three year joint space narrowing predicts long term incidence of knee surgery in patients with osteoarthritis: An eight year prospective follow up study. Ann. Rheum. Dis. 64, 1727–1730 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ornetti P., Brandt K., Hellio-Le Graverand M.-P., Hochberg M., Hunter D. J., Kloppenburg M., Lane N., Maillefert J.-F., Mazzuca S. A., Spector T., Utard-Wlerick G., Vignon E., Dougados M., OARSI-OMERACT definition of relevant radiological progression in hip/knee osteoarthritis. Osteoarthr. Cartil. 17, 856–863 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Angst F., Aeschlimann A., Michel B. A., Stucki G., Minimal clinically important rehabilitation effects in patients with osteoarthritis of the lower extremities. J. Rheumatol. 29, 131–138 (2002). [PubMed] [Google Scholar]
- 45.Altman R. D., Gold G. E., Atlas of individual radiographic features in osteoarthritis, revised. Osteoarthr. Cartil. 15, A1–A56 (2007). [DOI] [PubMed] [Google Scholar]
- 46.Ratzlaff C., Ashbeck E. L., Guermazi A., Roemer F. W., Duryea J., Kwoh C. K., A quantitative metric for knee osteoarthritis: Reference values of joint space loss. Osteoarthr. Cartil. 26, 1215–1224 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Chou C.-H., Jain V., Gibson J., Attarian D. E., Haraden C. A., Yohn C. B., Laberge R.-M., Gregory S., Kraus V. B., Synovial cell cross-talk with cartilage plays a major role in the pathogenesis of osteoarthritis. Sci. Rep. 10, 10868 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Wright M., Ziegler A., ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Soft. 77, 1–17 (2017). [Google Scholar]
- 49.Rubin D., Multiple imputation after 18+ years. J. Am. Stat. Assoc. 91, 473–489 (1996). [Google Scholar]
- 50.Moons K. G., Donders R. A., Stijnen T., Harrell F. E. Jr., Using the outcome for imputation of missing predictor values was preferred. J. Clin. Epidemiol. 59, 1092–1101 (2006). [DOI] [PubMed] [Google Scholar]
- 51.D. Rubin, Multiple Imputation for Nonresponse in Surveys (John Wiley & Sons, 1987). [Google Scholar]
- 52.Schomaker M., Heumann C., Bootstrap inference when using multiple imputation. Stat. Med. 37, 2252–2266 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Zou H., Hastie T., Regularization and variable selection via the elastic net. J. R. Stat. Soc.Stat. Methodol. Series B 67, 301–320 (2005). [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Tables S1 to S6





