TABLE 2.
Data Set | Paied Samples, n |
TX: subAR (% subAR prevalence) |
Probability Threshold |
% Negative (spared biopsy) |
NPV | True Negative | False Negative | % Positive (pickup subAR) |
PPV | True Positive | False Positive |
---|---|---|---|---|---|---|---|---|---|---|---|
Discovery Set | 530 | 400:130 (24.5) |
0.375 | 74.7 | 88 | 349 | 42 | 25.3 | 61 | 83 | 51 |
Validation set 1 | 138 | 96:42(30.4) | 0.375 | 71.7 | 78 | 77 | 22 | 28.3 | 51 | 20 | 19 |
Validation set 2 | 129/138 | 93:36(27.9) | 0.375 | 72.1 | 80 | 74 | 19 | 27.9 | 47 | 17 | 19 |
We tested the locked model classifiers at the defined threshold (0.375) first on 138 subjects from the Northwestern University (NU) biorepository (validation set 1) who had undergone surveillance biopsies (subclinical acute rejection [subAR] 42 [30.4%]: transplant excellent [TX] 96). Performance metrics consisted of a negative predictive value (NPV) of 78% and a positive predictive value (PPV) of 51%. We then tested the same locked model/ threshold on a subset of 129/138 (subAR 36 [27.9%]: TX 93) participants who met the strict study CTOT–08 criteria for the clinical phenotype definitions of subAR and TX (validation set 2); performance metrics consisted of NPV of 80% and PPV of 47%. The biomarker test results were interpreted dichotomously as “positive” (ie, correlating with a clinical phenotype of subAR) if the probability exceeded the 0.375 threshold and “negative” (ie, correlating with TX) if <0.375. To translate the performance of the biomarker into a narrative more relevant to clinical application, we sought to calculate our ability to diagnose the presence or absence of subAR in any given sample, taking into consideration the prevalent incidence of both subAR and TX compared with the frequency of a correct positive versus negative biomarker test result. Accordingly, we made a negative call (no subAR) in 72% to 75% of the patients (NPV 78%–88%) versus a positive call (subAR) in 25% to 28% of the patients (PPV 47%–61%).