Skip to main content
. 2021 Oct 26;10:e68758. doi: 10.7554/eLife.68758

Figure 2. Diagnostic performance of lung, prostate, bladder, and breast cancer detection based on infrared molecular fingerprints (IMFs) of blood sera.

Receiver operating characteristic (ROC) curves for the binary classification of the test set with support vector machine (SVM) models trained on water-corrected and vector-normalized IMFs. The different cancer entities were tested against (a) non-symptomatic references, (b) mixed references that also include organ-specific symptomatic references, and (c) organ-specific symptomatic references only. Detailed cohort characteristics can be found in Figure 2—source data 1. (d) Area under the receiver operating characteristic curve (AUC) for the test sets according to different spectral pre-processing of the IMFs. The error bars show the standard deviation of the individual results of the cross-validation (LuCa: lung cancer; PrCa: prostate cancer; BrCa: breast cancer; BlCa: bladder cancer; NSR: non-symptomatic references; MR: mixed references; SR: symptomatic references).

Figure 2—source data 1. Characteristics of the matched groups of individuals utilized for the analysis as presented in Table 1, Figures 2 and 3a-c.
Figure 2—source data 2. Zipped folder with trained machine learning models and application instructions.
Figure 2—source data 3. Potential impact of clinical site to classification performance.

Figure 2.

Figure 2—figure supplement 1. Unsupervised comparison between data from the three clinical sites as well as quality control (QC) analysis of measurements.

Figure 2—figure supplement 1.

(a–a′′′) Principal component analysis (PCA) of samples of non-symptomatic healthy individuals collected from three different clinical sites. Plots depict the first five principal components, which correspond to 95% of the explained variance. The three groups are statistically matched in terms of age, gender, and body mass index (BMI). Cohort characteristics are given in Figure 2—figure supplement 1—source data 1. (b) PCA plot of biological samples and QCs. The two first principal components included in the plot correspond to 93% of the explained variance. (b′, b′′) Loading vectors for the two principal components shown in (b).
Figure 2—figure supplement 1—source data 1. Characteristics of the matched groups utilized for the analysis presented in Figure 2—figure supplement 1.
Figure 2—figure supplement 2. Performance comparison of serum- and plasma-based fingerprints for cancer detection.

Figure 2—figure supplement 2.

Receiver operating characteristic (ROC) curves for (a) lung cancer (LuCa) and (b) prostate cancer (PrCa) vs. mixed references (MR). Differential fingerprints (a′, b′) for the same comparisons as above. The characteristics of the cohort used for this analysis are given in Figure 2—figure supplement 2—source data 1.
Figure 2—figure supplement 2—source data 1. Characteristics of the matched groups utilized for the analysis presented in Figure 2—figure supplement 2.