Logistic regression-based plasma lipid discovery for breast cancer detection. (A) Overview of the sample set (n = 256) used in plasma lipid discovery. Plasma samples are subsets of EV cohorts 2 and 3, with three morphologically distinct types of breast cancer or healthy controls. (B) Correlation of the lipid concentrations of the 23-lipid panel between matched EV and plasma samples was performed using the Pearson correlation method. The sample with the highest correlation coefficient (r) is shown. (C) The Pearson correlation coefficient for the 23-lipid panel between EVs and plasma was calculated for each of the 256 samples. The correlation coefficient for each sample is plotted and ranked by their correlation value. (D,E) ROC curves for the internal validation using the logistic regression model for the 23 lipid species panel to predict the presence of breast cancer from 256 plasma samples. (D) Combined predictions for all samples (n = 256). (E) DCIS, IDC, and ILC predictions are indicated separately. (F) Model prediction outputs from (D,E). (G) Confusion matrix indicating model predictions from combined data in (F). Optimized threshold in (F,G) was 0.45.