Skip to main content
. 2019 Jan 24;2:100012. doi: 10.1016/j.wnsx.2019.100012

Figure 4.

Figure 4

Receiver operating curves for raw (left) and feature-selected (right) data. Lines are color coordinated using the figure legend in the bottom right corner of the graph. Perforated diagonal line is y = x, with an area under the curve of 0.5 (indicative of random guessing). The performance increase is discerned by the increased distance between all curves and that of y = x. The lower right column chart demonstrates the scaled 95% confidence intervals of the area under the curves calculated for each machine learning method. Artificial neural network and logistic regression methods were not statistically different from 0.5. Support vector machine and decision tree methods were better than random guessing; however, their statistical significance was weak. Performance of algorithms using raw datasets can be compared with the chart for the feature-selected dataset in the lower right corner. These are the 95% confidence intervals for all machine learning methods, and can be concluded to be not only further away from 0.5 but also narrower, indicating greater significance. ANN, artificial neural network; AUC, area under the curve; C.I., confidence interval; DT, decision tree; LR, logistic regression; ROC, receiver operating curve; SVM, support vector machine.