Prediction performance evaluation parameters
|
Root Mean Squared Error (RMSE) |
RMSE is a square root of mean of the difference between predicted values and actual values for each sample |
Vihinen, 2012, Parikh et al., 2008a, Parikh et al., 2008b, Goffinet and Wallach, 1989
|
Root Relative Squared Error (RRSE) |
RRSE is a normalized RMSE which enables the comparison between datasets or models with different scales. Standard deviation is used for normalization |
Accuracy |
The accuracy of a test is its ability to differentiate the cases and controls correctly |
Precision/Positive Prediction Value |
The Precision of a test is its ability to determine cases that are true cases |
Sensitivity/Recall/True Positive Rate |
The sensitivity of a test is its ability to determine the cases (positive for disease) correctly |
Specificity/True negative Rate |
The specificity of a test is its ability to determine the healthy cases correctly |
F1-score |
F1-score of a test is its ability to determine harmonic mean of precision and recall |
MCC |
MCC of a test is a correlation coefficient between the true and predicted values |
Chicco and Jurman, 2020, Matthews, 1975
|
ROC curve |
ROC curve is a graph where each point on a curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. Area Under the ROC curve is a measure of how well a parameter can distinguish between cases and controls. ROC curves should be used when there are roughly equal numbers of instances for each class |
Fawcett, 2006, Davis and Goadrich, 2006
|
Precision-Recall Curve |
A precision-recall (PR) curve is a graph where each point on a curve represents a precision/sensitivity pair corresponding to a particular threshold. PR curves should be used when there is moderate to high class imbalance |
Buckland and Gey, (1994)
|
Clustering performance evaluation parameters
|
Dunn’s Index |
Dunn’s index is a ratio between the minimum distance between two clusters and the size of largest cluster. Larger the index better the clustering |
Dunn, 1974, Dalton, Ballarin and Brun, 2009
|
Silhouette Index |
Silhouette Index of a cluster is a defined as the average Silhouette width of its points. Silhouette width of a given point defines its proximity to its own cluster relative to its proximity to other clusters |
Rousseeuw, 1987, Dalton, Ballarin and Brun, 2009
|
Figure of Merit Index |
The FOM of a feature gene is computed by clustering the samples after removing that feature and by measuring the average distance between all samples and their cluster’s centroids. The FOM for a clustering technique is the sum of FOM over each feature gene at a time |
Smith and Snyder, 1979, Dalton, Ballarin and Brun, 2009
|
Instability Index |
Instability index is disagreement between labels obtained over data points to parts of a dataset, averaged over repeated random partitions of the data points. Clustering method is applied to a part of dataset, and the labels obtained on that part of the dataset are utilized to train a classifier that partitions the whole space |
Guruprasad, Reddy and Pandit, 1990, Dalton, Ballarin and Brun, 2009
|
Hubert’s Correlation, Rand Statistics, Jaccard Coefficient, Folke’s and Mallow’s index |
All these measures analyse the relationship between pairs of points using the co-occurrence matrices for the expected partition and the one generated by the clustering algorithm |
Dalton, Ballarin and Brun, 2009, Brun et al., 2007
|