TABLE 5.
Evaluation Parameters for analysis of microarray gene expression data.
Evaluation metric | Specifics | References |
---|---|---|
Prediction performance evaluation parameters | ||
Root Mean Squared Error (RMSE) | RMSE is a square root of mean of the difference between predicted values and actual values for each sample | Vihinen, 2012, Parikh et al., 2008a, Parikh et al., 2008b, Goffinet and Wallach, 1989 |
Root Relative Squared Error (RRSE) | RRSE is a normalized RMSE which enables the comparison between datasets or models with different scales. Standard deviation is used for normalization | |
Accuracy | The accuracy of a test is its ability to differentiate the cases and controls correctly | |
Precision/Positive Prediction Value | The Precision of a test is its ability to determine cases that are true cases | |
Sensitivity/Recall/True Positive Rate | The sensitivity of a test is its ability to determine the cases (positive for disease) correctly | |
Specificity/True negative Rate | The specificity of a test is its ability to determine the healthy cases correctly | |
F1-score | F1-score of a test is its ability to determine harmonic mean of precision and recall | |
MCC | MCC of a test is a correlation coefficient between the true and predicted values | Chicco and Jurman, 2020, Matthews, 1975 |
ROC curve | ROC curve is a graph where each point on a curve represents a sensitivity/specificity pair corresponding to a particular decision threshold. Area Under the ROC curve is a measure of how well a parameter can distinguish between cases and controls. ROC curves should be used when there are roughly equal numbers of instances for each class | Fawcett, 2006, Davis and Goadrich, 2006 |
Precision-Recall Curve | A precision-recall (PR) curve is a graph where each point on a curve represents a precision/sensitivity pair corresponding to a particular threshold. PR curves should be used when there is moderate to high class imbalance | Buckland and Gey, (1994) |
Clustering performance evaluation parameters | ||
Dunn’s Index | Dunn’s index is a ratio between the minimum distance between two clusters and the size of largest cluster. Larger the index better the clustering | Dunn, 1974, Dalton, Ballarin and Brun, 2009 |
Silhouette Index | Silhouette Index of a cluster is a defined as the average Silhouette width of its points. Silhouette width of a given point defines its proximity to its own cluster relative to its proximity to other clusters | Rousseeuw, 1987, Dalton, Ballarin and Brun, 2009 |
Figure of Merit Index | The FOM of a feature gene is computed by clustering the samples after removing that feature and by measuring the average distance between all samples and their cluster’s centroids. The FOM for a clustering technique is the sum of FOM over each feature gene at a time | Smith and Snyder, 1979, Dalton, Ballarin and Brun, 2009 |
Instability Index | Instability index is disagreement between labels obtained over data points to parts of a dataset, averaged over repeated random partitions of the data points. Clustering method is applied to a part of dataset, and the labels obtained on that part of the dataset are utilized to train a classifier that partitions the whole space | Guruprasad, Reddy and Pandit, 1990, Dalton, Ballarin and Brun, 2009 |
Hubert’s Correlation, Rand Statistics, Jaccard Coefficient, Folke’s and Mallow’s index | All these measures analyse the relationship between pairs of points using the co-occurrence matrices for the expected partition and the one generated by the clustering algorithm | Dalton, Ballarin and Brun, 2009, Brun et al., 2007 |