Skip to main content
. 2025 Feb 11;16:1544. doi: 10.1038/s41467-025-56618-y

Table 2.

Evaluation metrics used and their interpretations

Category Metric name Full metric name Explanation
Gene expression prediction/Model generalisability PCC Pearson Correlation Coefficient Measures the linear relationship between predicted and observed gene expression, providing a value between −1 and 1, where 1 indicates a perfect positive correlation and −1 indicates a perfect negative correlation.
MI Mutual Information Measures the amount of information shared between predicted and observed gene expression, capturing their statistical dependence. Higher values indicate stronger dependence and similarity between the variables.
JS-Div Jensen–Shannon divergence Quantifies the dissimilarity or divergence between the predicted and true gene expression probability distributions. It provides a measure of dissimilarity that ranges from 0 to 1, where 0 indicates identical distributions and 1 indicates complete dissimilarity. A lower JS-Div indicates better agreement and similarity between the distributions.
NRMSE Normalised Root Mean Squared Error The RMSE (Root Mean Squared Error) between predicted and observed gene expression values, normalised by the range of the observed values. It provides a normalised measure of prediction accuracy, allowing for comparison across different datasets or scales. A lower NRMSE indicates better prediction performance.
SSIM Structural Similarity Index Evaluates the structural similarity of spatial patterns between predicted and observed gene expression by treating each spot as a ‘pixel’ in the spatial grid. It measures the similarity of intensities, luminance, contrast and structural information. Higher SSIM values indicate better structural similarity.
AUC Area Under the Curve Quantifies how well the predicted gene expression can discriminate between binarisation of zero vs. non-zero (or small vs. large) values of the observed gene expression values. It ranges from 0 to 1, and an AUC of 1 suggests that the predicted gene expression values can perfectly discriminate between the binarisation of observed gene expression value.
Clinical translational impact C-index Concordance index Quantifies the discriminatory power of a predictive survival model by assessing its ability to correctly rank or classify pairs of observations, typically in terms of their survival times or outcome probabilities. A C-index value of 0.5 indicates random chance, while a value of 1.0 signifies perfect discrimination.
Log-rank p value Log-rank test p value The log-rank p value is a statistical measure commonly used in survival analysis to assess the difference in survival or event occurrence between two or more groups. If the p value is small (typically below a predefined significance level, such as 0.05), it suggests that the observed differences in survival curves are unlikely to have occurred due to chance alone.