Table 4. Performance measures of the extraction algorithm when applied to categorical characteristics. Precision measures the number of correctly classified reports among the total number of reports assigned to the class by the algorithm. Recall, measures the number of reports correctly classified among the number of true (i.e., human classified) reports in that class. The f-score is the harmonic mean of precision and recall. For multiclass characteristics precision, recall and f-score are averaged over classes (macro average). Overall accuracy is the number of reports correctly classified among the total number of reports evaluated.
| Descriptor | Macro Precision (%) | Macro Recall (%) | Macro f-score (%) | Overall Accuracy % (n/N) | |
|---|---|---|---|---|---|
| Complementary descriptors | Laterality | 66.2 | 50.0 | 52.9 | 64.3 (27/42) |
| Behavior | 57.1 | 92.7 | 58.6 | 85.7 (36/42) | |
| Grade | 70.3 | 64.8 | 79.6 | 76.2 (32/42) | |
| Method of Assessment for Solid Tumors | 78.6 | 94.8 | 78.4 | 85.7 (36/42) | |
| Method of Assessment for Hematological Tumors | 100 | 100 | 100 | 100 (42/42) | |
| Diagnostic Procedure | 95.0 | 83.7 | 87.2 | 90.5 (38/42) | |
| Lymphovascular Invasion | 82.5 | 91.2 | 83.9 | 85.7 (36/42) | |
| Surgical Margins | 94.4 | 77.2 | 82.8 | 90.5 (38/42) | |
| Pulmonary Metastasis | 100 | 100 | 100 | 100 (42/42) | |
| Osseous Metastasis | 92.8 | 50.0 | 96.3 | 92.9 (39/42) | |
| Hepatic Metastasis | 75.0 | 66.7 | 83.3 | 97.6 (41/42) | |
| Brain Metastasis | 50.0 | 50.0 | 100 | 97.6 (41/42) | |
| Distant Lymph Nodes Metastasis | 50.0% | 97.6 | 98.8 | 97.6 (41/42) | |
| Other Metastasis | 98.8 | 75.0 | 82.7 | 97.6 (41/42) | |
| Special descriptors | Examined Regional Nodes | 92.3 | 100 | 96.0 | 41.7 (5/12) |
| Positive Regional Nodes | 92.3 | 100 | 96.0 | 58.3 (7/12) | |
| Tumor Size | 85.7 | 75.0 | 80.0 | 50.0 (6/12) | |
| TNM-based Staging | 100 | 75.0 | 85.7 | 100 (3/3) |