Table 7.
Mean testing performance.
| Class | Precision | Recall | F1-score | Accuracy | N(segments) in test set |
|---|---|---|---|---|---|
| Text features only | |||||
| Hotspots | 0.443 | 0.675 | 0.530 | 4 | |
| Non-hotspots | 0.652 | 0.435 | 0.469 | 5 | |
| Weighted average/Total(N) | 0.568 | 0.546 | 0.501 | 0.545 | 9 |
| Speech features only | |||||
| Hotspots | 0.543 | 0.592 | 0.534 | 4 | |
| Non-hotspots | 0.603 | 0.560 | 0.553 | 5 | |
| Weighted average/Total(N) | 0.586 | 0.565 | 0.543 | 0.566 | 9 |
| Multimodal (text and speech features) | |||||
| Hotspots | 0.464 | 0.617 | 0.525 | 4 | |
| Non-hotspots | 0.594 | 0.495 | 0.512 | 5 | |
| Weighted average/Total(N) | 0.543 | 0.556 | 0.522 | 0.555 | 9 |
Note. Per class and average performance scores for the final models.