Skip to main content
. Author manuscript; available in PMC: 2024 Jun 13.
Published in final edited form as: Proc SIGCHI Conf Hum Factor Comput Syst. 2024 May 11;2024:448. doi: 10.1145/3613904.3641998

Table 2:

Results showing comparison of accuracy and macro-F1 scores across different interpretable models’ performance averaged over providers for the neutral-high classification. LR, DTC, SVM, RF and GBDT stand for Logistic Regression, Decision Tree Classifier, Support Vector Machine, Random Forest and Gradient Boosted Decision Trees respectively. Despite the small sample size and class imbalance, most models perform better than a macro-F1 threshold of 0.5

Social Signal Metric LR DTC SVM (Linear) SVM (Radial) RF GBDT
Provider Dominance Accuracy 0.719 0.673 0.703 0.704 0.722 0.717
F1 Score 0.658 0.606 0.650 0.614 0.651 0.657
Provider Interactiveness Accuracy 0.660 0.641 0.657 0.690 0.692 0.616
F1 Score 0.559 0.550 0.548 0.519 0.523 0.525
Provider Engagement Accuracy 0.863 0.947 0.852 0.903 0.898 0.934
F1 Score 0.547 0.736 0.566 0.605 0.651 0.672
Provider Warmth F1 Score 0.556 0.543 0.581 0.581 0.571 0.565
F1 Score 0.515 0.501 0.544 0.460 0.521 0.515
Patient Dominance Accuracy 0.975 0.968 0.973 0.973 0.961 0.970
F1 Score 0.910 0.841 0.892 0.893 0.840 0.842
Patient Interactiveness Accuracy 0.651 0.641 0.660 0.679 0.647 0.653
F1 Score 0.559 0.509 0.557 0.525 0.523 0.519
Patient Engagement Accuracy 0.826 0.663 0.827 0.883 0.804 0.826
F1 Score 0.556 0.529 0.551 0.599 0.557 0.594
Patient Warmth Accuracy 0.517 0.544 0.551 0.634 0.610 0.605
F1 Score 0.537 0.562 0.568 0.604 0.614 0.603