Table 1.
Performance of CNN, GAN discriminator, and human raters.
Model or human rater | Subjects | N | ON/OFF accuracy | R2 |
---|---|---|---|---|
Best CNN | Study 1 Dev Set | 10 | 100% | 0.56 |
Best CNN | Study 2 Test Set | 9 | 78% | 0.61 |
Best GAN | Study 1 Dev Set | 10 | 100% | 0.61 |
Best GAN | Study 2 Test Set | 9 | 100% | 0.55 |
In-Person Clinician Rater (Ground Truth) | Study 1 Dev Set | 10 | 100% | N/A |
In-Person Clinician Rater (Ground Truth) | Study 2 Test Set | 9 | 78% | N/A |
In-Person Clinician Rater (Ground Truth) | Partial Study 1 | 25 | 64% | N/A |
Video Rater 1 | Partial Study 1 | 25 | 68% | 0.45 |
Video Rater 2 | Partial Study 1 | 25 | 48% | 0.37 |
Video Rater Average | Partial Study 1 | 25 | 58% | 0.41 |