Table 2. Performance of the CNN, Cardiologist Diagnosis, and MUSE Diagnosisa on 38 Diagnostic Classes Compared Against the Committee Consensus Diagnosis (N = 328).
| Diagnostic class | Frequency | CNN AUC (95% CI) | CNN F1 scoreb | Cardiologist clinical F1 score | MUSE F1 score | Cardiologist-fixed specificityc | CNN sensitivity | Cardiologist clinical sensitivity |
|---|---|---|---|---|---|---|---|---|
| Rhythm | ||||||||
| Sinus | 228 | 0.856 (0.810-0.888) | 0.849 | 0.818 | 0.784 | 0.940 | 0.750 | 0.711 |
| Atrial fibrillation | 30 | 0.987 (0.974-0.996) | 0.847 | 0.881 | 0.833 | 0.990 | 0.800 | 0.867 |
| Atrial flutter | 18 | 0.970 (0.939-0.995) | 0.750 | 0.645 | 0.333 | 0.990 | 0.667 | 0.556 |
| Ectopic atrial rhythmd | 10 | 0.988 (0.974-0.999) | 0.750 | 0.560 | 0.444 | 0.975 | 0.900 | 0.700 |
| Atrial tachycardiad | 13 | 0.920 (0.853-0.957) | 0.400 | 0.333 | 0.133 | 0.978 | 0.231 | 0.308 |
| Ventricular tachycardiad | 9 | 0.997 (0.992-1.000) | 0.842 | 0.615 | 0.000 | 1.000 | 0.556 | 0.444 |
| Junctional rhythmd | 9 | 0.967 (0.925-0.991) | 0.526 | 0.727 | 0.300 | 0.984 | 0.556 | 0.889 |
| Supraventricular tachycardiad | 8 | 0.987 (0.970-0.998) | 0.696 | 0.632 | 0.714 | 0.984 | 0.500 | 0.750 |
| Bigeminyd | 10 | 0.999 (0.997-1.000) | 0.952 | 0.857 | 0.737 | 0.994 | 1.000 | 0.900 |
| Premature ventricular complex | 32 | 0.942 (0.894-0.985) | 0.786 | 0.800 | 0.712 | 0.976 | 0.719 | 0.812 |
| Premature atrial complex | 28 | 0.974 (0.954-0.990) | 0.759 | 0.692 | 0.556 | 0.980 | 0.679 | 0.643 |
| Ventricular paced | 13 | 0.983 (0.955-1.000) | 0.917 | 0.750 | 0.762 | 0.994 | 0.846 | 0.692 |
| Atrial pacede | 14 | 0.984 (0.956-1.000) | 0.846 | 0.800 | 0.800 | 0.997 | 0.786 | 0.714 |
| Rhythm diagnosis averagef | NA | 0.909 | 0.812 | 0.773 | 0.690 | 0.961 | 0.728 | 0.709 |
| Conduction | ||||||||
| AV block | ||||||||
| 1st Degree | 47 | 0.939 (0.906-0.962) | 0.679 | 0.560 | 0.521 | 0.975 | 0.553 | 0.447 |
| 2nd Degree Mobitz 1d | 9 | 0.999 (0.996-1.000) | 0.941 | 0.900 | 0.625 | 0.994 | 0.889 | 1.000 |
| Branch block | ||||||||
| Left bundle | 13 | 0.955 (0.871-1.000) | 0.870 | 0.720 | 0.692 | 0.990 | 0.769 | 0.692 |
| Right bundle | 42 | 0.994 (0.986-0.999) | 0.941 | 0.833 | 0.800 | 0.951 | 1.000 | 0.952 |
| Left fascicular block | ||||||||
| Anterior | 35 | 0.973 (0.956-0.989) | 0.756 | 0.621 | 0.491 | 0.983 | 0.629 | 0.514 |
| Posteriord | 23 | 0.971 (0.948-0.988) | 0.656 | 0.529 | 0.529 | 0.993 | 0.435 | 0.391 |
| Bifascicular blocke | 21 | 0.988 (0.977-0.996) | 0.800 | 0.485 | 0.485 | 0.987 | 0.667 | 0.381 |
| Nonspecific intraventricular conduction delay | 21 | 0.866 (0.771-0.939) | 0.514 | 0.308 | 0.293 | 0.961 | 0.476 | 0.286 |
| Axis deviation | ||||||||
| Right | 38 | 0.953 (0.928-0.971) | 0.667 | 0.357 | 0.345 | 0.972 | 0.447 | 0.263 |
| Left | 47 | 0.951 (0.909-0.981) | 0.800 | 0.593 | 0.674 | 0.964 | 0.574 | 0.511 |
| Right superior axisd | 5 | 0.951 (0.916-0.977) | 0.303 | 0.125 | 0.125 | 0.969 | 0.200 | 0.200 |
| Prolonged QT | 26 | 0.860 (0.795-0.918) | 0.500 | 0.360 | 0.415 | 0.950 | 0.423 | 0.346 |
| Wolff-Parkinson-Whited | 8 | 0.992 (0.980-1.000) | 0.800 | 0.842 | 0.706 | 0.991 | 0.750 | 1.000 |
| Conduction diagnosis averagef | NA | 0.951 | 0.729 | 0.560 | 0.538 | 0.971 | 0.609 | 0.513 |
| Chamber enlargement | ||||||||
| Ventricular hypertrophy | ||||||||
| Left | 24 | 0.975 (0.960-0.988) | 0.700 | 0.644 | 0.645 | 0.947 | 0.875 | 0.792 |
| Right | 11 | 0.982 (0.954-0.998) | 0.706 | 0.348 | 0.364 | 0.975 | 0.818 | 0.364 |
| Atrial enlargement | ||||||||
| Left | 37 | 0.835 (0.760-0.898) | 0.432 | 0.214 | 0.222 | 0.955 | 0.324 | 0.162 |
| Righte | 9 | 0.960 (0.878-1.000) | 0.875 | 0.762 | 0.737 | 0.987 | 0.889 | 0.889 |
| Chamber diagnosis averagef | NA | 0.910 | 0.598 | 0.420 | 0.424 | 0.959 | 0.617 | 0.457 |
| Infarct | ||||||||
| Anterior infarct | 12 | 0.919 (0.861-0.970) | 0.634 | 0.514 | 0.500 | 0.905 | 0.783 | 0.783 |
| Septal infarct | 27 | 0.966 (0.944-0.988) | 0.737 | 0.656 | 0.613 | 0.953 | 0.815 | 0.741 |
| Lateral infarcte | 10 | 0.891 (0.766-0.995) | 0.632 | 0.303 | 0.308 | 0.943 | 0.700 | 0.500 |
| Inferior infarct | 27 | 0.962 (0.906-0.987) | 0.746 | 0.708 | 0.622 | 0.950 | 0.889 | 0.852 |
| Posterior infarctd | 4 | 0.925 (0.807-1.000) | 0.667 | 0.400 | 0.400 | 0.975 | 0.500 | 0.750 |
| ST elevatione | 14 | 0.862 (0.746-0.948) | 0.480 | 0.400 | 0.308 | 0.981 | 0.429 | 0.357 |
| Infarct diagnosis averagef | NA | 0.934 | 0.674 | 0.566 | 0.514 | 0.950 | 0.749 | 0.696 |
| Other | ||||||||
| Lead misplacementd | 12 | 0.999 (0.997-1.000) | 0.917 | 0.833 | 0.250 | 0.994 | 0.917 | 0.833 |
| Low voltage | 4 | 0.985 (0.945-1.000) | 0.750 | 0.222 | 0.222 | 0.938 | 1.000 | 0.750 |
| Other diagnosis averagef | NA | 0.995 | 0.875 | 0.680 | 0.243 | 0.980 | 0.938 | 0.812 |
Abbreviations: AUC, area under the receiver operating characteristic curve; AV, atrioventricular; CNN, convolutional neural network; MUSE, electrocardiogram interpretation database management system by GE Healthcare; NA, not applicable.
Total diagnosis N = 948.
F1 score is a global metric of algorithm performance complementary to the AUC, which rewards algorithms that maximize positive predictive value and sensitivity simultaneously. It is particularly useful in settings where the frequency of classes is imbalanced. Reported F1 score for the CNN is the maximal F1 score in the consensus committee data set.
Specificity is fixed at the cardiologist clinical diagnosis specificity for each class. Convolutional neural network and cardiologist clinical diagnosis sensitivities are reported at this same fixed specificity for each class. MUSE sensitivity and specificities are fixed and are reported separately in eTable 3 in the Supplement, since MUSE specificity cannot be altered to match the cardiologist clinical diagnosis specificities shown here.
N = <4000 in the sampled training data set for this class.
N = <8000 in the sampled training data set for this class.
Frequency-weighted mean.