Fig. 2. Validation and performance of trained neural network for automated curling analysis.
a Confusion matrix plots of CurlNet. Rows represent the predicted class while columns correspond to the original class. The percentage of observations in the test set which were incorrectly classified are shown in off-diagonal cells. False discovery rates can be found in the column on the far right. Similarly, false negative rates are shown in the row at the bottom of the plots. CurlNet successfully distinguished Worm class with 98.8% accuracy (top). Objects in Worm class are classified into correct categories with 95.1% accuracy (bottom). b Comparison of manual hand-counting (left) versus software analysis (right) for the same set of 30 s videos of vehicle-treated worms showed significant curling detected by both methods. The automated analysis uses Coiled + Curled + Near-curled values, in order to best approximate the manual scoring criteria. n = 10 videos totaling 99 worms for control, 10 videos totaling 98 worms for bcat-1. c Comparison of automated analysis of 30 s videos (left) versus snapshots (right) (captured on the same day from two separate aliquots of worms and analyzed by including all three curling categories) showed that snapshots are sufficient to detect a significant difference between bcat-1(RNAi) and control. n = 5 wells totaling 50 worms for control videos, 7 wells totaling 73 worms for bcat-1 videos, 6 wells totaling 71 worms for control snapshots, and 6 wells totaling 41 worms for bcat-1 snapshots. d Meta-analysis of curling categories across 14 experiments showed Coiled posture distinguish spasm-like ‘curling’ motor defect of vehicle-treated bcat-1(RNAi) worms with greater mean ratios of bcat-1 to control than that of Curled and Near-curled. n = 94 wells totaling 780 worms for control, 101 wells totaling 1160 worms for bcat-1. e The difference between control and bcat-1 in terms of percentage worms in Coiled postures diminishes with time after transferring animals into liquid buffer. bcat-1 to control mean ratio of each data point is calculated by averaging curling levels of replicates in one round of snapshots initiated at the reported time. Red dashed line represents a linear curve fit for rounds data points at different times. n = 89 rounds of snapshots across 17 experiments totaling 3166 worms. f Sunburst plot summarizing program performance in detecting and categorizing 17,000 worms. g Summary of experimental categories for all 220 tested conditions with number of conditions in parentheses. Two-tailed t-tests. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. Violin plot represents probability density of percentage worms in specified posture. Box and violin plots show minimum, 25th percentile, median, 75th percentile, maximum. Mean of bcat-1(RNAi) divided by mean of control RNAi is abbreviated as mean ratio.