Fig 6. Comparison of 26 models participating in the 2016/2017 FluSight Challenge under different scoring rules.
Shown are mean scores averaged over 1- through 4-week-ahead forecasts, different geographical levels, weeks, and forecast horizons. Compared scores: negative logarithmic score and multibin logarithmic score, continuous ranked probability score, interval score (α = 0.2), weighted interval score with K = 11. Plots comparing the WIS to CRPS and AE, respectively, also show the diagonal in gray as these 3 scores operate on the same scale. All shown scores are negatively oriented. The models FluOutlook_Mech and FluOutlook_MechAug are highlighted in orange as they rank very differently under different scores.