Skip to main content
. 2019 Feb 8;47(6):e36. doi: 10.1093/nar/gkz061

Table 2.

The ROC AUC and PR AUC performance values for the different experimental set-ups in which the listed dataset is used as the test set

Gram-negative Gram-positive
Metric Model S. typhimurium E. coli C. crescentus M. smegmatis B. subtilis S. coelicolor S. aureus
MS SS MS SS MS SS MS SS MS SS MS SS MS SS
ROC AUC Full 0.983 0.991 0.991 0.995 0.971 0.973 0.930 0.956 0.985 0.993 0.973 0.966 0.983 0.995
CNN 0.943 0.962 0.969 0.976 0.918 0.946 0.877 0.929 0.956 0.974 0.935 0.949 0.969 0.987
RNN 0.939 0.980 0.934 0.980 0.923 0.958 0.809 0.854 0.942 0.982 0.907 0.913 0.933 0.965
PR AUC Full 0.804 0.910 0.860 0.943 0.710 0.842 0.522 0.717 0.796 0.922 0.777 0.863 0.874 0.965
CNN 0.574 0.706 0.640 0.763 0.562 0.730 0.419 0.627 0.639 0.779 0.622 0.760 0.812 0.910
RNN 0.533 0.777 0.531 0.812 0.576 0.781 0.114 0.175 0.508 0.768 0.478 0.637 0.485 0.707
ROC AUC REP - 0.916 - 0.916 - 0.838 - 0.821 - 0.933 - 0.838 - 0.944
PR AUC REP - 0.735 - 0.799 - 0.344 - 0.285 - 0.889 - 0.272 - 0.910

The performance metrics for are given in case multiple start sites are considered possible (MS) and in case each stop codon can only have a single predicted start site (SS). Performances of DeepRibo using either the DNA sequences as input (CNN) or ribo-seq data (RNN) highlights the improved performance if both features are combined in one model (Full). The performances on REPARATION (REP) are furthermore given. Note that these models are both trained and evaluated on the listed dataset using cross-validation.