Table 2.
Performance metrics of the transformer network trained and tested on the labeled transcription start sites obtained from RegulonDB [29], Etwiller . [9] (Cap(pable)-seq), Yan
. [40] (SMRT-(Cappable-)seq), Ju
. [18] (SEnd-seq) and the curated set
: A model has been trained (rows) and evaluated (columns) on each set of annotations. Both the Area Under the Receiver Operating Characteristics Curve (ROC AUC) and Area Under the Precision Recall Curve (PR AUC) are given for each set-up.
Train set | Test set | ||||
---|---|---|---|---|---|
RegulonDB | Cap-seq | SMRT-seq | SEnd-seq | Custom | |
ROC AUC | |||||
RegulonDB [29] | 0.882 | 0.815 | 0.923 | 0.882 | 0.885 |
Cap-seq [9] | 0.790 | 0.961 | 0.938 | 0.945 | 0.945 |
SMRT-seq [40] | 0.749 | 0.899 | 0.958 | 0.961 | 0.956 |
SEnd-seq [18] | 0.669 | 0.835 | 0.944 | 0.978 | 0.964 |
Curated | 0.740 | 0.920 | 0.976 | 0.981 | 0.976 |
PR AUC | |||||
RegulonDB [29] | 0.030 | 0.026 | 0.053 | 0.057 | 0.064 |
Cap-seq [9] | 0.014 | 0.132 | 0.029 | 0.039 | 0.044 |
SMRT-seq [40] | 0.029 | 0.044 | 0.086 | 0.081 | 0.089 |
SEnd-seq [18] | 0.035 | 0.052 | 0.098 | 0.128 | 0.137 |
Curated | 0.039 | 0.057 | 0.141 | 0.128 | 0.141 |
The best performances on each test set are given in boldface.