Inter-rater agreement of the RoB-SPEO tool by domain. Footnotes: aData by systematic review has been anonymised and randomized in order. Instead of the colour scale used in the rest of the graph, the scale was split into tertiles and colour coded accordingly, to ensure anonymity (white 0.00-0.33, light blue 0.34-0.66, dark blue 0.67-1.00). bAgreement shown for studies where all assessors had carried out a similar number of assessments (≤10 or >10). cDomain 1 score missing for one study record, resulting in 105 records from three systematic reviews included in the >10 assessments category; Domain 5 score missing for one study record, resulting in 105 records from three systematic reviews included in the >10 assessments category; Domain 7 score missing for five study records, resulting in 103 records from three systematic reviews included in the >10 assessments category (for the other two with missing scores, the reviewers did not have concordant experience). dAgreement shown for studies where all assessors recorded similar time for assessment, and where discordant times were recorded. eDomain1 score missing for one study record, resulting in 17 records from three systematic reviews included in the ≤25 minutes category; Domain 5 score missing for one study record, resulting in 17 records from three systematic reviews included in the ≤25 minutes category; Domain 7 score missing for five study records, resulting in 17 records from three systematic reviews included in the ≤25 minutes category, 19 records from three systematic reviews included in the 26-66 minutes category, and 59 records from four systematic reviews included in the discordant times category. fDomain 1 and 5 scores missing for one study record, resulting in 137 records from four systematic reviews; Domain 7 score missing for five study records, resulting in 133 records from four systematic reviews.