Skip to main content
. 2019 Mar 6;19:48. doi: 10.1186/s12874-019-0688-x

Table 4.

Descriptive characteristics of tools used to assess the quality of a peer review report

Journal or Company Name a First Author, Year Format Quality defined b Overall quality assessment Items (n) Items weights c Scoring range d Scoring system instruction e Scale/ Checklist Development f Validity g Reliability h Internal consistency RCTs i
Advances in Nursing Science; Issues in Mental Health Nursing; The Journal of Holistic Nursing Shattell 2010 [33] Scale N Summary Score 6 S 1–10 N NR NR NR NR 0
American Journal of Roentgenology Friedman 1995 [22] Scale N Single Score 1 NA 1–4 N NR NR NR NR 0
American Journal of Roentgenology Kliewer 2005 [49] Scale N Summary Score 4 NA 1–4 N NR NR NR NR 0
American Journal of Roentgenology Rajesh 2013 [32] Scale N Single Score 1 NA 1–4 P NR NR NR NR 0
American Journal of Roentgenology Berquist 2017 [50] Scale N Summary Score 4 NA 0–4 Y NR NR NR NR 0
Annals of Emergency Medicine Callaham 1998 [25] Scale N Single Score 1 NA 1–5 N NR NR Inter-Rater (ICC = 0.44, 0.24, 0.12) l NR 2 m
Annals of Emergency Medicine Callaham 2002 [26, 51] Scale N Summary Score 6 NA 1–5 N NR NR Inter-Rater (ICC = 0.44, 0.24, 0.12) l NR 1
Annals of Emergency Medicine; Annals of Internal Medicine; JAMA; Obstetrics & Gynecology and Ophthalmology Justice 1998 [35] Scale N Summary Score 4 S 1–5 N NR NR NR NR 0
British Journal of General Practice Moore 2014 [29] Scale N Single Score 1 NA A-E Y NR NR NR 0
British Medical Journal Black 1998 (RQI 3.2) [23, 39] Scale N Summary Score 7 S 1–5 N Y Face (N = 20) Test-Retest
(Kw = 1.00)
Internal Consistency (Cronbach’s alpha = 0.84) 5
Mean Content (N = 20)
Construct
Inter-Rater
(Kw = 0.83)
British Medical Journal Van Rooyen 1999 (RQI 4) [27] Scale N Mean n 8 S 1–5 N NR NR Inter-Rater
(Kw = 0.38–0.67) o
2
Chinese Journal of Tuberculosis and Respiratory Diseases Yang 2009 [52] Checklist N NA 5 NA NA N NR NR NR 0
Journal of Clinical Investigation Stossel 1985 [30] Scale N Single Score 1 NA Good-
Fair-
Poor
Y NR NR NR 0
Journal of General Internal Medicine McNutt 1990 [28, 40] Scale N Summary Score 9 S 1–5 N NR Construct NR 1
Journal of Vascular Interventional Radiology Feurer 1994 [41] Scale N Sum 7 D 0–14 N NR Content (N = 2)
Preliminary
Criterion (N = 2) (Kendall = 0.94)
Inter-Rater
(ICC = 0.84)
0
NA Review quality collector (RQC) 2012 [53] Scale N Mean 4 User-defined weights 0–100 N NR NR NR 0
Nursing Research Henly 2009 [24] Scale N Mean (CAS, GAS scale) 15 S 1–5 P NR NR Inter-Rater (ICC = 0.79) p 0
Summary Score (OAS scale) 1–5
Summary Score (GRQ scale) 0–100
Nursing Research Henly 2010 [36] Scale N Mean (CAS, GAR, SARNR scale) 26 S 1–5 P NR NR Inter-Rater
(ICC = 0.75)p
0
Summary Score (GRQ scale) 0–100
Obstetrics & Gynecology, Dutch Journal of Medicine Landkroon 2006 [42] Scale N Summary Score 5 NA 1–5 Y NR NR Test-Retest
(ICC =0.66–0.88)
Inter-Rater
(ICC = 0.62)
0
Pakistan Journal of Medical Sciences Jawaid 2006 [34] Scale N NR q 5 S 1–5 N NR NR NR 0
Peerage of science Peerage Essay Quality (PEQ) 2011 [37] Scale N Mean 3 S 1–5 N NR NR NR 0
Publons Academy Review Rating and Feedback Form 2016 [38] Scale N Sum 4 S 0–3 (Full score: 0–12) N NR NR NR 0
The Journal of Bone and Joint Surgery Thompson 2016 [31] Scale N Single Score 1 NA 80–100 Y NR NR Inter-Rater
(ICC = -4.5 to 0.99) r
0
The National Medical Journal of India Das Sinha 1999 [54] Scale N Sum 5 D 0–100 N NR NR NR 0

aName of journal or company/organization where the tool was used to assess the quality of their peer review reports

bThe quality of a peer review report is not clearly defined in any reports

cNA Not applicable, S Same weight for each item, D Different weight for each item

dNA Not applicable

eY Yes defined, P Partially defined, N Not defined

f, g, hNR Not reported

iNumber of randomized controlled trials where the tool was used as outcome criteria

lThe ICC was 0.44 for reviewers, 0.24 for editors, and 0.12 for manuscripts

mOne article consists of two studies. First study is not a RCT while the second one is a RCT [55]

nThe overall quality is based on the mean of the first seven items (the item about the tone of the review was not included)

oThe inter-rater reliability was measured with weighted K for item from 1 to 7 for two editors’ independent assessments

pThe tool includes more than one scale. We reported inter-rater reliability only for General Review Quality (GRQ) scale

qNot reported. Although the authors reported that the reviewers were rated as excellent, good and average based on the quality of the reviews, it is not reported how they assessed the overall quality of peer review reports

rICC range for 11 manuscripts. There was one outlier manuscript that if removed brought the range to 0.87–0.99