. 2015 May 12;47(4):332–340. doi: 10.3109/07853890.2015.1027255

Table II. Methodological characteristics of Benchmarking Controlled Trials (BCTs) in 10 studies published between January 2010 and October 2014 in leading medical journals. Assessment is based solely on each particular paper; if information is not reported, the issue is assessed as unclear. Each characteristic is recorded as yes, partial, unclear, or no; yes indicates that the criterion has been met.

Study characteristics	Coleman et al., Lancet, 8 Jan 2011^a	Pearse et al., Lancet, 22 Sep 2012	Birkmeyer et al., NEJM, 10 Oct 2013	Karthikesalinam et al., Lancet, 15 Mar 2014	Chung et al., Lancet, 12 April 2014	Finks et al., NEJM, 2 June 2011	Song et al., NEJM, 9 Aug 2011	Wallace et al., NEJM, 31 May 2012^b	Sutton et al., NEJM, 8 Nov 2012^a	Aiken et al., Lancet, 24 May 2014
1. Research question and study design	To produce up-to-date survival estimates for selected cancers, to establish whether international differences (Australia, Canada, Denmark, Norway, Sweden, UK) in survival have changed, and to investigate the causes of survival deficits	Describe mortality rates and patterns of critical care resource use for patients undergoing non-cardiac surgery across several European nations	To assess the effect of surgical skill as a determinant for complication rates after bariatric surgery	To compare the in-hospital mortality of patients with rupture of an abdominal aortic aneurysm in England and USA	To compare crude and casemix-standardized 30-day mortality for acute myocardial infarction between UK and Sweden	To evaluate the extent to which decreases in mortality after esophagectomy, pancreatectomy, lung resection, cystectomy, and abdominal aortic aneurysm repair could be associated with a concentration of surgical care in high-volume hospitals	To assess the effect of the Alternative Quality Contract system on health care spending and on measures of the quality of ambulatory care in 2009	To assess the relationship between night-time intensivist physician staffing and mortality among intensive care patients	To analyze the association of a hospital pay-for-performance program with patient mortality among patients with pneumonia, heart failure, or acute myocardial infarction	To assess whether differences in patient-to-nurse workloads and nurses’ educational qualifications in nine countries with similar patient discharge data are associated with variation in hospital mortality after common surgical procedures
1.1. clinical or system comparison	Clinical	Clinical	Clinical	Clinical	Clinical	System comparison	System comparison	System comparison	System comparison	System comparison
1.2. subcategory of comparison	Whole clinical pathway	Whole clinical pathway	Single intervention	Whole clinical pathway	Whole clinical pathway	Related to how and by whom the services are organized / provided	Related to the reimbursement and incentives	Related to how and by whom the services are organized / provided, and to the resources available for health care	Related to the reimbursement and incentives	Related to the resources available for health care
1.3. conceptually pertinent and clear	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
1.4. natural experiment (allocation to study groups apparently by chance)	No	No	No	No	No	No	No	No	No	No
1.5. operationalized according to the PICO principle (patient; treatment; comparison treatment; outcomes)	No	No	Yes	Yes	Yes	Yes	No	No	No	No
2. Selection of patients/population to the study and measures to increase comparability (all studies have individual patient data)
2.1. population-based cohort, administrative database, or clinical register	Population-based register	Clinical sample	Clinical register	Administrative databases	Clinical register	Administrative databases	Clinical register	Clinical register	Clinical register	Administrative databases
2.2. prospective or retrospective design	Retrospective	Prospective	Prospective	Unclear	Retrospective	Retrospective	Unclear	Retrospective	Unclear	Retrospective
2.3. level of health care provider (e.g. individual, health care center, hospital, district, country)	Country level	Country level	Individual provider	Country level	Country level	Hospital level	Provider organization	Hospital level	Hospital level	Hospital level
2.4. description of patients’ clinical path before eligible for the study	No	No	NA	No	No	No	No	No	No	No
2.5. description of patients’ clinical eligibility criteria	Yes	No	Yes	Yes	Partial	Yes	No	No	No	No
2.6. comprehensive patient population of the catchment area	Yes	Yes	No	Yes	Unclear	Unclear	No	Unclear	Yes	Unclear
2.7. restriction of patients to a particular group in order to increase homogeneity (e.g. first episode ever of ischemic stroke)	No	No	No	No	Partial	No	No	No	No	No
2.8. use of instrumental variables to compensate for lack of randomization	No	No	No	No	No	No	No	No	No	No
3. Validity and completeness of baseline data + Comparability ensured between groups at baseline (e.g. Validity: Yes; Comparability: No → Yes/No)
3.1. diagnostics	Yes/Unclear	No/Unclear	Yes/Unclear	Yes/Yes	Yes/Yes	No/Unclear	No/Unclear	No/Unclear	Yes/Yes	No/Unclear
3.2. other clinically important data relevant to the particular disorder/disease (e.g. severity)	No/Unclear	No/Unclear	Yes/Unclear	Yes/Unclear	Yes/Yes	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear
3.3. general health/risk status	No/Unclear	No/Unclear	No/Unclear	No/Unclear	Yes/Yes	No/Unclear	Yes/Yes	Yes/Yes	No/Unclear	No/Unclear
3.4. co-morbid conditions	No/Unclear	No/Unclear	Yes/Unclear	Yes/Yes	Yes/Yes	Yes/Yes	No/Unclear	Yes/Yes	Yes/Yes	Yes/Yes
3.5. behavioral factors (e.g. on health-related lifestyle)	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear
3.6. environmental factors (e.g. work conditions)	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear
3.7. potential inequality (e.g. socio-economic status)	No/Unclear	No/Unclear	No/Unclear	Yes/Yes	No/Unclear	Yes/Yes	No/Unclear	No/Unclear	No/Unclear	No/Unclear
3.8. other potential predictors (e.g. genetic factors), confounders, and effect modifiers	No/Unclear	No/Unclear	No/Unclear	Unclear/Unclear	Yes/Yes	No/Unclear	No/Unclear	No/Unclear	No/Unclear	No/Unclear
4. Validity and completeness of process data (also unrelated to the disorder in question) throughout the clinical pathway
4.1. diagnostics	Yes	No	Yes	Yes	Yes	No	Yes	Yes	No	No
4.2. treatment procedures	No	No	Yes	Yes	Yes	No	Yes	Yes	No	No
4.3. rehabilitation	No	No	NA	NA	Yes	No	NA	NA	No	No
4.4. hospitalizations and health care visits	No	No	Yes	Yes	NA	No	NA	NA	No	No
4.5. individual behavior (e.g. lifestyle-related to health)	No	No	No	NA	No	No	NA	NA	No	No
4.6. adherence to treatments	Yes	No	Yes	NA	No	No	NA	NA	No	No
4.7. characteristics of the clinical pathway	No	No	NA	NA	No	No	NA	NA	No	No
5. Validity and completeness of outcome data (related to the disorder in question)
5.1. validity of the outcomes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
5.2. outcomes assessed also among disadvantaged patients	No	No	No	No	No	No	No	No	No	No
5.3. comparability (similarity) of follow-up time points	Yes	Yes	Yes	Yes	Yes	Yes	Yes	No	Yes	Yes
5.4. percentage of dropouts during follow-up documented and acceptable (<e.g. 10%)	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
5.5. data at each comparator arm free of suggestion of selective outcome reporting	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
6. Statistical and data issues
6.1. description of power calculations and rationale on how the study size was arrived at or post-analysis power calculation	No	Yes	No	No	No	No	Yes	No	No	No
6.2. documentation of how data were classified and coded (e.g. blinding, multiple raters, inter-rater reliability)	Unclear	Unclear	Yes	Unclear	Yes	Unclear	Unclear	Yes	Unclear	Unclear
6.3. measures to increase reliability of data classification and coding (e.g. blinding, multiple raters, inter-rater reliability)	Yes	Unclear	Yes	Unclear	Partial	Unclear	Unclear	Yes	Unclear	Unclear
6.4. description of all primary statistical methods, including those used to control for confounding	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
6.5. description and use of methods to examine subgroups and interactions	No	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
6.6. description and use of propensity score or other methods to improve comparability at baseline	No	No	No	No	Yes	No	Yes	No	No	No
6.7. adjustment for the characteristic outcomes of each health care provider (e.g. differences in general life expectancy in each country)	Yes	No	NA	No	No	No	No	No	No	No
6.8. incomplete outcome data adequately addressed. If no missing data or appropriate imputation: Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes	Yes
6.9. use of multilevel modeling or survival modeling	Yes	Yes	Yes	Yes	Unclear	Yes	Yes	No	No	Yes
In studies on health care system as determinant of outcome:
6.10. diagnostic and treatment processes analyzed as the mediators of the effects	No	No	No	No	No	Yes	No	No	No	No
Study having comparisons also with cohorts in time, i.e. changes in outcomes between follow-up years (each have patients of their own). For this design there are three additional methodological issues:
7.1. Documentation of changes in patient characteristics over time	No					No
7.2. Documentation of changes in treatment practices over time	No					No
7.3. Documentation of changes in patient outcomes over time	Yes					Yes

NA = not applicable.

^aStudy having comparisons also with cohorts in time: study characteristics 7.1.–7.3.

^bStudy assessing the effect of one factor related to the organization of the system (e.g. presence of night-time intensivist).