Study type | Methodological appraisal criteria | Response | |||||
---|---|---|---|---|---|---|---|
Yes | No | Comment | |||||
Screening questions: assessing ‘fatal flaws’ (Dixon‐Woods 2005 ) Aggregative ‘fatal flaws’ based on Stewart et al (2014) Configurative ‘fatal flaws’ based on Pawson (2003) TAPUS framework |
Aggregative assessment:
|
||||||
Configurative assessment:
|
|||||||
Screening question based on abstract and/or superficial reading of full‐text: Further appraisal is not feasible or appropriate when the answer is ‘No’ to any of the above screening questions! | |||||||
Study type | Methodological appraisal criteria | Response | |||||
Yes | No | Comment / Confidence judgment | |||||
1. Qualitative e.g. (A) Ethnography (B) Phenomenology (C) Narrative (D) Grounded theory (E) Case study
|
I. RESEARCH IS DEFENSIBLE IN DESIGN (providing a research strategy that addresses the question) Appraisal indicators:
Consider whether |
||||||
i. there is a discussion of the rationale for the study design | |||||||
ii. the research question is clear, and suited to qualitative inquiry | |||||||
iii. there are convincing arguments for different features of the study design | |||||||
iv. limitations of the research design and implications for the research evidence are discussed | |||||||
Defensible | Arguable | Critical | Not defensible | Worth to continue: | |||
II. RESEARCH FEATURES AN APPROPRIATE SAMPLE (following an adequate strategy for selection of participants) Appraisal indicators: Consider whether |
|||||||
i. there is a description of study location and how/why it was chosen | |||||||
ii. the researcher has explained how the participants were selected | |||||||
iii. the selected participants were appropriate to collect rich and relevant data | |||||||
iv. reasons are given why potential participants chose not take part in study | |||||||
Appropriate sample | Functional sample | Critical sample | Flawed sample | Worth to continue: | |||
III. RESEARCH IS RIGOROUS IN CONDUCT (providing a systematic and transparent account of the research process) Appraisal indicators: Consider whether |
|||||||
i. researchers provide a clear account/description of the process by which data was collected (e.g. for interview method, is there an indication of how interviews were conducted?/procedures for collection or recording of data?) | |||||||
ii. researchers demonstrate that data collection targeted depth, detail and richness of information (e.g. interview/observation schedule) | |||||||
iii. there is evidence of how descriptive analytical categories, classes, labels, etc. have been generated and used | |||||||
iv. presentation of data distinguishes clearly between the data, the analytical frame used, and the interpretation | |||||||
v. methods were modified during the study; and if so, has the researcher explained how and why? | |||||||
Rigorous conduct | Considerate conduct | Critical conduct | Flawed conduct | Worth to continue: | |||
IV. RESEARCH FINDINGS ARE CREDIBLE IN CLAIM/BASED ON DATA (providing well‐founded and plausible arguments based on the evidence generated) Appraisal indicators: Consider whether |
|||||||
i. there is a clear description of the form of the original data | |||||||
ii. sufficient amount of data are presented to support interpretations and findings/conclusions | |||||||
iii. the researchers explain how the data presented were selected from the original sample to feed into the analysis process (i.e. commentary and cited data relate; there is an analytical context to cited data, not simply repeated description; is there an account of frequency of presented data?) | |||||||
iv. there is a clear and transparent link between data, interpretation, and findings/conclusion | |||||||
v. there is evidence (of attempts) to give attention to negative cases/outliers etc. | |||||||
Credible claims | Arguable claims | Doubtful claims | Not credible | If findings not credible, can data still be used? | |||
V. REASEARCH ATTENDS TO CONTEXTS (describing the contexts and particulars of the study) Appraisal indicators: Consider whether |
|||||||
i. there is an adequate description of the contexts of data sources and how they are retained and portrayed? | |||||||
ii. participants' perspectives/observations are placed in personal contexts | |||||||
iii. appropriate consideration is given to how findings relate to the contexts (how findings are influenced by or influence the context) | |||||||
iv. the study makes any claims (implicit or explicit) that infer generalisation (if yes, comment on appropriateness) | |||||||
Context central | Context considered | Context mentioned | No context attention | ||||
VI. RESEARCH IS REFLECTIVE (assessing what factors might have shaped the form and output of research) Appraisal indicators: Consider whether |
|||||||
i. appropriate consideration is given to how findings relate to researchers' influence/own role during analysis and selection of data for presentation | |||||||
ii. researchers have attempted to validate the credibility of findings (e.g. triangulation, respondent validation, more than one analyst) | |||||||
iii. researchers explain their reaction to critical events that occurred during the study | |||||||
iv. researchers discuss ideological perspectives/values/philosophies and their impact on the methodological or other substantive content of the research (implicit/explicit) | |||||||
Reflection | Consideration | Acknowledgement | Unreflective research | NB: Can override previous exclusion! | |||
OVERALL DECISCON – EXLUDE / INCLUDE (study generates new knowledge relevant to the review question and complies with minimum criteria to ensure reliability and empirical grounding of knowledge) | |||||||
Sources used in this section (in alphabetical order); Campbell et al (2003); CASP (2006); CRD (2009); Dixon‐Woods et al (2004); Dixon‐Woods et al (2006)cited in Gough 2012; Greenhalgh & Brown (2014); Harden et al (2004)cited in SCIE & Gough 2012; Harden et al (2009); Harden & Gough (2012); Mays & Pope (1995); Pluye et al (2011); Spencer et al 2006; Thomas et al (2003); SCIE (2010). | |||||||
Study type | Methodological appraisal criteria | Response | |||||
Yes | No | Comment / risk of bias judgment | |||||
2. Quantitative (non‐randomised; Randomised‐Controlled) Common non‐random design include: (A) Non‐randomised CT (B) Cohort studies (C) Case‐control (D) Cross‐sectional analytical studies Most common ways of controlling for bias due to baseline confounding: • Matching attempts to emulate randomization • Propensity score matching and methods • Stratification where sub‐groups have been compared • Regression analysis where covariates are adjusted for Randomised designs: Randomised Control Trial (RCT) |
I. Selection bias: (Are participants recruited in a way that minimizes selection bias?) Appraisal indicators: Consider whether |
||||||
i. there is a clear description of how and why sample was chosen | |||||||
ii. there is adequate sample size to allow for representative and/or statistically significant conclusions | |||||||
iii. participants recruited in the control group were sampled from the same population as that of the treatment | |||||||
iv. group allocation process attempted to control for potential risk of bias | |||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | Worth to continue: | |||
II. Bias due to baseline confounding: (Is confounding potentially controllable in the context of this study?) Appraisal indicators: Consider whether |
|||||||
i. the treatment and control group are comparable at baseline | |||||||
ii. matching was applied, and in case, featured sufficient criteria | |||||||
iii. the authors conducted an appropriate analysis that controlled for all potential critical confounding domains | |||||||
iv. the authors avoided to adjust for post‐intervention variables | |||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | Worth to continue: | |||
IF RANDOMISED CONTROL TRIAL, SKIP I + II AND START HERE! Bias due to ineffective randomisation: (Is allocation of treatment status truly random?) Appraisal indicators: Consider whether |
|||||||
i. there is a clear description of the randomisation process | |||||||
ii. the unit of randomisation and number of participants is clearly stated (pay special attention to treatment and control locations/ balance) | |||||||
iii. eligibility criteria for study entry are specified | |||||||
iv. characteristics of baseline and endline sample are provided 1 | Preferable condition, see 1 | ||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | If critical risk of bias, treat as non‐random study | |||
III. Bias due to departures from intended interventions (Was the intervention implemented as laid out in the study protocol?) Appraisal indicators: Consider whether |
|||||||
i. the critical co‐interventions were balanced across intervention groups | |||||||
ii. treatment switches were low enough to not threaten the validity of the estimated effect of intervention | |||||||
iii. implementation failure was minor and unlikely to threaten the validity of the outcome estimate | |||||||
iv. it is possible that intervention was taken by the controls (contamination and possible crossing‐over)* | *whilst challenging in terms of estimating impact, spill‐overs might be an important finding in itself (eg teachers read to pupils/village/family members) | ||||||
v. it is possible that knowledge of the intervention group affects how the two study groups are treated in course of follow‐up by investigators?** | **consider only in extreme cases in which preferential treatment is clearly evident; blinding in general not expected in social interventions | ||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | Worth to continue: | |||
IV. Bias due to missing data (attrition) (Are the intervention groups free of critical differences in participants with missing data?) Appraisal indicators: Consider whether |
|||||||
i. outcome data are reasonably complete (80% or above) | |||||||
ii. If ‘no’, are missing data reported? | |||||||
iii. If missing data:are proportion of participants and reasons for missing data similar across groups? | |||||||
iv. If missing data: Were appropriate statistical methods used to account for missing data? (e.g. sensitivity analysis) | |||||||
v. If not possible to control for missing data, are outcomes with missing data excluded from analysis? | |||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | Worth to continue: | |||
V. Outcome reporting bias (Are measurements appropriate, e.g. clear origin, or validity known?) Appraisal indicators: Consider whether |
|||||||
i. there was an adequate period for follow up*** | ***in many social science interventions, follow‐up is not required to coincide with the start of the treatment; further, longer period of follow up are often required to measure changes. In the context of education, the question of retention – in particular when dealing with short intervention periods –(< 1 month) is of major interest. | ||||||
ii. the outcome measure was clearly defined and objective | |||||||
iii. outcomes were assessed using standardised instruments and indicators | |||||||
iv. outcome measurements reflect what the experiment set out to measure | |||||||
v. the methods of outcome assessment were comparable across experiential groups | |||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | Worth to continue: | |||
VI. Bias in selection of results reported (Are the reported outcomes consistent with the proposed outcomes at the protocol stage?) Appraisal indicators: Consider whether |
|||||||
i. it is unlikely that the reported effect estimate is available primarily because it was a notable finding among numerous exploratory analyses | |||||||
ii. it is unlikely that the reported effect estimate is prone to selective reporting from among multiple outcome measurements within the outcome domain | |||||||
iii. it is unlikely that the reported effect estimate is prone to selective reporting from among multiple analyses of the outcome measurements | |||||||
iv. the analysis includes an intention to treat analysis? (If so, was this appropriate and were appropriate methods used to account for missing data?)**** | ****usually in clinical RCTs, rare in social science: only rate if conducted | ||||||
Low risk of bias | Risk of bias | High risk of bias | Critical risk of bias | ||||
OVERAL RISK OF BIAS: | |||||||
Sources used in this section (in weighted order): Cochrane (2014); Stewart et al (2014); Stewart et al (2012); Higgins et al (2011); Greenhalgh & Brown (2014); Pluye et al (2011); Gough et al (2007) | |||||||
Study type | Methodological appraisal criteria | Response | |||||
Yes | No | Comment /confidence judgment | |||||
3. Mixed‐methods 2 Sequential explanatory design The quantitative component is followed by the qualitative. The purpose is to explain quantitative results using qualitative findings. E.g., the quantitative results guide the selection of qualitative data sources and data collection, and the qualitative findings contribute to the interpretation of quantitative results. Sequential exploratory design The qualitative component is followed by the quantitative. The purpose is to explore, develop and test an instrument (or taxonomy), or a conceptual framework (or theoretical model). E.g., the qualitative findings inform the quantitative data collection, and the quantitative results allow a generalization of the qualitative findings. Triangulation designs The qualitative and quantitative components are concomitant. The purpose is to examine the same phenomenon by interpreting qualitative and quantitative results (bringing data analysis together at the interpretation stage), or by integrating qualitative and quantitative datasets (e.g., data on same cases), or by transforming data (e.g., quantization of qualitative data). Embedded/convergent design The qualitative and quantitative components are concomitant. The purpose is to support a qualitative study with a quantitative sub‐study (measures), or to better understand a specific issue of a quantitative study using a qualitative sub‐study, e.g., the efficacy or the implementation of an intervention based on the views of participants. |
I. RESEARCH INTEGRATION/SYNTHESIS OF METHODS (assessing the value‐added of the mixed‐methods approach) Applied mixed‐methods design:
Appraisal indicators: Consider whether |
||||||
i. the rationale for integrating qualitative and quantitative methods to answer the research question is explained [DEFENSIBLE] |
|||||||
ii. the mixed‐methods research design is relevant to address the qualitative and quantitative research questions, or the qualitative and quantitative aspects of the mixed methods research question [DEFENSIBLE] |
|||||||
iii. there is evidence that data gathered by both research methods was brought together to inform new findings to answer the mixed‐methods research question (e.g. form a complete picture, synthesise findings, configuration) [CREDIBLE] |
|||||||
iv. the approach to data integration is transparent and rigorous in considering all findings from both the qualitative and quantitative module (danger of cherry‐picking) [RIGOROUS] |
|||||||
v. appropriate consideration is given to the limitations associated with this integration, e.g., the divergence of qualitative and quantitative data (or results)? [REFLEXIVE] |
|||||||
For mixed‐methods research studies, each component undergoes its individual critical appraisal first. Since qualitative studies are either included or excluded, no combined risk of bias assessment is facilitated, and the assigned risk of bias from the quantitative component similarly holds for the mixed‐methods research. The above appraisal indicators only refer to the applied mixed‐methods design. If this design is not found to comply with each of the four mixed‐methods appraisal criteria below, then the quantitative/qualitative components will individually be included in the review: | |||||||
Mixed‐methods critical appraisal:
|
Qualitative critical appraisal: Include / Exclude |
Quantitative critical appraisal:
|
|||||
Combined appraisal: Include / Exclude mixed‐methods findings judged with____________________________ risk of bias | |||||||
Section based on Pluye et al (2011). Further sources consulted (in alphabetical order): Creswell & Clark (2007); Crow (2013); Long (2005); O'Cathain et al (2008); O'Cathain (2010); Pluye & Hong (2014); Sirriyeh et al (2011). |
Two theoretical exceptions to this rule apply:
-
i)A RCT with appropriate randomization procedure can be included without showing baseline data, as both experimental groups can be assumed to be equal at baseline by design.
-
ii)A sophisticated quasi‐experimental design such as PSM or RDD in theory could make the same claim to not require baseline data.
In both cases, the advise of an evaluation specialist will be thought as the researcher does not have the capacity to make an informed judgment in such specialist cases.
The mixed‐methods Critical Appraisal is facilitated for studies applying an explicit mixed‐methods approach. The component is applied in addition to criteria for the qualitative component (I to VI), and appropriate criteria for the quantitative component (I to VI).