Bias domain | Signalling questions | Elaboration | Response options |
Bias due to confounding | 1.1 Is there potential for confounding of the effect of intervention in this study? If N or PN to 1.1: the study can be considered to be at low risk of bias due to confounding and no further signalling questions need be considered |
In rare situations, such as when studying harms that are very unlikely to be related to factors that influence treatment decisions, no confounding is expected and the study can be considered to be at low risk of bias due to confounding, equivalent to a fully randomised trial. There is no NI (No information) option for this signalling question. |
Y / PY / PN / N |
If Y or PY to 1.1:determine whether there is a need to assess time‐varying confounding: | |||
1.2. Was the analysis based on splitting participants’ follow‐up time according to intervention received? If N or PN, answer questions relating to baseline confounding (1.4 to 1.6) If Y or PY, proceed to question 1.3. |
If participants could switch between intervention groups then associations between intervention and outcome may be biased by time‐varying confounding. This occurs when prognostic factors influence switches between intended interventions. | NA / Y / PY / PN / N / NI | |
1.3. Were intervention discontinuations or switches likely to be related to factors that are prognostic for the outcome? If N or PN, answer questions relating to baseline confounding (1.4 to 1.6) If Y or PY, answer questions relating to both baseline and time‐varying confounding (1.7 and 1.8) |
If intervention switches are unrelated to the outcome, for example when the outcome is an unexpected harm, then time‐varying confounding will not be present and only control for baseline confounding is required. | NA / Y / PY / PN / N / NI | |
Questions relating to baseline confounding only | |||
1.4. Did the authors use an appropriate analysis method that controlled for all the important confounding areas? | Appropriate methods to control for measured confounders include stratification, regression, matching, standardisation, and inverse probability weighting. They may control for individual variables or for the estimated propensity score. Inverse probability weighting is based on a function of the propensity score. Each method depends on the assumption that there is no unmeasured or residual confounding. | NA / Y / PY / PN / N / NI | |
1.5. If Y or PY to 1.4: Were confounding areas that were controlled for measured validly and reliably by the variables available in this study? | Appropriate control of confounding requires that the variables adjusted for are valid and reliable measures of the confounding domains. For some topics, a list of valid and reliable measures of confounding domains will be specified in the review protocol but for others such a list may not be available. Study authors may cite references to support the use of a particular measure. If authors control for confounding variables with no indication of their validity or reliability pay attention to the subjectivity of the measure. Subjective measures (e.g. based on self‐report) may have lower validity and reliability than objective measures such as lab findings. | NA / Y / PY / PN / N / NI | |
1.6. Did the authors control for any post‐intervention variables? | Controlling for post‐intervention variables is not appropriate. Controlling for mediating variables estimates the direct effect of intervention and may introduce confounding. Controlling for common effects of intervention and outcome causes bias. | NA / Y / PY / PN / N / NI | |
Questions relating to baseline and time‐varying confounding | |||
1.7. Did the authors use an appropriate analysis method that adjusted for all the important confounding areas and for time‐varying confounding? | Adjustment for time‐varying confounding is necessary to estimate per‐protocol effects in both randomised trials and NRSI. Appropriate methods include those based on inverse‐probability weighting. Standard regression models that include time‐updated confounders may be problematic if time‐varying confounding is present. | NA / Y / PY / PN / N / NI | |
1.8. If Y or PY to 1.7: Were confounding areas that were adjusted for measured validly and reliably by the variables available in this study? | See 1.5 above. | NA / Y / PY / PN / N / NI | |
'Risk of bias' judgement | Low ‐ No confounding expected. | Low / Moderate / Serious / Critical / NI | |
Moderate ‐ Confounding expected, all known important confounding domains appropriately measured and controlled for; and Reliability and validity of measurement of important domains were sufficient, such that we do not expect serious residual confounding. | |||
Serious ‐ At least one known important domain was not appropriately measured, or not controlled for; or Reliability or validity of measurement of a important domain was low enough that we expect serious residual confounding. | |||
Critical ‐ Confounding inherently not controllable, or the use of negative controls strongly suggests unmeasured confounding. | |||
Optional: What is the predicted direction of bias due to confounding? | Can the true effect estimate be predicted to be greater or less than the estimated effect in the study because one or more of the important confounding domains was not controlled for? Answering this question will be based on expert knowledge and results in other studies and therefore can only be completed after all of the studies in the body of evidence have been reviewed. Consider the potential effect of each of the unmeasured domains and whether all important confounding domains not controlled for in the analysis would be likely to change the estimate in the same direction, or if one important confounding domain that was not controlled for in the analysis is likely to have a dominant impact. | Favours experimental / Favours comparator / Unpredictable | |
Bias in selection of participants into the study | 2.1. Was selection of participants into the study (or into the analysis) based on participant characteristics observed after the start of intervention? | This domain is concerned only with selection into the study based on participant characteristics observed after the start of intervention. Selection based on characteristics observed before the start of intervention can be addressed by controlling for imbalances between intervention and control groups in baseline characteristics that are prognostic for the outcome (baseline confounding). | Y / PY / PN / N / NI |
If N or PN to 2.1: go to 2.4 | |||
2.2. If Y or PY to 2.1: Were the post‐intervention variables that influenced selection likely to be associated with intervention | Selection bias occurs when selection is related to an effect of either intervention or a cause of intervention and an effect of either the outcome or a cause of the outcome. Therefore, the result is at risk of selection bias if selection into the study is related to both the intervention and the outcome. | NA / Y / PY / PN / N / NI | |
2.3 If Y or PY to 2.2: Were the post‐intervention variables that influenced selection likely to be influenced by the outcome or a cause of the outcome? | NA / Y / PY / PN / N / NI | ||
2.4. Do start of follow‐up and start of intervention coincide for most participants? | If participants are not followed from the start of the intervention then a period of follow‐up has been excluded, and individuals who experienced the outcome soon after intervention will be missing from analyses. This problem may occur when prevalent, rather than new (incident), users of the intervention are included in analyses. | Y / PY / PN / N / NI | |
2.5.If Y or PY to 2.2 and 2.3, or N or PN to 2.4: Were adjustment techniques used that are likely to correct for the presence of selection biases? | It is in principle possible to correct for selection biases, for example by using inverse probability weights to create a pseudo‐population in which the selection bias has been removed, or by modelling the distributions of the missing participants or follow‐up times and outcome events and including them using missing data methodology. However such methods are rarely used and the answer to this question will usually be “No” | NA / Y / PY / PN / N / NI | |
'Risk of bias' judgement | Low ‐ All participants who would have been eligible for the target trial were included in the study and start of follow‐up and start of intervention coincide for all participants. | Low / Moderate / Serious / Critical / NI | |
Moderate ‐ Selection into the study may have been related to intervention and outcome, but the authors used appropriate methods to adjust for the selection bias; or Start of follow‐up and start of intervention do not coincide for all participants, but (a) the proportion of participants for which this was the case was too low to induce important bias; (b) the authors used appropriate methods to adjust for the selection bias; or (c) the review authors are confident that the rate (hazard) ratio for the effect of intervention remains constant over time. | |||
Serious ‐ Selection into the study was related to intervention and outcome; or Start of follow‐up and start of intervention do not coincide, and a potentially important amount of follow‐up time is missing from analyses, and the rate ratio is not constant over time. | |||
Critical ‐ Selection into the study was strongly related to intervention and outcome; or A substantial amount of follow‐up time is likely to be missing from analyses, and the rate ratio is not constant over time. | |||
Optional: What is the predicted direction of bias due to selection of participants into the study? | If the likely direction of bias can be predicted, it is helpful to state this. The direction might be characterised either as being towards (or away from) the null, or as being in favour of one of the interventions. | Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable | |
Bias in classification of interventions | 3.1 Were intervention groups clearly defined? | A pre‐requisite for an appropriate comparison of interventions is that the interventions are well‐ defined. Ambiguity in the definition may lead to bias in the classification of participants. For individual‐level interventions, criteria for considering individuals to have received each intervention should be clear and explicit, covering issues such as type, setting, dose, frequency, intensity and/or timing of intervention. For population‐level interventions (e.g. measures to control air pollution), the question relates to whether the population is clearly defined, and the answer is likely to be ‘Yes’. | Y / PY / PN / N / NI |
3.2 Was the information used to define intervention groups recorded at the start of the intervention? | In general, if information about interventions received is available from sources that could not have been affected by subsequent outcomes, then differential misclassification of intervention status is unlikely. Collection of the information at the time of the intervention makes it easier to avoid such misclassification. For population‐level interventions (e.g. measures to control air pollution), the answer to this question is likely to be ‘Yes’. | Y / PY / PN / N /NI | |
3.3 Could classification of intervention status have been affected by knowledge of the outcome or risk of the outcome? | Collection of the information at the time of the intervention may not be sufficient to avoid bias. The way in which the data are collected for the purposes of the NRSI should also avoid misclassification. | Y/ PY / PN / N / NI | |
'Risk of bias' judgement | Low ‐ Intervention status is well‐defined and based solely on information collected at the time of intervention. | Low / Moderate / Serious / Critical / NI | |
Moderate ‐ Intervention status is well‐defined but some aspects of the assignments of intervention status were determined retrospectively | |||
Serious ‐ Intervention status is not well‐defined, or major aspects of the assignments of intervention status were determined in a way that could have been affected by knowledge of the outcome. | |||
Critical ‐ (Unusual) An extremely high amount of misclassification of intervention status, e.g. because of unusually strong recall biases. | |||
Optional: What is the predicted direction of bias due to measurement of outcomes or interventions? | If the likely direction of bias can be predicted, it is helpful to state this. The direction might be characterised either as being towards (or away from) the null, or as being in favour of one of the interventions. | Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable | |
Bias due to departures from intended interventions | 4.1. Was the intervention implemented successfully for most participants? | Consider the success of implementation of the intervention in the context of its complexity. Was recommended practice followed by those administering the intervention? | Y / PY / PN / N / NI |
If your aim for this study is to assess the effect of initiating and adhering to intervention (as in a per‐protocol analysis), answer questions 4.2 to 4.4 | |||
4.2. Did study participants adhere to the assigned intervention regimen? | Lack of adherence to assigned intervention includes cessation of intervention, cross‐overs to the comparator intervention and switches to another active intervention. We distinguish between analyses where: (1) intervention switches led to follow‐up time being assigned to the new intervention, and (2) intervention switches (including cessation of intervention) where follow‐up time remained allocated to the original intervention. (3 ) is addressed under time‐varying confounding, and should not be considered further here. Consider available information on the proportion of study participants who continued with their assigned intervention throughout follow‐up. Was lack of adherence sufficient to impact the intervention effect estimate? |
NA/ Y / PY / PN / N / NI | |
4.3. Were important co‐interventions balanced across intervention groups? | Consider the co‐interventions that are likely to affect the outcome and to have been administered in the context of this study, based on the preliminary consideration of co‐interventions and available literature. Consider whether these co‐interventions are balanced between intervention groups. | NA/ Y / PY / PN / N / NI | |
4.4. If N or PN to 4.1, 4.2 or 4.3: Were adjustment techniques used that are likely to correct for these issues? | Such adjustment techniques include inverse‐probability weighting to adjust for censoring at deviation from intended intervention, or inverse probability weighting of marginal structural models to adjust for time‐varying confounding. Specialist advice may be needed to assess studies that used these approaches. | NA / Y / PY / PN / N / NI | |
'Risk of bias' judgement | Low ‐ No bias due to deviation from the intended intervention is expected, for example if both the intervention and comparator are implemented over a short time period, and subsequent interventions are part of routine medical care, or if the specified comparison relates to initiation of intervention regardless of whether it is continued. | Low / Moderate / Serious / Critical / NI | |
Moderate ‐ Bias due to deviation from the intended intervention is expected, and switches, co‐interventions, and some problems with intervention fidelity are appropriately measured and adjusted for in the analyses. Alternatively, most (but not all) deviations from intended intervention reflect the natural course of events after initiation of intervention. | |||
Serious ‐ Switches in treatment, co‐interventions, or problems with implementation fidelity are apparent and are not adjusted for in the analyses. | |||
Critical ‐ Substantial deviations from the intended intervention are present and are not adjusted for in the analysis. | |||
Optional: What is the predicted direction of bias due to departures from the intended interventions? | If the likely direction of bias can be predicted, it is helpful to state this. The direction might be characteris ed either as being towards (or away from) the null, or as being in favour of one of the interventions. | Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable | |
Bias due to missing data | 5.1 Were there missing outcome data? | This aims to elicit whether the proportion of missing observations is likely to result in missing information that could substantially impact our ability to answer the question being addressed. Guidance will be needed on what is meant by ‘reasonably complete’. One aspect of this is that review authors would ideally try and locate an analysis plan for the study. | Y / PY / PN / N / NI |
5.2 Were participants excluded due to missing data on intervention status? | Missing intervention status may be a problem. This requires that the intended study sample is clear, which it may not be in practice. | Y / PY / PN / N / NI | |
5.3 Were participants excluded due to missing data on other variables needed for the analysis? | This question relates particularly to participants excluded from the analysis because of missing information on confounders that were controlled for in the analysis. | Y / PY / PN / N / NI | |
5.4 If Y or PY to 5.1, 5.2 or 5.3: Are the proportion of participants and reasons for missing data similar across interventions? | This aims to elicit whether either (i) differential proportion of missing observations or (ii) differences in reasons for missing observations could substantially impact on our ability to answer the question being addressed. | NA / Y / PY / PN / N / NI | |
5.5 If Y or PY to 5.1, 5.2 or 5.3: Were appropriate statistical methods used to account for missing data? | It is important to assess whether assumptions employed in analyses are clear and plausible. Both content knowledge and statistical expertise will often be required for this. For instance, use of a statistical method such as multiple imputation does not guarantee an appropriate answer. Review authors should seek naïve (complete‐case) analyses for comparison, and clear differences between complete‐case and multiple imputation‐based findings should lead to careful assessment of the validity of the methods used. | NA / Y / PY / PN / N / NI | |
'Risk of bias' judgement | Low ‐ Data were reasonably complete; or Proportions of and reasons for missing participants were similar across intervention groups; or Analyses that addressed missing data are likely to have removed any risk of bias. | Low / Moderate / Serious / Critical / NI | |
Moderate ‐ Proportions of missing participants differ across interventions; or Reasons for missingness differ minimally across interventions; and Missing data were not addressed in the analysis. | |||
Serious ‐ Proportions of missing participants differ substantially across interventions; or Reasons for missingness differ substantially across interventions; and Missing data were addressed inappropriately in the analysis; or The nature of the missing data means that the risk of bias cannot be removed through appropriate analysis. | |||
Critical ‐ (Unusual) There were critical differences between interventions in participants with missing data that were not, or could not, be addressed through appropriate analysis. | |||
Optional: What is the predicted direction of bias due to missing data? | If the likely direction of bias can be predicted, it is helpful to state this. The direction might be characterised either as being towards (or away from) the null, or as being in favour of one of the interventions. | Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable | |
Bias in measurement of outcomes | 6.1 Could the outcome measure have been influenced by knowledge of the intervention received? | Some outcome measures involve negligible assessor judgment, e.g. all‐cause mortality or non‐repeatable automated laboratory assessments. Risk of bias due to measurement of these outcomes would be expected to be low. | Y / PY / PN / N / NI |
6.2 Were outcome assessors aware of the intervention received by study participants? | If outcome assessors were blinded to intervention status, the answer to this question would be ‘No’. In other situations, outcome assessors may be unaware of the interventions being received by participants despite there being no active blinding by the study investigators; the answer to this question would then also be ‘No’. In studies where participants report their outcomes themselves, for example in a questionnaire, the outcome assessor is the study participant. In an observational study, the answer to this question will usually be ‘Yes’ when the participants report their outcomes themselves. | Y / PY / PN / N / NI | |
6.3 Were the methods of outcome assessment comparable across intervention groups? | Comparable assessment methods (i.e. data collection) would involve the same outcome detection methods and thresholds, same time point, same definition, and same measurements | Y / PY / PN / N / NI | |
6.4 Were any systematic errors in measurement of the outcome related to intervention received? | This question refers to differential misclassification of outcomes. Systematic errors in measuring the outcome, if present, could cause bias if they are related to intervention or to a confounder of the intervention‐outcome relationship. This will usually be due either to outcome assessors being aware of the intervention received or to non‐comparability of outcome assessment methods, but there are examples of differential misclassification arising despite these controls being in place. | Y / PY / PN / N / NI | |
'Risk of bias' judgement |
Low ‐ The methods of outcome assessment were comparable across intervention groups; and The outcome measure was unlikely to be influenced by knowledge of the intervention received by study participants (i.e. is objective) or the outcome assessors were unaware of the intervention received by study participants; and Any error in measuring the outcome is unrelated to intervention status. |
Low / Moderate / Serious / Critical / NI | |
Moderate ‐ The methods of outcome assessment were comparable across intervention groups; and The outcome measure is only minimally influenced by knowledge of the intervention received by study participants; and Any error in measuring the outcome is only minimally related to intervention status. | |||
Serious ‐ The methods of outcome assessment were not comparable across intervention groups; or The outcome measure was subjective (i.e. likely to be influenced by knowledge of the intervention received by study participants) and was assessed by outcome assessors aware of the intervention received by study participants; or Error in measuring the outcome was related to intervention status. | |||
Critical ‐ The methods of outcome assessment were so different that they cannot reasonably be compared across intervention groups. | |||
Optional: What is the predicted direction of bias due to measurement of outcomes? | If the likely direction of bias can be predicted, it is helpful to state this. The direction might be characterised either as being towards (or away from) the null, or as being in favour of one of the interventions. | Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable | |
Bias in selection of the reported result | Is the reported effect estimate unlikely to be selected, on the basis of the results, from... | ||
7.1. ... multiple outcome measurements within the outcome domain? | For a specified outcome domain, it is possible to generate multiple effect estimates for different measurements. If multiple measurements were made, but only one or a subset is reported, there is a risk of selective reporting on the basis of results. | Y / PY / PN / N / NI | |
7.2 ... multiple analyses of the intervention‐outcome relationship? | Because of the limitations of using data from non‐randomised studies for analyses of effectiveness (need to control confounding, substantial missing data, etc), analysts may implement different analytic methods to address these limitations. Examples include unadjusted and adjusted models; use of final value vs change from baseline vs analysis of covariance; different transformations of variables; a continuously scaled outcome converted to categorical data with different cut‐points; different sets of co‐variates used for adjustment; and different analytic strategies for dealing with missing data. Application of such methods generates multiple effect estimates for a specific outcome metric. If the analyst does not pre‐specify the methods to be applied, and multiple estimates are generated but only one or a subset is reported, there is a risk of selective reporting on the basis of results. | Y / PY / PN / N / NI | |
7.3 ... different subgroups? | Particularly with large cohorts often available from routine data sources, it is possible to generate multiple effect estimates for different subgroups or simply to omit varying proportions of the original cohort. If multiple estimates are generated but only one or a subset is reported, there is a risk of selective reporting on the basis of results. | Y / PY / PN / N / NI | |
'Risk of bias' judgement | Low ‐ There is clear evidence (usually through examination of a pre‐registered protocol or statistical analysis plan) that all reported results correspond to all intended outcomes, analyses and sub‐cohorts. | Low / Moderate / Serious / Critical / NI | |
Moderate ‐ The outcome measurements and analyses are consistent with an a priori plan; or are clearly defined and both internally and externally consistent; and There is no indication of selection of the reported analysis from among multiple analyses; and There is no indication of selection of the cohort or subgroups for analysis and reporting on the basis of the results. | |||
Serious ‐ Outcome measurements or analyses are internally or externally inconsistent; or There is a high risk of selective reporting from among multiple analyses; or the cohort or subgroup is selected from a larger study for analysis and appears to be reported on the basis of the results. | |||
Critical ‐ There is evidence or strong suspicion of selective reporting of results, and the unreported results are likely to be substantially different from the reported results. | |||
Optional: What is the predicted direction of bias due to selection of the reported result? | If the likely direction of bias can be predicted, it is helpful to state this. The direction might be characterised either as being towards (or away from) the null, or as being in favour of one of the interventions. | Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable | |
Overall bias | 'Risk of bias' judgement | Low ‐ The study is judged to be at low risk of bias for all domains. | Low / Moderate / Serious / Critical / NI |
Moderate ‐ The study is judged to be at low or moderate risk of bias for all domains. | |||
Serious ‐ The study is judged to be at serious risk of bias in at least one domain, but not at critical risk of bias in any domain. | |||
Critical ‐ The study is judged to be at critical risk of bias in at least one domain. | |||
No information ‐ There is no clear indication that the study is at serious or critical risk of bias and there is a lack of information in one or more key domains of bias (a judgement is required for this). | |||
Optional: What is the overall predicted direction of bias for this outcome? |
Favours experimental / Favours comparator / Towards null /Away from null / Unpredictable |