. 2022 Oct 1;17:65. doi: 10.1186/s13012-022-01238-z

Table 3.

Recommendations for planning de-implementation research

Problem	Explanation and elaboration	Recommendation	Evidence (identified in our scoping review)
Complex interventions Studying very complex interventions increases challenges in feasibility, replication, and evaluation of individual factors that affect the success of de-implementation	To progress the understanding of what works in de-implementation and making interventions more feasible, simpler interventions should be conducted. Simpler intervention means that there are fewer factors potentially affecting the success of de-implementation. When conducting simpler interventions, it is also easier to separate effective from ineffective factors. When conducting more complex interventions, process evaluation can improve the feasibility and help separate the important factors	Prefer simpler intervention designs	67% of studies had multiple intervention components, which usually leads to higher intervention complexity
Human intervention deliverer Generalizability decreases when the “human factor” (personal characteristics of the deliverers) affects the results of de-implementation	A human deliverer of the intervention may introduce confounding characteristics that affect the success of de-implementation. To improve the applicability of the results, studies should aim for higher number of intervention deliverers. When reporting the results, article should specify the number and characteristics of the deliverers used	Aim for larger number of intervention deliverers and describe the number and characteristics of the deliverers	50% of studies tested an intervention with educational sessions using a human intervention deliverer
Small number of clusters A small number of clusters decreases the reliability of effect estimates	The intra-cluster correlation coefficient is used to adjust sample sizes for between-cluster heterogeneity in treatment effects. This adjustment is often insufficient in small cluster randomized trials, as they produce imprecise estimates of heterogeneity, which may lead to unreliable effect estimates and false-positive results [21, 22]. Probability of false-positive results increases with higher between-cluster heterogeneity and smaller number of clusters (especially under 30 clusters) [21, 22]. Analyses may be corrected by small sample size correction methods, resulting in decreased statistical power. If the number of clusters is low, higher statistical power in individually randomized trials may outweigh the benefits acquired from cluster RCT design, avoiding contamination [23]	If the eligible number of clusters is low, consider performing an individually randomized trial. If number of clusters is small, consider using small sample size correction methods to decrease the risk of a false-positive result. Take the subsequent decrease in statistical power into account when calculating target sample size	In 145 cluster randomized trials, the median number of clusters was 24
Dropouts Dropouts of participants may lead to unreliable effect estimates	Trials should report dropouts for all intervention participants, including participants that were targeted with the de-implementation intervention and participants used as the measurement unit. Trials should separate between intervention participants that completely dropped out and who were replaced by new participants. To minimize dropouts, randomization should occur as close to the intervention as possible	Report dropouts for all intervention participants. Randomize as near to the start of the intervention as possible	Missing data led to a high risk of bias in 60% of studies, of which 76% were due to unreported data
Heterogeneous study contexts Diverse contextual factors may affect the outcome	Behavioral processes are usually tied to “local” context, including study environment and characteristics of the participants. These factors may impact participants’ behavior. Tailoring the intervention facilitates designing the intervention to target factors potentially important for the de-implementation. Examples include assessing barriers for change (and considering them in the intervention design) and including intervention targets in planning the intervention	Tailor the intervention to the study context	82% of the studies did not tailor the intervention to the study context
Heterogeneous mechanisms of action De-implementation interventions have diverse mechanisms of action	Theoretical knowledge helps to understand how and why de-implementation works. A theoretical background may not only increase chances of success but also improve the understanding of what works (and what does not work) in de-implementation. Examples include describing barriers and enablers for the de-implementation or describing who are involved and how they contribute to process of behavioral change	Use a theoretical background in the planning of the intervention	79%of the studies did not report a theoretical basis for the intervention
Randomization unit Randomization at a different level from the target level where the intervention primarily happens may result in loss of the randomization effect	Reducing the use of medical practices happens at the level of the medical provider. Therefore, if randomization happens at the level of the patient, the trial will not provide randomized data on provider-level outcomes. Even when the intervention target is the patient, the provider is usually involved in decision-making. Therefore, the intervention effect will occur on both provider and patient levels. Randomization is justified at the patient level when patient-level outcomes are measured or when the number of providers is large, representing several types of providers	Randomize at the same level as the intervention effect is measured	12% of the studies had provider-level outcome(s) but were randomized at the patient level
Outcomes Total volume of care outcomes may not represent changes in low-value care use	Total volume of care outcomes (including diagnosis-based outcomes) are vulnerable to bias, such as seasonal variability and diagnostic shifting [24]. Changes in these outcomes may not represent changes in actual low-value care use as the total volume of care includes both appropriate and inappropriate care. When measuring low-value care, comparing its use relative to the total volume of care or to appropriate care can help mitigate these biases	Use actual low-value care use outcomes whenever possible	28% of the studies measured actual low-value care use
Cluster heterogeneity Practice level variability in use of low-value care may be large	Baseline variability in low-value care use may be large [25]. As such, if the number of clusters is low, the baseline variability might lead to biased effect estimates	Compare low-value care use between the baseline and after the intervention	24% of the studies did not report baselines estimates or differences between the baseline and after the intervention