Analysis | Methods |
Measures of treatment effect |
Continuous data For continuous data, we will calculate the mean difference (MD) and 95% confidence intervals (CIs). Because different scales may be used to measure the same outcomes in trials on autism spectrum disorder (ASD), standardized mean differences (SMDs) may be used widely in our review. Final values and changes from baseline data should not be combined together as SMDs. When final values and changes from baseline data are available in included trials, we shall analyze them separately. Apart from analyzing those values separately, we will combine final values and changes from baseline data using the MD when both types of data are available for the same scale. We will not incorporate skewed data in future analyses. Dichotomous data For dichotomous data, we will calculate the risk ratio (RR), odds ratio (OR), and risk difference (RD) with 95% CIs. |
Unit of analysis issues | For most outcomes, the unit of analysis will be the individual participant. Cluster‐randomized trials We will include cluster‐randomized trials along with individually randomized trials in the analysis. We will analyze them, as detailed in section 16.3 of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2011a), using an estimate of the intracluster correlation coefficient (ICC) derived from the trial (if possible) or from another source. If we use ICCs from other sources, we will report this fact and we will conduct sensitivity analyses to investigate effects of variation in the ICC. If we identify both cluster‐randomized trials and individually randomized trials, we will synthesize relevant information. We will consider it reasonable to combine the results derived from both if little heterogeneity between the study designs is noted, and if interaction between the effect of the intervention and the choice of randomization unit is considered unlikely. We will also acknowledge heterogeneity in the randomization unit, and we will perform a separate meta‐analysis. Multiple intervention groups We were not faced with multiple intervention studies until now. In the future, for a study with multiple intervention groups, if appropriate, we will combine groups to create a single pairwise comparison. The recommended method in most situations is to combine all relevant experimental intervention groups of the study into a single group, and to combine all relevant control intervention groups into a single control group (Higgins 2011a). Indirect comparisons are not randomized comparisons. They are observational findings across trials and may suffer the biases of observational studies (Higgins 2011a). Thus, we will exclude indirect comparisons. |
Dealing with missing data | When published data are incomplete, we will try to obtain the missing data from the primary investigator, if possible. If this approach is unsuccessful, we will restrict the analyses to available data. We will use sensitivity analyses to examine whether overall findings are robust to the potential influence of missing data. We will assess how sensitive results are to reasonable changes in the assumptions made. We will critically appraise issues of the intention‐to‐treat (ITT) analysis and will compare them with specifications of primary outcome parameters and power calculations. |
Assessment of heterogeneity | We will consider 3 types of heterogeneity: clinical, methodological, and statistical. We will assess clinical heterogeneity by comparing the distribution of important participant factors between trials such as age, gender, specific diagnoses or diagnostic subtypes (or both), duration of the disorder, and associated neuropsychiatric diseases. We will assess methodological heterogeneity by comparing trial characteristics such as randomization concealment, blinding, and losses to follow‐up (see Quality of the evidence). We will assess statistical heterogeneity by examining Chi² and I². We will use the Chi² test (P ≤ 0.10 shows substantial or considerable heterogeneity) to determine whether statistically significant heterogeneity is present. The Chi² test is not very reliable when a few studies or small sample sizes form the dataset. This means that a nonsignificant result cannot be taken as evidence of no heterogeneity. We will also assess the degree of statistical heterogeneity by examining I². We will grade the degree of heterogeneity as follows (Deeks 2011):
We will examine the trials to investigate possible explanations for heterogeneity. If heterogeneity is identified among a group of studies, we will check the data and establish potential reasons for the observed heterogeneity. For heterogeneity that cannot be readily explained, we intend to divide the data into subgroups if an appropriate basis is identified. Studies have shown that different estimation methods may lead to different results and conclusions. For example, the DerSimonian and Laird (DL) estimator, which is currently widely used by default to estimate between‐study variance, has long been challenged (Veroniki 2016). The DL estimator can lead to erroneous conclusions (Cornell 2014), or can largely underestimate the true value for dichotomous outcomes (Novianti 2014). For continuous data, the restricted maximum likelihood estimator is a better alternative for estimating between‐study variance when compared with other estimators (Veroniki 2016). We plan to assess heterogeneity by comparing the estimated magnitude of the heterogeneity variance with the empirical distribution of Turner 2012 for dichotomous data and Rhodes 2015 for continuous data. |
Assessment of reporting biases | We will try to obtain the study protocols of all included studies so that we can compare outcomes reported in the protocol versus those reported in the findings. When we suspect reporting bias, we will attempt to contact study authors to ask them to provide missing outcome data. When this is not possible, and the missing data are thought to introduce serious bias, we will conduct a sensitivity analysis to evaluate the impact of including such studies in the overall assessment of results. We will assess publication bias by using funnel plots or the Egger test (Egger 1997), depending on the number of clinical trials included in the systematic review. The funnel plot should be seen as a generic means of displaying small‐study effects. Asymmetry may arise as a result of publication bias or a relationship between trial size and effect size. True heterogeneity in intervention effects is only one cause of funnel plot asymmetry (Egger 1997; Sterne 2011). |
Data synthesis | If more than one eligible trial is identified and sufficient homogeneity is observed among studies with respect to participants and reported outcomes, we will perform meta‐analyses using Review Manager 5 (RevMan 5) (RevMan 5 2014). We will use both the fixed‐effect model and random‐effects model in the meta‐analysis. Both models will yield similar results if no significant heterogeneity and no publication bias are noted among the trials. If no significant heterogeneity is present, we will report results of the fixed‐effect model only. However, the asymmetry of the funnel plot may be due to true heterogeneity. If significant heterogeneity or severe asymmetry of the funnel plot is observed, we will report the results of the random‐effect model. For continuous data, we will use the inverse variance method, which is available in RevMan 5 2014. When data are sparse, in terms of low event rates or small study size, the Mantel‐Haenszel methods have better statistical properties than the inverse variance method for dichotomous data (Deeks 2011). In such cases, we will choose the Mantel‐Haenszel method for calculating the RR and RD for dichotomous data. Both Mantel‐Haenszel and inverse variance methods are poor when event rates are very low. In such cases, Peto's method works well; however, Peto's method can be used only to pool odds ratios (Deeks 2011). |
Subgroup analysis and investigation of heterogeneity | We will perform subgroup analysis based on the factors below.
|
Sensitivity analysis | We will perform sensitivity analyses for missing data and for study risk of bias. We will employ sensitivity analysis using different approaches to impute missing data. We will critically appraise last observation carried forward (LOCF), ITT, and per‐protocol (PP) analysis and will compare them with primary outcome parameters and power calculations. If appropriate, we will conduct sensitivity analyses by study risk of bias based on the presence or absence of a reliable random allocation method, concealment of allocation, and blinding of participants or outcome assessors. We will test robustness of the results by including or excluding studies of poor quality. |