Skip to main content
PLOS One logoLink to PLOS One
. 2024 Mar 19;19(3):e0298526. doi: 10.1371/journal.pone.0298526

Do statistical heterogeneity methods impact the results of meta- analyses? A meta epidemiological study

Samer Mheissen 1,*, Haris Khan 2, David Normando 3, Nikhillesh Vaiid 4, Carlos Flores-Mir 5
Editor: Ashraful Hoque6
PMCID: PMC10950254  PMID: 38502662

Abstract

Background

Orthodontic systematic reviews (SRs) use different methods to pool the individual studies in a meta-analysis when indicated. However, the number of studies included in orthodontic meta-analyses is relatively small. This study aimed to evaluate the direction of estimate changes of orthodontic meta-analyses (MAs) using different between-study variance methods considering the level of heterogeneity when few trials were pooled.

Methods

Search and study selection: Systematic reviews (SRs) published over the last three years, from the 1st of January 2020 to the 31st of December 2022, in six main orthodontic journals with at least one MA pooling five or lesser primary studies were identified. Data collection and analysis: Data were extracted from each eligible MA, which was replicated in a random effect model using DerSimonian and Laird (DL), Paule–Mandel (PM), Restricted maximum-likelihood (REML), Hartung Knapp and Sidik Jonkman (HKSJ) methods. The results were reported using median and interquartile range (IQR) for continuous data and frequencies for categorical data and analyzed using non-parametric tests. The Boruta algorithm was used to assess the significant predictors for the significant change in the confidence interval between the different methods compared to the DL method, which was only feasible using the HKSJ method.

Results

146 MAs were included, most applying the random effect model (n = 111; 76%) and pooling continuous data using mean difference (n = 121; 83%). The median number of studies was three (range 2, 4), and the overall statistical heterogeneity (I2 ranged from 0 to 99% with a median of 68%). Close to 60% of the significant findings became non-significant when HKSJ was applied compared to the DL method and when the heterogeneity was present I2>0%. On the other hand, 30.43% of the non-significant meta-analyses using the DL method became significant when HKSJ was used when the heterogeneity was absent I2 = 0%.

Conclusion

Orthodontic MAs with few studies can produce different results based on the between-study variance method and the statistical heterogeneity level. Compared to DL, HKSJ method is overconservative when I2 is greater than 0% and may result in false positive findings when the heterogeneity is absent.

Introduction

The publication of systematic reviews (SRs) in orthodontics has increased exponentially in recent years [1, 2]. A properly conducted SR requires many rigorous steps, starting with a comprehensive search for the available evidence, selecting the eligible studies, extracting data, and finally synthesizing the data [3, 4]. To guarantee valid results of an SR, authors should follow a proven rigor SR methodology that will maximize the proper consideration of the available evidence. Researchers generally use a meta-analysis (MA) to synthesize data from individual studies and provide the reader with summarized quantitative results. MA can facilitate interpreting the results by providing a cumulative effect measure. Selecting an appropriate statistical model is paramount for the validity of the MA.

Two main statistical models are applied in the MA: the fixed effect or the random effects model [5]. The fixed effect model assumes that all included studies in MA have the same true effect size, while the random effects model considers the effect size variation between the pooled studies in MA. In other words, the selection between the two models depends on the expected or measured effect size variation from study to study, which may result from differences (heterogeneity) in the participants’ characteristics (age, gender, ethnicity, type of malocclusion, crowding level, etc.) and the implementation of the intervention (type, dose, activation of appliance, wearing time of appliance, follow up, etc.). The fixed effect model ignores the between-study heterogeneity or assumes it is nonexistent. In contrast, the random effects model quantifies the degree of statistical heterogeneity in MA and gives relatively larger weight to smaller studies than the fixed mode [6, 7]. Random effects model provides identical results to the fixed effect model if there is no heterogeneity. The random effects model implements different estimators for the between-study variance [3] to calculate confidence intervals in MA.

The simplest and most commonly applied between-study variance estimator is the DerSimonian and Laird (DL) approach [8, 9]. DL method is the default option in many popular statistical software packages such as Comprehensive Meta-analysis (CMA) [10] and Review Manager (RevMan) [11]. However, DL may lead to false positive results, particularly when the number of studies is small [12], the studies are unequal in size [13], and the heterogeneity is increased [14]. Many alternative methods exist to overcome the limitation of the DL method; the Paule–Mandel (PM) method [15], the Restricted maximum-likelihood (REML) method, the Hartung and Knapp [16] and the Sidik and Jonkman [17] (HKSJ) method. Many simulation studies [13, 1820] investigated the effect of different between-study variance estimators on the pooled estimate, but the findings conflict. A previous study recommended the REML and HKSJ over the DL methods [19], while PM was recommended in three simulation studies reported in a systematic review [21].

A previous study [22] revealed that approximately 65% of MAs in orthodontics have less than five trials. The potential impact of selecting a between-study variance estimator increases with smaller samples [23]. To address these shortcomings of pooling few trials in MAs in orthodontics a recent study investigated the use of the heterogeneity estimator corrected by Hartung Knapp on MAs in orthodontics [24]. Still, the authors included a wide range of pooled studies in the MA (from 3 to 45 per MA). This large variation in the primary studies can diffuse the true impact on smaller samples. Keeping this in mind, the present study aimed to investigate the impact of different between-study variance estimators, the degree of heterogeneity, and the sample size equality of pooled studies on the result of MAs in orthodontics with fewer trials (less than 5).

Materials and methods

The reporting of this methodological study followed the reporting guidelines for methodological studies [25] which was adapted from Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA).

Eligibility criteria

SRs were included if they met the following criteria:

  • Orthodontic SRs published between the 1st of January 2020 and the 31st of December 2022 in six journals: Progress in Orthodontics (PIO), European Journal of Orthodontics (EJO), Orthodontics & Craniofacial Research (OCR), American Journal of Orthodontic and Dentofacial Orthopedics (AJODO), The Angle Orthodontist (AO), and Korean Journal of orthodontic (KJO).

  • These SRs should have included at least one meta-analysis with less than five primary studies reporting interventional procedures.

SRs of animal or laboratory trials, network meta-analyses, and MAs of incidence or prevalence were excluded. Scoping reviews, overviews, guidelines, and qualitative SRs were also excluded.

Search and study selection

One author (xx) performed an electronic search in Medline via PubMed using MeSH terms and text word and journal websites to collect the relevant records (S1 Table). Two authors (xx, xxx) screened the titles and abstracts of the retrieved studies independently and in duplicate. Systematic reviews with meta-analyses related to orthodontics were included initially without consideration to the number of primary studies. Once the full version was obtained, the number of included primary studies per meta-analysis was evaluated. A discussion was completed until a consensus was reached in case of disagreement.

Data collection process

The authors extracted the following characteristics of meta-analysis: the journal title, the effect measure (mean difference, Risk ratio, odds ratio, standardized mean difference), the MA model implemented (random, fixed), the number of included primary studies, the number of participants, the between-study variance estimator utilized (DL, REML, PM, HK, SJ), and the statistical software used.

For eligible meta-analysis; the number of each study group, mean, and standard deviation for continuous data, and number of events and non-events for dichotomous outcomes were extracted from the forest plots and entered in a Microsoft excel® (Microsoft, Redmond, Washington, USA) file for data presented in arm-level format. The mean, standard error or upper and lower confidence intervals were extracted if data were presented in contrast-level format.

Statistical analysis and data synthesis

The characteristics of the included meta-analyses were summarized using median and interquartile ranges for continuous variables and frequencies for categorical variables. Each meta-analysis was replicated by one author (xx) using four different heterogeneity estimators (DL, REML, PM, HKSJ) available in Stata 15.1 software (Stata Corp, Texas, USA). The results of each replicated MA were compared to the original MA results using the same scenario to avoid the error in data extraction. Then from each MA output we collected; the overall effect size, the corresponding 95% confidence interval (CI), the P-value, and the level of statistical heterogeneity presented by I2, Q, and Tau2. Furthermore, the ratio between the confidence intervals of each estimator and the confidence interval of DL was calculated using the following equation:

CIEstimatorratio=widthof95%CIEstimatorwidthof95%CIDL

For the equality of trials’ size, a percentage between the largest study and the smallest one in MA was calculated and if up to 20% the studies were considered to be equal to some extent, between 21–50% moderately unequal, between 51–100% unequal, and >100% substantially unequal.

The significance of the meta-analyses with the different heterogeneity estimators, P values, and confidence interval ratios for the different estimators were calculated and divided according to the statistical heterogeneity level expressed by I2 statistics. The results were analyzed and visualized using the R statistical package (version 4.3.0) (R Foundation for Statistical Computing, Vienna, Austria). Finally, the feature selection algorithm Boruta was used to identify significant predictors associated with the significant change in the confidence interval from the DL method, and this was applicable to the HKSJ method.

Results

Meta-analyses characteristics

The selection process of the studies in the CONSORT format is illustrated in Fig 1. The excluded studies and the reasons for exclusion are clarified in the S2 Table.

Fig 1. The flowchart of the MAs selection process.

Fig 1

A total of 146 meta-analyses with few studies were analyzed. The vast majority of the included MAs (n = 111; 76.03%) were conducted using the random effect model, and the output was expressed using the mean difference (n = 121; 82.88%). Likewise, most of the included MAs did not report the used heterogeneity estimator in their analysis (n = 99; 83.19%). Nevertheless, the DL method was reported in a few MAs (n = 14; 11.76%), and the other methods, such as HK, SJ, and REML, were barely reported. The median number of studies was 3 (IQR: 2 to 4), with a median number of participants 139 (IQR: 99, 249) per MA. The size of the studies per MA was equal in only 18% (21/146). The most frequently used software was RevMan (n = 112; 76.71%), followed by Stata and CMA software.

The heterogeneity measures varied as for I2 it ranged from 0 to 99% with a median of 68% (interquartile range (IQR): 0 to 78), for Q it ranged from 0 to 224.5 with a median of 2 (IQR: 0, 8), while for t2(tau2) it ranged from 0 to 75000 with a median of 0.0007 (IQR: 0 to 1) (Table 1).

Table 1. Characteristics of included meta-analyses.

Characteristic Overall number of meta-analyses, N = 146
Journal, n (%) AJODO 33 (22.60%)
Angle 9 (6.16%)
EJO 51 (34.93%)
KJO 5 (3.42%)
OCR 36 (24.66%)
PIO 12 (8.22%)
Number of studies per meta, Median (IQR) Median (IQR) 3(2,4)
model, n (%) fixed 27 (18.49%)
random 111 (76.03%)
NI 8 (5.48%)
Q, Median (IQR) 2 (0, 8)
I2, Median (IQR) 0.07 (0.00, 0.78)
t2 DL, Median (IQR) 0 (0, 1)
participants number, Median (IQR) 139 (99, 249)
Equality of trials’ size, n (%) Equal 21 (18%)
Moderately unequal 16 (14%)
Unequal 46 (39%)
Substantially unequal 34 (29%)
software, n (%) Comprehensive Meta-Analysis 12 (8.22%)
R 2 (1.37%)
RevMan 112 (76.71%)
Stata 19 (13.01%)
NI 1 (0.68%)

Replicating meta-analyses using the four heterogeneity estimator methods

The 146 MAs were replicated using the four heterogeneity estimators; DL, REML, PM, and HKSJ. (Table 2). The overall effect estimate did not change using the four mentioned estimators in MA. However, the change was obvious in the confidence interval’s (CI) width and, subsequently, the findings’ P-value and significance. REML and PM showed a slightly different confidence intervals and significance level. In contrast, HKSJ showed wider CI than DL, which was mostly double the width of DL confidence interval up to 6 times when I2 was greater than 0. Also, HKSJ yielded wider confidence intervals when I2 was 0, but this increase was smaller than MAs with higher I2 levels. Interestingly, the confidence Interval using the HKSJ method was wider in all meta-analyses when I2 was greater than 0, but it was narrower in (30/71; 42.25%) meta-analyses with a lack of heterogeneity (I2 = 0) (Table 2 and Fig 2).

Table 2. Crosstabulation of different results of replicated meta-analyses using different methods of between-study variance with the level of statical heterogeneity expressed by I2.

Characteristic Overall number of meta-analyses, N = 146 I2 = 0, N = 71 I2 >0, N = 75 p-value
Effect size measure, n (%)
 MD 121 (82. 88%) 58 (81.69%) 63 (84%)
 RR 5 (3.42%) 3 (4.23%) 2 (2. 67%)
 SMD 20 (13.70%) 10 (14.08%) 10 (13.33%)
Model, n (%) 0.0021
 fixed 27 (18. 49%) 20 (28. 17%) 7 (9.33%)
 random 111 (76.03%) 50 (70.42%) 61 (81.33%)
 NI 8 (5.48%) 1 (1.41%) 7 (9.33%)
tau2 Estimator, n (%) >0.91
 DL 14 (11.76%) 5 (9.80%) 9 (13.24%)
 HK 1 (0.84%) 0 (0%) 1 (1.47%)
 MH 2 (1.68%) 1 (1.96%) 1 (1.47%)
 REML 1 (0.84%) 1 (1.96%) 0 (0%)
 SJ 2 (1.68%) 1 (1.96%) 1 (1.47%)
 NI 99(83.19%) 43(84.31%) 56 (82.35%)
participants number, Median (IQR) 139 (99, 249) 139 (89, 238) 139 (105, 298) 0.32
t2 DL, Median (IQR) 0 (0, 1) 0 (0, 0) 0 (0, 4)
Q, statistics, Median (IQR) 2 (0, 8) 0 (0, 1) 8 (4, 17)
I2 statistics, Median (IQR) 0.77 (0.45, 0.88)
RMEL/DL ratio, Median (IQR) 1.00 (1.00, 1.00) 1.00 (1.00, 1.00) 1.00 (0.99, 1.01) 0.72
PM/DL ratio, Median (IQR) 1.00 (1.00, 1.00) 1.00 (1.00, 1.00) 1.00 (0.95, 1.01) 0.82
HKSJ/DL ratio, Median (IQR) 1.68 (1.22, 2.90) 1.19 (0.64, 2.07) 2.05 (1.65, 6.47) <0.0012
HKSJ and DL confidence interval length comparison Smaller 30(42.25%) 0 <0.0011
 Larger 41(57.75%) 75(100%)
Effect size DL, Median (IQR) 0.14 (-0.14, 1.12) 0.14 (-0.01, 0.69) 0.25 (-0.37, 1.31)
Effect size REML, Median (IQR) 0.14 (-0.14, 1.11) 0.14 (-0.01, 0.69) 0.25 (-0.37, 1.22)
Effect size PM, Median (IQR) 0.14 (-0.14, 1.09) 0.14 (-0.01, 0.69) 0.24 (-0.37, 1.22)
Effect sizeHKSJ, Median (IQR) 0.14 (-0.14, 1.12) 0.14 (-0.01, 0.69) 0.25 (-0.37, 1.31)
P value difference RMEL/DL (mean ± sd) (95%CI) 0.00004 ± 0.0558 (95%CI: -0.11, 0.11) 0.992 0.0013 ± 0.041 (-0.08,0.08) 0.972
P value difference PM/DL (mean ± sd) (95%CI) 0.00004 ± 0.0558 (-0.11, 0.11) 0.992 0.0023 ±0.041(-0.08, 0.08) 0.952
P value difference HKSJ/DL (mean ± sd) (95%CI) -0.001±.052(-0.001,0.2) 0.0512 -0.097± 0.04 (-017, -0.016) 0.012

1 Fisher’s exact test

2 Wilcoxon rank sum test

Fig 2. Boxplot depicts the different ratios of confidence intervals for the three heterogeneity estimators; HKSJ/DL, PM/DL, and REML/DL.

Fig 2

The median of REML/DL and PM/DL is almost equal to one. The median of HKSJ/DL ratio is slightly greater than two when I2 = 0%, and it is slightly greater than one when I2 = 0%. It is worth noting that almost one-third of the boxplot represents less than one.

In regards to P-values and significance of the results, more than half (21/35; 60%) of significant findings of MAs with a statistical heterogeneity (I2>0) using the DL method became non-significant using HKSJ. On the other hand, (7/47;30.43%) of non-significant results became significant, and (8/24;16%) of the significant results became non-significant when HKSJ was used instead of the DL method in meta-analyses with a lack of heterogeneity (I2 = 0). In terms of other methods, there was a change in one significant result using the PM method (Table 3 and Fig 3).

Table 3. Absolute and relative frequencies of the significance of the meta-analyses using the four different heterogeneity methods.

Level of Statistical Heterogeneity
I2 = 0 I2>0
Estimator DL Non-significant N = 47(%) Significant N = 24 (%) Total N = 71 (%) Non-significant N = 40 Significant N = 35(%) Total N = 75 (%)
REML Non-significant 47 0 47 (66.20%) 40 0 40 (53.33%)
significant 0 24 24 (33.80%) 0 35 35 (46.67%)
PM Non-significant 47 0 47 (66.20%) 40 1 (2.44%) 41 (54.67%)
significant 0 24 24(33.80%) 0 34 (97.14%) 34 (45.33%)
HKSJ Non-significant 40(83.33%) 8(16.67%) 48 (67.61%) 40 21(60%) 61 (81.33%)
significant 7(30.43%) 16(69.57%) 23 (32.39%) 0 14(40%) 14 (18.67%)

Fig 3. P-value distribution between the three methods of Heterogeneity compared to DL method.

Fig 3

The Boruta algorithm is a feature selection algorithm which measures the importance of each variable with respect to the outcome. The Boruta algorithm identified the heterogeneity measures (Q, I2, and t2) and the number of studies as significant predictors for the change in HKSJ confidence interval width compared to DL. In other words, the change in CI HKSJ/DL ratio was affected by the number of included studies in MA and the value of the statistical heterogeneity measures (Fig 4).

Fig 4. The Boruta algorithm model shows significant predictors with green boxplots.

Fig 4

Red boxplots represent the insignificant features. The blue boxplots represent the shadow attributes.

Discussion

The DL method in random effect meta-analysis is a commonly used for calculating the confidence interval in orthodontic MAs. Although there was a lack of reporting of the between-study variance method in 83% (99/119) of random meta-analysis, 98% (97/99) of those MAs were conducted using RevMan or CMA, where the default method of between-study variance is DL. Previous simulation studies [2628] reported that the DL method has a high rate of false positive results. Likewise, two methodological studies [13, 29] found a considerable number of false positive results in Cochrane MAs with significant findings using the DL method. The current investigation replicated 146 MAs with few primary studies less than 5 using four between-study variance methods (DL, REML, PM, HKSJ) and found that (30/69; 43.47%) of the significant MAs became non-significant after applying HKSJ. This finding confirms previous empirical assessments [13, 29], which investigated Cochrane MAs, and revealed that greater significant results resulted from methods based on the normal distribution (such as DL) than methods based on t-distribution (such as HKSJ). Likewise, a simulation study found that t-distribution produces a wider confidence interval than the normal distribution, especially when the statistical heterogeneity and the number of studies are small in meta-analysis [30]. This may rias concerns related to the overcoverage and loss of power [31]. The aforementioned information my explain our findings, as the confidence interval of MAs using HKSJ was double the length up to approximately 7 times the length of the confidence interval when DL was used, and the heterogeneity was present (I2>0%). Hence, HKSJ is over-conservative and may result in non-significant results when they are actually significant. A strength of this study is comparing the results of the different methods based on the heterogeneity level.

It is also important to mention that 30.43% (7/40) of non-significant MAs using the DL method became significant with HKSJ when the statistical heterogeneity was absent (I2 = 0%). This can be supported by the finding that the confidence interval was narrower in 42.25% (30/75) of MAs using HKSJ than that using DL when the statistical heterogeneity was absent (I2 = 0%).(Fig 5) These findings were consistent with a previous study [32] which replicated the analysis of 157 MAs with binary data using HK and DL and concluded that HK may yield a narrower confidence interval and smaller P-values than DL in some homogeneous meta-analyses. Consequently, HKSJ is not always conservative [31] and may also be imperfect and may result in false positive results when heterogeneity is absent. A previous study [31] suggested to avoid Hartung-Knapp method When the heterogeneity is absent (t2 = 0) due to producing a modified results with shorter confidence intervals and smaller P values.

Fig 5. Example of replicated meta-analysis for four trials with a lack of statistical heterogeneity (I2 = 0%).

Fig 5

The confidence interval is equal in the three methods (DL, REML, and PM) (95%CI; -0.56 to 1.70), but it is narrower when HKSJ estimator was applied (95%CI; -0.21 to 1.35).

The Boruta algorithm found that the number of studies and the heterogeneity measures are significant predictors for the change in HKSJ confidence interval width. However, the model rejected the effect of equality of trials’ size on the confidence interval change between HKSJ and DL methods. In contrast, a previous study [13] found that equal-sized trials may decrease the change in type I error rate even if there is a moderate heterogeneity. In our study, the high number of unequal-size trials, the large difference in trials’ size in the individual MA, or the very few trials in MA might render the effect of the size equality trivial.

Simulation studies [33] showed that REML and PM methods are more robust than the DL method. They recommended the PM method for both continuous and binary outcomes and the REML method for continuous outcomes [34] even for large heterogeneity (t2). A recent study found similar results for DL, REML, and PM, especially when the heterogeneity was absent. This is consistent with the findings of Chung et al. [35], who found that DL and REML produce similar findings when the number of studies is small.

A previous investigation [24] included a wide range of pooled studies in a meta-analysis and assessed the results using eight different heterogeneity methods with and without HK adjustment. The authors concluded that “meta-analysis with at least three studies is sensitive to HK correction”. However, we found that in meta-analysis with few studies (fewer than 5), HKSJ is overconservative when I2 is greater than 0%, while HKSJ may result in false positive results when the heterogeneity is absent (I2 = 0%). Although making a clear recommendation about the best methods is difficult in case of few studies, we suggest using DL method for pooling a small number of primary studies (fewer than 5) in MAs when I2 = 0% and sensitivity analysis using both HKSJ and DL for pooling MAs when I2>0.

Clinical implications

The clinical implications of this study are important for MAs focused on orthodontic samples. As many MAs related to orthodontics include small samples of primary studies, selecting the between-study variance methods when conducting the MAs can determine whether the identified differences are significant. It is recommended that MAs related to orthodontics that include small samples of primary studies should report using more than one between-study variance method (sensitivity analysis), especially when the confidence interval is close to the significance/non-significance cut-off point.

Limitations

The search period was restricted to three years, but the study aimed to collect MAs rather than only SRs and to map a problem rather than providing a more robust estimate. Although relatively short, this period probably expresses a reality closer to what has been investigated nowadays. MAs with only two to four studies were included in this study to concentrate on MAs with fewer primary studies. Finally, our assessment did not investigate the effect of between-study variance methods on the prediction intervals because of approximately more than one-third of the sample (53/146; 36.3%) had only two studies.

Conclusion

This sample of MAs with less than five studies appears to be sensitive to the selected between-study variance method. HKSJ doubled the confidence interval of the pooled estimate when the heterogeneity was present. However, HKSJ reduced the confidence intervals of 30% of MAs when the heterogeneity was absent, leading to more significant differences when compared to the DL method.

Supporting information

S1 Table. Search strategy in PubMed.

(DOCX)

pone.0298526.s001.docx (17.6KB, docx)
S2 Table. Excluded studies with reason.

(DOCX)

pone.0298526.s002.docx (29.7KB, docx)
S1 Checklist. PRISMA 2020 checklist.

(DOCX)

pone.0298526.s003.docx (21.2KB, docx)

Data Availability

All relevant data are within the paper and its Supporting Information files.

Funding Statement

The author(s) received no specific funding for this work.

References

  • 1.Papageorgiou SN, Eliades T. Evidence-based orthodontics: Too many systematic reviews, too few trials. Journal of Orthodontics. 2019;46(1_suppl):9–12. Epub 2019/05/06. doi: 10.1177/1465312519842322 . [DOI] [PubMed] [Google Scholar]
  • 2.Ioannidis JP. The Mass Production of Redundant, Misleading, and Conflicted Systematic Reviews and Meta-analyses. Milbank Q. 2016;94(3):485–514. doi: 10.1111/1468-0009.12210 ; PubMed Central PMCID: PMC5020151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Higgins JPT TJ, Chandler J, Cumpston M, Li T, Page MJ, Welch VA. Cochrane Handbook for Systematic Reviews of Interventions version 6.0. Second edition ed2019. [Google Scholar]
  • 4.Vaid N DV, Vandekar M. What’s “Trend”ing in Orthodontic literature?. APOS Trends Orthod 2016;6:1–4. [Google Scholar]
  • 5.Borenstein M, Hedges LV, Higgins JP, Rothstein HR. A basic introduction to fixed-effect and random-effects models for meta-analysis. Res Synth Methods. 2010;1(2):97–111. Epub 20101121. doi: 10.1002/jrsm.12 . [DOI] [PubMed] [Google Scholar]
  • 6.Barili F, Parolari A, Kappetein PA, Freemantle N. Statistical Primer: heterogeneity, random- or fixed-effects model analyses?†. Interactive CardioVascular and Thoracic Surgery. 2018;27(3):317–21. doi: 10.1093/icvts/ivy163 [DOI] [PubMed] [Google Scholar]
  • 7.Dettori JR, Norvell DC, Chapman JR. Fixed-Effect vs Random-Effects Models for Meta-Analysis: 3 Points to Consider. Global Spine J. 2022;12(7):1624–6. Epub 20220620. doi: 10.1177/21925682221110527 ; PubMed Central PMCID: PMC9393987. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986;7(3):177–88. doi: 10.1016/0197-2456(86)90046-2 . [DOI] [PubMed] [Google Scholar]
  • 9.DerSimonian R, Laird N. Meta-analysis in clinical trials revisited. Contemp Clin Trials. 2015;45(Pt A):139–45. Epub 20150904. doi: 10.1016/j.cct.2015.09.002 ; PubMed Central PMCID: PMC4639420. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Borenstein M HL, Higgins J, Rothstein H:. Comprehensive Meta-analysis. Englewood, NJ 07631 USA: Biostat; 2005. [Google Scholar]
  • 11.Review Manager (RevMan) 5.1.4. ed. Copenhagen: The Nordic Cochrane Centre; 2011.: The Cochrane Collaboration.
  • 12.Seide SE, Rover C, Friede T. Likelihood-based random-effects meta-analysis with few studies: empirical and simulation studies. BMC Med Res Methodol. 2019;19(1):16. Epub 20190111. doi: 10.1186/s12874-018-0618-3 ; PubMed Central PMCID: PMC6330405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.IntHout J, Ioannidis JP, Borm GF. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method. BMC Med Res Methodol. 2014;14:25. Epub 20140218. doi: 10.1186/1471-2288-14-25 ; PubMed Central PMCID: PMC4015721. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Makambi KH. The effect of the heterogeneity variance estimator on some tests of treatment efficacy. J Biopharm Stat. 2004;14(2):439–49. doi: 10.1081/BIP-120037191 . [DOI] [PubMed] [Google Scholar]
  • 15.Paule RC, Mandel J. Consensus Values and Weighting Factors. J Res Natl Bur Stand (1977). 1982;87(5):377–85. doi: 10.6028/jres.087.022 ; PubMed Central PMCID: PMC6768160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.J H. An alternative method for meta-analysis. Biom J [Google Scholar]
  • 17.Sidik K, Jonkman JN. Simple heterogeneity vari- ance estimation for meta-analysis. Journal of the Royal Statistical Society: Series C (Applied Statistics). 2005;54:367–84. [Google Scholar]
  • 18.Hartung J, Knapp G. A refined method for the meta-analysis of controlled clinical trials with binary outcome. Stat Med. 2001;20(24):3875–89. doi: 10.1002/sim.1009 . [DOI] [PubMed] [Google Scholar]
  • 19.Langan D, Higgins JPT, Jackson D, Bowden J, Veroniki AA, Kontopantelis E, et al. A comparison of heterogeneity variance estimators in simulated random-effects meta-analyses. Res Synth Methods. 2019;10(1):83–98. Epub 20180906. doi: 10.1002/jrsm.1316 . [DOI] [PubMed] [Google Scholar]
  • 20.Mathes T, Kuss O. A comparison of methods for meta-analysis of a small number of studies with binary outcomes. Res Synth Methods. 2018;9(3):366–81. Epub 20180426. doi: 10.1002/jrsm.1296 . [DOI] [PubMed] [Google Scholar]
  • 21.Langan D, Higgins JPT, Simmonds M. Comparative performance of heterogeneity variance estimators in meta-analysis: a review of simulation studies. Res Synth Methods. 2017;8(2):181–98. Epub 20160406. doi: 10.1002/jrsm.1198 . [DOI] [PubMed] [Google Scholar]
  • 22.Koletsi D, Fleming PS, Eliades T, Pandis N. The evidence from systematic reviews and meta-analyses published in orthodontic literature. Where do we stand? Eur J Orthod. 2015;37(6):603–9. Epub 2015/02/11. doi: 10.1093/ejo/cju087 . [DOI] [PubMed] [Google Scholar]
  • 23.Viechtbauer W. Bias and Efficiency of Meta-Analytic Variance Estimators in the Random-Effects Model. Journal of Educational and Behavioral Statistics. 2005;30(3):261–93. doi: 10.3102/10769986030003261 [DOI] [Google Scholar]
  • 24.Tatas Z, Koutsiouroumpa O, Seehra J, Mavridis D, Pandis N. Do pooled estimates from orthodontic meta-analyses change depending on the meta-analysis approach? A meta-epidemiological study. Eur J Orthod. 2023. Epub 20230712. doi: 10.1093/ejo/cjad031 . [DOI] [PubMed] [Google Scholar]
  • 25.Murad MH, Wang Z. Guidelines for reporting meta-epidemiological methodology research. Evid Based Med. 2017;22(4):139–42. Epub 20170712. doi: 10.1136/ebmed-2017-110713 ; PubMed Central PMCID: PMC5537553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hartung J, Knapp G. On tests of the overall treatment effect in meta-analysis with normally distributed responses. Stat Med. 2001;20(12):1771–82. doi: 10.1002/sim.791 . [DOI] [PubMed] [Google Scholar]
  • 27.Hartung J, Makambi KH. Reducing the Number of Unjustified Significant Results in Meta-analysis. Communications in Statistics ‐ Simulation and Computation. 2003;32(4):1179–90. doi: 10.1081/sac-120023884 [DOI] [Google Scholar]
  • 28.Sidik K, Jonkman JN. Robust variance estimation for random effects meta-analysis. Computational Statistics & Data Analysis. 2006;50(12):3681–701. doi: 10.1016/j.csda.2005.07.019 [DOI] [Google Scholar]
  • 29.Thorlund K, Wetterslev J, Awad T, Thabane L, Gluud C. Comparison of statistical inferences from the DerSimonian-Laird and alternative random-effects model meta-analyses ‐ an empirical assessment of 920 Cochrane primary outcome meta-analyses. Res Synth Methods. 2011;2(4):238–53. Epub 20120214. doi: 10.1002/jrsm.53 . [DOI] [PubMed] [Google Scholar]
  • 30.Veroniki AA, Jackson D, Bender R, Kuss O, Langan D, Higgins JPT, et al. Methods to calculate uncertainty in the estimated overall effect size from a random-effects meta-analysis. Res Synth Methods. 2019;10(1):23–43. Epub 20181009. doi: 10.1002/jrsm.1319 . [DOI] [PubMed] [Google Scholar]
  • 31.Jackson D, Law M, Rucker G, Schwarzer G. The Hartung-Knapp modification for random-effects meta-analysis: A useful refinement but are there any residual concerns? Stat Med. 2017;36(25):3923–34. Epub 20170726. doi: 10.1002/sim.7411 ; PubMed Central PMCID: PMC5628734. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Wiksten A, Rücker G, Schwarzer G. Hartung-Knapp method is not always conservative compared with fixed-effect meta-analysis. Stat Med. 2016;35(15):2503–15. Epub 20160204. doi: 10.1002/sim.6879 . [DOI] [PubMed] [Google Scholar]
  • 33.Veroniki AA, Jackson D, Viechtbauer W, Bender R, Bowden J, Knapp G, et al. Methods to estimate the between-study variance and its uncertainty in meta-analysis. Res Synth Methods. 2016;7(1):55–79. Epub 20150902. doi: 10.1002/jrsm.1164 ; PubMed Central PMCID: PMC4950030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Tsang S, Royse CF, Terkawi AS. Guidelines for developing, translating, and validating a questionnaire in perioperative and pain medicine. Saudi J Anaesth. 2017;11(Suppl 1):S80–S9. doi: 10.4103/sja.SJA_203_17 ; PubMed Central PMCID: PMC5463570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Chung Y, Rabe-Hesketh S, Choi IH. Avoiding zero between-study variance estimates in random-effects meta-analysis. Stat Med. 2013;32(23):4071–89. Epub 20130513. doi: 10.1002/sim.5821 . [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Ashraful Hoque

15 Jan 2024

PONE-D-23-41795

Do Statistical Heterogeneity methods impact the results of Meta- Analyses? A Meta Epidemiological Study.

PLOS ONE

Dear Dr. Mheissen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Feb 26 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Ashraful Hoque, MD

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please include captions for your Supporting Information files at the end of your manuscript, and update any in-text citations to match accordingly. Please see our Supporting Information guidelines for more information: http://journals.plos.org/plosone/s/supporting-information. 

3. We note that this manuscript is a systematic review or meta-analysis; our author guidelines therefore require that you use PRISMA guidance to help improve reporting quality of this type of study. Please upload copies of the completed PRISMA checklist as Supporting Information with a file name “PRISMA checklist”.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Abstract: In the results section of your abstract you wrote (n=111, 76%). This is unclear. Use a semicolon instead of a comma.

I notice here that you give different decimal places. The manuscript should be uniform. The same decimal places should be given throughout the manuscript.

Is the “Clinical relevance” heading in the abstract necessary? If recommend deleting it since you have stated your conclusion already.

Main text:

What I am missing is a meta-analyses quality assessment, using AMSTAR 2 (A MeaSurement Tool to Assess Systematic Reviews 2).

Reviewer #2: The authors aimed to summarise the impact of different estimators on the results of meta-analyses with less than five studies in the field of orthodontics. Below I provide some comments in which I believe can improve the manuscript.

[1] Abstract | Please cross-check the abstract to PRISMA abstract guidelines. EG dates of searches should be mentioned.

[2] Methods | Given this was a systematic review of systematic reviews I would suggest formatting the paper and report in line with PRISMA guidelines. See here: http://www.prisma-statement.org/

[3] Results – Characteristics | Regarding, “Likewise, most of the included MAs did not report the

heterogeneity estimator (n=99, 83%)”, which heterogeneity estimator are you referring to, or are you referring to all estimators? Please clarify.

[4] Results – Characteristics | RE “The most frequently used software was RevMan (n=112; 77%),”. My understanding if that RevMan automatically uses the DL methods of estimation. Above, you note that DL was only report in n=14 reviews. Could it be assumed that most studies used the DL method if this is the case? I think this may be worth clarifying if so here.

[5] Results/Discussion | You mention that the confidence intervals were narrower when heterogeneity was zero for HJSK. Sometimes ad-hoc variance correction is recommended in these cases (e.g., using the wider of the confidence intervals): https://pubmed.ncbi.nlm.nih.gov/28748567/. I think this should be highlighted further in the discussion in the second paragraph.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Mar 19;19(3):e0298526. doi: 10.1371/journal.pone.0298526.r002

Author response to Decision Letter 0


18 Jan 2024

Dear Editor Ashraful Hoque,

We thank you and the reviewers for the comments and suggestions that will improve our manuscript. Please find below the responses and actions taken.

Reviewers' Comments to Author:

Reviewer #1: Abstract: In the results section of your abstract, you wrote (n=111, 76%). This is unclear. Use a semicolon instead of a comma.

Author response: We have amended the abstract accordingly.

I notice here that you give different decimal places. The manuscript should be uniform. The same decimal places should be given throughout the manuscript.

Author response: All decimal places have been amended to two decimal places in the tables and throughout the manuscript except for the p-value which is less than 1 and the other decimals are important.

Is the “Clinical relevance” heading in the abstract necessary? If recommend deleting it since you have stated your conclusion already.

Author response: the clinical relevance heading was removed according to the reviewer’s comment.

Main text:

What I am missing is a meta-analyses quality assessment, using AMSTAR 2 (A MeaSurement Tool to Assess Systematic Reviews 2).

Author response: We appreciate the reviewer's input. However, a recent methodological study has already evaluated the methodological quality of orthodontic systematic reviews(1) in 2021. Consequently, we opted not to include this assessment to avoid duplicating research efforts. Additionally, it is important to note that the primary aim of our study was to examine the impact of different tau2 estimators on the statistical significance of results and confidence intervals rather than focusing on the methodological quality of the included studies.

Reviewer #2: The authors aimed to summarise the impact of different estimators on the results of meta-analyses with less than five studies in the field of orthodontics. Below I provide some comments in which I believe can improve the manuscript.

We want to thank you and the reviewers for the comments and suggestions that we think will improve our manuscript.

[1] Abstract | Please cross-check the abstract to PRISMA abstract guidelines. EG dates of searches should be mentioned.

Author response: the search date was added to the abstract, and the structure of the abstract was changed to follow PRISMA abstract guidelines.

[2] Methods | Given this was a systematic review of systematic reviews I would suggest formatting the paper and report in line with PRISMA guidelines. See here: http://www.prisma-statement.org/

Author response: We attempted to report the paper according to modified PRISMA guidelines for methodological studies.(2)

[3] Results – Characteristics | Regarding, “Likewise, most of the included MAs did not report the

heterogeneity estimator (n=99, 83%)”, which heterogeneity estimator are you referring to, or are you referring to all estimators? Please clarify.

Author response: we are referring to the heterogeneity estimator, whatever it was. The line was amended accordingly.

[4] Results – Characteristics | RE “The most frequently used software was RevMan (n=112; 77%),”. My understanding if that RevMan automatically uses the DL methods of estimation. Above, you note that DL was only report in n=14 reviews. Could it be assumed that most studies used the DL method if this is the case? I think this may be worth clarifying if so here.

Author response: the reporting of the heterogeneity method reflects the authors' knowledge. Hence, we tried to examine the reporting method used. We concur with the reviewer's observation that DL was the default method in most of our samples. We clarified this in the first paragraph of the discussion: Although there was a lack of reporting of the between-study variance method in 83% (99/119) of random meta-analyses, 98% (97/99) of those MAs were conducted using RevMan or CMA, where the default method of between-study variance is DL.

[5] Results/Discussion | You mention that the confidence intervals were narrower when heterogeneity was zero for HJSK. Sometimes ad-hoc variance correction is recommended in these cases (e.g., using the wider of the confidence intervals): https://pubmed.ncbi.nlm.nih.gov/28748567/. I think this should be highlighted further in the discussion in the second paragraph.

Author response: Thank you for bringing this to our attention. We have accordingly revised the discussion. Additionally, the last paragraph in the discussion aligns with the findings of the mentioned study, advocating for the use the modified methods as sensitivity analysis:

Although making a clear recommendation about the best methods is difficult in case of few studies, we suggest using DL method for pooling a small number of primary studies (fewer than 5) in MAs when I2=0% and sensitivity analysis using both HKSJ and DL for pooling MAs when I2>0.

Reference

1. Hooper EJ, Pandis N, Cobourne MT, Seehra J. (2021) Methodological quality and risk of bias in orthodontic systematic reviews using AMSTAR and ROBIS. Eur J Orthod, 43: 544-550

2. Murad MH, Wang Z. (2017) Guidelines for reporting meta-epidemiological methodology research. Evid Based Med, 22: 139-142

Attachment

Submitted filename: Rebuttal letter PLOS CFM Jan 19.docx

pone.0298526.s004.docx (24.4KB, docx)

Decision Letter 1

Ashraful Hoque

26 Jan 2024

Acceptance letter

Ashraful Hoque

3 Mar 2024

PONE-D-23-41795R1

PLOS ONE

Dear Dr. Mheissen,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Ashraful Hoque

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Search strategy in PubMed.

    (DOCX)

    pone.0298526.s001.docx (17.6KB, docx)
    S2 Table. Excluded studies with reason.

    (DOCX)

    pone.0298526.s002.docx (29.7KB, docx)
    S1 Checklist. PRISMA 2020 checklist.

    (DOCX)

    pone.0298526.s003.docx (21.2KB, docx)
    Attachment

    Submitted filename: Rebuttal letter PLOS CFM Jan 19.docx

    pone.0298526.s004.docx (24.4KB, docx)

    Data Availability Statement

    All relevant data are within the paper and its Supporting Information files.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES