Abstract
Regression based methods for the detection of publication bias in meta-analysis have been extensively evaluated in literature. When dealing with continuous outcomes, specific hidden factors (e.g., heteroscedasticity) may interfere with the test statistics. In this paper we investigate the influence of residual heteroscedasticity on the performance of four tests for publication bias: the Egger test, the Begg-Mazumdar test and two tests based on weighted regression. In the presence of heteroscedasticity, the Egger test and the weighted regression tests highly inflate the Type I error rate, while the Begg-Mazumdar test deflates the Type I error rate. Although all three tests already have low statistical power, heteroscedasticity typically reduces it further. Our results in combination with earlier discussions on publication bias tests lead us to conclude that application of these tests on continuous treatment effects is not warranted.
Keywords: Heteroscedastic mixed effects model, Aggregated data meta-analysis, Mean difference treatment effect sizes
1. Introduction
In a meta-analysis, publication bias can lead to an incorrect pooled estimate of a treatment effect. In the presence of publication bias, the treatment effect is associated with factors that affect publication bias, e.g., the size of the standard error of the treatment effect or the size of the study. Thus studies with a lack of statistical significance or smaller studies are less likely to be published.
Several methods have been proposed in literature to test for lack of this type of publication bias, e.g., the Egger test [1], the rank-correlation test [2], and several others [1,[3], [4], [5], [6], [7], [8], [9]]. These tests allow for heteroscedastic residual variances of the study effect sizes, but when their performances were studied the underlying data models are typically homoscedastic if sample size differences and realization of standard errors are ignored. We believe that homoscedasticity is not always a valid assumption, especially when dealing with continuous outcomes on individuals. For many clinical and social outcomes the variance may be (inversely) proportional to the mean (e.g., blood pressure in cardiovascular disease [10], forced expiratory volume in respiratory disease [11], heards on dairy sire in genetics [12], smoking-mood relationship in psychology [13], income and consumption in economics [14], grade point average in educations [15], stock market volatility in finance [16], radioimmunoassay in biology [17], pharmacokinetic, enzyme kinetics, and chemical kinetics in pharmacology [18], as well as inequalities in sociology [19]).
In case of heteroscedasticity, studies may or may not follow the premises of the test statistics, but when the variance of the outcome is correlated with its mean the resulting treatment effect estimates will be correlated with its standard errors as well. This may lead to the detection of artificial “publication bias” without the presence of a real publication bias process. Since treatment related heteroscedasticity is not testable with aggregated data in a meta-analysis, there is no guarantee that positive tests results for publication bias are truly positive at all. Therefore, it is crucial to better understand the performances of the publication bias test for aggregated data under heteroscedasticity.
The objective of this paper is to demonstrate that heteroscedasticity negatively affects the performance of test statistics for publication bias in case of continuous outcomes. We will perform a simulation study, since analytical investigations are complicated. We ignore bias adjustment methods, because we expect that correction of pooled estimates for publication bias is even more difficult when the presence or absence of publication bias is hard to determine. The Egger test, the rank-correlation test and two tests based on a weighted regression model [4] are considered. The choice of the Egger test follows from the objective of this paper, since it tests the dependence of the standardized estimated treatment effect with the precision of the study effect size (i.e., the inverse of its standard error). The two weighted regression methods were included because they were recommended for practice in the literature [4], and they might be robust to heteroscedasticity in theory. We decided to also include the rank-correlation test because it is not a regression based method and it is based on the correlation between the normalized treatment effect and its standard error. We did not consider publication bias tests for other outcome types (e.g., binary outcome) or other forms of publication bias (e.g., language bias) [1,3,[5], [6], [7], [8], [9]].
In Section 2 we will describe the four test statistics for publication bias in an aggregated-data meta-analysis. In the same section, we will introduce a heteroscedastic mixed effects model for individual participants in a study [17,20]. This model will be used to formulate the aggregated treatment effect per study and it will be used to simulate data. The third part of Section 2 is about a mechanism of publication bias. This mechanism is based on concepts for study selection and described in literature [1,8,21,22]. Then Section 3 describes a simulation study and the choices of parameter settings. Section 4 presents the simulation results and a discussion is provided in Section 5.
2. Methods
2.1. Tests for publication bias on aggregated data
An aggregated data meta-analysis usually consists of treatment effect estimates obtained from different studies, (), accompanied by their estimated standard errors [23]. The four test statistics for testing the hypothesis of no publication bias investigate the association between and .
Egger's Test: The Egger test uses a regression model with the standardized effect size as response variable and the precision as the independent variable, respectively. Using a t-test, the null hypothesis of no publication bias is rejected if the intercept of the regression model significantly deviates from zero [1].
Weighted regression: A weighted regression method uses as the response variable, as the independent variable and as the weight,1 with the between-study variance estimated with DerSimonian and Laird method [24]. Using a t-test, the null hypothesis of no publication bias is rejected if the slope of the independent variable significantly deviates from zero [4]. This method will be referred to as the weighted DL test. However, the DerSimonian-Laird estimate has been found to underestimate the between-study variance estimate, and thereby producing narrow confidence intervals for the mean treatment effect [25]. Instead of the DerSimonian-Laird estimator, the Restricted Maximum Likelihood (REML) estimator have been recommended in literature [26]. We therefore also considered the weighted regression model with estimated by REML, here referred to as the weighted REML test. We used procedure MIXED in SAS, version 9.4, to calculate this REML estimate .
Rank correlation: The rank-correlation test applies Kendall's tau correlation coefficient to the normalized treatment effect and the variance of the study effect size, with the weighted (fixed) effect size, and the estimated variance of the normalized effect size (under assumption of homogeneity of effect sizes). This non-parametric test statistic follows approximately a standard normal distribution when and are independent. Thus the associated -value is used to test the hypothesis of no publication bias [2,8].
2.2. Heteroscedastic mixed effects model on individual participants
Let denote the continuous response variable of individual , exposed to treatment , in study . A heteroscedastic linear mixed effects model per individual [20] can then be described as:
| (1) |
with the mean of treatment or control group , the mean treatment effect, a study-specific random effect for group , a random treatment effect for study , a treatment specific residual variance parameter, normally distributed, and standard normally distributed and independent of the random effects , , and . Residual heteroscedasticity at the individual level is introduced via parameter and the random term . The variance indicates a fixed heteroscedasticity in variability between individuals for the two treatment groups (i.e., treatment affects both the level and the variability) and indicates a random heteroscedasticity between individuals across studies (i.e., individuals are more or less alike within studies). Thus, the random variable makes it a heteroscedastic mixed effects model and not the variance parameters , because these parameters only introduce heteroscedasticity within study. If is degenerated in 0, model (1) becomes a simple mixed effects model.
It is assumed that has a multivariate normal distribution with means and variance-covariance matrix given by
The value of represents the correlation between the study-specific random effects and for the treatment and the control group, respectively. The random treatment effect represents the study heterogeneity of the study effect size as follows: If and , is degenerate in zero or non-existent, while for all other settings of , , and it will lead to study heterogeneity. The value represents the correlation between the mean and the logarithm of the random heteroscedastic residual variance.
The treatment effect per study is given by the raw mean difference for study , where is the average value for group in study . The standard error for the effect size in study is given by , where is the sample variance for treatment group in study . Based on model (1), the treatment effect can be written into the well-known random effects model2 for meta-analysis studies [23].
| (2) |
with , , and . Without the existence of , the residuals in (2) are homoscedastic if sample sizes are consistent across studies. The variance can be rewritten into
| (3) |
with chi-square distributed with degrees of freedom.
Clearly, the introduced random heteroscedasticity affects both the treatment effect and the standard error . As a consequence, affects the responses and the independent variables used in the Egger, the weighted DL, the weighted REML, and the rank correlation test. It also makes an analytical investigation of the test statistics more complex, since the joint distribution of the responses and independent variables are less traceable. We therefore studied the influence of the random residual heteroscedasticity on the three test statistics by simulation.
2.3. Publication bias mechanism on aggregated data
We briefly describe the selection model which will create publication bias at study level [5,21]. For each study in the meta-analysis, the selction model assumes a latent variable that depends on the standardized mean difference . If the latent variable is positive (), study is published and appearsin the meta-analysis study, but when it is non-positive () the study is not published and may create selection bias in the meta-analysis. The latent variable is given by
| (4) |
with and two constants and standard normally distributed and independent of all other random terms. Thus the larger the standardized treatment effect, the larger the probability of being selected (assuming that effect sizes are more frequently positive).
Note that the selection process of studies will be affected by the random residual heteroscedasticity through the standardized effect size . The standardized effect size can be rewritten in
| (5) |
with , , and . Thus the difference in behavior of with and without heteroscedasticity is determined by a difference in behavior of and , respectively. The distribution of is symmetric and normal, while is skewed to the right and non-normal. Combined with the choices for the constant and , the probability of selecting a study is lower under heteroscedasticity than under homoscedasticity when these studies would have the same standardized effect size .
3. Simulation study
We simulated a meta-analysis study with studies and vary the sample size for study . This sample size was selected from an overdispersed Poisson distribution, i.e., , with drawn from a gamma distribution. Then within each study the participants are randomly allocated to the treatment and the control group with equal probabilities, resulting in participants in the treatment group, and participants in the control group (i.e., )). The continuous response is then simulated according to the heteroscedastic linear mixed effects model described in Section 2.2. The data from this model is then used to calculate the study effect size and its standard error .
To introduce publication bias, a selection process or mechanism is simulated according to the selection model described in Section 2.3. We used the 5% and 95% quantiles of the set of standardized treatment effects , , …, , and denote them by and , respectively. The values and are chosen such that and . Thus relatively small standardized effect sizes, with respect to other studies, will be published with low probability and large standardized effect sizes will be published almost always. That the publication of one study depends on other studies may be reasonable if more research is already known on the same topic. Solving the two probability equations results in and , using the normality assumption of the random term and its independence with the standardized treatment effects. Note that creating and in this way results in different values for and per simulated meta-analysis.
Different simulation settings were considered both with and without publication bias and with and without random heteroscedasticity. The settings of the parameters are chosen such that the simulation corresponds approximately with a meta-analysis of clinical trials on for instance hypertension treatment. Parameter settings used to generate the aggregated data from the individual participant data are , , , , , , , , , , and . We will run all combinations of parameter choices and simulate 1000 meta-analysis studies.
Based on the number of studies that remain in the meta-analysis, the four publication bias methods in Section 2.1 were used to test for publication bias. The four test statistics were applied to the same meta-analysis data and they were considered significant at the level of 0.1 in order for our results to be comparable to other studies [4,6,[27], [28], [29], [30]]. We will study the type I error and the power of these tests. We will also report the effective number of studies used in the meta-analyses. The simulation of the meta-analysis data and the analysis of the data was conducted with SAS software, version 9.4.
4. Results
The Type I error rate and the power of Egger's test, the rank correlation test (RC), the weighted DL (wDL) and the weighted REML (wREML) test are presented in Table 1 and Table 2, respectively. For the power values, the effective number of studies (as percentage of the number ) ranged from 37.18% to 39.55% on average for to .
Table 1.
Type I error rate (%) of the four tests on publication bias.
| Test | Correlation |
σ2 = 0 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| −0.7 | −0.5 | −0.3 | 0 | 0.3 | 0.5 | 0.7 | |||
| 20 | Egger | 20.5 | 19.4 | 17.4 | 16.5 | 16.6 | 17.0 | 17.5 | 18.5 |
| wDL | 14.2 | 12.7 | 10.8 | 10.7 | 10.0 | 10.3 | 10.2 | 10.3 | |
| wREML | 13.9 | 12.4 | 10.7 | 10.2 | 10.0 | 10.2 | 9.9 | 10.0 | |
| RC | 6.4 | 5.3 | 5.1 | 4.5 | 4.2 | 5.3 | 5.1 | 8.6 | |
| 50 | Egger | 25.6 | 24.5 | 22.9 | 22.1 | 21.7 | 23.9 | 27.2 | 21.4 |
| wDL | 16.0 | 12.3 | 11.5 | 11.7 | 13.3 | 13.6 | 16.9 | 11.8 | |
| wREML | 16.2 | 12.5 | 11.3 | 11.5 | 12.9 | 13.6 | 16.7 | 12.0 | |
| RC | 8.3 | 7.2 | 6.1 | 5.4 | 5.7 | 6.8 | 7.7 | 9.3 | |
| 100 | Egger | 35.0 | 29.8 | 27.7 | 25.7 | 28.7 | 31.5 | 37.0 | 25.6 |
| wDL | 21.5 | 15.1 | 12.1 | 10.3 | 13.4 | 17.0 | 21.9 | 10.0 | |
| wREML | 21.2 | 15.4 | 12.1 | 10.6 | 13.3 | 16.9 | 22.2 | 10.4 | |
| RC | 10.4 | 8.5 | 6.4 | 6.4 | 6.7 | 8.5 | 11.2 | 9.3 | |
Table 2.
Power (%) of the four tests on publication bias.
| Test | Correlation |
σ2 = 0 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| −0.7 | −0.5 | −0.3 | 0 | 0.3 | 0.5 | 0.7 | |||
| 20 | Egger | 19.4 | 20.5 | 22.3 | 24.1 | 25.3 | 27.5 | 30.4 | 25.9 |
| wDL | 18.9 | 19.1 | 20.4 | 22.3 | 25.1 | 25.4 | 26.7 | 21.8 | |
| wREML | 18.6 | 18.5 | 20.4 | 22.0 | 24.5 | 25.5 | 26.3 | 21.3 | |
| RC | 11.7 | 12.6 | 13.6 | 14.9 | 17.2 | 19.4 | 17.9 | 17.1 | |
| 50 | Egger | 30.6 | 32.6 | 35.3 | 38.1 | 42.8 | 48.5 | 52.8 | 50.0 |
| wDL | 27.8 | 32.9 | 35.3 | 40.0 | 45.7 | 50.2 | 55.4 | 46.2 | |
| wREML | 27.4 | 32.7 | 34.4 | 40.1 | 45.9 | 50.4 | 55.3 | 45.9 | |
| RC | 13.9 | 17.4 | 19.3 | 24.2 | 26.1 | 31.1 | 33.5 | 34.1 | |
| 100 | Egger | 40.7 | 46.2 | 51.5 | 58.5 | 64.9 | 68.1 | 71.7 | 70.2 |
| wDL | 42.7 | 48.9 | 56.2 | 64.1 | 72.5 | 75.9 | 81.0 | 69.8 | |
| wREML | 42.9 | 49.3 | 55.4 | 63.8 | 72.4 | 76.1 | 81.0 | 69.8 | |
| RC | 24.8 | 28.5 | 32.8 | 40.4 | 47.7 | 50.9 | 56.7 | 54.1 | |
From Table 1, it can be seen that when heteroscedasticity is absent (), the weighted regression approaches and the rank correlation approach show a nominal Type I error rate of approximately 10% (although the rank correlation test was slightly conservative at ). When heteroscedasticity is introduced, the Type I error rates of the Egger test, the weighted DL test and the weighted REML seem to be close to their Type I error rates under homoscedasticity when and . However, the Type I error rate increases for these three tests when , compared to their Type I error rates at homoscedasticity. When the number of studies increases we see a different pattern. The Type I error rates of these three tests increase as increases.
When publication bias is introduced, heteroscedasticity influences the statistical power of the four test statistics in a different way than for the Type Ierror rates. The power is increasing with the correlation . This would make sense. If the heteroscedasticity is negatively correlated with the heterogeneity of effect sizes (), this correlation reduces the positive correlation between study effect sizes and standard errors that is introduced by the publication bias (antagonism). For positive values of the publication bias is increased (synergism). However, it is somewhat more complicated than just the presence of synergism and antagonism, because the heteroscedasticity also affects the publication bias mechanism (see Fig. 1). Under heteroscedasticity non-selected studies may have higher standardized effect sizes than under homoscedasticity, while selected studies may have lower standardized effect sizes than under homoscedasticity (see Fig. 1). As a consequence, the power of all four test statistics at is lower than the power under homoscedasticity when the number of studies is relatively large, due to this altered publication bias mechanism.
Fig. 1.
Histograms of the standardized treatment effects under homoscedasticity and three setting of heteroscedasticity. The red histogram represents the studies that are eliminated from the simulated meta-analysis and the blue histograms represent. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
5. Discussion and conclusion
The performances of the Egger test, the rank correlation test, the weighted DL test and the weighted REML test were investigated for testing publication bias in the presence of residual heteroscedasticity for mean differences. Residual heteroscedasticity for mean differences is plausible and realistic, since it represents heterogeneity in inter-participant variability across studies. Indeed, participants can be more alike in one study than in another study, depending on the implemented selection criteria for the study participants. For instance, pragmatic clinical trials may hardly use any selection criteria, since they focus on efficiency, while other clinical trials may focus on efficacy on a specific set of participants with a particular symptom or disease in a limited age range. Note that the form of heteroscedasticity that we included has been studied in more sophisticated ways in the area of multilevel models [20].
In the simulation study, we found the Type I error of the Egger's test inflated. This is a well-known phenomenon for log odds ratios [31] and standardized mean differences [32,33], but in our simulation such inflation is unexpected. The cause is related to the way we simulated the data. We have chosen to simulate data based on an IPD model, which is different from simulations based on aggregated data model that other researchers have used to study performances of tests for publication bias [22,24,28]. In our simulation, the distribution of sample size affects Egger's test directly. If we select the overdispersion parameter in , the distribution of is less extreme or less skewed than with and the Type I error rate for Egger's test reduces to 12.1 (for . One very large study in a meta-analysis becomes an influential point in Egger's regression analysis that strongly affects the test on intercept, especially when heterogeneity in study effect sizes are not considered as in the weighted regression approaches.
Heteroscedasticity renders the four tests unreliable in testing for publication bias in meta-analysis. It causes the Type I error to deviate from the nominal level substantially and also from the level obtained under homoscedasticity if the test was not nominal (Egger's test). For the Egger test, the introduction of heteroscedasticity increases the variability in . Together with the correlation between the heteroscedasticity and heterogeneity , the standard error of the intercept in the regression model reduces, leading to more rejections of the null hypothesis than under homoscedasticity. For the weighted regression approaches something similar is going on. The correlation between the heteroscedasticity and heterogeneity induces a correlation between and , which causes an increased Type I error rate. For the rank-correlation test, introducing heteroscedasticity causes the Type I error rate to drop below the nominal level, but the larger the number of studies the smaller the effect of heterogeneity. The introduction of heteroscedasticity reduces the variance of Kendall's tau statistic, thereby reducing the variance of the test statistic. This results in a non-standard normal distribution with a variance less than one, introducing the conservative Type I error rates.Furthermore, it decreases the power strongly in most cases. Tests for publication bias have always shown low statistical power even in the absence of heteroscedasticity [3,28,29,33]. With this known evidence and our newly presented criticism, testing for publication bias in meta-analysis with continuous outcomes should be avoided.
We did not include any of the publication bias methods (e.g., trim and fill, Copas’ selection method) that could correct the pooled estimate from a meta-analysis [9,22]. These estimation approaches would typically make use of the correlation between the study effect size and its standard error in one way or another, as do the test statistics we investigated. Since heteroscedasticity is destroying this relationship, we expect that these correction methods would not be able to correct the pooled estimate appropriately. As we have demonstrated, diagnosing publication bias using the correlation between effect size and standard error, is strongly affected.
Heteroscedasticity also affected the mechanism of publication bias, making it difficult to disentangle these two mechanisms at an aggregated level. Thus when heteroscedasticity is anticipated from the topic of study, it may be recommended to collect and pool the individual participant data. The heteroscedasticity can then be potentially modeled, although it remains unknown how to address the publication bias into this approach. More research is needed to be able to model the correlation between study effect sizes and its standard error.
Acknowledgments
This research was funded by grant number 023.005.087 from the Netherlands Organization for Scientific Research.
Footnotes
In case the weight is changed to , the weighted regression approach is identical to Egger's test.
In the random effects model it is often assumed that the random variables and are independent and normally distributed, but due to our random heteroscedastic variable both assumptions will be violated.
References
- 1.Egger M., Davey Smith G., Schneider M., Minder C. Bias in meta- analysis detected by a simple, graphical test. Br. Med. J. 1997;315(7109):629–634. doi: 10.1136/bmj.315.7109.629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Begg C.B., Mazumdar M. Operating characteristics of a rank correlation test for publication bias. Biometrics. 1994;50(4):1088–1101. [PubMed] [Google Scholar]
- 3.Harbord R.M., Egger M., Sterne J.A. A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints. Stat. Med. 2006;25(20):3443–3457. doi: 10.1002/sim.2380. [DOI] [PubMed] [Google Scholar]
- 4.Jin Z., Wu C., Zhou X., He J. A modified regression method to test publication bias in meta-analyses with binary outcomes. BMC Med. Res. Methodol. 2014;14(132) doi: 10.1186/1471-2288-14-132. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Mavridis D., Welton N.J., Suttond A., Salantia G. A selection model for accounting for publication bias in a full network meta-analysis. Stat. Med. 2014;33(30):5399–5412. doi: 10.1002/sim.6321. [DOI] [PubMed] [Google Scholar]
- 6.Rucker G., Schwarzer G., Carpenter J. Arcsine test for publication bias in meta-analysis with binary outcomes. Stat. Med. 2008;27:746–763. doi: 10.1002/sim.2971. [DOI] [PubMed] [Google Scholar]
- 7.Schwarzer G., Antes G., Schumacher M. A test for publication bias in meta-analysis with sparse binary data. Stat. Med. 2007;26:721–733. doi: 10.1002/sim.2588. [DOI] [PubMed] [Google Scholar]
- 8.Sterne J.A., Egger M. Publication Bias in Meta-Analysis: Prevention, Assessment and Adjustments. 2005. Regression methods to detect publication and other bias in meta-analysis; pp. 99–110. [Google Scholar]
- 9.Zhu Q., Carriere K.C. Detecting and correcting for publication bias in meta-analysis - a truncated normal distribution approach. Stat. Methods Med. Res. 2018;27(9):2722–2741. doi: 10.1177/0962280216684671. [DOI] [PubMed] [Google Scholar]
- 10.Rothwell P.M., Howard S.C., Dolan E., O'Brien E., Dobson J.E., Dahlöf B., Poulter N.R. Prognostic significance of visit-to-visit variability, maximum systolic blood pressure, and episodic hypertension. Lancet. 2010;375(9718):895–905. doi: 10.1016/S0140-6736(10)60308-X. [DOI] [PubMed] [Google Scholar]
- 11.Dockery D.W., Ware J.H., Ferris B.G., Jr., Glicksberg D.S., Fay M.E., Spiro A., III, Speizer F.E. Distribution of forced expiratory volume in one second and forced vital capacity in healthy, white, adult never-smokers in six US cities. Am. Rev. Respir. Dis. 1985;131(4):511–520. doi: 10.1164/arrd.1985.131.4.511. [DOI] [PubMed] [Google Scholar]
- 12.Winkelman A., Schaeffer L.R. Effect of heterogeneity of variance on dairy sire evaluation. J. Dairy Sci. 1988;71(11):3033–3039. [Google Scholar]
- 13.Hedeker D., Mermelstein R.J., Berbaum M.L., Campbell R.T. Modeling mood variation associated with smoking: an application of a heterogeneous mixed-effects model for analysis of ecological momentary assessment (EMA) data. Addiction. 2009;104(2):297–307. doi: 10.1111/j.1360-0443.2008.02435.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Wang N. Generalizing the permanent-income hypothesis: revisiting Friedman's conjecture on consumption. J. Monetary Econ. 2006;53(4):737–752. [Google Scholar]
- 15.Hayes A.F., Cai L. Using heteroskedasticity-consistent standard error estimators in OLS regression: an introduction and software implementation. Behav. Res. Methods. 2007;39(4):709–722. doi: 10.3758/bf03192961. [DOI] [PubMed] [Google Scholar]
- 16.Schwert G.W. Why does stock market volatility change over time? J. Finance. 1989;44(5):1115–1153. [Google Scholar]
- 17.Davidian M., Carroll R.J. Variance function estimation. J. Am. Stat. Assoc. 1987;82(400):1079–1091. [Google Scholar]
- 18.Davidian M., Giltinan D.M. Some simple methods for estimating intraindividual variability in nonlinear mixed effects models. Biometrics. 1993;49:59–73. [Google Scholar]
- 19.Western Bruce, Bloome Deirdre. Harvard University; 2009. Variance Function Regressions for Studying Inequality. Working Paper, Department of Sociology. [Google Scholar]
- 20.Quintero A., Lesaffre E. Multilevel covariance regression with correlated random effects in the mean and variance structure. Biom. J. 2017;59(5):1047–1066. doi: 10.1002/bimj.201600193. [DOI] [PubMed] [Google Scholar]
- 21.Hedges L.V. Estimation of effect size under nonrandom sampling: the effects of censoring studies yielding statistically insignificant mean differences. J. Educ. Behav. Stat. 1984;9:61–85. [Google Scholar]
- 22.Duval S.J., Tweedie R.L. A non-parametric trim and fill method of accounting for publication bias in meta-analysis. J. Am. Stat. Assoc. 2000;95:89–98. [Google Scholar]
- 23.DerSimonian R., Laird N. Meta-analysis in clinical trials. Contr. Clin. Trials. 1986;7(3):177–188. doi: 10.1016/0197-2456(86)90046-2. [DOI] [PubMed] [Google Scholar]
- 24.Brockwell S.E., Gordon I.R. A comparison of statistical methods for meta-analysis. Stat. Med. 2001;20:825–840. doi: 10.1002/sim.650. [DOI] [PubMed] [Google Scholar]
- 25.Cornell J.E., Mulrow C.D., Localio R., Stack C.B., Meibohm A.R., Guallar E., Goodman S.N. Random-effects meta-analysis of inconsistent effects: a time for change. Ann. Intern. Med. 2014;160:267–270. doi: 10.7326/M13-2886. [DOI] [PubMed] [Google Scholar]
- 26.Viechtbauer W. Bias and efficiency of meta-analytic variance estimators in the random-effects Model. Journal of Educational Behavioural Statistics. 2005;30:261–293. [Google Scholar]
- 27.Idris N.R.N. A comparison of methods to detect publication bias for meta-analysis of continuous data. J. Appl. Sci. 2012;12:1413–1417. [Google Scholar]
- 28.Macaskill P., Walter S.D., Irwig L.A. A comparison of methods to detect publication bias in meta-analysis. Stat. Med. 2001;20(4):641–654. doi: 10.1002/sim.698. [DOI] [PubMed] [Google Scholar]
- 29.Peters J.L., Sutton A.J., Jones D.R., Abrams K.R., Rushton L. Comparison of two methods to detect publication bias in meta-analysis. J. Am. Med. Assoc. 2006;295(6):676–680. doi: 10.1001/jama.295.6.676. [DOI] [PubMed] [Google Scholar]
- 30.Schwarzer G., Antes G., Schumacher M. Inflation of type I error rate in two statistical tests for the detection of publication bias in meta-analyses with binary outcomes. Stat. Med. 2002;21(17):2465–2477. doi: 10.1002/sim.1224. [DOI] [PubMed] [Google Scholar]
- 31.Deeks J.J., Macaskill P., Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. J. Clin. Epidemiol. 2005;58(9):882–893. doi: 10.1016/j.jclinepi.2005.01.016. [DOI] [PubMed] [Google Scholar]
- 32.Pustejovsky J.E., Rodgers M.A. Testing for funnel plot asymmetry of standardized mean differences. Res. Synth. Methods. 2019;10(1):57–71. doi: 10.1002/jrsm.1332. [DOI] [PubMed] [Google Scholar]
- 33.Zwetsloot P.P., Van der Naald M., Sena E.S., Howells D.W., In ’t Hout J., De Groot J.A.H., Chamuleau S.A.J., MacLeod M.R., Wever K.E. 2017. Standardized Mean Differences Cause Funnel Plot Distortion in Publication Bias Assessments. [DOI] [PMC free article] [PubMed] [Google Scholar]

