Abstract
Quantitative evidence synthesis through meta‐analysis is central to evidence‐based medicine. For well‐documented reasons, the meta‐analysis of individual patient data is held in higher regard than aggregate data. With access to individual patient data, the analysis is not restricted to a “two‐stage” approach (combining estimates and standard errors) but can estimate parameters of interest by fitting a single model to all of the data, a so‐called “one‐stage” analysis. There has been debate about the merits of one‐ and two‐stage analysis. Arguments for one‐stage analysis have typically noted that a wider range of models can be fitted and overall estimates may be more precise. The two‐stage side has emphasised that the models that can be fitted in two stages are sufficient to answer the relevant questions, with less scope for mistakes because there are fewer modelling choices to be made in the two‐stage approach. For Gaussian data, we consider the statistical arguments for flexibility and precision in small‐sample settings. Regarding flexibility, several of the models that can be fitted only in one stage may not be of serious interest to most meta‐analysis practitioners. Regarding precision, we consider fixed‐ and random‐effects meta‐analysis and see that, for a model making certain assumptions, the number of stages used to fit this model is irrelevant; the precision will be approximately equal. Meta‐analysts should choose modelling assumptions carefully. Sometimes relevant models can only be fitted in one stage. Otherwise, meta‐analysts are free to use whichever procedure is most convenient to fit the identified model.
Keywords: individual‐patient data, meta‐analysis, one‐stage, two‐stage
1. INTRODUCTION
Quantitative synthesis of evidence from multiple studies through meta‐analysis is a cornerstone of medical, psychological, and social research. Historically, parameter estimation in meta‐analysis has involved two stages, taking (published) estimates of effect and standard errors from each study (first stage), then combining them (second stage), assigning weights to each study, with weights most often based on the inverse variance.
Over the last 2 decades, the bar for high‐quality systematic reviews has been raised, particularly by reviewers who have collected participant‐level data for included studies and analysed these datasets to obtain their own trial‐level estimates in place of published estimates. The merits of using individual participant data (IPD) are well documented1, 2, 3, 4: Data can be checked by the reviewer; consistent inclusion and exclusion criteria can be applied; participants can be included who were not in the original reports; longer follow‐up may be included than at the time of the primary study report; analyses can be executed identically for each study, meaning that it is more reasonable to combine estimates; and there is greater scope to investigate interactions. Collecting and checking IPD is a time‐consuming, diplomatic, and expensive exercise, so when IPD has been obtained, the advantages should be fully leveraged.
When IPD have been obtained, the analysis is not restricted to combining estimates and standard errors from published data. Instead, the whole analysis can be done in one stage. The merits of “one‐stage” compared to “two‐stage” approaches have been debated.5, 6, 7, 8 Those who argue for one‐stage methods tend to be statistically trained, while those who argue for two‐stage methods are typically meta‐analysts whose primary training is not statistical. (Executing a thorough systematic review and IPD meta‐analysis involves many non‐statistical skills.) There seems to be support for both perspectives.
The views of a one‐stage exponent are exemplified by an anonymous reviewer of one of the authors' (DJF) work on a Stata package to perform two‐stage meta‐analysis of IPD.3 The reviewer was ambivalent. Despite approving the package, he or she was convinced that two‐stage methods have little to offer, writing
The debate is ongoing but the statistical merits of one‐stage IPD far outweigh the benefits of a two‐stage approach …
… the merit of the one‐stage is doing an analysis that is definitely more accurate …
… a simulation study is needed to quantify how much better one‐stage is in certain scenarios …
I do not think it [two‐stage meta‐analysis] should be recommended for use when a one‐stage approach is possible …
These opinions were apparently informed by Mathew and Nordström's paper,6 which shows that a two‐stage procedure can be at best as asymptotically efficient as a one‐stage procedure. This is in contrast to the view of Burke et al who argue that “differences arise because of different modelling assumptions, rather than the choice of one‐stage or two‐stage itself.”9
The two‐stage perspective tends to be that the models and their assumptions are transparent and of scientific value and are sufficient to address relevant substantive questions. There is arguably less transparency in the reporting of models used with one‐stage analysis, as evidenced by two recent reviews.10, 11 Simmonds et al found that “One‐stage methods were in general more poorly described, perhaps because of the greater complexity involved in describing properties of regression models.”10 The main choices for two‐stage analysis are around whether to model treatment effects as fixed or random and (a closely related choice) how much weight studies should contribute to the overall estimate. The apparent simplicity of two‐stage analysis, combined with the longer history of two‐stage estimation in the context of meta‐analysis, may be why descriptions tend to be better.
The aim of this paper is to consider the statistical arguments for one‐ and two‐stage analysis in the context of Gaussian outcome data, and particularly to identify differences in precision for models that can be fitted in either way. This is done in the finite‐sample setting; the reality we all work in. That is, our results allow explicitly for small numbers of patients‐per‐study and the implied uncertainty about study‐specific estimates.
To aid clarity, we focus here on the simple setting where the IPD include a quantitative outcome, an indicator of treatment assignment, a participant identifier, and a study identifier. It is assumed that the focus of the meta‐analysis is on the treatment effect and that no covariates are present. This is an inadequate summary of a meta‐analysis,12 but this is often the primary focus and permits a clear discussion of the arguments for and against one‐ and two‐stage approaches.
As evidenced by the above quotes, two‐stage procedures can receive a bad press from statisticians, and we seek to establish the extent to which such comments are justified, not asymptotically, but in practice.
2. MODELS FOR META‐ANALYSIS
Even for the relatively simple structure of datasets considered here, there are several possible models for performing a meta‐analysis. Borrowing heavily from Senn,13 Table 1 lists seven that might be of interest. We define our notation below before discussing the models in Table 1.
Table 1.
Models for individual patient data meta‐analysis. Model II is the “fixed‐effects meta‐analysis” and model V is the “random‐effects meta‐analysis”
| Model |
|
one‐stage? | Example one‐stage Stata code | ||
|---|---|---|---|---|---|
| two‐stage? | Example two‐stage Stata code | ||||
| I | rly ij=α+β x ij+ε ij | Common | Yes | regress y x | |
| No | … | ||||
| II | rly ij=(α+a i)+β x ij+ε ij | Study‐specific | Yes | mixed y x i.study, residuals(,by(study)) reml | |
| Yes | ipdmetan, study(study): regress y x | ||||
| III | y ij=α i+β x ij+ε ij | Common | Yes | mixed y x || study: , reml | |
| α i=α+a iwherea i∼N(0,λ 2) | No | … | |||
| IV | y ij=(α+a i)+(β+b i)x ij+ε ij | Common | Yes | regress y x#study | |
| No | … | ||||
| V | y ij=(α+a i)+β i x ij+ε ij | Study‐specific | Yes | mixed y x i.study || study: x, nocons residuals(,by(study)) reml df(kr) | |
| β i=β+b iwhereb i∼N(0,τ 2) | Yes | b ipdmetan, study(study) re(kr): regress y x | |||
| VI | y ij=α i+(β+b i)x ij+ε ij | Study‐specific | Yes | mixed y x#study || study: , residuals(,by(study)) reml df(kr) | |
| α i=α+a iwherea i∼N(0,λ 2). | Yesa | b ipdmetan, study(study) re poolvar(_cons): regress y x | |||
| VII | y ij=α i+β i x ij+ε ij | ||||
| α i=α+a iand | Study‐specific | Yes | mixed y x || study: x, residuals(,by(study)) reml | ||
| β i=β+b iwhere | Yes | b mvmeta_make regress y x, by(study) saving(data) replace names(b V) | |||
| (a i,b i)∼BVN | then b mvmeta b V, reml |
aWhile the model can technically be fitted in a two‐stage way, it only obtains an estimate of α, not βso is practically useless for meta‐analysis of treatment effects.
bNote that ipdmetan and mvmeta are user‐written Stata packages, which can be installed in Stata using “. ssc install command_name.”
2.1. Notation
Let y denote outcome and x denote treatment assignment14 (coded −0.5/+0.5). Let i=1,…,I index studies and j=1,…,n i index patients within a study, so that y ij is the outcome for the jth patient in the ith study. Let α be the intercept term and β be the treatment effect, the parameter of central interest. The interpretation of α and β will depend on which model is used; for example, in model I, α represents mean outcome (because treatment is coded −0.5/+0.5 and it is assumed that randomisation is 1:1).
There are 3 possibilities for modelling the main effect (intercept) of study:
- αi=α,∀i
a single constant; studies share an intercept
- αi=α+ai
fixed‐study intercept
- αi=α+ai whereai∼N(0,λ2)
random‐study intercept
Similarly, for the treatment effect,
-
1
β i=β,∀i
-
2
common treatment effect (no interaction)
-
3
β i=β+b i
-
4
fixed treatment effects which are different for each study
-
5
(fixed treatment‐by‐study interaction terms)
-
6
β i=β+b i whereb i∼N(0,τ 2)
-
7
random treatment effect
-
8
(random treatment‐by‐study interaction)
Let . With a one‐stage analysis, can be allowed to vary by study or restricted such that or some combination of these. For example, a fixed‐effects model based on ordinary least squares (OLS) implies α i=α+a i (with a i representing fixed differences in intercepts for i=1,…,), β i=β, and . Two‐stage analysis by default allows to vary by i(we have referred to this as default because, at the time of writing, we believe all software packages which perform stage two do so under this assumption). However, the assumption that can also be invoked with a two‐stage analysis (see Olkin and Sampson5. To do this, relative study weights are proportional to a function of n i rather than of . Weighting in this way exactly recovers the one‐stage OLS estimate.15 We return to this point later.
We distinguish here between two types of variance we will refer to, particularly for understanding the workings provided in the appendices. We denote the true variance of the estimator of β by . In the frequentist sense, this is the long‐run variance of under repeated sampling. Second, the estimated variance is denoted by . This is what is estimated in a specific realisation of a meta‐analysis. A good variance estimator should have expectation .
2.2. Flexibility and estimands
Note the column of Table 1 stating whether the model can be fitted in one stage (top) or two (bottom). Of the models considered, (I), (III), and (IV) cannot be fitted in two stages. Further, although model (VI) can be fitted in two stages, it does not provide an estimate of β. This restriction is what is meant when one‐stage approaches are promoted for their flexibility.
Note that, provided that the available evidence is representative and the model specific assumptions hold, any of the models listed in Table 1 could provide an unbiased estimate of the treatment effect and its variance. The practical settings for which this is true may be rather limited. For example, if each study used simple randomisation and the same allocation ratio (not necessarily 1:1), then would be unbiased for all the models in Table 1. The interesting point is that this depends only on the design of the included studies and not on the true data‐generating model. We refer interested readers to Parzen et al16 or Kahan and Morris.17
3. A COMPARISON OF ONE‐ AND TWO‐STAGE APPROACHES FOR LINEAR MODELS
3.1. Fixed‐effects model (model II)
We wish to compare the variance of one‐ and two‐stage fixed‐effect meta‐analysis. There are 2 approaches to this.
Compare the variance of “default” or most natural one‐ and two‐stage fixed‐effect estimators.
Compare the variance of one‐ and two‐stage estimators when the models and associated assumptions are the same.
The former is an apples‐vs‐oranges comparison. It is interesting because one‐stage fixed‐effect estimators are most naturally estimated by OLS (this is the default in all major software), which assumes a common variance across studies (previously described as “scarcely credible” in the context of meta‐analysis).15 Meanwhile the two‐stage inverse‐variance estimator most naturally assumes heterogeneity of variances across studies (again this is the default in software). Assumptions about within‐study variances impact on the relative weights of studies and thus on the overall estimate of β and its variance. Results may differ between one‐ and two‐stage analysis due to subtle differences in modelling assumptions, such as this one. It is important not to attribute this to the number of stages in computation. Therefore, the comparison of defaults is very much a comparison of the modelling assumptions, rather than of the procedure by which the models are estimated (ie, in one or two stages).
The most natural one‐stage analysis is to fit a model based on OLS, where the theory is well understood. This assumes is common across studies. The theory is less well understood when are allowed to be study specific, and the model is estimated based on variance‐weighted least squares. The fact that the mixed command is needed to fit this model demonstrates that it is not a natural one‐stage model to fit. We derive the one‐stage estimator of β and variance estimator allowing study‐specific in Appendix A.1. These turn out to be the same as the two‐stage inverse‐variance estimator and the usual asymptotic inverse‐variance formula:
| (1) |
For details, see Appendix A.1.
Thus, when we allow study‐specific variances, computing our estimates in one or two stages would lead to identical point estimates and variance estimates. (In fact, the variance estimators are biased downwards in finite samples because they assume that the are known rather than estimated, as is assumed in Mathew and Nordström.6 This result is related to the result of Olkin and Sampson,5 when both models impose a shared , and to Lunn et al,18 who present a two‐stage strategy for fitting a full Bayesian model.
We focus on comparisons of variance here and, to do so, needed to derive some new theoretical results. Appendix A works through the mathematics, while the rationale and results are given here. The overall aim is to calculate the expected value of the variance for the OLS and inverse‐variance estimators. This involved:
Calculation of the expected value of the asymptotic variance formula (1) for the inverse‐variance estimator (noting again that this variance estimator is biased downwards in small samples and so produces misleading conclusions) (Appendix A.2).
- Derivation of a new variance formula which explicitly acknowledges that the are estimates and so is unbiased with practical sample sizes (Appendix A.3). This tends to the standard formula (1) as all n i→∞ but is less biased when any n i is small. This is
where is the standard inverse‐variance formula (1).(2) Calculation of the expectation of the new small‐sample variance formula (2) (Appendix A.4).
The expected value of the asymptotic and small‐sample variance formulas are then compared with the version that assumes common .
Results are shown in Figure 1, showing very little effect of the number of studies I (the 40 translucent curves are almost on top of each other) but some effect of the number of patients per study n i. However, this effect becomes small with more than about 25 patients per study. This is shown for scenarios where all studies contain the same number of participants but could equally be done with specific incidences of unbalanced data.
Figure 1.

Ratio of two‐ to one‐stage expected value of variances, using the asymptotic variance formula (1) (lines below 1) and the small sample formula (2) (lines above 1). The plot shows 39 translucent lines as the number of studies increases from i=2,…,40. The lines for smaller i are higher for both panels, but differences are negligible
This demonstrates that assuming is the same across studies reduces variance of when n i is small. This being the assumption underlying OLS, it is typically associated with one‐stage analysis. Two‐stage fixed‐effects meta‐analysis typically allows to be study specific. These are merely defaults, and it is not necessary for either procedure to use the default assumption.
3.2. Random‐effects model (model V)
Here, we consider model V of Table 1, often designated as “random‐effects” meta‐analysis, because the I treatment effects are assumed to follow a probability distribution. There is a bewildering range of competing two‐stage estimators for this model.3
We will consider the restricted maximum likelihood estimator. It is attractive because it is more efficient than moment‐based estimators such as DerSimonian and Laird's method but can provide consistent estimates of variance parameters. For finite‐sample inference with restricted maximum likelihood (REML) estimation, Kenward and Roger's approximate correction to the covariance matrix and degrees of freedom is the best available option.19
It is possible to fit the random‐effects meta‐analysis REML realisation of the Kenward–Roger correction in one stage via mixed in Stata or proc mixed in sas. The covariance and degrees of freedom for two‐stage meta‐analysis of a single parameter has not previously been derived; we provide the derivation in Appendix B.
The one‐ and two‐stage versions of these models were compared in a simulation study. The simulation design follows. This is structured as aims, data‐generating mechanism, methods, estimand, and performance measures.
-
Aims
The aim of the simulation study is to compare one‐ and two‐stage procedures in terms of (1) precision of meta‐analysis with random treatment effects and fixed‐study intercepts and (2) coverage of 95% confidence intervals. Both are with respect to β, the overall average treatment effect.
-
Data‐generating mechanism
Individual participant data were simulated for independently 2 up to 40 studies from model V of Table 1, that is, studies had fixed intercepts and random treatment effects (drawn from a Gaussian distribution). One thousand simulated datasets were produced using Stata 14's default random‐number generator (64‐bit Mersenne twister). For the data‐generating mechanism we used, parameter values were set to α=a i=0∀i,β=0 with , and with (where denotes a random draw from a χ 2 distribution on 50 df). Study sizes were unbalanced. Following Rücker et al,20 based on Galandi et al,21 study sizes n i were drawn from a log‐normal distribution with and , with n i rounded to the nearest integer or to 20 for any drawn values <20.
-
Methods
One‐stage analysis was done in sas 9.3 using proc mixed, while two‐stage analysis3 used ipdmetan in Stata 14, based on the same simulated datasets. The choice of sas for one‐stage models was for computational speed: Both packages return the same results, but sas currently gets there faster. The only difference in the methods is that the two‐stage variance estimates were based on expected information and the one‐stage on observed.
-
Estimand
The estimand of interest is the meta‐analytic estimate of average overall treatment effect β. As earlier mentioned, we acknowledge that this is not an adequate summary of a meta‐analysis but is usually the parameter of central interest.
-
Performance measures
The key performance measure is precision: the inverse of the empirical variance of .22 This is estimated for two‐stage relative to one‐stage (see White22 and presented as “% gain”. The empirical variance is based only on the REML point estimate and so does not depend on the Kenward–Roger adjustment. To assess the new variance and df adjustments, we also compare the coverage of nominal 95% confidence intervals.
The Stata files required to generate data and run one‐ and two‐stage analyses are included as a supplement to this article. Also included is the simulation analysis of the results (though the code to produce figures is not).22
Figure 2 plots results for relative precision and coverage. All results are accompanied by 95% Monte Carlo confidence intervals to describe simulation uncertainty.22
Figure 2.

Simulation study results from 1000 repetitions plotted for 2 up to 40 studies in a meta‐analysis. Upper panel: relative % increase in precision for two‐ vs one‐stage restricted maximum likelihood estimation. Lower panel: coverage of Kenward–Roger confidence intervals
The upper panel displays relative precision, showing that the two‐stage REML estimator has precision extremely similar to its one‐stage counterpart under this data‐generating mechanism. Although we are not able to obtain an analytic result to support whether this is a general or specific result, it appears to be in line with the equivalence of one‐ and two‐stage for fixed‐effects meta‐analysis when the same models are fitted.
Coverage results (the lower panel of Figure 2) indicate that both methods tend to have good coverage, except when the number of studies is below 5. Inspection of the simulation results shows that variance estimates are accurate, on average very close to the true (empirical) variance. The conservative intervals appear to be due to the df estimated by two‐stage Kenward–Roger approach being conservative (two‐stage) and anti‐conservative (one‐stage). For both procedures, the variance formulas closely matched the empirical variance of the REML estimate of β. The issue with confidence interval coverage is then due to the df used. A comparison of the one‐ and two‐stage df across simulation runs showed that the one‐stage df were slightly higher than the two‐stage counterpart. This is apparently due to the approximations involved in the Taylor expansions, which begin to fail in different directions at very small study sample sizes.
4. ILLUSTRATION OF METHODS FOR INDANA META‐ANALYSIS
The Indana data includes IPD from 10 trials of antihypertensive drugs in patients at high risk of cardiovascular disease.23, 24 Several outcome variables were collected, such as death, stroke, systolic blood pressure, diastolic blood pressure, and cholesterol. We focus here on systolic blood pressure (SBP), which was recorded annually at years 1 to 5 in seven of the 10 studies and never in the remaining three. As previously, we consider a simple analysis involving a linear regression of SBP on randomised group. Analysis was restricted to participants with observed values of SBP at 1 year.
The situation is shown in Figure 3, a forest plot of the within‐study estimates of the mean difference in SBP at 1 year. The trials are highly heterogeneous in many ways, and this is apparent in the estimated treatment effects. A fixed‐effect meta‐analysis will thus produce a confidence interval that is too narrow and relates to what is arguably an ill‐defined target parameter, as the assumption of a constant treatment effect is clearly not supported. Meanwhile, the smaller trials seem to have larger effects, so that the random‐effects point estimate may be too large. Our purpose here is not to produce the most appropriate summary of the overall effect in terms of the point estimate and confidence interval; instead, we aim to demonstrate the impact of different modelling assumptions and of the number of stages used to fit a given model, on these quantities.
Figure 3.

Forest plot of mean difference in systolic blood pressure at 1 year by randomised arm, for 7 trials in the indana data. The size of squares is based on fixed‐effect weights with study‐specific . CI, confidence interval
We consider three different modelling assumptions, matching the models considered in Section 2. First, a fixed‐effect meta‐analysis where is assumed to be equal in all studies; second, a fixed‐effect meta‐analysis under the assumption that are different across studies; and third, a random‐effects meta‐analysis also allowing to vary across studies. Table 2 gives the results of applying one‐ and two‐stage procedures for parameter estimation. Results are intentionally given to 4 decimal places. We see clear differences between different modelling assumptions, the largest difference being between the fixed‐ and random‐effects models. However, the different procedures produce almost identical estimates of β and very similar confidence intervals.
Table 2.
Results from meta‐analysis of INDANA data: overall mean difference in systolic blood pressure under different meta‐analysis models
| M‐A model | Estimation procedure | (95% CI), mm Hg |
|---|---|---|
| Fixed effect, shared | One‐stage | −12.5254 (−12.9128, −12.1382) |
| Two‐stage | −12.5255 (−12.9112, −12.1398) | |
| Fixed effect, study‐specific | One‐stagea | −12.3404 (−12.7236, −11.9571) |
| Two‐stagea | −12.3404 (−12.7236, −11.9571) | |
| Random effects, study‐specific | One‐stage | −15.2069 (−19.3984, −11.0153) |
| (REML with Kenward–Roger adjustment) | Two‐stage | −15.2068 (−18.5534, −11.8602) |
aConfidence intervals (CIs) computed using asymptotic variance formula (1).
5. DISCUSSION
Focusing on practical sample sizes (rather than asymptotic results), this paper attempts to resolve the debate about using one‐ or two‐stage estimation methods in meta‐analysis. We have considered Gaussian data and demonstrated theoretically (for fixed effect) and via simulation (for random‐effects models), that provided the same underlying model is used, inference from one‐ and two‐stage procedures is practically equivalent for 2 models which are practically relevant.
To make progress, we focused on a very simple data structure and considered combining evidence from randomised studies. We considered the main arguments regarding “flexibility” and precision. Table 1 shows that a wider range of statistical models can be fitted in a one‐stage framework; however, for many meta‐analyses, this flexibility is not needed.
To study precision, we derived some new results. Specifically, the one‐stage fixed‐effects estimator of β which allows for study‐specific variances; a new variance estimator which admits that are estimated and not known (again for the fixed‐effect model allowing study‐specific variances); and estimated the expectation of this formula, comparing it to the expectation of the OLS variance, where is assumed to be equal across all studies. For random‐effect meta‐analysis, we also calculated and implemented the Kenward–Roger small sample correction.
In practical sample sizes, our work shows that the arguments regarding precision are largely redundant, as previously shown asymptotically for more general models.7 Mathew and Nördstrom claim to show that a one‐stage analysis will always be at least as precise as two‐stage, but the only example of an inequality is for a meta‐analysis with no study effects at all, which is of no practical interest.6 Appendix A.1 shows that the fixed‐effect model can be fitted using weighted one‐stage estimation, which returns the same estimates as would be obtained through inverse‐variance weighted two‐stage estimation. Meanwhile, Senn15 shows how OLS estimates can be recovered from summary data in two stages. The results displayed in Figure 1 show that the latter is more precise than the former. Thus, analysis in two stages may be less precise, more precise, or identical to two stages, depending on the modelling assumptions made by the computational procedure. In summary, these papers do not contradict our key finding, which is that for equivalent models, fitting by a one‐ or two‐stage procedure gives practically equivalent results.
Improvements in precision must be attributed to different modelling assumptions, that is, the precision argument for one‐stage procedures is actually about flexibility again. A different model may be more or less precise: In general, we buy information with assumptions. We argue that model choice should never be based on precision. Opting to fit a model in one stage or two based on the perceived efficiency is in fact doing exactly this. In practice, it is important to make modelling assumptions that are tenable and fit the model as using whatever procedure allows for these assumptions. For the assumptions which may change between one‐ and two‐stage procedures, see Burke et al.9
Due to the simple data structure considered, we have not touched on covariate adjustment here. Adjusting for the main effects of covariates may be desirable, either to obtain a valid estimate of (for example, where covariates have been balanced by design25 or to increase power.26 Here, a one‐stage model can estimate covariate effects to have the same values across all studies. Two‐stage analysis relaxes this assumption by fitting a model which implies separate covariate effects for each study, which we conjecture is the correct way to ensure a valid estimate of . To fit a one‐stage version of this model, covariate–study interaction terms need to be included and, if the modelling assumptions are identical, our results would hold. If the covariate effects truly differ by study, then the model that incorporates this (whether fitted by a one‐ or two‐stage procedure) will tend to give more precise estimates; the converse is also true.
For meta‐analysis of patient‐level treatment‐covariate interactions, using the two‐stage procedure of fitting a model including interactions and combining these within‐study interactions in a second‐stage meta‐analysis, guards against ecological bias.27, 28, 29 As always, a practically equivalent model can be fitted using a one‐stage procedure; however, in this setting, care needs to be taken over the parameterisation and study‐specific covariate means must be adjusted for.28, 29 This may not be intuitive, and in this context, a one‐stage analysis should aim to estimate the same effect as a two‐stage analysis. Here, a model that can be fitted in one stage but not two may actually be misleading (“deluded” in the parlance of Fisher et al).28
For practical meta‐analysis, a clear description of the model and assumptions is important. See Burke et al9 for a description of practical assumptions which may differ between one‐ and two‐stage meta‐analysis: The relevance of each of these should be considered and described.
When fitting a model in two stages, it is important to describe whether a fixed‐ or random‐effects model is being used. For fixed‐effect models, the assumption about common or study‐specific then needs to be described. For random effects, the random‐effects model should be described. The clearest way of doing this is in terms of the estimator of tau squared (eg, DerSimonian–Laird or REML3 followed by any additional details such as use of the Hartung–Knapp/Sidik–Jonkman or Kenward–Roger variance correction.
When fitting a model in one stage, some modelling aspects to describe may be whether random effects were estimated and the parameters they were attached to; whether was shared or study specific; any correction to the covariance matrix; whether parameters for covariate effects were constrained to be equal or allowed to vary across studies; and how interactions were estimated.
We encourage those involved in meta‐analysis methodology to focus more on consideration of models and assumptions and less on the procedures used for parameter estimation. The procedure used (one or two stages) is a computational tool; the model being fitted is what the tool aims to construct, and a tool that does the required job conveniently should be favoured.
6. Data Accessibility
ACKNOWLEDGEMENTS
We are grateful to the investigators who provided the Indana data and, in turn, to all the trial teams and patients involved in the various studies.
This work was supported by the Medical Research Council (grant numbers MC_UU_12023/21 and MC_UU_12023/29).
CONFLICTS OF INTEREST
The meta‐analysis group of the MRC Clinical Trials Unit at UCL have historically used two‐stage approaches. Defending the approach may buttress our colleagues' work. DJF wrote the ipdmetan Stata package, which performs two‐stage meta‐analysis; our colleague Ian White wrote the mvmeta Stata package; defending two‐stage approaches may increase the use of these package and citations of the articles. TPM came into this research expecting one‐stage analysis to be better than two‐stage and now favours neither approach other than the most convenient. The authors declare that they have no other conflicts of interest.
APPENDIX A. PROPERTIES OF WEIGHTED ESTIMATORS
The below sections derive the variance of the fixed‐effects inverse‐variance estimator. Unless otherwise stated, summations are over all i, the study index.
A1. Variance estimator for a one‐stage analysis assuming varying variances
The fixed effects model is Y ij=α i+β X ij+ε ij, where ε ij∼N(0,σ 2), i indexes studies and j indexes patients. This model is estimated by (X ′ X)−1 X ′ Y and covariance matrix by . We require the expectation of .
Assume that we have I studies, each with n i patients. Define the design matrix
The first I columns indicate (exclusive) membership in study i=1,…,I and the (I+1)th indicates treatment assignments. For a one‐stage analysis, is the (I+1)th diagonal element of . First, (X ′ X) is (I+1)×(I+1):
The inverse is
We are interested in the last element (variance of the treatment effect), which is .
The shared‐σ 2 assumption can be relaxed in the following way: Construct a diagonal matrix of weights W, where the nonzero elements are ; the variance–covariance matrix is then estimated as . The matrix X ′ W X is

The solid lines show how the matrix can be partitioned to invert the (I+1)th diagonal element (variance of the treatment effect), which is
This is identical to the standard inverse‐variance formula used by a two‐stage procedure. The algebra can also be worked through for the estimator of β to include weights W, that is (X ′ W X)−1 X ′ W y. The one‐ and two‐stage procedures then give identical results for . This complements the result described and worked through in Senn,15 taken from Olkin and Sampson,5 which shows that a two‐stage procedure can obtain one‐stage estimates when modified to make the same assumptions of equal .
A2. Expected value of standard two‐stage variance estimator
Let i=1,…,I index studies and j=1,…,n i index individuals. The unadjusted estimator of β for active vs control is
with variance for study i estimated by
This first stage is repeated within each study. Define weights and let
For the second stage, the inverse‐variance estimator of the overall treatment effect is
| (A1) |
The variance of β is almost always estimated by
| (A2) |
We now calculate the expectation of for 2 studies. Note that
| (A3) |
For a meta‐analysis of two studies, our variance estimator is
Substituting , write as
The first and second partial derivatives for x and y are
Denote f x after evaluating at , and similarly for f y,f xx,f yy. By a Taylor expansion,
Note that E[ε x]=E[ε y]=0, and studies are uncorrelated, so E[ε x ε y]=0, then
| (A4) |
Within each study,
Generalising to I studies,
This gives us that
| (A5) |
This result is used to compare the expectation of (A2) to the OLS fixed‐effect estimator (results given in Figure 1, upper panel).
A3. An improved variance estimator to partner the inverse‐variance estimator of
Because the standard variance formula given in (A2) is biased downwards in finite samples, we derive an improved estimator for the variance of (A1). Let
First derivatives wrt and are
By a Taylor expansion of f(), we get
| (A6) |
| (A7) |
The bold 0s in (A6) are always equal to 0 in meta‐analysis provided studies can be assumed independent. The roman 0s are set to 0 here because we are considering linear models, where an estimate and its variance are independent. Note that when β i and Var(β i) cannot be assumed independent, as with binary outcomes, the roman 0s in (A6) need to be replaced with the correct terms.
(A7) provides a variance estimator for Gaussian data which improves on (A2) in finite samples.
A4. Expected value of the variance estimator (A7)
The estimator is given by A1, where
and are assumed to be independent. Note that can be expressed as a function of the random variable X i. Let X represent the set of all X is.
We want the variance of .
The generic problem is that we want the expectation of the ratio
for A and B as defined above with arbitrary positive constants e i and f i.
Use the second‐order expression30
| (A8) |
The calculation of the required moments follows.
Substituting the expectations E(A), E(B), Cov(A,B), and Var(B) into (A8) gives an expression for the actual variance of .
This has been verified in simulation; the Mata code to calculate the formula and verify (A8) and its components is provided as a supplement. Also included is the log file. This is run for 7 studies where, arbitratily, f=[1,2,3,4,5,6,7] and . We simulated the X is from and calculated
We provide 2 simulation results. First, Var(R) is estimated empirically from the simulated values; second, it is calculated using (A8) where empirical values of E(A), E(B), Var(B), and Cov(A,B) are substituted for the theoretical expectations. These results, and those based on the expectations, are given in Table A1 for 1×1010 repetitions. A tiny error in the approximation (A8) is seen, but the empirical and theoretical components of (A8) are near identical. This justifies the use of the theoretical formula to compare expected variances of the inverse‐variance and OLS estimators (as in Figure 1).
Table A1.
Simulation‐based check of variance formula (1×1010 repetitions)
APPENDIX B. KENWARD–ROGER CORRECTION FOR TWO‐STAGE RANDOM‐EFFECTS META‐ANALYSIS ESTIMATED BY REML
This section derives the Kenward–Roger variance estimator and degrees of freedom for two‐stage meta‐analysis using the notation of Kenward and Roger.19, 31) We first recap the general approach and notation before applying it to the meta‐analysis model.
The general linear mixed‐effects model can be written as
where y is a (k×1) vector of random variables, X is a (k×p) matrix of known constants for the (p×1) fixed‐effect parameter vector β, Z is the (k×q) design matrix for the (q×1) random‐effects parameter vector γ, and e is a (k×1) vector of random error terms.
We assume E[γ]=0, E[e]=0, and Cov[γ,e]=0. Define D as the (q×q) covariance matrix of the random‐effects parameters in γ, and R as the (k×k) covariance matrix of e. Then, Σ, the (k×k) covariance matrix of y, is equal to ZDZ ⊤+R. After assuming normality of the random terms in the model, we obtainy∼N(X β,Σ).
Kenward and Roger describe Σ under the general linear mixed‐effects model as being an unknown, block diagonal, (k×k) variance‐covariance matrix whose elements are well‐behaved functions of σ(r×1).19 The standard approach would be to use REML to find an estimate for σ and proceed to find a REML estimator for β, with variance , where Σ and hence Φ are dependent on .
The standard REML estimator of , , is the variance‐covariance matrix of the asymptotic limiting distribution of based on the REML estimation of σ. Kenward and Roger obtain a better approximation than to the finite‐sample variance‐covariance matrix of .
The corrected estimator can be written as
where
and W ij is the (i,j)th element of the variance‐covariance matrix of σ, that is, of W=Var[σ], which may be approximated by the inverse of the observed or expected information matrix. Kenward and Roger give the following expression for the expected information:
The third term of , and therefore V i and S ij, may be ignored if the covariance matrix Σ is linear in σ. Those quantities are defined as follows:
B1. The random‐effects meta‐analysis model
For the random‐effects meta‐analysis model, y consists of the I treatment effect estimates β i, X=1 i, β includes the “pooled” effect size β, and Σ is diagonal with elements , where is the estimated variance of β i and is the estimated between‐study heterogeneity.
We proceed by noting that Kenward and Roger's σ 1=τ 2(note that this is distinct from σ i, the within‐study standard deviation) and also define , so that . Then,
Note that Σ is linear in σ 1, so that . This immediately implies that R 11=0 and also that the entire third term of the above expression for is 0.
Then,
where W 11=Var[σ 1] as explained earlier.
The expected information works out to be
If we wish to use the observed information, this requires the second derivative of the log‐likelihood with respect to σ 1.
Note that is itself a function of σ 1 and must also be differentiated. Concentrating first on this term, let so that the final term becomes . Then,
Now, using the quotient rule and rearranging,
Now, we go back and calculate the second differentials of the first two (non‐constant) terms:
Hence, the expression for observed information is
Finally,
where may be either or .
Kenward and Roger consider inferences of the form L β for L a fixed (l×p) matrix, where l is the number of simultaneous inferences and p is the number of elements of β. Here, l=p=1(since we simply wish to estimate β), hence L=1 also. Furthermore, since l=1, the test distribution is t rather than F, and the scaling factor λ should also be equal to 1.
To calculate approximate degrees of freedom m, working through the formulae of Kenward and Roger19(p987):
Next,
and
If W 11 is based on the observed information matrix, the degrees of freedom are
or if based on the expected information matrix:
Morris TP, Fisher DJ, Kenward MG, Carpenter JR. Meta‐analysis of Gaussian individual patient data: Two‐stage or not two‐stage? Statistics in Medicine. 2018;37:1419–1438. https://doi.org/10.1002/sim.7589
Present Address 90 High Holborn, London WC1V 6LJ, UK
REFERENCES
- 1. Stewart LA, Tierney JF. To IPD or not to IPD? Advantages and disadvantages of systematic reviews using individual patient data. Eval Health Prof. 2002;25(1):76‐97. [DOI] [PubMed] [Google Scholar]
- 2. Riley RD, Lambert PC, Abo‐Zaid G. Meta‐analysis of individual participant data: rationale, conduct, and reporting. BMJ. 2010;340:c221. [DOI] [PubMed] [Google Scholar]
- 3. Fisher DJ. Two‐stage individual participant data meta‐analysis and generalized forest plots. Stata J. 2015;15(2):369‐396. [Google Scholar]
- 4. Tierney JF, Vale C, Riley R, et al. Individual participant data (IPD) meta‐analyses of randomised controlled trials: guidance on their use. PLoS Med. 2015;12(7):e1001855+. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Olkin I, Sampson A. Comparison of meta‐analysis versus analysis of variance of individual patient data. Biometrics. 1998;54(1):317‐322. [PubMed] [Google Scholar]
- 6. Mathew T, Nordstrom K. Comparison of one‐step and two‐step meta‐analysis models using individual patient data. Biom J. 2010;52(2):271‐287. [DOI] [PubMed] [Google Scholar]
- 7. Lin DY, Zeng D. On the relative efficiency of using summary statistics versus individual‐level data in meta‐analysis. Biometrika. 2010;97(2):321‐332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Lin DY, Zeng D. Meta‐analysis of genome‐wide association studies: no efficiency gain in using individual participant data. Genet Epidemiol. 2010;34(1):60‐66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Burke DL, Ensor J, Riley RD. Meta‐analysis using individual participant data: one‐stage and two‐stage approaches, and why they may differ. Statist Med. 2017;36(5):855‐875. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Simmonds M, Stewart G, Stewart L. Decade of individual participant data meta‐analyses: a review of current practice. Contemp Clin Trials. 2015;45(Pt A):76‐83. [DOI] [PubMed] [Google Scholar]
- 11. Debray TPA, Moons KGM, Valkenhoef G, et al. Get real in individual participant data (IPD) meta‐analysis: a review of the methodology. Res Synth Methods. 2015;6(4):293‐309. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Higgins JPT, Thompson SG, Spiegelhalter DJ. A re‐evaluation of random‐effects meta‐analysis. J R Stat Soc Series A (Statistics in Society). 2009;172(1):137‐159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Senn S. The many modes of meta. Drug Inf J. 2000;34:535‐54. [Google Scholar]
- 14. Tudur Smith C, Williamson PR, Marson AG. Investigating heterogeneity in an individual patient data meta‐analysis of time to event outcomes. Stat Med. 2005;24(9):1307‐1319. [DOI] [PubMed] [Google Scholar]
- 15. Senn SJ. Hans van Houwelingen and the art of summing up. Biom J. 2010;52(1):85‐94. [DOI] [PubMed] [Google Scholar]
- 16. Parzen M, Lipsitz SR, Dear KBG. Does clustering affect the usual test statistics of no treatment effect in a randomized clinical trial? Biom J. 1998;40(4):385‐402. [Google Scholar]
- 17. Kahan BC, Morris TP. Assessing potential sources of clustering in individually randomised trials. BMC Med Res Methodol. 2013;13(1):58+. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Lunn D, Barrett J, Sweeting M, Thompson S. Fully Bayesian hierarchical modelling in two stages, with application to meta‐analysis. J R Stat Soc Ser A Stat Soc Ser C (Appl Stat). 2013;62(4):551‐572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kenward MG, Roger JH. Small sample inference for fixed effects from restricted maximum likelihood. Biometrics. 1997;53:983‐997. [PubMed] [Google Scholar]
- 20. Rücker G, Schwarzer G, Carpenter JR. Arcsine test for publication bias in meta‐analyses with binary outcomes. Stat Med. 2008;27(5):746‐763. [DOI] [PubMed] [Google Scholar]
- 21. Galandi D, Schwarzer G, Antes G. The demise of the randomised controlled trial: bibliometric study of the German‐language health care literature, 1948 to 2004. BMC Med Res Methodol. 2006;6:30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. White IR. simsum: analyses of simulation studies including Monte Carlo error. Stata J. 2010;10(3):369‐385. [Google Scholar]
- 23. Gueyffier F, Boutitie F, Boissel JP, et al. INDANA: a meta‐analysis on individual patient data in hypertension. Protocol and preliminary results. Thérapie. 1995;50(4):353‐362. [PubMed] [Google Scholar]
- 24. Gueyffier F, Boutitie F, Boissel JP, et al. Effect of antihypertensive drug treatment on cardiovascular outcomes in women and men. A meta‐analysis of individual patient data from randomized, controlled trials. The INDANA Investigators. Ann Intern Med. 1997;126(10):761‐767. [DOI] [PubMed] [Google Scholar]
- 25. Kahan BC, Morris TP. Improper analysis of trials randomised using stratified blocks or minimisation. Stat Med. 2012;31(4):328‐340. [DOI] [PubMed] [Google Scholar]
- 26. Kahan BC, Jairath V, Doré CJ, Morris TP. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials. 2014;15(1):139+. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Fisher DJ, Copas AJ, Tierney JF, Parmar MKB. A critical review of methods for the assessment of patient‐level interactions in individual participant data meta‐analysis of randomized trials, and guidance for practitioners. J Clin Epidemiol. 2011;64(9):949‐967. [DOI] [PubMed] [Google Scholar]
- 28. Fisher DJ, Carpenter JR, Morris TP, Freeman SC, Tierney JF. Meta‐analytical methods to identify who benefits most from treatments: daft, deluded, or deft approach? BMJ. 2017;356:j573+. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Hua H, Burke DL, Crowther MJ, Ensor J, Tudur Smith C, Riley RD. One‐stage individual participant data meta‐analysis models: estimation of treatment‐covariate interactions must avoid ecological bias by separating out within‐trial and across‐trial information. Statist Med. 2017;36(5):772‐789. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Kendall MG, Stuart A. The Advanced Theory of Statistics, Vol. 1: Distribution Theorych. 10, :232+. Bucks, UK: Charles Griffin & Company Limited third ed; 1969. [Google Scholar]
- 31. Kenward MG, Roger JH. An improved approximation to the precision of fixed effects from restricted maximum likelihood. Comput Stat Data Anal. 2009;53(7):2583‐2595. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
