Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2018 Jul 10.
Published in final edited form as: Biometrics. 2017 Aug 14;74(2):548–556. doi: 10.1111/biom.12762

A matrix-based method of moments for fitting multivariate network meta-analysis models with multiple outcomes and random inconsistency effects

Dan Jackson, Sylwia Bujkiewicz , Martin Law *, Richard D Riley , Ian R White *,+
PMCID: PMC6038911  EMSID: EMS78351  PMID: 28806485

Summary

Random-effects meta-analyses are very commonly used in medical statistics. Recent methodological developments include multivariate (multiple outcomes) and network (multiple treatments) meta-analysis. Here we provide a new model and corresponding estimation procedure for multivariate network meta-analysis, so that multiple outcomes and treatments can be included in a single analysis. Our new multivariate model is a direct extension of a univariate model for network meta-analysis that has recently been proposed. We allow two types of unknown variance parameters in our model, which represent between-study heterogeneity and inconsistency. Inconsistency arises when different forms of direct and indirect evidence are not in agreement, even having taken between-study heterogeneity into account. However the consistency assumption is often assumed in practice and so we also explain how to fit a reduced model which makes this assumption. Our estimation method extends several other commonly used methods for meta-analysis, including the method proposed by DerSimonian and Laird (1986). We investigate the use of our proposed methods in the context of both a simulation study and a real example.

Keywords: Incoherence, Mixed treatment comparisons, Multiple treatments meta-analysis, Random effect-models

1. Introduction

Meta-analysis, the statistical process of pooling the results from separate studies, is commonly used in medical statistics and now requires little introduction. The univariate random-effects model is often used for this purpose. This model has recently been extended to the multivariate (multiple outcomes; Jackson et al., 2011) and network (multiple treatments; Lu and Ades, 2004) meta-analysis settings. In a network meta-analysis, more than two treatments are included in the same analysis. The main advantage of network meta-analysis is that, by using indirect information contained in the network, more precise and coherent inference is possible, especially when direct evidence for particular treatment comparisons is limited. Here we describe a new model that extends the random-effects modelling framework to the multivariate network meta-analysis setting, so that both multiple outcomes and multiple treatments may be included in the same analysis.

Other multivariate extensions of univariate methods for network meta-analysis have previously been proposed. For example, Achana et al. (2014) analyse multiple correlated outcomes in multi-arm studies in public health. Efthimiou et al. (2014) propose a model for the joint modelling of odds ratios on multiple endpoints. Efthimiou et al. (2015) develop another model that is a network extension of an alternative multivariate meta-analytic model that was originally proposed by Riley et al. (2008). A network meta-analysis of multiple outcomes with individual patient data has also been proposed by Hong et al. (2015) under both contrast-based and arm-based parameterizations, and Hong et al. (2016) develop a Bayesian framework for multivariate network meta-analysis. These multivariate network meta-analysis models are based on the assumption of consistency in the network, extending the approach introduced by Lu and Ades (2004). In contrast to these previously developed methods, the method proposed here relaxes the consistency assumption. This assumption is sometimes found to be false across the entire network (Veroniki et al., 2013). We model the inconsistency using a design-by-treatment interaction, so that different forms of direct and indirect evidence may not agree, even after taking between-study heterogeneity into account. However we assume that the design-by-interaction terms follow normal distributions, and so conceptualise inconsistency as another source of random variation. This allows us to achieve the dual aim of estimating meaningful treatment effects whilst also allowing for inconsistency in the network.

Although we allow inconsistency in the network, we propose a relatively simple model. Our preference for a simple model is because the between-study covariance structure is typically hard to identify accurately in multivariate meta-analyses (Jackson et al., 2011) and also because network meta-analysis datasets are usually small (Nikolakopoulou et al., 2014). The new model that we propose for multivariate network meta-analysis is a direct generalisation of the univariate network meta-analysis model proposed by Jackson et al. (2016), which is a particular form of the design-by-treatment interaction model (Higgins et al., 2012). In addition to proposing a new model for multivariate network meta-analysis, we also develop a corresponding new estimation method. This estimation method is based on the method of moments and extends a wide variety of related methods. In particular, we extend the estimation method described by DerSimonian and Laird (1986) by directly extending the matrix based extension of DerSimonian and Laird’s estimation method for multivariate meta-analysis (Chen et al., 2012; Jackson et al., 2013). We adopt the usual two-stage approach to meta-analysis, where the estimated study-specific treatment effects (including the within-study covariance matrices) are computed in the first stage. We give some information about how this first stage is performed but our focus is the second stage, where the meta-analysis model is fitted.

The paper is set out as follows. In section 2, we briefly describe the univariate model for network meta-analysis to motivate our new multivariate network meta-analysis model in section 3. We present our new estimation method in section 4 and we apply our methods to a real dataset in section 5. We conclude with a short discussion in section 6.

2. A univariate network meta-analysis model

Here we describe our univariate modelling framework for network meta-analysis (Jackson et al., 2016; Law et al., 2016). Without loss of generality, we take treatment A as the reference treatment for the network meta-analysis. The other treatments are B, C, etc. We take the design d as referring only to the set of treatments compared in a study. For example, if the first design compares treatments A and B only, then d = 1 refers to two-arm studies that compare these two treatments. We define t to be the total number of treatments included in the network, and td to be the number of treatments included in design d. We define D to be D the number of different designs, Nd to be the number of studies of design d, and N=d=1DNd to be the total number of studies. We will use the word ‘contrast’ to refer to a particular treatment comparison or effect in a particular study, for example the ‘AB contrast’ in the first study.

We model the estimated relative treatment effects, rather than the average outcomes in each arm, and so perform contrast based analyses. We define Ydi to be the cd × 1 column vector of estimated relative treatment effects from the ith study of design d, where cd = td − 1. We define nd = Ndcd to be the total number of estimated treatment effects that design d contributes to the analysis, and n=d=1Dnd to be the total number of estimated treatment effects that contribute to the analysis. To specify the outcome data Ydi, we choose a baseline treatment in each design d. The entries of Ydi are then the estimated effects of the other cd treatments included in design d relative to this baseline treatment. For example, if we take d = 2 to indicate the ‘CDE design’ then c2 = 2. Taking C as the baseline treatment for this design, the two entries of the Y2i vectors are the estimated relative effects of treatment D compared to C and of treatment E compared to C. For example, the entries of the Ydi could be estimated log-odds ratios or mean differences.

We use normal approximations for the within-study distributions. We define Sdi to be the cd × cd within-study covariance matrix corresponding to Ydi. We treat all Sdi as fixed and known in analysis. Ignoring the uncertainty in the Sdi is acceptable provided that the studies are reasonably large and is conventional in meta-analysis, but this approximation is motivated by pragmatic considerations because this greatly simplifies the modelling. We do not impose any constraints on the form of Sdi other than they must be valid covariance matrices. The lead diagonal entries of the Sdi are within-study variances that can be calculated using standard methods. Assuming that the studies are composed of independent samples for each treatment, the other entries of the Sdi are calculated as the variance of the average outcome (for example the log odds or the sample mean) of the baseline treatment.

We define δ1AB,δ1AC,,δ1AZ, where Z is the final treatment in the network, to be treatment effects relative to the reference treatment A, and call them basic parameters (Lu and Ades, 2006). We use the subscript 1 when defining the basic parameters to emphasise that they are treatment effects for the first (and in this section, only) outcome. We define c = t − 1 to be the number of basic parameters in the univariate setting. Treatment effects not involving A can be obtained as linear combinations of the basic parameters and are referred to as functional parameters (Lu and Ades, 2006). For example the average treatment effect of treatment E to treatment C, δ1CE=δ1AEδ1AC, is a functional parameter. We define the c × 1 column vector δ=(δ1AB,δ1AC,,δ1AZ)T and design specific cd × c design matrices Z(d). We use the subscript (d) in these design matrices to emphasise that they apply to each individual study of design d; we reserve the subscript d for design matrices that describe regression models for all outcome data from this design. If the ith entry of the Ydi are estimated treatment effects of treatment J relative to the reference treatment A then the ith row of Z(d) contains a single nonzero entry: 1 in the (j−1)th column, where j is the position of J in the alphabet. If instead the ith entry of the Ydi are estimated treatment effects of treatment J relative to treatment K, KA, then the ith row of Z(d) contains two nonzero entries: 1 in the (j−1)th column and -1 in the (k − 1)th column.

Our univariate model for network meta-analysis is

Ydi=Z(d)δ+Θdi+Ωd+ϵdi (1)

where ΘdiN(0,τβ2Pcd),ΩdN(0,τω2Pcd),ϵdiN(0,Sdi), all Θdi, Ωdi and ϵdi are independent, and Pcd is the cd × cd matrix with ones on the leading diagonal and halves elsewhere. We refer to τβ2 and τω2 as the between-study variance, and the inconsistency variance, respectively. The term Θdi is a study-by-treatment interaction term that models between-study heterogeneity. The model ΘdiN(0,τβ2Pcd) implies that the heterogeneity variance is the same for all contrasts for every study regardless of whether or not the comparison is relative to the baseline treatment (Lu and Ades, 2004). Other simple choices of Pcd, such as allowing the off-diagonal entries to differ from 0.5, violate this symmetry between treatments. For example, in the case d = 2 indicating the CDE design, the between-study variances for the CD and CE effects in this study are given by the two diagonal entries of τβ2Pcd, which are both τβ2. The between-study variance for the effect of E relative to D is (1,1)τβ2Pcd(1,1)T, which is also τβ2. The Ωd are design-by-treatment interaction terms that model inconsistency in the network. The model ΩdN(0,τω2Pcd) implies that the inconsistency variance is the same for all contrasts for every design; other simple choices of Pcd violate this symmetry.

To describe all estimates from all studies, we stack the Ydi from the same design to form the nd × 1 column vector Yd=(Yd1T,,YdNdT)T, and we then stack these Yd to form the 1 column vector Y=(Y1T,,YDT)T. Jackson et al. (2016) then use three further matrices that we also define here because they will be required to describe the estimation procedure that follows. The matrix M1 is defined as a n × n square matrix where m1ij = 0 if the ith and jth entries of Y, i, j = 1, ⋯ n, are estimates from different studies; otherwise m1ii = 1, and m1ij = 1/2 for ij. The matrix M2 is defined as a n × n square matrix where m2ij = 0 if the ith and jth entries of Y, i, j = 1, ⋯ n, are estimates from different designs; otherwise m2ij = 1 if the ith and jth entries of Y are estimates of the same treatment comparison (for example, treatment A compared to treatment B) and m2ij = 1/2 if these entries are estimates of different treatment comparisons. The supplementary materials show a concrete example showing how these two matrices are formed. Jackson et al. (2016) also define a n × c univariate design matrix Z, where if the ith entry of Y is an estimated treatment effect of treatment J relative to the reference treatment A then the ith row of Z contains a single nonzero entry: 1 in the (j−1)th column, where j is the position of J in the alphabet. If instead the ith entry Y is an estimated treatment effect of treatment J relative to treatment K, KA, then the ith row of Z contains two nonzero entries: 1 in the (j−1)th column and -1 in the (k−1)th column. Defining Sd = diag(Sd1,…, SdNd), and then S = diag(S1,…, SD), model (1) can be presented for the entire dataset as

YN(Zδ,τβ2M1+τω2M2+S)

3. A multivariate network meta-analysis model

We now explain how to extend the univariate model in section 2 to the multivariate setting to handle multiple outcomes. We define p to be the number of outcomes, and so the dimension of the network meta-analysis, so that we now consider the case where p > 1. The Ydi are now pcd × 1 column vectors, where the Ydi contain cd column vectors of length p. For example, in a p = 5 dimensional meta-analysis and continuing with the example where d = 2 indicates the CDE design, we have c2 = 2. The Y2i are then 10 × 1 column vectors where, taking C as the baseline treatment for this design, the first five entries of the Y2i are estimated relative treatment effect of D compared to C and the second five entries are the same estimate of E compared to C. We define the pc × 1 column vector δ=(δ1AB,δ1AC,,δ1AZ,δ2AB,δ2AC,,δ2AZ,,δpAB,δpAC,,δpAZ)T, so that this vector contains the basic parameters for each outcome in turn. When p = 1 the vector δ reduces to its definition in the univariate setting, as given in section 2.

We define Σβ and Σω to be p × p unstructured covariance matrices that are multivariate generalisations of τβ2 and τω2. These two matrices contain the between-study variances and covariances, and the inconsistency variances and covariances, respectively, for all p outcomes. We refer to Σβ and Σω as the between-study covariance matrix, and the inconsistency covariance matrix, respectively. We continue to treat the within-study covariance matrices Sdi as if fixed and known in analysis but these are now pcd × pcd matrices. The entries of the Sdi matrices that describe the covariance of estimated treatment effects for the same outcome can be obtained as in the univariate setting. However the other entries of Sdi, that describe the covariance between treatment effects for different outcomes, are harder to obtain in practice. A variety of strategies for dealing with this difficulty have been proposed (Jackson et al., 2011; Wei and Higgins, 2013).

3.1. The proposed multivariate model for network meta-analysis

In the multivariate setting, to allow correlations between estimated treatment effects for different outcomes, both within studies and designs, we propose that model (1) is generalised to

Ydi=X(d)δ+Θdi+Ωd+ϵdi (2)

where X(d) = ((IpZ(d)1)T, … , (IpZ(d)cd)T)T, Z(d)i is the ith row of Z(d), Θdi ~ N (0, PcdΣβ), Ωd ~ N (0, PcdΣω) and ϵdi ~ N (0, Sdi), where all Θdi, Ωd and ϵdi are independent, and ⊗ is the Kronecker product. The random Θdi and Ωd continue to model between-study heterogeneity, and inconsistency, respectively. Recalling that δ contains the basic parameters for each outcome in turn, the design matrices X(d) provide the correct linear combinations of basic parameters to describe the mean of all estimated treatment effects in Ydi. Model (2) reduces to model (1) in one dimension. The definition of Pcd means that Σβ and Σω are the between-study covariance matrix, and inconsistency covariance matrix, for all contrasts. We continue define Y as in the univariate setting, where Y contains n column vectors of estimated treatment effects that are of length p, so that Y is a np × 1 column vector in the multivariate setting. We define the multivariate np × pc design matrix X = ((IpZ1)T, … , (IpZn)T)T, where Zi is the ith row of Z. Model (2) can be presented for the entire dataset as

YN(Xδ,M1Σβ+M2Σω+S) (3)

where we continue to define S as in the univariate case. Matrices M1 and M2 are the same as in the univariate setting, and so continue to be n× n matrices. Model (3) is a linear mixed model for network meta-analysis and is conceptually similar to other models of this type (Piepho et al., 2012). If Σω = 0 then all Ωd = 0 and there is no inconsistency; we refer to this reduced model as the ‘consistent model’. If both Σβ = 0 and Σω = 0 then all studies estimate the same effects to within-study sampling error and we refer to this model as the ‘common-effect and consistent model’.

Missing data (unobserved entries of Y) are common in applications as not all studies may provide data for all outcomes and contrasts. When there are missing outcome data, the model for the observed data is the marginal model for the observed data implied by (3), where any rows of Y that contain missing values are discarded. We will use a non-likelihood based approach for making inferences and so assume any data are missing completely at random (Seaman et al., 2013). We define the diagonal np × np missing indicator matrix R, where Rii = 1 if Yi is observed, Rii = 0 if Yi is missing, and Rij = 0 if ij.

4. Multivariate estimation: a new method of moments

Our estimation procedure is motivated by the univariate method proposed by DerSimonian and Laird (1986). This was developed in the much simpler setting where each study provides a single estimate Yi, and where the random-effects model Yi ~ N (δ, τ2 + Si) is assumed. This estimation method for τ2 uses the Q statistic, where Q=Si1(Yiδ^)2 and δ^=Si1Yi/Si1 is the pooled estimate under the common-effect model (τ2 = 0).

Now consider an alternative representation of this Q statistic. Taking Y = (Y1, … , Yn)T, S = diag(S1, … , Sn) and W = S−1 means that Q = tr(W(YŶ)(YŶ)T), where Ŷ is obtained under the common-effect model. To obtain a p × p matrix generalisation of Q for multivariate analyses, we replace the trace operator with the block trace operator in this expression (Jackson et al., 2013). The block trace operator is a generalisation of the trace that sums over all n of the p× p matrices along the main block diagonal of an np× np matrix. This produces a p × p matrix. In the absence of missing data we can write our multivariate generalisation of the Q statistic, btr(W(YŶ)(YŶ)T), as a weighted sum of outer products of p × 1 vectors of residuals under the common-effect and consistent model. Hence the distribution of btr(W(YŶ)(YŶ)T) depends directly on the magnitudes of unknown variance components.

4.1. A Q matrix for multivariate network meta-analysis

We define a within-study precision matrix W corresponding to S. If there are no missing outcome data in Y then we define W = S−1, where S is taken from model (3). If there are missing data in Y then the entries of W that correspond to observed data are obtained as the inverse of the corresponding entries of the within-study covariance matrix of reduced dimension (equal to that of the observed data) and the other entries of W are set to zero. For example, consider the case where Y is a 6 × 1 vector but only the second and fifth entries are observed; this corresponds to much less outcome data than would be used in practice but provides an especially simple example. Then we define Sr, where the subscript r indicates a dimension reduction, as a 2 × 2 matrix whose entries are the within-study variances and covariances of the two observed entries of Y. The 6 × 6 precision matrix W then has all zero entries in the first, third, fourth and sixth rows and columns. However the remaining entries of W are the entries of the 2 × 2 matrix Sr1, so that W22=(Sr1)11,W25=(Sr1)12,W52=(Sr1)21, and W55=(Sr1)22. We define Ŷ to be the fitted value of Y under the common-effect and consistent model (Σβ = Σω = 0), so that Ŷ = HY where H = X(XTWX)−1XTW. We also define an asymmetric np × np matrix (Jackson et al., 2013)

Q=W{R(YY^)}{R(YY^)}T=W(YY^)(YY^)TR (4)

Our definitions of W and R mean that WR = W, which results in the simplified version of Q in (4). From the first form given in (4), we have that the residuals YŶ are pre-multiplied by R, so that any residuals that correspond to missing outcome data do not contribute to Q. Furthermore missing outcome data do not contribute to Ŷ because they have no weight under the common-effect and consistent model. Hence we can impute missing outcome data with any finite value without changing the value of Q. This is merely a convenient way to handle missing data numerically and has no implications for the statistical modelling.

4.2. Design specific Q matrices for multivariate network meta-analysis

In order to identify the full model, we will require design-specific versions of Q that only use data from a particular design. As in the univariate setting, we stack the outcome data from design d to form the vector Yd=(Yd1T,,YdNdT)T. In the multivariate setting the vector Yd contains nd estimated effects each of length p, so that Yd is now a pnd × 1 column vector. We define the design specific nd × nd matrix M1d, where m1ijd=0 if the ith and jth estimated effect (of length p) in Yd, i, j = 1, … , nd, are from separate studies; otherwise m1iid=1 and m1ijd=1/2forij. We define the pnd × pcd design matrix Xd which is obtained by stacking identity matrices of dimension pcd, where we include one such identity matrix for each study of design d. Hence Xd = 1NdIpcd, where 1Nd is the Nd × 1 column vector where every entry is one. We also define the pcd × 1 column vector βd = X(d)δ + Ωd.

An identifiable design-specific marginal model for outcome data from design d only, that is implied by model (2), is

YdN(Xdβd,M1dΣβ+Sd) (5)

where Sd = diag(Sd1, … , SdNd). We can also calculate design specific versions of (4) where we calculate all quantities, including the fitted values, using just the data from studies of design d. We define these pnd × pnd design specific matrices as

Qd=Wd(YdY^d)(YdY^d)TRd (6)

where Wd, Rd and Ŷd in (6) are define in the same way as W, R and Ŷ in (4) but where only data from design d are used. Hence Rd and Wd are the missing indicator matrix, and the within-study precision matrix, of Yd, respectively. We compute Ŷd = HdYd where Hd=Xd(XdTWdXd)1XdTWd. When computing Hd we take the matrix inverse to be the Moore-Penrose pseudoinverse. This is so that any design-specific regression corresponding to this hat matrix that is not fully identifiable (due to missing outcome data) can still contribute to the estimation. We use model (5) to derive the properties of Qd in equation (6).

4.3. The estimating equations

We base our estimation on the two p × p matrices btr(Q) and d=1Dbtr(Qd), where Q and Qd are given in (4) and (6), respectively. Specifically, we match these quantities to their expectations to estimate the unknown variance parameters using the method of moments.

4.3.1. Evaluating E[btr(Q)] and deriving the first estimating equation

We define A = (InpH)TW and B = (InpH)TR, which are known np × np matrices. We also divide the matrices A and B into n2 blocks of p × p matrices, and write Ai,j and Bi,j, i, j = 1, … n, to mean the ith by jth blocks of A and B respectively. Hence Ai,j and Bi,j are both p × p matrices. In the supplementary materials we show that

E[btr(Q)]=i=1nj=1nk=1nm1ijAk,iΣβBj,k+i=1nj=1nk=1nm2ijAk,iΣωBj,k+btr(B).

We apply the vec(·) operator to both sides of the previous equation and use the identity vec(AXB) = (BTA)vec(X) (see Henderson and Searle, 1981), to obtain

vec(E[btr(Q)])=Cvec(Σβ)+Dvec(Σω)+E (7)

where

C=i=1nj=1nk=1nm1ijBj,kTAk,i
D=i=1nj=1nk=1nm2ijBj,kTAk,i

and

E=vec(btr(B)).

Upon substituting E[btr(Q)] = btr(Q), Σβ = Σ̂β and Σω = Σ̂ω in equation (7), the method of moments gives one estimating equation in the vectorised form of two unknown covariance matrices.

4.3.2. Evaluating E[btr(Qd)] and deriving the second estimating equation

Model (5) depends upon one unknown covariance matrix, Σβ. The intuition is that, upon using all D of the Qd matrices in (6) and the method of moments to estimate Σβ, we will then be able to estimate the other unknown covariance matrix Σω using the first estimating equation. We define design specific Ad = (IpndHd)TWd and Bd = (IpndHd)TRd, where Ad and Bd are known pnd × pnd matrices. We also divide the matrices A and B into nd2 blocks of p × p matrices, and write Ad,i,j and Bd,i,j, i, j = 1, … , nd, to mean the ith by jth blocks of Ad and Bd respectively. In the supplementary materials we show that

vec(E[d=1Dbtr(Qd)])=(d=1DCd)vec(Σβ)+d=1DEd (8)

where

Cd=i=1ndj=1ndk=1ndm1ijdBd,j,kTAd,k,i

and

Ed=vec(btr(Bd)).

Upon substituting E[d=1Dbtr(Qd)]=d=1Dbtr(Qd) and Σβ = Σ̂β in (8), we obtain a second estimating equation from the method of moments.

4.4. Solving the estimating equations and performing inference

We solve the estimating equation resulting from (8) for vec(Σ̂β) and substitute this estimate into the estimating equation resulting from (7) and solve for vec(Σ̂ω).

4.4.1. Estimating Σβ under the consistent model

Some applied analysts may prefer to assume the consistent model (Σω = 0). As in the univariate case (Jackson et al., 2016), we have two possible ways of estimating Σβ under the consistent model: we can use the estimating equation resulting from (7) with Σω = 0 or the estimating equation resulting from (8) as in the full model. Also as in the univariate case, we suggest the former option because it uses the information made by assuming consistency when estimating Σβ. However this first option is valid only under the consistent model.

4.4.2. ‘Truncating’ the estimates of the unknown covariance matrices so that they are symmetric and positive semi-definite

As in the univariate case, there is the problem that the point estimates of the two unknown covariance matrices are not necessarily positive semi-definite. The method of moments does not even initially enforce the constraint that the point estimates of the unknown covariance matrices are symmetrical (Chen et al., 2012; Jackson et al., 2013). We produce symmetric estimators corresponding to an estimated covariance matrix of Σ̂ as (Σ̂T + Σ̂)/2 (Chen et al., 2012; Jackson et al., 2013). This also corresponds to taking the average of estimates that result from our Q and Qd matrices and their transposes (Jackson et al., 2013). We then write these symmetric estimators in terms of their spectral decomposition (Chen et al., 2012; Jackson et al., 2013) and truncate any negative eigenvalues to zero to provide the final symmetric positive semi-definite estimated covariance matrices. Specifically, we define the truncated estimate corresponding to the symmetrical Σ̂ as Σ^+=i=1pmax(0,λi)eieiT, where λi is the ith eigenvalue of the symmetric Σ̂ and ei is the corresponding normalised eigenvector.

4.4.3. Inference for δ

Inference for δ then proceeds as a weighted regression where all weights are treated as fixed and known. Writing as the estimated variance of Y in (3), in the absence of missing outcome data we have δ̂ = (XT−1X)−1XT−1Y where Var(δ̂) = (XT−1X)−1. In the presence of missing data we can, under our missing completely at random assumption, apply these standard formulae for weighted regression to the observed outcomes. Alternatively and equivalently, we can impute the missing outcome data in Y with an arbitrary value and replace −1 with the precision matrix corresponding to , calculated in the way explained for S in section 4.1 (Jackson et al., 2011). Approximate confidence intervals and hypothesis tests for all basic parameters for all outcomes then immediately follow by taking δ̂ to be approximately normally distributed. Inferences for functional parameters follow by taking appropriate linear combinations of δ̂.

4.5. Special cases of the estimation procedure

In the supplementary materials we show that the proposed method reduces to two previous methods in special cases. If all studies are two arm studies and consistency is assumed then the proposed method reduces to the matrix based method for multivariate meta-regression (Jackson et al., 2013). The proposed multivariate method reduces to the univariate DerSimonian and Laird method for network meta-analysis (Jackson et al., 2016) when p = 1.

4.6. Model identification

If the necessary standard matrix inversions resulting from the estimating equations from (7) and (8) cannot be performed then both unknown variance components cannot be identified using the proposed method. A minimum requirement for any multivariate modelling is that the common-effect and consistent model must be identifiable. This means that there must be some information (direct or indirect) about each basic parameter for all outcomes. Two or more studies of the same design must provide data for all possible pairs of outcomes to identify Σβ. Two or more studies of different designs must provide data for all possible pairs of outcomes to identify Σω. If these conditions are satisfied then the model will be identifiable. In situations where our model is not identifiable we suggest that simpler models should be considered instead. Possible strategies for this include considering models of lower dimension or the consistent model. In practice it is highly desirable to have more than the minimum amount of replication required, both within and between designs, so that the model is well identified. We make some pragmatic decisions in the next section for our example to provide sufficient replication within designs, in order to estimate Σβ with reasonable precision.

5. Example

The methodology developed in this paper is now applied to an illustrative example in relapsing remitting multiple sclerosis (RRMS). Multiple sclerosis (MS) is an inflammatory disease of the brain and spinal cord and RRMS is a common type of MS. The effectiveness of a new treatment is typically measured to assess its impact on relapse rate and odds of disease progression. Magnetic Resonance Imaging (MRI) allows measurement of the number of new or enlarging lesions in the brain. Three outcomes are included in our analyses, so that p = 3 in the full three dimensional network meta-analysis. These three outcomes are: (1) the log rate ratio of new or enlarging MRI lesions; (2) the log annualised relapse rate ratio; and (3) log disability progression odds ratio. Relapse is defined as appearance of new, worsening or recurrence of neurological symptoms that can be attributable to MS, accompanied by an increase of a score on the Expanded Disability Status Scale (EDSS) and also functional-systems score(s), lasting at least 24 hours, preceded by neurologic stability for at least 30 days. Disability progression is defined as an increase in EDSS score that was sustained for 12 weeks, with an absence of relapse at the time of assessment. Negative basic parameters indicate that treatments B-F are beneficial compared to treatment A throughout.

Data in this illustrative example were obtained from ten randomised controlled trials of six treatment options (coded in the network data as treatments A to F); placebo (A), interferon beta-1b (B), interferon beta-1a (C), glatiramer (D), and two doses of fingolimod; 0.5mg (E) and 1.25mg (F). Three trials of fingolimod were three-arm (two doses and a control) and are included as three-arm studies. Three trials of interferon beta (one 1a and two 1b) were three-arm (also two doses and a control), and these were included as separate two-arm trials (each dose against the control, with the number of participants in each control arm halved). This ignores the differences in doses of interferon beta and was a pragmatic decision to help provide an identifiable network. Briefly, in this example there is very little replication within designs, so that identifying Σβ well is very difficult without making pragmatic decisions such as this. Sormani et al. (2010) also treat these particular studies as two separate studies in this way, which helps them to identify their meta-regression models. Treating these three studies as separate two-arm trials means that the data are analysed as being from thirteen studies and a summary of the resulting data structure is shown in Table 1. There are eight different designs in Table 1 and so there is relatively little replication within designs, even when including three of the three-arm studies as separate two arm studies. Full details of the dataset that are relevant to this paper are described in the supplementary materials and see also Bujkiewicz et al. (2016). Figure 1 provides network diagrams that show the number of comparisons between each pair of treatments on the edges. In these diagrams the three arm studies (Table 1) are taken to contribute three comparisons, for example the CEF study contributes CE, CF and EF comparisons. Two estimates of treatment effect from this study contribute to analyses however because C is taken as the baseline; the study’s estimated EF treatment effect contains no additional information once its CE and CF contrasts are included in the analysis.

Table 1. Summary of the relapsing remitting multiple sclerosis dataset.

Study Design Outcomes
IFNB SG (1) AB All three outcomes measured
IFNB SG (2) AB All three outcomes measured
Jacobs/Simon AC All three outcomes measured
PRISMS (1) AC All three outcomes measured
PRISMS (2) AC All three outcomes measured
Johnson AD Relapse rate and disability progression only
Durelli BC Relapse rate and disability progression only
O’Connor (1) BD Relapse rate and disability progression only
O’Connor (2) BD Relapse rate and disability progression only
Mikol CD All three outcomes measured
FREEDOMS 1 AEF All three outcomes measured
FREEDOMS 2 AEF All three outcomes measured
TRANSFORMS CEF All three outcomes measured

Figure 1.

Figure 1

Network diagram for RRMS dataset. A – placebo, B – interferon beta-1b, C – interferon beta-1a, D – glatiramer, E – fingolimod 0.5mg, F – fingolimod 1.25mg. Left-hand-side network corresponds to studies reporting the log annualised relapse rate ratio and log disability progression odds ratio (y2 and y3) for which data are complete. The right-hand-side network corresponds to studies reporting the log rate ratio of new or enlarging MRI lesions (y1 which is not reported in four studies). The numbers shown on the network edges are the number of direct comparisons of each pair of treatments; the absence of an edge indicates that there is no direct comparison. Three of the thirteen studies are three arm trials which are each taken to provide three direct comparisons (a direct comparison between each treatment pair). Hence there are 19 direct comparisons in the left-hand-side network where there is no missing data.

Table 2 shows the estimates of the basic parameters (treatment effects relative to the reference treatment, placebo) obtained from univariate network meta-analyses, bivariate analyses for all three combinations of pairs of outcomes and the trivariate analysis. The results are similar across all analyses, and conclusions from univariate and multivariate analyses are the same. This is disappointing because multivariate analyses have not resulted in more precise inference. The entries of Σ̂β and Σ̂ω are shown in Table 3. The positive estimates obtained for the unknown variance components suggest that this example exhibits some between-study heterogeneity and inconsistency. In order to assess the impact of the unknown variance components, we also fitted the consistent model and the common-effect and consistent model (results not shown) using all three outcomes (p = 3). On average, the standard errors of the fifteen basic parameters from the full model are 35% greater (range: 13% to 84%) than those from the consistent model, which in turn are 58% (range: 8% to 128%) greater than those from the common-effect and consistent model. Both the between-study heterogeneity and inconsistency have notable impact.

Table 2. Treatment effect estimates of each treatment relative to the reference treatment A (placebo).

model estimate (se)
AB AC AD AE AF
MRI (y1)

univariate (y1) -0.95 (0.39) -1.00 (0.21) -0.68 (0.50) -1.38 (0.26) -1.52 (0.26)
bivariate (y1, y2) -0.94 (0.39) -1.00 (0.21) -0.68 (0.50) -1.39 (0.26) -1.53 (0.26)
bivariate (y1, y3) -0.96 (0.39) -0.98 (0.22) -0.66 (0.50) -1.38 (0.26) -1.51 (0.26)
trivariate (y1, y2, y3) -0.96 (0.39) -0.97 (0.22) -0.67 (0.50) -1.38 (0.26) -1.51 (0.26)

Relapse rate (y2)

univariate (y2) -0.35 (0.10) -0.25 (0.09) -0.34 (0.11) -0.81 (0.12) -0.78 (0.12)
bivariate (y1, y2) -0.35 (0.10) -0.25 (0.09) -0.34 (0.11) -0.81 (0.12) -0.78 (0.12)
bivariate (y2, y3) -0.36 (0.11) -0.23 (0.10) -0.33 (0.12) -0.80 (0.13) -0.77 (0.13)
trivariate (y1, y2, y3) -0.36 (0.11) -0.23 (0.10) -0.33 (0.12) -0.80 (0.13) -0.77 (0.13)

Disability progression (y3)

univariate (y3) -0.46 (0.25) -0.11 (0.21) -0.42 (0.25) -0.33 (0.25) -0.37 (0.24)
bivariate (y2, y3) -0.47 (0.25) -0.10 (0.21) -0.43 (0.25) -0.37 (0.25) -0.37 (0.25)
bivariate (y1, y3) -0.46 (0.25) -0.11 (0.21) -0.42 (0.25) -0.34 (0.25) -0.38 (0.25)
trivariate (y1, y2, y3) -0.47 (0.25) -0.10 (0.21) -0.43 (0.25) -0.37 (0.25) -0.37 (0.25)

Table 3. Inconsistency and heterogeneity covariance matrices estimates.

model Σω11 Σω12 Σω13 Σω22 Σω23 Σω33

univariate (y1) 0.0000
univariate (y2) 0.0115
univariate (y3) 0.0713
bivariate (y1, y2) 0.0002 0.0017 0.0125
bivariate (y1, y3) 0.0018 0.0116 0.0741
bivariate (y2, y3) 0.0161 0.0344 0.0735
trivariate (y1, y2, y3) 0.0027 0.0066 0.0143 0.0161 0.0349 0.0756

Σβ11 Σβ12 Σβ13 Σβ22 Σβ23 Σβ33

univariate (y1) 0.1508
univariate (y2) 0.0043
univariate (y3) 0.0000
bivariate (y1, y2) 0.1523 -0.0110 0.0047
bivariate (y1, y3) 0.1526 -0.0191 0.0024
bivariate (y2, y3) 0.0061 0.0015 0.0004
trivariate (y1, y2, y3) 0.1538 -0.0116 -0.0195 0.0059 0.0024 0.0027

The multivariate analysis adds to the univariate analyses in two main ways. Firstly, the finding that the multivariate analysis is in good agreement with the univariate analyses is a particularly important finding for treatment effects on MRI where a substantial proportion of data were missing. It has been demonstrated by Kirkham et al. (2012) that a multivariate approach to meta-analysis can help obtain more accurate estimates in the presence of outcome reporting bias. Hence the multivariate analysis reduces concerns that this univariate analysis is affected by reporting bias. Secondly joint inferences for all three outcomes are possible under the multivariate model. For example, and as we might anticipate, in our example the estimated log annualised relapse rate ratios and log disability progression odds ratios are highly positively correlated; from Var(δ̂) in our three dimensional multivariate meta-analysis, the correlations between the five pairs of estimated basic parameters for these two outcomes are all between 0.63 and 0.75. Medical decision making based jointly on these two outcomes should take this high positive correlation into account, and this is only possible by using a multivariate approach. For example, a formal decision analysis involving these two outcomes should be based on their joint distribution rather than their two marginal distributions. In the supplementary materials we perform a simulation study to further explore how the proposed methodology performs.

6. Discussion

We have proposed a new model for dealing with both multiple treatment contrasts and multiple outcomes, to provide a framework for conducting multivariate network meta-analysis. By using a matrix-based method of moments estimator, our methodology naturally builds on previous work (such as the well-known DerSimonian and Laird approach) and is computationally very fast, relative to other potential estimation approaches such as REML or MCMC; this is especially the case in very high dimensions and so our methodology is particularly advantageous for ambitious analyses of this type. The main disadvantage is that, as a necessary consequence of its semi-parametric nature, the method of moments is not based on sufficient statistics and so is not fully efficient. The loss in efficiency relative to maximum likelihood estimation awaits investigation but we anticipate that this will be less serious for inferences about the average effects than the unknown variance components. Furthermore the within-study normal approximations used in our model are not necessarily very accurate even in moderately sized studies.

Since our analysis uses a general design matrix, the modelling may easily be extended by adding study level covariates to describe and fit multivariate network meta-regressions. In the network meta-analysis setting these regressions have the potential to explain the reasons for inconsistency and model multiple dose level responses. Our method of moments estimation can be combined with approaches that ‘inflate’ confidence intervals from a frequentist random effects meta-analysis (Hartung and Knapp, 2001; Jackson and Riley, 2014).

In conclusion, we have developed a new model and estimation method for multivariate network meta-analysis, which can describe multiple treatments and multiple correlated outcomes. An R function is available in the web supplementary materials that implements the proposed methodology.

Acknowledgements

DJ, IRW and ML are (or were) employed by the UK Medical Research Council [Unit Programme number U105260558]. SB was supported by the Medical Research Council (MRC) Methodology Research Programme [New Investigator Research Grant MR/L009854/1].

Footnotes

This paper has been submitted for consideration for publication in Biometrics

Supplementary Materials

The supplementary materials provide additional information and computing codes.

References

  1. Achana FA, Cooper NJ, Bujkiewicz S, Hubbard SJ, Kendrick D, Jones DR, Sutton AJ. Network meta-analysis of multiple outcome measures accounting for borrowing of information across outcomes. BMC Medical Research Methodology. 2014;14:92. doi: 10.1186/1471-2288-14-92. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bujkiewicz S, Thompson JR, Riley RD, Abrams KR. Bayesian meta-analytical methods to incorporate multiple surrogate endpoints in drug development process. Statistics in Medicine. 2016;35:1063–1089. doi: 10.1002/sim.6776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Chen H, Manning AK, Dupuis J. A method of moments estimator for random effect multivariate meta-analysis. Biometrics. 2012;68:1278–1284. doi: 10.1111/j.1541-0420.2012.01761.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Dersimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials. 1986;7:177–188. doi: 10.1016/0197-2456(86)90046-2. [DOI] [PubMed] [Google Scholar]
  5. Efthimiou O, Mavridis D, Cipriani A, Leucht S, Bagos P, Salanti G. An approach for modelling multiple correlated outcomes in a network of interventions using odds ratios. Statistics in Medicine. 2014;33:2275–2287. doi: 10.1002/sim.6117. [DOI] [PubMed] [Google Scholar]
  6. Efthimiou O, Mavridis D, Riley RD, Cipriani A, Salanti G. Joint synthesis of multiple correlated outcomes in networks of interventions. Biostatistics. 2015;16:84–97. doi: 10.1093/biostatistics/kxu030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Hartung J, Knapp G. On tests of the overall treatment effect in meta-analysis with normally distributed responses. Statistics in Medicine. 2001;20:1771–1782. doi: 10.1002/sim.791. [DOI] [PubMed] [Google Scholar]
  8. Henderson HV, Searle SR. The vec-permutation matrix, the vec operator and Kronecker products: a review. Linear and Multilinear Algebra. 1981;9:271–288. [Google Scholar]
  9. Higgins JPT, Jackson D, Barrett JK, Lu G, Ades AE, White IR. Consistency and inconsistency in network meta-analysis: concepts and models for multi-arm studies. Research Synthesis Methods. 2012;3:98–110. doi: 10.1002/jrsm.1044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Hong H, Fu H, Price KL, Carlin BP. Incorporation of individual-patient data in network meta-analysis for multiple continuous endpoints, with application to diabetes treatment. Statistics in Medicine. 2015;34:2794–2819. doi: 10.1002/sim.6519. [DOI] [PubMed] [Google Scholar]
  11. Hong H, Chu H, Zhang J, Carlin BP. A Bayesian missing data framework for generalized multiple outcome mixed treatment comparisons. Research Synthesis Methods. 2016;7:6–22. doi: 10.1002/jrsm.1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Jackson D, Riley R, White IR. Multivariate meta-analysis: potential and promise (with discussion) Statistics in Medicine. 2011;30:2481–2510. doi: 10.1002/sim.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Jackson D, White IR, Riley RD. A matrix-based method of moments for fitting the multivariate random effects model for meta-analysis and meta-regression. Biometrical Journal. 2013;55:231–245. doi: 10.1002/bimj.201200152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Jackson D, Riley R. A refined method for multivariate meta-analysis and meta-regression. Statistics in Medicine. 2014;33:541–554. doi: 10.1002/sim.5957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Jackson D, Law M, Barrett JK, Turner R, Higgins JPT, Salanti G, White IR. Extending DerSimonian and Laird’s methodology to perform network meta-analyses with random inconsistency effects. Statistics in Medicine. 2016;35:819–839. doi: 10.1002/sim.6752. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kirkham JJ, Riley RD, Williamson PR. A multivariate meta-analysis approach for reducing the impact of outcome reporting bias in systematic reviews. Statistics in Medicine. 2012;31:2179–2195. doi: 10.1002/sim.5356. [DOI] [PubMed] [Google Scholar]
  17. Kulinskaya E, Dollinger MB, Bjørkestøl K. Testing for Homogeneity in Meta-Analysis I. The One-Parameter Case: Standardized Mean Difference. Biometrics. 2011;67:203–212. doi: 10.1111/j.1541-0420.2010.01442.x. [DOI] [PubMed] [Google Scholar]
  18. Law M, Jackson D, Turner R, Rhodes K, Viechtbauer W. Two new methods to fit models for network meta-analysis with random inconsistency effects. BMC Medical Research Methodology. 2016;16:87. doi: 10.1186/s12874-016-0184-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Lu G, Ades A. Combination of direct and indirect evidence in mixed treatment comparisons. Statistics in Medicine. 2004;23:3105–3124. doi: 10.1002/sim.1875. [DOI] [PubMed] [Google Scholar]
  20. Lu G, Ades A. Assessing evidence consistency in mixed treatment comparisons. Journal of the American Statistical Association. 2006;101:447–459. [Google Scholar]
  21. Nikolakopoulou A, Chaimani A, Veroniki A, Vasiliadis HS, Schmid CH, Salanti G. Characteristics of Networks of Interventions: A Description of a Database of 186 Published Networks. Plos One. 2014;9(1):e86754. doi: 10.1371/journal.pone.0086754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Piepho HP, Williams ER, Madden LV. The Use of Two-Way Linear Mixed Models in Multitreatment Meta-Analysis. Biometrics. 2012;68:1269–1277. doi: 10.1111/j.1541-0420.2012.01786.x. [DOI] [PubMed] [Google Scholar]
  23. Riley RD, Thompson JR, Abrams KR. An alternative model for bivariate random-effects meta-analysis when the within-study correlations are unknown. Biostatistics. 2008;9:172–186. doi: 10.1093/biostatistics/kxm023. [DOI] [PubMed] [Google Scholar]
  24. Riley RD, Price MJ, Jackson D, Wardle M, Gueyffier F, Wang J, Staessen JA, White IR. Multivariate meta-analysis using individual participant data. Research Synthesis Methods. 2015;6:157–174. doi: 10.1002/jrsm.1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Sormani MP, Bonzano L, Roccatagliata L, Mancardi GL, Uccelli A, Bruzzi P. Surrogate endpoints for EDSS worsening in multiple sclerosis. A meta-analytic approach. Neurology. 2010;75:302–309. doi: 10.1212/WNL.0b013e3181ea15aa. [DOI] [PubMed] [Google Scholar]
  26. Seaman S, Galati J, Jackson D, Carlin J. What Is Meant by “Missing at Random?”. Statistical Science. 2013;28:257–268. [Google Scholar]
  27. Searle SR. Linear Models. Wiley; New York: 1971. [Google Scholar]
  28. Veroniki A, Vasiliadis HS, Higgins JP, Salanti G. Evaluation of inconsistency in networks of interventions. International Journal of Clinical Epidemiology. 2013;42:332–345. doi: 10.1093/ije/dys222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Wei Y, Higgins JPT. Estimating within-study covariances in multivariate meta-analysis with multiple outcomes. Statistics in Medicine. 2013;32:1191–1205. doi: 10.1002/sim.5679. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES