Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach

Yong Chen; Chuan Hong; Yang Ning; Xiao Su

doi:10.1002/sim.6620

. Author manuscript; available in PMC: 2018 Jan 30.

Published in final edited form as: Stat Med. 2015 Aug 24;35(1):21–40. doi: 10.1002/sim.6620

Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach

Yong Chen ^1,^*, Chuan Hong ², Yang Ning ³, Xiao Su ²

PMCID: PMC5789784 NIHMSID: NIHMS935891 PMID: 26303591

Abstract

When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach, and has several attractive features compared to the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, since the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration.

Keywords: Bivariate beta-binomial model, Composite likelihood, Marginal model, Meta-analysis, Sarmanov family

1. Introduction

Meta-analysis is a statistical procedure of synthesizing the available evidences from multiple studies. Recently, the rapid growth of evidence-based medicine has led to an increased attention to statistical methods for meta-analysis [1, 2]. In many applications, meta-analysis involves multivariate binary outcomes, such as diagnostic test results in diagnostic accuracy studies, and the exposure status of both cases and controls in case-control studies. When the covariates reduce to a single binary indicator (e.g. true disease status), the data can be summarized by multiple 2 × 2 tables. Inference on the comparative measures between two probabilities based on 2 × 2 tables has been investigated by many statisticians. Two important features of such data have to be taken into consideration: namely, the within-study correlation and the between-study heterogeneity [3]. For example, diagnostic accuracy studies are often based on 2 × 2 tables cross-tabulating the binary test results with disease status. Because different thresholds may be used to define positive and negative test results, study-specific sensitivity and specificity are often negatively correlated [4]. In addition, there is often a substantial heterogeneity in test accuracies between studies due to differences in study population characteristics, variability of assessment, and other factors [5, 6, 7]. Another example is the meta-analysis of case-control studies where cases and controls in the same study are likely to share some common, but possibly unmeasured factors [8]. The probabilities of exposures in cases and controls are likely to be correlated. In our recent investigation of the association between the N-acetyltransferase 2 (NAT2) gene (binary exposure) and colorectal cancer (disease status), twenty studies in the meta-analysis were conducted at very different locations including Australia, Japan, Spain, UK and USA. Consequently, the environment and genetic background of different studies can be very different, leading to heterogeneity across studies. On the other hand, people in the same study are likely to share similar environmental factors or similar ancestors. A strong correlation between probabilities of exposure in cases and controls was found; for more details, we refer to Chen et al. (2012) [8].

Due to the within-study correlation and between-study heterogeneity, statistical methods for meta-analyses need to be designed to account for these characteristics. For concreteness of our discussion, we consider meta-analyses of diagnostic accuracy studies, although the discussed method is equally applicable to any meta-analysis of studies with bivariate binary outcomes. Two categories of methods are commonly used in practice. The first includes methods based on summary receiver operating characteristic (SROC) curve generated from the studies. The standard SROC methods [9, 10], although easy to implement, have limitations; for more details, see Rutter and Gatsonis (2011) [6], Walter (2002) [11] and Arends et al. (2008) [12]. Recently, a hierarchical summary receiver operating characteristic (HSROC) model has been proposed by Rutter and Gatsonis (2011) [6] to account for the limitations of SROC method. The second category consists of methods based on bivariate mixed-effects models for modeling sensitivity and specificity simultaneously [12, 13, 14, 15, 16]. Among the bivariate models, the bivariate generalized linear mixed effects model (BGLMM) is commonly used due to its good model interpretation and accurate coverage property [4]. The BGLMM can be specified in two stages. At the first stage, the numbers of true positives and true negatives given the study-specific sensitivity and specificity are assumed to follow independent binomial distributions; at the second stage, the study-specific sensitivity and specificity, after some transformation (e.g., logit), are assumed to follow a bivariate normal distribution. This BGLMM model avoids the arbitrary continuity corrections for “zero cell” which is needed for bivariate general linear mixed effects model [4], and has been shown to have small bias and good coverage probabilities [16, 17]. For a more complete overview of recent methods, please refer to the excellent review paper by Ma et al. [18].

Despite the advantages of BGLMM, several practical issues in the inference of BGLMM have been reported in the literature [17, 19]. The first is a nonconvergence or singular covariance matrix problem [17]. Such problems are caused mainly by the maximum likelihood estimate of the correlation being close to ±1, and are even more severe when the number of studies is small or moderate. Inferences on parameter estimates and their standard errors may be misleading or invalid under nonconvergence or singular covariance matrix. The second practical issue is the choice of transformation on the study-specific sensitivity and specificity. Different choices of transformation may lead to different conclusions, and hence the interpretation of the results from BGLMM is transformation-dependent [20]. The third practical issue is the computational difficulty caused by a double-integral in the likelihood function. Although modern computational methods such as Laplace and adaptive Gaussian quadrature approximation are implemented in software like GLIMMIX and NLMIXED in SAS (SAS Institute Inc., Cary, NC) [21] and ADMB (Automatic Differentiation Model Builder) [22], these approximations may still have non-negligible approximation errors. These computational errors often result in unstable or unreproducible estimates (e.g., results sensitive to initial values) [17].

To address these issues of BGLMM, several authors have recently proposed the use of Sarmanov beta-binomial model which directly models the bivariate probabilities (e.g., study-specific sensitivities and specificities) without any transformation [8, 20]. Such model has a similar two-stage specification except that the bivariate study-specific probabilities are modeled directly using the Sarmanov bivariate beta distribution [23, 24]. There are three advantages of such model due to the use of the Sarmanov bivariate beta distribution. First, as a property of Sarmanov distributions, the bivariate distribution is fully determined by specifying the marginal distributions and the correlation parameter; second, it is pseudo-conjugate for binomial distribution, leading to a closed-form expression of the likelihood function; and third, the interpretation of model parameters does not depend on any transformation.

However, the Sarmanov beta-binomial model still suffers from the non-convergence or singular covariance matrix problem when the estimated correlation parameter reaches the boundary of its range. The standard inference may be invalid under such situation [25, 26]. More importantly, as pointed out by several authors [8, 27, 28, 29], only a restricted range of values is allowed for the correlation parameter. Hence such model only allows small level of correlation, and is not suitable to applications such as diagnostic accuracy studies where large negative correlation between sensitivity and specificity is possible [29]. Very recently, Kuss et al. (2013) [29] proposed a novel model using bivariate copulas. Such model uses beta-binomial distributions for the marginal numbers of true positives and true negatives, and links these margins by a bivariate copula distribution. This model not only retains all the advantages of Sarmanov beta-binomial model, including easy to specify, having a closed-form likelihood function and not requiring transformation, but also allows a larger flexibility for the correlation parameter. In this paper, we propose a marginal beta-binomial model approach which shares the same spirit of that of Kuss et al. [29] in retaining the advantages of Sarmanov beta-binomial model. Our proposed marginal beta-binomial model also avoids the limitation of the restricted range for the correlation parameter. The idea is to use working independence assumption between the study-specific probabilities (e.g. sensitivity and specificity). Under the working independence assumption, the constructed likelihood belongs to the family of composite likelihood [30, 31]. Hence the constructed likelihood enjoys the established properties of composite likelihood [30, 32, 33]. The family of composite likelihoods have been widely used in applications such as longitudinal data analysis and multivariate survival data analysis to account for the correlations between observations [33, 34, 35, 36, 37, 38]. However, to the best of our knowledge, it is the first application of composite likelihood in meta-analyses of studies with bivariate binary outcomes. There is also a close relation between the proposed method and bivariate copula model [29] in that both methods make the same marginal distribution assumption. The key difference between these two methods is that our method does not need to specify any copula distribution, hence avoids the potential bias due to model misspecification.

This article is organized as follows. In Section 2, we describe the Sarmanov beta-binomial model and the proposed method. In Section 3, we conduct simulation studies to evaluate the finite sample performance of the proposed method in a variety of settings, where the bias, coverage probability and relative efficiency are investigated. We apply the proposed method in Section 4 to a systematic review of diagnostic accuracy studies for early detection of melanoma metastasis, a meta-analysis of the association between N-acetyltransferase 2 acetylation status and colorectal cancer, and a meta-analysis for the diagnostic values of tumor markers in detecting primary bladder cancer. We summarize the result and discuss the future work in Section 5.

2. Statistical Methodology

2.1. Notations and bivariate generalized linear mixed effects models

For concreteness, we consider a meta-analysis of diagnostic accuracy studies, by acknowledging that other types of meta-analyses with binary outcomes, such as, meta-analyses of case-control studies are also under the framework. We assume the meta-analysis is consisted of m independent studies. For the ith study, let n_i11, n_i22, n_i21, n_i12, n_i1 and n_i2 be the numbers of true positives, true negatives, false positives, false negatives, diseased subjects and healthy subjects, respectively, i = 1, … m. The sensitivity and specificity in the ith study are defined as Se_i, and Sp_i, respectively.

The bivariate generalized linear mixed effects model (BGLMM) is commonly used in meta-analysis of diagnostic accuracy studies to account for the heterogeneity between studies and the correlation between (Se_i and Sp_i) [15]. The BGLMM can be specified in two stages. At the first stage, the numbers of true positives and true negatives (n_i11, n_i22) given the study-specific sensitivity and specificity (Se_i, Sp_i) are assumed to be independent and follow binomial distribution,

(n_{i 11}, n_{i 22}) | (n_{i 1}, n_{i 2}, {Se}_{i}, {Sp}_{i}) ~ Binomial (n_{i 11} | n_{i 1}, {Se}_{i}) \times Binomial (n_{i 22} | n_{i 2}, {Sp}_{i}),

(1)

where n_i1 = n_i11 + n_i12 and n_i2 = n_i21 + n_i22. This conditional independence assumption is reasonable because n_i11 and n_i22 are summarized from subjects of distinct groups (i.e., diseased subjects and healthy subjects). The parameters of interest are usually comparative measures of the pooled probabilities. For example, in meta-analysis of diagnostic accuracy studies, the parameters of interest can be the pooled sensitivity, specificity and diagnostic odds ratio (dOR).

At the second stage, the study-specific sensitivity Se_i and specificity Sp_i, after some transformation (e.g., logit), are assumed to follow a random effects model,

g ({Se}_{i}) = β_{1} + μ_{i 1}, g ({Sp}_{i}) = β_{2} + μ_{i 2}

and

{(μ_{i 1}, μ_{2 i})}^{T} ~ N (0, \sum),

where g(·) is a known link function such as a logit function, (β₁, β₂) are the fixed effects, $\sum = (\begin{array}{l} τ_{1}^{2} & ρ τ_{1} τ_{2} \\ ρ τ_{1} τ_{2} & τ_{2}^{2} \end{array})$ is the covariance matrix, $τ_{1}^{2}$ and $τ_{2}^{2}$ capture the between-study heterogeneity in sensitivities and specificities, respectively, and ρ denotes the correlation between the random effects (Se_i, Sp_i) in the transformed scale.

The log likelihood function of BGLMM is calculated as

log L (β_{1}, τ_{1}^{2}, β_{2}, τ_{2}^{2}, ρ) = \sum_{i = 1}^{m} log Pr (n_{i 00}, n_{i 11} | n_{i 0}, n_{i 1}) = \sum_{i = 1}^{m} \int \int Binomial (n_{i 00} | n_{i 0}; {Sp}_{i}) \times Binomial (n_{i 11} | n_{i 1}; {Se}_{i}) ϕ ({Se}_{i}, {Sp}_{i}; β_{1}, τ_{1}^{2}, β_{2}, τ_{2}^{2}, ρ) {dSe}_{i} {dSp}_{i},

where $ϕ (\cdot, \cdot; β_{1}, τ_{1}^{2}, β_{2}, τ_{2}^{2}, ρ)$ is the bivariate logit normal distribution indexed by $(β_{1}, τ_{1}^{2}, β_{2}, τ_{2}^{2}, ρ)$ and Binomial (·, ·; ·) is the binomial distribution.

2.2. Sarmanov beta-binomial model

Now we introduce the Sarmanov beta-binomial model [8]. To allow for heterogeneity between studies and correlation between study-specific sensitivity and specificity, a joint distribution of the study-specific Se_i and Sp_i can be specified using Sarmanov distribution family [23, 24],

({Se}_{i}, {Sp}_{i}) \overset{i . id .}{~} g ({Se}_{i}, {Sp}_{i}; a_{1}, b_{1}, a_{2}, b_{2}, ρ)

(2)

where

g ({Se}_{i}, {Sp}_{i}; a_{1}, b_{1}, a_{2}, b_{2}, ρ) = beta ({Se}_{i}; a_{1}, b_{1}) beta ({Sp}_{i}; a_{2}, b_{2}) {1 + ρ \frac{({Se}_{i} - μ_{1}) ({Sp}_{i} - μ_{2})}{δ_{1} δ_{2}}},

Beta(p; a, b) is the density function of beta distribution defined as Beta(p; a, b) = {B(a, b)}⁻¹p^a−1(1 − p)^b−1, B(a, b) is a beta function defined as $\int_{0}^{1} t^{a - 1} {(1 - t)}^{b - 1} dt$ , μ_j = a_j/(a_j + b_j) and $δ_{j}^{2} = μ_{j} (1 - μ_{j}) / (a_{j} + b_{j} + 1)$ for j = 1, 2.

The assumptions (1) and (2) lead to the Sarmanov beta-binomial model. Specifically, the marginal distribution of (n_i11, n_i22) is calculated as

Pr (n_{i 11}, n_{i 22} | a_{1}, b_{1}, a_{2}, b_{2}, ρ) = \int \int Pr (n_{i 11}, n_{i 22} | {Se}_{i}, {Sp}_{i}) g ({Se}_{i}, {Sp}_{i}; a_{1}, b_{1}, a_{2}, b_{2}, ρ) {dSe}_{i} {dSp}_{i} = P_{BB} (n_{i 11}; n_{i 1}, a_{1}, b_{1}) P_{BB} (n_{i 22}; n_{i 2}, a_{2}, b_{2}) {1 + \frac{ρ}{δ_{1} δ_{2}} \frac{(n_{i 11} - n_{i 1} μ_{1})}{(a_{1} + b_{1} + n_{i 1})} \frac{(n_{i 22} - n_{i 2} μ_{2})}{(a_{2} + b_{2} + n_{i 2})}}

where P_BB(n_ijj; n_ij, a_j, b_j) is the probability mass function of a beta-binomial distribution, i.e.,

P_{BB} (n_{ijj}; n_{ij}, a_{j}, b_{j}) = (\begin{matrix} n_{ij} \\ n_{ijj} \end{matrix}) \frac{B (n_{ijj} + a_{j}, n_{ij} - n_{ijj} + b_{j})}{B (a_{j}, b_{j})} for j = 1, 2 .

The log marginalized likelihood function for the unknown parameters (a₁, b₁, a₂, b₂, ρ) is

log L (a_{1}, b_{1}, a_{2}, b_{2}, ρ) = \sum_{i = 1}^{n} log [P_{BB} (n_{i 11}; n_{i 1}, a_{1}, b_{1}) P_{BB} (n_{i 22}; n_{i 2}, a_{2}, b_{2}) {1 + \frac{ρ}{δ_{1} δ_{2}} \frac{(n_{i 11} - n_{i 1} μ_{1}) (n_{i 22} - n_{i 2} μ_{2})}{(a_{1} + b_{1} + n_{i 1}) (a_{2} + b_{2} + n_{i 2})}}] .

(3)

The expression in equation (3) has been derived by [27]. There are several attractive features of the Sarmanov beta-binomial model. Firstly, the model specification and interpretation are simple because the bivariate jointly distribution g(Se_i, Sp_i; a₁, b₁, a₂, b₂, ρ) is fully specified by the univariate marginal distributions and correlation parameter. Secondly, the log marginalized likelihood function log L(a₁, b₁, a₂, b₂, ρ) has a closed-form expression, which avoids numerical approximation of integrals. Thirdly, the correlation parameter ρ is flexible to account for both positive and negative correlations. However, as acknowledged by several authors [8, 29], an important limitation of the Sarmanov beta-binomial model is that the correlation parameter ρ is subject to a constraint narrower than [−1, 1], which can be quite restrictive in modeling and may lead to a biased estimate of standard error in inference when the correlation parameter estimate reaches its boundary. For example, if we have a₁ = 3.11, a₂ = 3.94, b₁ = 2.91, b₂ = 3.36, as the point estimates of these parameters obtained in example 4.2, the range of ρ, [−0.12, 0.12], can be too restrictive to model meaningful correlation. More importantly, when the correlation estimate reaches its boundary, the Wald-type confidence interval may have unsatisfactory coverage property [25, 26]. Another limitation of the Sarmanov beta-binomial model when applied to meta-analysis of diagnostic accuracy studies, as pointed out by an anonymous reviewer, is that this model implies a linear relationship between sensitivity and specificity, see also page 625 of Chu et al. (2012) [20]. Therefore, the Sarmanov beta-binomial model is not appropriate for meta-analysis of diagnostic accuracy studies.

2.3. Bivariate Copula model

Recently, Kuss et al. (2013) [29] proposed a novel model using bivariate Copulas. Such model uses beta-binomial distributions for the marginal numbers of true positives and true negatives, and links these margins by a bivariate Copula distribution. Specifically, Kuss et al. applied the concept of Copulas to account for the correlation between the true positives and true negatives (i.e., n_i00 and n_i11). Here we rephrase the model proposed by Kuss et al. Denote X₁and X₂ as two random variables, which follow density functions F(X₁) and F(X₂). It has been shown by Sklar (1996) [39] that there exists a function C with

H (X_{1} = x_{1}, X_{2} = x_{2}) = C (F (x_{1}), F (x_{2})) = C (u_{1}, u_{2}),

where H(X₁ = x₁, X₂ = x₂) is a distribution function for random variables X₁ and X₂, and C(u₁, u₂) is a distribution function for a bivariate pair for uniform random variables. It is defined as a ‘copula’ if C(u₁, u₂) satisfies three unrestricted conditions [40].

We describe the two radially-symmetric Archimedian copulas (Frank and Clayton) here. The bivariate Clayton copula is given by:

C_{C} (u_{1}, u_{2}, θ_{C}) = {max (u_{1}^{- θ_{C}} + u_{2}^{- θ_{C}} - 1, 0)}^{- \frac{1}{- θ_{C}}}, θ_{C} \in [- 1, + \infty) \ {0} .

The Frank copula is given by:

C_{F} (u_{1}, u_{2}, θ_{F}) = - θ_{F}^{- 1} log {1 + \frac{(e^{- θ_{F} u_{1}} - 1) (e^{- θ_{F} u_{2}} - 1)}{(e^{- θ} - 1)}} .

This model not only retains all the advantages of Sarmanov beta-binomial model, including easy to specify, having a closed-form likelihood function and not requiring transformation, but also allows a larger flexibility for the correlation parameter.

2.4. marginal beta-binomial approach

Despite the attractive features of Sarmanov beta-binomial model and the model by Kuss et al, the intrinsic constraint of correlation parameter becomes a limitation of such models. More importantly, specification of joint distribution of study-specific sensitivity and specificity may be difficult to validate. For illustration, Figure 1 presents contour plots of bivariate distributions induced by Clayton Copula, Frank Copula and Sarmanov model. Although all bivariate distributions share the same marginal beta distributions and Pearson correlations, the joint distributions are still different. As we will show by simulation studies, such difference leads to biased inference when mistakenly applying the Sarmanov beta-binomial model.

Contour plots of bivariate distributions induced by Clayton Copula, Frank Copula and Sarmanov model. The contour curves represent curves along which the density function has the same value. The number on a contour curve represents the value of density function at that curve. The parameters a₁, a₂, b₁ and b₂ in the marginal models are set to be 0.5. The parameters θ’s for Clayton Copula and Frank Copula and the parameter ρ for Sarmanov model are set to be 0.55, 2 and 0.3 so that the Pearson correlations equal 0.3 in all three subplots.

We propose an improved modeling approach to overcome the limitations of the Sarmanov beta-binomial model (i.e., constraint on correlation parameter and vulnerable to misspecified joint distribution) while retaining all its attractive features. The idea is to use the working independence assumption. Specifically, setting ρ = 0 in log L(a₁, b₁, a₂, b₂, ρ), we obtain the following pseudolikelihood

log L_{p} (a_{1}, b_{1}, a_{2}, b_{2}) = log L_{1} (a_{1}, b_{1}) + log L_{2} (a_{2}, b_{2}),

(4)

where

log L_{1} (a_{1}, b_{1}) = \sum_{i = 1}^{n} log \int P_{Bin} (n_{i 11}; n_{i 1}, {Se}_{i}) beta ({Se}_{i}; a_{1}, b_{1}) d {Se}_{i} = \sum_{i = 1}^{n} log P_{BB} (n_{i 11}; n_{i 1}, a_{1}, b_{1}),

log L_{2} (a_{2}, b_{2}) = \sum_{i = 1}^{n} log \int P_{Bin} (n_{i 22}; n_{i 2}, {Sp}_{i}) beta ({Sp}_{i}; a_{2}, b_{2}) d {Sp}_{i} = \sum_{i = 1}^{n} log P_{BB} (n_{i 22}; n_{i 2}, a_{2}, b_{2}) .

It is important to note that the construction of the pseudolikelihood log L_p(a₁, b₁, a₂, b₂) does not require the joint distribution of study-specific sensitivity and specificity. Rather, only their marginal distribution assumption (e.g., beta distribution) is required. Therefore, such model can be regarded as a marginal beta-binomial model.

Let θ_j = (a_j, b_j), for j = 1, 2. Notice that the above pseudolikelihood L_p(θ₁, θ₂) is not a true likelihood function unless the correlation parameter ρ is truly zero. Nevertheless, since both L₁(θ₁) and L₂(θ₂) are legitimate density function, the corresponding score equation of L_p(θ₁, θ₂) yields an unbiased estimating equation for (θ₁, θ₂). In fact, the proposed pseudolikelihood belongs to the well-studied family of composite likelihoods [41, 30, 42, 31, 43, 44] where conditional or marginal densities are multiplied, whether or not they are independent. Composite likelihoods have been widely used in the context of longitudinal studies [33, 34, 37, 38, 45], analysis of panel data [46, 47, 48], spatial modelling [49, 50, 51, 52], missing data [53, 54, 55, 56], bioinformatics [57], genetics association studies [58], and sparse Ising model learning [59], among many others. In particular, when a working independence assumption is adopted, the pseudolikelihood is sometimes called independence likelihood [60]. For more discussion on the composite likelihood methods, we refer to the excellent review paper [43] and the references therein. The major motivation of using composite likelihood in many applications is to substitute high-dimensional integration involved in full likelihoods with lower-dimensional integration [43]. However, in our proposed method, the major motivation is to overcome the limitation of the correlation parameter constraint and to alleviate the consequence of model misspecification.

Denote the maximum pseudolikelihood estimator (θ̃₁, θ̃₂) by a solution of the score equation ∂ log L₁(θ₁)/∂θ₁ = 0, and ∂ log L₂(θ₂)/∂θ₂ = 0. By a standard argument using Taylor expansion, we can show that the estimator θ̃ is asymptotically normal with mean zero and covariance matrix

\sum = (\begin{matrix} I_{11}^{- 1} & I_{11}^{- 1} I_{12} I_{22}^{- 1} \\ {(I_{11}^{- 1} I_{12} I_{22}^{- 1})}^{T} & I_{22}^{- 1} \end{matrix}),

where

I_{11} = E {- \frac{\partial^{2} log L_{1} (θ_{1})}{\partial θ_{1}^{2}}}, I_{12} = E [{\frac{\partial log L_{1} (θ_{1})}{\partial θ_{1}}} {\frac{\partial log L_{2} (θ_{2})}{\partial θ_{2}}}^{T}] and I_{22} = E {- \frac{\partial^{2} log L_{2} (θ_{2})}{\partial θ_{2}^{2}}} .

The general asymptotic results of composite likelihood have been provided by Lindsay (1988) [30], Kent (1982) [32] and Molenberghs and Verbeke (2005) [33]. For the readers of interest, an outline of derivation for the asymptotic covariance is provided in the Appendix.

In meta-analysis of diagnostic accuracy studies, the overall sensitivity, specificity and diagnostic odds ratio (dOR) are μ₂ = a₂/(a₂ + b₂), μ₁ = a₁/(a₁ + b₁) and {μ₁/(1 − μ₁)} / {μ₂/(1 − μ₂)} = (a₁b₂)/(a₂b₁), respectively. These comparative measures can be estimated by functions of θ̃₁, θ̃₂ and their variances can be estimated by delta method.

The information matrices I₁₁, I₂₂ and I₁₂ can be empirically estimated as

{\hat{I}}_{11} = \frac{1}{m} \sum_{i = 1}^{m} U_{i 1} ({\tilde{θ}}_{1}) U_{i 1} {({\tilde{θ}}_{1})}^{T}, {\hat{I}}_{12} = \frac{1}{m} \sum_{i = 1}^{m} U_{i 1} ({\tilde{θ}}_{1}) U_{i 2} {({\tilde{θ}}_{2})}^{T}, and {\hat{I}}_{22} = \frac{1}{m} \sum_{i = 1}^{m} U_{i 2} ({\tilde{θ}}_{2}) U_{i 2} {({\tilde{θ}}_{2})}^{T},

where the study-specific score function U_ij(θ̃_j) is defined as

U_{ij} ({\tilde{θ}}_{j}) = \frac{\partial log P_{BB} (y_{ij}; n_{ij}, a_{j}, b_{j})}{\partial θ_{j}}

for j = 1, 2 and i = 1, …,m. This proposed method is simple to implement in practice. Specifically, the point estimate of overall probability θ̃_j is the same as that from univariate meta-analysis of (n_ijj, n_ij) using a beta-binomial model, which is available in most of statistical software. The covariance matrix of (θ̃₁, θ̃₂) can be easily calculated using the above closed-form formulas. We implement our method in R (R Development Core Team, Version 2.14.1) using the betabin function in the aod package [61]. An R program to fit this model (with a working example) is attached in Appendix B. We note that the matrix Î_jj/m is the same as the covariance estimated from the univariate meta-analysis of (n_ijj, n_ij), j = 1, 2, whereas the matrix ${\hat{I}}_{11}^{- 1} {\hat{I}}_{12} {\hat{I}}_{22} / m$ accounts for the covariance between the estimated parameters θ̃₁ and θ̃₂. Such covariance cannot be accounted if investigators conduct separate univariate meta-analyses on (n_ijj, n_ij).

It is easy to see that the proposed method overcomes the limitation of the Sarmanov beta-binomial model while maintaining its attractive features. Specifically, the correlation parameter ρ is not involved in the construction of the pseudolikelihood and the potential boundary problem is resolved. In addition, the method is not based on the joint distribution and there is no practical issue about misspecification of the joint distribution. We note that log L_j(θ_j) is simply the log likelihood when the beta-binomial model for the data from the jth group is considered. The point and variance estimates of the proposed model are the same as those form two univariate models when only θ₁ (or θ₂) is considered. When considering the functions of θ₁ and θ₂ (e.g. dOR), the covariance between estimated θ₁ and θ₂ must be considered. The pseudolikelihood method can correctly estimate such covariance which cannot be estimated by two separate univariate beta-binomial models.

2.5. Regression extension

The marginal beta-binomial model can be extended to regression setting. In some applications, study-level covariates are available, such as the study quality score, race of the study population and percentage of females. It is important to incorporate these study level covariates for studying the covariates effects and reducing the between-study heterogeneity. We assume that the study-specific sensitivity and specificity follow the distribution

{Se}_{i} | (ϕ_{1}, μ_{i 1}) \propto {Se}_{i}^{μ_{i 1} / (1 / ϕ_{1}) - 1} {(1 - {Se}_{i})}^{(1 - μ_{i 1}) / (1 / ϕ_{1} - 1) - 1}

{Sp}_{i} | (ϕ_{2}, μ_{i 2}) \propto {Sp}_{i}^{μ_{i 2} / (1 / ϕ_{2}) - 1} {(1 - {Sp}_{i})}^{(1 - μ_{i 2}) / (1 / ϕ_{2} - 1) - 1}

where μ_ij is the mean parameter, and ϕ_j is the dispersion parameter, for j = 1, 2.

E [{Se}_{i} | ϕ_{1}, μ_{i 1}] = μ_{i 1}, var ({Se}_{i} | ϕ_{1}, μ_{i 1}) = σ_{i 1}^{2} = ϕ_{1} μ_{i 1} (1 - μ_{i 1});

E [{Sp}_{i} | ϕ_{2}, μ_{i 2}] = μ_{i 2}, var ({Sp}_{i} | ϕ_{2}, μ_{i 2}) = σ_{i 2}^{2} = ϕ_{2} μ_{i 2} (1 - μ_{i 2}) .

The mean of each beta distribution is a function of covariates

μ_{ij} = h^{- 1} (X_{ij} η_{j}), for j = 1, 2 .

where h(·) is some known link function, X_ij is the study-level covariate for study i group j, and (η₁, η₂, ϕ₁, ϕ₂) are unknown parameters. The marginal distributions of the study-specific sensitivity and specificity are, respectively, beta-distribution with parameters (μ_i1/(1/ϕ₁ − 1), (1 − μ_i1)/(1/ϕ₁ − 1)), and beta-distribution with parameters (μ_i2/(1/ϕ₂ − 1), (1 − μ_i2)/(1/ϕ₂ − 1)). The log likelihood for (η₁, η₂, ϕ₁, ϕ₂) can be defined by replacing (a₁, b₁, a₂, b₂) by (μ_i1/(1/ϕ₁ − 1), (1 − μ_i1/(1/ϕ₁ − 1), μ_i2/(1/ϕ₂ − 1), (1 − μ_i2)/(1/ϕ₂ − 1). Similar argument as that in Section 2.2 can be applied to obtain the point estimate of (η₁, η₂, ϕ₁, ϕ₂) and its covariance estimate.

3. Simulation Study

In this section, we conduct simulation studies to evaluate the finite sample performance of the proposed marginal beta-binomial model, and compare with the BGLMM and the Sarmanov beta-binomial models. We follow a three-step procedure to generate the data. Firstly, the number of studies are sampled from the real data set in section 4.1 [62]; secondly, the study-specific sensitivity and specificity in logarithm are generated from joint distributions; thirdly, given study-specific sensitivity and specificity, the number of true positives and the number of true negatives are generated from independent binomial distributions. Alternatively, continuous measurements can be simulated first, then the sensitivity and specificity can be generated by picking a threshold. As acknowledged by [63], the above two data generation procedures are shown to be equivalent. We consider five different joint distributions of study-specific sensitivity and specificity to investigate the robustness and efficiency of the proposed method. Specifically, in the first setting, we sample study-specific sensitivity and specificity from Sarmanov bivariate beta distribution using reject sampling method. Details of the sampling method can be found in Chen et al.(2011) [8]. The parameters for the beta distributions are set at a₁ = 3.11, b₁ = 2.91, a₂ = 3.94, b₂ = 3.36 to mimic the real meta-analysis data in Section 4.2, and the correlation parameter is set at values between −0.12 and 0. In the second and third settings, we use the same marginal beta distributions as that in the first setting, except that study-specific sensitivity and specificity are sampled from the Clayton Copula model or Frank Copula model using the “rmvdc” function in the Copula package in R [64, 65]. In this case, the correlation parameter is set at values between −1 and 0. We note that the Sarmanov beta-binomial model is a misspecified model under the second and third settings, the BGLMM model is a misspecified model under all three settings, whereas the marginal beta-binomial model is not misspecified. The next two settings are designed to evaluate the robustness of both Sarmanov beta-binomial and marginal beta-binomial methods. In the fourth setting, we generate study-specific sensitivity and specificity, after a logit transformation, from a bivariate normal distribution (i.e., BGLMM model). Since the marginal distributions are not beta distributions, both Sarmanov beta-binomial model and the marginal beta-binomial model are misspecified, whereas the BGLMM model is not misspecified. In the fifth setting, we change the bivariate normal distribution to bivariate t-distribution with 4 degrees of freedom, which represents a situation of heavy tails in logit sensitivity and specificity. For the fifth setting, all three models are misspecified. The number of studies is set at 10, 20 and 40 to represent a relatively small, moderate and large meta-analyses. For each simulation setting, we generate 5000 datasets and apply both methods to obtain the inference for diagnostic odds ratio (dOR). The estimated bias (denoted as “Bias”), empirical standard error (denoted as “SE”), model based standard error (the average of the standard error estimates of dOR, denoted as “MBSE”), coverage probability of 95% confidence intervals (denoted as “CP”), and the relative efficiency of the dOR estimates (defined as the ratio of the empirical variance of dOR estimates based on Sarmanov beta-binomial model to that of estimates based on the marginal beta-binomial model, labeled as “RE”) of both methods under four different settings of joint distributions for study-specific sensitivity and specificity are summarized in Tables 1 ~ 4 respectively.

Table 1.

Estimates of the bias, true standard error (SE), model based standard error (MBSE), coverage probability (CP), and relative efficiency (RE) in 5000 simulations based on BGLMM, Sarmanov beta-binomial model and marginal beta-binomial model, with different number of studies, for different values of correlation ρ between study-specific sensitivity and specificity. The data are generated from Sarmanov beta-binomial model with parameters (a₁, b₁, a₂, b₂) = (3.11, 2.91, 3.94, 3.36). All entries are multiplied by 100.

		BGLMM				Sarmanov beta-binomial Model				Marginal beta-binomial model

m	rho	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP
10	−0.12	26.0	36.8	35.3	83.6	−0.9	35.4	31.4	90.5	0.3	35.0	32.4	91.3
	−0.09	25.9	37.3	36.0	84.4	0.6	35.1	31.4	91.3	−0.6	35.5	32.0	91.1
	−0.06	26.4	37.7	36.5	83.6	0.7	34.1	31.3	91.9	0.9	34.5	31.5	90.7
	−0.03	25.1	39.1	36.7	84.4	0.4	34.0	31.1	91.7	0.2	33.3	31.0	91.4
	0.00	26.1	38.8	37.4	84.4	−0.4	33.4	30.9	91.9	0.4	33.7	30.8	91.2
20	−0.12	25.1	26.0	25.4	80.9	−0.3	25.3	23.3	92.5	0.2	25.2	24.0	92.8
	−0.09	25.7	26.4	25.7	80.5	−0.4	25.0	23.2	92.4	0.1	24.4	23.7	93.1
	−0.06	25.9	26.6	26.1	80.4	−0.3	24.6	23.1	93.1	0.0	24.1	23.3	93.0
	−0.03	25.8	26.7	26.5	82.1	−0.1	23.6	22.9	93.5	0.5	24.0	23.0	93.0
	0.00	25.7	27.9	26.9	81.5	−0.6	24.0	22.8	93.0	0.5	24.2	22.8	92.3
40	−0.12	26.1	18.5	18.2	68.8	−0.2	17.7	16.9	93.4	0.2	17.6	17.3	94.4
	−0.09	26.0	18.6	18.5	70.3	−0.4	17.4	16.7	93.8	0.1	17.2	17.1	94.1
	−0.06	26.1	19.1	18.7	69.9	−0.2	17.2	16.6	93.9	0.1	17.3	16.8	93.9
	−0.03	26.4	19.0	19.0	70.8	−0.2	17.2	16.5	93.5	0.1	16.9	16.6	93.9
	0.00	26.1	19.2	19.3	71.9	−0.3	16.7	16.4	94.0	0.1	16.7	16.4	93.9

Open in a new tab

Table 4.

Estimates of the bias, true standard error (SE), model based standard error (MBSE), coverage probability (CP), and relative efficiency (RE) in 5000 simulations based on BGLMM, Sarmanov beta-binomial model and marginal beta-binomial model, with different number of studies, for different values of correlation ρ between study-specific sensitivity and specificity. The data are generated from bivariate logit normal distribution with parameters (a₁, b₁, a₂, b₂) = (0.5, 0.5, 0.5, 0.5). All entries are multiplied by 100.

		BGLMM				Sarmanov beta-binomial Model				Marginal beta-binomial model

m	rho	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP
10	−0.9	0.1	13.7	15.9	92.5	0.7	40.2	28.4	81.6	0.7	40.1	36.7	93.1
	−0.7	0.6	19.7	19.9	90.9	−0.4	37.5	28.4	84.3	−0.4	37.4	34.8	93.1
	−0.5	0.3	24.5	23.3	90.6	0.2	35.9	28.5	86.3	0.2	35.9	32.9	92.5
	−0.3	−0.6	28.0	26.7	90.2	−0.3	33.4	28.1	89.1	−0.3	33.4	30.6	92.4
	−0.1	0.4	31.3	30.1	90.8	−0.1	31.0	27.6	90.5	−0.1	31.1	28.4	91.6
	0.0	−0.3	33.0	31.6	90.2	1.0	29.9	27.4	91.8	1.0	30.1	27.0	91.1
20	−0.9	−0.1	9.4	11.1	93.8	0.4	28.7	21.0	83.3	0.5	28.6	27.1	94.6
	−0.7	0.4	13.8	13.6	93.5	−0.2	26.8	20.9	86.7	−0.2	26.7	25.6	94.5
	−0.5	0.0	17.2	16.8	92.4	0.0	24.9	21.0	89.3	0.0	24.9	24.2	94.9
	−0.3	0.1	19.7	19.3	93.1	0.0	23.5	20.7	91.0	0.0	23.4	22.5	93.9
	−0.1	0.1	22.2	21.6	92.7	−0.1	21.8	20.4	92.9	−0.1	21.9	21.0	94.3
	0.0	−0.6	23.1	22.6	93.2	−0.1	20.5	20.1	94.4	−0.1	20.7	20.0	94.5
40	−0.9	0.0	6.9	6.8	93.5	−0.2	20.4	15.1	84.7	−0.2	20.3	19.5	96.8
	−0.7	0.2	9.9	9.7	93.7	−0.4	18.9	15.1	88.1	−0.4	18.9	18.5	97.2
	−0.5	−0.1	12.2	12.0	94.5	0.1	17.8	15.1	90.3	0.1	17.7	17.4	96.5
	−0.3	0.0	14.2	13.8	93.5	−0.4	16.5	15.0	92.0	−0.4	16.5	16.3	96.0
	−0.1	0.0	15.7	15.5	93.8	−0.7	15.5	14.7	93.4	−0.7	15.5	15.1	96.1
	0.0	−0.3	16.6	16.2	93.7	0.6	14.4	14.5	94.9	0.6	14.5	14.5	96.7

Open in a new tab

Table 1 summarizes the results when study-specific sensitivity and specificity are generated by Sarmanov bivariate beta distribution. When the number of studies is relatively small (i.e., m = 10), the BGLMM method produces estimates with large bias and confidence intervals with poor CPs (range of CP: 68.8% ~ 84.4%). Both Sarmanov beta-binomial model and the proposed marginal beta-binomial model produce estimates with small bias and confidence intervals with satisfactory CPs. As the number of studies increases, the performances of both Sarmanov beta-binomial model and marginal beta-binomial model improve with CP closer to the nominal level. We note that the marginal beta-binomial model perform equally well as the Sarmanov beta-binomial model when the latter model is the true model, and the relative efficiency is close to 100%, suggesting the efficiency loss is negligible. On the other hand, as the number of studies increases, the bias of estimates from BGLMM remains large and the CP deteriorates rapidly. Table 2 summarizes the results when study-specific sensitivity and specificity are generated by bivariate Clayton Copula with beta marginals. Similar to setting 1, the BGLMM method produces large biases and poor CPs (range of CP: 8.4% ~ 84.6%). The CP deteriorates quickly as the number of studies increases and the correlation becomes larger. The Sarmanov beta-binomial model produces estimates with slightly larger bias than that from the marginal beta-binomial model. More importantly, the MBSE from Sarmanov beta-binomial model is substantially less than the true SE, except when the correlation ρ is close to 0. This leads to poor coverage performance of the confidence intervals produced by Sarmanov beta-binomial model (range of CP: 82.0% ~ 91.8% for ρ ≤ −0.3). In contrast, the estimates from the marginal beta-binomial model have small bias, well estimated MBSE and good coverage probabilities (range of CP: 90.3% ~ 94.3%) under all settings considered. Table 3 summarizes the similar results as in Table 2 when the study-specific sensitivity and specificity are generated by bivariate Frank Copula with beta marginals. The results from Table 2 and Table 3 suggest that the BGLMM method performs poorly under distributions with beta marginals, the use of Sarmanov beta-binomial model may lead to inappropriate inference when the joint distribution is actually from a Copula distribution, whereas the marginal beta-binomial model is robust under Copula specifications.

Table 2.

Estimates of the bias, true standard error (SE), model based standard error (MBSE), coverage probability (CP), and relative efficiency (RE) in 5000 simulations based on BGLMM, Sarmanov beta-binomial model and marginal beta-binomial model, with different number of studies, for different values of correlation ρ between study-specific sensitivity and specificity. The data are generated from Clayton copula model with parameters (a₁, b₁, a₂, b₂) = (3.11, 2.91, 3.94, 3.36). All entries are multiplied by 100.

		BGLMM				Sarmanov beta-binomial Model				Marginal beta-binomial model

m	rho	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP
10	−0.9	26.5	16.1	18.2	57.8	0.1	46.3	33.1	82.0	−0.7	46.7	42.7	92.7
	−0.7	26.4	23.0	22.4	74.0	1.3	43.8	33.1	84.6	0.0	44.3	40.2	92.0
	−0.5	26.1	28.5	27.3	79.4	0.8	41.0	32.9	87.0	−0.4	41.6	38.9	92.1
	−0.3	25.3	32.8	31.8	83.7	0.3	38.8	32.3	88.7	−0.2	38.9	35.5	91.6
	−0.1	26.3	37.1	35.2	83.1	0.2	35.6	31.8	91.1	0.1	35.6	32.7	91.3
	0.0	26.0	38.5	37.2	84.6	−0.1	34.3	31.4	91.9	0.6	34.5	35.7	90.3
20	−0.9	25.9	10.9	10.6	29.5	0.7	32.4	24.2	83.9	−0.5	32.0	31.0	93.3
	−0.7	25.7	16.1	15.8	60.6	0.3	30.6	24.0	86.9	−0.4	30.5	29.3	93.1
	−0.5	25.9	20.5	19.6	71.6	0.6	29.4	24.1	88.7	−0.2	29.0	27.5	92.3
	−0.3	25.7	23.1	22.6	76.6	0.2	27.7	23.8	90.4	0.3	27.0	25.8	93.0
	−0.1	25.3	26.2	25.4	79.9	0.1	24.6	23.3	93.4	−0.3	25.1	23.9	93.0
	0.0	25.8	27.6	26.8	81.3	0.2	24.0	22.9	92.9	0.3	23.8	22.9	92.9
40	−0.9	25.8	7.8	7.7	8.4	0.7	23.1	17.4	85.7	−0.3	22.3	22.3	94.0
	−0.7	25.8	11.5	11.4	36.9	0.7	21.9	17.3	87.7	0.0	21.6	21.1	94.2
	−0.5	25.6	14.4	14.1	55.6	0.5	20.0	17.3	90.9	0.0	20.4	19.8	93.7
	−0.3	26.2	16.4	16.2	61.9	0.3	19.4	17.3	91.8	0.3	18.8	18.6	94.3
	−0.1	25.8	18.1	18.2	69.5	0.6	17.3	16.8	93.8	0.1	17.9	17.3	93.6
	0.0	25.9	19.3	19.2	71.7	0.1	17.0	16.4	93.8	0.6	16.9	16.5	94.1

Open in a new tab

Table 3.

Estimates of the bias, true standard error (SE), model based standard error (MBSE), coverage probability (CP), and relative efficiency (RE) in 5000 simulations based on BGLMM, Sarmanov beta-binomial model and marginal beta-binomial model, with different number of studies, for different values of correlation ρ between study-specific sensitivity and specificity. The data are generated from Frank copula model with parameters (a₁, b₁, a₂, b₂) = (3.11, 2.91, 3.94, 3.36). All entries are multiplied by 100.

		BGLMM				Sarmanov beta-binomial Model				Marginal beta-binomial model

m	rho	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP
10	−0.9	25.7	16.7	18.5	60.8	0.1	45.9	33.1	82.0	−0.1	45.7	43.1	92.9
	−0.7	25.7	22.9	22.4	74.0	1.9	42.6	32.9	85.3	1.7	42.5	41.7	92.8
	−0.5	25.7	29.3	28.0	79.1	−0.4	41.1	32.8	87.0	−0.5	41.1	41.5	91.7
	−0.3	26.7	32.2	32.0	81.7	0.9	39.5	32.4	87.5	0.8	39.5	35.5	90.5
	−0.1	25.8	37.7	35.7	83.8	−0.1	35.7	31.7	91.1	−0.1	35.9	33.7	90.4
	0.0	25.8	39.6	37.3	84.4	0.3	34.1	31.3	91.9	0.4	34.4	32.0	90.3
20	−0.9	25.9	11.2	11.2	33.4	0.5	32.1	24.2	85.2	0.0	32.0	30.9	94.2
	−0.7	25.4	16.3	15.9	59.9	−0.2	30.9	24.1	86.6	−0.5	30.8	29.4	94.1
	−0.5	25.4	20.6	20.0	71.7	0.5	28.9	24.1	89.2	0.3	28.9	27.6	92.8
	−0.3	25.8	23.1	22.3	75.9	−0.2	27.5	24.0	90.6	−0.3	27.5	26.2	93.3
	−0.1	25.0	26.6	25.7	81.1	0.0	25.2	23.3	92.9	0.0	25.3	23.8	91.9
	0.0	25.0	27.4	26.8	82.1	0.0	23.6	22.9	93.6	0.1	23.8	22.8	92.4
40	−0.9	25.8	7.9	7.9	10.0	0.1	22.8	17.4	86.1	−0.3	22.8	22.2	95.2
	−0.7	26.0	11.7	11.3	35.5	0.2	21.7	17.5	88.6	−0.1	21.7	21.2	94.9
	−0.5	25.8	14.5	14.3	55.0	0.1	20.5	17.4	90.4	−0.1	20.5	19.8	93.8
	−0.3	26.1	15.7	16.0	61.9	−0.5	19.3	17.4	91.8	−0.6	19.3	18.9	94.0
	−0.1	26.1	18.5	18.3	68.6	0.4	17.4	16.9	93.8	0.4	17.4	17.2	92.0
	0.0	26.2	19.4	19.2	71.2	0.0	16.7	16.5	94.0	0.1	16.8	16.5	92.5

Open in a new tab

Table 4 summarizes the results when study-specific sensitivity and specificity (in logit scale) are generated by bivariate normal distribution. The BGLMM is the true model and performances well with small bias and CPs close to the nominal level (range of CP: 90.2% ~ 94.5%). Interestingly, both Sarmanov and marginal beta-binomial models produce estimates with small bias. However, similar to its performance in setting 2, Sarmanov beta-binomial model has satisfactory performance (i.e., accurate MBSE and good CP) only when correlation ρ is close to 0 (range of CP: 89.1% ~ 94.9% for ρ = −0.3, − 0.1 or 0). When the magnitude of the correlation ρ becomes larger, the performance deteriorates quickly with substantially underestimated standard error and low coverage probability (range of CP: 81.6% ~ 89.9% for ρ ≤ −0.5). In contrast, the marginal beta-binomial model is robust in that the MBSE is close to the true SE and the coverage probability is close to the nominal level under all levels of correlations and number of studies (range of CP: 91.1% ~ 96.8%). Table 5 summarizes similar results when data are generated by bivariate t-distribution with 4 degrees of freedom. A similar finding is observed that the BGLMM performs well with small biases and good coverage probabilities, and the marginal beta-binomial model performs equally well to the BGLMM and is better than Sarmanov beta-binomial model, especially when the correlation is high. The above simulation results strongly suggest that by avoiding the specification of joint distribution for study-specific sensitivity and specificity, the marginal beta-binomial model gains substantial model robustness to misspecification in both joint and marginal distributions.

Table 5.

Estimates of the bias, true standard error (SE), model based standard error (MBSE), coverage probability (CP), and relative efficiency (RE) in 5000 simulations based on BGLMM, Sarmanov beta-binomial model and marginal beta-binomial model, with different number of studies, for different values of correlation ρ between study-specific sensitivity and specificity. The data are generated from bivariate logit t-distribution with 4 degree of freedom. All entries are multiplied by 100.

		BGLMM				Sarmanov beta-binomial Model				Marginal beta-binomial model

m	rho	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP	Bias	SE	MBSE	CP
10	−0.9	0.2	16.7	24.3	93.1	−2.2	51.1	35.0	81.0	−2.1	49.9	44.5	92.7
	−0.7	0.4	25.4	24.8	91.5	0.2	48.2	34.9	83.9	0.1	47.2	42.2	92.4
	−0.5	0.6	32.0	31.3	91.3	0.1	44.9	34.7	86.9	0.1	44.1	39.7	92.2
	−0.3	0.0	37.1	35.4	90.7	−0.5	41.8	34.2	88.7	−0.5	41.3	37.1	92.2
	−0.1	0.2	41.7	39.9	91.5	−0.5	37.8	33.3	91.6	−0.5	37.7	34.2	92.3
	0.0	−0.3	43.6	42.1	91.7	−0.1	37.0	32.9	91.5	−0.1	37.0	32.5	90.5
20	−0.9	−0.2	12.1	12.5	93.8	−0.2	35.6	25.7	84.6	−0.2	34.7	33.1	94.0
	−0.7	−0.3	18.1	17.6	92.9	−0.6	34.4	25.8	85.6	−0.7	33.7	31.4	93.3
	−0.5	0.1	22.4	21.9	92.8	0.1	31.4	25.5	88.5	0.1	30.8	29.3	93.9
	−0.3	0.7	26.0	25.5	93.7	−0.2	29.5	25.2	90.6	−0.2	29.1	27.4	93.9
	−0.1	−0.4	28.9	28.5	93.7	−0.5	27.0	24.6	93.3	−0.5	26.8	25.4	94.1
	0.0	0.4	30.6	30.0	93.1	−0.1	25.3	24.1	93.8	−0.1	25.3	24.2	93.7
40	−0.9	0.0	8.1	7.9	93.7	0.0	24.9	18.6	85.3	0.0	24.4	24.0	95.6
	−0.7	0.2	12.5	12.3	94.0	0.4	24.0	18.6	87.3	0.4	23.5	22.8	94.9
	−0.5	−0.2	15.5	15.4	93.3	−0.4	22.2	18.5	89.9	−0.4	21.8	21.4	95.4
	−0.3	−0.4	18.0	18.0	94.6	0.2	20.8	18.3	91.2	0.2	20.5	19.9	94.5
	−0.1	0.0	20.4	20.2	93.8	−0.1	18.9	17.8	94.0	−0.1	18.7	18.4	95.3
	0.0	−0.4	21.4	21.2	93.8	−0.1	18.3	17.4	94.0	−0.1	18.3	17.6	94.9

Open in a new tab

It is worth mentioning that fitting of Sarmanov beta-binomial model encounters non-convergence problem (i.e., number of iterations reaches the maximum number of iterations, or the estimated covariance matrix is singular) with non-convergence rate up to 11.3%. In contrast, fitting of the marginal beta-binomial model suffers substantially less from the non-convergence problem with non-convergence rate less than 1.5% among all settings. In summary, our simulation studies suggest that the marginal beta-binomial model performs as well as Sarmanov beta-binomial model when the latter is the true model, perform satisfactorily compared to BGLMM model when the true model is bivariate logit normal model or bivariate logit t-model, and substantially outperforms BGLMM and Sarmanov beta-binomial models when the true model is Copula model. Hence, the marginal beta-binomial model can be more robust than BGLMM and Sarmanov beta-binomial models, and is recommended as an useful alternative for practical investigators.

4. Applications

We illustrate the use of the marginal beta-binomial model by three meta-analyses. The first is a systematic review of diagnostic accuracy studies for early detection of melanoma metastasis, the second is a meta-analysis of case-control studies for association between N-acetyltransterase 2 acetylation status and colorectal cancer and the third is a meta-analysis for the diagnostic values of tumor markers in detecting primary bladder cancer.

4.1. A systematic review of diagnostic accuracy studies for early detection of melanoma metastasis

Melanoma is the most deadly type of skin cancer, which occurs in melanocytes (cells that produce the skin pigment melanin). If not found early, melanoma can become very dangerous by growing deeper into skin and spreading to other part of the body (metastasis). Although sentinel lymph node biopsy is considered as gold standard for the pathological staging of melanoma, diagnostic imaging is often used for early detection of melanoma metastasis, which provides a cost-effective surveillance approach. The type of imaging and the interval of testing which is the most effective and cost-effective have not been defined. The goal of surveillance imaging is to detect melanoma recurrence in regional lymph nodes and/or distant sites at a point when it remains treatable and/or possible surgically resectable. Currently, the most commonly used diagnostic imaging modalities for the surveillance of melanoma patients include ultrasonography (US), computed tomography(CT), positron emission tomography (PET) and a combination of both (PET-CT). It is critical to assess and compare the performance of these contemporary diagnostic imaging modalities to compare accuracy in various clinical settings.

Xing et al. [62] conducted a systematic review of 98 published studies from 10,528 patients with melanoma between January 1, 1990 and June, 30, 2009. A Bayesian version of the BGLMM was fitted to obtain summaries of the accuracy of four contemporary diagnostic imaging modalities, as measured by sensitivity, specificity and dOR. They found that among four imaging methods, US had the highest sensitivity (60%, 95% confidence interval (CI) = 33% to 83%) and specificity (97%, 95% CI = 88% to 99%) for the surveillance of regional lymph node metastasis, and PET-CT had the highest sensitivity (80%, 95% CI = 53% to 93%), specificity (87%, 95% CI = 54% to 97%) for the surveillance of distant lymph node metastasis. The posterior mean, along with the credible intervals, for accuracy measures of each diagnostic imaging modality and cancer stage (i.e., regional or distant) are displayed as the dashed segments in Figure 2. We note that the logit-transformation is assumed when fitting the BGLMM model. To avoid assuming an arbitrary transformation, we apply the proposed marginal beta-binomial model to the data, and the point estimates together with 95% confidence intervals for accuracy measures are presented by solid segments in Figure 2. The results from fitting BGLMM and marginal beta-binomial models on this data are generally consistent, but there are some sizably differences in both point estimates and credible/confidence intervals. Here we only highlight some differences. Specifically, different from the conclusion from the BGLMM model, CT also has the highest sensitivity in addition to US for detecting regional lymph node metastasis, although the specificity of CT is less than that of US. Another visible difference is that the confidence intervals produced by the marginal beta-binomial model are generally narrower than the credible intervals from fitting the BGLMM model. As pointed out by an anonymous reviewer, the Sarmanov beta-binomial model implies a linear relationship between sensitivity and specificity, see also page 625 of Chu et al. (2012) [20]. Therefore, the Sarmanov beta-binomial model is not appropriate for diagnostic test. On the other hand, the marginal beta-binomial model only partially specifies the distribution, and does not necessarily imply the linear SROC curve. Since the correlation structure is unspecified, there is no estimated correlation parameter SROC curve. Therefore, if the primary goal is to estimate the SROC curve, fully parametric models such as the BGLMM model and the Copula models are preferred. If the goal is to estimate overall diagnostic measures (such as sensitivity, specificity and diagnostic odds ratio), the marginal beta-binomial model can make valid inference.

Estimated overall sensitivities (left panel), specificities (middle panel) and diagnostic odds ratios (right panel) of identifying melanoma and 95% confidence intervals of four diagnostic imaging modalities for patients with different stage of melanoma (Regional and Distant). Solid segments: estimates from the marginal beta-binomial method; dashed segments: estimates from the BGLMM method.

4.2. A meta-analysis of the association between N-acetyltransterase 2 acetylation status and colorectal cancer

N-acetyltransferase 2 (NAT2) is an enzyme encoded by the NAT2 gene in human. Although the rapid NAT2 acetylation has been considered to increase the colorectal cancer risk in many studies, the overall results of such studies are inconsistent. To study the overall impact of NAT2 rapid acetylation on colorectal cancer risk, Ye and Parry [66] conducted a meta-analysis of 20 published case-control studies from January 1985 to October 2001. These 20 studies include 4,471 colorectal cancer cases and 4,885 controls among which 2,361 and 2,238 subjects had rapid NAT2 acetylator status. The data are summarized in Table C.1 in the Appendix. A strong positive correlation between probabilities of exposure in cases and controls is found, with Pearson’s correlation, Spearman’s rank correlation, and Kendall’s tau equal to 0.87, 0.49, and 0.40, respectively, and all p-values less than 0.03 [8]. Chen et al. [8] analyzed such data by using Sarmanov beta-binomial model, and the overall odds ratio between rapid NAT2 acetylator status and colorectal cancer was estimated as 1.10 (95% CI : 0.70, 1.72) with a p-value of 0.67 using Wald test. The parameters of the Sarmanov beta-binomial model (i.e., (a₁, b₁, a₂, b₂, ρ)) were estimated as (3.11, 2.91, 3.94, 3.36, 0.12). We note that the estimate of the correlation parameter ρ was in fact on the boundary of its parameter space. Thus, the 95% CI based on the Wald inference may be biased or invalid [25, 26]. Alternatively, we fitted the marginal beta-binomial model, and the parameter estimates of the marginal beta-binomial distributions were similar to those from the Sarmanov beta-binomial model. Specifically, the parameters (a₁, b₁, a₂, b₂) were estimated as (3.01, 3.01, 3.99, 3.40), and the overall odds ratio was estimated as 1.13 (95% CI : 0.99, 1.31) with a p-value of 0.09. The difference between results from these two methods is due to the fact that the point estimate of ρ reaches the boundary of its parameter space and the Wald inference using the Sarmanov beta-binomial model can be misleading. In this case, the results from the marginal beta-binomial model are more reliable.

4.3. A meta-analysis for the diagnostic values of tumor markers in detecting primary bladder cancer

Bladder cancer is the most common urological cancer. Cytology has been used as a classical marker for detecting bladder cancer since 1945, and is not expensive compared to the cystoscopy procedure. However, it lacks the diagnostic sensitivity. Studies have been conducted to develop easy to use urinary markers with better sensitivity. Tumor markers such as bladder tumor antigen (BTA), BTA stat, BTA TRAK, NMP22 and telomerase have been evaluated and compared to the classical tumor marker cytology. Gas et al. (2003) [67] conducted a systematic review to compare the diagnostic accuracy of the most used urinary tumor markers. A total of 42 published studies from 1990 through November 2001 in English and German were included in the systematic review. The BGLMM model was fitted to estimate the overall sensitivity and specificity for different tumor markers under comparison. The logittransformation was used for sensitivity and specificity values, and the additional heterogeneity between studies due to differences in study characteristics was accounted for. They found that cytology had the best specificity (94%, 95% CI=90% to 96%), while telomerase had the best sensitivity (75%, 95% CI=71% to 79%). We applied the proposed marginal beta-binomial model to the data, and compare the results with those from fitting BGLMM. As shown in Figure 3, the results from fitting BGLMM and marginal beta-binomial models on this data are generally consistent for sensitivities and specificities, but there are sizably differences in the estimation of diagnostic odds ratio. Specifically, confidence intervals for diagnostic odds ratio produced by the marginal beta-binomial model are narrower than the confidence intervals from fitting the BGLMM for telomerase, BTA TRAK and BTA datasets. It is worth mentioning that for telomerase dataset, the estimated correlation from BGLMM is close to its boundary (ρ̂ = −0.95) and the point estimates of overall sensitivity and specificity from the BGLMM method is very sensitive to the choice of initial values. In contrast, the proposed marginal beta-binomial method does not require estimation of correlation parameter and thus does not suffer from this estimation issue.

Estimated overall sensitivities (left panel), specificities (middle panel) and diagnostic odds ratios (right panel) of identifying the bladder cancer and their 95% confidence intervals of five diagnostic tests. Solid segments: estimates from the marginal beta-binomial method; dashed segments: estimates from the BGLMM method.

5. Discussion

In this paper, we proposed a simple and robust approach for meta-analysis of studies with binary outcomes. This method is easy to implement with closed-form likelihood function, does not need to specify the link function, and does not suffer from the correlation parameter constraint as in Sarmanov beta-binomial model. More importantly, through simulation studies, we found this model is more robust than the current bivariate generalized linear mixed model and Sarmanov beta-binomial model under a variety of model misspecifications, and maintains high relative efficient even when the underlying model is Sarmanov beta-binomial model. Therefore, the marginal beta-binomial model is recommended as an useful alternative for practical investigators. The proposed method deals with meta-analyses with bivariate binary outcomes, such as meta-analyses of diagnostic accuracy studies and meta-analysis of case-control studies with binary exposure. It is worth mentioning that when dealing with meta-analyses of diagnostic accuracy studies, summary points (such as sensitivity, specificity and diagnostic odds ratio) and summary curves (such as SROC or HSROC) are two most meaningful summaries of data. One limitation of the proposed marginal beta-binomial method, when applied to analysis of diagnostic accuracy studies, is that it does not yield a summary curve. Therefore, if the primary goal is to estimate the summary curve, fully parametric models such as the BGLMM model [15] and HSROC model [6] are preferred. If the goal is to estimate overall diagnostic measures, the proposed marginal beta-binomial model can make valid inference. Another limitation of the proposed method is that it does not estimate conditional means, which may be more useful under some settings. However, if the primary interest is to obtain summary of diagnostic accuracies, the proposed method can be used for its simplicity and robustness.”

The proposed method belongs to the well-studied family of composite likelihoods [30] where conditional or marginal densities are multiplied. The point and variance estimates of the proposed model are the same as those from two univariate models when only the pooled estimates two outcomes are of interest. When considering the functions of pooled estimates (e.g. dOR), the proposed method can correctly estimate the covariance between the pooled estimates which cannot be estimated by two separate univariate beta-binomial models.

Kuss et al. (2013) [29] recently proposed a novel model using bivariate copulas which links the beta-binomial marginal distributions for the bivariate binary data. This model shares many advantages of the proposed marginal beta-binomial model, and is expected to gain a certain degree of efficiency if the copula is correctly specified. One major advantage of the Copula model compared to the proposed marginal model is that the Copula model can provide estimate of SROC curve. On the other hand, the copula model may lead to biased results under misspecified copula and it is challenging to select among candidate copulas. In this sense, the proposed beta-binomial model provides a robust procedure for analysis of bivariate binary data. The focus of this paper is to propose a simple and robust method for meta-analysis of bivariate binary outcomes. The comparison of finite sample performance between the proposed model with Copula beta-binomial model will be studied and reported in the future.

Recently, trivariate random effects model for meta-analysis of diagnostic tests has been proposed by Chu et al [68], which jointly models the study-specific disease prevalence and diagnostic accuracies. This model has the advantages of improved statistical efficiency and ability of estimating more clinical relevant diagnostic accuracy measures (such as positive/negative predictive values). Extension of the current method to trivariate meta-analysis models is currently under investigation. In some situations, the reference test in diagnostic accuracy studies may be subject to measurement error, leading to an imperfect reference test. Random effects models for meta-analysis of diagnostic accuracy studies without a gold standard have been proposed [69, 70]. Extension of the current method to the situation without a gold standard will be investigated and reported elsewhere.

Acknowledgments

Yong Chen was supported by grant number R03HS022900 from the Agency for Healthcare Research and Quality. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality.

Appendix

Section A: Derivation of the asymptotic results

Proof 1

Denote θ = (θ₁, θ₂). The marginal distribution of n_ijj is Beta-binomial (n_ijj; n_ij, a_j, b_j) for j = 1, 2. Therefore, we have,

E [\frac{\partial}{\partial a_{1}} log L_{1} (a_{1}, b_{1}); θ, ρ] = 0 .

Equivalently, we have

E [\frac{\partial}{\partial a_{1}} log L_{p} (θ); θ, ρ] = 0 .

By similar argument, we can show $E [\frac{\partial}{\partial θ} log L_{p} (θ); θ, ρ] = 0$ .

Therefore ∂log L_p(θ)/∂θ = 0 is an unbiased estimating equation for θ. Let θ̂ = arg max{log L_p(θ)}. By Taylor expansion around the true value θ, we have

0 \approx \frac{\partial log L_{p} (\hat{θ})}{\partial θ} = \frac{\partial log L_{p} (θ)}{\partial θ} + \frac{\partial^{2} log L_{p} (θ)}{\partial θ^{2}} (\hat{θ} - θ)

Hence we have,

\sqrt{n} (\hat{θ} - θ) \approx {- \frac{1}{n} \frac{\partial^{2} log L_{p} (θ)}{\partial θ^{2}}}^{- 1} {\frac{1}{\sqrt{n}} \frac{\partial log L_{p} (θ)}{\partial θ}} .

The proof is completed by noting that −n⁻¹ ∂²log L_p(θ)/∂θ² →_p Σ₁ and $n^{- 1 / 2} \partial log L_{p} (θ) / \partial θ \overset{d}{\to} N (0, \sum_{2})$ , as n→ + ∞, where

\sum_{1} = (\begin{matrix} I_{11} & 0 \\ 0 & I_{22} \end{matrix}), and \sum_{2} = (\begin{matrix} I_{11} & I_{12} \\ {(I_{12})}^{T} & I_{22} \end{matrix}) .

Section B: SPLUS/R program to fit the marginal beta-binomial model and a working example

library(aod) # use this package to fit beta-binomial model
# Function to compute the log-likelihood in equation (4)
myLik.indep.log=function(mypar, mydat) {
  a1.temp <- mypar[1]; b1.temp <- mypar[2]
  a2.temp <- mypar[3]; b2.temp <- mypar[4]
  a1 <- exp(a1.temp); b1 <- exp(b1.temp)
  a2 <- exp(a2.temp); b2 <- exp(b2.temp)
  temp1 <- (lgamma(a1+mydat$y1) + lgamma(b1+mydat$n1-mydat$y1)
            + lgamma(a2+mydat$y2) + lgamma(b2+mydat$n2-mydat$y2)
            + lgamma(a1+b1) + lgamma(a2+b2))
  temp2 <- (lgamma(a1) + lgamma(b1) + lgamma(a2) + lgamma(b2)
            + lgamma(a1+b1+mydat$n1) + lgamma(a2+b2+mydat$n2))
  myLogLik <- sum(temp1 – temp2)
  return(myLogLik)
}
# Function to back-transform the parameters (a1,b1,a2,b2,rho) to original scale
par.cal=function(mypar) {
  a1 <- exp(mypar[1]); b1 <- exp(mypar[2])
  a2 <- exp(mypar[3]); b2 <- exp(mypar[4])
  eta <- mypar[5]
  cc <- sqrt(a1*a2*b1*b2)/sqrt((a1+b1+1)*(a2+b2+1))
  upper.bound <- cc/max(a1*b2, a2*b1)
  lower.bound <- −cc/max(a1*a2, b1*b2)
  expit.eta= exp(eta)/(1+exp(eta))
  rho <- (upper.bound-lower.bound)*expit.eta + lower.bound
  return(c(a1,b1,a2,b2,rho,eta))
}
# Function to calculate the estimates and Wald confidence interval of odds ratio for the marginal beta-binomial model estimate.CL=function(y1,n1,y2,n2){
  initial.val.gen <- function(y, n) {
    BBfit <- betabin(cbind(y, n-y)~1, ~1, data=data.frame(y=y,n=n))
    expit.BB <- exp(as.numeric(BBfit@param[1]))/(1+exp(as.numeric (BBfit@param[1])))
    a.ini <- expit.BB*(1/as.numeric(BBfit@param[2])−1)
    b.ini <- (1/as.numeric(BBfit@param[2])−1)*(1-expit.BB)
    return(list(a=a.ini, b=b.ini))
  }
  mle.CL <- function(y1=y1,n1=n1,y2=y2,n2=n2) {
    init.val <- rep(0, 5)
    fit1 <- initial.val.gen(y1, n1)
    fit2 <- initial.val.gen(y2, n2)
    init.val[1] <- log(fit1$a);
    init.val[2] <- log(fit1$b)
    init.val[3] <- log(fit2$a);
    init.val[4] <- log(fit2$b)
    MLE.inde.log <- optim(init.val[1:4], myLik.indep.log, method = "L-BFGS-B",
                       lower=rep(−20,4), upper=rep(20,4),
                       control = list(fnscale=−1,maxit=1000),
                       hessian = T, mydat=list(y1=y1,n1=n1,y2=y2,n2=n2))
   mypar<- par.cal(MLE.inde.log$par)
   rho<-0;
   eta<-NA;
   hessian.log<-MLE.inde.log$hessian
   colnames(hessian.log)<-c("loga1","logb1","loga2","logb2")
   rownames(hessian.log)<-c("loga1","logb1","loga2","logb2")
   conv=MLE.inde.log$convergence
   a1 <- mypar[1]; b1 <- mypar[2];
   a2 <- mypar[3]; b2 <- mypar[4];
   prior.MLE<-c(a1, b1, a2, b2, rho ,eta)
  return(list(MLE=prior.MLE,hessian.log=hessian.log,conv=conv))
 }
  myLik.indep.vector=function(mypar, mydat) {
    a1.temp <- mypar[1]; b1.temp <- mypar[2]
    a2.temp <- mypar[3]; b2.temp <- mypar[4]
    a1 <- exp(a1.temp); b1 <- exp(b1.temp)
    a2 <- exp(a2.temp); b2 <- exp(b2.temp)
    temp1 <- (lgamma(a1+mydat$y1) + lgamma(b1+mydat$n1-mydat$y1)
            + lgamma(a2+mydat$y2) + lgamma(b2+mydat$n2-mydat$y2)
            + lgamma(a1+b1) + lgamma(a2+b2))
    temp2 <- (lgamma(a1) + lgamma(b1) + lgamma(a2) + lgamma(b2)
            + lgamma(a1+b1+mydat$n1) + lgamma(a2+b2+mydat$n2))
    myLogLik <- (temp1 - temp2)
    return(myLogLik)
  }
  sandwich.var.cal=function(mypar, myhessian.indep, mydat){
    npar=length(mypar)
    delta=1e-8
    myD = matrix(NA, nrow=npar, ncol=nstudy)
    for(i in 1:npar){
      par.for=par.back=mypar
      par.for[i]=par.for[i]+delta
      par.back[i]=par.back[i]-delta
      myD[i,] = (myLik.indep.vector(par.for, mydat)- myLik.indep.vector(par.back,mydat))/(2*delta)
    }
    score.squared = myD%*%t(myD)
    inv.hessian = solve(-myhessian.indep)
    results=inv.hessian%*% score.squared %*%inv.hessian
    return(results)
  }
  out=mle.CL(y1=y1,n1=n1,y2=y2,n2=n2)
  mle=out$MLE
  hessian.log=out$hessian.log
  a1=mle[1];b1=mle[2]
  a2=mle[3];b2=mle[4]
  mypar=log(c(a1,b1,a2,b2))
  mydat=list(y1=y1,n1=n1,y2=y2,n2=n2)
  covar.sandwich=sandwich.var.cal(mypar, hessian.log, mydat)
  myD <- matrix(c(−1, 1, 1, −1), nrow=1)
  myVar.log <- as.numeric(myD %*% covar.sandwich %*% t(myD))
  logOR <- log((a2/b2)/(a1/b1))
  OR_CI=c(exp(logOR−1.96*sqrt(myVar.log)), exp(logOR+1.96*sqrt(myVar.log)))
  return(list(logOR=logOR, OR_CI=OR_CI, se=sqrt(myVar.log),hessian.log=hessian.log,conv=out$conv,mle=mle))
}
# Example 2 in Section 4.2: dataset from Ye and Parry (2002) Med Sci Monit
# Dataset
perc1.dat <- c(.244, .422, .317, .417, .464, .449, .917, .458, .446, .354, .408, .425, .338, .433,
               .717, .953, .44, .411, .461, .773)
n1.dat <- c(41, 45, 41, 96, 28, 205, 36, 329, 112, 96, 343, 174, 201, 221, 187, 100, 200,
            1963, 258, 209)
y1.dat <- round(perc1.dat*n1.dat)
perc2.dat <- c(.551, .551, .535, .45, .455, .412, .917, .479, .475, .311, .364, .42, .386, .382,
                 .722, .93, .433, .573, .50, .778)
n2.dat <- c(49, 49, 43, 109, 44, 34, 36, 234, 202, 103, 275, 174, 114, 212, 216, 106, 527, 1624,
              120, 200)
y2.dat <- round(perc2.dat*n2.dat)
nstudy=length(n1.dat)
# Fit the marginal beta-binomial model
out.mar=estimate.CL(y1.dat,n1.dat,y2.dat,n2.dat)
OR_CI.mar=out.mar$OR_CI

Section C: Data in Section 4.2

Table C.1.

Data from a meta-analysis of studies on the association between rapid N-acetyltransferase 2 (NAT2) acetylator status (exposed and not exposed) and colorectal cancer risk (cases and controls) [?]. Here “Total” denotes for the total number of subjects in the case or control group, and “Exposure” denotes for the number of subjects that were exposed.

Author	Cases		Control

	Exposure	Total	Exposure	Total
Ilett	27	49	10	41
Ilett	27	49	19	45
Wohlleb	23	43	13	41
Ladero	49	109	40	96
Rodriguez	20	44	13	28
Lang	14	34	92	205
Oda	33	36	33	36
Shibuta	112	234	151	329
Bell	96	202	50	112
Spurr	32	103	34	96
Hubbard	100	275	140	343
Welfare	73	174	74	174
Gil	44	114	68	201
Chen	81	212	96	221
Lee	156	216	134	187
Yoshika	99	106	95	100
Potter	228	527	88	200
Slattery	931	1624	807	1963
Agundez	60	120	119	258
Butler	156	200	162	209

Open in a new tab

References

1.Jackson D, Riley R, White I. Multivariate meta-analysis: Potential and promise. Statistics in Medicine. 2011;30(20):2481–2498. doi: 10.1002/sim.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Borenstein M, Hedges LV, Higgins JP, Rothstein HR. Introduction to meta-analysis. Wiley; 2011. [Google Scholar]
3.Borenstein M, Hedges L, Higgins J, Rothstein H. Introductionto Meta-Analysis. Wiley Online Library; 2009. [Google Scholar]
4.Reitsma J, Glas A, Rutjes A, Scholten R, Bossuyt P, Zwinderman A. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of clinical epidemiology. 2005;58(10):982–990. doi: 10.1016/j.jclinepi.2005.02.022. [DOI] [PubMed] [Google Scholar]
5.Irwig L, Macaskill P, Glasziou P, Fahey M. Meta-analytic methods for diagnostic test accuracy. Journal of clinical epidemiology. 1995;48(1):119–130. doi: 10.1016/0895-4356(94)00099-c. [DOI] [PubMed] [Google Scholar]
6.Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in medicine. 2001;20(19):2865–2884. doi: 10.1002/sim.942. [DOI] [PubMed] [Google Scholar]
7.Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: Data-analytic approaches and some additional considerations. Statistics in medicine. 2007;12(14):1293–1316. doi: 10.1002/sim.4780121403. [DOI] [PubMed] [Google Scholar]
8.Chen Y, Chu H, Luo S, Nie L, Chen S. Bayesian analysis on meta-analysis of case-control studies accounting for within-study correlation. Statistical methods in medical research. 2014a doi: 10.1177/0962280211430889. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Littenberg B, Moses L. Estimating diagnostic accuracy from multiple conflicting reports. Medical Decision Making. 1993;13(4):313. doi: 10.1177/0272989X9301300408. [DOI] [PubMed] [Google Scholar]
10.Moses L, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: Data-analytic approaches and some additional considerations. Statistics in medicine. 1993;12(14):1293–1316. doi: 10.1002/sim.4780121403. [DOI] [PubMed] [Google Scholar]
11.Walter S. Properties of the summary receiver operating characteristic (sroc) curve for diagnostic test data. Statistics in medicine. 2002;21(9):1237–1256. doi: 10.1002/sim.1099. [DOI] [PubMed] [Google Scholar]
12.Arends L, Hamza T, Van Houwelingen J, Heijenbrok-Kal M, Hunink M, Stijnen T. Bivariate random effects meta-analysis of roc curves. Medical Decision Making. 2008;28(5):621–638. doi: 10.1177/0272989X08319957. [DOI] [PubMed] [Google Scholar]
13.Van Houwelingen H, Zwinderman K, Stijnen T. A bivariate approach to meta-analysis. Statistics in Medicine. 1993;12(24):2273–2284. doi: 10.1002/sim.4780122405. [DOI] [PubMed] [Google Scholar]
14.Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in meta-analysis: multivariate approach and meta-regression. Statistics in medicine. 2002;21(4):589–624. doi: 10.1002/sim.1040. [DOI] [PubMed] [Google Scholar]
15.Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. Journal of clinical epidemiology. 2006;59(12):1331–1332. doi: 10.1016/j.jclinepi.2006.06.011. [DOI] [PubMed] [Google Scholar]
16.Chu H, Guo H, Zhou Y. Bivariate random effects meta-analysis of diagnostic studies using generalized linear mixed models. Medical Decision Making. 2010;30(4):499–508. doi: 10.1177/0272989X09353452. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Hamza T, Reitsma J, Stijnen T. Meta-analysis of diagnostic studies: A comparison of random intercept, normal-normal, and binomial-normal bivariate summary roc approaches. Medical Decision Making. 2008;28(5):639–649. doi: 10.1177/0272989X08323917. [DOI] [PubMed] [Google Scholar]
18.Ma X, Nie L, Cole SR, Chu H. Statistical methods for multivariate meta-analysis of diagnostic tests: An overview and tutorial. Statistical methods in medical research. 2013 doi: 10.1177/0962280213492588. 0962280213492 588. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Hamza T, van Houwelingen H, Stijnen T. The binomial distribution of meta-analysis was preferred to model within-study variability. Journal of clinical epidemiology. 2008;61(1):41–51. doi: 10.1016/j.jclinepi.2007.03.016. [DOI] [PubMed] [Google Scholar]
20.Chu H, Nie L, Chen Y, Huang Y, Sun W. Bivariate random effects models for meta-analysis of comparative studies with binary outcomes: Methods for the absolute risk difference and relative risk. Statistical methods in medical research. 2012;21(6):621–633. doi: 10.1177/0962280210393712. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Menke J. Bivariate random-effects meta-analysis of sensitivity and specificity with sas proc glimmix. Methods Inf Med. 2010;49:54–64. doi: 10.3414/ME09-01-0001. [DOI] [PubMed] [Google Scholar]
22.Fournier DA, Skaug HJ, Ancheta J, Ianelli J, Magnusson A, Maunder MN, Nielsen A, Sibert J. Ad model builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optimization Methods and Software. 2012;27(2):233–249. [Google Scholar]
23.Sarmanov O. Generalized normal correlation and two-dimensional Fréchet classes. Soviet MathematicsDoklady. 1966;7:596–599. [Google Scholar]
24.Lee M. Properties and applications of the Sarmanov family of bivariate distributions. Communications in Statistics-Theory and Methods. 1996;25(6):1207–1222. [Google Scholar]
25.Self SG, Liang KY. Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association. 1987;82(398):605–610. [Google Scholar]
26.Chen Y, Liang KY. On the asymptotic behaviour of the pseudolikelihood ratio test statistic with boundary problems. Biometrika. 2010;97(3):603–620. doi: 10.1093/biomet/asq031. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Danaher PJ, Hardie BGS. Bacon with your eggs? applications of a new bivariate beta-binomial distribution. The American Statistician. 2005;59(4):282–286. [Google Scholar]
28.Danaher PJ, Smith MS. Modeling multivariate distributions using copulas: Applications in marketing. Marketing Science. 2011;30(1):4–21. [Google Scholar]
29.Kuss O, Hoyer A, Solms A. Meta-analysis for diagnostic accuracy studies: a new statistical model using beta-binomial distributions and bivariate copulas. Statistics in medicine. 2014;33(1):17–30. doi: 10.1002/sim.5909. [DOI] [PubMed] [Google Scholar]
30.Lindsay B. Composite likelihood methods. Contemporary Mathematics. 1988;80(1):221–39. [Google Scholar]
31.Varin C. On composite marginal likelihoods. AStA Advances in Statistical Analysis. 2008;92(1):1–28. [Google Scholar]
32.Kent J. Robust properties of likelihood ratio tests. Biometrika. 1982;69(1):19–27. [Google Scholar]
33.Molenberghs G, Verbeke G. Models for discrete longitudinal data. Springer; 2005. [Google Scholar]
34.Henderson R, Shimakura S. A serially correlated gamma frailty model for longitudinal count data. Biometrika. 2003;90(2):355–366. [Google Scholar]
35.Fieuws S, Verbeke G. Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics. 2006;62(2):424–431. doi: 10.1111/j.1541-0420.2006.00507.x. [DOI] [PubMed] [Google Scholar]
36.Fieuws S, Verbeke G, Molenberghs G. Random-effects models for multivariate repeated measures. Statistical Methods in Medical Research. 2007;16(5):387–397. doi: 10.1177/0962280206075305. [DOI] [PubMed] [Google Scholar]
37.Fieuws S, Verbeke G, Maes B, Vanrenterghem Y. Predicting renal graft failure using multivariate longitudinal profiles. Biostatistics. 2008;9(3):419–431. doi: 10.1093/biostatistics/kxm041. [DOI] [PubMed] [Google Scholar]
38.Barry S, Bowman A. Linear mixed models for longitudinal shape data with applications to facial modeling. Biostatistics. 2008;9(3):555–565. doi: 10.1093/biostatistics/kxm056. [DOI] [PubMed] [Google Scholar]
39.Sklar A. Random variables, distribution functions, and copulas: a personal look backward and forward. Lecture notes-monograph series. 1996:1–14. [Google Scholar]
40.Nelsen RB. An introduction to copulas. Vol. 139. Springer Science & Business Media; 1999. [Google Scholar]
41.Besag J. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological) 1974;36(2):192–236. [Google Scholar]
42.Cox D, Reid N. A note on pseudolikelihood constructed from marginal densities. Biometrika. 2004;91(3):729–737. [Google Scholar]
43.Varin C, Reid N, Firth D. An overview of composite likelihood methods. Statistica Sinica. 2011;21(1):5–42. [Google Scholar]
44.Lindsay B, Grace Y, Sun J. Issues and strategies in the selection of composite likelihoods. Statistica Sinica. 2011;21:71–105. [Google Scholar]
45.Joe H, Lee Y. On weighting of bivariate margins in pairwise likelihood. Journal of Multivariate Analysis. 2009;100(4):670–685. [Google Scholar]
46.Wellner J, Zhang Y. Two estimators of the mean of a counting process with panel count data. Annals of Statistics. 2000:779–814. [Google Scholar]
47.Zhang Y. A semiparametric pseudolikelihood estimation method for panel count data. Biometrika. 2002;89(1):39–48. [Google Scholar]
48.Wellner J, Zhang Y. Two likelihood-based semiparametric estimation methods for panel count data with covariates. The Annals of Statistics. 2007;35(5):2106–2142. [Google Scholar]
49.Heagerty P, Lele S. A composite likelihood approach to binary spatial data. Journal of the American Statistical Association. 1998;93(443):1099–1111. [Google Scholar]
50.Varin C, Høst G, Skare Ø. Pairwise likelihood inference in spatial generalized linear mixed models. Computational statistics & data analysis. 2005;49(4):1173–1191. [Google Scholar]
51.Varin C, Vidoni P. Pairwise likelihood inference for ordinal categorical time series. Computational statistics & data analysis. 2006;51(4):2365–2373. [Google Scholar]
52.Apanasovich T, Ruppert D, Lupton J, Popovic N, Turner N, Chapkin R, Carroll R. Aberrant crypt foci and semiparametric modeling of correlated binary data. Biometrics. 2008;64(2):490–500. doi: 10.1111/j.1541-0420.2007.00892.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Troxel A, Lipsitz S, Harrington D. Marginal models for the analysis of longitudinal measurements with nonignorable non-monotone missing data. Biometrika. 1998;85(3):661–672. [Google Scholar]
54.Parzen M, Lipsitz S, Fitzmaurice G, Ibrahim J, Troxel A. Pseudo-likelihood methods for longitudinal binary data with non-ignorable missing responses and covariates. Statistics in medicine. 2006;25(16):2784–2796. doi: 10.1002/sim.2435. [DOI] [PubMed] [Google Scholar]
55.Parzen M, Lipsitz S, Fitzmaurice G, Ibrahim J, Troxel A, Molenberghs G. Pseudo-likelihood methods for the analysis of longitudinal binary data subject to nonignorable non-monotone missingness. Journal of data science. 2007;5(1):103–129. [Google Scholar]
56.He W, Yi G. A pairwise likelihood method for correlated binary data with/without missing observations under generalized partially linear single-index models. Statistica Sinica. 2011;21(1):207–229. [Google Scholar]
57.Mardia K, Kent J, Hughes G, Taylor C. Maximum likelihood estimation using composite likelihoods for closed exponential families. Biometrika. 2009;96(4):975–982. [Google Scholar]
58.Han F, Pan W. A composite likelihood approach to latent multivariate gaussian modeling of snp data with application to genetic association testing. Biometrics. 2011;68(1):307–315. doi: 10.1111/j.1541-0420.2011.01649.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Xue L, Zou H, Cai T. Nonconcave penalized composite conditional likelihood estimation of sparse ising models. The Annals of Statistics. 2012;40(3):1403–1429. [Google Scholar]
60.Chandler R, Bate S. Inference for clustered data using the independence loglikelihood. Biometrika. 2007;94(1):167–183. [Google Scholar]
61.Lesnoff M, Lancelot R. Aod: analysis of overdispersed data. R package version. 2010;1 [Google Scholar]
62.Xing Y, Bronstein Y, Ross M, Askew R, Lee J, Gershenwald J, Royal R, Cormier J. Contemporary diagnostic imaging modalities for the staging and surveillance of melanoma patients: a meta-analysis. Journal of the National Cancer Institute. 2011;103(2):129–142. doi: 10.1093/jnci/djq455. [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics. 2007;8(2):239–251. doi: 10.1093/biostatistics/kxl004. [DOI] [PubMed] [Google Scholar]
64.Jun Yan. Enjoy the joy of copulas: With a package copula. Journal of Statistical Software. 2007;21(4):1–21. URL http://www.jstatsoft.org/v21/i04/ [Google Scholar]
65.Ivan Kojadinovic, Jun Yan. Modeling multivariate distributions with continuous margins using the copula R package. Journal of Statistical Software. 2010;34(9):1–20. URL http://www.jstatsoft.org/v34/i09/ [Google Scholar]
66.Ye Z, Parry JM. Meta-analysis of 20 case-control studies on the n-acetyltransferase 2 acetylation status and colorectal cancer risk. Medical Science Review. 2002;8(8):CR558–CR565. [PubMed] [Google Scholar]
67.Glas AS, Roos D, Deutekom M, Zwinderman AH, Bossuyt PM, Kurth KH. Tumor markers in the diagnosis of primary bladder cancer. a systematic review. The Journal of urology. 2003;169(6):1975–1982. doi: 10.1097/01.ju.0000067461.30468.6d. [DOI] [PubMed] [Google Scholar]
68.Chu H, Nie L, Cole SR, Poole C. Meta-analysis of diagnostic accuracy studies accounting for disease prevalence: Alternative parameterizations and model selection. Statistics in medicine. 2009;28(18):2384–2399. doi: 10.1002/sim.3627. [DOI] [PubMed] [Google Scholar]
69.Chu H, Chen S, Louis TA. Random effects models in a meta-analysis of the accuracy of two diagnostic tests without a gold standard. Journal of the American Statistical Association. 2009;104(486):512–523. doi: 10.1198/jasa.2009.0017. [DOI] [PMC free article] [PubMed] [Google Scholar]
70.Dendukuri N, Schiller I, Joseph L, Pai M. Bayesian meta-analysis of the accuracy of a test for tuberculous pleuritis in the absence of a gold standard reference. Biometrics. 2012;68(4):1285–1293. doi: 10.1111/j.1541-0420.2012.01773.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Jackson D, Riley R, White I. Multivariate meta-analysis: Potential and promise. Statistics in Medicine. 2011;30(20):2481–2498. doi: 10.1002/sim.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Borenstein M, Hedges LV, Higgins JP, Rothstein HR. Introduction to meta-analysis. Wiley; 2011. [Google Scholar]

[R3] 3.Borenstein M, Hedges L, Higgins J, Rothstein H. Introductionto Meta-Analysis. Wiley Online Library; 2009. [Google Scholar]

[R4] 4.Reitsma J, Glas A, Rutjes A, Scholten R, Bossuyt P, Zwinderman A. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of clinical epidemiology. 2005;58(10):982–990. doi: 10.1016/j.jclinepi.2005.02.022. [DOI] [PubMed] [Google Scholar]

[R5] 5.Irwig L, Macaskill P, Glasziou P, Fahey M. Meta-analytic methods for diagnostic test accuracy. Journal of clinical epidemiology. 1995;48(1):119–130. doi: 10.1016/0895-4356(94)00099-c. [DOI] [PubMed] [Google Scholar]

[R6] 6.Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in medicine. 2001;20(19):2865–2884. doi: 10.1002/sim.942. [DOI] [PubMed] [Google Scholar]

[R7] 7.Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: Data-analytic approaches and some additional considerations. Statistics in medicine. 2007;12(14):1293–1316. doi: 10.1002/sim.4780121403. [DOI] [PubMed] [Google Scholar]

[R8] 8.Chen Y, Chu H, Luo S, Nie L, Chen S. Bayesian analysis on meta-analysis of case-control studies accounting for within-study correlation. Statistical methods in medical research. 2014a doi: 10.1177/0962280211430889. (in press) [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Littenberg B, Moses L. Estimating diagnostic accuracy from multiple conflicting reports. Medical Decision Making. 1993;13(4):313. doi: 10.1177/0272989X9301300408. [DOI] [PubMed] [Google Scholar]

[R10] 10.Moses L, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: Data-analytic approaches and some additional considerations. Statistics in medicine. 1993;12(14):1293–1316. doi: 10.1002/sim.4780121403. [DOI] [PubMed] [Google Scholar]

[R11] 11.Walter S. Properties of the summary receiver operating characteristic (sroc) curve for diagnostic test data. Statistics in medicine. 2002;21(9):1237–1256. doi: 10.1002/sim.1099. [DOI] [PubMed] [Google Scholar]

[R12] 12.Arends L, Hamza T, Van Houwelingen J, Heijenbrok-Kal M, Hunink M, Stijnen T. Bivariate random effects meta-analysis of roc curves. Medical Decision Making. 2008;28(5):621–638. doi: 10.1177/0272989X08319957. [DOI] [PubMed] [Google Scholar]

[R13] 13.Van Houwelingen H, Zwinderman K, Stijnen T. A bivariate approach to meta-analysis. Statistics in Medicine. 1993;12(24):2273–2284. doi: 10.1002/sim.4780122405. [DOI] [PubMed] [Google Scholar]

[R14] 14.Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in meta-analysis: multivariate approach and meta-regression. Statistics in medicine. 2002;21(4):589–624. doi: 10.1002/sim.1040. [DOI] [PubMed] [Google Scholar]

[R15] 15.Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. Journal of clinical epidemiology. 2006;59(12):1331–1332. doi: 10.1016/j.jclinepi.2006.06.011. [DOI] [PubMed] [Google Scholar]

[R16] 16.Chu H, Guo H, Zhou Y. Bivariate random effects meta-analysis of diagnostic studies using generalized linear mixed models. Medical Decision Making. 2010;30(4):499–508. doi: 10.1177/0272989X09353452. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Hamza T, Reitsma J, Stijnen T. Meta-analysis of diagnostic studies: A comparison of random intercept, normal-normal, and binomial-normal bivariate summary roc approaches. Medical Decision Making. 2008;28(5):639–649. doi: 10.1177/0272989X08323917. [DOI] [PubMed] [Google Scholar]

[R18] 18.Ma X, Nie L, Cole SR, Chu H. Statistical methods for multivariate meta-analysis of diagnostic tests: An overview and tutorial. Statistical methods in medical research. 2013 doi: 10.1177/0962280213492588. 0962280213492 588. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Hamza T, van Houwelingen H, Stijnen T. The binomial distribution of meta-analysis was preferred to model within-study variability. Journal of clinical epidemiology. 2008;61(1):41–51. doi: 10.1016/j.jclinepi.2007.03.016. [DOI] [PubMed] [Google Scholar]

[R20] 20.Chu H, Nie L, Chen Y, Huang Y, Sun W. Bivariate random effects models for meta-analysis of comparative studies with binary outcomes: Methods for the absolute risk difference and relative risk. Statistical methods in medical research. 2012;21(6):621–633. doi: 10.1177/0962280210393712. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Menke J. Bivariate random-effects meta-analysis of sensitivity and specificity with sas proc glimmix. Methods Inf Med. 2010;49:54–64. doi: 10.3414/ME09-01-0001. [DOI] [PubMed] [Google Scholar]

[R22] 22.Fournier DA, Skaug HJ, Ancheta J, Ianelli J, Magnusson A, Maunder MN, Nielsen A, Sibert J. Ad model builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optimization Methods and Software. 2012;27(2):233–249. [Google Scholar]

[R23] 23.Sarmanov O. Generalized normal correlation and two-dimensional Fréchet classes. Soviet MathematicsDoklady. 1966;7:596–599. [Google Scholar]

[R24] 24.Lee M. Properties and applications of the Sarmanov family of bivariate distributions. Communications in Statistics-Theory and Methods. 1996;25(6):1207–1222. [Google Scholar]

[R25] 25.Self SG, Liang KY. Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association. 1987;82(398):605–610. [Google Scholar]

[R26] 26.Chen Y, Liang KY. On the asymptotic behaviour of the pseudolikelihood ratio test statistic with boundary problems. Biometrika. 2010;97(3):603–620. doi: 10.1093/biomet/asq031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Danaher PJ, Hardie BGS. Bacon with your eggs? applications of a new bivariate beta-binomial distribution. The American Statistician. 2005;59(4):282–286. [Google Scholar]

[R28] 28.Danaher PJ, Smith MS. Modeling multivariate distributions using copulas: Applications in marketing. Marketing Science. 2011;30(1):4–21. [Google Scholar]

[R29] 29.Kuss O, Hoyer A, Solms A. Meta-analysis for diagnostic accuracy studies: a new statistical model using beta-binomial distributions and bivariate copulas. Statistics in medicine. 2014;33(1):17–30. doi: 10.1002/sim.5909. [DOI] [PubMed] [Google Scholar]

[R30] 30.Lindsay B. Composite likelihood methods. Contemporary Mathematics. 1988;80(1):221–39. [Google Scholar]

[R31] 31.Varin C. On composite marginal likelihoods. AStA Advances in Statistical Analysis. 2008;92(1):1–28. [Google Scholar]

[R32] 32.Kent J. Robust properties of likelihood ratio tests. Biometrika. 1982;69(1):19–27. [Google Scholar]

[R33] 33.Molenberghs G, Verbeke G. Models for discrete longitudinal data. Springer; 2005. [Google Scholar]

[R34] 34.Henderson R, Shimakura S. A serially correlated gamma frailty model for longitudinal count data. Biometrika. 2003;90(2):355–366. [Google Scholar]

[R35] 35.Fieuws S, Verbeke G. Pairwise fitting of mixed models for the joint modeling of multivariate longitudinal profiles. Biometrics. 2006;62(2):424–431. doi: 10.1111/j.1541-0420.2006.00507.x. [DOI] [PubMed] [Google Scholar]

[R36] 36.Fieuws S, Verbeke G, Molenberghs G. Random-effects models for multivariate repeated measures. Statistical Methods in Medical Research. 2007;16(5):387–397. doi: 10.1177/0962280206075305. [DOI] [PubMed] [Google Scholar]

[R37] 37.Fieuws S, Verbeke G, Maes B, Vanrenterghem Y. Predicting renal graft failure using multivariate longitudinal profiles. Biostatistics. 2008;9(3):419–431. doi: 10.1093/biostatistics/kxm041. [DOI] [PubMed] [Google Scholar]

[R38] 38.Barry S, Bowman A. Linear mixed models for longitudinal shape data with applications to facial modeling. Biostatistics. 2008;9(3):555–565. doi: 10.1093/biostatistics/kxm056. [DOI] [PubMed] [Google Scholar]

[R39] 39.Sklar A. Random variables, distribution functions, and copulas: a personal look backward and forward. Lecture notes-monograph series. 1996:1–14. [Google Scholar]

[R40] 40.Nelsen RB. An introduction to copulas. Vol. 139. Springer Science & Business Media; 1999. [Google Scholar]

[R41] 41.Besag J. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological) 1974;36(2):192–236. [Google Scholar]

[R42] 42.Cox D, Reid N. A note on pseudolikelihood constructed from marginal densities. Biometrika. 2004;91(3):729–737. [Google Scholar]

[R43] 43.Varin C, Reid N, Firth D. An overview of composite likelihood methods. Statistica Sinica. 2011;21(1):5–42. [Google Scholar]

[R44] 44.Lindsay B, Grace Y, Sun J. Issues and strategies in the selection of composite likelihoods. Statistica Sinica. 2011;21:71–105. [Google Scholar]

[R45] 45.Joe H, Lee Y. On weighting of bivariate margins in pairwise likelihood. Journal of Multivariate Analysis. 2009;100(4):670–685. [Google Scholar]

[R46] 46.Wellner J, Zhang Y. Two estimators of the mean of a counting process with panel count data. Annals of Statistics. 2000:779–814. [Google Scholar]

[R47] 47.Zhang Y. A semiparametric pseudolikelihood estimation method for panel count data. Biometrika. 2002;89(1):39–48. [Google Scholar]

[R48] 48.Wellner J, Zhang Y. Two likelihood-based semiparametric estimation methods for panel count data with covariates. The Annals of Statistics. 2007;35(5):2106–2142. [Google Scholar]

[R49] 49.Heagerty P, Lele S. A composite likelihood approach to binary spatial data. Journal of the American Statistical Association. 1998;93(443):1099–1111. [Google Scholar]

[R50] 50.Varin C, Høst G, Skare Ø. Pairwise likelihood inference in spatial generalized linear mixed models. Computational statistics & data analysis. 2005;49(4):1173–1191. [Google Scholar]

[R51] 51.Varin C, Vidoni P. Pairwise likelihood inference for ordinal categorical time series. Computational statistics & data analysis. 2006;51(4):2365–2373. [Google Scholar]

[R52] 52.Apanasovich T, Ruppert D, Lupton J, Popovic N, Turner N, Chapkin R, Carroll R. Aberrant crypt foci and semiparametric modeling of correlated binary data. Biometrics. 2008;64(2):490–500. doi: 10.1111/j.1541-0420.2007.00892.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] 53.Troxel A, Lipsitz S, Harrington D. Marginal models for the analysis of longitudinal measurements with nonignorable non-monotone missing data. Biometrika. 1998;85(3):661–672. [Google Scholar]

[R54] 54.Parzen M, Lipsitz S, Fitzmaurice G, Ibrahim J, Troxel A. Pseudo-likelihood methods for longitudinal binary data with non-ignorable missing responses and covariates. Statistics in medicine. 2006;25(16):2784–2796. doi: 10.1002/sim.2435. [DOI] [PubMed] [Google Scholar]

[R55] 55.Parzen M, Lipsitz S, Fitzmaurice G, Ibrahim J, Troxel A, Molenberghs G. Pseudo-likelihood methods for the analysis of longitudinal binary data subject to nonignorable non-monotone missingness. Journal of data science. 2007;5(1):103–129. [Google Scholar]

[R56] 56.He W, Yi G. A pairwise likelihood method for correlated binary data with/without missing observations under generalized partially linear single-index models. Statistica Sinica. 2011;21(1):207–229. [Google Scholar]

[R57] 57.Mardia K, Kent J, Hughes G, Taylor C. Maximum likelihood estimation using composite likelihoods for closed exponential families. Biometrika. 2009;96(4):975–982. [Google Scholar]

[R58] 58.Han F, Pan W. A composite likelihood approach to latent multivariate gaussian modeling of snp data with application to genetic association testing. Biometrics. 2011;68(1):307–315. doi: 10.1111/j.1541-0420.2011.01649.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] 59.Xue L, Zou H, Cai T. Nonconcave penalized composite conditional likelihood estimation of sparse ising models. The Annals of Statistics. 2012;40(3):1403–1429. [Google Scholar]

[R60] 60.Chandler R, Bate S. Inference for clustered data using the independence loglikelihood. Biometrika. 2007;94(1):167–183. [Google Scholar]

[R61] 61.Lesnoff M, Lancelot R. Aod: analysis of overdispersed data. R package version. 2010;1 [Google Scholar]

[R62] 62.Xing Y, Bronstein Y, Ross M, Askew R, Lee J, Gershenwald J, Royal R, Cormier J. Contemporary diagnostic imaging modalities for the staging and surveillance of melanoma patients: a meta-analysis. Journal of the National Cancer Institute. 2011;103(2):129–142. doi: 10.1093/jnci/djq455. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R63] 63.Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics. 2007;8(2):239–251. doi: 10.1093/biostatistics/kxl004. [DOI] [PubMed] [Google Scholar]

[R64] 64.Jun Yan. Enjoy the joy of copulas: With a package copula. Journal of Statistical Software. 2007;21(4):1–21. URL http://www.jstatsoft.org/v21/i04/ [Google Scholar]

[R65] 65.Ivan Kojadinovic, Jun Yan. Modeling multivariate distributions with continuous margins using the copula R package. Journal of Statistical Software. 2010;34(9):1–20. URL http://www.jstatsoft.org/v34/i09/ [Google Scholar]

[R66] 66.Ye Z, Parry JM. Meta-analysis of 20 case-control studies on the n-acetyltransferase 2 acetylation status and colorectal cancer risk. Medical Science Review. 2002;8(8):CR558–CR565. [PubMed] [Google Scholar]

[R67] 67.Glas AS, Roos D, Deutekom M, Zwinderman AH, Bossuyt PM, Kurth KH. Tumor markers in the diagnosis of primary bladder cancer. a systematic review. The Journal of urology. 2003;169(6):1975–1982. doi: 10.1097/01.ju.0000067461.30468.6d. [DOI] [PubMed] [Google Scholar]

[R68] 68.Chu H, Nie L, Cole SR, Poole C. Meta-analysis of diagnostic accuracy studies accounting for disease prevalence: Alternative parameterizations and model selection. Statistics in medicine. 2009;28(18):2384–2399. doi: 10.1002/sim.3627. [DOI] [PubMed] [Google Scholar]

[R69] 69.Chu H, Chen S, Louis TA. Random effects models in a meta-analysis of the accuracy of two diagnostic tests without a gold standard. Journal of the American Statistical Association. 2009;104(486):512–523. doi: 10.1198/jasa.2009.0017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R70] 70.Dendukuri N, Schiller I, Joseph L, Pai M. Bayesian meta-analysis of the accuracy of a test for tuberculous pleuritis in the absence of a gold standard reference. Biometrics. 2012;68(4):1285–1293. doi: 10.1111/j.1541-0420.2012.01773.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach

Yong Chen

Chuan Hong

Yang Ning

Xiao Su

Abstract

1. Introduction

2. Statistical Methodology

2.1. Notations and bivariate generalized linear mixed effects models

2.2. Sarmanov beta-binomial model

2.3. Bivariate Copula model

2.4. marginal beta-binomial approach

Figure 1.

2.5. Regression extension

3. Simulation Study

Table 1.

Table 4.

Table 2.

Table 3.

Table 5.

4. Applications

4.1. A systematic review of diagnostic accuracy studies for early detection of melanoma metastasis

Figure 2.

4.2. A meta-analysis of the association between N-acetyltransterase 2 acetylation status and colorectal cancer

4.3. A meta-analysis for the diagnostic values of tumor markers in detecting primary bladder cancer

Figure 3.

5. Discussion

Acknowledgments

Appendix

Section A: Derivation of the asymptotic results

Proof 1

Section B: SPLUS/R program to fit the marginal beta-binomial model and a working example

Section C: Data in Section 4.2

Table C.1.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases