A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard

Yulun Liu; Yong Chen; Haitao Chu

doi:10.1111/biom.12264

. Author manuscript; available in PMC: 2015 Jun 25.

Published in final edited form as: Biometrics. 2014 Oct 30;71(2):538–547. doi: 10.1111/biom.12264

A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard

Yulun Liu ^*, Yong Chen ^†, Haitao Chu ^‡

PMCID: PMC4416105 NIHMSID: NIHMS639505 PMID: 25358907

Abstract

Several statistical methods for meta-analysis of diagnostic accuracy studies have been discussed in the presence of a gold standard. However, in practice, the selected reference test may be imperfect due to measurement error, non-existence, invasive nature, or expensive cost of a gold standard. It has been suggested that treating an imperfect reference test as a gold standard can lead to substantial bias in the estimation of diagnostic test accuracy. Recently, two models have been proposed to account for imperfect reference test, namely, a multivariate generalized linear mixed model (MGLMM) and a hierarchical summary receiver operating characteristic (HSROC) model. Both models are very flexible in accounting for heterogeneity in accuracies of tests across studies as well as the dependence between tests. In this paper, we show that these two models, although with different formulations, are closely related and are equivalent in the absence of study-level covariates. Furthermore, we provide the exact relations between the parameters of these two models and assumptions under which two models can be reduced to equivalent submodels. On the other hand, we show that some submodels of the MGLMM do not have corresponding equivalent submodels of the HSROC model, and vice versa. With three real examples, we illustrate the cases when fitting the MGLMM and HSROC models leads to equivalent submodels and hence identical inference, and the cases when the inferences from two models are slightly different. Our results generalize the important relations between the bivariate generalized linear mixed model and HSROC model when the reference test is a gold standard.

Keywords: Diagnostic test, Generalized linear mixed model, Hierarchical model, Imperfect reference test, Meta-analysis

1 Introduction

The rapid growth of evidence-based medicine has led to a dramatic increase in attention to statistical methods for meta-analysis¹. One important area is meta-analysis of diagnostic accuracy studies which combines the measures of test performance (e.g., sensitivity and specificity) across multiple studies. For many applications in diagnostic tests, the test under evaluation (referred to as the index test) is compared to a perfect reference test (i.e., sensitivity=specificity=1), also known as a gold standard. When such a gold standard is available, two categories of statistical methods are popular. The first category consists of methods based on a summary receiver operating characteristic (SROC) curve generated from the study data^2,3. Among them, the hierarchical summary receiver operating characteristic (HSROC) model has been recommended³. The second category consists of methods that use bivariate mixed effects models to model sensitivity and specificity simultaneously^4,5. Among mixed effects models, the bivariate generalized linear mixed effects model (BGLMM)⁵ has been recommended for better coverage performance and avoidance of continuity correction⁶. Interestingly, Harbord et al.⁷ found that the BGLMM and HSROC models are closely related and even equivalent in the absence of covariates. Here, the equivalence means that both models give the same likelihood except different parametrizations. In addition, Harbord et al. provided the relations between parameters in both models under the assumption of reference test being a gold standard.

In this paper, we are dealing with a different setting. In practice, the reference test may be imperfect because of measurement error, non-existence, invasive nature, or expensive cost of a gold standard⁸. Despite the imperfection of reference tests in many applications, the imperfect reference tests are simply treated as gold standard tests in many analyses. Such a procedure can lead to biased estimates of diagnostic test accuracies⁹. To account for such bias, a variety of methods, including both frequentist and Bayesian methods, have been proposed. Hui and Walter¹⁰ and Walter et al.¹¹ have proposed methods to evaluate the diagnostic accuracy under the assumptions of homogeneous sensitivity and specificity. To fully account for the heterogeneity in test accuracies across studies, Chu et al.¹² proposed a flexible random effects model by modeling study-specific disease prevalence, sensitivities and specificities of the index and reference tests in a multivariate generalized linear mixed model (MGLMM) framework. Recently, Dendukuri et al.¹³ proposed a hierarchical summary receiver operating characteristic (HSROC) model, extending the HSROC model of Rutter and Gatsonis³ with a gold standard reference test to the situation where no gold standard test is available. This model postulates a study-specific continuous latent variable and a study-specific cutoff and accuracy values for each diagnostic test. Both the MGLMM and HSROC frameworks have the advantages that they account for heterogeneity across studies and allow for dependence between the index test and reference test.

The MGLMM framework by Chu et al.¹² and the HSROC framework by Dendukuri et al.¹³ are flexible and statistically rigorous, and are expected to be increasingly popular. In this paper, we show that these two models, although with different formulations, are closely related and some of their submodels are equivalent. We provide the exact relations between the parameters of these two models and assumptions under which two models can be reduced to equivalent submodels. On the other hand, we show that some submodels of one framework do not have corresponding equivalent submodels in the other framework. With three examples, we illustrate a case when fitting the MGLMM and HSROC models lead to equivalent submodels and hence identical inference, and two cases when the inferences from the two models are different. Our results generalize the important relations between the BGLMM and HSROC models established by Harbord et al.⁷ when the reference test is a gold standard.

The contributions of this work are three-fold. First, we extend the HSROC model along the line of Dendukuri et al.¹³ by allowing the study-specific cutoff and accuracy values for the reference test. Second, we establish the relations between the MGLMM and HSROC frameworks, as well as their corresponding submodels. Third, with real examples, we illustrate the similarities and differences between the MGLMM and HSROC frameworks, and provide new insights on modeling based on the established relations. Throughout this paper, we consider the case when the same type of reference test is used in all studies. The cases when completely different reference tests are used in each study are not considered. In addition, the comparison between the HSROC and MGLMM models is under classical framework. Relations between model parameters and their maximum likelihood estimators under two models are derived. Comparison under Bayesian framework, which involves investigation on prior specifications and posterior distributions, is not considered in this paper.

This paper is organized as follows. Section 2 presents three real examples. We describe the MGLMM framework by Chu et al.¹² in Section 3, and extend the HSROC model along the line of Dendukuri et al.¹³ in Section 4. We establish the mathematical relationship between these two frameworks in Section 5. We illustrate the similarities and differences between two frameworks by studying three motivating examples in Section 6. A brief discussion is provided in Section 7.

2 Examples

We use three examples of meta-analyses to illustrate the similarities and differences between the MGLMM and HSROC frameworks. In this section, we briefly describe the background of these three examples, which will be revisited in Section 6.

Example 1 (Papanicolaou test for diagnosis of cervical neoplasia): Fahey, Irwig, and Macaskill¹⁴ reported data from a meta-analysis of the Papanicolaou (Pap) test that diagnoses cervical neoplasia (defined as the growth of abnormal cells on the surface on the cervix). This data is comprised of 59 studies that published between January 1984 and March 1992. The diagnostic accuracy of the Pap test (i.e., index test) is evaluated by comparing with the histology test (i.e., reference test), which is not a perfect test¹⁴.

Example 2 (Rheumatoid factor test for diagnosis of rheumatoid arthritis): Nishimura et al.¹⁵ collected data from Rheumatoid factor (RF) test (i.e., index test) for detection of Rheumatoid arthritis (RA), and reported 50 studies published before September 2006 with a total of 15,286 patients. In this meta-analysis, the American College of Rheumatology (ACR) 1987 revised criteria were used as the reference standard of RA. Although the ACR 1987 criteria (i.e., reference test) is widely used as an approximate ‘gold standard’ for RA classification, it may be an imperfect reference test for classification.

Example 3 (Computed tomography for diagnosis of coronary artery disease): Schuetz et al.¹⁶ reported data from a systematic review of the computed tomography (CT) test (i.e., index test) for diagnosis of the coronary artery disease. A total of 89 studies were included and collected through MEDLINE that were published before September 2006. In this meta-analysis, the conventional coronary angiography (CAG) test (i.e., reference test) was treated as the ‘gold standard’ for diagnosing the presence of coronary stenoses. However, CAG test may not be perfect due to measurement errors in angiography.

3 The MGLMM framework

Chu et al.¹² proposed an MGLMM for diagnostic tests without a gold standard. Following their notations, for study i (i = 1, 2,…, I), denote π_i as the study-specific disease prevalence, and (Se_i1, Sp_i1) and (Se_i2, Sp_i2) as the respective pairs of study-specific sensitivities and specificities for the index test T₁ and the reference test T₂. Typically, the data for each study are summarized by a 2 × 2 table, cross-tabulating the test results from T₁ and T₂. To account for the heterogeneity across studies and correlations among sensitivities and specificities of T₁ and T₂ and disease prevalence, an MGLMM is formulated in two stages. The first stage specifies the cell probabilities in the ith 2 × 2 table as functions of test sensitivities, specificities and disease prevalence of ith study. At the second stage, a random effects model on test sensitivities, specificities and disease prevalence is assumed.

Denote H (·) as the cumulative distribution function (cdf) of a continuous distribution in the location-scale family, denoted as h(·; μ, σ) with location parameter μ = 0 and scale parameter σ = 1 (such as standard logistic distribution, and standard normal distribution), and H⁻¹ (·) as the inverse of the cdf (e.g., logit and probit functions). The study-specific test sensitivities, specificities and disease prevalence, after an H⁻¹ transformation, are assumed to jointly follow a multivariate normal distribution with mean ℳ = (P, S₁, C₁, S₂, C₂)^T and variance V, where

V = (\begin{matrix} σ_{P}^{2} & σ_{P S_{1}} & σ_{P C_{1}} & σ_{P S_{2}} & σ_{P C_{2}} \\ σ_{S_{1}}^{2} & σ_{S_{1} C_{1}} & σ_{S_{1} S_{2}} & σ_{S_{1} C_{2}} \\ σ_{C_{1}}^{2} & σ_{C_{1} S_{2}} & σ_{C_{1} C_{2}} \\ σ_{S_{2}}^{2} & σ_{S_{2} C_{2}} \\ σ_{C_{2}}^{2} \end{matrix}) .

Here the parameters P, S_j, and C_j are the overall disease prevalence, and overall sensitivity and specificity for diagnostic test j in the transformed scale, j = 1, 2. In addition, the variance parameters $σ_{P}^{2}, σ_{S_{1}}^{2}, σ_{C_{1}}^{2}, σ_{S_{2}}^{2}$ and $σ_{C_{2}}^{2}$ describe the between-study heterogeneity in disease prevalence, sensitivities and specificities for tests T₁ and T₂, and the parameters in the off-diagonal of V account for the dependence among the study-specific prevalence, sensitivities and specificities. To avoid confusion in notations, we use English letters to represent the parameters in the MGLMM. We note that there are twenty parameters in this model.

4 The HSROC framework

Now we provide an extended HSROC framework which is closely related to the model by Dendukuri et al.¹³. Extending and along the lines of the pioneering work by Rutter and Gatsonis³, we assume that the test T_j, j = 1, 2, is based on a continuous latent variable, Z_ij, which comes from different distributions given different disease status D_i for patients in the ith study. Since the gold standard is absent, the disease status D_i is unknown and has to be treated as a dichotomous latent variable. We assume that the result of the diagnostic test T_j on the patients in the ith study is based on a comparison between the latent variable Z_ij and a study-specific “cutoff” value θ_ij. The diagnostic test T_j is positive if Z_ij ≥ θ_ij and is negative otherwise. The latent variable Z_ij follows the location-scale distribution h(z|μ, σ) = σ⁻¹h((z – μ)/σ|0, 1) with the location and scale parameters of –α_ij/2 and exp(–β_j/2) when D_i = 0, and α_ij/2 and exp(β_j/2) when D_i = 1.³ refers the study-specific α_ij as “accuracy value” and the parameter β_j as a “shape parameter”, because the former quantifies the distance between two possible distributions of the latent variable Z_ij, and the latter describes the asymmetry of the ROC curve. To complete the specification of the model, the parameters α_ij and θ_ij are assumed to be independent as in Rutter and Gatsonis³ and follow normal distributions $N (Λ_{j}, σ_{Λ_{j}}^{2})$ and $N (Θ_{j}, σ_{Θ_{j}}^{2})$ respectively. Here for the test T_j (j = 1, 2), the parameters $(Λ_{j}, σ_{Λ_{j}}^{2})$ denote the respective overall mean and between-study variation of the accuracy values, and $(Θ_{j}, σ_{Θ_{j}}^{2})$ denote the mean and variation of the cutoff values. Furthermore, the disease status D_i is positive if a variable Z_i0 from location-scale distribution h(z|0, 1) is greater than a cutoff value θ_i0, and is negative otherwise, where θ_i0 is distributed as $N (Θ_{0}, σ_{Θ_{0}}^{2})$ .

We assume the study-specific values (θ_i0, θ_i1, α_i1, θ_i2, α_i2)^T jointly follow a multivariate normal distribution with mean ℋ = (Θ₀, Θ₁, Λ₁, Θ₂, Λ₂)^T and variance Ω, where

Ω = (\begin{matrix} σ_{Θ_{0}}^{2} & σ_{Θ_{0} Θ_{1}} & σ_{Θ_{0} Λ_{1}} & σ_{Θ_{0} Θ_{2}} & σ_{Θ_{0} Λ_{2}} \\ σ_{Θ_{1}}^{2} & 0 & σ_{Θ_{1} Θ_{2}} & σ_{Θ_{1} Λ_{2}} \\ σ_{Λ_{1}}^{2} & σ_{Λ_{1} Θ_{2}} & σ_{Λ_{1} Λ_{2}} \\ σ_{Θ_{2}}^{2} & 0 \\ σ_{Λ_{2}}^{2} \end{matrix}) .

Here the parameters Θ₀, Θ_j and Λ_j are the overall means of the prevalence cutoff value, the cutoff and accuracy values for tests T₁ and T₂. The variance parameters $σ_{Θ_{0}}^{2}, σ_{Θ_{1}}^{2}, σ_{Λ_{1}}^{2}, σ_{Θ_{2}}^{2}$ and $σ_{Λ_{2}}^{2}$ describe the between-study variation in the prevalence cutoff value, the cutoff and accuracy values. In addition, cov (θ_i1, α_i1) = cov (θ_i2, α_i2) are set as zero for model identifiability, as specified in Dendukuri et al.¹³. Such a specification is necessary and an empirical justification is provided in Web Appendix A. The covariance parameters σ_Θ₀Θ₁, σ_Θ₀Λ₁, σ_Θ₀Θ₂ and σ_Θ₀Λ₂ describe the dependence between the prevalence cutoff value θ_i0 and test characteristic parameters (θ_i1, α_i1, θ_i2, α_i2). And the covariance parameters σ_Θ₁Θ₂, σ_Θ₁Λ₂, σ_Λ₁Θ₂ and σ_Λ₁Λ₂ describe the dependence between two tests. Hereafter, the above model is referred to as the HSROC model and Greek letters are used to represent its model parameters. We note that there are twenty parameters in the HSROC model. In particular, when Z_ij follows a normal distribution, the HSROC model reduces to the model considered by Dendukuri et al.¹³, referred to as HSROC-D. For clarification of notations of models, we listed MGLMM, HSROC and HSROC-D in Table 1.

Table 1.

Summary of the MGLMM and HSROC models, and their submodels.

Model

Reference tests

Latent variables and their distributions

Random effects and their distributions

HSROC-D (Dendukuri et al., 2012)

common cutoff and accuracy value

Z_{i 1} \sim {\begin{matrix} N (\frac{α_{i 1}}{2}, exp (\frac{β_{1}}{2})) if D_{i} = 1, \\ N (- \frac{α_{i 1}}{2}, exp (- \frac{β_{1}}{2})) if D_{i} = 0, \end{matrix}

where β₁: scale parameter; i = 1,…, I

(\begin{matrix} θ_{i 1} \\ α_{i 1} \end{matrix}) \sim N ((\begin{matrix} Θ_{1} \\ Λ_{1} \end{matrix}), (\begin{matrix} σ_{Θ_{1}}^{2} & 0 \\ σ_{Λ_{1}}^{1} \end{matrix}))

Extended Models HSROC ≡ MGLMM

HSROC

study-specific cutoff and accuracy values

Z_i0 ∼ h(0, 1),

Z_{i 1} \sim {\begin{matrix} h (\frac{α_{i 1}}{2}, exp (\frac{β_{1}}{2})) if D_{i} = 1, \\ h (- \frac{α_{i 1}}{2}, exp (- \frac{β_{1}}{2})) if D_{i} = 0, \end{matrix} Z_{i 2} \sim {\begin{matrix} h (\frac{α_{i 2}}{2}, exp (\frac{β_{2}}{2})) if D_{i} = 1, \\ h (- \frac{α_{i 2}}{2}, exp (- \frac{β_{2}}{2})) if D_{i} = 0, \end{matrix}

where h(·; ·): location-scale family; β₁, β₂: scale parameters; i = 1,…, I

(\begin{matrix} θ_{i 0} \\ θ_{i 1} \\ α_{i 1} \\ θ_{i 2} \\ α_{i 1} \end{matrix}) \sim N ((\begin{matrix} Θ_{0} \\ Θ_{1} \\ Λ_{1} \\ Θ_{2} \\ Λ_{2} \end{matrix}), (\begin{matrix} σ_{Θ_{0}}^{2} & σ_{Θ_{0} Θ_{1}} & σ_{Θ_{0} Λ_{1}} & σ_{Θ_{0} Θ_{2}} & σ_{Θ_{0} Λ_{2}} \\ σ_{Θ_{1}}^{2} & 0 & σ_{Θ_{1} Θ_{2}} & σ_{Θ_{1} Λ_{2}} \\ σ_{Λ_{1}}^{2} & σ_{Λ_{1} Θ_{2}} & σ_{Λ_{1} Λ_{2}} \\ σ_{Θ_{2}}^{2} & 0 \\ σ_{Λ_{2}}^{2} \end{matrix}))

MGLMM

study-specific sensitivity and specificity

H^{- 1} (\begin{matrix} π_{i} \\ {Se}_{i 1} \\ {Sp}_{i 1} \\ {Se}_{i 2} \\ {Sp}_{i 2} \end{matrix}) \sim N ((\begin{matrix} p \\ S_{1} \\ C_{1} \\ S_{2} \\ C_{2} \end{matrix}), (\begin{matrix} σ_{P}^{2} & σ_{P S_{1}} & σ_{P C_{1}} & σ_{P S_{2}} & σ_{P C_{2}} \\ σ_{S_{1}}^{2} & σ_{S_{1} C_{1}} & σ_{S_{1} S_{2}} & σ_{S_{2} C_{2}} \\ σ_{C_{1}}^{2} & σ_{C_{1} S_{2}} & σ_{C_{1} C_{2}} \\ σ_{S_{2}}^{2} & σ_{S_{2} C_{2}} \\ σ_{C_{2}}^{2} \end{matrix}))

Reduced Models HSROC_R ≡ MGLMM_R

HSROC_R

common cutoff and accuracy values

Z_i0 ∼ h(0, 1),

Z_{i 1} \sim {\begin{matrix} h (\frac{α_{i 1}}{2}, exp (\frac{β_{1}}{2})) if D_{i} = 1, \\ h (- \frac{α_{i 1}}{2}, exp (- \frac{β_{1}}{2})) if D_{i} = 0, \end{matrix}

where h(·; ·): location-scale family; β₁: scale parameters; i = 1, …, I

(\begin{matrix} θ_{i 0} \\ θ_{i 1} \\ α_{i 1} \end{matrix}) \sim N ((\begin{matrix} Θ_{0} \\ Θ_{1} \\ Λ_{1} \end{matrix}), (\begin{matrix} σ_{Θ_{0}}^{2} & 0 & 0 \\ σ_{Θ_{1}}^{2} & 0 \\ σ_{Λ_{1}}^{2} \end{matrix}))

MGLMM_R

common sensitivity and specificity

H^{- 1} (\begin{matrix} π_{i} \\ {Se}_{i 1} \\ {Sp}_{i 1} \end{matrix}) \sim N ((\begin{matrix} p \\ S_{1} \\ C_{1} \end{matrix}), (\begin{matrix} σ_{P}^{2} & 0 & 0 \\ σ_{S_{1}}^{2} & σ_{S_{1} C_{1}} \\ σ_{C_{1}}^{2} \end{matrix}))

Open in a new tab

5 Relations between the two frameworks

We establish the exact mathematical relations between parameters in the MGLMM model and those in the HSROC model in subsection 5.1, elucidate the relations under two corresponding submodels in subsection 5.2, and discuss the extension of the relations under meta-regression models in subsection 5.3.

5.1 Reference test with heterogeneous sensitivity and specificity across studies

In this subsection, we consider the situations when the sensitivity and specificity of the reference test are heterogeneous across studies. We note that the formulation of the MGLMM model is based on the study-specific prevalence, sensitivities and specificities, i.e., (π_i, Se_i1, Sp_i1, Se_i2, Sp_i2), whereas the HSROC model is based on the study-specific cutoff and accuracy values, i.e., (θ_i0, θ_i1, α_i1, θ_i2, α_i2). The parameters of both models are the distribution parameters (mean and variance) of these random effects. Therefore, to establish the relationship between the model parameters, it is sufficient to establish the mathematical relations between the study-specific effects (π_i, Se_i1, Sp_i1, Se_i2, Sp_i2) and (θ_i0, θ_i1, α_i1, θ_i2, α_i2).

Recall that the latent variable Z_i1 follows the location-scale distribution h(z|μ, σ) = σ⁻¹h((z – μ)/σ|0, 1) with the mean and scale parameters of α_i1/2 and exp(β₁/2) when D_i = 1. By standardizing Z_i1, we have

Pr (Z_{i 1} \geq θ_{i 1} | D_{i} = 1) = Pr {(Z_{i 1} - α_{i 1} / 2) exp (- β_{1} / 2) \geq (θ_{i 1} - α_{i 1} / 2) exp (- β_{1} / 2) | D_{i} = 1} = H {- (θ_{i 1} - α_{i 1} / 2) exp (- β_{1} / 2)},

where H (·) is the cdf of h(·|0, 1). Similarly, Pr(Z_i1 < θ_i1|D_i = 0) = H {(θ_i1 + α_i1/2) exp(β₁/2)}. Under the HSROC model, the study-specific disease prevalence, sensitivity and specificity of T₁ in the ith study, in the transformed scale, can be calculated as

\begin{matrix} H^{- 1} (π_{i}) = H^{- 1} {Pr (D_{i} = 1)} = H^{- 1} {Pr (Z_{i 0} \geq θ_{i 0})} = H^{- 1} {H (- θ_{i 0})} = - θ_{i 0}, \\ H^{- 1} ({Se}_{i 1}) = H^{- 1} {Pr (T_{i 1} = 1 | D_{i} = 1)} = H^{- 1} {Pr (Z_{i 1} \geq θ_{i 1} | D_{i} = 1)} = - (θ_{i 1} - α_{i 1} / 2) exp (- β_{1} / 2), \\ H^{- 1} ({Sp}_{i 1}) = H^{- 1} {Pr (T_{i 1} = 0 | D_{i} = 0)} = H^{- 1} {Pr (Z_{i 1} < θ_{i 1} | D_{i} = 0)} = (θ_{i 1} + α_{i 1} / 2) exp (β_{1} / 2) . \end{matrix}

Similarly, the study-specific sensitivity and specificity of T₂ can be calculated as H⁻¹(Se_i2) = −(θ_i2 −α_i2/2) exp(−β₂/2) and H⁻(Sp_i2) = (θ_i2 + α_i2/2) exp(β₂/2).

Denote b₁ = exp(β₁/2) and b₂ = exp(β₂/2). The relations between (π_i, Se_i1, Sp_i1, Se_i2, Sp_i2) and (θ_i0, θ_i1, α_i1, θ_i2, α_i2) can be written in matrix form as

H^{- 1} (\begin{matrix} π_{i} \\ {Se}_{i 1} \\ {Sp}_{i 1} \\ {Se}_{i 2} \\ {Sp}_{i 2} \end{matrix}) = S^{- 1} (\begin{matrix} θ_{i 0} \\ θ_{i 1} \\ α_{i 1} \\ θ_{i 2} \\ α_{i 2} \end{matrix}), where S^{- 1} = (\begin{matrix} - 1 & 0 & 0 & 0 & 0 \\ 0 & - b_{1}^{- 1} & \frac{1}{2} b_{1}^{- 1} & 0 & 0 \\ 0 & b_{1} & \frac{1}{2} b_{1} & 0 & 0 \\ 0 & 0 & 0 & - b_{2}^{- 1} & \frac{1}{2} b_{2}^{- 1} \\ 0 & 0 & 0 & b_{2} & \frac{1}{2} b_{2} \end{matrix}) .

(1)

By taking the expectation and variance of both sides of equation (1), we obtain the following relationship between the parameters in the MGLMM framework and the HSROC framework

ℳ = S^{- 1} \cdot ℋ,

(2)

V = S^{- 1} \cdot Ω \cdot {(S^{- 1})}^{T},

(3)

where ℳ = (P, S₁, C₁, S₂, C₂)^T is the mean vector of disease prevalence, sensitivities and specificities of the two tests in the transformed scale (i.e., H⁻¹-transformation), and ℋ = (Θ₀, Θ₁, Λ₁, Θ₂, Λ₂)^T is the mean vector of prevalence cutoff value, cutoff values and accuracy values of the two tests. By equations (2) and (3), we can write the MGLMM model parameters as functions of parameters in the HSROC model, and vice versa by multiplying the matrix S (or S^T) to both hands of equation (2) and to the left and right of matrices in equation (3). The detailed results of such mathematical relations between two sets of 20 model parameters are provided in Web Appendix B. Here we only highlight some interesting findings.

The first interesting result is $β_{j} = 0.5 log (σ_{C_{j}}^{2} / σ_{S_{j}}^{2})$ for j = 1, 2. The shape parameter β_j, which characterizes the asymmetry of the ROC curve for test T_j, is determined solely by the ratio of variances of sensitivity and specificity of the test in the transformed scale. Recall the equation (4.11) of Harbord et al.⁷ under the assumption of a gold standard reference test reveals the same finding. Our result generalizes the previous finding where the reference test is a gold standard. Secondly, by equation (3), we have

σ_{S_{j}}^{2} = σ_{C_{j}}^{2} = 0 \Leftrightarrow σ_{Θ_{j}}^{2} = σ_{Λ_{j}}^{2} = 0, for j = 1, 2 .

(4)

This confirms our intuition that the homogeneity in sensitivity and specificity of the test T_j under the MGLMM framework is equivalent to the homogeneity in cutoff and accuracy values of that test. Thirdly, equation (3) implies

σ_{P S_{j}} = σ_{P C_{j}} = 0 \Leftrightarrow σ_{Θ_{0} Θ_{j}} = σ_{Θ_{0} Λ_{j}} = 0, for j = 1, 2 .

(5)

This agrees with our intuition. If disease prevalence is independent of sensitivity and specificity of the test j under the MGLMM framework, then cutoff value for the prevalence is also independent of cutoff and accuracy values of that test under the HSROC framework, and vice versa. Lastly, we have

σ_{S_{1} S_{2}} = σ_{C_{1} S_{2}} = σ_{S_{1} C_{2}} = σ_{C_{1} C_{2}} = 0 \Leftrightarrow σ_{Θ_{1} Θ_{2}} = σ_{Λ_{1} Λ_{2}} = σ_{Θ_{1} Λ_{2}} = σ_{Λ_{1} Θ_{2}} = 0 .

(6)

The relation (6) justifies that the independence assumption between two tests can be equivalently made by imposing the constraints on (σ_S₁S₂, σ_C₁S₂, σ_S₁C₂, σ_C₁C₂) in the MGLMM framework and on (σ_Θ₁Θ₂, σ_Λ₁Λ₂, σ_Θ₁Λ₂, σ_Λ₁Θ₂) in the HSROC framework.

5.2 Reference test with homogeneous sensitivity and specificity across studies

In this subsection, we consider the submodel of the MGLMM when sensitivity and specificity of the reference test are homogeneous across studies, i.e, $σ_{S_{2}}^{2} = σ_{C_{2}}^{2} = 0$ , and disease prevalence is independent of sensitivity and specificity of the test T₁, i.e., σ_PS₁ = σ_PC₁ = 0. By equations (4) and (5), the corresponding submodel of the HSROC model can be obtained by letting $σ_{Θ_{2}}^{2} = σ_{Λ_{2}}^{2} = 0$ and σ_Θ₀Θ₁ = σ_Θ₀Λ₁ = 0. We denote both reduced submodels by MGLMM_R and HSROC_R. Both submodels have 7 parameters and are summarized in the lower panel of Table 1. The functional relations between parameters of these two submodels can be represented as

\begin{matrix} {MGLMM}_{R} \Rightarrow {HSROC}_{R} & {HSROC}_{R} \Rightarrow {MGLMM}_{R} \\ {\begin{matrix} Θ_{0} = - P \\ Θ_{1} = - \frac{1}{2} {(σ_{C_{1}}^{2} / σ_{S_{1}}^{2})}^{1 / 4} S_{1} + \frac{1}{2} {(σ_{S_{1}}^{2} / σ_{C_{1}}^{2})}^{1 / 4} C_{1} \\ Λ_{1} = {(σ_{C_{1}}^{2} / σ_{S_{1}}^{2})}^{1 / 4} S_{1} + {(σ_{S_{1}}^{2} / σ_{C_{1}}^{2})}^{1 / 4} C_{1} \\ β_{1} = \frac{1}{2} log (σ_{C_{1}}^{2} / σ_{S_{1}}^{2}) \\ σ_{Θ_{0}}^{2} = σ_{P}^{2} \\ σ_{Θ_{1}}^{2} = \frac{1}{2} {{(σ_{C_{1}}^{2} σ_{S_{1}}^{2})}^{1 / 2} - σ_{S_{1} C_{1}}} \\ σ_{Λ_{1}}^{2} = 2 {{(σ_{C_{1}}^{2} σ_{S_{1}}^{2})}^{1 / 2} + σ_{S_{1} C_{1}}} . \end{matrix} & {\begin{matrix} P = - Θ_{0} \\ S_{1} = - (Θ_{1} - \frac{1}{2} Λ_{1}) exp (- β_{1} / 2) \\ C_{1} = (Θ_{1} + \frac{1}{2} Λ_{1}) exp (β_{1} / 2) \\ σ_{P}^{2} = σ_{Θ_{0}}^{2} \\ σ_{S_{1}}^{2} = (σ_{Θ_{1}}^{2} + \frac{1}{4} σ_{Λ_{1}}^{2}) exp (- β_{1}) \\ σ_{C_{1}}^{2} = (σ_{Θ_{1}}^{2} + \frac{1}{4} σ_{Λ_{1}}^{2}) exp (β_{1}) \\ σ_{S_{1} C_{1}} = - σ_{Θ_{1}}^{2} + \frac{1}{4} σ_{Λ_{1}}^{2} . \end{matrix} \end{matrix}

(7)

The functional relations in equations (7) will be empirically validated through a meta-analysis for diagnosis of cervical neoplasia in Section 6.1. We note that the relations in equations (7) are identical to the main results in Harbord et al.⁷ after a reparametrization where the reference test is a gold standard, i.e., S₂ = C₂ = 1. It is interesting that exact relations hold even when the reference test is not a gold standard, as long as the sensitivity and specificity of the reference test are homogeneous across studies, and the sensitivity and specificity of the index test are independent of disease prevalence. For details of connections between the results in equation (7) and the main results in Harbord et al.⁷, please refer to Web Appendix C.

5.3 Extensions to models with covariates

In some meta-analyses, study-level covariates such as the study quality, race of the study population and type of recruitment (e.g., family-based versus otherwise) are available. Including study-level covariates can reduce unexplained heterogeneity and correlations. In general, different, but possibly overlapping, covariates may be allowed to affect the study-specific disease prevalence, and sensitivities and specificities of diagnostic test differently in the MGLMM model and to affect the study-specific cutoff and accuracy values differently in the HSROC model. By a similar argument as in Section 5.1 and in Section 4.2 of Harbord et al.⁷, we can show that the MGLMM model with common covariates affecting prevalence, sensitivities and specificities is equivalent to a HSROC model with the same covariates affecting both accuracy and cutoff values. The relations between variance parameters in both models are the same as described in equation (3) and the relations between coefficient parameters in both models are provided in Web Appendix D. However, the MGLMM model with different covariates affecting prevalence, sensitivities and specificities is not equivalent to any HSROC model with covariates.

6 Similarities and differences: examples revisited

In this section, we revisit the three examples described in Section 2. We illustrate a case when fitting the MGLMM and HSROC models leads to equivalent submodels and hence identical inference, and two cases when the inferences from two models are slightly different. We also use example 1 to verify the derived relations between the MGLMM_R and HSROC_R submodels as described in equation (7). For each of the three examples, we fit submodels in both MGLMM and HSROC frameworks, and conduct model selection based on two commonly used information criteria, Akaike's information criterion (AIC) and the Bayesian information criterion (BIC)¹⁷. For all three examples, the transformation H⁻¹(·) is taken as logit-function, acknowledging that other link functions may also be considered. The model implementation is through fitting the non-linear mixed effects model using PROC NLMIXED via the adaptive Gaussian quadrature approximation to the likelihood integrated over the random effects in SAS version 9.3 (SAS Institute Inc., Cary, NC). The selection process is summarized in Table 2, including the −2 log likelihood statistic, AIC and BIC for the MGLMM framework (top panel) and the HSROC framework (bottom panel) respectively.

Table 2.

Selection of random effects using a forward selection procedure. The upper panel shows the results from fitting submodels in the MGLMM framework; and the lower panel shows the results from fitting submodels in the HSROC framework.

		Example 1: Pap test			Example 2: Nishimura-RF			Example 3: Schuetz

Models	Random effects	-2logL	AIC	BIC	-2logL	AIC	BIC	-2logL	AIC	BIC
I	NA	45277	45287	45297	36062	36072	36082	17359	17369	17382
IIa	π_i	41882	41894	41906	33623	33635	33646	16509	16521	16537
IIb	Se_i1	43329	43341	43353	35480	35492	35504	17235	17247	17263
IIc	Se_i2	43398	43410	43423	34700	34712	34723	17064	17076	17092
IId	Sp_i1	43520	43532	43544	35527	35539	35551	17035	17047	17063
IIe	Sp_i2	42838	42850	42863	34582	34594	34605	17166	17178	17194
IIIa	π_i, Se_i1	40510	40524	40539	33133	33147	33161	16391	16405	16424
IIIb	π_i, Se_i2	40888	40902	40917	33109	33123	33137	16243	16257	16276
IIIc	π_i, Sp_i1	40894	40908	40922	33103	33117	33130	16247	16261	16280
IIId	π_i, Sp_i2	40520	40534	40548	33139	33153	33167	16408	16422	16441
IVa	π_i, Se_i1, Se_i2	39777	39793	39810	32629	32645	32661	16122	16138	16159
IVb	π_i, Se_i1, Sp_i1	39762	39778	39795	32624	32640	32655	16124	16140	16162
IVc	π_i, Se_i1, Sp_i2	40506	40522	40539	33138	33154	33169	16390	16406	16428
IVd	π_i, Se_i1, Se_i2, ρSe_i1Se_i2	39777	39795	39814	32629	32647	32664	16113	16131	16155
IVe	π_i, Se_i1, Sp_i1, ρSe_i1Sp_i1	39752	39770	39789	32621	32639	32656	16115	16133	16157
IVf	π_i, Se_i1, Sp_i2, ρSe_i1Sp_i2	40503	40521	40540	33133	33151	33169	16390	16408	16432

1 (≡ I)	NA	45277	45291	45306	36062	36076	36090	17359	17373	17392
2a (≡ IIa)	θ_i0	42085	42101	42118	33623	33639	33654	16510	16526	16547
2b	θ_i1	43106	43122	43139	35355	35371	35386	17082	17098	17120
2c	θ_i2	42704	42720	42736	34338	34354	34370	17063	17079	17100
2d	α_i1	43329	43345	43361	35480	35496	35511	17009	17025	17046
2e	α_i2	43398	43414	43431	34700	34716	34731	17063	17079	17101
3a	θ_i0, θ_i1	39921	39939	39958	33008	33026	33043	16243	16261	16285
3b	θ_i0, θ_i2	39934	39952	39971	32876	32894	32911	16259	16277	16301
3c	θ_i0, α_i1	40667	40685	40704	33455	33473	33491	16207	16225	16249
3d	θ_i0, α_i2	40880	40898	40916	33113	33131	33148	16246	16264	16288
3e	θ_i1, α_i1	42904	42922	42940	30940	30958	30972	16905	16923	16947
4a	θ_i0, θ_i1, θ_i2	39842	39862	39883	32728	32748	32767	16145	16165	16192
4b (≡ IVe)	θ_i0, θ_i1, α_i₁	39752	39772	39793	32621	32641	32660	16121	16141	16168
4c	θ_i0, θ_i1, α_i2	39776	39796	39816	32629	32649	32668	16138	16158	16185
4d	θ_i0, θ_i1, θ_i2, ρθ_i1θ_i2	39773	39795	39818	32681	32703	32724	16145	16167	16197
4e	θ_i0, θ_i1, α_i2, ρθ_i1α_i2	39772	39794	39816	32637	32659	32680	16123	16145	16174
4f	θ_i0, α_i1, θ_i2, ρα_i1θ_i2	39755	39777	3979916	32627	32649	32670	16115	16137	16166

Open in a new tab

6.1 Example 1

To analyze the data on the Pap test and histology test, we begin with a fixed effect model (submodel I), and then add any random effect to improve the goodness of fit under all criteria. The results in the third column of Table 2 suggest that the largest improvement is achieved by allowing for the study-specific prevalence (referred to as submodel IIa). Interestingly, such submodel IIa is identical to the model considered by Walter et al.¹¹, which only allows random disease prevalence but not random sensitivities and specificities of both tests. In other words, the model considered by Walter et al.¹¹ is the best one random component model. Finally, both AIC and BIC suggest the use of submodel IVe (i.e., MGLMM_R discussed in Section 5.2) which includes the random effects on prevalence, both sensitivity and specificity of the Pap test, and fixed effects on sensitivity and specificity of the histology test (reference test). Following a similar model selection procedure in HSROC framework, the submodel 4b (i.e., HSROC_R discussed in Section 5.2) is selected with allowing for the study-specific prevalence cutoff values, and study-specific cutoff and accuracy values of the Pap test. Therefore, in this example, the best fitted submodels are in fact equivalent according to our results in Section 5.2. To verify that both submodels provide the same inference and validate our derived relations in equation (7), we calculate the estimates and standard errors of parameters in the submodel IVe from results of the submodel 4b using the relations in equation (7), and vice versa. Table 3 presents the parameter estimates obtained from both submodels, and the results of applying equation (7) to transform estimates from one submodel to the other. As shown in Table 3, the results of different parametrizations of two models are identical. In this case, either of the two submodels can be used without any discrepancy.

Table 3.

Results of fitting the MGLMM_R and HSROC_R models to the Papanicolaou (Pap) test data.

Parameter

Estimate (SE) from MGLMM_R model

Results of applying equation (7) to HSROC_R estimates below

0.56 (0.21)

S₁

0.64 (0.18)

C₁

1.62 (0.23)

σ_{P}^{2}

2.15 (0.48)

σ_{S_{1}}^{2}

1.67 (0.35)

σ_{C_{1}}^{2}

1.61 (0.42)

σ_S₁C₁

-0.88 (0.31)

Estimate (SE) from HSROC_R model

Results of applying equation (7) to MGLMM_R estimates above

Θ_o

-0.56 (0.21)

Θ₁

0.50 (0.18)

Λ₁

2.27 (0.25)

β₁

-0.02 (0.15)

σ_{Θ_{0}}^{2}

2.15 (0.48)

σ_{Θ_{1}}^{2}

1.26 (0.28)

σ_{Λ_{1}}^{2}

1.51 (0.45)

Open in a new tab

The summary estimates of overall prevalence, sensitivity and specificity based on MGLMM_R model are computed by taking inverse logit transforms of P, S₁ and C₁ estimates. As a result, the estimates of overall disease prevalence is 0.64 (95% CI: 0.54 to 0.73), and the overall sensitivity and specificity of the Pap test are estimated as 0.65 (95% CI: 0.57 to 0.74) and 0.83 (95% CI: 0.77 to 0.90). The sensitivity and specificity of the histology test are estimated as 0.90 (95% CI: 0.88 to 0.93) and 0.99 (95% CI: 0.96 to 1.00) respectively. Furthermore, the covariance between sensitivity and specificity of Pap test, σ_S₁C₁, is estimated to be negative, as would be expected due to the trade-off between sensitivity and specificity when the cutoff value varies across studies. In contrast,¹¹ fitted a model (i.e., submodel IIa) with fixed sensitivities and specificities for both Pap and histology tests but with study-specific disease prevalence. The estimated sensitivity and specificity for the Pap test are 0.75 (95% CI: 0.74 to 0.76) and 0.79 (95% CI: 0.78 to 0.81), and that for the histology test are 0.86 (95% CI: 0.84 to 0.88) and 0.90 (95% CI: 0.88 to 0.92), respectively. Both AIC and BIC suggest that the MGLMM_R model (or equivalently, the HSROC_R model) provides a much better fit of the data compared to the submodel IIa in Walter et al.¹¹ (likelihood ratio statistic between these two nested submodels = 2,130, p < 0.001), suggesting non-negligible heterogeneity in study-specific sensitivity and specificity of Pap test across studies. Specifically, AIC and BIC for the MGLMM_R model (or equivalently, the HSROC_R model) are 39,770 and 39,789 respectively, whereas AIC and BIC for the submodel IIa in Walter et al.¹¹ are 41,894 and 41,906, which are substantially larger.

6.2 Examples 2 and 3

Table 2 also presents the results from model selection procedure applied to examples 2 and 3. For example 2, both AIC and BIC suggest that submodel 4b provides the best fit in the HSROC framework, suggesting that both cutoff and accuracy values of Rheumatoid factor (RF) test should be considered as random effects across studies. In contrast, the corresponding submodel IVe with random effects for prevalence, sensitivity and specificity of RF test, and their correlation in MGLMM framework does not provide the best fit. The submodel IVb without the correlation provides a slightly better fit. In fact, a likelihood ratio test comparing the submodels IVb and IVe suggests that incorporating the correlation between sensitivity and specificity of the RF test does not improve the model fit (p-value = 0.08). In this case, submodel IVb is the best model under MGLMM framework, and there is no corresponding submodel under HSROC framework. The parameter estimates and standard errors of the best fitted submodels IVb and 4b are displayed in Table 4. To enable a direct comparison, we calculate the estimates of disease prevalence and sensitivities and specificities of RF test through results from submodel 4b using the equation (1). It so happens that results from model IVb and 4b are similar despite them not being equivalent. This is because the submodel 4b is equivalent to submodel IVe, which only differs from IVb by a correlation parameter. It is worthy mentioning that both submodels yield estimates of sensitivity and specificity of the ACR 1987 criteria (reference test) being 1, suggesting that such test is in fact a gold standard.

Table 4.

Summary of the parameter estimates (standard errors) of the final fitted models to Examples 2 and 3 in the MGLMM and HSROC frameworks.

	Example 2: Nishimura-RF		Example 3: Schuetz

	MGLMM	HSROC	MGLMM	HSROC

	IVb	4b	IVd	4f
S₁	0.68 (0.03)	0.68 (0.03)	0.96 (0.01)	0.96 (0.01)
C₁	0.87(0.02)	0.88(0.02)	0.97 (0.01)	0.97 (0.02)
S₂	1.00 (-)	1.00 (-)	0.91 (0.01)	0.92 (0.02)
C₂	1.00 (-)	1.00 (-)	1.00 (-)	1.00 (-)
P	0.44 (0.03)	0.44 (0.03)	0.64 (0.02)	0.64 (0.02)
σ_P	0.72 (0.08)	NA	0.86 (0.07)	NA
σ_S₁	0.83 (0.10)	NA	1.08 (0.13)	NA
σ_C₁	1.04 (0.12)	NA	NA	NA
σ_S₂	NA	NA	1.02 (0.13)	NA
ρ_S₁S₂	NA	NA	0.46 (0.13)	NA
σ_Θ₀	NA	0.73 (0.08)	NA	0.86 (0.07)
σ_Θ₁	NA	0.74 (0.08)	NA	NA
σ_Λ₁	NA	1.14 (0.13)	NA	1.52 (0.51)
σ_Θ₂	NA	NA	NA	1.97 (3.01)
ρ_{Λ_lΘ₂}	NA	NA	NA	-0.31 (0.18)

Open in a new tab

In example 3, the best fitted submodels are the submodel IVd with random effects for prevalence, sensitivities for both CT and CAG tests, and their correlation under the MGLMM framework, and the submodel 4f with random effects for prevalence, accuracy of CT test and cutoff of CAG test, and their correlation under the HSROC framework. The parameter estimates and standard errors of submodels IVd and 4f are summarized in Table 4, which show slight differences between these two submodels.

7 Discussion

Multivariate meta-analysis is gaining its popularity recently, especially in meta-analysis of diagnostic accuracy studies¹. In diagnostic accuracy studies, the reference test may be imperfect because it is subject to measurement error or a gold standard is not available in practice. In this paper, we established the equivalence between two recently proposed models that account for the imperfect reference test. Exact relations between the parameters of the two models are established and are empirically validated by an example of meta-analysis for the Papanicolaou test for diagnosis of cervical neoplasia. As we have seen in this example, although seemingly very different parametrizations of these two models, both models lead to equivalent submodels, and hence identical inferences. On the other hand, with two other examples of meta-analysis, we illustrated that there are some differences between two frameworks. In practice, the complexity of the models that should be considered depends on whether the reference test is a gold standard, the number of studies in the meta-analysis, and the degree of heterogeneity of the studies. The choice between MGLMM and HSROC models can be based on the nature of the available data. As suggested by subsection 10.5.4 of the Cochrane handbook¹⁸, when the available studies used a common cut-off value on a continuous or ordinal scale for defining test positivity (e.g. in a commercial test), then the MGLMM model can provide an appropriate framework for test comparisons; if the included studies used different cut-off values for defining positive results, then the HSROC model is a recommended approach. In this paper, we did not consider the situation where the two tests may be conditional dependent given the latent disease status and study-specific random effects. Further study is needed for the relations between two models under such conditional dependence.

Both the MGLMM and HSROC models, as implemented in SAS PROC NLMIXED, involve maximizing an approximation to the likelihood integrated over the multi-dimensional random effects. Implicitly, the NLMIXED procedure approximates the integrated likelihood function by dual quasi-Newton optimization techniques. When study-specific random effects are allowed for both index and reference tests, the likelihood function involves five dimensional integrals and the maximum likelihood inference may suffer from non-convergence and the approximation to the likelihood may have non-negligible errors, which can result in unstable or unreproducible estimates⁶. Bayesian methods and Monte Carlo Markov Chain techniques with proper priors may circumvent these numerical issues. As pointed out by the associate editor and an anonymous referee, it would be of interest to investigate the relations between two prior specifications, and between two matching posteriors under the MGLMM and HSROC frameworks for future research. Finally, we want to emphasize that the model frameworks considered in this paper are two of many possible formulations. It is worthy to study other model frameworks for diagnostic accuracy studies.

Supplementary Material

supplemental

NIHMS639505-supplement-supplemental.pdf^{(149.5KB, pdf)}

Acknowledgments

We are grateful to the editor Jeanine Houwing-Duistermaat, the Associate Editor and two anonymous referees for their constructive comments which have greatly improved the presentation of this paper. Yong Chen was supported by grant number R03HS022900 from the AHRQ. The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality. Haitao Chu was supported in part by the U.S. AHRQ R03HS020666, the U.S. NIAID AI103012 and the NCI P01CA142538. We want to thank Stacia DeSantis and Jose-Miguel Yamal for their helpful comments.

Footnotes

Supplementary Materials: Web Appendices and the SAS code referenced in Sections 5 and 6 are available with this paper at the Biometrics website on Wiley Online Library.

References

1.Jackson D, Riley R, White IR. Multivariate meta-analysis: Potential and promise. Statistics in Medicine. 2011;30(20):2481–2498. doi: 10.1002/sim.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: Data-analytic approaches and some additional considerations. Statistics in Medicine. 1993;12(14):1293–1316. doi: 10.1002/sim.4780121403. [DOI] [PubMed] [Google Scholar]
3.Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in Medicine. 2001;20(19):2865–2884. doi: 10.1002/sim.942. [DOI] [PubMed] [Google Scholar]
4.Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in meta-analysis: multivariate approach and meta-regression. Statistics in Medicine. 2002;21(4):589–624. doi: 10.1002/sim.1040. [DOI] [PubMed] [Google Scholar]
5.Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. Journal of Clinical Epidemiology. 2006;59(12):1331. doi: 10.1016/j.jclinepi.2006.06.011. [DOI] [PubMed] [Google Scholar]
6.Hamza TH, Reitsma JB, Stijnen T. Meta-analysis of diagnostic studies: A comparison of random intercept, normal-normal, and binomial-normal bivariate summary ROC approaches. Medical Decision Making. 2008;28(5):639–649. doi: 10.1177/0272989X08323917. [DOI] [PubMed] [Google Scholar]
7.Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics. 2007;8(2):239–251. doi: 10.1093/biostatistics/kxl004. [DOI] [PubMed] [Google Scholar]
8.Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. American Journal of Epidemiology. 1995;141(3):263–272. doi: 10.1093/oxfordjournals.aje.a117428. [DOI] [PubMed] [Google Scholar]
9.Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PM. Evidence of bias and variation in diagnostic accuracy studies. Canadian Medical Association Journal. 2006;174(4):469–476. doi: 10.1503/cmaj.050090. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Hui SL, Walter SD. Estimating the error rates of diagnostic tests. Biometrics. 1980;36(1):167–171. [PubMed] [Google Scholar]
11.Walter S, Irwig L, Glasziou P, et al. Meta-analysis of diagnostic tests with imperfect reference standards. Journal of Clinical Epidemiology. 1999;52(10):943. doi: 10.1016/s0895-4356(99)00086-4. [DOI] [PubMed] [Google Scholar]
12.Chu H, Chen S, Louis TA. Random effects models in a meta-analysis of the accuracy of two diagnostic tests without a gold standard. Journal of the American Statistical Association. 2009;104(486):512–523. doi: 10.1198/jasa.2009.0017. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Dendukuri N, Schiller I, Joseph L, Pai M. Bayesian Meta-Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference. Biometrics. 2012;68(4):1285–1293. doi: 10.1111/j.1541-0420.2012.01773.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Fahey MT, Irwig L, Macaskill P. Meta-analysis of Pap test accuracy. American Journal of Epidemiology. 1995;141(7):680–689. doi: 10.1093/oxfordjournals.aje.a117485. [DOI] [PubMed] [Google Scholar]
15.Nishimura K, Sugiyama D, Kogata Y, Tsuji G, Nakazawa T, Kawano S, et al. Meta-analysis: diagnostic accuracy of anti–cyclic citrullinated peptide antibody and rheumatoid factor for rheumatoid arthritis. Annals of Internal Medicine. 2007;146(11):797–808. doi: 10.7326/0003-4819-146-11-200706050-00008. [DOI] [PubMed] [Google Scholar]
16.Schuetz GM, Zacharopoulou NM, Schlattmann P, Dewey M. Meta-analysis: noninvasive coronary angiography using computed tomography versus magnetic resonance imaging. Annals of Internal Medicine. 2010;152(3):167–177. doi: 10.7326/0003-4819-152-3-201002020-00008. [DOI] [PubMed] [Google Scholar]
17.Burnham KP, Anderson DR. Model selection and multi-model inference: a practical information-theoretic approach. Springer; 2002. [Google Scholar]
18.Macaskill P, Gatsonis C, Deeks J, Harbord R, Takwoingi Y. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy: Chapter 10 Analysing and Presenting Results. Version 10 The Cochrane Collaboration. 2010 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplemental

NIHMS639505-supplement-supplemental.pdf^{(149.5KB, pdf)}

[R1] 1.Jackson D, Riley R, White IR. Multivariate meta-analysis: Potential and promise. Statistics in Medicine. 2011;30(20):2481–2498. doi: 10.1002/sim.4172. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Moses LE, Shapiro D, Littenberg B. Combining independent studies of a diagnostic test into a summary roc curve: Data-analytic approaches and some additional considerations. Statistics in Medicine. 1993;12(14):1293–1316. doi: 10.1002/sim.4780121403. [DOI] [PubMed] [Google Scholar]

[R3] 3.Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Statistics in Medicine. 2001;20(19):2865–2884. doi: 10.1002/sim.942. [DOI] [PubMed] [Google Scholar]

[R4] 4.Van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in meta-analysis: multivariate approach and meta-regression. Statistics in Medicine. 2002;21(4):589–624. doi: 10.1002/sim.1040. [DOI] [PubMed] [Google Scholar]

[R5] 5.Chu H, Cole SR. Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. Journal of Clinical Epidemiology. 2006;59(12):1331. doi: 10.1016/j.jclinepi.2006.06.011. [DOI] [PubMed] [Google Scholar]

[R6] 6.Hamza TH, Reitsma JB, Stijnen T. Meta-analysis of diagnostic studies: A comparison of random intercept, normal-normal, and binomial-normal bivariate summary ROC approaches. Medical Decision Making. 2008;28(5):639–649. doi: 10.1177/0272989X08323917. [DOI] [PubMed] [Google Scholar]

[R7] 7.Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA. A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics. 2007;8(2):239–251. doi: 10.1093/biostatistics/kxl004. [DOI] [PubMed] [Google Scholar]

[R8] 8.Joseph L, Gyorkos TW, Coupal L. Bayesian estimation of disease prevalence and the parameters of diagnostic tests in the absence of a gold standard. American Journal of Epidemiology. 1995;141(3):263–272. doi: 10.1093/oxfordjournals.aje.a117428. [DOI] [PubMed] [Google Scholar]

[R9] 9.Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PM. Evidence of bias and variation in diagnostic accuracy studies. Canadian Medical Association Journal. 2006;174(4):469–476. doi: 10.1503/cmaj.050090. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Hui SL, Walter SD. Estimating the error rates of diagnostic tests. Biometrics. 1980;36(1):167–171. [PubMed] [Google Scholar]

[R11] 11.Walter S, Irwig L, Glasziou P, et al. Meta-analysis of diagnostic tests with imperfect reference standards. Journal of Clinical Epidemiology. 1999;52(10):943. doi: 10.1016/s0895-4356(99)00086-4. [DOI] [PubMed] [Google Scholar]

[R12] 12.Chu H, Chen S, Louis TA. Random effects models in a meta-analysis of the accuracy of two diagnostic tests without a gold standard. Journal of the American Statistical Association. 2009;104(486):512–523. doi: 10.1198/jasa.2009.0017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Dendukuri N, Schiller I, Joseph L, Pai M. Bayesian Meta-Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference. Biometrics. 2012;68(4):1285–1293. doi: 10.1111/j.1541-0420.2012.01773.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Fahey MT, Irwig L, Macaskill P. Meta-analysis of Pap test accuracy. American Journal of Epidemiology. 1995;141(7):680–689. doi: 10.1093/oxfordjournals.aje.a117485. [DOI] [PubMed] [Google Scholar]

[R15] 15.Nishimura K, Sugiyama D, Kogata Y, Tsuji G, Nakazawa T, Kawano S, et al. Meta-analysis: diagnostic accuracy of anti–cyclic citrullinated peptide antibody and rheumatoid factor for rheumatoid arthritis. Annals of Internal Medicine. 2007;146(11):797–808. doi: 10.7326/0003-4819-146-11-200706050-00008. [DOI] [PubMed] [Google Scholar]

[R16] 16.Schuetz GM, Zacharopoulou NM, Schlattmann P, Dewey M. Meta-analysis: noninvasive coronary angiography using computed tomography versus magnetic resonance imaging. Annals of Internal Medicine. 2010;152(3):167–177. doi: 10.7326/0003-4819-152-3-201002020-00008. [DOI] [PubMed] [Google Scholar]

[R17] 17.Burnham KP, Anderson DR. Model selection and multi-model inference: a practical information-theoretic approach. Springer; 2002. [Google Scholar]

[R18] 18.Macaskill P, Gatsonis C, Deeks J, Harbord R, Takwoingi Y. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy: Chapter 10 Analysing and Presenting Results. Version 10 The Cochrane Collaboration. 2010 [Google Scholar]

PERMALINK

A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard

Yulun Liu

Yong Chen

Haitao Chu

Abstract

1 Introduction

2 Examples

3 The MGLMM framework

4 The HSROC framework

Table 1.

5 Relations between the two frameworks

5.1 Reference test with heterogeneous sensitivity and specificity across studies

5.2 Reference test with homogeneous sensitivity and specificity across studies

5.3 Extensions to models with covariates

6 Similarities and differences: examples revisited

Table 2.

6.1 Example 1

Table 3.

6.2 Examples 2 and 3

Table 4.

7 Discussion

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard

Yulun Liu

Yong Chen

Haitao Chu

Abstract

1 Introduction

2 Examples

3 The MGLMM framework

4 The HSROC framework

Table 1.

5 Relations between the two frameworks

5.1 Reference test with heterogeneous sensitivity and specificity across studies

5.2 Reference test with homogeneous sensitivity and specificity across studies

5.3 Extensions to models with covariates

6 Similarities and differences: examples revisited

Table 2.

6.1 Example 1

Table 3.

6.2 Examples 2 and 3

Table 4.

7 Discussion

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases