Bayesian Network Meta-Regression Hierarchical Models Using Heavy-Tailed Multivariate Random Effects with Covariate-Dependent Variances

Hao Li; Daeyoung Lim; Ming-Hui Chen; Joseph G Ibrahim; Sungduk Kim; Arvind K Shah; Jianxin Lin

doi:10.1002/sim.8983

. Author manuscript; available in PMC: 2022 Jul 10.

Published in final edited form as: Stat Med. 2021 Apr 12;40(15):3582–3603. doi: 10.1002/sim.8983

Bayesian Network Meta-Regression Hierarchical Models Using Heavy-Tailed Multivariate Random Effects with Covariate-Dependent Variances

Hao Li ¹, Daeyoung Lim ¹, Ming-Hui Chen ^1,^*, Joseph G Ibrahim ², Sungduk Kim ³, Arvind K Shah ⁴, Jianxin Lin ⁴

PMCID: PMC8274575 NIHMSID: NIHMS1713492 PMID: 33846992

Abstract

Network meta-analysis (NMA) is gaining popularity in evidence synthesis and network meta-regression (NMR) allows us to incorporate potentially important covariates into network meta-analysis. In this paper, we propose a Bayesian network meta-regression hierarchical model and assume a general multivariate t distribution for the random treatment effects. The multivariate t distribution is desired for heavy-tailed random effects and converges to the multivariate normal distribution when the degrees of freedom go to infinity. Moreover, in NMA, some treatments are compared only in a single study. To overcome such sparsity, we propose a log-linear regression model for the variances of the random effects and incorporate aggregate covariates into modeling the variance components. We develop a Markov chain Monte Carlo sampling algorithm to sample from the posterior distribution via the collapsed Gibbs technique. We further use the Deviance Information Criterion (DIC) and the logarithm of the Pseudo-marginal likelihood (LPML) for model comparison. A simulation study is conducted and a detailed analysis from our motivating case study is carried out to further demonstrate the proposed methodology.

Keywords: Arm-based model, Collapsed Gibbs sampling, Multivariate t distribution, Surface under the cumulative ranking curve (SUCRA), Triglycerides (TG)

1 |. INTRODUCTION

Triglycerides (TG) are a type of fat found in blood which is used by body for energy. Some TG are needed for good health, but high levels might raise the risk of heart disease. A TG level below 150 mg/dL is considered normal, from 150 mg/dL to 199 mg/dL borderline-high, from 200 mg/dL to 499 mg/dL high and over 500 mg/dL very high. Diet and lifestyle changes are recommended as the first line for therapy of hypertriglyceridemia, but drug therapy is often required. The drugs currently available for the treatment of hypertriglyceridemia include niacin, fibrates and omega-3-fatty acids. In a recent study, the addition of ezetimibe to the ongoing lipid lowering therapies reduced TG levels by 33% and by 8% with the monotherapy of ezetimibe.¹

According to the National Vital Statistics Reports,² cardiovascular disease (CVD) continues to be the leading cause of death for both men and women. This is the case in the U.S. and worldwide. Although low density lipoprotein cholesterol (LDL-C) is a primary cause of CVD, other risk factors, for example, triglycerides, contribute as well. The 2011 statement cited epidemiolog- ical evidence that a moderate elevation in the TG level is often associated with increased atherosclerotic cardiovascular disease (ASCVD) risk.³ Several studies have demonstrated that elevated TG associated with genetic variants may be a causal factor for ASCVD and possibly for premature all-cause mortality.^4,5,6 A long-standing association exists between elevated TG levels and CVD.^7,8 In a meta-analysis of 17 studies, increased TG levels were associated with increased coronary disease risk in both men and women, after adjustment for high density lipoprotein cholesterol (HDL-C) and other risk factors.⁹ The randomized, controlled clinical trial REDUCE-IT¹⁰ has shown that intervention to lower TG levels is associated with reduced CVD events.

In the process of synthesizing evidence from multiple studies, study-level covariates might be taken into account since the settings of studies may differ.^11,12,13,14 Network meta-regression, by incorporating categorical or continuous covariates, can reduce heterogeneity between studies and summarize treatment effects after adjusting for study-level differences. The importance of adjusting for covariate effects in the presence of treatment-covariate interactions and adding individual baseline characteristics as covariates to the random effects logistic model are illustrated in Batson et al.¹⁴ Categorical study-level covariates can also split the data into subgroups and be used to investigate subgroup effects.¹² Mainstream modeling methods for NMR focus on synthesis of relative treatment effects data from studies (contrast-based models). In cases where the effect of each treatment arm is available, an alternative method of modeling the absolute treatment effects (arm-based models) has been investigated.^15,16,17 In the arm-based framework, NMR allows us to include treatment-by-study-level covariates. For both modeling methods, an exchangeability assumption is usually made to allow for variation in the true treatment effects across trials, which induces a generalized linear mixed effects model. A detailed discussion of generalized linear mixed effects models is provided for implementing arm-based network meta-regression within the frequentist framework.¹³ In addition to Markov chain Monte Carlo (MCMC) sampling, which is widely used in NMA or NMR, integrated nested Laplace approximations are proposed to carry out Bayesian inference of the NMR model and compare the results to those obtained by MCMC.¹⁸ In this paper, we consider 10 baseline characteristic covariates in our arm-based NMR model in order to reduce the bias in estimating the true treatment effects. The covariate effects are assumed fixed while the treatment effects are assumed random. In all of the aforementioned work, regardless of the types of outcomes and modeling methods, the common distribution for the trial-specific random treatment effects is usually assumed to be a normal distribution. Here, we assume a general multivariate t distribution for the random treatment effects. The random effects converge to a commonly used multivariate normal distribution when the degrees of freedom in the multivariate t distribution go to infinity. The use of the multivariate t distribution creates more challenges in sampling from posterior distributions and carrying out Bayesian inference. We develop a generalized collapsed Gibbs sampling algorithm to implement the posterior computations.

While the methodology for network meta-analysis has become more and more sophisticated, the complexity of the NMA structure keeps increasing. The number of treatments compared in one study is typically smaller than the complete collection of treatments in the NMA data. This complicates the adoption of non-informative prior distributions and inference for the parameters in the covariance matrix of the random effects. In a contrast-based model, the covariance matrix of the random effects is often assumed to have certain structures under the homogeneous variances (the variances of the random effects are equal) and consistency (no apparent discrepancy between direct and indirect comparisons) assumption.^18,19,20,21 Lu and Ades have proposed spherical parameterizations to generate a positive-definite matrix to facilitate the unstructured covariance matrix. ²² Li et al.¹¹ propose an alternative approach based on partial correlations to generate the correlation matrix, which results in the same posterior inference under non-informative priors for the correlations. In addition, it is also common that some treatments are included in single trials so that the variances of the random effects of these treatments cannot be estimated. To resolve this issue, a grouping approach for the variances of the random effects based on the mechanism of the corresponding treatments has been investigated.¹¹ In this paper, we extend the grouping approach to a modeling approach and assume a log-linear model for the variances of the random effects so that the variances can depend on the treatment-by-study-level aggregate covariates. The modeling approach is a natural extension to the grouping approach and also facilitates more flexibility.

We further use the DIC and LPML to compare the performance of different degrees of freedom of the multivariate t distribution and different sets of covariates for modeling the variances of the random effects. The rest of this article is organized as follows. In Section 2, we describe the motivating network meta-data, which have continuous outcomes in each study arm for 29 studies consisting of 11 different treatments. In Section 3, we thoroughly develop the network meta-regression hierarchical model. This includes specifying a multivariate t distribution for the random effects and a log-linear model for heterogenous variances. The complete-data likelihood and the observed-data likelihood are also given in this section. The priors, posteriors, MCMC sampling algorithm, and Bayesian model comparison criteria are presented in Sections 4.1, 4.2, and 4.3, respectively. In Section 5, we carry out a simulation study to examine the empirical performance of the proposed methodology. In Section 6, we give detailed analyses and Bayesian inference for our motivating case study. Finally, we conclude the article with some discussion for future research in Section 7.

2 |. MOTIVATING CASE STUDY: THE TNM DATA

A systematic online search yielded 78 double-blind, randomized, active or placebo-controlled clinical trials. From these trials, 37 second-line studies (i.e., studies with patients on statin at study entry) were excluded. From the remaining 41 first-line trials (i.e., studies with patients who were drug-naive or rendered drug-naive by wash-out at study entry), 12 trials with missing information (5 trials having the response variable missing, 2 trials having the study covariates missing and 5 trials having both the response variable and covariates missing) were further excluded. The flow diagram of data collection can be found in Figure 1(a) of Li et al. (2019).¹¹ The remaining 29 studies compose the motivating dataset, the Triglycerides Network Meta (TNM) data. These studies reported the TG lowering effects of placebo (PBO), five statins (simvastatin (S), atorvastatin (A), lovastatin (L), rosuvastatin (R), pravastatin (P)), ezetimibe (E), or statins plus ezetimibe (S and E (SE), A and E (AE), L and E (LE), and P and E (PE)). Figure 1 exhibits the study network among the cholesterol lowering treatments for patients with primary hypercholesterolemia. The outcome variable of our data is the mean percent change from baseline in TG. Our continuous outcome and the sample standard deviation for each trial arm are presented in Table 1. Figure 2 is the boxplot of the outcome variable grouped by treatments. We can see from Table 1 and Figure 2 that some treatments have a wide range of effects in lowering TG. For example, the mean TG lowering effect ranges from −10.20 to −27.00 for treatment A, and ranges from −8.50 to −29.20 for treatment R. Particularly, two data points fall outside the boxplot of treatment R and they are labelled with trial IDs. Treatments A and R are compared in the most trials and have the widest ranges in effects. The potential heavy-tails in the outcome variable prompt us to assume heavy-tailed random effects for the treatment effects.

TG network meta-data diagram. Each node represents one treatment in the network. Node sizes are proportional to the total number of subjects treated in the arms across the network. Each edge represents at least one direct comparison that exists for two connected treatments. The involved trial IDs are listed on the edges. Thinner lines represent fewer direct comparisons. Red, green or blue colors represent 1, 2–4, or 5+ direct comparisons available for the connected treatments.

TABLE 1.

TG network meta-data: mean (SD) percent change from baseline in TG

Treatment Study ID	PBO	S	A	L	R	P	E	SE	AE	LE	PE

1	−2.20 (33.00)	−15.20 (34.10)					−13.20 (27.80)	−28.00 (28.00)
2	2.40 (23.26)	−16.60 (22.70)					−8.30 (23.24)	−24.10 (23.17)
3	−1.90 (31.42)	−20.80 (29.69)					−10.70 (31.63)	−24.30 (27.03)
4		−19.00 (29.92)						−22.82 (25.49)
5		−14.97 (27.36)						−22.38 (27.96)
6		−22.90 (24.18)						−28.60 (25.29)
7			−22.50 (28.23)					−25.45 (28.08)
8			−25.50 (29.51)					−27.40 (29.45)
9			−25.75 (25.17)					−27.72 (26.78)
10			−26.41 (29.25)					−26.38 (30.78)
11			−24.70 (27.80)					−25.15 (22.09)
12	−6.40 (42.21)		−24.50 (24.41)				−5.10 (25.46)		−32.80 (24.43)
13	4.00 (24.00)			−11.00 (29.66)			−3.00 (25.46)			−22.00 (27.71)
14	2.00 (30.64)					−7.60 (30.07)	−2.10 (30.40)				−17.60 (29.99)
15					−24.60 (25.06)			−26.20 (24.09)
16	−2.80 (27.38)		−20.90 (27.39)		−19.10 (28.17)
17	−1.00 (26.42)		−19.00 (27.05)		−18.00 (27.17)
18		−14.00 (23.04)	−18.00 (22.98)		−20.00 (23.15)
19			−18.35 (25.38)		−18.47 (26.10)
20			−19.10 (26.32)		−17.90 (26.64)
21			−27.00 (27.04)		−22.20 (29.55)
22			−23.90 (27.36)		−29.20 (27.30)
23			−17.80 (24.79)		−18.60 (24.82)
24			−16.00 (28.30)		−16.98 (28.90)
25			−22.10 (25.98)		−22.90 (29.20)
26			−26.11 (29.11)		−21.88 (28.37)
27			−11.80 (34.48)		−13.50 (36.35)
28			−10.20 (40.36)		−8.50 (41 70)
29			−14.30 (31.31)		−12.00 (32.17)

Open in a new tab

Boxplot of the mean percent change in TG for each treatment arm. The width of each box is proportional to the number of trials that compared the corresponding treatment.

In our data, ezetimibe is available at only one dose of 10 mg while the statins are available at múltiple doses. We follow the recommendation from the Cochrane Handbook²³ to form the TG lowering effects of each statin with different doses. We consider 10 common covariates from the 29 studies, which are aggregated on the treatment-by-study level. An examination of the covariates shows that there is heterogeneity among the baseline characteristics of the included studies. For example, the trial durations were not uniform and ranged between 6 and 12 months. Further, the proportions of white race and male gender also varied across trials. In particular, the proportions of white race were 0% in trials 13 and 16 and almost 100% in trials 18 and 21. Furthermore, the proportions of high statin potency were 100% in trials 8, 15, and 21. These intuitive inspections encourage us to incorporate covariates into our meta-regression model. A detailed summary of the aggregate covariates and the sample size of each treatment arm in each study can be found in Tables A1a and A1b of Appendix A of the Supplementary Materials.

3 |. NETWORK META-REGRESSION HIERARCHICAL MODEL

3.1 |. Model

Assume that $T = {1, 2, \dots, T}$ is the complete collection of treatments across a total of K trials in the data. Within a multiple treatment comparison framework, although it is rarely the case that the complete collection of treatments is compared in all K trials, we momentarily assume this is true for ease of exposition. Thus, for the tth treatment in the kth trial, let y_kt and $S_{k t}^{2}$ denote the aggregate response, which generally represents the sample mean and the sample variance, respectively. Following Zhang et al.¹⁶, Yao et al.²⁴, and Hong et al.²⁵, we propose a general hierarchical random effects model for the network meta-regression as

y_{k t} = x_{k t}^{'} β + τ_{k t} γ_{k t} + ϵ_{k t}, ϵ_{k t} \sim N (0, \frac{σ_{k t}^{2}}{n_{k t}}),

(1)

and

\frac{(n_{k t} - 1) S_{k t}^{2}}{σ_{k t}^{2}} \sim χ_{n_{k t} - 1}^{2} .

(2)

The aggregate variables, y_kt and $S_{k t}^{2}$ are independent.²⁶ The aggregate covariate vector, x_kt, is p-dimensional and β is the regression coefficient vector corresponding to x_kt.

The random effect of the tth treatment in the kth trial, γ_kt, is assumed to be independent of ∈_kt. It captures the dependence of the y_kt’s within the trial as well as the heterogeneity across trials. We let $γ_{k} = {(γ_{k 1}, \dots, γ_{k T})}^{'}$ be the vector of T-dimensional scaled random effects, which would be observed in the kth trial, and ρ denotes the T × T correlation matrix. We include the overall treatment effects in the coefficients β in (1) so that the first p − T components are the regression coefficients corresponding to the p − T aggregate covariates and the remaining T components are the overall treatment effects. We assume a general multivariate t distribution for the scaled random effects γ_k, that is

γ_{k} \sim t_{T} (0, ρ, v),

(3)

where v is the degrees of freedom. When the degrees of freedom v → ∞, the scaled random effects γ_k converge to a multivariate normal distribution, which is a common assumption for random effects in NMA. The multivariate t distribution is preferable for heavy tailed random effects. As shown in Figure 2, the mean TG lowering effects for both treatments A and R exhibit substantial variation from trial to trial with wide ranges of −10.20 to −27.00 for treatment A and −8.50 to −29.20 for treatment R. Thus, for the TNM data, the heavy tailed random effects in (3) may yield a better fit than light tailed random effects.

The variance of the random effect for the tth treatment in the kth trial is captured by $τ_{k t}^{2}$ . We further assume that

\log (τ_{k t}) = z_{k t}^{'} ϕ,

(4)

where $z_{k t}^{'}$ is the q-dimensional vector of aggregate covariates and ϕ is the corresponding coefficient vector in the log-linear regression model. This means that the variances of the random effects in the log-linear regression model are treatment-by-study-based and can be determined by the aggregate covariates from each trial arm. The variance of the random effect, τ_kt, is generally more difficult to estimate than the regression coefficients β in (1) for the aggregate response. Also, certain covariates may be highly associated with the outcome variable while they may not affect the variance of the random effect, since the random effect quantifies the difference between the trial-specific treatment effect and the corresponding overall treatment effect. For these reasons, we assume that $z_{k t}^{'}$ may be different than x_kt and q ≤ p. If $z_{k t}^{'} \equiv 1$ in (4), the random effects for the T treatments across all trials share the same variance. By letting ϕ = (ϕ₁, …, ϕ_G) and $z_{k t}^{'} = (0, \dots, 1, \dots, 0)$ , with the gth element equal to 1 and 0 otherwise if the tth treatment is from the gth group, we are able to divide these T treatments into G groups, and assume the variances of the random effects for those treatments within the same group share the same variance. This is analogous to the grouping approach¹¹ for resolving the issue that some variances of the random effects cannot be estimated due to insufficient data in a heterogeneous variances model. Our log-linear regression model can easily accommodate the grouping approach, and therefore is an extension of that approach.

3.2. | Likelihood Function

In practice, we can only observe T_k treatments in the kth trial. Let $T_{k} = {t_{k 1}, \dots, t_{k T_{k}}; t_{k ℓ} \in T, ℓ = 1, \dots, T_{k}}$ denote the set of treatments compared in the kth trial. Define a collection of unit vectors $E_{k} = (e_{t_{k 1}}, e_{t_{k 2}}, \dots, e_{t_{k T_{k}}})$ , where $e_{t_{k ℓ}} = {(0, \dots, 1, \dots, 0)}^{'}, ℓ = 1, \dots, T_{k}$ with t_kℓ th element equal to 1 and 0 otherwise. Thus, E_k is a T×T_k matrix. Let $E_{k}^{C}$ be a T×(T−T_k) matrix, which consists of the columns of the T×T identity matrix I_T that are not included in E_k. We let $γ_{k, o} = E_{k}^{'} γ_{k}$ be the vector of the T_k-dimensional scaled random effects of the treatments that are actually observed in the kth trial while $γ_{k, m} = {(E_{k}^{C})}^{'} γ_{k}$ is the vector of the (T − T_k)-dimensional scaled random effects of the treatments that are missing in the kth trial. It is easy to show that $γ_{k, o} \sim t_{T_{k}} (0, E_{k}^{'} ρ E_{k}, v)$ , and $γ_{k, m} \sim t_{T - T_{k}} (0, {(E_{k}^{C})}^{'} ρ E_{k}^{C}, v)$ , for k =1,2,…, K.

Now, we replace the notation with subscript kt in the last subsection to the new notation with subscript kt_kℓ to indicate the t_kℓth treatment in the kth trial. Let $y_{k} = (y_{k t_{k 1}}, \dots, y_{k t_{k T_{k}}})$ denote the vector of the response variables and also let $ϵ_{k} = {(ϵ_{k t_{k 1}}, \dots, ϵ_{k t_{k T_{k}}})}^{'}$ denote the vector of the random errors for the kth trial. We have $ϵ_{k} \sim N_{T_{k}} (0, Σ_{k})$ , where $Σ_{k} = diag (\frac{σ_{k t_{k 1}}^{2}}{n_{k t_{k 1}}}, \frac{σ_{k t_{k 2}}^{2}}{n_{k t_{k 2}}}, \dots, \frac{σ_{k t_{k T_{k}}}^{2}}{n_{k t_{k T_{k}}}})$ is a T_k × T_k diagonal matrix. Let $X_{k} = {(x_{k t_{k 1}}, x_{k t_{k 2}}, \dots, x_{k t_{k T_{k}}})}^{'}$ denote the T_k × p matrix of covariates for the kth trial and let $Z_{k} (ϕ) = diag (\exp {z_{k t_{k 1}}^{'} ϕ}, \dots, \exp {z_{k t_{k T_{k}}}^{'} ϕ})$ be a T_k × T_k diagonal matrix. Then, the vector form of (1) for the kth trial is

y_{k} = X_{k} β + Z_{k} (ϕ) γ_{k, o} + ϵ_{k} .

(5)

Let $D_{o y} = {(y_{k t_{k ℓ}}, n_{k t_{k ℓ}}, x_{k t_{k ℓ}}), ℓ = 1, 2, ... T_{k}, k = 1, 2, ..., K}$ , $D_{o s} = {(S_{k t_{k ℓ}}^{2}, n_{k t_{k ℓ}}), ℓ = 1, 2, \dots, T_{k}, k = 1, 2, \dots, K}$ and D_o = D_oy ∪ D_os denotes the observed data. Let θ = (β, ϕ, ρ, Σ), where $Σ = {σ_{k t_{k ℓ}}^{2}; ℓ = 1, \dots, T_{k}, k = 1, \dots, K}$ , and thus θ is the collection of all model parameters. Also, write γ_o = {γ_k,o k = 1, …, K}, which is the collection of random effects. Using (2) and (5) and the independence between $y_{k t_{k ℓ}}$ and $S_{k t_{k ℓ}}^{2}$ , the likelihood function can be written as

L (θ, γ_{o} ∣ D_{o}) = \prod_{k = 1}^{K} ({(2 π)}^{- \frac{T_{k}}{2}} {| Σ_{k} |}^{- \frac{1}{2}} \exp {- \frac{{(y_{k} - X_{k} β - Z_{k} (ϕ) γ_{k, o})}^{'} Σ_{k}^{- 1} (y_{k} - X_{k} β - Z_{k} (ϕ) γ_{k, o})}{2}} \times \prod_{l = 1}^{T_{k}} [\frac{{((n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2})}^{\frac{n_{k t_{k ℓ}} - 1}{2} - 1}}{{(2 σ_{k t_{k ℓ}}^{2})}^{\frac{n_{k t_{k ℓ}} - 1}{2}} Γ (\frac{n_{k t_{k ℓ}} - 1}{2})} \exp {- \frac{(n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2}}{2 σ_{k t_{k ℓ}}^{2}}}] \times f (γ_{k, o} ∣ ρ, v)),

(6)

where f(γ_k,o|ρ, v) is the probability density function of $t_{T_{k}} (0, E_{k}^{'} ρ E_{k}, v)$ . The scaled random effects γ_k,o cannot be directly integrated out from (6) because of the multivariate t distribution. We use the fact that a multivariate t random vector can be represented in terms of a multivariate normal vector whose covariance matrix is scaled by the reciprocal of a gamma random variable. In (5), the scaled random effects are conditionally distributed as a multivariate normal distribution, that is, $γ_{k, o} ∣ λ_{k} \sim N_{T_{k}} (0, \frac{1}{λ_{k}} (E_{k}^{'} ρ E_{k}))$ , where $λ_{k} \sim Gamma (\frac{v}{2}, \frac{2}{v})$ for v > 0. Given λ_k, γ_k,o and ϵ_k are independent and normally distributed, we have

y_{k} ∣ λ_{k} \sim N (X_{k} β, \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}) .

(7)

The following proposition gives the expression of the likelihood function after integrating out the random effects γ_o.

Proposition 1. Let $h (λ_{k}) = \frac{v^{\frac{v}{2}}}{Γ (\frac{v}{2}) 2^{\frac{v}{2}}} λ_{k}^{\frac{v}{2} - 1} \exp {- \frac{v λ_{k}}{2}}$ , which is the density function of a $Gamma (\frac{v}{2}, \frac{2}{v})$ random variable. After integrating out γ_o, the likelihood function in (6) reduces to

L (θ | D_{o}) = L (θ | D_{o y}) L (Σ | D_{o s}),

(8)

where

L (θ ∣ D_{o y}) = \prod_{k = 1}^{K} [\int_{0}^{\infty} {(2 π)}^{- \frac{T_{k}}{2}} h (λ_{k}) {| \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)} d λ_{k}],

and

L (Σ ∣ D_{o s}) = \prod_{k = 1}^{K} \prod_{l = 1}^{T_{k}} \frac{{((n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2})}^{\frac{n_{k t_{_{k ℓ}} - 1}}{2}} - 1}{{(2 σ_{k t_{k ℓ}}^{2})}^{\frac{n_{k t_{_{k ℓ}} - 1}}{2}} Γ (\frac{n_{k t_{_{k ℓ}} - 1}}{2})} \exp {- \frac{(n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2}}{2 σ_{k t_{k ℓ}}^{2}}} .

4 |. BAYESIAN INFERENCE

4.1 |. Priors and Posteriors

We assume independent non-informative priors for β, ϕ, Σ and ρ. Specifically, we assume that β ~ N_p(0,c₀₁I_p), ϕ ~ N_q(0, c₀₂I_q), and $π (σ_{k t_{k ℓ}}^{2}) \propto \frac{1}{σ_{k t_{k ℓ}}^{2}} I (0 < σ_{k t_{k ℓ}}^{2} < \infty), ℓ = 1, \dots, T_{k}, k = 1, \dots, K$ . The hyperparameter c₀₁ = 100,000. For our TG network meta-data, we use c₀₂ = 4. When q = 1, c₀₂ = 4 yields a 95% prior credible interval from 0.00039 to 2540.20 for the expected variance of the random effects, which is fairly non-informative as well as facilitating MCMC convergence. We also assume that $π (ρ_{01}, ρ_{12}, ρ_{02}, ρ_{23}, ρ_{13}, \dots, ρ_{0 T}) \propto 1$ subject to the constraint of positive definiteness. The degrees of freedom v can also be treated as random. We consider the following hierarchical prior: $(v ∣ v_{a}, v_{b}) \sim Gamma (v_{a}, v_{b} / v_{a}), v_{a} \sim Gamma (a_{4}, b_{4}^{- 1})$ , and v_b ~ IG(a₅, b₅) with density proportional to $v_{b}^{- (a_{5} + 1)} \exp (- b_{5} / v_{b})$ . This prior specification places the conditional mean of v at v_b, i.e., E(v | v_a, v_b) = v_b. The hyperparameters are set as a₄ = 10, b₄ = 10, a₅ = 11, b₅ = 10.

Write λ = (λ₁, …, λ_K)'. From our previous discussion and (6), the augmented posterior distribution is given by

π (β, ϕ, ρ, Σ, γ_{o}, λ, v ∣ D_{o}) \propto \prod_{k = 1}^{K} {(2 π)}^{- \frac{T_{k}}{2}} {| Σ_{k} |}^{- \frac{1}{2}} \exp {- \frac{{(y_{k} - X_{k} β - Z_{k} (ϕ) γ_{k, o})}^{'} Σ_{k}^{- 1} (y_{k} - X_{k} β - Z_{k} (ϕ) γ_{k, o})}{2}} \times \prod_{k = 1}^{K} \prod_{l = 1}^{T_{k}} \frac{{((n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2})}^{\frac{n_{k t_{k ℓ}} - 1}{2} - 1}}{{(2 σ_{k t_{k ℓ}}^{2})}^{\frac{n_{k t_{k ℓ}} - 1}{2}} Γ (\frac{n_{k t_{k ℓ}} - 1}{2})} \exp {- \frac{(n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2}}{2 σ_{k t_{k ℓ}}^{2}}} \times \prod_{k = 1}^{K} {(\frac{λ_{k}}{2 π})}^{\frac{T_{k}}{2}} {| E_{k}^{'} ρ E_{k} |}^{- \frac{1}{2}} \exp {- \frac{λ_{k}}{2} γ_{k, o}^{'} {(E_{k}^{'} ρ E_{k})}^{- 1} γ_{k, o}} h (λ_{k}) \times {(2 π c_{01})}^{- \frac{p}{2}} \exp {- \frac{β^{'} β}{2 c_{01}}} \times {(2 π c_{02})}^{- \frac{q}{2}} \exp {- \frac{ϕ^{'} ϕ}{2 c_{02}}} \times \prod_{k = 1}^{K} \prod_{l = 1}^{T_{k}} \frac{1}{σ_{k t_{k ℓ}}^{2}} \times \frac{{(v_{a} / v_{b})}^{v_{a}}}{Γ (v_{a})} v^{v_{a} - 1} \exp (- \frac{v_{a} v}{v_{b}}) \frac{b_{4}^{a_{4}}}{Γ (a_{4})} v_{a}^{a_{4} - 1} \exp (- b_{4} v_{a}) \frac{b_{5}^{a_{5}}}{Γ (a_{5})} v_{b}^{- (a_{5} + 1)} \exp (- \frac{b_{5}}{v_{b}}) .

(9)

4.2 |. Computational Development

The analytical evaluation of the posterior distribution of θ = (β, ϕ, ρ, Σ) given in (9) is not available. However, we can develop a Metropolis-within-Gibbs sampling algorithm to sample from (9). The algorithm requires sampling the following parameters in turn from the following conditional distributions: (i) [Σ|β, λ, ϕ, ρ, γ_o, D_o] and (ii) [β, λ, ϕ, ρ, γ_o |Σ, D_o].

For (i), the full conditional distribution for $σ_{k t_{k ℓ}}^{2}, l = 1, \dots, T_{k}, k = 1, \dots, K$ is

I G (\frac{n_{k t_{k ℓ}}}{2}, \frac{n_{k t_{k ℓ}} {(y_{k t_{k ℓ}} - x_{k t_{k ℓ}}^{'} β - \exp {z_{k t_{k ℓ}}^{'} ϕ} γ_{k t_{k ℓ}})}^{2}}{2} + \frac{(n_{k t_{k ℓ}} - 1) S_{k t_{k ℓ}}^{2}}{2}) .

For (ii), we use the modified collapsed Gibbs sampling technique in Chen, Shao and Ibrahim²⁷ via the identity

[β, λ, ϕ, ρ, γ_{o} ∣ Σ, D_{o}] = [β, λ, ϕ, ρ ∣ Σ, D_{o}] [γ_{o} ∣ Σ, β, λ, ϕ, ρ, D_{o}],

and further [β, λ, ϕ, ρ |Σ, D_o] is sampled in turn from the conditional distributions: (ii_a) [β | Σ, λ, ϕ, ρ, D_o]; (ii_b) [λ | Σ, β, ϕ, ρ, D_o]; (ii_c) [ϕ | Σ, β, λ, ρ, D_o]; and (ii_d) [ρ | Σ, β, λ, ϕ, D_o]. The full conditional distribution for β is

β ∣ Σ, λ, ϕ, ρ, D_{o} \sim N_{p} (μ_{β}, Σ_{β}),

where $μ_{β} = Σ_{β} \times [\sum_{k = 1}^{K} X_{k}^{'} {\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}}^{- 1} y_{k}]$ and $Σ_{β} = {[c_{01}^{- 1} I_{p} + \sum_{k = 1}^{K} X_{k}^{'} {\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}}^{- 1} X_{k}]}^{- 1}$ . Given Σ, β, ϕ, ρ, and D_o, the λ_k’s are conditionally independent with density function

π (λ_{k} ∣ Σ, β, ϕ, ρ, D_{o}) \propto π^{*} (λ_{k} ∣ Σ, β, ϕ, ρ, D_{o}) = λ_{k}^{\frac{v}{2} - 1} \exp {- \frac{v λ_{k}}{2}} {| \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)},

for k = 1,2 …,K. The lack of log-concavity of π* (λ_k | Σ, β, ϕ, ρ, D_o) facilitates the use of a localized Metropolis algorithm from Chen et al.²⁷. We make a log transformation on λ_k and denote η_k = log λ_k. Thus, the conditional density function of η_k is proportional to

π^{*} (η_{k} ∣ Σ, β, ϕ, ρ, D_{o}) = \exp {\frac{v}{2} η_{k}} \exp {- \frac{v}{2} e^{η_{k}}} {| e^{- η_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[e^{- η_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)} .

Instead of generating λ_k, the following steps give the algorithm for generating η_k.

The Localized Metropolis Algorithm for Generating η_k

Step 1. Let $η_{k}^{(m - 1)}$ be the current value.

Step 2. Compute ${\hat{η}}_{k} = \arg \max_{η_{k}} π^{*} (η_{k} ∣ Σ, β, ϕ, ρ, D_{o})$ .

Step 3. Compute

{\hat{σ}}_{η_{k}}^{- 2} = - {\frac{\partial^{2} \log π^{*} (η_{k} ∣ Σ, β, ϕ, ρ, D_{o})}{\partial η_{k}^{2}} |}_{η_{k} = {\hat{η}}_{k}} .

Step 4. Draw $η_{k}^{*}$ from $N ({\hat{η}}_{k}, {\hat{σ}}_{η_{k}}^{2})$ .

Step 5. Compute

a = \min {\frac{π^{*} (η_{k}^{*} ∣ Σ, β, ϕ, ρ, D_{o}) ϕ ((η_{k}^{(m - 1)} - {\hat{η}}_{k}) / {\hat{σ}}_{η_{k}})}{π^{*} (η_{k}^{(m - 1)} ∣ Σ, β, ϕ, ρ, D_{o}) ϕ ((η_{k}^{*} - {\hat{η}}_{k}) / {\hat{σ}}_{η_{k}})}, 1}

where ϕ is the standard normal density function.

Step 6. Draw u from U(0, 1). Set $η_{k}^{(m)} = η_{k}^{*}$ if u ≤ a and $η_{k}^{(m)} = η_{k}^{(m - 1)}$ if u > a.

Remark 4.2. In Step 4, $N ({\hat{η}}_{k}, {\hat{σ}}_{η_{k}}^{2})$ is the normal proposal density, where ${\hat{η}}_{k}$ is the maximizer of log π* (η_k | Σ, β, ϕ, ρ, D_o), and ${\hat{σ}}_{η_{k}}^{2}$ is minus the inverse of the second derivative of log π* (η_k | Σ, β, ϕ, ρ, D_o) evaluated at $η_{k} = {\hat{η}}_{k}$ . To avoid direct computation of the second derivative of log π* (η_k | Σ, β, ϕ, ρ, D_o), we fit a quadratic polynomial regression model to approximate log π* (η_k | Σ, β, ϕ, ρ, D_o) around its mode ${\hat{η}}_{k}$ . The variance ${\hat{σ}}_{η_{k}}^{2}$ can thus be approximated by −1/(2a), where a is the least squares estimate of the quadratic coefficient. Specifically, we select $(η_{k}^{1}, \dots, η_{k}^{B})$ in the neighborhood of ${\hat{η}}_{k}$ , compute $w_{k}^{b} = \log π^{*} (η_{k} ∣ Σ, β, ϕ, ρ, D_{o})$ for b =1, …, B, fit a second order polynomial regression using the points ${(η_{k}^{1}, w_{k}^{1}), \dots, (η_{k}^{B}, w_{k}^{B})}$ , and then set the inverse of the variance to be -k times the least squares estimate of the second order coefficient.

The full conditional distribution for ϕ is

π (ϕ ∣ Σ, β, λ, ρ, D_{o}) \propto π^{*} (ϕ ∣ Σ, β, λ, ρ, D_{o}) = \exp {- \frac{ϕ^{'} ϕ}{2 c_{02}}} \times \prod_{k = 1}^{K} ({| \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)}) .

Letting $ϕ^{(m - 1)} = {(ϕ_{1}^{(m - 1)}, \dots, ϕ_{q}^{(m - 1)})}^{'}$ be the current value, we use a modified localized Metropolis algorithm to generate $ϕ^{(m)} = {(ϕ_{1}^{(m)}, \dots, ϕ_{q}^{(m)})}^{'}$ . The steps are similar to the algorithm for generating $η_{k}^{(m)}$ . For (ii_d), let ρ_ij denote the (i, j)th element of ρ and also let $ρ_{i, i + ℓ}_{∣ i + 1 \dots i + ℓ - 1}$ denote the partial correlations. The full conditional distribution for ρ_ij, 1 ≤ i < j ≤T, is

π (ρ_{i j}, 1 \leq i < j \leq T ∣ Σ, β, λ, ϕ, D_{o}) \propto \prod_{k = 1}^{K} ({| \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ (ρ_{i j}) E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ (ρ_{i j}) E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)}) .

We generate the positive definite correlation matrix ρ(ρ_ij) based on the correlations ρ_i,i+1 for i = 1, …, T − 1 and partial correlations $ρ_{i, j ∣ i + 1, \dots, j - 1}$ for j − i ≥ 2.²⁸ These $(\begin{array}{l} T \\ 2 \end{array})$ dimensional parameters can independently take values on (−1, 1). Following this approach, we first transform $(ρ_{12}, ρ_{23}, ρ_{13}, ρ_{34}, ρ_{24}, \dots, ρ_{1 T})$ to $(ρ_{12}, ρ_{23}, ρ_{13 ∣ 2}, ρ_{34}, ρ_{24 ∣ 3}, \dots, ρ_{1 T ∣ 2, \dots, T - 1})$ , for which the determinant |J_T| of the Jacobian of this transformation is given by

{[\prod_{i = 1}^{T - 1} {(1 - ρ_{i, i + 1}^{2})}^{T - 2} \times \prod_{ℓ = 2}^{T - 2} \prod_{i = 1}^{T - ℓ} {(1 - ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}^{2})}^{T - 1 - ℓ}]}^{1 / 2} .

Write $ρ_{i, i + 1} = ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}$ with ℓ = 1 since the indices i+1, …, i+ ℓ−1 form an empty set if ℓ = 1. Then the full conditional joint density of $ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}$ for ℓ = 1, 2, …, T − 1 and i = 1,…, T − ℓ can be written as

π (ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1} ∣ Σ, β, λ, ϕ, D_{o}) \propto \prod_{k = 1}^{K} [{| \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ (ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}) E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ (ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}) E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)}] \times {[\prod_{ℓ = 1}^{T - 2} \prod_{i = 1}^{T - ℓ} {(1 - ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}^{2})}^{T - 1 - ℓ}]}^{1 / 2} .

We further make the Fisher’s z transformation on $ρ_{i, i + ℓ ∣ i + 1, \dots, i + ℓ - 1}$ and denote $z_{i, i + ℓ} = \frac{1}{2} \log (\frac{1 + ρ_{i, i + ℓ} ∣ i + 1, \dots, i + ℓ - 1}{1 - ρ_{i, i + ℓ} ∣ i + 1, \dots, i + ℓ - 1})$ . The Jacobian for this transformation is $\prod_{ℓ = 1}^{T - 1} \prod_{i = 1}^{T - ℓ} \frac{4 e^{2 z_{i, i + ℓ}}}{{(e^{2 z_{i, i + ℓ}} + 1)}^{2}}$ . Thus, the full conditional joint density of z_i,i+ℓ with ℓ = 1, 2, …, T − 1, i = 1, …, T − ℓ, can be written as

π (z_{i, i + ℓ} ∣ Σ, β, λ, ϕ, D_{o}) \propto π^{*} (z_{i, i + ℓ} ∣ Σ, β, λ, ϕ, D_{o}) = \prod_{k = 1}^{K} [{| \frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ (z_{i, i + ℓ}) E_{k} Z_{k} (ϕ)) + Σ_{k} |}^{- \frac{1}{2}} \times \exp {- \frac{1}{2} {(y_{k} - X_{k} β)}^{'} {[\frac{1}{λ_{k}} (Z_{k} (ϕ) E_{k}^{'} ρ (z_{i, i + ℓ}) E_{k} Z_{k} (ϕ)) + Σ_{k}]}^{- 1} (y_{k} - X_{k} β)}] \times {[\prod_{ℓ = 1}^{T - 2} \prod_{i = 1}^{T - ℓ} {1 - {(\frac{e^{2 z_{i, i + ℓ}} - 1}{e^{2 z_{i, i + ℓ}} + 1})}^{2}}^{T - 1 - ℓ}]}^{1 / 2} \times \prod_{ℓ = 1}^{T - 1} \prod_{i = 1}^{T - ℓ} \frac{4 e^{2 z_{i, i + ℓ}}}{{(e^{2 z_{i, i + ℓ}} + 1)}^{2}} .

Again, z_i,i+ℓ can be generated by the localized Metropolis algorithm. The full conditional distribution for γ_k,o, k = 1, …, K, is

γ_{k, o} ∣ Σ, β, λ, ϕ, ρ, D_{o} \sim N_{T_{k}} ({[Z_{k} (ϕ) Σ_{k}^{- 1} Z_{k} (ϕ) + λ_{k} {(E_{k}^{'} ρ E_{k})}^{- 1}]}^{- 1} Z_{k} (ϕ) Σ_{k}^{- 1} (y_{k} - X_{k} β), {[Z_{k} (ϕ) Σ_{k}^{- 1} Z_{k} (ϕ) + λ_{k} {(E_{k}^{'} ρ E_{k})}^{- 1}]}^{- 1}),

The full conditional distribution [v_b | v, v_a] is $I G (a_{5} + v_{a}, v_{a} v + b_{5})$ . Finally,

π (v ∣ λ, v_{a}, v_{b}) \propto π^{*} (v ∣ λ, v_{a}, v_{b}) = \frac{{(\frac{v}{2})}^{\frac{K v}{2}}}{{(Γ (\frac{v}{2}))}^{K}} [\prod_{k = 1}^{K} λ_{k}^{\frac{v}{2} - 1} \exp (- \frac{v λ_{k}}{2})] v^{v_{a} - 1} \exp (- \frac{v_{a} v}{v_{b}}) π (v_{a} ∣ v, v_{b}) \propto π^{*} (v_{a} ∣ v, v_{b}) = \frac{{(v_{a} / v_{b})}^{v_{a}}}{Γ (v_{a})} v^{v_{a} - 1} \exp (- \frac{v_{a} v}{v_{b}}) v_{a}^{a_{4} - 1} \exp (- b_{4} v_{a}) .

We make the transformation ξ = log v and ξ_a = log v_a. The conditional densities of £ and £_a are proportional to

π^{*} (ξ ∣ λ, v_{a}, v_{b}) = \frac{{(\frac{e^{ξ}}{2})}^{\frac{K e^{ξ}}{2}}}{{(Γ (\frac{e^{ξ}}{2}))}^{K}} [\prod_{k = 1}^{K} λ_{k}^{\frac{ε}{2} - 1} \exp (- \frac{e^{ξ} λ_{k}}{2})] e^{ξ v_{a}} \exp (- \frac{v_{a} e^{ξ}}{v_{b}}), π^{*} (ξ_{a} ∣ v, v_{b}) = \frac{{(e^{ξ_{a}} / v_{b})}^{e^{ξ a}}}{Γ (e^{ξ_{a}})} v^{e^{ξ_{a} - 1}} \exp (- \frac{e^{ξ_{a}} v}{v_{b}}) e^{ξ_{a} a_{4}} \exp (- b_{4} e^{ξ_{a}}) .

We use the localized Metropolis algorithm for ξ and ξ_a, where the proposal distributions have variances

{\hat{σ}}_{ξ}^{- 2} = - \frac{\partial^{2} \log π^{*} (\hat{ξ} ∣ λ, v_{a}, v_{b})}{\partial ξ^{2}} = - e^{\hat{ξ}} (\frac{K}{2} (\hat{ξ} - \log 2) - \frac{K e^{\hat{ξ}} ψ^{'} (e^{\hat{ξ}} / 2)}{4} - \frac{K ψ (e^{\hat{ξ}} / 2)}{2} + K + \frac{1}{2} \sum_{k = 1}^{K} (\log λ_{k} - λ_{k}) - \frac{v_{a}}{v_{b}}), {\hat{σ}}_{ξ_{a}}^{- 2} = - \frac{\partial^{2} \log π^{*} ({\hat{ξ}}_{a} ∣ v, v_{b})}{\partial ξ_{a}^{2}} = - e^{\hat{ξ}} (- b_{4} + {\hat{ξ}}_{a} - \frac{ν}{v_{b}} - e_{α}^{\hat{ξ}} ψ^{'} (e^{{\hat{ξ}}_{a}}) + \log v - \log v_{b} - ψ (e^{{\hat{ξ}}_{a}}) + 2),

where ψ(·) and ψ′(·) indicate the digamma function and the trigamma function, respectively, and $\hat{ξ}$ and ${\hat{ξ}}_{a}$ indicate the values where their full conditionals are maximized.

4.3 |. Model Comparison

It is essential to determine the degrees of freedom when using the multivatiate t distribution for the random effects. In this paper, we alternate between several choices for fixed v including ∞ for which the random effects would follow a multivariate normal distribution, and random v with a hierarchical prior. Likewise, the choices of the covariate vector $z_{k t_{k ℓ}}$ need to be compared. These two issues call for carrying out Bayesian model comparisons for which two criterion-based measures are considered. We first define the deviance function only for the observed data likelihood using the response variable y_k, which is given by D(θ) = −2log L(θ | D_oy). Thus, the deviance information criterion (DIC)²⁹ is given by $DIC = D (\bar{θ}) + 2 P_{D}$ , where $P_{D} = \bar{D} - D (\bar{θ}), \bar{D} = E [D (θ) ∣ D_{o}]$ and $\bar{θ} = E [θ ∣ D_{o}]$ . Secondly, for the kth trial, the conditional predictive ordinate is defined as ${CPO}_{k} = \int f (y_{k} ∣ θ) π (θ ∣ D_{o y}^{- k}, D_{o s}) d θ$ , where $D_{o y}^{- k}$ is D_oy with the kth trial deleted, and $π (θ ∣ D_{o y}^{- k}, D_{o s})$ is the posterior distribution based on the data $(D_{o y}^{- k}, D_{o s})$ . Another model comparison criterion is the logarithm of the Pseudo-marginal likelihood (LPML), which is a summary statistic of the CPO_k’s.

5 |. A SIMULATION STUDY

To examine the empirical performance of the proposed methods, we conduct two sets of simulation studies, in which the true distributions of the random effects are a multivariate t-distribution with v = 3 and a multivariate normal distribution, respectively. The simulation setting is as follows: K = 30, p = 9, T = 4, β = (−0.06, 1.40, 1.83, 0.11, −0.81, 1.03, −18.88, −24.53, −11.70)′, and ϕ = (1.38, −1.5)′. The K = 30 trials are divided into three blocks of the same size, having two, three, and four arms,respectively. The covariate vectors are generated from a multivariate normal distribution. That is, $x_{k t_{k l}} \overset{i . i . d}{\sim} N_{5} (0_{5}, Σ_{x})$ with $Σ_{x} = (\begin{matrix} 4.5 & 0 & 0 & 0 & 0 \\ 0 & 4.5 & 2 & 0 & 0 \\ 0 & 2 & 4.5 & 0 & 0 \\ 0 & 0 & 0 & 4.5 & - 2.5 \\ 0 & 0 & 0 & - 2.5 & 4.5 \end{matrix})$ . The correlation matrix ρ is fixed as a 4×4 matrix with off-diagonal elements of 0.7. z_kt is set to be a 2-dimensional vector where its first element is 1 if its treatment is 1, and its second element is 1 if the treatment is 3; all else is zero. The error variances, $σ_{k t}^{2}$ ‘s, are set to the values that range from 366.1473 to 439.7980 with the first, second (median), and third quartiles being 392.0314, 399.9440, and 412.9699, respectively. These specific values for the true parame- ters were chosen to reflect the real data in our paper. However, to make the simulation model simpler than that of the real data analysis, we reduced the number of covariates from 10 to 5 and the treatments from 11 to 4. The true values for the treatment effects (β₆, … ,β₉) were chosen to include significant effect sizes and a close-to-null effect. The near-zero effect was included to examine whether our models can detect the effect even when it is small. The measures for evaluating the accuracy of the posterior estimates are the bias, the relative bias (Rel. Bias), which is defined as bias/|true value of the parameter|, simulation error (SE), the average of standard deviations (SD), the root mean squared error (rMSE), and the coverage probabilities (CP) of their 95% HPD intervals. The MCMC chains were run for 12,500 iterations each with the first 2,500 discarded.

Table 2 contains the summary of 500 simulations for which the true model assumes the random effects follow a multivariate t-distribution with 3 degrees of freedom. The posterior estimates of the parameters under the true model have small biases and the CP’s are close to 95% for most of them. The posterior estimates for β are less affected by the distributional assumption of the random effects than by the variance structure. However, the coverage probabilities start to deviate from 95%, with the exception that ϕ₂ has a high CP under all models. The CP for ϕ₁ drops to 80% for v = 10 while the CPs of β₇ and β₉ are around 85% when v = ∞. The covariate effects captured by (β₁, β₂, … ,β₅) have CPs barely above 90%. The departure from the true model takes the most toll on the CP for ϕ₁, which falls to 80% when we fit the multivariate t distribution with v = 10 and as low as 41.4% when the normal random effects were assumed. Assuming a single variance is worse than an incorrect distribution for the random effects as it particularly worsens the CP to 79.4% for j₅. Furthermore, the CPs for (β₆, β₇, β₈)′ range from 99.4% to 100%, which are far too high, indicating a lack of power.

TABLE 2.

Results for 500 simulations from the multivariate t-distribution with 3 degrees of freedom. v indicates the degrees of freedom of the fitted model with v = ∞ implying a normal distribution. “Single Variance” assumes z_kt = 1 for all k and t.

	Bias	Rel. Bias	SE	SD	rMSE	CP		Bias	Rel. Bias	SE	SD	rMSE	CP

v =3
β₁	0.006	0.100	0.154	0.171	0.154	0.978	β₆	−0.022	−0.021	1.001	1.019	1.000	0.946
β₂	0.004	0.003	0.168	0.187	0.168	0.976	β₇	0.026	0.001	0.351	0.357	0.352	0.960
β₃	−0.001	−0.001	0.173	0.180	0.173	0.952	β₈	0.003	0.000	0.214	0.229	0.214	0.962
β₄	−0.019	−0.173	0.243	0.248	0.244	0.954	β₉	−0.000	0.000	0.364	0.375	0.364	0.954
β₅	−0.015	−0.019	0.231	0.238	0.231	0.952	ϕ₁	−0.027	−0.020	0.176	0.188	0.178	0.966
							ϕ₂	−0.537	−0.358	0.445	1.087	0.697	0.992

v =10
β₁	0.008	0.133	0.158	0.165	0.158	0.972	β₆	−0.022	−0.021	1.075	1.084	1.074	0.946
β₂	0.002	0.001	0.170	0.179	0.170	0.956	β₇	0.026	0.001	0.364	0.329	0.365	0.936
β₃	−0.001	−0.001	0.176	0.175	0.176	0.942	β₈	0.001	0.000	0.214	0.229	0.214	0.962
β₄	−0.020	−0.182	0.245	0.238	0.246	0.944	β₉	0.001	0.000	0.376	0.346	0.375	0.930
β₅	−0.012	−0.015	0.232	0.229	0.232	0.946	ϕ₁	0.150	0.109	0.180	0.163	0.234	0.800
							ϕ₂	−0.366	−0.244	0.462	1.105	0.589	0.990

v =∞
β₁	0.011	0.183	0.178	0.160	0.178	0.928	β₆	0.026	0.025	1.389	1.214	1.388	0.922
β₂	0.002	0.001	0.200	0.172	0.200	0.928	β₇	0.034	0.002	0.421	0.309	0.422	0.846
β₃	−0.005	−0.003	0.202	0.172	0.202	0.900	β₈	0.002	0.000	0.215	0.233	0.215	0.966
β₄	−0.021	−0.191	0.273	0.230	0.274	0.916	β₉	0.020	0.002	0.439	0.322	0.439	0.858
β₅	−0.007	−0.009	0.265	0.220	0.265	0.908	ϕ₁	0.335	0.243	0.235	0.141	0.409	0.414
							ϕ₂	−0.198	−0.132	0.475	1.103	0.514	0.978

single varience
β₁	0.014	0.233	0.227	0.237	0.227	0.958	β₆	0.048	0.047	0.814	0.517	0.814	0.794
β₂	−0.002	−0.001	0.228	0.268	0.228	0.978	β₇	0.029	0.002	0.325	0.529	0.326	0.994
β₃	−0.022	−0.012	0.265	0.286	0.266	0.962	β₈	0.003	0.000	0.230	0.529	0.230	1.000
β₄	0.017	0.155	0.357	0.313	0.357	0.916	β₉	−0.008	−0.001	0.338	0.572	0.337	0.998
β₅	−0.004	−0.005	0.370	0.313	0.370	0.886	ϕ₁	N/A	N/A	N/A	N/A	N/A	N/A
							ϕ₂	N/A	N/A	N/A	N/A	N/A	N/A

random v
β₁	0.007	0.117	0.153	0.171	0.153	0.980	β₆	−0.033	−0.032	0.999	1.012	0.998	0.938
β₂	0.004	0.003	0.168	0.188	0.168	0.974	β₇	0.022	0.001	0.351	0.358	0.351	0.964
β₃	0.000	0.000	0.173	0.181	0.173	0.958	β₈	0.000	0.000	0.215	0.231	0.214	0.962
β₄	−0.030	−0.273	0.243	0.248	0.245	0.950	β₉	0.007	0.001	0.364	0.375	0.363	0.960
β₅	−0.022	−0.027	0.232	0.240	0.233	0.952	ϕ₁	−0.048	−0.035	0.183	0.201	0.189	0.968
v	0.229	0.076	0.988	1.550	1.014	0.966	ϕ₂	−0.556	−0.371	0.440	1.062	0.709	0.994

Open in a new tab

Table 3 contains the summary of 500 simulations for which the true model assumes normally distributed random effects. The same pattern persists under this setting as well: the bigger the departure of the assumed model from the true normal distribution, the worse the CPs. The effect of the incorrect assumption is the most conspicuous in ϕ₁ where its CP for the model assuming v = 3 drops to 85%. Equally impacted is the CP of ϕ₂ under v = 3 which attains 100% coverage, implying a lack of power. However, as aforementioned, the CPs are worst when a single variance was assumed for all random effects, for which the CP for β₆ falls to 73.6%, and those of β₇, β₈, and β₉, on the other hand, spike to 98.4%, 100%, and 99.2%.

TABLE 3.

Results for 500 simulations from the multivariate normal distribution. v indicates the degrees of freedom of the random effects for the fitted model with v = ∞ implying a normal distribution. “Single Variance” assumes z_kt = 1 for all k and t.

	Bias	Rel. Bias	SE	SD	rMSE	CP		Bias	Rel. Bias	SE	SD	rMSE	CP

v =3
β₁	0.002	0.037	0.157	0.166	0.157	0.956	β₆	0.040	0.038	0.853	0.830	0.853	0.928
β₂	0.002	0.002	0.164	0.182	0.164	0.978	β₇	0.035	0.002	0.319	0.344	0.321	0.964
β₃	−0.009	−0.005	0.169	0.177	0.169	0.954	β₈	0.003	0.000	0.204	0.228	0.203	0.972
β₄	0.018	0.170	0.215	0.239	0.216	0.982	β₉	−0.010	−0.001	0.333	0.361	0.333	0.970
β₅	0.002	0.002	0.225	0.230	0.225	0.948	ϕ₁	−0.231	−0.167	0.175	0.193	0.290	0.850
							ϕ₂	−0.460	−0.307	0.407	1.117	0.614	1.000

v =10
β₁	0.004	0.060	0.155	0.161	0.155	0.952	β₆	0.037	0.036	0.817	0.826	0.817	0.934
β₂	0.003	0.002	0.161	0.175	0.161	0.976	β₇	0.035	0.002	0.316	0.322	0.317	0.952
β₃	−0.008	−0.005	0.169	0.172	0.169	0.944	β₈	−0.000	0.000	0.203	0.228	0.203	0.972
β₄	0.018	0.167	0.214	0.231	0.214	0.978	β₉	−0.010	−0.001	0.330	0.338	0.330	0.964
β₅	0.001	0.001	0.221	0.222	0.221	0.948	ϕ₁	−0.121	−0.088	0.163	0.169	0.203	0.908
							ϕ₂	−0.332	−0.221	0.376	1.112	0.501	0.998

v=∞
β₁	0.003	0.045	0.155	0.157	0.155	0.946	β₆	0.048	0.047	0.809	0.825	0.810	0.940
β₂	0.004	0.003	0.160	0.171	0.160	0.976	β₇	0.034	0.002	0.316	0.310	0.317	0.948
β₃	−0.008	−0.004	0.169	0.169	0.169	0.940	β₈	0.001	0.000	0.203	0.229	0.203	0.976
β₄	0.016	0.145	0.215	0.226	0.215	0.972	β₉	−0.005	0.000	0.330	0.322	0.330	0.950
β₅	−0.001	−0.001	0.221	0.217	0.220	0.946	ϕ₁	−0.043	−0.031	0.159	0.155	0.164	0.922
							ϕ₂	−0.284	−0.189	0.428	1.137	0.513	0.998

single varlance
β₁	0.017	0.283	0.214	0.216	0.214	0.940	β₆	0.034	0.033	0.917	0.540	0.917	0.736
β₂	−0.001	−0.001	0.211	0.241	0.211	0.974	β₇	0.030	0.002	0.338	0.443	0.339	0.984
β₃	−0.018	−0.010	0.243	0.261	0.243	0.966	β₈	−0.002	0.000	0.231	0.424	0.230	1.000
β₄	0.013	0.118	0.331	0.293	0.331	0.910	β₉	−0.011	−0.001	0.346	0.471	0.346	0.992
β₅	−0.007	−0.009	0.333	0.289	0.333	0.892	ϕ₁	N/A	N/A	N/A	N/A	N/A	N/A
							ϕ₂	N/A	N/A	N/A	N/A	N/A	N/A

Random v
β₁	0.003	0.045	0.156	0.165	0.156	0.958	β₆	0.044	0.042	0.842	0.824	1.012	0.928
β₂	0.005	0.004	0.164	0.181	0.164	0.978	β₇	0.035	0.002	0.318	0.338	0.358	0.958
β₃	−0.008	−0.004	0.169	0.176	0.169	0.954	β₈	−0.003	0.000	0.203	0.229	0.231	0.974
β₄	0.017	0.150	0.214	0.238	0.215	0.980	β₉	−0.010	−0.001	0.332	0.355	0.375	0.970
β₅	0.001	0.002	0.224	0.230	0.224	0.950	ϕ₁	−0.208	−0.151	0.175	0.195	0.201	0.868
							ϕ₂	−0.437	−0.291	0.389	1.119	1.062	1.000

Open in a new tab

The simulation error (SE) encodes the across-simulation variation of the posterior mean, whereas the average of the standard deviations (SD) indicates the within-simulation variation of the posterior mean. Tables 2 and 3 show that SE and SD are close to each other, as desired, for most parameters. The results for v = 3 in Table 2 show that the SE and SD of β₁ are 0.154 and 0.171, respectively. This implies that the posterior means across simulations tend to oscillate less than the posterior sample in a given batch, which typically is reflected in its CP as an overcoverage provided that it is unbiased. The greater the discrepancy, the larger the impact on its CP. For example, this is observed for ϕ₂, whose SE, SD, and CP are 0.445, 1.087, and 0.992, respectively.

Meanwhile, the relative biases of β₁, β₄, and β₆ are notably larger than those of others across the board. It is apparent that the relative bias reflects the strength of the signal in its true value since (β₁, β₄, β₆) are exactly the elements with effect sizes that are hard to detect. Moreover, the rMSEs indicate the average variability of the posterior sample with respect to the true value. The rMSE for β₆ corresponding to v = 3 in Table 2 is 1.000, which is the largest among all the parameters. This indicates that a given posterior sample may signal a fair amount of uncertainty for β₆ due to its small size, which is 1.03, while all other treatment effects have magnitudes over 11.00.

The results when v was assigned a prior distribution and sampled in the MCMC iterations are promising. Although the model does not assume that the degrees of freedom are fixed at the true value, it does not sacrifice the CPs for almost all of the parameters with β₆ being a minor exception. This implies that treating v as random has similar statistical power to that of the true model specification. Table 2 also contains the summary of the posterior distribution of v, which shows that its CP is 0.966 with a relative bias of 0.076. It is worth noting that taking v random does not solve the lack of power for ϕ₂ since the CP is equally too high.

Figures 3 and 4 shows the differences between each pair of DICs and LPMLs, respectively. A summary of the DIC (LPML) differences is provided for each boxplot in Table 4, in which the medians and IQRs (interquartile ranges) are reported. Both DICs and LPMLs clearly select the true model most of the time when the variance structure is correctly specified. However, it is evident that the magnitude of lost information in the DICs and the LPMLs is much greater when the variance structure is misspecified than when a wrong distribution is assigned for the random effects. On the other hand, LPML and DIC both favor treating v as random even when compared to the model assuming the true value for v provided that the true model assumes t-distributed random effects. When the true random effects come from a multivariate normal distribution, the random degrees of freedom model resulted in worse performance as demonstrated in Figures 3 and 4. Table 4 indicates, however, that the true model and the fitted model with random degrees of freedom have comparable fit since the sizes of the median differences are around 2 and the widths of the IQRs do not exceed 10.

The DIC difference is defined as the DIC under the true model minus the DIC under the fitted model. The true model for the left panel has multivariate t random effects with v = 3 and that of the right panel is multivariate normal random effects. Differences below −100 are not shown in the boxplots.

The LPML difference is defined as the LPML under the true model minus the LPML under the fitted model. The true model for the left panel is the multivariate t random effects with v = 3 and that of the right panel is the multivariate normal random effects.

TABLE 4.

Summary of DIC (LPML) differences. The DIC (LPML) difference is defined as the DIC (LPML) under the true model minus the DIC (LPML) under the fitted model. The row label indicates the fitted model.

	DIC Difference		LPML Difference
Fitted Model	Median	IQR	Median	IQR

(True) v = 3
v = 10	−0.19	(−1.94, 1.80)	0.97	(−0.26,1.91)
v = ∞	−1.83	(−6.21, 1.48)	2.53	(0.72, 5.63)
Single Variance	−43.40	(−48.97, −38.31)	22.74	(20.91, 27.12)
Random v	1.47	(−2.40, 8.41)	−2.07	(−6.41,0.19)

(True) v = ∞
v = 3	−5.23	(−6.43, −3.86)	3.03	(2.34, 3.44)
v = 10	−1.19	(−1.68, −0.55)	0.54	(0.34, 0.74)
Single Variance	−47.44	(−53.07, −40.87)	27.44	(21.74, 36.97)
Random v	−4.74	(−5.23, −4.11)	1.68	(1.20, 2.00)

Open in a new tab

6 |. ANALYSIS OF THE TNM DATA

In this section, we use our model (1) - (4) to analyze the TG network meta-data introduced in Section 2. Specifically, our data include T = 11 treatments {PBO, S, A, L, R, P, E, SE, AE, LE, PE}, each of which is assigned a treatment ID from 1 to 11, sequentially. Among the 11 treatments, {S, A, L, R, P} are different types of statins and {SE, AE, LE, PE} are the combination treatments of E with different types of statins. It is worth noting from Section 2 that the combination treatment {RE} is missing in our TG network meta-data and the five treatments {L, P, AE, LE, PE} are only included in single trials. Let $y_{k t_{k ℓ}}$ in (1) be the mean percent change in TG from the baseline value for the t_kℓ th treatment in the kth trial. In model (1), $x_{k t_{k ℓ}}$ includes the ten baseline covariates, namely, baseLDLC, baseHDLC, baseTG, age, white, male, BMI, potency_med, potency_high, and duration, and the eleven-dimensional treatment indicators consisting of 1{PBO}, 1{S}, 1{A}, 1{L}, 1{R}, 1{P}, 1{E}, 1{SE}, 1{AE}, 1{LE}, and 1{PE}, where the indicator function 1{B} takes a value of “1” if B is true and a value of “0” if B is false.

In model (4), we compare eight sets of group indicator variables, one set of selected baseline variables, and a mixed set of group indicator variables and selected baseline variables. For the eight sets of group indicator variables, we replicate the eight different groupings in Li et al.,¹¹ where they first formulate possible grouping sets based on the clinical mechanism of action of treatments and then compare them using model comparison criteria. Briefly, the eight sets of group indicator variables are: $S_{1} = {z_{k t_{k ℓ}} = 1, ℓ = 1, \dots, T_{k}, k = 1, \dots, K}$ , $S_{2} = {z_{k t_{k ℓ}}^{(1)} = 1; z_{k t_{k ℓ}}^{(2)} = 1, if t_{k ℓ} = 2 / 3 / 4 / 5 / 6, o . w . 0; z_{k t_{k ℓ}}^{(3)} = 1, if t_{k ℓ} = 7, o . w . 0; z_{k t_{k ℓ}}^{(4)} = 1, if t_{k ℓ} = 8 / 9 / 10 / 11, o . w . 0, ℓ = 1, \dots, T_{k}, k = 1, \dots, K}$ . The set $S_{2}$ divides the treatments into four groups, in which PBO alone is the first group, all statins {S, A, L, R, P} are the second group, E is the third group, and all statins with E {SE, AE, LE, PE} are the fourth group. Similarly, we create sets $S_{3} - S_{5}$ by separating one statin (S, A or R) from the other statins. Sets $S_{6} - S_{8}$ are also similar to set $S_{2}$ but separate two statins (S and R, S and A, or A and R) from the other statins. Sets $S_{9} = {z_{k t_{k ℓ}}^{(1)} = 1; z_{k k_{k ℓ}}^{(2)} = {({bl}_{l} dlc)}_{k t_{k ℓ}}; z_{k t_{k ℓ}}^{(3)} = {(bl_hdlc)}_{k t_{k ℓ}}; z_{k t_{k ℓ}}^{(4)} = {(bl_tg)}_{k t_{k ℓ}}, ℓ = 1, \dots, T_{k}, k = 1, \dots, K}$ . It is chosen based on the plausibility that baseline cholesterol measures are related to the variability of treatment effects in lowering TG. In addition, we include a calibrated covariate set with a mixture of treatment indicators and baseline variables. The reasoning for the inclusion of each covariate is as follows. We have a binary covariate for PBO since PBO has the most different mechanism of action compared to other treatments. Also, from the discussion in Section 2, we include a binary covariate for R since treatment R has the widest range of variability in effects. Baseline TG is included to adjust for baseline effects on variances, and correspondingly baseline LDL-C is included since LDL-C and TG are highly related. Therefore, $S_{10} = {z_{k t_{k ℓ}}^{(1)} = 1; z_{k t_{k ℓ}}^{(2)} = 1, if t_{k ℓ} = 1, o . w . 0; z_{k t_{k ℓ}}^{(3)} = 1, if t_{k ℓ} = 5, o . w . 0; z_{k t_{k ℓ}}^{(4)} = {(bl_ldlc)}_{k t_{k ℓ}}; z_{k t_{k ℓ}}^{(5)} = {({bl}_{-} tg)}_{k t_{k ℓ}}, ℓ = 1, \dots, T_{k}, k = 1, \dots, K}$ .

In all of the computations, we standardized the covariates in order to achieve better MCMC convergence. We took every third sample and generated 20000 MCMC samples after a 2000-iteration burn-in. We checked MCMC convergence using standard Bayesian model assessment, including trace plots and sample autocorrelation plots. We see from Table 1 that some treatments in trials 12, 13, and 14 are not included in other trials. Thus, the calculation of the LPML excluded these trials. Table 5 compares the DIC and LPML values for the 10 sets of covariates in model (4) combined with six fixed values of v as well as a random v. Different covariate sets might give different parameter sizes in ϕ. This information is additionally given in the first column of Table 5. We see from Table 5 that (i) for each of these 10 sets of covariates, the largest DIC and the smallest LPML are attained at v = ∞ (the normal distribution); (ii) set $S_{10}$ has the smallest DIC and the largest LMPL among all 10 covariate sets; (iii) the values of DICs and LPMLs for a random v are close to those with v = 3 or v = 4 for most of these 10 sets of covariates; (iv) the combination of set $S_{10}$ with v = 2 fits the data best under DIC; and (v) the combination of set $S_{10}$ with v = 3 fits the data best under LPML. Without including the covariates in model (1), the DIC under set $S_{10}$ with v = 3 is 423.21. We compare the posterior estimates of covariates-included and covariates-not-included models in the latter discussion. We also fit the data using a diagonal covariance matrix, that is, assuming no correlations among the random treatment effects. With covariate set $S_{10}$ and v = 3, the DIC under the diagonal covariance matrix is 387.86, which is larger than the DIC under our specified covariance matrix.

TABLE 5.

Model comparisons for the TG network meta-data using DIC and LPML.

Criterion	DIC							LPML

v^a = Covariates Set(dim(ϕ)^b)	2	3	5	10	20	∞	Random v	2	3	5	10	20	∞	Random v
$S_{1} (1)$	385.81	387.09	389.52	392.88	395.50	397.70	387.0137	−160.97	−161.90	−163.33	−165.22	−166.88	−171.17	−161.4639
$S_{2} (4)$	387.82	388.77	390.85	393.78	395.97	397.56	389.5591	−162.48	−163.15	−164.27	−166.43	−168.88	−169.12	−163.2149
$S_{3} (5)$	389.10	389.89	391.78	394.07	395.82	397.06	398.2215	−163.47	−164.28	−165.68	−167.78	−168.84	−169.21	−163.8937
$S_{4} (5)$	386.42	386.81	387.95	389.40	390.35	391.19	388.6203	−162.15	−163.16	−163.95	−164.45	−165.65	−167.85	−163.1581
$S_{5} (5)$	388.44	389.09	390.79	393.24	395.20	396.82	389.9858	−162.36	−163.25	−164.42	−166.03	−167.85	−169.45	−162.9580
$S_{6} (6)$	388.26	388.37	389.24	391.00	391.63	392.82	390.0239	−164.43	−164.33	−164.94	−167.09	−166.62	−167.17	−164.0532
$S_{7} (6)$	389.14	389.64	390.88	392.48	393.62	394.33	390.9370	−163.30	−164.25	−164.68	−166.88	−166.51	−167.65	−164.1565
$S_{8} (6)$	388.55	388.62	389.57	390.87	392.23	392.78	390.4394	−163.24	−163.39	−164.70	−166.49	−167.37	−168.03	−164.1556
$S_{9} (4)$	385.75	386.47	387.75	390.16	391.71	393.10	387.1559	−161.36	−162.03	−162.84	−164.09	−166.52	−166.21	−162.0838
$S_{10} (5)$	383.63	384.03	384.88	386.15	387.23	387.88	385.2640	−160.48	−160.17	−160.92	−161.84	−162.13	−164.13	−161.1602

Open in a new tab

Degrees of Freedom

Dimension of ϕ.

Table 6 presents the posterior means, posterior standard deviations (SDs), and 95% HPD intervals of the parameters under the best covariate set $S_{10}$ with v = 2 and v = 3. The posterior estimates for v = 2 are similar to those for v = 3. The posterior means (SDs and HPD intervals) for the baseline TG and race proportion of white are 1.86 (0.68 and [0.49, 3.34]) and −2.08 (0.42 and [−2.88, −1.24]), respectively, for v = 2; and 1.83 (0.70 and [0.43, 3.20]) and −2.01 (0.45 and [−2.86, −1.10]), respectively, for v = 3. Both sets of the 95% HPD intervals do not include 0, indicating significant influence on the mean percent change from baseline in TG. As we have mentioned, the covariates were standardized in all computations. It is worth noting that one should not read too much into the individual covariate coefficients from the fitted model given in Table 6. These coefficients are dependent upon the trial populations and the terms included in the model. Instead, the model should be looked at and interpreted in its totality.

TABLE 6.

Posterior estimates of the parameters under set $S_{10}$ with v = 2 and v = 3.

		v = 2			v = 3
		Posterior			Posterior
Variable	Parameter	Mean	SD	95% HPD Interval	Mean	SD	95% HPD Interval

baseLDLC	β₁	0.005	0.48	(−0.92, 0.96)	−0.06	0.49	(−1.01, 0.91)
baseHDLC	β₁	1.58	0.92	(−0.26, 3.34)	1.40	0.94	(−0.48, 3.20)
baseTG	β₃	1.86	0.68	(0.49, 3.18)	1.83	0.70	(0.43, 3.20)
age	β₄	0.14	0.42	(−0.69, 0.96)	0.11	0.43	(−0.77, 0.93)
white	β₅	−2.08	0.42	(−2.88, −1.24)	−2.01	0.45	(−2.86,-1.10)
male	β₆	0.50	0.74	(−0.97, 1.95)	0.34	0.76	(−1.16, 1.82)
BMI	β₇	−0.79	0.51	(−1.78, 0.24)	−0.81	0.53	(−1.81, 0.26)
potency_med	β₈	2.30	3.19	(−3.95, 8.64)	2.52	3.25	(−3.77, 9.07)
potency_high	β₉	−0.79	3.40	(−7.65, 5.75)	−0.61	3.46	(−7.55, 6.14)
duration	β₁₀	0.13	0.49	(−0.81, 1.08)	0.21	0.51	(−0.78, 1.22)
PBO	β₁₁	0.79	5.75	(−10.51, 12.08)	1.03	5.85	(−10.88, 12.26)
S	β₁₂	−18.78	1.20	(−21.13, −16.41)	−18.88	1.23	(−21.28, −16.40)
A	β₁₃	−24.54	1.68	(−27.86, −21.26)	−24.53	1.72	(−27.87, −21.10)
L	β₁₄	−11.70	4.30	(−20.05, −3.53)	−11.70	4.25	(−20.09, −3.49)
R	β₁₅	−18.63	1.77	(−21.96,-15.03)	−18.50	1.83	(−22.06, −14.79)
P	β₁₆	−8.53	4.39	(−17.07, −0.17)	−8.41	4.35	(−16.74, 0.39)
E	β₁₇	−5.66	5.80	(−17.15, 5.70)	−5.47	5.90	(−16.93, 6.37)
SE	β₁₈	−21.75	1.78	(−25.23, −18.26)	−21.82	1.83	(−25.60, −18.38)
AE	β₁₉	−29.75	2.86	(−35.31, −24.22)	−29.94	2.96	(−35.61, −24.00)
LE	β₂₀	−25.95	3.50	(−32.83, −19.28)	−26.24	3.52	(−33.27, −19.52)
PE	β₂₁	−21.77	3.82	(−29.17, −14.62)	−22.13	3.71	(−29.47, −14.86)
Intercept	ϕ₁	−0.26	0.39	(−1.06, 0.46)	−0.38	0.51	(−1.41, 0.51)
PBO	ϕ₂	−0.70	1.07	(−2.83, 1.29)	−1.38	1.19	(−3.76, 0.41)
R	ϕ₃	0.50	0.42	(−0.30, 1.34)	0.22	0.16	(−0.11, 0.54)
baseLDLC	ϕ₄	−0.41	0.47	(−1.38, 0.44)	−0.41	0.46	(−1.32, 0.42)
baseTG	ϕ₅	0.27	0.30	(−0.31, 0.88)	0.23	0.28	(−0.31, 0.79)

Open in a new tab

We also calculated the posterior mean ranks for all treatments. For v = 3, the mean ranks of {PBO, S, A, L, R, P, E, SE, AE, LE, PE} are {10.97, 6.33, 2.97, 7.96, 6.63, 8.95, 9.58, 4.54, 1.21, 2.38, 4.50}, respectively. This suggests that the descending order of treatments, after adjusting for the 10 aggregate covariates, is AE, LE, A, PE, SE, S, R, L, P, E, PBO. This order reflects the same order of the posterior means of the overall treatment effects. The treatments have the same order for v = 2. We also plot the absolute and cumulative ranking probabilities for all treatments and calculate the surface under the cumulative ranking curve (SUCRA) in Figure 5. The higher the probability for a treatment having top ranks, the larger the area under the cumulative ranking line. Hence, larger SUCRA values represent better treatments on a scale from 0 to 1. It is worth noting that a SUCRA of value p is related to the mean rank r through the transformation r = 1 + (1 − p)(T − 1), where T is the number of treatments in the data³⁰. The mean ranks of the treatments for v = 3, without adjusting for the covariates, are {10.92, 6.29, 3.80, 8.22, 4.90, 9.16, 9.24, 2.22, 1.09, 4.09, 6.06}. The suggested descending order becomes AE, SE, A, LE, R, PE, S, L, P, E, PBO. The best treatment is always E combined with A and the worst two treatments are PBO and E alone. The effects of the combination treatments SE, LE, PE and the statin R differ substantially with and without covariate adjustment.

Plots of ranking probabilities for all treatment arms: (a) v = 2, (b) v = 3. The green dashed line represents the absolute probability and the orange solid line represents the cumulative probability. Treatment abbreviations and the SUCRA values are on the top of each sub-plot.

Figure 6 exhibits the posterior means and 95% HPD intervals of pairwise treatment differences under set $S_{10}$ for v = 2 and v = 3 with and without adjusting for covariates. We observe from this figure that (i) without covariate adjustment, treatment P does not provide a substantially higher reduction in TG than PBO. With covariate adjustment, all active treatments provide substantially higher reductions in TG than PBO; (ii) without covariate adjustment, SE and AE have significantly higher TG reductions than their respective statins, while LE and PE do not; (iii) With covariate adjustment, AE, LE and PE provide sig- nificantly higher TG reductions than their respective statins while SE does not; (iv) except for PE, all combination treatments provide significantly higher TG reductions than E alone with or without adjusting for covariates; (v) AE provides a substantially higher TG reduction than PE without adjusting for covariates and a substantially higher TG reduction than SE with adjusting for covariates; (vi) the four combination treatments do not have substantial differences although they do have a treatment order; and (vii) the results for v = 2 are similar to those for v = 3. Finally, we note that covariate adjustment is critical when the studies are not sufficiently similar in clinical characteristics.

Plot of posterior means and 95% HPD intervals of treatment differences under the models with *v = 2* and *v = 3.* The reference treatment being compared in each sub-plot is listed at the bottom. The red color represents estimates with covariate adjustment, while the blue color represents estimates without covariate adjustment.

7 |. DISCUSSION

In this paper, we have proposed a general multivariate t distribution for the random treatment effects. The multivariate t distribution is a natural distribution to use and is a natural extension of the multivariate normal distribution. The Cochrane Handbook²³ has an independent section about meta-analysis of skewed data and gives instructions on diagnosing and fixing skewed out- comes in meta-analysis. Some other authors^31,32 also make contributions to this topic. However, none of their work has been applied to the network meta-regression model.

In (1), we assume a normal distribution for the random error ∈_kt. The rationale for such an assumption is that since our outcome variable is aggregate, which generally represents the sample mean, the Central Limit Theorem guarantees that the aggregate response approximately follows a normal distribution. However, for individual patient data, the random error should be assumed to follow a multivariate t distribution. In this paper, we assume that v (the degrees of freedom in the multivariate t distribution) is both fixed and random, and use DIC and LPML to guide the choice of model. Based on our empirical results, the posterior estimates of the regression coefficients, which include the overall treatment effects, are relatively robust with respect to moderate misspecification of the degrees of freedom v. In clinical practice, it may be sufficient to default to the hierarchical prior for v as a way of data-driven model selection. Ideally, however, it is advised to compare a heavy-tailed t distribution (2 or 3 degrees of freedom), a moderate-tailed t distribution (e.g., 8–10 degrees of freedom), which is approximately a logistic distribution, and a light-tailed t distribution (20 or greater degrees of freedom).

In addition to the network meta-regression model, we have developed an innovative log-linear regression model for the variances of the random effects and used an unstructured correlation matrix. The variances of the random effects can be estimated through the log-linear regression and thus depend on the treatment-by-study-level covariates. The log-linear regression relaxes the strong assumption of homogeneous variances. Our specified correlation matrix also exhibited better fit to the data than the diagonal covariance matrix. Although our results show that the correlations of treatment effects are necessary, it is difficult to accurately estimate all correlations due to the incomplete observed arms in each study. In this situation, more data need to be collected or stronger assumptions need to applied. Potential future work would be to explore the weakest assumptions for the correlations so that the correlation parameters can be fully determined by the data and the posterior estimates can be close to the true correlations.

Supplementary Material

supp

NIHMS1713492-supplement-supp.pdf^{(134.6KB, pdf)}

ACKNOWLEDGEMENTS

We would like to thank the Editor, the Associate Editor, and two reviewers for their very helpful comments and suggestions, which have led to a much improved version of the paper. Dr. Chen and Dr. Ibrahim’s research was partially supported by NIH grants #GM70335 and #P01CA142538, and Merck & Co., Inc., Kenilworth, NJ, USA. Dr. Kim’s research was supported by the Intramural Research Program of National Institutes of Health, National Cancer Institute.

References

1.Vrablik M, Holmes D, Forer B, Juren A, Martinka P, Frohlich J. Use of ezetimibe results in more patients reaching lipid targets without side effects. Cor Et Vasa 2014; 56(2): e128–e132. [Google Scholar]
2.Heron M Deaths: Leading Causes for 2017. National Vital Statistics Reports 2019; 68(6): June 24. [PubMed] [Google Scholar]
3.Miller M, Stone N, Ballantyne C, et al. Triglycerides and cardiovascular disease: a scientific statement from the American Heart Association. Circulation 2011; 123: 2292–2333. [DOI] [PubMed] [Google Scholar]
4.Nordestgaard B, Varbo A. Triglycerides and cardiovascular disease. Lancet 2014; 384: 626–635. [DOI] [PubMed] [Google Scholar]
5.Dewey F, Gusarova V, Dunbar R, et al. Genetic and pharmacologic inactivation of ANGPTL3 and cardiovascular disease. New England Journal of Medicine 2017; 377: 211–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Jorgensen A, Frikke-Schmidt R, Nordestgaard B, Tybjarg-Hansen A. Loss-of-function mutations in APOC3 and risk of ischemic vascular disease. New England Journal of Medicine 2014; 371: 32–41. [DOI] [PubMed] [Google Scholar]
7.Austin M, Hokanson J, Edwards K. Hypertriglyceridemia as a cardiovascular risk factor. American Journal Of Cardiology 1998; 81: 7B–12B. [DOI] [PubMed] [Google Scholar]
8.Sarwar N, Danesh J, Eiriksdottir G, et al. Triglycerides and the risk of coronary heart disease: 10,158 incident cases among 262,525 participants in 29 Western prospective studies. Circulation 2007; 115: 450–458. [DOI] [PubMed] [Google Scholar]
9.Hokanson J, Austin M. Plasma triglyceride level is a risk factor for cardiovascular disease independent of high-density lipoprotein cholesterol level: a meta-analysis of population-based prospective studies. Journal of Cardiovasc Risk 1996; 3: 213–219. [PubMed] [Google Scholar]
10.Bhatt D, Steg P, Miller M, et al. Cardiovascular Risk Reduction with Icosapent Ethyl for Hypertriglyceridemia. New England Journal of Medicine 2019; 380: 11–22. [DOI] [PubMed] [Google Scholar]
11.Li H, Chen MH, Ibrahim JG, et al. Bayesian inference for network meta-regression using multivariate random effects with applications to cholesterol lowering drugs. Biostatistics 2019; 20(3): 499–516. 10.1093/biostatistics/kxy014. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Dias S, Sutton AJ, Welton NJ, Ades AE. Evidence synthesis for decision making 3: heterogeneity—subgroups, meta-regression, bias, and bias-adjustment. Medical Decision Making 2013; 33(5): 618–640. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Tu YK. Use of generalized linear mixed models for network meta-analysis. Medical Decision Making 2014; 34(7): 911–918. [DOI] [PubMed] [Google Scholar]
14.Batson S, Sutton A, Abrams K. Exploratory Network Meta Regression Analysis of Stroke Prevention in Atrial Fibrillation Fails to Identify Any Interactions with Treatment Effect. PloS one 2016; 11(8): e0161864. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Zhang J, Carlin BP, Neaton JD, et al. Network Meta-analysis of Randomized Clinical Trials: Reporting the Proper Summaries. Clinical Trials 2014; 11(2): 246–262. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Zhang J, Fu H, Carlin BP. Detecting outlying trials in network meta-analysis. Statistics in Medicine 2015; 34(19): 2695–2707. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Hong H, Chu H, Zhang J, Carlin BP. A Bayesian missing data framework for generalized multiple outcome mixed treatment comparisons. Research Synthesis Methods 2016; 7(1): 6–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Gunhan BK, Friede T, Held L. A design-by-treatment interaction model for network meta-analysis and meta-regression with integrated nested Laplace approximations. Research Synthesis Methods 2018; 9(2): 179–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Donegan S, Dias S, Tudur-Smith C, Marinho V, Welton NJ. Graphs of study contributions and covariate distributions for network meta-regression. Research Synthesis Methods 2018; 9(2): 243–260. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Higgins JPT, Jackson D, Barrett JK, Lu G, Ades AE, White IR. Consistency and inconsistency in network meta-analysis: concepts and models for multi-arm studies. Research Synthesis Methods 2012; 3(2): 98–110. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Statistics in Medicine 2004; 23(20): 3105–3124. [DOI] [PubMed] [Google Scholar]
22.Lu G, Ades AE. Modeling between-trial variance structure in mixed treatment comparisons. Biostatistics 2009; 10(4): 792–805. [DOI] [PubMed] [Google Scholar]
23.Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. 2011. http://handbook.cochrane.org, The Cochrane Collaboration. [Google Scholar]
24.Yao H, Chen M, Qiu C. Bayesian Modeling and Inference for Meta Data with Applications in Efficacy Evaluation of an Allergic Rhinitis Drug. Journal of Biopharmaceutical Statistics 2011; 21(5): 992–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Hong H, Price KL, Fu H, Carlin BP. Bayesian network meta-analysis for multiple endpoints. In: Morton S, Gatsonis C, eds. Methods in Comparative Effectiveness ResearchBoca Raton, FL: CRC Press. 2017. [Google Scholar]
26.Yao H, Kim S, Chen MH, Ibrahim JG, Shah AK, Lin J. Bayesian Inference for Multivariate Meta-regression with Partially Observed Within-Study Sample Covariance Matrix. Journal ofthe American Statistical Association 2015; 110(510): 528–544. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Chen MH, Shao QM, Ibrahim JG. Monte Carlo methods in Bayesian computation. New York: Springer. 2000. ISBN 978–1-4612–1276-8. [Google Scholar]
28.Joe H Generating random correlation matrices based on partial correlations. Journal ofMultivariate Analysis 2006; 97(10): 2177–2189. [Google Scholar]
29.Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. Journal ofthe Royal Statistical Society: Series B (Statistical Methodology) 2000; 64(4): 583–639. [Google Scholar]
30.Salanti G, Ades AE, Ioannidis JP. Graphical methods and numerical summaries for presenting results from multiple-treatment meta-analysis: an overview and tutorial. Journal ofclinical epidemiology 2011; 64(2): 163–171. [DOI] [PubMed] [Google Scholar]
31.Higgins J, White IR, Anzures-Cabrera J. Meta-analysis of skewed data: Combining results reported on log-transformed or raw scales. Statistics in Medicine 2008; 27(29): 6072–6092. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Greco T, Biondi-Zoccai G, Gemma M, Guérin C, Zangrillo A, Landoni G. How to impute study-specific standard deviations in meta-analyses of skewed continuous endpoints?. World Journal of Meta-Analysis 2000; 3(5): 215–224. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supp

NIHMS1713492-supplement-supp.pdf^{(134.6KB, pdf)}

[R1] 1.Vrablik M, Holmes D, Forer B, Juren A, Martinka P, Frohlich J. Use of ezetimibe results in more patients reaching lipid targets without side effects. Cor Et Vasa 2014; 56(2): e128–e132. [Google Scholar]

[R2] 2.Heron M Deaths: Leading Causes for 2017. National Vital Statistics Reports 2019; 68(6): June 24. [PubMed] [Google Scholar]

[R3] 3.Miller M, Stone N, Ballantyne C, et al. Triglycerides and cardiovascular disease: a scientific statement from the American Heart Association. Circulation 2011; 123: 2292–2333. [DOI] [PubMed] [Google Scholar]

[R4] 4.Nordestgaard B, Varbo A. Triglycerides and cardiovascular disease. Lancet 2014; 384: 626–635. [DOI] [PubMed] [Google Scholar]

[R5] 5.Dewey F, Gusarova V, Dunbar R, et al. Genetic and pharmacologic inactivation of ANGPTL3 and cardiovascular disease. New England Journal of Medicine 2017; 377: 211–221. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Jorgensen A, Frikke-Schmidt R, Nordestgaard B, Tybjarg-Hansen A. Loss-of-function mutations in APOC3 and risk of ischemic vascular disease. New England Journal of Medicine 2014; 371: 32–41. [DOI] [PubMed] [Google Scholar]

[R7] 7.Austin M, Hokanson J, Edwards K. Hypertriglyceridemia as a cardiovascular risk factor. American Journal Of Cardiology 1998; 81: 7B–12B. [DOI] [PubMed] [Google Scholar]

[R8] 8.Sarwar N, Danesh J, Eiriksdottir G, et al. Triglycerides and the risk of coronary heart disease: 10,158 incident cases among 262,525 participants in 29 Western prospective studies. Circulation 2007; 115: 450–458. [DOI] [PubMed] [Google Scholar]

[R9] 9.Hokanson J, Austin M. Plasma triglyceride level is a risk factor for cardiovascular disease independent of high-density lipoprotein cholesterol level: a meta-analysis of population-based prospective studies. Journal of Cardiovasc Risk 1996; 3: 213–219. [PubMed] [Google Scholar]

[R10] 10.Bhatt D, Steg P, Miller M, et al. Cardiovascular Risk Reduction with Icosapent Ethyl for Hypertriglyceridemia. New England Journal of Medicine 2019; 380: 11–22. [DOI] [PubMed] [Google Scholar]

[R11] 11.Li H, Chen MH, Ibrahim JG, et al. Bayesian inference for network meta-regression using multivariate random effects with applications to cholesterol lowering drugs. Biostatistics 2019; 20(3): 499–516. 10.1093/biostatistics/kxy014. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Dias S, Sutton AJ, Welton NJ, Ades AE. Evidence synthesis for decision making 3: heterogeneity—subgroups, meta-regression, bias, and bias-adjustment. Medical Decision Making 2013; 33(5): 618–640. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Tu YK. Use of generalized linear mixed models for network meta-analysis. Medical Decision Making 2014; 34(7): 911–918. [DOI] [PubMed] [Google Scholar]

[R14] 14.Batson S, Sutton A, Abrams K. Exploratory Network Meta Regression Analysis of Stroke Prevention in Atrial Fibrillation Fails to Identify Any Interactions with Treatment Effect. PloS one 2016; 11(8): e0161864. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Zhang J, Carlin BP, Neaton JD, et al. Network Meta-analysis of Randomized Clinical Trials: Reporting the Proper Summaries. Clinical Trials 2014; 11(2): 246–262. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Zhang J, Fu H, Carlin BP. Detecting outlying trials in network meta-analysis. Statistics in Medicine 2015; 34(19): 2695–2707. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Hong H, Chu H, Zhang J, Carlin BP. A Bayesian missing data framework for generalized multiple outcome mixed treatment comparisons. Research Synthesis Methods 2016; 7(1): 6–22. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Gunhan BK, Friede T, Held L. A design-by-treatment interaction model for network meta-analysis and meta-regression with integrated nested Laplace approximations. Research Synthesis Methods 2018; 9(2): 179–194. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Donegan S, Dias S, Tudur-Smith C, Marinho V, Welton NJ. Graphs of study contributions and covariate distributions for network meta-regression. Research Synthesis Methods 2018; 9(2): 243–260. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Higgins JPT, Jackson D, Barrett JK, Lu G, Ades AE, White IR. Consistency and inconsistency in network meta-analysis: concepts and models for multi-arm studies. Research Synthesis Methods 2012; 3(2): 98–110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Statistics in Medicine 2004; 23(20): 3105–3124. [DOI] [PubMed] [Google Scholar]

[R22] 22.Lu G, Ades AE. Modeling between-trial variance structure in mixed treatment comparisons. Biostatistics 2009; 10(4): 792–805. [DOI] [PubMed] [Google Scholar]

[R23] 23.Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0. 2011. http://handbook.cochrane.org, The Cochrane Collaboration. [Google Scholar]

[R24] 24.Yao H, Chen M, Qiu C. Bayesian Modeling and Inference for Meta Data with Applications in Efficacy Evaluation of an Allergic Rhinitis Drug. Journal of Biopharmaceutical Statistics 2011; 21(5): 992–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.Hong H, Price KL, Fu H, Carlin BP. Bayesian network meta-analysis for multiple endpoints. In: Morton S, Gatsonis C, eds. Methods in Comparative Effectiveness ResearchBoca Raton, FL: CRC Press. 2017. [Google Scholar]

[R26] 26.Yao H, Kim S, Chen MH, Ibrahim JG, Shah AK, Lin J. Bayesian Inference for Multivariate Meta-regression with Partially Observed Within-Study Sample Covariance Matrix. Journal ofthe American Statistical Association 2015; 110(510): 528–544. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Chen MH, Shao QM, Ibrahim JG. Monte Carlo methods in Bayesian computation. New York: Springer. 2000. ISBN 978–1-4612–1276-8. [Google Scholar]

[R28] 28.Joe H Generating random correlation matrices based on partial correlations. Journal ofMultivariate Analysis 2006; 97(10): 2177–2189. [Google Scholar]

[R29] 29.Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A. Bayesian measures of model complexity and fit. Journal ofthe Royal Statistical Society: Series B (Statistical Methodology) 2000; 64(4): 583–639. [Google Scholar]

[R30] 30.Salanti G, Ades AE, Ioannidis JP. Graphical methods and numerical summaries for presenting results from multiple-treatment meta-analysis: an overview and tutorial. Journal ofclinical epidemiology 2011; 64(2): 163–171. [DOI] [PubMed] [Google Scholar]

[R31] 31.Higgins J, White IR, Anzures-Cabrera J. Meta-analysis of skewed data: Combining results reported on log-transformed or raw scales. Statistics in Medicine 2008; 27(29): 6072–6092. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Greco T, Biondi-Zoccai G, Gemma M, Guérin C, Zangrillo A, Landoni G. How to impute study-specific standard deviations in meta-analyses of skewed continuous endpoints?. World Journal of Meta-Analysis 2000; 3(5): 215–224. [Google Scholar]

PERMALINK

Bayesian Network Meta-Regression Hierarchical Models Using Heavy-Tailed Multivariate Random Effects with Covariate-Dependent Variances

Hao Li

Daeyoung Lim

Ming-Hui Chen

Joseph G Ibrahim

Sungduk Kim

Arvind K Shah

Jianxin Lin

Abstract

1 |. INTRODUCTION

2 |. MOTIVATING CASE STUDY: THE TNM DATA

FIGURE 1.

TABLE 1.

FIGURE 2.

3 |. NETWORK META-REGRESSION HIERARCHICAL MODEL

3.1 |. Model

3.2. | Likelihood Function

4 |. BAYESIAN INFERENCE

4.1 |. Priors and Posteriors

4.2 |. Computational Development

The Localized Metropolis Algorithm for Generating ηk

4.3 |. Model Comparison

5 |. A SIMULATION STUDY

TABLE 2.

TABLE 3.

FIGURE 3.

FIGURE 4.

TABLE 4.

6 |. ANALYSIS OF THE TNM DATA

TABLE 5.

TABLE 6.

FIGURE 5.

FIGURE 6.

7 |. DISCUSSION

Supplementary Material

ACKNOWLEDGEMENTS

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

The Localized Metropolis Algorithm for Generating η_k