Bayesian flexible hierarchical skew heavy-tailed multivariate meta regression models for individual patient data with applications

Sungduk Kim; Ming-Hui Chen; Joseph Ibrahim; Arvind Shah; Jianxin Lin

doi:10.4310/sii.2020.v13.n4.a6

. Author manuscript; available in PMC: 2020 Aug 26.

Published in final edited form as: Stat Interface. 2020;13(4):485–500. doi: 10.4310/sii.2020.v13.n4.a6

Bayesian flexible hierarchical skew heavy-tailed multivariate meta regression models for individual patient data with applications

Sungduk Kim ¹, Ming-Hui Chen ², Joseph Ibrahim ³, Arvind Shah ⁴, Jianxin Lin ⁵

PMCID: PMC7448754 NIHMSID: NIHMS1619624 PMID: 32855761

Abstract

A flexible class of multivariate meta-regression models are proposed for Individual Patient Data (IPD). The methodology is well motivated from 26 pivotal Merck clinical trials that compare statins (cholesterol lowering drugs) in combination with ezetimibe and statins alone on treatment-naïve patients and those continuing on statins at baseline. The research goal is to jointly analyze the multivariate outcomes, Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG). These three continuous outcome measures are correlated and shed much light on a subject’s lipid status. The proposed multivariate meta-regression models allow for different skewness parameters and different degrees of freedom for the multivariate outcomes from different trials under a general class of skew t-distributions. The theoretical properties of the proposed models are examined and an efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for carrying out Bayesian inference under the proposed multivariate meta-regression model. In addition, the Conditional Predictive Ordinates (CPOs) are computed via an efficient Monte Carlo method. Consequently, the logarithm of the pseudo marginal likelihood and Bayesian residuals are obtained for model comparison and assessment, respectively. A detailed analysis of the IPD meta data from the 26 Merck clinical trials is carried out to demonstrate the usefulness of the proposed methodology.

Keywords and phrases: Collapsed Gibbs, CPO Identity II, Heterogeneity, Multi-dimensional random effects, Multiple trials, Outlying trials

1. INTRODUCTION

According to the National Vital Statistics Reports (Heron 2019) cardiovascular disease (CVD) continues to be the leading cause of death for both men and women. This is the case in the U.S. and worldwide. More than half of all people who die due to heart disease are men. It has been confirmed that increased low density lipoprotein cholesterol (LDL-C) is an independent risk factor for CVD. LDL-C lowering has been consistently shown to reduce the risk of CVD. One large meta-analysis (Baigent et al. 2010) of statin clinical trials shows a progressive reduction in risk of major CVD events with lower on-treatment LDL-C levels. Although LDL-C is a primary cause of CVD, other risk factors contribute, as well, for example, high-density lipoprotein cholesterol (HDL-C) and triglycerides (TG). Large cohort studies show a strong and inverse relationship of HDL-C levels with the risk of incident CVD independent of other lipids. HDL-C is positively associated with a decreased risk of coronary heart disease (CHD). As defined by the US National Cholesterol Education Program Adult Treatment Panel III guidelines (ATP III 2001), an HDL-C level of 60 mg/dL or greater is a negative risk factor. A long-standing association exists between elevated triglyceride levels and CVD (Austin et al. 1998; Sarwar et al. 2007). In a meta-analysis of 17 studies, increased triglyceride levels were associated with increased coronary disease risk in both men and women, after adjustment for HDL-C and other risk factors (Hokanson and Austin 1996). A randomized, controlled clinical trial REDUCE-IT (Bhatt et al. 2019) has demonstrated that intervention to low triglyceride level is associated with reduced CVD events. Among lipid-lowering drugs, statins are the cornerstone of therapy. Ezetimibe is the most commonly used nonstatin agent. It lowers LDL-C levels by 13% to 20% and has a low incidence of side effects (Cannon et al. 2015; Kashani et al. 2008).

Meta-regression (MR) of individual patient data (IPD) is an effective modeling tool for explaining heterogeneity between trials, synthesizing evidence across studies, investigating individual-level interactions, or identifying subgroups (Simmonds and Higgins 2007; Ritz et al. 2008; Kim et al. 2013; Riley et al. 2015; Burke et al. 2017; Belias et al. 2019; Ibrahim et al. 2019). In dealing with IPD multivariate meta-data, it is often the case that the data may be highly skewed and or have heavy-tailed and non-normal distributions to properly model certain response variables which may have skew and/or heavy-tailed distributions.

The modeling framework proposed here is motivated from multivariate IPD data from 26 Merck clinical trials for cholesterol lowering drugs. In our application, we consider a three-dimensional continuous response, in which some components of the response variable are heavy-tailed and/or skew distributions, and some components may have symmetric and/or light tailed distributions. We do not know which are which in advance and our hope is to develop a model that accommodates this flexibility. Thus, in these settings, one needs more complex models than the traditional linear mixed model. There is abundant literature on using skew and/or heavy tailed distributions for modeling univariate and or multivariate data (Chen et al. 1999; Branco and Dey 2001; Azzalini and Capitanio 2003; Sahu et al. 2003; Genton 2004; Adcock 2004; Kim et al. 2008; Arellano-Valle and Genton 2010; Chang and Zimmerman 2016), but sparse in the multivariate MR setting (Kim et al. 2013; Ibrahim et al. 2019). One of the challenges of modeling skew and heavy tailed distributions in the MR setting is that one needs to develop a flexible class of models that allow certain components of the multivariate response to have skew and/or heavy-tailed while allowing for other components to have symmetric and/or light-tailed distributions, while at the same time, capturing heterogeneity between trials via appropriate random effects. Since the skewness and heaviness of the tails will not be known in advance, one needs to also model the skew parameters and scale parameters appropriately to correctly capture the data structure in this complex multivariate MR setting.

Instead of using a Box-Cox transformation on the multivariate response variables as in Kim et al. (2013), we extend the multivariate skew MR models of Ibrahim et al. (2019) to develop a flexible class of multivariate MR models that accommodate skewness and heavy tailed distributions for multivariate meta-data. Under the proposed models, we first assume that the skewness parameters, the covariance matrices of the multivariate responses, and the degrees of freedom in the multivariate t distributions for the error terms are different across trials at the first stage, and then assume hierarchial priors for the multivariate random effects, the skewness parameters, the covariance matrices, and the degrees of freedom at the second stage. The proposed models are very flexible and general. As empirically shown in Section 5, the proposed model leads to a substantial gain in the goodness-of-fit of the multivariate IPD data from the 26 Merck clinical trials. Due to the complexity and computational challenge of the proposed models, an efficient Markov chain Monte Carlo (MCMC) sampling algorithm is developed for sampling from the posterior distribution of the model parameters. Moreover, one needs to develop and use model assessment tools to examine and find the best fitting models. In this paper, we consider the Conditional Predictive Ordinate (CPO) and develop its computational implementation for the proposed model. It is shown that CPO along with our flexible modeling framework identifies a more suitable model than a traditional MR model.

The rest of the paper is organized as follows. In Section 2, we discuss the Merck cholesterol data in detail. Sections 3.1–3.2 lay out the modeling details of our proposed multivariate skew heavy-tailed random effects meta-regression model, the prior distributions, properties of the proposed model as well as a flow diagram explaining all of the components of the model. Section 3.3 develops the likelihood function and joint posterior distribution of all the parameters. Section 4.1 develops new analytic and computational results for CPO, and Section 4.2 gives details of the MCMC computational development. Section 5 presents a detailed analysis of the cholesterol data showing that our proposed model provides a better interpretation and fit over the existing MR model. We close the paper with a discussion in Section 6.

2. THE CHOLESTEROL DATA

We consider the individual patient data (IPD) from 26 Merck-sponsored double-blind, randomized, active or placebo-controlled clinical trials on adult patients with primary hypercholesterolemia, which were analyzed by Kim et al. (2013) and Ibrahim et al. (2019). The IPD data considered in our analyses are a subset of the meta-data published in Leiter et al. (2011). The citations of the primary papers published in clinical journals for these 26 trials can be found in Leiter et al. (2011). A detailed summary of the covariates was provided in Tables 1 and 2 of Kim et al. (2013).

Table 1.

Values of LPML Measure under Various Models for the Cholesterol Data

Flexible skew t model
	τ = 5	τ = 10	τ = 20	τ = 30	random v_a & τ
v_a = 20	−261324.67	−261391.54	−261423.48	−261455.18	−261321.15
v_a = 25	−261322.38	−261346.60	−261413.49	−261448.32
v_a = 30	−261323.05	−261358.94	−261415.45	−261450.02
skew t model with τ = 30
−262517.61

Open in a new tab

Table 2.

Posterior Estimates under the flexible skew t model with random ν_a and τ for LDL-C

	Par.	Mean	SD	95% HPD
bl_ldlc	β_1,1	−0.031	0.003	(−0.037, −0.025)
bl_hdlc	β_1,2	0.013	0.009	(−0.005, 0.032)
bl_tg	β_1,3	0.007	0.001	(0.004, 0.010)
BMI	β_1,4	0.069	0.017	(0.035, 0.102)
age	β_1,5	−0.144	0.009	(−0.162, −0.126)
duration	β_1,6	0.338	0.282	(−0.217, 0.895)
Female	β_1,7	−1.239	0.206	(−1.621, −0.812)
DM	β_1,8	−2.315	0.278	(−2.853, −1.769)
CHD	β_1,9	0.258	0.270	(−0.301, 0.763)
potency2	β_1,10	−7.319	0.327	(−7.954, −6.669)
potency3	β_1,11	−15.738	0.375	(−16.438, −14.968)
Black	β_1,12	2.220	0.394	(1.433, 2.975)
Hispanic	β_1,13	0.397	0.456	(−0.519, 1.269)
Other	β_1,14	−2.077	0.490	(−3.046, −1.146)
onstatin × bl_ldlc	β_1,15	−0.058	0.007	(−0.072, −0.045)
onstatin × bl_hdlc	β_1,16	−0.074	0.017	(−0.109, −0.041)
onstatin × bl_tg	β_1,17	−0.005	0.003	(−0.011, 0.001)
onstatin × BMI	β_1,18	−0.068	0.034	(−0.135, −0.004)
onstatin × age	β_1,19	0.108	0.018	(0.074, 0.145)
onstatin × duration	β_1,20	0.155	0.633	(−1.087, 1.398)
onstatin × Female	β_1,21	2.722	0.396	(1.956, 3.497)
onstatin × DM	β_1,22	−0.654	0.478	(−1.598, 0.281)
onstatin × CHD	β_1,23	−0.862	0.504	(−1.855, 0.106)
onstatin × potency2	β_1,24	6.144	0.655	(4.873, 7.431)
onstatin × potency3	β_1,25	13.924	0.835	(12.336, 15.597)
onstatin × Black	β_1,26	−2.630	0.851	(−4.272, −0.953)
onstatin × Hispanic	β_1,27	−2.094	1.020	(−4.151, −0.139)
onstatin × Other	β_1,28	3.108	0.962	(1.213, 4.981)

Open in a new tab

The primary goal of these clinical trials was to evaluate the LDL-C lowering effects of ezetimibe (which works in the digestive tract) in combination with statin (which works in the liver) in comparison with statin alone on treatment-naïve patients at baseline (on a first-line therapy) or patients who underwent washout of previous lipid-modifying therapy at baseline (on a second-line therapy). In our analyses, different statins and their doses are combined to form the “statin” and “statin + ezetimibe” treatment groups. Ezetimibe (EZE) is available at only one dose of 10 mg, and the statins used in these trials included simvastatin, atorvastatin, lovastatin, rosuvastatin, pravastatin, and fluvastatin. Statin potency describes the chemical/medicinal strength of the statin, and it was categorized into three potency classes (low, medium, and high). The covariates include treatment (trt: 0 = Statin and 1 = Statin + EZE), baseline LDL-C (bl-ldlc), baseline HDL-C (bl-hdlc), baseline TG (bl-tg), age, race (White (reference), Black, Hispanic, and Other), gender (Female: 0 = male (reference), 1 = female), diabetes (DM: 0 = No, 1 = Yes), coronary heart disease (CHD: 0 = No, 1 = Yes), body mass index (BMI), statin potency (low (reference), med (potency2), and high (potency3)), and trial duration (duration) (6–12 weeks). Tables 1 and 2 of Kim et al. (2013) show a considerable amount of heterogeneity in the covariates across the trials. Therefore, to examine the treatment effects, there is a need to adjust for these covariates. We consider three primary outcome variables including percent changes from baseline in LDL-C, HDL-C, and TG. For ease of presentation, we simply denote these three outcome variables by LDL-C, HDL-C, and TG. As empirically shown in Kim et al. (2013) and Ibrahim et al. (2019), skew and heavy-tailed distributions are needed for modeling these three primary outcome variables.

3. MULTIVARIATE META REGRESSION MODELS

Consider K randomized trials, where each trial has two treatment arms (“Statin” or “Statin + EZE”), and patients in each trial were either all on statin or all not on statin prior to the trial. The sample size of the individual patient data for the k^th trial is n_k. Let y_ik = (y_i1k,…, y_iJk)′ denote a J-dimensional vector of the responses for the i^th patient in the k^th trial. Also let trt_ik = 1 if the i^th patient received “Statin + EZE” and 0 if “Statin” alone, and onstatin_k = 1 if patients were on statin and 0 if not on statin prior to the trial. Furthermore, let x_ijk denote a p_j-dimensional vector of covariates for the j^th response corresponding to the i^th patient, and $β_{j} = {(β_{j 1}, \dots, β_{j p_{j}})}^{'}$ is the vector of fixed effects regression coefficients corresponding to the p_j covariates. Let w_ijk denote a q_j-dimensional vector for the random effects.

3.1. Preliminary

The multivariate skew meta-regression model proposed by Ibrahim et al. (2019) is given by

y_{i j k} = [γ_{j k 0} + γ_{j k 1} {trt}_{i k}] (1 - {onstatin}_{k}) + [γ_{j k 2} + γ_{j k 3} {trt}_{i k}] {onstatin}_{k} + δ_{j} (z_{i j k} - E [z_{i j k}]) + x_{i j k}^{'} β_{j} + ϵ_{i j k}, {(ϵ_{i 1 k}, \dots, ϵ_{i J k})}^{'} ∣ λ_{i k} ~ N (0, λ_{i k}^{- 1} Σ), λ_{i k} ~ Gamma (ν / 2, ν / 2),

(3.1)

where z_ijk\ψ_ik follows an exponential distribution with mean $\frac{1}{ψ_{i k}}$ , ψ_ik ~ Gamma(τ + 1, τ), where Gamma(a, b) denotes the gamma distribution with mean a/b and variance a/b², and γ_jk = (γ_jk0, γ_jk1, γ_jk2, γ_jk3)′ ~ N₄(γ_j, Ω_j) with γ_j = (γ_j0, γ_j1, γ_j2, γ_j3)′. The model in (3.1) assumed that z_i1k, …, z_iJk are dependent as well as the same covariance matrices, skewness parameters, and degrees of freedom across K trials. Let $λ_{k} = \sum_{i = 1}^{n_{k}} λ_{i k}$ . Following Ibrahim et al. (2019), we define

λ_{k}^{*} = (λ_{k} - n_{k}) / {(n_{k} / (ν / 2))}^{1 / 2} .

(3.2)

Then, we have $E [λ_{k}^{*}] = 0$ and $Var (λ_{k}^{*}) = 1$ a priori, where the expectation and variance are taken with respect to Gamma(n_kv/2, v/2), and consequently, $λ_{k}^{*}$ approximately follows the standard N(0, 1) distribution. Figure 4 of Ibrahim et al. (2019) shows the boxplots of the $λ_{k}^{*}' s$ from the posterior distribution under the skew t model in (3.1) with τ = 30 and an unstructured Σ for the cholesterol data. From these boxplots, it was found that the posterior distributions of the $λ_{k}^{*}' s$ substantially depart from their prior distributions for three trials. The reasons for such a discrepancy between the prior and posterior distributions of $λ_{k}^{*}$ could be two-fold: (i) outlying trials or (ii) lack of fit due to the use of the same degrees of freedom v across all trials.

Figure 4. — Plots of the posterior estimates (mean and 95% HPD interval) for ν_k under the flexible skew model with random ν_a and τ for the cholesterol data.

3.2. Hierarchical skew heavy-tailed multivariate meta regression models

We propose the following flexible hierarchical skew heavy-tailed multivariate meta regression models as follows.

Stage 1: Model for Multivariate Responses

y_{i j k} = x_{i j k}^{'} β_{j} + w_{i k}^{'} γ_{j k} + δ_{j k} (z_{i j k} - E [z_{i j k}]) + ϵ_{i j k},

(3.3)

where $γ_{j k} = {(γ_{j 1 k}, \dots, γ_{j q_{j} k})}^{'}$ represents the vector of q_j - dimensional random effects for the j^th response, δ_jk is a skewness parameter for the j^th response in the kth trial, z_ijk is the skewness latent variable with the expected value E[z_ijk], and ϵ_ik = (ϵ_i1k, …, ϵ_iJk)′ is the vector of error terms. We assume

ϵ_{i k} ∣ λ_{i k} ~ N (0, λ_{i k}^{- 1} Σ_{k}) and λ_{i k} ~ Gamma (ν_{k} / 2, ν_{k} / 2),

(3.4)

where Σ_k is a J × J positive definite covariance matrix and v_k > 0 is an unknown parameter. In (3.4), v_k corresponds the degrees of freedom and the variance of ϵ_ik is finite for all v_k > 2. Also, in (3.3), we assume that

z_{i j k} ∣ ψ_{i k} ~ E (ψ_{i k}) and ψ_{i k} ~ Gamma (τ + 1, τ),

(3.5)

where ε(ψ_ik) denotes an exponential distribution with mean $\frac{1}{ψ_{i k}}$ and Gamma(τ + 1, τ) denotes a Gamma distribution with mean $\frac{τ + 1}{τ}$ and variance $\frac{τ + 1}{τ^{2}}$ . Under this assumption, E[z_ijk] = 1 and $Var (z_{i j k}) = \frac{τ + 1}{τ - 1}$ for τ > 1. Let $z_{i k}^{δ} = {(z_{i 1 k}^{δ}, \dots, z_{i J k}^{δ})}^{'} = (δ_{1 k} (z_{i 1 k} - {E [z_{i 1 k}]), \dots, δ_{J k} (z_{i J k} - E [z_{i J k}]))}^{'}$ . The covariance matrix of $z_{i k}^{δ}$ is given by

Var (z_{i k}^{δ}) = \frac{τ}{τ - 1} diag (δ_{1 k}^{2}, \dots, δ_{J k}^{2}) + \frac{1}{τ - 1} δ_{k} δ_{k}^{'},

(3.6)

where δ_k = (δ_1k, …, δ_Jk)′ for k = 1, …, K. From (3.6), we note that the correlations of the responses, y_ijk, depend on the kth skewness parameter δ_jk, and the z_ijk’s are independent when τ → ∞. We further assume that γ_jk, z_ijk, and ϵ_ik are independent.

At Stage 2, models for the random effects, covariance matrices, skewness parameters, and degrees of freedom are specified as follows.

Stage 2a: Model for Random Effects

γ_{j k} ~ N (γ_{j}, Ω_{j}),

(3.7)

where $γ_{j} = {(γ_{j 1}, \dots, γ_{j q_{j}})}^{'}$ is the overall mean vector of γ_jk and Ω_j is a q_j × q_j covariance matrix of the random effects γ_jk.

Stage 2b: Model for Covariance Matrices

Σ_{k}^{- 1} ~ {Wishart}_{J} (v, (v - J - 1) Σ),

(3.8)

where Σ is a J × J overall covariance matrix and υ > J + 1. Also Σ_k has prior expectation E[Σ_k|Σ] = Σ when υ > J + 1. The model (3.8) is attractive as it allows for “borrowing of strength” across trials through the common second-level covariance matrix Σ and it also accounts for the heterogeneity of the within-study covariance matrices among different trials at the same time. The parameter υ in (3.8) controls the amount of borrowing across trials. The larger the value of υ, the more the within-trial covariance matrices borrow strength from different trials. Note that Wishart_J (d₀, S₀) denotes the Wishart distribution with mean d₀S₀. That is, $π (Σ_{k}^{- 1} ∣ d_{0}, S_{0}) \propto {| Σ_{k}^{- 1} |}^{(d_{0} - J - 1) / 2} exp (- \frac{1}{2} tr (S_{0}^{- 1} Σ_{k}^{- 1}))$ .

Stage 2c: Model for Skewness Parameters

δ_{j k} ~ N (δ_{j}, σ_{δ_{j}}^{2}),

(3.9)

where −∞ < δ_j < ∞ is the overall skewness parameter for the j^th response and $σ_{δ_{j}}^{2} > 0$ is the variance parameter, controlling the amount of “borrowing” across trials for within-trial skewness parameters for the j^th response.

Stage 2d: Model for Degrees of Freedom

ν_{k} ~ Gamma (ν_{a}, ν_{a} / ν_{b}) .

(3.10)

where v_a > 0 controls the amount of borrowing across trials and v_b > 0 is the overall degrees of freedom. Under (3.10), the prior mean of v_k is E[v_k] = v_b.

At Stage 3, the prior distributions of the hyperparameters for the random effects, covariance matrices, skewness parameters, and degrees of freedom, which are proposed at Stage 2, as well as the regression coefficients, are specified as follows. Let δ = (δ₁, …, δ_J)′ and $σ_{δ}^{2} = {(σ_{δ_{1}}^{2}, \dots, σ_{δ_{I}}^{2})}^{'}$ . We assume that β, γ, δ*, Σ*, v*, τ, and Ω are independent a priori.

Stage 3a: Prior distributions of the Hyperparameters for Random Effects

γ ~ N_{q J} (0, c_{1} I_{q J}) and Ω_{j}^{- 1} ~ {Wishart}_{q_{j}} (d_{1}, S_{1}),

(3.11)

where $q = \sum_{j = 1}^{J} q_{j}$ .

Stage 3b: Prior distributions of the Hyperparameters for Covariance Matrices

v ~ Gamma (a_{1}, b_{1}) and Σ ~ {Wishart}_{J} (d_{2}, S_{2}) .

(3.12)

Stage 3c: Prior distributions of the Hyperparameters for Skewness Parameters and Latent Variables

δ ~ N_{J} (0, c_{2} I_{J}), σ_{δ_{j}}^{2} ~ IGamma (a_{2}, b_{2}), and τ ~ Gamma (a_{3}, b_{3}) 1 (τ > 1),

(3.13)

where IGamma(a, b) denotes the inverse gamma distribution with mean b/(a − 1) when a > 1 and variance b²/[(a − 1)²(a − 2)] when a > 2.

Stage 3d: Prior distributions of the Hyperparameters for Degrees of Freedom

ν_{a} ~ Gamma (a_{4}, b_{4}) and ν_{b} ~ IGamma (a_{5}, b_{5}) .

(3.14)

Stage 3e: Prior distribution for Fixed Effects Regression Coefficients

β ~ N_{p} (0, c_{3} I_{p}),

(3.15)

where $p = \sum_{j = 1}^{J} p_{j}$ .

The multivariate meta-regression model defined in (3.3), (3.4), (3.6), (3.7), (3.9), and (3.10) is very general and flexible, and it includes as special cases the multivariate normal meta-regression model, the multivariate t meta-regression model, and the multivariate skew t meta-regression model. Furthermore, this proposed model also incorporates the different covariance matrices, skewness parameters, and degrees of freedom across the K trials. In the analysis, the hyperparameters of the prior distribution at Stage 3 were specified as c₁ = 100, c₂ = 100, c₃ = 100, d₁ = q_j + 0.1, S₁ = 0.1, d₂ = J + 0.1, S₂ = 0.1, a₁ = 1, b₁ = 0.1, a₂ = 0.1, b₂ = 0.1, a₃ = 1, b₃ = 0.1, a₄ = 1, b₄ = 0.1, a₅ = 0.1, and b₅ = 0.1. These choices of the hyperparameters lead to noninformative priors. The flow diagram of the proposed model is given in Figure 1. Simultaneous estimation of all parameters is not easy and requires a sophisticated and computationally intensive MCMC sampling algorithm.

Figure 1. — Flow Diagram of the Proposed Model.

3.3. The likelihood function and posterior distribution

Let $X_{i k} = diag (x_{i 1 k}^{'}, \dots, x_{i J k}^{'})$ , $W_{i k} = diag (w_{i k}^{'}, \dots, w_{i k}^{'})$ , $β = {(β_{1}^{'}, \dots, β_{J}^{'})}^{'}$ , $γ = {(γ_{1}^{'}, \dots, γ_{J}^{'})}^{'}$ , $δ^{*} = (δ_{1}^{'}, \dots, δ_{K}^{'})$ , Σ* = (Σ₁,…, Σ_K), v* = (v₁,…, v_K)′, $γ_{k}^{R} = {(γ_{1 k}^{'}, \dots, γ_{J k}^{'})}^{'}$ , and $ψ = {(ψ_{11}, \dots, ψ_{n_{K} K})}^{'}$ . Also let y_ik = (y_i1k,…, y_iJk)′, $y = {(y_{11}^{'}, \dots, y_{n_{K} K}^{'})}^{'}$ , $W = (W_{11}, \dots, W_{n_{K} K})$ , and $X = (X_{11}, \dots, X_{n_{K} K})$ . Furthermore, we let D_obs = (y, X, W) denote the observed data. Then the complete-data likelihood function is given by

L (β, γ, δ^{*}, Σ^{*}, ν^{*}, τ, Ω, γ^{R}, z, ψ, λ ∣ D_{obs}) = \prod_{k = 1}^{K} \prod_{i = 1}^{n_{k}} f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, z_{i k}, λ_{i k}, X_{i k}, W_{i k}) \times \frac{{(\frac{ν_{k}}{2})}^{\frac{ν_{k}}{2}}}{Γ (\frac{ν_{k}}{2})} λ_{i k}^{\frac{ν_{k}}{2} - 1} exp {- \frac{ν_{k}}{2} λ_{i k}} \times \prod_{k = 1}^{K} \prod_{i = 1}^{n_{k}} [\prod_{j = 1}^{J} ψ_{i k} exp (- ψ_{i k} z_{i j k})] \times \frac{τ^{τ + 1}}{Γ (τ + 1)} ψ_{i k}^{τ} exp {- τ ψ_{i k}} \times \prod_{k = 1}^{K} \prod_{j = 1}^{J} {(2 π)}^{- \frac{4}{2}} {| Ω_{j} |}^{- \frac{1}{2}} \times exp {- \frac{1}{2} {(γ_{j k} - γ_{j})}^{'} Ω_{j}^{- 1} (γ_{j k} - γ_{j})},

(3.16)

where $f (y_{i k} ∣ β, Δ_{k}, Σ_{k}, γ_{k}^{R}, z_{i k}, λ_{i k}, X_{i k}, W_{i k}) = \frac{{| Σ_{k} |}^{- 1 / 2}}{{(2 π)}^{J / 2}} λ_{i k}^{J / 2} exp {- \frac{λ_{i k}}{2} {(y_{i k} - X_{i k} β - W_{i k} γ_{k}^{R} - Δ_{k} z_{i k}^{*})}^{'} Σ_{k}^{- 1} (y_{i k} - X_{i k} β - W_{i k} γ_{k}^{R} - Δ_{k} z_{i k}^{*})}$ with $z_{i k}^{*} = z_{i k} - E [z_{i k}]$ , Δ_k = diag(δ_1k,…, δ_Jk), Ω = diag(Ω₁,…,Ω_J), $z = (z_{11}^{'}, \dots, {z_{n_{K} K}^{'})}^{'}$ , $γ^{R} = {({(γ_{1}^{R})}^{'}, \dots, {(γ_{K}^{R})}^{'})}^{'}$ , and $λ = (λ_{11}, \dots, λ_{n_{K} K})$ . Then using the complete-data likelihood function in (3.16) and prior distributions in Section 3.2, the joint posterior distribution of all the parameters is given by

π (β, γ, δ^{*}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{R}, z, ψ, λ ∣ D_{obs}) \propto L (β, γ, δ^{*}, Σ^{*}, ν^{*}, τ, Ω, γ^{R}, z, ψ, λ ∣ D_{obs}) \times π (β) π (γ), π (δ^{*} ∣ δ, σ_{δ}^{2}) π (δ) π (σ_{δ}^{2}) \times π (Σ^{*} ∣ v, Σ) π (v) π (Σ) π (Ω) \times π (ν^{*} ∣ ν_{a}, ν_{b}) π (ν_{a}) π (ν_{b}) π (τ)

(3.17)

4. BAYESIAN INFERENCE

4.1. Bayesian model comparison via CPO’s

Given the rich specification of our proposed model, it is of interest to compare the performance of various special cases of the general multivariate skew meta-regression model proposed in Section 3.2. That is, we need methods for checking whether a skew and/or heavy-tailed distribution is needed for modeling the y_ik’s. Also, we need to investigate whether the skewness, variance, and degrees of freedom are varying across trials. To this end, we carry out the model comparison using the logarithm of pseudo-marginal likelihood (LPML) proposed by Ibrahim et al. (2001). The LPML is a well-established Bayesian model comparison criterion based on the conditional predictive ordinate (CPO) statistic. The CPO statistic for the ith subject in the kth trial is the marginal posterior predictive density of y_ik. As suggested in Ibrahim et al. (2001), a natural summary statistic of the CPO_ik’s is the LPML defined as

LPML = \sum_{k}^{K} \sum_{i}^{I} log ({CPO}_{i k}) .

(4.1)

We use LPML as a criterion-based measure for model selection. The larger the LPML, the better the fit of a given model. From (3.16), the marginal likelihood function is given by

f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) = \int_{λ_{i k} \in R_{+}} \int_{ψ_{i k} \in R_{+}} \int_{z_{i k} \in R_{+}^{J}} f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, z_{i k}, λ_{i k}, X_{i k}, W_{i k}) \times \prod_{j = 1}^{J} [ψ_{i k} exp {- ψ_{i k} z_{i j k}} d z_{i k} \times \frac{τ^{τ + 1}}{Γ (τ + 1)} ψ_{i k}^{τ} exp {- τ ψ_{i k}} d ψ_{i k}] \times \frac{{(\frac{ν_{k}}{2})}^{\frac{ν_{k}}{2}}}{Γ (\frac{ν_{k}}{2})} λ_{i k}^{\frac{ν_{k}}{2} - 1} exp {- \frac{ν_{k}}{2} λ_{i k}} d λ_{i k} = \frac{{| Σ_{k} |}^{- \frac{1}{2}}}{π^{\frac{J}{2}}} \frac{Γ (\frac{ν_{k} + J}{2})}{Γ (\frac{ν_{k}}{2})} ν_{k}^{\frac{ν_{k}}{2}} \frac{Γ (τ + J + 1)}{Γ (τ + 1)} τ^{τ + 1} \times \int_{z_{i k} \in R_{+}^{J}} {(A_{i k} + ν_{k})}^{- \frac{ν_{k} + J}{2}} {(\sum_{j = 1}^{J} z_{i j k} + τ)}^{- (J + τ + 1)} d z_{i k},

(4.2)

where $A_{i k} = {(\begin{array}{l} y_{i k} & - & X_{i k} & β & - & W_{i k} & γ_{k}^{R} & - & Δ_{k} & z_{i k}^{*} \end{array})}^{'} Σ_{k}^{- 1} (y_{i k} - X_{i k} β - W_{i k} \begin{array}{l} γ_{k}^{R} & - & Δ_{k} & z_{i k}^{*} \end{array})$ . Let $θ = (β, γ, δ^{*}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{R})$ denote the collection of parameters and the random effects. Let $D_{obs}^{(- i k)}$ denote the observed data with the ith patient in the kth trial deleted. Let Θ and $Z$ be the parameter spaces corresponding to θ and z, respectively. Also let $π (θ ∣ D_{obs}^{(- i k)})$ denote the posterior of θ given $D_{obs}^{(- i k)}$ . The following proposition gives a computational form of the CPO statistic.

Proposition 4.1. Let h_ik (z_ik) be a normalized weight function satisfying $\int_{R_{+}^{J}} h_{i k} (z_{i k}) d z_{i k} = 1$ . For the ith patient in the kth trial, CPO_ik can be written as

C P O_{i k} = \int_{Θ} f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) \times π (θ ∣ D_{v}^{(- i k)}) d θ = {[\int_{Θ} \int_{Z} f_{y_{i k}, z_{i k}} \times π (θ, z ∣ D_{obs}) d z d θ]}^{- 1},

(4.3)

where $f_{y_{i k}, z_{i k}} = \frac{h_{i k} (z_{i k})}{f (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})}$ and $f (y_{i k} z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) = {(A_{i k} + ν_{k})}^{- \frac{ν_{k} + J}{2}} {(\sum_{j = 1}^{J} z_{i j k} + τ)}^{- (J + τ + 1)} \frac{{| Σ_{k} |}^{- \frac{1}{2}}}{π^{\frac{J}{2}}} \frac{Γ (\frac{ν_{k} + J}{2})}{Γ (\frac{ν_{k}}{2})} ν_{k}^{\frac{ν_{k}}{2}} \frac{Γ (τ + J + 1)}{Γ (τ + 1)} τ^{τ + 1}$ .

The proof of this proposition is given in Appendix A.

Remark 4.1. Using the CPO Identity I in Zhang et al. (2017), CPO_ik can be written as

{CPO}_{i k} = \frac{1}{\int_{Θ} \frac{1}{f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})} π (θ ∣ D_{obs}) d θ},

where f(y_ik | β, δ_k, Σ_k, $γ_{k}^{R}$ , τ, v_k, X_ik, W_ik) is given in (4.2). This identity requires a J-dimensional integration over $R_{+}^{J}$ . Compared to the CPO identity I, (4.3) is more computationally attractive.

Remark 4.2. The CPO_ik given in (4.3) uses the CPO Identity II in Zhang et al. (2017). Now, let {(θ^(b), z^(b)), b = 1, …, B} denote a Gibbs sample of (θ, z) from π(θ, z | D_obs). Using Proposition 4.1, a Monte Carlo estimator of CPO_ik in (4.3) is given by

{\hat{CPO}}_{i k} = B {[\sum_{b = 1}^{B} f_{y_{i k}, z_{i k}}^{(b)}]}^{- 1},

(4.4)

where $f_{y i k}^{(b)}, z_{i k} = \frac{h_{i k} (z_{i k}^{(b)})}{f (y_{i k}, z_{i k}^{(b)} ∣ β^{(b)}, δ_{k}^{(b)}, Σ_{k}^{(b)}, {(γ_{k}^{R})}^{(b)}, τ^{(b)}, ν_{k}^{(b)}, X_{i k}, W_{i k})}$ .

The Monte Carlo error of ${\hat{CPO}}_{i k}$ given in (4.4) depends on the choice of the weight function h_ik(z_ik). Following Theorem 1 in Zhang et al. (2017), the optimal weight function is defined as the weight function minimizing the variance of the Monte Carlo estimator. Here, we have

h_{i k, opt} (z_{i k}) = \frac{f (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})}{f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})} .

(4.5)

However, the optimal weight function is not analytically available and difficult to compute since the denominator involves a J-dimensional integration. We can, instead, use a multivariate normal density as a possible choice of h_ik(z_ik), which is constructed via the Laplace approximation to the joint density f(y_ik, z_ik | β, δ_k, Σ_k, $γ_{k}^{R}$ , τ, v_k, X_ik, W_ik). Let u_ik = log(z_ik) for i = 1, …, n_k, k = 1, …, K, then z_ik = exp(u_ik). Here, log and exp functions applied to a vector means that the operations are applied to every element of the vector. Denote $g (u_{i k}) = log {f (y_{i k}, exp (u_{i k}) ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) \prod_{j = 1}^{J} exp (u_{i j k})}$ . Let $u_{i k}^{0}$ be the stationary point, where ∇g(u_ik) = 0, and $B = - {\nabla^{2} g (u_{i k}) |}_{u = u_{i k}^{0}}$ . Then, the optimal weight function h_ik,opt(z_ik) is approximately given by $\frac{f (log (z_{i k}) ∣ u_{i k}^{0}, B^{- 1})}{\prod_{j = 1}^{J} z_{i j k}}$ , where $f (\cdot ∣ u_{i k}^{0}, B^{- 1})$ denotes the density of a $N (u_{i k}^{0}, B^{- 1})$ distribution. A detailed derivation of this approximated optimal weight function is given in Appendix B.

4.2. Computational development

We consider the following one-to-one transformations: $γ_{k}^{* R} = {(γ_{1 k}^{*'}, γ_{2 k}^{*'}, \dots, γ_{J k}^{*'})}^{'} = γ_{k}^{R} - γ$ and $δ_{k}^{*} = {(δ_{1 k}^{*'}, δ_{2 k}^{*'}, \dots, δ_{J k}^{*'})}^{'} = δ_{k} - δ$ for k = 1,…, K. Thus, $γ_{j k}^{*} = γ_{j k} - γ_{j}$ and $δ_{j k}^{*} = δ_{j k} - δ_{j}$ for j = 1,…, J and k = 1,…,K. Write $γ^{* R} = {({(γ_{1}^{* R})}^{'}, \dots, {(γ_{K}^{* R})}^{'})}^{'}$ and $δ^{* *} = (δ_{1}^{*'}, \dots, δ_{K}^{*'})$ . Let $Δ_{k}^{*} = diag (δ_{1 k}^{*}, \dots, δ_{J k}^{*})$ and Δ = diag(δ₁, …, δ_J). Also, let $X_{i k}^{*} = (X_{i k}, W_{i k})$ for i = 1,…, n and k = 1,…, K and θ = (β′, γ′)′. We present a detailed development of the MCMC sampling algorithm. Although the analytic evaluation of the joint posterior distribution of (θ, δ**, δ, $σ_{δ}^{2}$ , Σ*,υ, Σ, v*, v_a, v_b, τ, Ω, γ*^R, z, ψ, λ) based on the observed data D_obs given in Equation (3.17) is not possible, the proposed model allows us to develop an efficient MCMC sampling algorithm to sample from (3.17). The MCMC sampling algorithm requires sampling from the following full conditional distributions in turn: (i) [γ*^R, θ, δ**, δ, $σ_{δ}^{2} ∣ Σ^{*}$ , υ, Σ, ν*, ν_a, ν_b, τ, Ω, z, ψ, λ, D_obs]; (ii) [λ, ν*, ν_a, ν_b | θ, δ**, δ, $σ_{δ}^{2}$ , Σ*, υ, Σ, τ, Ω, γ*^R, z, ψ, D_obs]; (iii) [Σ*, υ, Σ | θ, δ**, δ, $σ_{δ}^{2}$ , ν*, ν_a, ν_b, τ, Ω, γ*^R, z, ψ, λ, D_obs]; (iv) [z | θ, δ**, δ, $σ_{δ}^{2}$ , Σ*, υ, Σ, ν*, ν_a, ν_b, τ, Ω, γ*^R, ψ, λ, D_obs]; (v) [ψ, τ | θ, δ**, δ, $σ_{δ}^{2}$ , Σ*, υ, Σ, ν*, ν_a, ν_b, Ω, γ*^R, z, λ, D_obs]; and (vi) [Ω | θ, δ**, δ, $σ_{δ}^{2}$ , Σ*, υ, Σ, ν*, ν_a, ν_a, τ, γ*^R, z, ψ, λ, D_obs]. For (i), we apply the collapsed Gibbs technique of Liu (1994) and Chen et al. (2000) through the identity

[γ^{* R}, θ, δ^{* *}, δ, σ_{δ}^{2} ∣ Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, z, ψ, λ, D_{obs}] = [γ^{* R} ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, z, ψ, λ, D_{obs}] \times [δ^{* *} ∣ θ, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, z, ψ, λ, D_{obs}] \times [θ, δ, σ_{δ}^{2} ∣ Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, z, ψ, λ, D_{obs}] .

(4.6)

That is, we sample δ** after collapsing out γ*^R, and also sample θ, δ, and $σ_{δ}^{2}$ after collapsing out γ*^R and δ**. For (ii), we apply the collapsed Gibbs technique of Liu (1994) and Chen et al. (2000) through the identity

[λ, ν^{*}, ν_{a}, ν_{b} ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, τ, Ω, γ^{* R}, z, ψ, D_{obs}] = [λ ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{* R}, z, ψ, D_{obs}] \times [ν_{b} ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, τ, Ω, γ^{* R}, z, ψ, D_{obs}] \times [ν^{*}, ν_{a} ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, τ, Ω, γ^{* R}, z, ψ, D_{obs}] .

(4.7)

That is, we sample ν_b after collapsing out λ, and also sample ν* and ν_a after collapsing out λ and ν_b. For (iii), we apply the collapsed Gibbs technique of Liu (1994) and Chen et al. (2000) through the identity

[Σ^{*}, v, Σ ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{* R}, z, ψ, λ, D_{obs}] = [Σ^{*} ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{* R}, z, ψ, λ, D_{obs}] \times [v, Σ ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{* R}, z, ψ, λ, D_{obs}] .

(4.8)

That is, we sample υ and Σ after collapsing out Σ*. For (v), we apply the collapsed Gibbs technique of Liu (1994) and Chen et al. (2000) through the identity

[ψ, τ ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, Ω, γ^{* R}, z, λ, D_{obs}] = [ψ ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, τ, Ω, γ^{* R}, z, λ, D_{obs}] \times [τ ∣ θ, δ^{* *}, δ, σ_{δ}^{2}, Σ^{*}, v, Σ, ν^{*}, ν_{a}, ν_{b}, Ω, γ^{* R}, ψ, z, λ, D_{obs}] .

(4.9)

All the full conditional distributions discussed above are presented in Section S1 of the Supplementary Materials (http://intlpress.com/site/pub/files/_supp/sii/2020/0013/0004/SII-2020-0013-0004-s004.pdf).

5. ANALYSIS OF THE CHOLESTEROL DATA

We re-analyze the cholesterol data discussed in Section 2. In (3.3), x_ijk consists of 14 covariates, including bl_ldlc, bl_hdlc, bl_tg, BMI, age, duration, Female, DM, CHD, potency2, potency3, black, hispanic, and other, as well as 14 interaction terms between the 14 covariates and onstatin as in Ibrahim et al. (2019). We model these three outcome variables LDL-C, HDL-C, and TG jointly via (3.3), (3.4), (3.5), and (3.6) with J = 3 and K = 26 in conjunction with the models specified at Stage 2 and the priors specified at Stage 3. We standardized all the fourteen covariates for numerical stability in the posterior computations.

As shown in Ibrahim et al. (2019), the multivariate skew meta-regression model with an unstructured covariance matrix for the multivariate outcome variables outperformed the symmetric normal and t models as well as the skew models with a diagonal covariance matrix. Thus, for the cholesterol data, we only fit the flexible multivariate meta-regression models defined by (3.3), (3.4), (3.5), and (3.6) with different fixed values of τ in (3.6) and random τ with prior specified in (3.13). We also consider different fixed values of ν_a in (3.10) and random ν_a with a prior specified in (3.14). In total, we consider 14 different models, including the one, which was the best model considered in Ibrahim et al. (2019), and the values of LPML are reported in Table 1. We see, from Table 1, that the LPML values under the proposed flexible skew t models are greater than the one (LPML = −262517.61) under the skew t model with τ = 30 (Ibrahim et al. 2019). The best flexible skew t model is the one with random ν_a and τ, which has LPML = −261321.15 while the second best model is the one with ν_a = 25 and τ = 5. The values of LPML for these models are −261321.15 and −261322.38, which are very close.

We extend $λ_{k}^{*}$ in (3.2) to

λ_{k}^{*} = (λ_{k} - n_{k}) / {(n_{k} / (ν_{k} / 2))}^{1 / 2}

(5.1)

to account for different degrees of freedom for the kth trial under the proposed flexible skew meta-regression model. Figure 2 shows the boxplots of the $λ_{k}^{*}' s$ defined in (5.1) under the flexible skew t model random ν_a and τ. The boxplots of the $λ_{k}^{*}' s$ under the flexible skew t model with ν_a = 25 and τ = 5 are shown in Figure S.1 of the Supplementary Materials. We see from Figure 2 and Figure S.1 that all of these boxplots had a median close to zero and no obvious outlying trials were found from these two figures. These two figures are quite different than Figure 2 of Ibrahim et al. (2019), in which boxplots corresponding to trials 8 and 25 were quite different than the rest of the 24 boxplots and these boxplots were much more heterogenous than those in Figures 2 and Figure S.1 under the proposed models. Thus, the outlying trials identified in Ibrahim et al. (2019) could be due to a lack of fit.

The posterior estimates, including posterior means, posterior standard deviations (SDs), and 95% highest posterior density (HPD) intervals of the parameters under the flexible skew t model with random ν_a and τ are reported in Tables 2–6 and Tables S.1–S.3. Those posterior estimates under the flexible skew t model with ν_a = 25 and τ = 5 are reported in Tables S.4–S.11 of the Supplementary Materials. The posterior means and the 95% HPD intervals of the 28 regression coefficients in Tables 2–4 (or Tables S.4–S.6) under the proposed flexible skew models and the skew t model with τ = 30 (Ibrahim et al., 2019) are also plotted in Figure 3 and Figure S.2 of the Supplementary Materials. We call a posterior estimate “statistically significant at a significance level of 0.05” if the corresponding 95% HPD interval does not contain 0. Under this notion, we see from Figures 2 and S.2 that significant posterior estimates are consistent between the proposed model and the model of Ibrahim et al. (2019) for most of the regression coefficients except for a few coefficients. For example, for LDL-C, onstatin × DM (β_1,22) was not significant with 95% HPD intervals (−1.598, 0.281) and (−1.553, 0.322) under the proposed models with random ν_a and τ (Table 2) and with ν_a = 25 and τ = 5 (Table S.4), respectively, while it was significant with 95% HPD interval (−2.088, −0.225) under the skew t model with τ = 30 (Table 7 of Ibrahim et al. (2019)). For HDL-C, onstatin × potency2 (β_2,24) was nearly significant with HPD intervals (−2.094, −0.029) and (−2.053, 0.000) under the proposed models with random ν_a and τ (Table 3) and with ν_a = 25 and τ = 5 (Table S.5), respectively, while it was not significant with 95% HPD interval (−1.955, 0.068) under the skew t model with τ = 30 (Table 8 of Ibrahim et al. (2019)). For TG, onstatin × duration (β_3,20) was not significant with HPD intervals (−0.242, 1.380) and (−0.215, 1.390) under the proposed models with random ν_a and τ (Table 4) and with ν_a = 25 and τ = 5 (Table S.6), respectively, while it was significant with 95% HPD interval (0.227, 1.539) under the skew t model with τ = 30 (Table 9 of Ibrahim et al. (2019)). Therefore, different models may identify different sets of significant covariates. Since the proposed flexible skew model fits the cholesterol data much better than the skew t model with τ = 30, the latter model may potentially incorrectly identify the association between the outcome variables and covariates, yielding a misleading conclusion in terms of the clinical importance of covariates.

Table 6.

Posterior Estimates under the flexible skew t model with random ν_a and τ

Par.	Mean	SD	95% HPD	Par.	Mean	SD	95% HPD
v₁	7.789	1.189	(5.593, 10.179)	v₁₄	7.208	0.961	(5.508, 9.191)
v₂	8.694	1.554	(5.998, 11.880)	v₁₅	5.408	0.619	(4.258, 6.645)
v₃	7.511	1.514	(4.649, 10.531)	v₁₆	8.761	1.418	(6.285, 11.631)
v₄	8.520	1.328	(6.243, 11.313)	v₁₇	8.418	1.772	(5.182, 11.890)
v₅	9.176	1.536	(6.449, 12.285)	v₁₈	9.272	1.730	(6.292, 12.851)
v₆	8.560	1.419	(5.989, 11.384)	v₁₉	10.479	2.113	(6.847, 14.652)
v₇	9.728	1.521	(6.958, 12.749)	v₂₀	8.379	1.506	(5.662, 11.390)
v₈	7.000	0.581	(5.900, 8.141)	v₂₁	6.626	1.067	(4.680, 8.731)
v₉	8.150	0.906	(6.412, 9.894)	v₂₂	6.671	1.198	(4.527, 9.110)
v₁₀	10.103	1.113	(8.024, 12.286)	v₂₃	6.417	1.093	(4.403, 8.575)
v₁₁	7.910	0.978	(6.087, 9.844)	v₂₄	7.593	1.362	(5.128, 10.317)
v₁₂	5.809	1.173	(3.609, 8.091)	v₂₅	6.825	1.003	(4.937, 8.806)
v₁₃	6.838	1.114	(4.860, 9.160)	v₂₆	6.235	0.923	(4.537, 8.080)

Open in a new tab

Table 4.

Posterior Estimates under the flexible skew t model with random ν_a and τ for TG

	Par.	Mean	SD	95% HPD
bl_ldlc	β_3,1	−0.002	0.006	(−0.013, 0.009)
bl_hdlc	β_3,2	0.030	0.017	(−0.002, 0.063)
bl_tg	β_3,3	−0.093	0.003	(−0.098, −0.088)
BMI	β_3,4	0.237	0.029	(0.181, 0.295)
age	β_3,5	−0.032	0.016	(−0.064, 0.000)
duration	β_3,6	−0.186	0.182	(−0.566, 0.140)
Female	β_3,7	2.570	0.361	(1.879, 3.279)
DM	β_3,8	−0.549	0.480	(−1.465, 0.417)
CHD	β_3,9	0.680	0.471	(−0.243, 1.591)
potency2	β_3,10	−4.303	0.572	(−5.422, −3.179)
potency3	β_3,11	−9.458	0.666	(−10.775, −8.178)
Black	β_3,12	−1.997	0.665	(−3.334, −0.744)
Hispanic	β_3,13	1.045	0.805	(−0.512, 2.615)
Other	β_3,14	−1.012	0.838	(−2.697, 0.573)
onstatin × bl_ldlc	β_3,15	0.015	0.010	(−0.004, 0.036)
onstatin × bl_hdlc	β_3,16	−0.093	0.028	(−0.150, −0.039)
onstatin × bl_tg	β_3,17	−0.026	0.005	(−0.036, −0.017)
onstatin × BMI	β_3,18	0.038	0.054	(−0.064, 0.146)
onstatin × age	β_3,19	0.096	0.029	(0.040, 0.153)
onstatin × duration	β_3,20	0.605	0.413	(−0.242, 1.380)
onstatin × Female	β_3,21	0.262	0.639	(−0.975, 1.513)
onstatin × DM	β_3,22	2.221	0.766	(0.715, 3.684)
onstatin × CHD	β_3,23	0.922	0.804	(−0.690, 2.482)
onstatin × potency2	β_3,24	3.468	1.063	(1.384, 5.549)
onstatin × potency3	β_3,25	6.237	1.302	(3.623, 8.732)
onstatin × Black	β_3,26	−1.101	1.316	(−3.662, 1.482)
onstatin × Hispanic	β_3,27	0.896	1.603	(−2.240, 4.007)
onstatin × Other	β_3,28	3.854	1.458	(1.039, 6.738)

Open in a new tab

Figure 3. — Plots of the relative posterior estimates (mean/SD and 95% HPD interval/SD) for β under the skew t model with τ = 30 (blue) and flexible skew model with random ν_a and τ (red) for the cholesterol data.

Table 3.

Posterior Estimates under the flexible skew t model with random ν_a and τ for HDL-C

	Par.	Mean	SD	95% HPD
bl_ldlc	β_2,1	0.002	0.003	(−0.004, 0.008)
bl_hdlc	β_2,2	−0.197	0.010	(−0.217, −0.178)
bl_tg	β_2,3	0.024	0.001	(0.021, 0.027)
BMI	β_2,4	−0.145	0.017	(−0.179, −0.112)
age	β_2,5	0.075	0.009	(0.057, 0.094)
duration	β_2,6	0.065	0.063	(−0.057, 0.190)
Female	β_2,7	0.523	0.212	(0.114, 0.941)
DM	β_2,8	−2.270	0.278	(−2.819, −1.728)
CHD	β_2,9	−0.985	0.273	(−1.536, −0.465)
potency2	β_2,10	0.790	0.317	(0.176, 1.424)
potency3	β_2,11	0.256	0.367	(−0.445, 0.989)
Black	β_2,12	−2.448	0.371	(−3.187, −1.730)
Hispanic	β_2,13	−1.097	0.449	(−1.972, −0.211)
Other	β_2,14	−0.050	0.504	(−1.051, 0.920)
onstatin × bl_ldlc	β_2,15	−0.007	0.005	(−0.017, 0.003)
onstatin × bl_hdlc	β_2,16	−0.006	0.014	(−0.035, 0.021)
onstatin × bl_tg	β_2,17	−0.016	0.002	(−0.020, −0.011)
onstatin × BMI	β_2,18	0.027	0.027	(−0.026, 0.081)
onstatin × age	β_2,19	−0.059	0.014	(−0.088, −0.031)
onstatin × duration	β_2,20	−0.245	0.194	(−0.638, 0.128)
onstatin × Female	β_2,21	0.786	0.324	(0.161, 1.427)
onstatin × DM	β_2,22	1.719	0.390	(0.914, 2.462)
onstatin × CHD	β_2,23	0.464	0.395	(−0.299, 1.248)
onstatin × potency2	β_2,24	−1.052	0.526	(−2.094, −0.029)
onstatin × potency3	β_2,25	−0.946	0.643	(−2.176, 0.338)
onstatin × Black	β_2,26	2.571	0.634	(1.328, 3.819)
onstatin × Hispanic	β_2,27	−0.436	0.769	(−1.967, 1.038)
onstatin × Other	β_2,28	−0.895	0.742	(−2.300, 0.579)

Open in a new tab

The results shown in Table 5 under the flexible skew model with random ν_a and τ, Table S.7 under the flexible skew model with ν_a = 25 and τ = 5, and Table 6 of Ibrahim et al. (2019) indicate that patients on “statin + EZE” had significantly more percent changes from baseline in all three outcome variables (LDL-C, HDL-C, and TG) than those on statin alone for both the first-line and second-line therapies. For first-line therapy, the 95% HPD intervals were (−15.771, −12.492), (−15.748, −12.455), and (−15.662, −12.454) for the percent change from baseline in LDL-C (γ_1,1); (1.265, 2.956), (1.302, 2.952) and (1.285, 2.870) for the percent change from baseline in HDL-C (γ_2,1); and (−7.304, −4.523), (−7.357, −4.548), and (−7.316, −4.369) for the percent change from baseline in TG (γ_3,1), respectively, under the flexible skew model with random ν_a and τ, the flexible skew model with ν_a = 25 and τ = 5, and the skew model with τ = 30. For second-line therapy, these 95% HPD intervals were (−21.849, −16.293), (−21.840, −16.260), and (−21.586, −15.974) for the percent change from baseline in LDL-C (γ_1,3); (0.763, 1.773), (0.740, 1.751), and (0.726, 1.736) for the percent change from baseline in HDL-C (γ_2,3); and (−9.373, −6.448), (−9.306, −6.452), and (−9.213, −5.868) for the percent change from baseline in TG (γ_3,3), respectively, under the above three models.

Table 5.

Posterior Estimates under the flexible skew t model with random ν_a and τ

Par.	Mean	SD	95% HPD	Par.	Mean	SD	95% HPD
γ_1,0	−22.112	3.034	(−28.094, −16.127)	Ω_1,0,0	20.133	10.345	(6.526, 39.943)
γ_1,1	−14.129	0.827	(−15.748, −12.455)	Ω_1,0,1	−9.158	5.367	(−19.859, −1.405)
γ_1,2	6.229	4.290	(−2.232, 14.651)	Ω_1,1,1	8.197	3.900	(2.791, 15.742)
γ_1,3	−19.107	1.410	(−21.849, −16.293)	Ω_1,2,2	43.791	21.636	(13.970, 84.722)
γ_2,0	9.918	1.354	(7.291, 12.594)	Ω_1,2,3	−26.458	14.770	(−54.154, −5.612)
γ_2,1	2.111	0.430	(1.265, 2.956)	Ω_1,3,3	24.157	12.438	(7.482, 47.977)
γ_2,2	13.850	1.818	(10.345, 17.446)	Ω_2,0,0	1.957	1.085	(0.530, 4.033)
γ_2,3	1.256	0.260	(0.763, 1.773)	Ω_2,0,1	−1.734	0.990	(−3.616, −0.395)
γ_3,0	−1.626	2.679	(−6.795, 3.708)	Ω_2,1,1	1.680	0.992	(0.344, 3.587)
γ_3,1	−5.918	0.704	(−7.304, −4.523)	Ω_2,2,2	1.182	0.817	(0.121, 2.657)
γ_3,2	4.180	3.665	(−2.717, 11.540)	Ω_2,2,3	−0.188	0.355	(−0.930, 0.366)
γ_3,3	−7.923	0.738	(−9.373, −6.448)	Ω_2,3,3	0.137	0.161	(0.007, 0.419)
Σ₁₁	72.921	4.189	(64.870, 81.231)	Ω_3,0,0	9.561	6.304	(1.245, 21.757)
Σ₁₂	13.629	3.067	(7.729, 19.587)	Ω_3,0,1	−6.229	4.040	(−14.125, −0.545)
Σ₁₃	22.812	4.459	(14.018, 31.508)	Ω_3,1,1	4.463	3.107	(0.218, 10.459)
Σ₂₂	96.721	5.193	(86.786, 107.018)	Ω_3,2,2	9.301	5.919	(1.423, 20.324)
Σ₂₃	−37.165	5.046	(−47.266, −27.514)	Ω_3,2,3	−5.427	4.046	(−13.380, −0.181)
Σ₃₃	182.385	10.700	(161.945, 203.932)	Ω_3,3,3	3.574	3.221	(0.017, 9.468)
δ₁	9.651	0.612	(8.438, 10.865)	$σ_{δ_{1}}^{2}$	8.242	3.181	(3.345, 14.545)
δ₂	−1.522	0.189	(−1.907, −1.166)	$σ_{δ 2}^{2}$	0.154	0.133	(0.016, 0.403)
δ₃	19.112	0.820	(17.518, 20.752)	$σ_{δ_{3}}^{2}$	13.696	5.151	(5.273, 23.948)
ϕ	19.672	8.486	(6.197, 36.399)	v₀	33.921	4.472	(25.751, 43.207)
υ	7.863	0.523	(6.886, 8.926)	τ	4.350	0.290	(3.787, 4.914)

Open in a new tab

Tables 6 and S.8 as well as Figures 4 and S.3 show that the posterior estimates of the degrees of freedom, ν_k’s, vary across trials, with the posterior estimates from 5.408 to 10.479 (Table 6) and 5.452 to 9.778 (Table S.8). Tables S.1 and S.9 and Figures 5 and S.4 indicate that the skewness parameters, δ_k,j’s, are very heterogenous for outcome variables LDL-C and TG and relatively homogenous for HDL-C among the 26 trials. Finally, we see from Tables S.2, S.3, S.10, and S.11 that the magnitudes of the covariances and the variances are different across these 26 trials, however, interestingly, the signs of the correlations among the three outcome variables (LDL-C, HDL-C, and TG) are consistent across trials. These posterior estimates suggest that the skewness parameters, the covariance matrices of the multivariate responses, and the degrees of freedom in the multivariate t distributions for the error terms should be different across trials, which further empirically confirms the finding according to the LPML criterion that the flexible skew model did fit the cholesterol data better than the skew t model with τ = 30.

Figure 5. — Plots of the posterior estimates (mean and 95% HPD interval) for δ_k under the flexible skew model with random ν_a and τ for the cholesterol data.

To compute posterior estimates, including posterior means, posterior SDs, 95% HPD intervals, LPMLs, and boxplots of $λ_{k}^{*}' s$ , we used 20,000 MCMC samples, which were taken from every fifth iteration, after a burn-in of 20,000 iterations. The convergence of the MCMC sampling algorithm was checked using several diagnostic procedures discussed in Chen et al. (2000). The HPD intervals were computed via the Monte Carlo method developed by Chen and Shao (1999). Computer code was written for the FORTRAN 95 compiler using IMSL subroutines with double precision accuracy. The FORTRAN code is available from the authors upon request.

6. DISCUSSION

In this paper, we have proposed a general and flexible multivariate skew and heavy-tailed meta-regression model for modeling individual level patient meta-data, which is a novel extension of the multivariate skew meta-regression model of Ibrahim et al. (2019). Due to the complexity of the proposed model, we have also developed an efficient MCMC sampling algorithm using the collapsed Gibbs technique of Liu (1994) and Chen et al. (2000) to carry out challenging posterior computations due to the large size of the meta-data and the high-dimensions of the random effects. In addition, we have proposed the logarithm of the pseudo marginal likelihood for model comparison. As was seen from the analysis of the (LDL-C, HDL-C, TG) data, the proposed model has substantially improved goodness-of-fit over the one developed in Ibrahim et al. (2019).

An extension to network meta-regression (NMR) of the proposed model for individual level patient network meta-data can be developed. Under the network regression setting, multiple treatments are compared using both direct comparisons of interventions within randomized controlled trials and indirect comparisons across trials based on a common comparator, accounting for covariates. Such an extension is potentially useful and significant in comparing and assessing the effects of cholesterol lowering drugs. A detailed development of this extension is beyond the scope of the current paper.

Supplementary materials available on the journal website consist of the full conditional distributions, additional tables (Tables S.1–S.3) of the posterior estimates under the flexible skewed model with random ν_a and τ for the cholesterol data, and the posterior estimates (Table S.4–S.11; Figure S.1–S.4) under the flexible skewed model with ν_a = 25 and τ = 5.

Supplementary Material

Supplementary

NIHMS1619624-supplement-Supplementary.pdf^{(368.9KB, pdf)}

ACKNOWLEDGEMENTS

Drs. Chen and Ibrahim’s research was partially supported by NIH grants #GM70335 and #P01CA142538. Dr. Kim’s research was supported by the Intramural Research Program of National Institutes of Health, National Cancer Institute.

APPENDIX

Appendix A. Proof of Proposition 4.1

The CPO_ik statistic for the ith patient in the kth trial is defined as

{CPO}_{i k} = \int_{Θ} f (y_{i k} ∣ X_{i k}, W_{i k}, θ) π (θ ∣ D_{obs}^{(- i k)}) d θ = \int_{Θ} f (y_{i k} ∣ X_{i k}, W_{i k}, θ) \times \frac{f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ) π (θ)}{\int_{Θ} f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ) π (θ) d θ} d θ = \int_{Θ} \int_{Z} \frac{[\prod_{k = 1}^{K} \prod_{i = 1}^{n_{k}} f_{y_{i k}, z_{i k}}^{A}] π (θ)}{\int_{Θ} f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ) π (θ) d θ} d z d θ,

(A.1)

where $f_{y_{i k}, z_{i k}}^{A} = f (y_{i k} ∣ X_{i k}, W_{i k}, θ, z_{i k}) π (z_{i k} ∣ τ)$ .

Let $Z_{- i k}$ denote the integration region of z_-ih. Note that

\int_{Θ} f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ) π (θ) d θ = \int_{Θ} \int_{Z_{- i k}} f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ, z_{- i k}) \times π (z_{- i k} ∣ τ) π (θ) d z_{- i k} d θ = \int_{Θ} \int_{Z} h_{i k} (z_{i k}) f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ, z_{- i k}) \times π (z_{- i k} ∣ τ) π (θ) d z d θ,

(A.2)

where h_ik (z_ik) is any weight density function such that $\int_{z_{i k}} h_{i k} (z_{i k}) d z_{i k} = 1$ , and

f (y_{- i k} ∣ X_{- i k}, W_{- i k}, θ, z_{- i k}) π (z_{- i k} ∣ τ) = \frac{1}{f (y_{i k} ∣ X_{i k}, W_{i k}, θ, z_{i k}) π (z_{i k} ∣ τ)} \times [\prod_{k = 1}^{K} \prod_{i = 1}^{n_{k}} f (y_{i k} ∣ X_{i k}, W_{i k}, θ, z_{i k}) π (z_{i k} ∣ τ)] .

(A.3)

Thus, combining (A.1), (A.2), and (A.3) yields (4.3).

Appendix B. Derivation of an approximation of the optimal weight function

The optimal weight function is given by

h_{i k, o p t} (z_{i k}) = f (z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, y_{i k}, X_{i k}, W_{i k}) = \frac{f (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})}{f (y_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})} = \frac{f (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})}{f^{M} (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})},

where $f^{M} (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) = \int_{z_{i k} \in R_{+}^{J}} f (y_{i k}, z_{i k} ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) d z_{i k}$ . Let u_ik = log(z_ih) for i = 1,…, n_k, k = 1,…, K, then z_ik = exp(u_ik). We note that the log and exp functions applied to a vector means that the operations are applied to every element of the vector. Then, we have

h_{i k, o p t} (z_{i k}) = \frac{f_{y_{i k}, u_{i k}}^{A}}{\int_{u_{i k} \in R^{J}} f_{y_{i k}, u_{i k}}^{A} \times \prod_{j = 1}^{J} exp (u_{i j k}) d u_{i k}},

where $f_{y_{i k}, u_{i k}}^{A} = f (y_{i k}, exp (u_{i k}) ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k})$ and

h_{i k, o p t} (z_{i k}) \prod_{j = 1}^{J} exp (u_{i j k}) = \frac{f_{y_{i k}, u_{i k}}^{A} \times \prod_{j = 1}^{J} exp (u_{i j k})}{\int_{u_{i k} \in R^{J}} f_{y_{i k}, u_{i k}}^{A} \times \prod_{j = 1}^{J} exp (u_{i j k}) d u_{i k}} .

(A.4)

The right hand side of (A.4) can be approximated by a multivariate normal density which is constructed via the Laplace approximation to the joint density $f (y_{i k} exp (u_{i k}) ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) \prod_{j = 1}^{J} exp (u_{i j k})$ . Denote $g (u_{i k}) = log {f (y_{i k}, exp (u_{i k}) ∣ β, δ_{k}, Σ_{k}, γ_{k}^{R}, τ, ν_{k}, X_{i k}, W_{i k}) \prod_{j = 1}^{J} exp (u_{i j k})}$ . Letting $u_{i k}^{0}$ be the stationary point where ∇g(u_ik) = 0, and $B = - {\nabla^{2} g (u_{i k}) |}_{u = u_{i k}^{0}}$ , the function h_ik,opt (z_ik) $\prod_{j = 1}^{J} exp (u_{i j k})$ can be approximated by $ϕ (log (z_{i k}) ∣ u_{i k}^{0}, B^{- 1})$ , where $ϕ (log (z_{i k}) ∣ u_{i k}^{0}, B^{- 1})$ denotes the density function of $N (u_{i k} ∣ u_{i k}^{0}, B^{- 1})$ . Thus, h_ik,opt (z_ik) is approximated by the function $\frac{ϕ (log (z_{i k}) ∣ u_{i k}^{0}, B^{- 1})}{\prod_{j = 1}^{J} z_{i j k}}$ . From equation (4.2), we have

g (u_{i k}) = log {f_{y_{i k}, u_{i k}}^{A} \times \prod_{j = 1}^{J} exp (u_{i j k})} \propto - \frac{ν_{k} + J}{2} log (A_{i k} (u_{i k}) + ν_{k}) - (J + τ + 1) log (\sum_{j = 1}^{J} exp (u_{i j k}) + τ) + \sum_{j = 1}^{J} u_{i j k},

(A.5)

where $A_{i k} (u_{i k}) = (y_{i k} - X_{i k} β - W_{i k} γ_{k}^{R} - Δ_{k} (exp (u_{i k}) {- E z_{i k}))}^{'} Σ_{k}^{- 1} (y_{i k} - X_{i k} β - W_{i k} γ_{k}^{R} - Δ_{k} (exp (u_{i k}) - E z_{i k}))$ . Using g(u_ik) in (A.5), we obtain ∇g(u_ik) and ∇²g(u_ik) after some algebra.

Contributor Information

Sungduk Kim, Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, USA.

Ming-Hui Chen, Ming-Hui Chen, Department of Statistics, University of Connecticut, Storrs, CT, USA.

Joseph Ibrahim, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.

Arvind Shah, Clinical Biostatistics, Merck & Co., Inc., Rahway, NJ, USA.

Jianxin Lin, Clinical Biostatistics, Merck & Co., Inc., Rahway, NJ, USA.

REFERENCES

Adcock C (2004). Capital Asset Pricing for UK Stocks under the Multivariate Skew-Normal Distribution. Boca Raton, FL: Chapman & Hall/CRC. [Google Scholar]
Arellano-Valle R and Genton M (2010). Multivariate unified skew-elliptical distributions. Chilean Journal of Statistics, 1:17–33. [Google Scholar]
ATP III (2001). Executive summary of the third report of the national cholesterol education program (ncep) expert panel on detection, evaluation, and treatment of high blood cholesterol in adults (adult treatment panel iii). Journal of the American Medical Association, 285:2486–2497. [DOI] [PubMed] [Google Scholar]
Austin M, Hokanson J, and Edwards K (1998). Hypertriglyceridemia as a cardiovascular risk factor. The American Journal of Cardiology, 81:7B–12B. [DOI] [PubMed] [Google Scholar]
Azzalini A and Capitanio A (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of The Royal Statistical Society, Series B, 65:367–389. [Google Scholar]
Baigent C, Blackwell L, et al. , and for Cholesterol Treatment Trialists’ (CTT) Collaboration (2010). Efficacy and safety of more intensive lowering of ldl cholesterol: a meta-analysis of data from 170,000 participants in 26 randomised trials for cholesterol treatment trialists’ (ctt) collaboration. Lancet, 376:1670–1681. [DOI] [PMC free article] [PubMed] [Google Scholar]
Belias M, Rovers M, Reitsma J, Debray T, and IntHout J (2019). Statistical approaches to identify subgroupsin meta-analysis of individual participantdata: a simulation study. BMC Medical Research Methodology, 19:183. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bhatt D, Steg P, Miller M, Brinton E, Jacobson T, Ketchum S, R. D Jr., Juliano R, Jiao L, Granowitz C, Tardif J-C, Ballantyne C, and for the REDUCE-IT Investigators (2019). Cardiovascular risk reduction with icosapent ethyl for hypertriglyceridemia. The New England Journal of Medicine, 380:11–22. [DOI] [PubMed] [Google Scholar]
Branco M and Dey D (2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79:99–113. [Google Scholar]
Burke d., Ensor J, and Riley R (2017). Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ. Statistics in Medicine, 36:855–875. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cannon C, Blazing M, Giugliano R, McCagg A, White JA, Theroux P, Darius H, Lewis B, Ophuis TO, Jukema J, Ferrari GD, Ruzyllo W, Lucca PD, Im K, Bohula E, Reist C, Wiviott S, Tershakovec A, Musliner T, Braunwald E, Califf R, and for the IMPROVE-IT Investigators (2015). Ezetimibe added to statin therapy after acute coronary syndromes. The New England Journal of Medicine, 372:2387–2397. [DOI] [PubMed] [Google Scholar]
Chang s. C. and Zimmerman d. L. (2016). Skew-normal antedependence models for skewed longitudinal data. Biometrika, 103:363–376. [DOI] [PubMed] [Google Scholar]
Chen M-H, Dey D, and Shao Q (1999). A new skewed link model for dichotomous quantal response data. Journal of the American Statistical Association, 94:1172–1186. [Google Scholar]
Chen M-H and Shao Q (1999). Monte carlo estimation of bayesian credible and hpd intervals. Journal of Computational and Graphical Statistics, 8:69–92. [Google Scholar]
Chen M-H, Shao Q, and Ibrahim J (2000). Monte Carlo Methods in Bayesian Computation. New York: Springer. [Google Scholar]
Genton M (2004). Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality. Boca Raton, FL: Chapman & Hall/CRC. [Google Scholar]
Heron M (2019). Deaths: Leading causes for 2017. National Vital Statistics Reports, 68:June 24. [PubMed] [Google Scholar]
Hokanson J and Austin M (1996). Plasma triglyceride level is a risk factor for cardiovascular disease independent of high-density lipoprotein cholesterol level: a meta-analysis of population-based prospective studies. Journal of Cardiovasc Risk, 3:213–219. [PubMed] [Google Scholar]
Ibrahim j., Chen M-H, and Sinha d. (2001). Bayesian Survival Analysis. New York: Springer. [Google Scholar]
Ibrahim J, Kim S, Chen M-H, Shah A, and Lin J (2019). Bayesian multivariate skew meta-regression models for individual patient data. Statistical Methods in Medical Research, 28:3415–3436. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kashani A, Sallam T, Bheemreddy S, Mann D, Wang Y, and Foody J (2008). Review of side-effect profile of combination ezetimibe and statin therapy in randomized clinical trials. The American Journal of Cardiology, 101:1606–1613. [DOI] [PubMed] [Google Scholar]
Kim S, Chen M-H, and Dey D (2008). Flexible generalized t-link models for binary response data. Biometrika, 95:93–106. [Google Scholar]
Kim S, Chen M-H, Ibrahim J, Shah A, and Lin J (2013). Bayesian inference for multivariate meta-analysis box-cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs. Statistics in Medicine, 32:3972–3990. [DOI] [PMC free article] [PubMed] [Google Scholar]
Leiter L, Betteridge D, Farnier M, Guyton J, Lin J, Shah a., Johnson-Levonas a., and Brudi P (2011). Lipid-altering efficacy and safety profile of combination therapy with ezetimibe/statin vs. statin monotherapy in patients with and without diabetes: an analysis of pooled data from 27 clinical trials. Diabetes, Obesity and Metabolism, 13:615–628. [DOI] [PubMed] [Google Scholar]
Liu J (1994). The collapsed gibbs sampler in bayesian computations with applications to a gene regulation problem. Journal of the American Statistical Association, 89:958–966. [Google Scholar]
Riley R, Price M, Jackson D, Wardle M, Gueyffier F, Wang J, Staessen J, and White I (2015). Multivariate meta-analysis using individual participant data. Research Synthesis Methods, 6:157–174. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ritz J, Demidenko E, and Spiegelman D (2008). Multivariate meta-analysis for data consortia, individual patient meta-analysis, and pooling projects. Journal of Statistical Planning and Inference, 138:1919–1933. [Google Scholar]
Sahu S, Dey D, and Branco M (2003). A new class of multivariate skew distributions with applications to bayesian regression models. Canadian Journal of Statistics, 31:129–150. [Google Scholar]
Sarwar N, Danesh J, Eiriksdottir G, Sigurdsson G, Wareham N, Bingham S, Boekholdt S, Khaw K, and Gudnason V (2007). Triglycerides and the risk of coronary heart disease: 10,158 incident cases among 262,525 participants in 29 western prospective studies. Circulation, 115:450–458. [DOI] [PubMed] [Google Scholar]
Simmonds M and Higgins J (2007). Covariate heterogeneity in meta-analysis: Criteria for deciding between meta-regression and individual patient data. Statistics in Medicine, 26:2982–2999. [DOI] [PubMed] [Google Scholar]
Zhang D, Chen M-H, Ibrahim J, Boye M, and Shen W (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials. Journal of Computational and Graphical Statistics, 26:121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary

NIHMS1619624-supplement-Supplementary.pdf^{(368.9KB, pdf)}

[R1] Adcock C (2004). Capital Asset Pricing for UK Stocks under the Multivariate Skew-Normal Distribution. Boca Raton, FL: Chapman & Hall/CRC. [Google Scholar]

[R2] Arellano-Valle R and Genton M (2010). Multivariate unified skew-elliptical distributions. Chilean Journal of Statistics, 1:17–33. [Google Scholar]

[R3] ATP III (2001). Executive summary of the third report of the national cholesterol education program (ncep) expert panel on detection, evaluation, and treatment of high blood cholesterol in adults (adult treatment panel iii). Journal of the American Medical Association, 285:2486–2497. [DOI] [PubMed] [Google Scholar]

[R4] Austin M, Hokanson J, and Edwards K (1998). Hypertriglyceridemia as a cardiovascular risk factor. The American Journal of Cardiology, 81:7B–12B. [DOI] [PubMed] [Google Scholar]

[R5] Azzalini A and Capitanio A (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution. Journal of The Royal Statistical Society, Series B, 65:367–389. [Google Scholar]

[R6] Baigent C, Blackwell L, et al. , and for Cholesterol Treatment Trialists’ (CTT) Collaboration (2010). Efficacy and safety of more intensive lowering of ldl cholesterol: a meta-analysis of data from 170,000 participants in 26 randomised trials for cholesterol treatment trialists’ (ctt) collaboration. Lancet, 376:1670–1681. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Belias M, Rovers M, Reitsma J, Debray T, and IntHout J (2019). Statistical approaches to identify subgroupsin meta-analysis of individual participantdata: a simulation study. BMC Medical Research Methodology, 19:183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Bhatt D, Steg P, Miller M, Brinton E, Jacobson T, Ketchum S, R. D Jr., Juliano R, Jiao L, Granowitz C, Tardif J-C, Ballantyne C, and for the REDUCE-IT Investigators (2019). Cardiovascular risk reduction with icosapent ethyl for hypertriglyceridemia. The New England Journal of Medicine, 380:11–22. [DOI] [PubMed] [Google Scholar]

[R9] Branco M and Dey D (2001). A general class of multivariate skew-elliptical distributions. Journal of Multivariate Analysis, 79:99–113. [Google Scholar]

[R10] Burke d., Ensor J, and Riley R (2017). Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ. Statistics in Medicine, 36:855–875. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Cannon C, Blazing M, Giugliano R, McCagg A, White JA, Theroux P, Darius H, Lewis B, Ophuis TO, Jukema J, Ferrari GD, Ruzyllo W, Lucca PD, Im K, Bohula E, Reist C, Wiviott S, Tershakovec A, Musliner T, Braunwald E, Califf R, and for the IMPROVE-IT Investigators (2015). Ezetimibe added to statin therapy after acute coronary syndromes. The New England Journal of Medicine, 372:2387–2397. [DOI] [PubMed] [Google Scholar]

[R12] Chang s. C. and Zimmerman d. L. (2016). Skew-normal antedependence models for skewed longitudinal data. Biometrika, 103:363–376. [DOI] [PubMed] [Google Scholar]

[R13] Chen M-H, Dey D, and Shao Q (1999). A new skewed link model for dichotomous quantal response data. Journal of the American Statistical Association, 94:1172–1186. [Google Scholar]

[R14] Chen M-H and Shao Q (1999). Monte carlo estimation of bayesian credible and hpd intervals. Journal of Computational and Graphical Statistics, 8:69–92. [Google Scholar]

[R15] Chen M-H, Shao Q, and Ibrahim J (2000). Monte Carlo Methods in Bayesian Computation. New York: Springer. [Google Scholar]

[R16] Genton M (2004). Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality. Boca Raton, FL: Chapman & Hall/CRC. [Google Scholar]

[R17] Heron M (2019). Deaths: Leading causes for 2017. National Vital Statistics Reports, 68:June 24. [PubMed] [Google Scholar]

[R18] Hokanson J and Austin M (1996). Plasma triglyceride level is a risk factor for cardiovascular disease independent of high-density lipoprotein cholesterol level: a meta-analysis of population-based prospective studies. Journal of Cardiovasc Risk, 3:213–219. [PubMed] [Google Scholar]

[R19] Ibrahim j., Chen M-H, and Sinha d. (2001). Bayesian Survival Analysis. New York: Springer. [Google Scholar]

[R20] Ibrahim J, Kim S, Chen M-H, Shah A, and Lin J (2019). Bayesian multivariate skew meta-regression models for individual patient data. Statistical Methods in Medical Research, 28:3415–3436. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Kashani A, Sallam T, Bheemreddy S, Mann D, Wang Y, and Foody J (2008). Review of side-effect profile of combination ezetimibe and statin therapy in randomized clinical trials. The American Journal of Cardiology, 101:1606–1613. [DOI] [PubMed] [Google Scholar]

[R22] Kim S, Chen M-H, and Dey D (2008). Flexible generalized t-link models for binary response data. Biometrika, 95:93–106. [Google Scholar]

[R23] Kim S, Chen M-H, Ibrahim J, Shah A, and Lin J (2013). Bayesian inference for multivariate meta-analysis box-cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs. Statistics in Medicine, 32:3972–3990. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Leiter L, Betteridge D, Farnier M, Guyton J, Lin J, Shah a., Johnson-Levonas a., and Brudi P (2011). Lipid-altering efficacy and safety profile of combination therapy with ezetimibe/statin vs. statin monotherapy in patients with and without diabetes: an analysis of pooled data from 27 clinical trials. Diabetes, Obesity and Metabolism, 13:615–628. [DOI] [PubMed] [Google Scholar]

[R25] Liu J (1994). The collapsed gibbs sampler in bayesian computations with applications to a gene regulation problem. Journal of the American Statistical Association, 89:958–966. [Google Scholar]

[R26] Riley R, Price M, Jackson D, Wardle M, Gueyffier F, Wang J, Staessen J, and White I (2015). Multivariate meta-analysis using individual participant data. Research Synthesis Methods, 6:157–174. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Ritz J, Demidenko E, and Spiegelman D (2008). Multivariate meta-analysis for data consortia, individual patient meta-analysis, and pooling projects. Journal of Statistical Planning and Inference, 138:1919–1933. [Google Scholar]

[R28] Sahu S, Dey D, and Branco M (2003). A new class of multivariate skew distributions with applications to bayesian regression models. Canadian Journal of Statistics, 31:129–150. [Google Scholar]

[R29] Sarwar N, Danesh J, Eiriksdottir G, Sigurdsson G, Wareham N, Bingham S, Boekholdt S, Khaw K, and Gudnason V (2007). Triglycerides and the risk of coronary heart disease: 10,158 incident cases among 262,525 participants in 29 western prospective studies. Circulation, 115:450–458. [DOI] [PubMed] [Google Scholar]

[R30] Simmonds M and Higgins J (2007). Covariate heterogeneity in meta-analysis: Criteria for deciding between meta-regression and individual patient data. Statistics in Medicine, 26:2982–2999. [DOI] [PubMed] [Google Scholar]

[R31] Zhang D, Chen M-H, Ibrahim J, Boye M, and Shen W (2017). Bayesian model assessment in joint modeling of longitudinal and survival data with applications to cancer clinical trials. Journal of Computational and Graphical Statistics, 26:121–133. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Bayesian flexible hierarchical skew heavy-tailed multivariate meta regression models for individual patient data with applications

Sungduk Kim

Ming-Hui Chen

Joseph Ibrahim

Arvind Shah

Jianxin Lin

Abstract

1. INTRODUCTION

2. THE CHOLESTEROL DATA

Table 1.

Table 2.

3. MULTIVARIATE META REGRESSION MODELS

3.1. Preliminary

Figure 4.

3.2. Hierarchical skew heavy-tailed multivariate meta regression models

Stage 1: Model for Multivariate Responses

Stage 2a: Model for Random Effects

Stage 2b: Model for Covariance Matrices

Stage 2c: Model for Skewness Parameters

Stage 2d: Model for Degrees of Freedom

Stage 3a: Prior distributions of the Hyperparameters for Random Effects

Stage 3b: Prior distributions of the Hyperparameters for Covariance Matrices

Stage 3c: Prior distributions of the Hyperparameters for Skewness Parameters and Latent Variables

Stage 3d: Prior distributions of the Hyperparameters for Degrees of Freedom

Stage 3e: Prior distribution for Fixed Effects Regression Coefficients

Figure 1.

3.3. The likelihood function and posterior distribution

4. BAYESIAN INFERENCE

4.1. Bayesian model comparison via CPO’s

4.2. Computational development

5. ANALYSIS OF THE CHOLESTEROL DATA

Figure 2.

Table 6.

Table 4.

Figure 3.

Table 3.

Table 5.

Figure 5.

6. DISCUSSION

Supplementary Material

ACKNOWLEDGEMENTS

APPENDIX

Appendix A. Proof of Proposition 4.1

Appendix B. Derivation of an approximation of the optimal weight function

Contributor Information

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases