Nonparametric Bayes testing of changes in a response distribution with an ordinal predictor

Michael L Pennell; David B Dunson

doi:10.1111/j.1541-0420.2007.00885.x

. Author manuscript; available in PMC: 2008 Jun 1.

Published in final edited form as: Biometrics. 2007 Aug 30;64(2):413–423. doi: 10.1111/j.1541-0420.2007.00885.x

Nonparametric Bayes testing of changes in a response distribution with an ordinal predictor

Michael L Pennell ^1,^*, David B Dunson ²

PMCID: PMC2391093 NIHMSID: NIHMS33453 PMID: 17764484

Summary

In certain biomedical studies, one may anticipate changes in the shape of a response distribution across the levels of an ordinal predictor. For instance, in toxicology studies, skewness and modality might change as dose increases. To address this issue, we propose a Bayesian nonparametric method for testing for distribution changes across an ordinal predictor. Using a dynamic mixture of Dirichlet processes, we allow the response distribution to change flexibly at each level of the predictor. In addition, by assigning mixture priors to the hyperparameters, we can obtain posterior probabilities of no effect of the predictor and identify the lowest dose level for which there is an appreciable change in distribution. The method also provides a natural framework for performing tests across multiple outcomes. We apply our method to data from a genotoxicity experiment.

Keywords: Dirichlet process, Dose-response, Nonparametric Bayes, Toxicology, Trend test

1. Introduction

In many biomedical studies, there is interest in evaluating changes in a response distribution across an ordinal predictor. For example, toxicologists are often interested in assessing whether several biological responses vary in distribution with dose. In such settings, normality assumptions are typically not justified and there are biological reasons to expect changes in not only the location but also the variance and shape with dose. In particular, when there is substantial heterogeneity amongst subjects in their biological response to treatment due to variation in gene expression, timing of the cell cycle, and other factors, changes in variance, skewness, and even modality are natural.

Several frequentist methods have been proposed for testing for trends in non-Gaussian responses. Univariate approaches, such as Jonckheere's test (Terpstra, 1952; Jonckheere, 1954), are used most frequently, though other authors have proposed related methods for multivariate responses (e.g., Dietz and Killeen, 1981; Dietz, 1989; O'Brien, 1984; Huang et al., 2005). Unfortunately, these methods are most sensitive to changes in the location of a distribution and may ignore important trends in shape. In contrast, k-sample tests based on empirical distribution functions (e.g, Kiefer, 1959; Ahmad, 1976) are sensitive to changes in distribution shape, but not changes in location and scale.

Bayesian nonparametric approaches have several advantages over frequentist alternatives, including the ability to incorporate prior information (e.g., from historical controls) and exact inferences on unknown distributions. Although there is an extensive literature on estimation (for a review see Müller and Quintana, 2004) and a growing literature on goodness of fit testing (e.g., Berger and Guglielmi, 2001; Carota and Parmigiani, 1996; Verdinelli and Wasserman, 1998), there is a paucity of methods for nonparametric testing across multiple groups. Gopalan and Berry (1998) use a nonparametric Dirichlet process prior (DPP; Ferguson, 1973, 1974) to perform multiple comparisons on distribution means. Recently, Basu and Chib (2003) proposed a less restrictive method which compares semiparametric models using Bayes factors. This method may be promising when there are only 2−3 levels of an ordinal predictor. However, in toxicology studies with several dose groups, this approach would be unwieldy when one considers each possible contrast between dose groups.

In analyzing data from multiple groups, one would typically be interested not only in testing but also in estimation of group-specific distributions. Such estimation can potentially be accomplished by fitting a DP mixture (DPM; Antoniak, 1974) model separately to the different dose groups. For some references on applications of DPMs, refer to West et al. (1994), Escobar and West (1995), Bush and MacEachern (1996), Kleinman and Ibrahim (1998), and Mukhopadhyay and Gelfand (1997). The disadvantage of such an approach is its inability to model trends and borrow information across the different dose groups, which is particularly important in applications having a modest number of subjects per group. Note that applications of the Basu and Chib (2003) procedure to test differences between groups would also have this disadvantage.

An alternative to using separate DPMs is to consider the dose group-specific distributions as a collection of dependent unknown distributions, which are assigned a dependent nonparametric prior. Dependent nonparametric priors have been the focus of a growing body of literature. In the early work of Cifarelli and Regazzini (1978), dependence was induced through a regression model within the parametric base measure of the DP (for related work see Mira and Petrone, 1996; Tomlinson and Escobar, 1999; Carota and Parmigiani, 2002; Giudici et al., 2003). Although the method is straightforward, it has limited flexibility. Other authors proposed inducing nonparametric dependence by parameterizing random measures as products of DP distributed factors, though within a different framework than ours (see, e.g., Gelfand and Kottas, 2001; Pennell and Dunson, 2006). MacEachern (1999; 2000) proposed a dependent Dirichlet process which characterizes dependency by defining a stochastic process for the atoms in the Sethuraman (1994) stick-breaking characterization of the DP. This approach has been successfully applied in ANOVA (De Iorio et al., 2004) and spatial (Gelfand et al., 2004) applications. A limitation of these applications is the assumption of fixed weights on the atoms, which does not allow new features to appear as dose increases. To solve this problem, Dunson (2006) proposed a dynamic mixture of DPs (DMDP), which is related to a mixture structure originally proposed by Müller et al. (2004) to combine inferences across related nonparametric models. More flexible dependent nonparametric priors for continuous predictors have been proposed by Griffin and Steel (2006), Dunson et al. (2007), and Duan et al. (2005).

Motivated by tractability in addressing the problem of nonparametric Bayes testing of changes with dose, we focus here on generalizing the Dunson (2006) approach. Our goal is to obtain posterior probabilities for local and global null hypotheses corresponding to equivalence in an unknown distribution between groups. Instead of requiring exact equivalence, we treat the distributions in adjacent groups as effectively equivalent if the total variation norm between their probability measures is less than ∊. We use a hierarchical modification of the DMDP to borrow information across groups while assigning probabilities to the local null and alternative hypotheses. Using an MCMC implementation, we obtain posterior model probabilities for the local and global hypotheses from a single run, while also producing posterior distributions for thresholds and estimates of group-specific distributions. Multiple response data can be accommodated without complication.

In Section 2, we describe our generalization of the DMDP and hyperprior structure. In Section 3, we outline an MCMC methodology for testing and estimation. In Section 4, we evaluate our approach using simulated data. In Section 5 we apply our method to data from a genotoxicity experiment. In Section 6 we discuss the results and future work.

2. Nonparametric Model and Prior Structure

2.1 General Framework

Let y_hi be a vector of q continuous outcomes measured on subject i (i = 1, . . . , n_h) in group h (h = 1, . . . , d) and X_h be a score for group h, where X_h−1 < X_h < X_h+1. In a toxicology study, X_h is the dose administered to each animal in group h with X₁ = 0 representing the control group. To relax distribution assumptions, we assume that y_hi ∼ F_h which has the density function

f_{h} (y_{h i}) = \int N (y_{h i}; μ_{h i}, Σ_{h i}) d G_{h} (μ_{h i}, Σ_{h i}),

(1)

where G_h is unknown. Hence, the density of y in group h is an infinite mixture of multivariate Gaussian densities, with the mixture distribution varying across groups. As the number of components increases without bound, a mixture of Gaussians converges uniformly to any smooth density function (see Sorenson and Alspach, 1971).

It is clear that differences in the mixture distributions, G_h and G_h+1, imply differences in the outcome distributions, F_h and F_h+1. Hence, we focus on testing for changes in the mixture distributions. In particular, our focus is on local null hypotheses characterizing differences in adjacent groups and on global hypotheses representing intersections of these local hypotheses. As in the Kolmogorov-Smirnov test, we could specify a point null hypothesis which requires G₁, . . . , G_d to match. However, many have argued that point nulls are artificial and are rarely observed in practice (see, e.g., Berger and Delampady, 1987; Nickerson, 2000). In this paper, we use a more realistic null hypothesis under which the group-specific distributions effectively match. We formalize this hypothesis below.

Let G₁, . . . , G_d denote probability measures corresponding to the mixture distributions for the different groups. In addition, let B denote any Borel set such that $B \in B$ , with $B \subset ℜ^{q}$ and $B$ the Borel sigma-algebra of subsets of $ℜ^{q}$ , and let the total variation norm of G_h+1 − G_h be defined as

{‖ G_{h + 1} - G_{h} ‖}_{T V} = \underset{B \in B}{m a x} ∣ G_{h + 1} (B) - G_{h} (B) ∣ .

When ||G_h+1 − G_h||_TV = 0 we say that G_h+1 and G_h match. However, we are interested in local nulls which only require G_h+1 and G_h to effectively match which we specify as the following:

H_{0 h} : {‖ G_{h + 1} - G_{h} ‖}_{T V} \leq ∊

(2)

for h = 1, . . . , d − 1, where ∊ is some small constant such that when H_0h holds, there is no appreciable difference in the mixture distributions across groups h and h + 1. The global null of no changes in the response distribution across groups (H₀) corresponds to the intersection of these local nulls. Note that in using the total variation norm to define our hypothesis tests, we treat differences across all sets as being potentially important, in terms of the application. This approach avoids the need to specify a response range which is of particular interest. Such a range may be diffcult to choose objectively, particularly in toxicology studies as it is typically not known how to exactly equate a response value in an animal to a corresponding value in humans.

2.2 DMDP Model

In addition to conducting hypothesis tests, we would like to estimate the density in each dose group. As mentioned in Section 1, a promising approach would be to assign a dependent nonparametric process to G₁, . . . , G_d thereby allowing us to borrow information across groups. Using the method of Müller et al. (2004), we could let $G_{h} = π G_{0} + (1 - π) G_{h}^{*}$ for h = 1, . . . , d, where G₀ is a global probability measure and $G_{h}^{*}$ is an innovation measure specific to group h, with $G_{0}, G_{1}^{*}, \dots, G_{d}^{*}$ assigned independent nonparametric priors. However, this approach results in an over-specified model in that d+1 random measures are incorporated to characterize d unknown distributions. Teh et al. (2006) proposed an alternative model which assumes that the G_h are drawn from DPPs with a common DP-distributed base measure G₀. However, these hierarchical priors treat the groups as exchangeable. To incorporate information on ordering in the groups, Dunson (2006) proposed a dynamic mixture of Dirichlet processes (DMDP), which is conceptually related to the Müller et al. (2004) approach, but avoids the over-specification problem.

We first consider the case of two groups (d = 2). To induce dependency, we model the mixture distributions using the following DMDP:

\begin{matrix} G_{1} \sim DP (α_{0} G_{0}) \\ G_{2} = (1 - π_{1}) G_{1} + π_{1} G_{1}^{*} G_{1}^{*} \sim DP (α_{1} G_{01}), \end{matrix}

(3)

where $0 \leq π_{1} \leq 1, G_{1}^{*}$ is an innovation distribution that characterizes changes in the mixture distribution of y caused by increasing the group score from X₁ (e.g., dose = 0) to X₂ (e.g., dose > 0), and G₀₁ is a known distribution. Note that the DMDP implies the following densities for y in groups 1 and 2:

\begin{matrix} f_{1} (y) & = \int N (y; μ, Σ) d G_{1} (μ, Σ) \\ f_{2} (y) & = (1 - π_{1}) f_{1} (y) + π_{1} \int N (y; μ, Σ) d G_{1}^{*} (μ, Σ) . \end{matrix}

(4)

Hence, the distribution of y in group 2 is a mixture of the distribution in group 1 and a different DP mixture of normals characterized by mixture distribution $G_{1}^{*}$ . This is a natural model for toxicology data since we would expect that the distribution in the treatment group, F₂, shares features with the distribution in the control group, F₁, but that innovations may have occurred due to stress induced by the test chemical. The amount of borrowing of information from the control control group and the level of innovation are represented by the weights (1 − π₁) and π₁ respectively.

In the above model, it is convenient to choose G₀ ≡ G₀₁, with these base measures having a conjugate normal-inverse Wishart form: $G_{0} (μ, Σ) = N (μ; μ_{0}, κ Σ) \cdot W (Σ^{- 1}; v_{0}, v_{0}^{- 1} V_{0}^{- 1})$ , where $W (\cdot; v_{0}, v_{0}^{- 1} V_{0}^{- 1})$ , denotes the Wishart density with degrees of freedom v₀ and mean $V_{0}^{- 1}$ . In this case, our nonparametric prior for F₁ is centered on a multivariate t-distribution, while the prior for F₂ is centered on a mixture of multivariate t-distributions. As α₀ and α₁ increase, we place more weight on this parametric base model for the unknown densities. By choosing hyperprior densities for α₀ and α₁ (as we describe in Section 2.4), the method can adapt flexibly to accommodate lack of fit in the multivariate t components.

Extending our model to d groups, for h = 2, . . . , d we let

\begin{matrix} G_{h} & = (1 - π_{h - 1}) G_{h - 1} + π_{h - 1} G_{h - 1}^{*} \\ = {\prod_{l = 1}^{h - 1} (1 - π_{l})} G_{1} + \sum_{l = 1}^{h - 2} {\prod_{m = l + 1}^{h - 1} (1 - π_{m})} π_{l} G_{l}^{*} + π_{h - 1} G_{h - 1}^{*} \\ = ω_{h 1} G_{1} + \sum_{l = 1}^{h - 1} ω_{h, l + 1} G_{l}^{*} \\ G_{l}^{*} & \sim D P (α_{l} G_{0}), for l = 1 \dots, h - 1, \end{matrix}

(5)

where $ω_{h l} = π_{l - 1} \prod_{m = 1}^{h - 1} (1 - π_{m})$ for l = 1, . . . , h − 1 and ω_hl = π_h−1 for l = h with π₀ = 1 and ω_h = (ω_h1, . . . , ω_hh)^′ are group-specific probability weights on the different components in the mixture. Note that the above DMDP is not exchangeable on (G₁, . . . , G_d). As we move from predictor level h to h + 1 we decrease the weight assigned to $G_{1}, G_{1}^{*}, \dots, G_{h - 1}^{*}$ and introduce a new unknown distribution to our mixture, $G_{h}^{*}$ . As mentioned previously, this is a natural model for toxicology data; with every increase in dose level, h to h+1, individuals are subjected to higher levels of stress resulting in changes in the response distribution characterized by $G_{h}^{*}$ . Hence, we would expect the response distributions at adjacent doses to be more similar than dose groups that are several levels apart, as is the case in our model.

Another important feature of our model is that it imposes a type of order restriction across groups. For instance, if the mixture distribution changes when we increase the predictor from the group h to the group h + 1 level, all later groups (h + 2, . . . , d) will also differ from group h. This is due to the following theorem:

THEOREM 1

For any G_h, G_h+1, and G_h+j+1 distributed according to (5) (h = 1, . . . , d − 2; j = 1, . . . , d − h − 1)

\Pr ({‖ G_{h + j + 1} - G_{h} ‖}_{T V} \leq ∊ ∣ {‖ G_{h + 1} - G_{h} ‖}_{T V} > ∊) = 0 .

A proof of this theorem is provided in Web Appendix A. While it is possible, conceptually, for the distribution in group h + j + 1 to shift back toward the distribution in group h, it is unlikely in any application for groups h and h + j + 1 to effectively match when groups h and h + 1 differ. For instance, in a toxicology experiment it is possible for a feedback mechanism to activate at a dose greater than h + 1, causing G_h+j+1 to resemble G_h. However, the feedback should not negate the effect of increasing dose from the group h to the group h + 1 level. Biological plausibility motivated our formulation of the model hypotheses in this order-dependent manner.

2.3 Model Space Prior

In the two sample case of expression (3),

\begin{matrix} H_{0} : {‖ G_{2} - G_{1} ‖}_{T V} & = \max_{B \in B} ∣ G_{2} (B) - G_{1} (B) ∣ \\ = \max_{B \in B} ∣ (1 - π_{1}) G_{1} (B) + π_{1} G_{1}^{*} (B) - G_{1} (B) ∣ \\ = π_{1} {‖ G_{1}^{*} - G_{1} ‖}_{T V} \leq ∊ . \end{matrix}

(6)

Note that H₀ holds for any fixed ∊ if: 1.) π₁ ≤ ∊, regardless of G₁ and $G_{1}^{*}$ or 2.) ${‖ G_{1}^{*} - G_{1} ‖}_{T V} \leq ∊$ , regardless of π₁. Hence, the value of π₁ and the independent DPPs on G₁ and $G_{1}^{*}$ control the prior probability allocated to H₀. If probability close to zero is allocated to H₀, then we may expect the posterior probability of H₀ to be small unless the sample size is large, clearly an unappealing property in most applications. Thus, we want to be able to control the prior probability allocated to H₀.

We first consider the possibility of controlling Pr(H₀) through the priors for G₁ and $G_{1}^{*}$ . The DPPs G₁ ∼ DP(α₀G₀) and $G_{1}^{*} \sim DP (α_{1} G_{0})$ imply that

{‖ G_{1} - G_{1}^{*} ‖}_{T V} = \max_{B \in B} ∣ \sum_{h = 1}^{\infty} V_{1 h} \prod_{l < h} (1 - V_{1 l}) δ_{θ_{1 h}} (B) - \sum_{h = 1}^{\infty} V_{2 h} \prod_{l < h} (1 - V_{2 l}) δ_{θ_{2 h}} (B) ∣,

where θ_1h and θ_2h ∼ G₀, V_1l ∼ Be(1, α₀), V_2l ∼ Be(1, α₁), and δ_θ denotes a degenerate distribution with mass at θ (Sethuraman, 1994). Suppose that we choose the Borel set B to contain (θ_1h, h = 1, . . . ,∞) but not (θ_2h, h = 1, . . . , ∞). Then, for a set B satisfying these conditions, $δ_{θ_{1 h}} (B) = 1 for h = 1, \dots, \infty$ and $δ_{θ_{2 h}} (B) = 0 for h = 1, \dots, \infty$ . Hence, the above absolute value reduces to

∣ \sum_{h = 1}^{\infty} V_{1 h} \prod_{l < h} (1 - V_{1 l}) ∣ = 1 .

The theorem below follows directly:

THEOREM 2

Let G₀ be a continuous base measure. For any G₁ ∼ DP(α₀G₀) independent of $G_{1}^{*} \sim D P (α_{1} G_{0})$

{‖ G_{1} - G_{1}^{*} ‖}_{T V} = 1 .

From Theorem 2 we have $\Pr ({‖ G_{1}^{*} - G_{1} ‖}_{T V} < ∊) = 0$ . Thus zero probability is allocated to H₀ by the total variation component, which implies that Pr(H₀) is entirely controlled by π₁. Hence, one can effectively replace H₀ in (6) with π₁ ≤ ∊* , where ∊* is also close to zero. Extending this rationale to the d-group setting, we let

H_{0 h} : π_{h} \leq ∊^{*} .

(7)

To control the prior probability allocated to H_0h, we propose a prior for π_h which is a mixture of its distribution under H_0h and H_1h : π_h > ∊* . Let ζ_h be a latent variable that equals 1 when the local null H_0h is true and 0 otherwise and let Pr(ζ_h = 1) = p_0h. Given ζ_h, our prior for π_h has the following form:

π_{h} ∣ ζ_{h} \sim ζ_{h} \cdot Unif (0, ∊^{*}) + (1 - ζ_{h}) \cdot Unif (∊^{*}, 1), h = 1, \dots, d - 1 .

(8)

In the above mixture, p₀ = (p₀₁, . . . , p_0h)^′ is chosen to reflect prior knowledge about the chances of H₀ being true. For example, to assign equal prior weight to H₀ and H₁, p₀ should be chosen to satisfy $\Pr (H_{0}) = \prod_{h = 1}^{d - 1} p_{0 h} = 0.5$ . Under the Bayesian Bonferroni method of Westfall et al. (1997), this may be achieved by setting p₀₁ = · · · = p_0,d−1 = 0.5^1/(d−1). However for large d, the probability of each local null hypothesis is nearly 1, which can make this approach overly conservative. We instead use a model space prior described by Hans and Dunson (2005). This approach induces dependency in the local null hypotheses by assuming p_0h = p₀ for h = 1, . . . , d − 1 where p₀ ∼ Be(a_p, b_p). As a consequence of this prior, $\Pr (H_{0 h} ∣ {ζ_{j} : j \neq h}) = (a_{p} + \sum_{j \neq h} ζ_{j}) ∕ (a_{p} + b_{p} + d - 2)$ . Hence when several H_0j(j ≠ h) are true, we shrink toward no change between groups h and h+1, which is conceptually appealing in dose-response studies.

As in Hans and Dunson (2005), a_p and b_p are chosen so that on average Pr(H₀)=0.5, i.e., $E (p_{0}^{d - 1})$ = 0.5. This implies that a_p and b_p must satisfy the following:

\begin{matrix} 0.5 & = \int_{0}^{1} p_{0}^{d - 1} \frac{Γ (a_{p} + b_{p})}{Γ (a_{p}) Γ (b_{p})} p_{0}^{a_{p} - 1} {(1 - p_{0})}^{b_{p} - 1} d p_{0} \\ = \frac{Γ (a_{p} + b_{p}) Γ (a_{p} + d - 1)}{Γ (a_{p}) Γ (a_{p} + b_{p} + d - 1)} . \end{matrix}

(9)

We then impose the constraint a_p+b_p = 1 to represent unit information in the prior and solve (9) numerically for a_p and b_p.

2.4 Hyperpriors for DP Parameters

To decrease the sensitivity of our model to subjectively chosen hyperparameters, we specify the following hyperpriors:

\begin{matrix} κ^{- 1} \sim Ga (a_{κ}, b_{κ}) μ_{0} \sim N (β, Ω) \\ α_{h} \sim & Ga (a_{γ h}, b_{γ h}) for h = 0, 1, \dots, d - 1 . \end{matrix}

In practice, one may favor priors which assign high prior probability to low values of α_h since this implies that the number of mixture components is small (Antoniak, 1974) and a very flexible density estimator can be produced with few mixture components. Hence, a reasonable choice is a_{γ_h} = a_γ = 1 and b_{γ_h} = b_γ = 1. We also recommend following Escobar and West (1995) by choosing diffuse priors for κ⁻¹ and μ₀, with a_κ/b_κ small and, when possible, β chosen to reflect the mean of y in previous studies. Potentially, hyperpriors could also be assigned to V₀ and v₀, though we choose to fix these parameters to avoid over-parameterization. We recommending choosing V₀ so that a N(β, V₀) distribution sufficiently covers an expected range of values of y and choosing v₀ to reflect one's confidence in this choice. When there is little prior information on V₀, we strongly recommend performing a sensitivity analysis on v₀ as small values can often result in convergence or numerical problems, particularly in the case of a multivariate response.

3. Posterior Computation

3.1 Gibbs Sampling Methodology

We sample from the joint posterior of our model parameters using a Gibbs sampling methodology which is related to the Bush and MacEachern (1996) and West et al. (1994) approaches for the Dirichlet process mixture. Details may be found in Web Appendix B.

3.2 Hypothesis Testing

The global null may be formally evaluated using Rao-Blackwellization (Gelfand and Smith, 1990). For T iterations of the MCMC with a burn-in of T_b iterations, we compute

\begin{matrix} \hat{\Pr} (H_{0} ∣ Data) & = \frac{1}{T - T_{b}} \sum_{t = T_{b} + 1}^{T} \Pr (ζ_{1} = \dots = ζ_{d - 1} = 1 ∣ p_{0}^{(t)}, M^{(t)}) \\ = \frac{1}{T - T_{b}} \sum_{t = T_{b} + 1}^{T} \prod_{h = 1}^{d - 1} p_{0 h}^{* (t)}, \end{matrix}

(10)

where the superscript (t) denotes a value sampled at iteration t, $M^{(t)} = {M_{h i}^{(t)}; h = 1, \dots, d; i = 1, \dots, n_{h}}, M_{h i} \in {1, \dots, h}$ is the mixture indicator for subject i in group h, and $p_{0 h}^{* (t)} = \Pr (ζ_{h} = 1 ∣ p_{0}^{(t)}, M^{(t)})$ . The explicit form for $p_{0 h}^{*}$ is given in Web Appendix B.

A common approach is to reject H₀ if (10) is less than some small pre-determined value (e.g., 0.05). However, it is appealing in a Bayesian analysis to report the posterior null hypothesis probability, or alternatively the Bayes factor, as a weight of evidence, avoiding the choice of an arbitrary cuto for determining significance. Hence, we also recommend reporting that there is evidence against H₀ when the posterior probabilities are only moderately small (i.e., between 0.05 and 0.1).

An attractive feature of our methodology is that we can test local hypotheses within the same Markov chain used to test H₀. Since we properly calibrate our prior for p₀ to give equal prior probability to H₀ and H₁, the posterior probabilities of the local nulls do not need to be adjusted (see Westfall et al., 1997). For instance, the posterior probability of a distribution change between groups h and h + 1 would be the average value of $p_{0 h}^{*}$ over T − T_b iterations. In toxicology studies, there is also interest in comparing each dose group to control (group 1). The null hypothesis of no difference in the distributions of groups 1 and h is

H_{0, h - 1}^{*} : π_{j} \leq ∊^{*} j = 1, \dots, h - 1

(11)

and we estimate its posterior probability as

\hat{\Pr} (H_{0, h - 1}^{*} ∣ Data) = \frac{1}{T - T_{b}} \sum_{t = T_{b} + 1}^{T} \prod_{j = 1}^{h - 1} p_{0 j}^{* (t)} .

(12)

3.3 Density Estimation

The predictive density of a future $ϕ_{h, n_{h} + 1} = {μ_{h, n_{h} + 1}, Σ_{h, n_{h} + 1}}$ can be obtained at each iteration of the MCMC:

π (ϕ_{h, n_{h} + 1}) = \sum_{l = 1}^{h} {\frac{α_{l - 1}}{α_{l - 1} + m_{l}} \cdot d G_{0} (ϕ_{h, n_{h} + 1}) + \sum_{r = 1}^{k_{l}} (\frac{m_{l r}}{α_{l - 1} + m_{l}}) δ_{θ_{l r}}},

(13)

where, as explained in Web Appendix B, m_l and k_l are the number of subjects and the number of unique means and covariances, respectively, in mixture component l, $θ_{l r} = {μ_{l r}^{*}, Σ_{l r}^{*}}$ , and m_lr is the number of subjects whose mean and covariance equals θ_lr. This implies that the predictive density of $y_{h, n_{h} + 1}$ is

f_{h, n_{h} + 1} (y) = ω_{0} \cdot t (y; v_{0} - q + 1, μ_{0}, \frac{v_{0} (1 + κ)}{v_{0} - q + 1} V_{0}) + \sum_{l = 1}^{h} \sum_{r = 1}^{k_{l}} ω_{h l r} \cdot N (y; μ_{l r}^{*}, Σ_{l r}^{*}),

(14)

where the multivariate t-density results from integrating $N (\cdot; μ_{h, n_{h} + 1}, Σ_{h, n_{h} + 1})$ over G₀ and ω₀ and ω_hlr are the respective multipliers on $d G_{0} (ϕ_{h, n_{h} + 1})$ and $δ_{θ_{l r}}$ in (13). Following T iterations, the Rao-Blackwellized estimate,

{\overset{‒}{f}}_{h, n_{h} + 1} (y) = \frac{1}{T - T_{b}} \sum_{t = T_{b} + 1}^{T} f_{h, n_{h} + 1}^{(t)} (y),

(15)

may be computed over a grid of values.

In a toxicology study, one would also be interested in the posterior density of the lowest observed adverse e ects level (LOAEL). Assuming that any change in the distribution relative to group 1 is an adverse effect, we can estimate the posterior density of the LOAEL as follows:

\begin{matrix} \hat{\Pr} (LOAEL = h ∣ Data) & = \frac{1}{T - T_{b}} \sum_{t = T_{b} + 1}^{T} (1 - p_{0, h - 1}^{* (t)}) \prod_{j = 1}^{h - 2} p_{0 j}^{* (t)} h = 2, \dots, d \\ \hat{\Pr} (LOAEL = d + 1 ∣ Data) & = \frac{1}{T - T_{b}} \sum_{t = T_{b} + 1}^{T} \prod_{j = 1}^{d - 1} p_{0 j}^{* (t)}, \end{matrix}

(16)

where a LOAEL of d+1 implies that the LOAEL is greater than the largest dose considered in the study. Hans and Dunson (2005) use a similar approach to estimate the posterior density of the LOAEL under umbrella orderings in parametric models.

4. Simulation Studies

4.1 Description of Data

We performed three simulation studies to evaluate the performance of our methodology. In each simulation case, we generated a vector of three responses for each subject, y_hi = (y_hi1, y_hi2, y_hi3)^′, from a mixture of five multivariate normal distributions:

f_{h} (y_{h i}) = \sum_{j = 1}^{5} w_{h j} N (y_{h i}; μ_{j}, τ_{j} \cdot I_{3}),

where μ_j = (μ_j1, μ_j2, μ_j3)^′ and w_hj ∝ exp(γ_j0 + (h − 1) γ_j1) for h = 1, . . . , 6 and i = 1, . . . , 30. For each mixture component, μ_jk = x_j + k − 1 where x_j is a constant.

The parameter values used in the simulations are provided in Table 1. Cases 1 and 2 differed only in their values of the mixture probabilities; in Case 1 we let γ_j0 = 1 and γ_j1 = 0 for j = 1 . . . , 5 to simulate under a null model, while in Case 2 the mixture probabilities varied with predictor level. As seen in Figure 1, the right tails of the marginal densities of y₁ become heavier with predictor level in Case 2 while, in Case 3, the modality of the distribution changes across h. The latter case may arise in toxicology experiments in which gene×dose interactions cause sub-populations to respond differently to treatment. The marginal densities of y₂ and y₃ follow a similar trend with dose, though their location parameters differ. Five data sets were simulated under each case.

Table 1.

Parameter values for the mixture components in simulation cases 1−3.

	Case 1
	j
Parameter	1	2	3	4	5
x_j	−0.6	−0.5	−0.4	0.5	1
τ_j	0.16	0.25	0.36	0.903	1.323
γ_j0	1	1	1	1	1
γ_j1	0	0	0	0	0
	Case 2
x_j	−0.6	−0.5	−0.4	0.5	1
τ_j	0.16	0.25	0.36	0.903	1.323
γ_j0	1	1	1	−3	−3
γ_j1	−1	−0.25	−0.25	0.25	0.75
	Case 3
x_j	−2	−1	0	1	2
τ_j	0.16	0.25	0.36	0.903	1.323
γ_j0	−3	−3	1	−3	−3
γ_j1	1	0.5	−1	0.5	1

Open in a new tab

Marginal density of y₁ at each predictor level in simulation cases 1−3.

4.2 Univariate Analyses

We first performed a univariate analysis on y₁ in order to compare our methodology to standard frequentist methods. In each case, we assumed μ₀ ∼ N(0, 100), κ⁻¹ ∼ (0.5, 50), and α_h ∼ Ga(1, 1) for h = 1, . . . , d a priori and let v₀ = 10 and V₀ = 1. In our mixture priors for π₁, . . . , π₅, we let ∊* = 0.05 and p₀ ∼ Be(0.725, 0.275) to assign equal prior probability to H₀ and H₁. We ran our MCMC for a total 55,000 iterations, discarding the first 5,000 as a burn-in, and saved every 10th iteration to thin the chain. For the purpose of comparing our method to the frequentist approaches, we decided to reject $H_{0} if \hat{\Pr} (H_{0} ∣ Data) \leq 0.05$ .

Our method correctly assigned high probability to H₀ in Case 1 (> 0.914 for each data set) and very low posterior probability to H₀ in Cases 2 and 3 (as seen in Table 2). These results were consistent with the p-values from a k-sample Kolmogorov-Smirnov test (Kiefer, 1959), while both Jonckheere's test and the Kruskal-Wallis test failed to reject the null in 5/5 Case 3 simulations. Since stochastic ordering was violated in Case 3, it is not surprising that Jonckheere's test was insignificant. Being sensitive to location but not shape changes, the Kruskal-Wallis test could detect distribution changes in case 2 but not 3, while our approach was sensitive to changes in each case.

Table 2.

Posterior probabilities of global and local null hypotheses for the univariate analysis of y₁ in simulation cases 1−3. The letters in superscript indicate that the following frequentist tests were significant at the 0.05 level: J=Jonckheere, KW=Kruskal-Wallis, KS=Kolmogorov-Smirnov, S=Shirley, D=Dunn. Since Shirley's test assumes a monotone trend in location, it was not performed when Jonckheere's test was insignificant.

Case 1
Data set	H₀	$H_{01}^{*}$	$H_{02}^{*}$	$H_{03}^{*}$	$H_{04}^{*}$	$H_{05}^{*}$
1	0.914	0.987	0.927	0.925	0.918	0.914
2	0.977	0.998	0.996	0.989	0.986	0.977
3	0.979	0.998	0.995	0.990	0.988	0.979
4	0.971	0.988	0.980	0.978	0.976	0.971
5	0.971	0.987	0.983	0.981	0.978	0.971
Case 2
1	0.066	0.987	0.980	0.952	0.080	0.066
2	< 0.001^J,KW,KS	0.990	0.945	0.888	< 0.001^S,D	< 0.001^S,D,KS
3	0.001^J,KW,KS	0.990	0.944	0.075	0.004^S	0.001^S,D,KS
4	< 0.001^J,KW,KS	0.994	0.959	0.253	< 0.001^S,D	< 0.001^S,D
5	0.002^J,KW,KS	0.963	0.914	0.843	0.202	0.002^S,D,KS
Case 3
1	< 0.001^KS	0.260	0.193	< 0.001^KS	< 0.001^KS	< 0.001^KS
2	< 0.001^KS	0.799	0.027	< 0.001^KS	< 0.001^KS	< 0.001^KS
3	< 0.001^KS	0.838	< 0.001^KS	< 0.001	< 0.001^KS	< 0.001^KS
4	< 0.001^KS	0.637	0.097	< 0.001	< 0.001^KS	< 0.001
5	< 0.001^KS	0.864	0.048	< 0.001^KS	< 0.001^KS	< 0.001^KS

Open in a new tab

In addition to assessing evidence of any change across the groups, it is of interest to identify dose groups that differ from control. We compared our approach to methods commonly used for non-Gaussian toxicology data, Dunn's method (1964) and Williams' (1986) modification of Shirley's method (1977). We also compared our method to (d − 1) 2-sample Kolmogorov-Smirnov tests performed under a Bonferroni correction. As seen in Table 2, our method found clear evidence that groups 5 and 6 differed from group 1 in 3/5 and 4/5 Case 2 simulations, respectively. These results were consistent with Shirley's method, while Dunn's method found a significant difference between groups 1 and 5 in only 2/5 simulations and the Kolmogorov-Smirnov tests could only identify differences between groups 1 and 6. In Case 3, our method revealed that groups 4−6 were all clearly different from group 1 and found significant differences between groups 1 and 3 in 3/5 simulations. The Kolmogorov-Smirnov tests were more conservative and suggested group-related trends that were inconsistent with the simulated data; in two data sets a lower dose group was significantly different from group 1, while a higher dose group was not found to differ from baseline.

4.3 Multivariate Analyses

We next repeated our analyses using a multivariate approach. In addition to testing for a predictor effect on the distribution of y, in Cases 2 and 3 we also tested for an effect on the joint distribution of y₁, z₁, z₂ and y₁, y₂, z₁, where (z_hi1, z_hi2) ∼ N(0, I₂) for h = 1, . . . , 5 and i = 1 . . . , 30 independently of y_hi. The purpose of these latter tests was to evaluate the performance of our multivariate method under varying numbers of a ected outcomes. The implementation was very similar to that of the univariate analyses, with μ₀ ∼ N(0, 100 · I₃) a priori and V₀ = I₃.

Figure 2 summarizes the results. As in the univariate analyses, our method gave high posterior probability to H₀ in Case 1; the posterior probability was greater than 0.989 in each of the five cases. In Case 2, the posterior probabilities of $H_{0}, H_{04}^{*}$ , and $H_{05}^{*}$ were small in each analysis. However, in most cases, the posterior probabilities of $H_{01}^{*}, H_{02}^{*} and H_{03}^{*}$ increased as we decreased the number of y's (the affected outcomes) in our model. Thus our method tends to be conservative when only one outcome is moderately affected, which is a desirable feature for a multivariate method. A similar trend was seen in the posterior probabilities of $H_{01}^{*}$ and $H_{02}^{*}$ in Case 3.

Posterior probabilities of global and local null hypotheses for the multivariate analyses in simulation cases 1−3. Labels for y-axes denote the different outcomes used in each analysis. Data points correspond to results from data sets 1 (+), 2 (o), 3 (*), 4 (·), and 5 (×). As a point of reference for evaluating each posterior probability, we have provided a dashed line at 0.1.

4.4 Sensitivity Analyses

As recommended in Section 2.4 we performed sensitivity analyses on the choice of v₀. The results are provided in Web Figures 1 and 2. The results of our univariate analyses were relatively unaffected by an increase (v₀ = 20) and decrease (v₀ = 5) in the degrees of freedom. In our multivariate analyses of y, increasing v₀ to 20 substantially changed the posterior probability of $H_{01}^{*}$ and $H_{02}^{*}$ in 2/5 Case 3 simulations, while all other results were fairly robust. In one of these data sets, the MCMC mixed poorly which could be reflective of a poor choice of V₀. When v₀ = 5, the prior is quite diffuse and computational instabilities arise due to difficulties in inversion of ill-conditioned covariance matrices, arising in some of the MCMC samples.

Web Figure 1 — Sensitivity analysis of v₀ in our univariate analysis of y₁. Values on the vertical axis correspond to posterior probabilities of the different null hypotheses. Data points correspond to results from data sets 1 (+), 2 (o), 3 (*), 4 (·) and 5 (×).

Web Figure 2 — Sensitivity analysis of v₀ in our multivariate analysis of y. Values on the vertical axis correspond to posterior probabilities of the different null hypotheses. Data points correspond to results from data sets 1 (+), 2 (o), 3 (*), 4 (·), and 5(×). Mixing was poor for Case 3, data set 2 when v₀ = 20.

5. Genotoxicity Example

5.1 Data and Methods

We considered data from a genotoxicity study analyzed previously by Dunson (2006), Dunson et al. (2003), and Dunson and Taylor (2005). The study assessed the effect of oxidative stress, induced by H₂O₂, on the frequency of DNA strand breaks using the comet assay (single cell electrophoresis). Human lymphoblast cells were exposed to 5 different concentrations of hydrogen peroxide (0, 5, 20, 50, and 100 μM H₂O₂). Several surrogate measures of DNA damage were obtained for 100 cells in each group, though we focused on the two best surrogates: y₁ = % tail DNA and y₂ = Olive tail moment (OTM). More information on these surrogates, including diagnostics demonstrating non-normality, may be found in Duez et al. (2003) and Dunson et al. (2003).

Our implementation was identical to that described in the simulation studies, with a few exceptions. Based on the range of surrogate values reported in Duez et al. (2003), we chose β = (50, 10)^′, V₀₁₁ = 181, V₀₂₂ = 10, and V₀₁₂ = V₀₂₁ = 38.2, where V_0ij denotes element (i, j) of V₀. Note that we chose a large value for the covariance because % tail DNA and OTM are likely to be highly correlated. However, since we are uncertain about how well these mean and variance values represent the current data, we chose a relatively diffuse prior for μ₀, N(β, 6 · V₀), and let v₀ = 10. Also, using the guidelines in Section 2.3, we assumed p₀ ∼ Be(0.7, 0.3) a priori. We ran our MCMC for 105,000 iterations, otherwise implementing as in Section 4.

5.2 Results

Our method provided strong evidence in favor of an effect of H₂O₂ on the frequency of DNA strand breaks as $\hat{\Pr} (H_{0} ∣ Data) < 0.001$ . As demonstrated by the posterior predictive densities in Figure 3, treatment with H₂O₂ changes the joint distribution of the % tail DNA and OTM from a unimodal distribution favoring small values to a multi-modal distribution supporting large levels of DNA damage. While genotoxic effects are evident at the smallest dose of H₂O₂ (the posterior probability that 5 μM is the LOAEL is > 0.999), they appear to level o at the higher doses; the outcome distribution clearly does not differ between the 50 and 100 μM groups $(\hat{\Pr} (No difference 50 - 100 μ M ∣ Data) = 0.990)$ and there is little evidence that the distribution changes at doses higher than 5 μM $(\hat{\Pr} (No difference 5 - 100 μ M ∣ Data) = 0.281)$ . Figure 3 demonstrates that these densities are in cagreement with the observed data, supporting goodness of fit.

Posterior predictive density of % tail DNA and OTM in each dose group. ‘·’ denotes an observation.

Although previous analyses of these data have also demonstrated a relationship between H₂O₂ and the frequency of DNA strand breaks, we have provided several new pieces of information. For instance, neither the latent response models fit by Dunson et al. (2003) and Dunson (2006) nor the quantile regression approach of Dunson and Taylor (2005) provided a formal method for testing for changes across dose groups and for estimating threshold doses. In genetic epidemiology studies of DNA repair it is important to use minimal doses of the test agent to avoid high levels of cell death, which make the comet assay unreliable. Hence our finding that 5 μM is a potential threshold level for DNA damage has important implications for future experiments. Another advantage of our method is our ability to obtain smooth, nonparametric estimates of the joint density of the surrogates.

5.3 Sensitivity Analyses

As in Section 4.4, we performed a sensitivity analysis on our choice of v₀. Increasing the degrees of freedom to v₀ = 20 did not change the posterior probability of H₀ or the LOAEL, but did increase the probability of no change across doses 5−100 μM to 0.826. When v₀ = 5 mixing was poor and the model did not converge after 105,000 iterations.

6. Discussion

We have proposed a nonparametric method which models distribution changes across an ordinal predictor and provides a formal approach for Bayesian testing of local and global changes across groups. In addition, we can perform inferences on multivariate distributions and identify thresholds. Our method provides more informative results than many frequentist k-sample tests and, as demonstrated in simulation studies, may be more sensitive to changes in distributional shape. We acknowledge, however, that these results are based on a small number data sets and larger simulation studies are needed to adequately characterize the sensitivity of our method.

Although our method is relatively simple to program, the computation can be intensive for large data sets especially those with several responses and dose groups. For example, it took approximately 33 hours to complete the MCMC for our analysis of the comet assay data using a Matlab (version 7.1) program compiled in C and run on a Dell Optiplex SX270 (2.8 GHz Pentium 4 processor). Thus, a more efficient computational method would likely be necessary for massive, high dimensional data sets. One promising approach may be to develop a Variational Bayes (VB) implementation (see Blei and Jordan, 2006).

Some extensions of our method may also serve as exciting areas of future research. For example, a formal method of incorporating historical data would be useful in toxicology studies with extensive historical control databases. It may also be of interest to incorporate order restrictions into our method which ensure that the the quantiles of the response are non-decreasing with dose. This extension would be related to the nonparametric approaches for stochastic ordering described by Gelfand and Kottas (2001) and Hoff (2003).

Acknowledgements

We would like to thank Jack Taylor, NIEHS, for providing the data used in our example. This research was supported in part by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences.

Web Appendix A: Proof of Theorem 1

Note that

{‖ G_{h + j + 1} - G_{h} ‖}_{T V} = \max_{B \in B} ∣ \prod_{l = h}^{h + j} (1 - π_{l}) G_{h} (B) + \sum_{l = h}^{h + j - 1} π_{l} \prod_{m = l + 1}^{h + j} (1 - π_{m}) G_{l}^{*} (B) + π_{h + j} G_{h + j}^{*} (B) - G_{h} (B) ∣ .

Under Sethuraman (1994) stick-breaking representation of the Dirichlet process

\begin{matrix} G_{1} (B) & = \sum_{j = 1}^{\infty} V_{1 j} \prod_{k < j} (1 - V_{1 k}) δ_{θ_{1 k}} (B), \\ G_{l}^{*} (B) & = \sum_{j = 1}^{\infty} V_{l + 1, j} \prod_{k < j} (1 - V_{l + 1, k}) δ_{θ_{l + 1, k}} (B) l = 1, \dots, h + j, \end{matrix}

$θ_{l k} \overset{i i d}{\sim} G_{0}, V_{l j} \overset{i n d}{\sim} Be (1, α_{l - 1})$ , and δ_θ denotes a point mass at θ. Suppose we chose B to contain (θ_lk, k = 1, . . . , ∞) for l = 1, . . . , h but not for l > h. Then, for a set B satisfying these conditions, G_h(B) = 1 and $G_{l}^{*} (B) = 0 for l = h, h + 1, \dots, h + j$ which causes the absolute value to reduce to

\begin{matrix} ∣ \prod_{l = h}^{h + j} (1 - π_{h}) - 1 ∣ & = 1 - \prod_{l = h}^{h + j} (1 - π_{l}) \\ = π_{h} + (1 - π_{h}) \sum_{l = h + 1}^{h + j - 1} π_{l} \prod_{k = l + 1}^{h + j} (1 - π_{k}) + π_{h + j} (1 - π_{h}) \\ \geq π_{h} \\ > ∊, \end{matrix}

since π_h > ∊ when ||G_h+1 − G_h||_TV > ∊. Hence

\Pr ({‖ G_{h + j + 1} - G_{h} ‖}_{T V} \leq ∊ ∣ {‖ G_{h + 1} - G_{h} ‖}_{T V} > ∊) = 0 .

Web Appendix B: Gibbs Sampling Methodology

To simplify notation, let ϕ_hi = {μ_hi, Σ_hi}. As in Dunson (2006), the derivation of our full conditional posterior distributions involves re-writing the model in (5) as

\begin{matrix} ϕ_{h i} & \sim \sum_{l = 1}^{h} 1 (M_{h i} = l) G_{l}^{* *} \\ M_{h i} & \sim Multinomial (1, \dots, h; ω_{h 1}, \dots, ω_{h h}), for i = 1, \dots, n_{h} \\ G_{l}^{* *} & \sim DP (α_{l - 1} G_{0}), for l = 1, \dots, h, \end{matrix}

(B-1)

where $G_{l}^{* *} = G_{1} for l = 1$ and $G_{l}^{* *} = G_{l - 1}^{*} for l = 2, \dots, h$ . As mentioned in Section 2.2, ω_hl is distinct for each h. For l = 1, . . . , d, let ϕ_l = {ϕ_hi : M_hi = l; h = l, . . . , d, i = 1, . . . , n_h} and $m_{l} = \sum_{h = l}^{d} \sum_{i = 1}^{n_{h}} 1 (M_{h i} = l)$ . Due to the discrete nature of the DP measure $G_{l}^{* *}$ , the elements of ϕ_l are allocated to k_l ≤ m_l distinct values (or clusters) $θ_{l} = {(θ_{l 1}^{'}, \dots, θ_{l k_{l}}^{'})}^{'}$ where $θ_{l r} = {μ_{l r}^{*}; Σ_{l r}^{*}}$ and we let $m_{l r} = \sum \sum_{h, i : M_{h i = l}} 1 (ϕ_{h i} = θ_{l r})$ .

Let $ϕ_{l}^{(h i)}$ be the subset of ϕ_l excluding subject h,i (l = 1, . . . , d). The full conditional prior of ϕ_hi given $ϕ^{(h i)} = {ϕ_{1}^{(h i)}, \dots, ϕ_{d}^{(h i)}}$ is

\sum_{l = 1}^{h} ω_{h l} {(\frac{α_{l - 1}}{α_{l - 1} + m_{l}^{(h i)}}) G_{0} + \sum_{r = 1}^{k_{l}^{(h i)}} (\frac{m_{l r}^{(h i)}}{α_{l - 1} + m_{l}^{(h i)}}) δ_{θ_{l r}^{(h i)}}},

(B-2)

where $m_{l}^{(h i)}$ is the number of latent ϕ_h′,i′ (h′, i′ ≠ h, i) in $ϕ_{l}^{(h i)}$ whose unique values are $θ_{l}^{(h i)} = (θ_{l r}^{(h i)}, r = 1, \dots, k_{l}^{(h i)}$ and $m_{l r}^{(h i)}$ denotes the number of ϕ_h′,i′ having value $θ_{l r}^{(h i)}$ . To simplify notation we will henceforth let

ω_{h l 0}^{(h i)} = \frac{ω_{h l} \cdot α_{l - 1}}{α_{l - 1} + m_{l}^{(h i)}}

and $ω_{h l r}^{(h i)}$ denote the multiplier on $δ_{θ_{l r}^{(h i)}}$ in (B-2). Updating the conditional prior with the data, we have the following full conditional posterior for ϕ_hi:

\sum_{l = 1}^{h} {{\tilde{ω}}_{h l 0}^{(h i)} G_{0 h i} + \sum_{r = 1}^{k_{l}^{(h i)}} {\tilde{ω}}_{h l r}^{(h i)} δ_{θ_{l r}^{(h i)}}},

(B-3)

where

\begin{matrix} {\tilde{ω}}_{h l 0}^{(h i)} & = c \cdot ω_{h l 0}^{(h i)} \cdot t (y_{h i}; v_{0} - q + 1, μ_{0}, V_{0}^{*}) \\ V_{0}^{*} & = \frac{v_{0} (1 + κ)}{v_{0} - q + 1} \cdot V_{0} \\ {\tilde{ω}}_{h l r}^{(h i)} & = c \cdot ω_{h l r}^{(h i)} \cdot N (y_{h i}; μ_{l r}^{* (h i)}, Σ_{l r}^{* (h i)}) \\ G_{0 h i} (ϕ_{h i}) & = W (Σ_{h i}^{- 1}; v_{0} + 1, V_{0 h i}^{- 1}) \cdot N (μ_{h i}; \frac{κ}{κ + 1} (y_{h i} + κ^{- 1} μ_{0}), \frac{κ}{κ + 1} Σ_{h i}) \\ V_{0 h i} & = v_{0} \cdot V_{0} + \frac{1}{κ + 1} (y_{h i} - μ_{0}) {(y_{h i} - μ_{0})}^{'} \end{matrix}

and t(·; v, μ, V) denotes the multivariate t density with v degrees of freedom, location parameter μ, and scale V.

Gibbs sampling may proceed by sampling directly directly from (B-3). However to improve mixing, we apply an algorithm which parallels those described by Bush and MacEachern (1996) and West et al. (1994). Let S_hi = (0, l) if ϕ_hi is allocated to a new cluster in mixture component l and S_hi = (r, l) if $ϕ_{h i} = θ_{r l}^{(h i)}$ . At each iteration of our MCMC, we update the parameters in our model using the following steps:

1. Sample each S_hi from its full conditional posterior distribution which is multinomial with $\Pr (S_{h i} = (r, l) ∣ \cdot) = {\tilde{ω}}_{h l r}$ for $l = 1, \dots, h, r = 0, 1, \dots, k_{l}^{(h i)}$ . Whenever S_hi = (0, l) sample a new value for ϕ_hi from G_0hi and assign subject h, i to their own cluster in component l.

2. Given the updated values of S = {S_hi, h = 1, . . . , d; i = 1, . . . , n_h}, for l = 1, . . . , d and r = 1, . . . , k_l sample θ_lr from

W (Σ_{l r}^{* - 1}; v_{0} + m_{l r}, V_{0 l r}^{* - 1}) \cdot N (μ_{l r}^{*}; κ^{*} (κ^{- 1} μ_{0} + m_{l r} {\overset{‒}{y}}_{l r}), κ^{*} Σ_{l r}^{*}),

(B-4)

where

V_{0 l r}^{*} = v_{0} V_{0} + \sum_{h = l}^{d} \sum_{i : S_{h i} = (l, r)} (y_{h i} - {\overset{‒}{y}}_{l r}) {(y_{h i} - {\overset{‒}{y}}_{l r})}^{'} + \frac{m_{l r}}{κ \cdot m_{l r} + 1} (μ_{0} - {\overset{‒}{y}}_{l r}) {(μ_{0} - {\overset{‒}{y}}_{l r})}^{'},

κ* = κ/(1 + κ · m_lr), and ${\overset{‒}{y}}_{l r}$ is the mean response vector in cluster (l, r).

The priors for our remaining model parameters are conjugate, and thus may be updated by adding the following Gibbs steps to our MCMC algorithm:

3. For h = 1, . . . , d − 1 sample:

a.) ζ_h from $Bin (1, p_{0 h}^{*})$ where

p_{0 h}^{*} = \frac{p_{0} \cdot Φ_{h} (∊^{*})}{p_{0} \cdot Φ_{h} (∊^{*}) + \frac{∊^{*}}{1 - ∊^{*}} (1 - p_{0}) (1 - Φ_{h} (∊^{*}))},

with

\begin{matrix} Φ_{h} (∊^{*}) & = \int_{0}^{∊^{*}} Be (x; a_{h}^{*}, b_{h}^{*}) d x \\ a_{h}^{*} & = 1 + \sum_{j = h + 1}^{d} \sum_{i = 1}^{n_{j}} 1 (M_{j i} = h + 1) \\ b_{h}^{*} & = 1 + \sum_{j = h + 1}^{d} \sum_{i = 1}^{n_{j}} 1 (M_{j i} < h + 1) . \end{matrix}

b.) π_h from

π (π_{h} ∣ \cdot) = ζ_{h} \cdot \frac{Be (π_{h}; a_{h}^{*}, b_{h}^{*})}{Φ_{h} (∊^{*})} 1 (0 \leq π_{h} \leq ∊^{*}) + (1 - ζ_{h}) \cdot \frac{Be (π_{h}; a_{h}^{*}, b_{h}^{*})}{1 - Φ_{h} (∊^{*})} 1 (∊^{*} < π_{h} \leq 1),

(B-5)

where the notation a|· denotes a given all other variables.

4. Sample p₀ from

π (p_{0} ∣ \cdot) = Be (p_{0}; a_{p} + \sum_{h = 1}^{d - 1} ζ_{h}, b_{p} + d - 1 - \sum_{h = 1}^{d - 1} ζ_{h}) .

(B-6)

5. For h = 0, 1, . . . , d − 1 sample a latent variable, u_h, from π(u_h|α_h, m_h+1) = Be(u_h; α_h + 1, m_h+1) and then sample α_h from its full conditional posterior

π (α_{h} ∣ u_{h}, k_{h + 1}, m_{h + 1}) = p_{u h} Ga (α_{h}; a_{γ_{h}} + k_{h + 1}, b_{γ_{h}} - l o g (u_{h})) + (1 - p_{u h}) Ga (α_{h}; a_{γ_{h}} + k_{h + 1} - 1, b_{γ_{h}} - l o g (u_{h})),

(B-7)

where

\frac{p_{u h}}{1 - p_{u h}} = \frac{a_{γ_{h}} + k_{h + 1} - 1}{m_{h + 1} (b_{γ_{h}} - l o g (u_{h}))} .

6. Sample μ₀ from π(μ₀|·) = N(μ₀; β*,Ω*) where

\begin{matrix} β^{*} & = Ω^{*} (Ω^{- 1} β + \frac{1}{κ} \sum_{h - 1}^{d} \sum_{r = 1}^{k_{h}} Σ_{h r}^{* - 1} μ_{h r}^{*}) \\ Ω^{*} = {Ω^{- 1} + \frac{1}{κ} \sum_{h = 1}^{d} \sum_{r = 1}^{k_{h}} Σ_{h r}^{* - 1}}^{- 1} . \end{matrix}

7. Sample κ⁻¹ from

π (κ^{- 1} ∣ \cdot) = Ga (κ^{- 1}; a_{κ} + \frac{q}{2} \sum_{h = 1}^{d} k_{h}, b_{κ} + \frac{1}{2} \sum_{h = 1}^{d} \sum_{r = 1}^{k_{h}} {(μ_{h r}^{*} - μ_{0})}^{'} Σ_{h r}^{* - 1} (μ_{h r}^{*} - μ_{0})) .

(B-8)

References

Ahmad R. Multivariate k-sample problem and generalization of the Kolmogorov-Smirnov test. Annals of Statistical Mathematics. 1976;28:259–265. [Google Scholar]
Antoniak CE. Mixtures of Dirichlet processes with applications to nonparametric problems. Annals of Statistics. 1974;2:1152–1174. [Google Scholar]
Basu S, Chib S. Marginal likelihood and Bayes factors for Dirichlet process mixture models. Journal of the American Statistical Association. 2003;98:224–235. [Google Scholar]
Berger JO, Delampady M. Testing precise hypotheses. Statistical Science. 1987;2:317–352. [Google Scholar]
Berger JO, Guglielmi A. Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives. Journal of the American Statistical Association. 2001;96:174–184. [Google Scholar]
Blei DM, Jordan MI. Variational inference for Dirichlet process mixtures. Bayesian Analysis. 2006;1:1–23. [Google Scholar]
Bush CA, MacEachern SN. A semiparametric Bayesian model for randomized block designs. Biometrika. 1996;83:175–185. [Google Scholar]
Carota C, Parmigiani G. On Bayes factors for nonparametric alternatives. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM, editors. Bayesian Statistics 5. Oxford University Press; London: 1996. pp. 507–511. [Google Scholar]
Carota C, Parmigiani G. Semiparametric regression for count data. Biometrika. 2002;89:265–281. [Google Scholar]
Cifarelli D, Regazzini E. Technical report. Quaderni Istituto Matematica Finanziaria; Turin, Italy: 1978. Non parametrici in condizioni di scambiabilit a parziale e impiego di medie associative. [Google Scholar]
De Iorio M, Müller P, Rosner GL, MaEachern SN. An ANOVA model for dependent random measures. Journal of the American Statistical Association. 2004;99:205–215. [Google Scholar]
Dietz EJ. Multivariate generalizations of Jonckheere's test for ordered alternatives. Communications in Statistics, Part A. 1989;18:3763–3783. [Google Scholar]
Dietz EJ, Killeen TJ. A nonparametric multivariate test for monotone trend with pharmaceutical applications. Journal of the American Statistical Association. 1981;76:169–174. [Google Scholar]
Duan JA, Guindani M, Gelfand AE. ISDS Discussion Paper 2005−23. Duke University; 2005. Generalized spatial Dirichlet process models. [Google Scholar]
Duez P, Dehon G, Kumps A, Dubois J. Statistics of the comet assay: a key to discriminate between genotoxic effects. Journal of the American Statistical Association. 2003;76:169–174. doi: 10.1093/mutage/18.2.159. [DOI] [PubMed] [Google Scholar]
Dunn O. Multiple comparisons using rank sums. Technometrics. 1964;6:241–252. [Google Scholar]
Dunson DB. Bayesian dynamic modeling of latent trait distributions. Biostatistics. 2006;7:551–568. doi: 10.1093/biostatistics/kxj025. [DOI] [PubMed] [Google Scholar]
Dunson DB, Pillai D, Park J-H. Bayesian density regression. Journal of the Royal Statistical Society, Series B. 2007;69:163–183. [Google Scholar]
Dunson DB, Taylor JA. Approximate Bayesian inference for quantiles. Journal of Nonparametric Statistics. 2005;17:385–400. [Google Scholar]
Dunson DB, Watson M, Taylor JA. Bayesian latent variable models for median regression on multiple outcomes. Biometrics. 2003;59:296–304. doi: 10.1111/1541-0420.00036. [DOI] [PubMed] [Google Scholar]
Escobar MD, West M. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association. 1995;90:578–588. [Google Scholar]
Ferguson TS. A Bayesian analysis of some non-parametric problems. Annals of Statistics. 1973;1:209–230. [Google Scholar]
Ferguson TS. Prior distributions on spaces of probability measures. Annals of Statistics. 1974;2:615–629. [Google Scholar]
Gelfand AE, Kottas A. Nonparametric Bayesian modeling for stochastic order. Annals of the Institute of Statistical Mathematics. 2001;53:865–876. [Google Scholar]
Gelfand AE, Kottas A, MacEachern SN. Technical Report AMS 2004−5, Department of Applied Math and Statistics. University of California; Santa Cruz: 2004. Bayesian nonparametric spatial modeling with Dirichlet process mixing. [Google Scholar]
Gelfand AE, Smith AFM. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Society. 1990;85:398–409. [Google Scholar]
Giudici P, Mezzetti M, Muliere P. Mixtures of Dirichlet process priors for variable selection in survival analysis. Journal of Statistical Planning and Inference. 2003;111:101–115. [Google Scholar]
Gopalan R, Berry DA. Bayesian multiple comparisons using Dirichlet process priors. Journal of the American Statistical Association. 1998;93:1130–1139. [Google Scholar]
Griffin JE, Steel MFJ. Order-based dependent Dirichlet process. Journal of the American Statistical Association. 2006;101:179–194. [Google Scholar]
Hans C, Dunson DB. Bayesian inferences on umbrella orderings. Biometrics. 2005;61:1018–1026. doi: 10.1111/j.1541-0420.2005.00373.x. [DOI] [PubMed] [Google Scholar]
Hoff PD. Bayesian methods for partial stochastic orderings. Biometrika. 2003;90:303–317. [Google Scholar]
Huang P, Tilley BC, Woolson RF, Lipsitz S. Adjusting O'Brien's test to control type I error for the generalized nonparametric Behrens-Fisher problem. Biometrics. 2005;61:532–539. doi: 10.1111/j.1541-0420.2005.00322.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jonckheere AR. A distribution free k-sample test against ordered alternatives. Biometrika. 1954;41:133–145. [Google Scholar]
Kiefer J. k-sample analogues of the Kolmogorov-Smirnov and Cramér-v. Mises tests. Annals of Mathematical Statistics. 1959;30:420–447. [Google Scholar]
Kleinman KP, Ibrahim JG. A semiparametric Bayesian approach to the random effects model. Biometrics. 1998;54:921–938. [PubMed] [Google Scholar]
MacEachern SN. ASA Proceedings of the Section on Bayesian Statistical Science. American Statistical Association; Alexandria, VA: 1999. Dependent nonparametric processes. [Google Scholar]
MacEachern SN. Dependent Dirichlet processes. Unpublished manuscript, Department of Statistics. The Ohio State University; 2000. [Google Scholar]
Mira A, Petrone S. Bayesian hierarchical nonparametric inference for change-point problems. In: Berger JO, Bernardo JM, Dawid AP, Smith AFM, editors. Bayesian Statistics 5. Oxford University Press; London: 1996. pp. 609–620. [Google Scholar]
Mukhopadhyay S, Gelfand AE. Dirichlet process mixed generalized linear models. Journal of the American Statistical Association. 1997;92:633–639. [Google Scholar]
Müller P, Quintana FA. Nonparametric Bayesian data analysis. Statistical Science. 2004;19:95–110. [Google Scholar]
Müller P, Quintana FA, Rosner G. A method for combining inference across related nonparametric Bayesian models. Journal of the Royal Statistical Society, Series B. 2004;66:735–749. [Google Scholar]
Nickerson RS. Null hypothesis significance testing: a review of an old and continuing controversy. Psychological Methods. 2000;5:241–301. doi: 10.1037/1082-989x.5.2.241. [DOI] [PubMed] [Google Scholar]
O'Brien PC. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40:1079–1087. [PubMed] [Google Scholar]
Pennell ML, Dunson DB. Bayesian semiparametric dynamic frailty models for multiple event time data. Biometrics. 2006;62:1044–1052. doi: 10.1111/j.1541-0420.2006.00571.x. [DOI] [PubMed] [Google Scholar]
Sethuraman J. A constructive definition of Dirichlet priors. Statistica Sinica. 1994;4:639–650. [Google Scholar]
Shirley E. A non-parametric equivalent of Williams' test for contrasting increasing dose levels of a treatment. Biometrics. 1977;33:386–389. [PubMed] [Google Scholar]
Sorenson HW, Alspach DL. Recursive Bayesian estimation using Gaussian sums. Automatica. 1971;7:465–479. [Google Scholar]
Teh YW, Jordan MI, Beal MJ, Blei DM. Hierarchical Dirichlet processes. Journal of the American Statistical Association. 2006;101:1566–1581. [Google Scholar]
Terpstra TJ. The asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking. Indigationes Mathematicae. 1952;14:327–333. [Google Scholar]
Tomlinson G, Escobar M. Analysis of densities. Technical report. University of Toronto; 1999. [Google Scholar]
Verdinelli I, Wasserman L. Bayesian goodness of fit testing using infinite dimensional exponential families. Annals of Statistics. 1998;20:1215–1241. [Google Scholar]
West M, Müller P, Escobar MD. Hierarchical priors and mixture models with application in regression and density estimation. In: Smith A, Freeman P, editors. Aspects of Uncertainty: A Tribute to D. V. Lindley. Wiley; New York: 1994. pp. 363–386. [Google Scholar]
Westfall PH, Johnson WO, Utts JM. A Bayesian perspective on the Bonferroni adjustment. Biometrika. 1997;84:419–427. [Google Scholar]
Williams D. A note on Shirley's nonparametric test for comparing several dose levels with a zero-dose control. Biometrics. 1986;42:183–186. [PubMed] [Google Scholar]

[R1] Ahmad R. Multivariate k-sample problem and generalization of the Kolmogorov-Smirnov test. Annals of Statistical Mathematics. 1976;28:259–265. [Google Scholar]

[R2] Antoniak CE. Mixtures of Dirichlet processes with applications to nonparametric problems. Annals of Statistics. 1974;2:1152–1174. [Google Scholar]

[R3] Basu S, Chib S. Marginal likelihood and Bayes factors for Dirichlet process mixture models. Journal of the American Statistical Association. 2003;98:224–235. [Google Scholar]

[R4] Berger JO, Delampady M. Testing precise hypotheses. Statistical Science. 1987;2:317–352. [Google Scholar]

[R5] Berger JO, Guglielmi A. Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives. Journal of the American Statistical Association. 2001;96:174–184. [Google Scholar]

[R6] Blei DM, Jordan MI. Variational inference for Dirichlet process mixtures. Bayesian Analysis. 2006;1:1–23. [Google Scholar]

[R7] Bush CA, MacEachern SN. A semiparametric Bayesian model for randomized block designs. Biometrika. 1996;83:175–185. [Google Scholar]

[R8] Carota C, Parmigiani G. On Bayes factors for nonparametric alternatives. In: Bernardo JM, Berger JO, Dawid AP, Smith AFM, editors. Bayesian Statistics 5. Oxford University Press; London: 1996. pp. 507–511. [Google Scholar]

[R9] Carota C, Parmigiani G. Semiparametric regression for count data. Biometrika. 2002;89:265–281. [Google Scholar]

[R10] Cifarelli D, Regazzini E. Technical report. Quaderni Istituto Matematica Finanziaria; Turin, Italy: 1978. Non parametrici in condizioni di scambiabilit a parziale e impiego di medie associative. [Google Scholar]

[R11] De Iorio M, Müller P, Rosner GL, MaEachern SN. An ANOVA model for dependent random measures. Journal of the American Statistical Association. 2004;99:205–215. [Google Scholar]

[R12] Dietz EJ. Multivariate generalizations of Jonckheere's test for ordered alternatives. Communications in Statistics, Part A. 1989;18:3763–3783. [Google Scholar]

[R13] Dietz EJ, Killeen TJ. A nonparametric multivariate test for monotone trend with pharmaceutical applications. Journal of the American Statistical Association. 1981;76:169–174. [Google Scholar]

[R14] Duan JA, Guindani M, Gelfand AE. ISDS Discussion Paper 2005−23. Duke University; 2005. Generalized spatial Dirichlet process models. [Google Scholar]

[R15] Duez P, Dehon G, Kumps A, Dubois J. Statistics of the comet assay: a key to discriminate between genotoxic effects. Journal of the American Statistical Association. 2003;76:169–174. doi: 10.1093/mutage/18.2.159. [DOI] [PubMed] [Google Scholar]

[R16] Dunn O. Multiple comparisons using rank sums. Technometrics. 1964;6:241–252. [Google Scholar]

[R17] Dunson DB. Bayesian dynamic modeling of latent trait distributions. Biostatistics. 2006;7:551–568. doi: 10.1093/biostatistics/kxj025. [DOI] [PubMed] [Google Scholar]

[R18] Dunson DB, Pillai D, Park J-H. Bayesian density regression. Journal of the Royal Statistical Society, Series B. 2007;69:163–183. [Google Scholar]

[R19] Dunson DB, Taylor JA. Approximate Bayesian inference for quantiles. Journal of Nonparametric Statistics. 2005;17:385–400. [Google Scholar]

[R20] Dunson DB, Watson M, Taylor JA. Bayesian latent variable models for median regression on multiple outcomes. Biometrics. 2003;59:296–304. doi: 10.1111/1541-0420.00036. [DOI] [PubMed] [Google Scholar]

[R21] Escobar MD, West M. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association. 1995;90:578–588. [Google Scholar]

[R22] Ferguson TS. A Bayesian analysis of some non-parametric problems. Annals of Statistics. 1973;1:209–230. [Google Scholar]

[R23] Ferguson TS. Prior distributions on spaces of probability measures. Annals of Statistics. 1974;2:615–629. [Google Scholar]

[R24] Gelfand AE, Kottas A. Nonparametric Bayesian modeling for stochastic order. Annals of the Institute of Statistical Mathematics. 2001;53:865–876. [Google Scholar]

[R25] Gelfand AE, Kottas A, MacEachern SN. Technical Report AMS 2004−5, Department of Applied Math and Statistics. University of California; Santa Cruz: 2004. Bayesian nonparametric spatial modeling with Dirichlet process mixing. [Google Scholar]

[R26] Gelfand AE, Smith AFM. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Society. 1990;85:398–409. [Google Scholar]

[R27] Giudici P, Mezzetti M, Muliere P. Mixtures of Dirichlet process priors for variable selection in survival analysis. Journal of Statistical Planning and Inference. 2003;111:101–115. [Google Scholar]

[R28] Gopalan R, Berry DA. Bayesian multiple comparisons using Dirichlet process priors. Journal of the American Statistical Association. 1998;93:1130–1139. [Google Scholar]

[R29] Griffin JE, Steel MFJ. Order-based dependent Dirichlet process. Journal of the American Statistical Association. 2006;101:179–194. [Google Scholar]

[R30] Hans C, Dunson DB. Bayesian inferences on umbrella orderings. Biometrics. 2005;61:1018–1026. doi: 10.1111/j.1541-0420.2005.00373.x. [DOI] [PubMed] [Google Scholar]

[R31] Hoff PD. Bayesian methods for partial stochastic orderings. Biometrika. 2003;90:303–317. [Google Scholar]

[R32] Huang P, Tilley BC, Woolson RF, Lipsitz S. Adjusting O'Brien's test to control type I error for the generalized nonparametric Behrens-Fisher problem. Biometrics. 2005;61:532–539. doi: 10.1111/j.1541-0420.2005.00322.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] Jonckheere AR. A distribution free k-sample test against ordered alternatives. Biometrika. 1954;41:133–145. [Google Scholar]

[R34] Kiefer J. k-sample analogues of the Kolmogorov-Smirnov and Cramér-v. Mises tests. Annals of Mathematical Statistics. 1959;30:420–447. [Google Scholar]

[R35] Kleinman KP, Ibrahim JG. A semiparametric Bayesian approach to the random effects model. Biometrics. 1998;54:921–938. [PubMed] [Google Scholar]

[R36] MacEachern SN. ASA Proceedings of the Section on Bayesian Statistical Science. American Statistical Association; Alexandria, VA: 1999. Dependent nonparametric processes. [Google Scholar]

[R37] MacEachern SN. Dependent Dirichlet processes. Unpublished manuscript, Department of Statistics. The Ohio State University; 2000. [Google Scholar]

[R38] Mira A, Petrone S. Bayesian hierarchical nonparametric inference for change-point problems. In: Berger JO, Bernardo JM, Dawid AP, Smith AFM, editors. Bayesian Statistics 5. Oxford University Press; London: 1996. pp. 609–620. [Google Scholar]

[R39] Mukhopadhyay S, Gelfand AE. Dirichlet process mixed generalized linear models. Journal of the American Statistical Association. 1997;92:633–639. [Google Scholar]

[R40] Müller P, Quintana FA. Nonparametric Bayesian data analysis. Statistical Science. 2004;19:95–110. [Google Scholar]

[R41] Müller P, Quintana FA, Rosner G. A method for combining inference across related nonparametric Bayesian models. Journal of the Royal Statistical Society, Series B. 2004;66:735–749. [Google Scholar]

[R42] Nickerson RS. Null hypothesis significance testing: a review of an old and continuing controversy. Psychological Methods. 2000;5:241–301. doi: 10.1037/1082-989x.5.2.241. [DOI] [PubMed] [Google Scholar]

[R43] O'Brien PC. Procedures for comparing samples with multiple endpoints. Biometrics. 1984;40:1079–1087. [PubMed] [Google Scholar]

[R44] Pennell ML, Dunson DB. Bayesian semiparametric dynamic frailty models for multiple event time data. Biometrics. 2006;62:1044–1052. doi: 10.1111/j.1541-0420.2006.00571.x. [DOI] [PubMed] [Google Scholar]

[R45] Sethuraman J. A constructive definition of Dirichlet priors. Statistica Sinica. 1994;4:639–650. [Google Scholar]

[R46] Shirley E. A non-parametric equivalent of Williams' test for contrasting increasing dose levels of a treatment. Biometrics. 1977;33:386–389. [PubMed] [Google Scholar]

[R47] Sorenson HW, Alspach DL. Recursive Bayesian estimation using Gaussian sums. Automatica. 1971;7:465–479. [Google Scholar]

[R48] Teh YW, Jordan MI, Beal MJ, Blei DM. Hierarchical Dirichlet processes. Journal of the American Statistical Association. 2006;101:1566–1581. [Google Scholar]

[R49] Terpstra TJ. The asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking. Indigationes Mathematicae. 1952;14:327–333. [Google Scholar]

[R50] Tomlinson G, Escobar M. Analysis of densities. Technical report. University of Toronto; 1999. [Google Scholar]

[R51] Verdinelli I, Wasserman L. Bayesian goodness of fit testing using infinite dimensional exponential families. Annals of Statistics. 1998;20:1215–1241. [Google Scholar]

[R52] West M, Müller P, Escobar MD. Hierarchical priors and mixture models with application in regression and density estimation. In: Smith A, Freeman P, editors. Aspects of Uncertainty: A Tribute to D. V. Lindley. Wiley; New York: 1994. pp. 363–386. [Google Scholar]

[R53] Westfall PH, Johnson WO, Utts JM. A Bayesian perspective on the Bonferroni adjustment. Biometrika. 1997;84:419–427. [Google Scholar]

[R54] Williams D. A note on Shirley's nonparametric test for comparing several dose levels with a zero-dose control. Biometrics. 1986;42:183–186. [PubMed] [Google Scholar]

PERMALINK

Nonparametric Bayes testing of changes in a response distribution with an ordinal predictor

Michael L Pennell

David B Dunson

Summary

1. Introduction

2. Nonparametric Model and Prior Structure

2.1 General Framework

2.2 DMDP Model

THEOREM 1

2.3 Model Space Prior

THEOREM 2

2.4 Hyperpriors for DP Parameters

3. Posterior Computation

3.1 Gibbs Sampling Methodology

3.2 Hypothesis Testing

3.3 Density Estimation

4. Simulation Studies

4.1 Description of Data

Table 1.

Figure 1.

4.2 Univariate Analyses

Table 2.

4.3 Multivariate Analyses

Figure 2.

4.4 Sensitivity Analyses

Web Figure 1.

Web Figure 2.

5. Genotoxicity Example

5.1 Data and Methods

5.2 Results

Figure 3.

5.3 Sensitivity Analyses

6. Discussion

Acknowledgements

Web Appendix A: Proof of Theorem 1

Web Appendix B: Gibbs Sampling Methodology

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases