Skip to main content
Entropy logoLink to Entropy
. 2022 Oct 31;24(11):1579. doi: 10.3390/e24111579

Model Checking with Right Censored Data Using Relative Belief Ratio

Luai Al-Labadi 1,*, Ayman Alzaatreh 2, Mark Asuncion 1
Editor: Carlos Alberto De Bragança Pereira
PMCID: PMC9689107  PMID: 36359670

Abstract

Model checking is a topic of special interest in statistics. When data are censored, the problem becomes more difficult. This paper employs the relative belief ratio and the beta-Stacy process to develop a method for model checking in the presence of right-censored data. The proposed method for the given model of interest compares the concentration of the posterior distribution to the concentration of the prior distribution using a relative belief ratio. We propose a computational algorithm for the method and then illustrate the method through several data analysis examples.

Keywords: beta-Stacy process, model checking, nonparametric Bayesian statistics, relative belief inferences, right-censored data

1. Introduction

An important problem in statistics is to check whether a chosen statistical model is in agreement with observed data. This problem is known as model checking. The tests used to describe how well a statistical model fits a set of observations are goodness-of-fit tests. If it is determined that the observed data do not contradict the model, the true values of the parameters can be inferred. If the model fails to pass its checks, there is cause for concern about the accuracy of the inferences used in analysis. Thus, checking a proposed model on the basis of observed data is a critical component of statistical inference.

The time until an event occurs is the variable of interest in survival analysis. It is common to refer to this time as the survival time. In this case, the observed data are typically right censored. That is, at the right side, the exact survival time can become incomplete, for example, when a subject leaves the study before an event occurs, or when the study ends before the event occurs, right censoring occurs. This type of data are common in medical studies where the variable of interest is the time to death or time to relapse of a disease, in reliability studies where the variable of interest is the time until a machine part fails, and in social sciences where the variable of interest is the lifetime of the elderly in certain social programs [1]. While there are several well-known methods for model checking, the approach we want to take is Bayesian in nature. There is interest in developing nonparametric Bayesian procedures for hypothesis testing. The majority of this work focuses on uncensored data. The proposed model was embedded as a null hypothesis into a larger family of distributions, which is the main approach of model testing. Then priors were placed on the null and the alternative, and a Bayes factor was computed (see, for example, [2,3,4,5,6]). A second important approach for model testing is to apply a prior to the true distribution generating the data, and then measure the distance between the posterior distribution and the hypothesized distribution (see, for example, [7,8,9]). A third and more recent approach combines the second approach with relative belief ratio ([10]) (for instance, see the work of [11,12,13]).

When data are censored, goodness-of-fit tests become a challenging problem. This explains why there are few and scattered studies on goodness-of-fit tests for right-censored data. In particular, Bayesian goodness-of-fit tests are rarely covered. Some exceptions include the work of [14,15,16] considered the two sample problem by computing the Bayesian factors and beta process priors ([17]). Al-Labadi and Zarepour [15] used the beta-Stacy process and the Kolmogorov–Smirnov distance to test simple hypotheses. Chen and Hanson [16] used Polya tree priors and computed the Bayesian factor. To compute the posteriors, the MCMC procedure was used. Permutation tests were used to compute p-values. These approaches are computationally intensive.

The main problem addressed in this paper is model checking for lifetime distribution functions. A general description of the approach developed in this paper is provided along with a discussion of the benefits that the approach brings to the problem of model checking. The beta-Stacy process BSP(k(t),F0(t)), t0 is considered as a prior on on the space of cumulative distribution functions ([18]). Here, k(·)>0 is a function defined on R+, and F0 is a cumulative distribution function (cdf) on R+=[0,) and it represents the prior guess for the beta-Stacy process. How closely realizations from the process are clustered around F0 is determined by the value of k(·). The use of the beta-Stacy process is justified because, unlike the Dirichlet process, it is conjugate to both exact and right-censored observations ([19]). The proposed method then compares the posterior and prior distribution concentrations about the model of interest. If the posterior is more concentrated on the model than the prior, this is evidence in favor of the model; if the posterior is less concentrated, this is evidence against the model. This comparison is produced using a relative belief ratio that measures the evidence in the observed data for or against the model and provides a measure of the strength of this evidence; thus, the methodology is based on a direct measure of statistical evidence. The approach is fairly simple to implement and does not necessitate obtaining a closed form of the relative belief ratio. Theoretical results are developed to support appropriate hyperparameter selections k(·) and F0.

The rest of the paper is structured as follows. Section 2 provides the background on the definitions and generic properties of the beta-Stacy process, the relative belief ratio, and the Cramér–von Mises distance between probability measures. Section 3 discusses the proposed methodology and the selection of relevant values for the beta-Stacy process’s hyperparameters. Section 4 develops a computational algorithm for implementing the approach. Section 5 examines the performance of the methodology through a number of examples. This method works well in general and can be used for both censored and uncensored observations. Section 6 wraps up the paper with a summary of the findings. Lastly, datasets and figures are provided in the Appendix A.

2. Notations and Background

Let the random sample X1,,Xn be drawn from an unknown cdf F defined on R+. Let (T,δ)=(T1,δ1),,(Tn, δn) be observed, where Ti=min(Xi,Ci), δi=IXiCi and C1,,Cn are the censoring times. When δi=1, Xi is observed, and when δi=0, Xi is right censored, this type of data can occur, for instance, when Xi is the lifetime of a patient who enrols in a study of a certain disease. If death occurs from the disease before the study ends, Xi is recorded; however, if the patient is still alive, withdraws, or dies for another reason at the end of the study, Ci records the end of the period during which the patient was under observation ([20]).

In survival analysis, cdf F is frequently referred to as the lifetime distribution function. We use the same notation throughout this paper for the probability measure and its corresponding cumulative distribution function, i.e., F(t)=F((,t]).

2.1. Beta-Stacy Process

Walker and Muliere [18] introduced the beta-Stacy process, which is a nonparametric prior that is widely used in survival analysis. In this section, a relevant introduction about the beta-Stacy process is provided.

The beta-Stacy process is defined by another process known as the log-beta process ([18]). Let β(·) be a positive function defined on R+, and α be a measure concentrated on R+ that is absolutely continuous with respect to the Lebesgue measure such that 0β(z)1α(dz)=. On the basis of [18], Z(t)t0 is a log-beta process with parameters α and β, written as Z(t)logBSP(α(t),β(t)), if Z(t)t0 is a Lévy process with Lévy measure defined, for s>0, by

Lt(ds)=11es0tesβ(z)α(dz)ds

and has the following moment generating function:

logEeuZc(t)=0eus1Lt(ds),t0,uR.

In order to define the beta-Stacy process, let positive function k(·) be defined on R+ and the absolutely continuous cdf F0 be defined on R+. We call F(t)t0 a beta-Stacy process with parameters k(·) and F0, written as F(t)BSP(a(t),F0(t)), if F(t)=1eZ(t), and Z(t)t0 is a log-beta process with parameters

α(dz)=k(z)F0(dz)andβ(z)=k(z)F0([z,)).

Walker and Muliere [18] showed that

EF(t)=F0(t)=1exp0tβ(z)1α(dz), (1)

rendering F0 the prior guess. The expression in (1) explains the need for assumption 0β(z)1α(dz)=. The beta-Stacy process includes various neutral-to-right processes proposed in the literature. For instance, the beta-Stacy process reduces to the Dirichlet process when k(t)=k>0 for all t0. The beta-Stacy process also reduces to a simple homogenous process ([21]) when β(·) is constant.

Next, we describe the posterior distributions of Z(t)t0 and F(t)t0. Let random sample X1,,Xn be drawn from F, and (T,δ)=(T1,δ1),,(Tn,δn) be observed, where Ti=min(Xi,Ci), δi=IXiCi and C1,,Cn are censoring times. Define the counting process N by

N(t)=i=1nITitandδi=1

and the (left-continuous) at-risk process Y by

Y(t)=i=1nITit,

where I is the indicator function. In particular, let N{t}=N(t)N(t) (i.e., N{t} is the number of observed Xi’s at the exact position t). Let Z(t)t0logBSP(α(t),β(t)) and F(t)=1expZ(t). Given the data (T,δ), the posterior distribution of Z is a log-beta process ([18]) with parameters

α*(t)=α(t)+N(t) (2)

and

β*(t)=β(t)+Y(t)N{t}. (3)

The posterior Lévy measure for Z(t) is given by

Lt(ds)=11es0tesβ(z)+Y(z)dα(z)ds.

There are fixed points of discontinuity in the posterior process. These extra points appear at the exact (uncensored) observations. If ti is an exact observation with corresponding jump Si, then

1exp(Si)betaN{ti},β(ti)+Y(ti)N{ti}. (4)

If N{ti}=1, then the random jump Si follows an exponential density with mean β(ti)+Y(ti)N{ti}1.

Let F(t)BSPk(t),F0(t), t0. Given the data (T,δ), the posterior distribution of F is a beta-Stacy process with parameters

k*(t)=β*(t)F0*[t,)=β(t)+Y(t)N{t}F0*[t,) (5)

and

F0*(t)=1[0,t]1dα*(z)β*(z)+α*{z}=1[0,t]1k(z)dF0(z)+dN(z)k(z)F0[z,)+Y(z), (6)

with α*(s) and β*(s) are defined in (2) and (3), and ∏ stands for the product integral. Note that, as k(·) tends to zero, the nonparametric Kaplan–Meier estimator of the distribution function is obtained. On the other hand, F* becomes the prior guess F0 as k(·) grows large. Thus, parameter k(t) can be viewed as the concentration parameter. The posterior consistency of the beta-Stacy process was addressed by Kim and Lee [22].

Algorithms A and B in Appendix A can be used to sample from prior and posterior beta-Stacy processes. These algorithms were developed by Al-Labadi and Zarepour [15] (see also Lee and Kim [23]).

2.2. Relative Belief Ratio

Evans’ (2015) relative belief ratio has become a widely used tool in statistical hypothesis testing theory. Assume that we have a statistical model defined by the pdf {fθ:θΘ} with respect to the Lebesgue measure on the parameter space Θ. Let π(θ) to be a prior on Θ. After observing the data (T,δ), the posterior distribution of θ is given by the conditional density function

πθ|(T,δ)=fθ(T,δ)π(θ)Θfθ(T,δ)π(θ)dθ,

Assume that we are interested in inferring regarding the parameter θ. Let π and π(·|(T,δ)) be continuous at θ. Then, the relative belief ratio for a hypothesized value θ0 of θ is given by

RBΘ(θ0|(T,δ))=π(θ0|(T,δ))/π(θ0), (7)

The posterior density to the prior density ratio at θ0, that is, RB(θ0|(T,δ)), measures how beliefs about θ0 changed from a priori to a posteriori. When π and π(·|(T,δ)) are discrete, the relative belief ratio is defined through limits. For more information, see Evans (2015).

Quantity RBΘ(θ0|(T,δ)) is a measure of evidence that θ0 is the true value. If RBΘ(θ0|(T,δ))>1, then the probability of θ0 being the true value increases from a priori to a posteriori; thus, there is evidence based on the data that θ0 is the true value, and there is, hence, evidence in favor of θ0. If RBΘ(θ0|(T,δ))<1; then, the probability of θ0 being the true value decreases from a priori to a posteriori. As a result, the data provide evidence that θ0 is not the true value. Case RBΘ(θ0|(T,δ))=1 implies there is no evidence in either direction.

The ability to calibrate relative belief ratios is an appealing feature that renders it desirable in hypothesis testing problems. After calculating the relative belief ratio, it is critical to determine whether the result represents strong or weak evidence for or against H0:θ=θ0. A typical calibration RB(θ0|(T,δ)) is obtained by computing the tail probability (Evans, 2015)

StrΘ(θ0|(T,δ))=Π(RBΘ(θ|(T,δ))RBΘ(θ0|(T,δ))|(T,δ)), (8)

where Π(·|(T,δ)) is the posterior distribution of the posterior density π(·|(T,δ)). The posterior probability that the true value of θ has a relative belief ratio no greater than that of the hypothesized value θ0 is represented by Equation (8). When RBΘ(θ0|(T,δ))<1, there is evidence against θ0, and a small value for StrΘ(θ0|(T,δ)) represents a large posterior probability that the true value has a relative belief ratio greater than RBΘ(θ0|(T,δ)) and hence there is strong evidence against θ0. Similarly when RBΘ(θ0|(T,δ))>1, indicates a strong evidence in favour of θ0, while a small value of StrΘ(θ0|(T,δ)) indicates weak evidence in favour of θ0. Figure 1 illustrates the strength of the evidence for both cases; RBΘ(θ0|(T,δ))<1 and RBΘ(θ0|(T,δ))>1.

Figure 1.

Figure 1

Strength of evidence. The shaded area represents StrΘ(θ0|(T,δ)). RB represents the relative belief ratio RBΘ(θ0|(T,δ)). The smaller the shaded area, the stronger the evidence for case (a). The larger the shaded area, the stronger the evidence for case (b).

2.3. Cramér–Von Mises Distance

Let F and G be two cdfs; Cramér–von Mises distance between F and G is defined as

d(F,G)=F(x)G(x)2G(dx).

Other distances can be used (see Gibbs and Su (2002)), but d has some computational advantages. The formula for the distance between the beta-Stacy process and a continuous cdf is provided in the following result.

Lemma 1. 

Let F0 be a continuous cdf and Fϵ=1expZϵ(t), where Zϵ(t)=i=1MJiδθi0,t, where t0 with θ1,,θMi.i.d.F0. Let θ(1)θ(M) denote the order statistics of θ1,,θM and J1,,JM be the associated jump sizes such that Ji=Jj when θi=θ(j). Then

d(Fϵ,F0)=13i=0MFϵ(θ(i))(F0(θ(i+1)))2(F0(θ(i)))2+i=0M(Fϵ(θ(i)))2F0(θ(i+1))F0(θ(i)).

where Fϵ(θ(i))=1expk=1iJk.

Proof. 

Note that

Fϵ(x)=0ifx<θ(1)Fϵ(θ(i))ifθ(i)x<θ(i+1)(i=1,,M1).1ifxθ(M)

Le θ(0)=0 and θ(M+1)=+. Then,

d(Fϵ,F0)=θ(0)θ(ϵ+1)Fϵ(x)F0(x)2f0(x)dx=i=0Mθ(i)θ(i+1)Fϵ(θ(i))F0(x)2f0(x)dx.

Substituting y=F0(x) and U(i)=F0(θ(i)) gives

d(Fϵ,F0)=i=0MU(i)U(i+1)Fϵ(θ(i))y2dy=13i=0M[Fϵ(θ(i))U(i)]3[Fϵ(θ(i))U(i+1)]3.=13i=0MU(i+1)3U(i)3i=0MFϵ(θ(i))U(i+1)2U(i)2+i=0MFϵ2(θ(i))U(i+1)U(i)=13i=0MFϵ(θ(i))U(i+1)2U(i)2+i=0M((Fϵ)(θ(i))2U(i+1)U(i)=13i=0MFϵ(θ(i))(F0(θ(i+1)))2(F0(θ(i)))2+i=0M(Fϵθ(i))2F0(θ(i+1))F0(θ(i)).

When considering the prior and posterior distributions of the Cramér–von Mises distance, the following lemma allows for the use of the approximation to BSP(k(·),F0).

Lemma 2. 

If FBSP(k(·),F0) and Fϵ is given by Algorithm A, then dFϵ,F0a.s.dF,F0 as ϵ0.

Proof. 

By Kim and Lee (2001), Fϵ(x)a.s.F(x), where a.s. stands for almost surely convergence. As (Fϵ(x)F0(x))2f0(x), where f0(x) is integrable, the proof follows from the dominated convergence theorem. □

3. Model Checking Using the Relative Belief

Consider the statistical model Fθ:θΘ of continuous cdf on R+. Let X1,, Xn be a random sample drawn from an unknown cdf F defined on R+. Assume that (T,δ)=((T1,δ1),,(Tn,δn)) is observed, where Ti=min(Xi,Ci), δi=IXiCi and C1,,Cn are censoring times. The aim is to test the hypothesis H0:FFθ:θΘ. Let BSP(k(·),F0) be the prior on F for some choice of k(·) and F0. Then F(t)|(T,δ)BSPk*(t),F0*(t), where k(·) and F0 are defined in (5) and (6), respectively. If H0 is true, then the posterior distribution of the distance between F and the proposed model should be more concentrated around 0 than the prior distribution. As a result, this test involves comparing the concentrations of the prior and the posterior distributions of d (see Lemma 1) about 0 using the relative belief ratio with the interpretation as discussed in Section 2.2.

To perform this test, we must measure the distance and then set relevant values for k(·) and F0. To calculate the distance, similar to Al-Labadi and Evans [24], we computed d(F,Fθ^), where θ^ is the relative belief estimate of θ, which is always the same as the maximal likelihood estimate (MLE) for the full model parameter. In terms of hyperparameters, we set F0=Fθ^ and k(t)=k for all t. There are numerous advantages in setting F0=Fθ^. First, it avoids prior-data conflict, a possible contradiction between the data and the prior. This typically happens when the prior places its mass in a region of the parameter space where the data suggest the true value does not lie ([25,26]). In the context of the approach considered in this paper, prior-data conflict arises whenever there is a small overlap between the effective support regions of F and Fθ^. Note that, by Lemma 1, the distance d(F,Fθ^) depends on the prior guess F0 through the jump points θi. If the θi lay in one tail of Fθ^, then we get prior-data conflict between F and Fθ^ because F0 and F had the same effective support. To avoid this, it is required that θi are selected in a region that includes most of the mass of Fθ^. When F0=Fθ^, then Fθ^ is the prior mean of F; thus, both share the same effective support, which renders it a reasonable choice to avoid prior-data conflict. We refer the reader to Example 1 of Al Labadi and Evans (2018) for an interesting discussion about prior-data conflict. Nevertheless, the choice of F0=Fθ^ should also avoid any impacts due to the “double use of the data”. This means that the approach becomes conservative in detecting the model failure when H0 is false. Although setting F0=Fθ^ appears to induce a data-dependent prior distribution for d, the following lemma implies that this is not the case; thus, the approach is prior distribution-free with this choice.

Lemma 3. 

If FBSPk(·),Fθ^, then the distribution of dF,Fθ^ does not depend on Fθ^.

Proof. 

Using Lemma 1, since (θi)1iM is a sequence of i.i.d. random variables with continuous distribution Fθ^, for i1, we have Ui=dFθ^(θi), where Ui1iM is a sequence of i.i.d. random variables follow a uniform distribution on [0,1]. Thus,

d(Fϵ,Fθ^)=d13i=0MFϵ(θ(i))(U(i+1))2(θ(i))2+i=0M(Fϵ(θ(i)))2U(i+1)U(i),

where U(i) is the i-th order statistic for Ui1iM i.i.d. uniform[0,1]. Now, as ϵ0, by Lemma 2, we conclude that the distribution of d(F,Fθ^) does not depend on Fθ^. □

The following results shows that setting F0=Fθ^ prevents any effect due to the double use of the data. Specifically, as the sample size increases, the posterior distribution of dF,Fθ^ becomes concentrated around 0 if and only if H0 is true. For the proof, see Al-Labadi and Evans (2018).

Lemma 4. 

Let (T,δ)=((T1,δ1),,(Tn,δn))F, where FBSPk(·),Fθ^. Suppose that θ^a.s.θ0,supy|Fθ^(y)Fθ0(y)|a.s.0 as n.

  • (i) 

    If H0 is true, then, as n, dF|(T,δ),Fθ^a.s.0.

  • (ii) 

    If H0 is false, then, as n, liminfd(F|(T,δ),Fθ^)>a.s.0.

Now, concerning the choice of k(·), we considered k(t)=k for all t>0. In general, larger values of k must be chosen to identify smaller deviations. Consequently, it is possible to consider multiple values of k. One way to perform that is, for instance, to start with k=1. If a larger (smaller) value of k renders the relative belief ratios to be below (above) 1, H0 is rejected (accepted). As Section 5 shows, when the null hypothesis is correct (not correct), the relative belief ratio always remains above (below) 1 when larger (smaller) values of k are considered. When using the Dirichlet process, Al-Labadi and Zarepour [27] advised using k0.5n for complete data to avoid the prior becoming too influential. Setting k between 1 and 10 is satisfactory for most purposes. As indicated in the introduction, when k(t)=k, the beta-Stacy process turns into the Dirichlet process. However, in the presence of right censored data, the posterior distribution of the Dirichlet process becomes beta-Stacy process ([19]). This justifies the necessity of using the beta-Stacy process in the approach.

4. Computational Algorithm

Closed forms of the prior and posterior densities of D=d(F,Fθ^) are required to compute the relative belief ratio as in (7). This is not usually available. As a result, the relative belief ratio must be approximated through simulation. A particular problem of computing (7) arises here when both πD(0) and πD|(T,δ)(0) are close to 0, where πD(·) and πD|(T,δ)(·) denote the pdf’s of D and D|(T,δ), respectively. In such a case, determining RBD(0|(T,δ)) is difficult. The formal definition of the relative belief ratio, as discussed in Section 2.2, is as a limit that can be approximated at zero by

RB^D(0|T,δ)=ΠD([0,dc))ΠD|(T,δ)([0,dc)), (9)

for a suitably small value dc, where ΠD(·) and ΠD|(T,δ)(·) denote the cdfs of D and D|(T,δ), respectively. From Equation (8), the strength of the evidence based on the relative belief ratio RBD(0|T,δ) can be computed using

StrD(0|(T,δ))=ΠD|(T,δ)(RBD(d|(T,δ))RBD(0|(T,δ))). (10)

Appendix B provides computational Algorithm C for assessing H0 on the basis of estimates of (9) and (10). A similar algorithm for complete data based on the Dirichlet process was developed by Al-Labadi and Evans [24].

5. Examples

The approach is demonstrated by two main examples in this section. Throughout this section, let Exp (λ), Weibull (k,λ), and Lognorma l (μ,σ) denote the exponential distribution with mean 1/λ, the Weibull distribution with shape parameter k and scale parameter λ, and the log-normal distribution with mean μ and standard deviation σ, respectively. In all examples, the sensitivity to choosing k is investigated. We set ϵ=0.01, i0=1, M0=20, and r1=r2=1000 in Algorithms A, B, and C, though other values are also possible. R package parmsurvfit was used to compute the MLE of the distribution parameters.

Example 1. 

A real dataset from Lee and Wang (2003) based on the remission times (in months) of cancer patients. The dataset is given in Appendix C. We tested hypothesis

H0:

F, given in Table 1, is the underlying distribution of the observed data,

where F could be a family of distributions (composite hypothesis) or a specific distribution with known parameters (simple hypothesis). Various values of k were considered to investigate the approach’s sensitivity to concentration parameter selection. The p-value of the (frequentist) log-rank test was also computed for comparison purposes. Table 1 summarizes the findings. When H0 is true, we want RB>1 and a strength close to 1, and when H0 is false, we want RB<1 and a strength close to 0. According to Table 1, the proposed test performed well in this example.

Table 1.

Model checking of Example 1.

F F0 k RB(Str) p-Value
Exp(λ) Exp(λ^=0.0962) 1 20(1) 0.6521
5 19.84(1)
10 15.98(1)
Weibull(k,λ) Weibull(k^=1.0536,λ^=10.1901) 1 20(1) 0.9986
5 19.76(1)
10 17.58(1)
Lognormal(μ,σ) Lognormal(μ^=1.8198,σ^=1.0912) 1 20(1) 0.7550
5 18.48(1)
10 15.52 (1)
Exp(λ=2) Exp(λ=2) 1 0(0) 0
5 0(0)
10 0(0)
Weibull(2,0.5) Weibull(2,0.5) 1 0(0) 0
5 0(0)
10 0(0)
Lognormal(3,1) Lognormal(2,1) 1 0(0) 0
5 0(0)
10 0(0)

Example 2. 

Simulated data. The primary purpose of this dataset is to investigate how the proposed test performs as the sample size increases. We considered data (T1,δ1),,(Tn,δn), where Ti=min(Xi,Ci) where the survival times (Xi)1in were generated from Lognormal(1,4), while the censored time (Ci)1in are generated from Lognormal (4,1).

H0:

F in Table 2 is the underlying distribution of the observed data.

Table 2 summarizes the results that show that the selected models were accepted. Figure A1, Figure A2 and Figure A3 (see Appendix D) give the plots of F0=Fθ^ and 5 sample paths each for the prior beta-Stacy process and the posterior beta-Stacy process for each case in Table 2. These figures clearly show that the plots of the sample paths for the posterior process moved toward the plot of F0. This supports the previous conclusion regarding the null hypothesis. Furthermore, in this case, the p-values of the (frequentist) log-rank test support the conclusion that the null hypothesis should not be rejected.

Table 2.

Model checking of Example 2.

F n F0 c RB(Str) p-Value
Exp(λ) 20 Exp(λ^=0.03) 1 3.4(0.83) 0.9995
5 0.76(0.08)
10 0.36(0)
50 Exp(λ^=0.03) 1 0.24(0.02) 0.9977
5 0.02(0)
10 0(0)
100 Exp(λ^=0.05) 1 0.2(0.02) 0.9992
5 0(0)
10 0(0)
Weibull(k,λ) 20        Weibull(k^=0.38,λ^=20.88)         1 18.12(1) 1
5 9.78 (1)
10 5.22(0.739)
50     Weibull(k^=0.36,λ^=19.77)     1 19.88(1) 0.9992
5 15.98(1)
10 11.12(1)
100 Weibull(k^=0.38,λ^=14.10) 1 20(1) 0.9991
5 19.48(1)
10 16.46(1)
Lognormal(μ,σ) 20 Lognormal(μ^=1.14,σ^=2.40) 1 11.68(0.416) 0.9867
5 3.52(1)
10 1.88(1)
50 Lognormal(μ^=1.63,σ^=3.64) 1 20(1) 0.9454
5 16.4(1)
10 12.4(1)
100 Lognormal(μ^=1.34,σ^=3.44) 1 20(1) 0.9373
5 19.36 (1)
10 16.12(1)

6. Concluding Remarks

The beta-Stacy process and relative belief ratio were used to propose a general approach for model checking. This method could be used for both complete and right-censored data. It could also be used to test composite or simple hypotheses. Several examples demonstrated that the approach works very well.

Though the Cramér–von Mises distance was used here, other distance measures such as the Kolmogorov–Smirnov and Anderson–Darling distances are viable alternatives. Testing for families of multivariate distributions is an important extension of the approach presented in this paper. While conceptually similar, computational and inferential issues must be addressed. This problem can be addressed in future work.

Acknowledgments

The authors are gratefully acknowledge that the work in this paper was partly supported by Faculty Research grant FRG21-S-S03 and the Open Access Program from the American University of Sharjah.

Appendix A. Simulation Algorithms

Algorithm A: Simulation algorithm to approximate the prior of beta-Stacy process.

  1. Generate log-beta process with parameters α(dz)=k(z)F0(dz) and β(z)=k(z)F0[z,). The following steps are required to accomplish this:

    • (a)

      Fix a small positive number ϵ.

    • (b)

      Generate the total number of jumps MPoissonλϵ, where λϵ=α([0,))/ϵ.

    • (c)

      For i=1,M, generate the jump times θi from the probability density function α(dz)/α([0,)).

    • (d)

      Generate the jump sizes J1,,JM such that 1exp(Ji)|θiBeta(ϵ,β(θi)).

    • (e)
      Set
      Zϵ(t)=k=1MJkI(θkt). (A1)
  2. The approximate prior beta-Stacy process is Fϵ(t)=1expZϵ(t).

Algorithm B: Simulation algorithm to approximate the posterior of beta-Stacy process.

  1. Generate the posterior log-beta process. The following steps are required to accomplish this:

    • (a)

      Generate the process Zϵ*(t)t0 on the basis of Algorithm A with β(θi) is replaced by β(θi)+Y(θi).

    • (b)

      Generate the process Zf*(t)=i=1lSi*ITit, where the random jumps (Si*)1il are associated with the fixed points of discontinuity from the distribution given in (4), where ln. These random jumps occur at points where the data are not right-censored.

    • (c)
      The approximated posterior log-beta process is given by:
      Zϵ*(t)=Zϵ(t)|(T,δ)=Zϵ*(t)+Zf*(t)=i=1MJi*I(θit)+i=1lSi*Itit.
  2. The approximate posterior beta-Stacy is Fϵ*(t)=Fϵ(t)|(T,δ)=1expZϵ*(t).

Appendix B. Algorithm for Model Checking

Algorithm C (Relative belief algorithm for model checking):

(i) Generate a sample from Fϵ, where Fϵ is an approximation of FBSP(k(·),Fθ^). See Algorithm A.

(ii) Compute d(pri)=d(Fϵ,Fθ^) as described in Lemma 2.

(iii) Repeat Steps (i) and (ii) to obtain a sample of r1 values from the prior of D.

(iv) Generate a sample from Fϵ|(T,δ), where Fϵ|(T,δ) an approximation of F. See Algorithm B.

(v) Compute d(pos)=dFϵ|(T,δ),Fθ^ as described in Lemma 2.

(vi) Repeat Steps (iv) and (v) to obtain a sample of r2 values from the posterior of D.

(vii) For a fixed positive number M0, let F^D denote the empirical cdf of D based on the prior sample in (iii) and for i=0,,M0, let d^i/M0(pri) be the estimate of di/M0(pri), the (i/M0)-th prior quantile of D. Here d^0(pri)=0, and d^1(pri) is the largest value of d(pri). Let F^D(·|(T,δ)) denote the empirical cdf of D based on the posterior sample of d(pos) in (vi). For d[d^i/M0(pri),d^(i+1)/M0(pri)), estimate RBD(d|(T,δ)) by the ratio of the estimates of the posterior and prior contents of [d^i/M0(pri),d^(i+1)/M0(pri)). Specifically,

RB^D(d|(T,δ))=M0{F^D(d^(i+1)/M0(pri)|(T,δ))F^D(d^i/M0(pri)|(T,δ))}, (A2)

Moreover, estimate (9) by RB^D(0|(T,δ))=M0F^D(d^i0/M0(pri)|(T,δ)) where i0 is chosen so that i0/M0 is not too small (typically i0/M00.05).

(viii) Estimate (10) by the finite sum

A(F^D(d^(i+1)/M0(pri)|(T,δ))F^D(d^i/M(pri)|(T,δ))), (A3)

where A={ii0:RB^D(d^i/M0(pri)|(T,δ))RB^D(0|(T,δ))}.

By Al-Labadi and Evans (2018), for fixed M0, as r1,r2, (A2) converge a.s. to RBD(d|(T,δ)) and (A3) converge a.s. to DPD(RBD(d|(T,δ))RBD(0|(T,δ))|(T,δ)).

Appendix C. Example 1 Dataset

Table A1.

Remission times (months) of 137 cancer patients of the dataset used in Example 1. + denotes a right-censored observation.

 4.5   19.13   14.24  7.87 5.49 2.02 9.22 3.82  26.31   4.65+ 
2.62 0.90 21.73 0.87+ 0.51 3.36 43.01 0.81 3.36 1.46
24.80+ 10.86+ 17.14 15.96 22.69 4.33 7.28 2.46 3.48 4.23
6.54 8.65 5.41 2.23 4.34 32.15 4.87 5.71 7.59 3.02
4.51 1.05 9.47 79.05 2.02 4.26 11.25 10.34 10.66 12.03
2.64 14.74 1.19 8.66 14.83 5.62 18.10 25.74 17.36 1.35
9.02 6.94 7.26 4.70+ 3.70 3.64 3.57 11.64 6.25 25.82
3.88 3.02+ 19.36+ 20.28 46.12 5.17 0.20 36.66 10.06 4.94
5.06 16.62 12.07 6.97 0.08 1.40 2.75 7.32 1.26 6.76
8.60+ 7.62 3.52 9.74 0.40 5.41 2.54 2.69 8.26 0.50
5.32 5.09 2.09 7.93 12.02 13.08 5.85 7.09 5.32 4.33+
2.83 8.37 14.77 8.53 11.98 1.76 4.40 34.26 2.07 17.12
12.63 7.66 4.18 13.29 23.63 3.25 7.63 2.87 3.31 2.26
2.69 11.79 5.34 6.93 10.75 13.11 7.39

Appendix D. Example 2 Figures

Figure A1.

Figure A1

Plots of the sample paths for the model, prior and posterior of BS(a=0.01,F0) with different choices of n and k in Example 2.

Figure A2.

Figure A2

Plots of the sample paths for the model, prior and posterior of BS(a=0.01,F0) with different choices of n and k in Example 2.

Figure A3.

Figure A3

Figure A3

Plots of the sample paths for the model, prior and posterior of BS(a=0.01,F0) with different choices of n and k in Example 2.

Author Contributions

Formal analysis, A.A.; Investigation, L.A.-L. and M.A.; Methodology, L.A.-L. and A.A.; Project administration, L.A.-L.; Software, L.A.-L. and M.A.; Writing – review & editing, A.A. and M.A. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Funding Statement

This research received no external funding.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Lawless J.F. Statistical Models and Methods for Lifetime Data. 2nd ed. Wiley & Sons, Inc.; Hoboken, NJ, USA: 2003. [Google Scholar]
  • 2.Florens J.P., Richard J.F., Rolin J.M. Bayesian Encompassing Specification Tests of a Parametric Model against a Nonparametric Alternative. Universitse Catholique de Louvain, Institut de Statistique; Neuchatel, Switzerland: 1996. Technical Report 9608. [Google Scholar]
  • 3.Carota C., Parmigiani G. In: On Bayes Factors for Nonparametric Alternatives. Bernardo J.M., Berger J., Dawid A.P., Smith A.F.M., editors. Oxford University Press; London, UK: 1996. In Bayesian Statistics 5. [Google Scholar]
  • 4.Verdinelli I., Wasserman L. Bayesian goodness-of-fit testing using finite-dimensional exponential families. Ann. Stat. 1998;26:1215–1241. doi: 10.1214/aos/1024691240. [DOI] [Google Scholar]
  • 5.Berger J.O., Guglielmi A. Bayesian testing of a parametric model versus nonparametric alternatives. J. Am. Stat. Assoc. 2001;96:174–184. doi: 10.1198/016214501750333045. [DOI] [Google Scholar]
  • 6.McVinish R., Rousseau J., Mengersen K. Bayesian goodness of fit testing with mixtures of triangular distributions. Scand. J. Stat. 2009;36:337–354. doi: 10.1111/j.1467-9469.2008.00620.x. [DOI] [Google Scholar]
  • 7.Hsieh P. A nonparametric assessment of model adequacy based on Kullback-Leibler-divergence. Stat. Comput. 2011;23:149–162. doi: 10.1007/s11222-011-9298-0. [DOI] [Google Scholar]
  • 8.Swartz T.B. Nonparametric goodness-of-fit. Commun. Stat. Theory Methods. 1999;28:2821–2841. doi: 10.1080/03610929908832452. [DOI] [Google Scholar]
  • 9.Viele K. Evaluating Fit Using Dirichlet Processes. University of Kentucky, Dept. of Statistics; Lexington, KY, USA: 2000. Technical Report 384. [Google Scholar]
  • 10.Evans M. Measuring Statistical Evidence Using Relative Belief. Monographs on Statistics and Applied Probability 144, CRC Press, Taylor & Francis Group; London, UK: 2015. [Google Scholar]
  • 11.Abdelrazeq I., Al-Labadi L., Alzaatreh A. On one-sample Bayesian tests for the mean. Statistics. 2020;54:424–440. doi: 10.1080/02331888.2020.1726918. [DOI] [Google Scholar]
  • 12.Al-Labadi L., Patel V., Vakiloroayaei K., Wan C. Kullback-Leibler Divergence for Bayesian Nonparametric Model Checking. J. Korean Stat. Soc. 2021;50:272–289. doi: 10.1007/s42952-020-00072-7. [DOI] [Google Scholar]
  • 13.Evans M., Tomal J. Measuring statistical evidence and multiple testing. FACET. 2018;3:563–583. doi: 10.1139/facets-2017-0121. [DOI] [Google Scholar]
  • 14.Damien P., Walker S. A Bayesian non-parametric comparison of two treatments. Scand. J. Stat. 2002;29:51–56. doi: 10.1111/1467-9469.00891. [DOI] [Google Scholar]
  • 15.Al-Labadi L., Zarepour M. A Bayesian nonparametric goodness of fit test for right censored data based on approximate samples from the beta-Stacy process. Can. J. Stat. 2013;41:466–487. doi: 10.1002/cjs.11188. [DOI] [Google Scholar]
  • 16.Chen Y., Hanson T. Bayesian nonparametric k-sample tests for censored and uncensored data. Comput. Stat. Data Anal. 2014;71:335–346. doi: 10.1016/j.csda.2012.11.003. [DOI] [Google Scholar]
  • 17.Hjort N.L. Nonparametric Bayes estimators based on Beta processes in models for life history data. Ann. Stat. 1990;18:1259–1294. doi: 10.1214/aos/1176347749. [DOI] [Google Scholar]
  • 18.Walker S., Muliere P. Beta-stacy processes and a generalisation of the polya-urn scheme. Ann. Stat. 1997;25:1762–1780. doi: 10.1214/aos/1031594741. [DOI] [Google Scholar]
  • 19.Phadia E.G. Prior Processes and Their Applications. Springer; Berlin/Heidelberg, Germany: 2013. [Google Scholar]
  • 20.D’Agostino R.B., Stephens M.A. Goodness-of-Fit Techniques. Marcel Dekker Inc.; New York, NY, USA: 1986. [Google Scholar]
  • 21.Ferguson T.S., Phadia E.G. Bayesian nonparametric estimation based on censored data. Ann. Stat. 1979;7:163–186. doi: 10.1214/aos/1176344562. [DOI] [Google Scholar]
  • 22.Kim Y., Lee J. On posterior consistency of survival models. Ann. Stat. 2001;29:666–686. doi: 10.1214/aos/1009210685. [DOI] [Google Scholar]
  • 23.Lee J., Kim Y. A new algorithm to generate beta processes. Comput. Stat. Data Anal. 2004;25:401–405. doi: 10.1016/j.csda.2003.12.008. [DOI] [Google Scholar]
  • 24.Al-Labadi L., Evans M. Prior based model checking. Can. J. Stat. 2018;46:380–398. doi: 10.1002/cjs.11457. [DOI] [Google Scholar]
  • 25.Evans M., Moshonov H. Checking for prior-data conflict. Bayesian Anal. 2006;1:893–914. doi: 10.1214/06-BA129. [DOI] [Google Scholar]
  • 26.Al-Labadi L., Evans M. Optimal robustness results for relative belief inferences and the relationship to prior-data conflict. Bayesian Anal. 2017;12:705–728. doi: 10.1214/16-BA1024. [DOI] [Google Scholar]
  • 27.Al-Labadi L., Zarepour M. Two-sample Bayesian nonparametric goodness-of-fit test. Math. Methods Stat. 2017;26:212–225. doi: 10.3103/S1066530717030048. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.


Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES