Estimation and model selection of semiparametric multivariate survival functions under general censorship

Xiaohong Chen; Yanqin Fan; Demian Pouzo; Zhiliang Ying

doi:10.1016/j.jeconom.2009.10.021

. Author manuscript; available in PMC: 2014 Apr 28.

Published in final edited form as: J Econom. 2010 Jul 1;157(2):129–142. doi: 10.1016/j.jeconom.2009.10.021

Estimation and model selection of semiparametric multivariate survival functions under general censorship

Xiaohong Chen ^a,^*, Yanqin Fan ^b, Demian Pouzo ^c, Zhiliang Ying ^d

PMCID: PMC4002182 NIHMSID: NIHMS491941 PMID: 24790286

Abstract

We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.

Keywords: Multivariate survival models, Misspecified copulas, Penalized pseudo-likelihood ratio, Fixed or random censoring, Kaplan–Meier estimator

1. Introduction

Economic, financial, and medical multivariate survival data are typically non-normally distributed and exhibit nonlinear dependence among their component variables. A class of semiparametric multivariate survival models that has proven to be useful in modeling such data is the class of semiparametric copula-based multivariate survival functions in which the marginal survival functions are nonparametric, but the copula functions characterizing the dependence structure between the component variables are parameterized. More specifically, let X = (X₁, …, X_d)′ be the survival variables of interest with a d-variate joint survival function: F^o(x₁, …, x_d) = P(X₁ > x₁, …, X_d > x_d) and marginal survival functions $F_{j}^{0} (\cdot) (j = 1, \dots, d)$ . Assume that $F_{j}^{0} (j = 1, \dots, d)$ are continuous. A straightforward application of Sklar's (1959) theorem shows that there exists a unique d-variate copula function C^o such that $F^{0} (x_{1}, \dots, x_{d}) \equiv C^{0} (F_{1}^{0} (x_{1}), \dots, F_{d}^{0} (x_{d}))$ , where the copula C^o(·) : [0, 1]^d → [0, 1] is itself a multivariate probability distribution function; it captures the dependence structure among the component variables X₁, …, X_d. This decomposition of the joint survival function leads naturally to the class of semiparametric multivariate survival functions in which the marginal survival functions are unspecified, but the copula function is parameterized: C^o(u₁, …, u_d) = C^o(u₁, …, u_d; α^o) for some parametric copula function C^o(u₁, …, u_d; α) and some value $α^{0} \in A$ . As a multivariate survival function in this class depends on nonparametric functions of only one dimension, it achieves dimension reduction while maintaining a more flexible form than purely parametric survival functions. This class of semiparametric multivariate survival functions has been used widely in survival analysis, where modeling and estimating the dependence structure between survival variables is of importance. See Joe (1997), Nelsen (1999), Oakes (1989, 1994), Frees and Valdez (1998) and Li (2000) for examples of such applications.

A semiparametric copula-based multivariate survival model has two sets of unknown parameters: the unknown marginal survival functions $F_{j}^{0}, j = 1, \dots, d$ , and the copula parameter α^o of the parametric copula function C^o(u₁, …, u_d; α^o). For complete data (i.e., data without censoring or truncation), Oakes (1994) and Genest et al. (1995) propose a two-step estimation procedure: in first step the marginal distribution functions $1 - F_{j}^{0}, j = 1, \dots, d$ are estimated by the rescaled empirical distribution functions, in the second step the copula parameter α^o is estimated by maximizing the estimated log-likelihood function. For randomly right censored data, Shih and Louis (1995) independently propose the same two-step procedure, except that the Kaplan–Meier estimators of marginal survival functions are used in the first step. For a random sample of size n, Genest et al. (1995) establish the root-n consistency and asymptotic normality of their two-step estimator of α^o. For randomly right censored data, Shih and Louis (1995) derive similar large sample properties of their two-step estimator of αo under the assumption of bounded partial derivatives of score functions. Unfortunately, this assumption is violated by many commonly used copulas including the Gaussian copula, the Student's t copula, Clayton copula and Gumbel copula. In addition, Shih and Louis (1995) assume that the censoring scheme is i.i.d. random and the parametric copula function is correctly specified.

A closely related important issue in applying this class of semiparametric survival functions to a given data set is how to choose an appropriate parametric copula, as different parametric copulas lead to survival functions that may have very different dependence properties. A number of existing papers has attempted to address this issue. For complete data, we refer to Chen and Fan (2005, 2006a) for a detailed discussion of existing approaches and references. For bivariate censored data, existing work include Frees and Valdez (1998), Klugman and Parsa (1999), Wang and Wells (2000), Chen and Fan (2007), and Denuit et al. (2006). Frees and Valdez (1998) and Klugman and Parsa (1999) consider fully parametric models of bivariate distribution (or survival) functions, and they address model selection of parametric copulas and parametric marginals for insurance company data on losses and allocated loss adjustment expenses (ALAEs). The particular data set they use was collected by the US Insurance Services Office in which loss is censored by a fixed censoring mechanism and ALAE is not censored. Using various model selection techniques including AIC/BIC, Frees and Valdez (1998) select the Pareto marginal distributions and the Gumbel copula, while Klugman and Parsa (1999) select inverse paralogistic for loss marginal distribution, inverse Burr for ALAE marginal distribution and the Frank copula. Wang and Wells (2000), Denuit et al. (2006) and Chen and Fan (2007) consider model selection of semiparametric bivariate distribution (or survival) functions in which they do not specify marginals, but restrict the parametric copulas to be in the Archimedean family. In particular, Wang and Wells (2000) propose a model selection procedure for comparing copulas in the one-parameter Archimedean family, allowing for various censoring mechanisms, as long as a consistent nonparametric estimator for the bivariate joint distribution (or survival) function is available. Their selection procedure is based on comparing point estimates of the integrated squared difference between the true Archimedean copula and a parametric copula; the one with the smallest value of the integrated squared difference is chosen over the rest of the one-parameter Archimedean copulas. Denuit et al. (2006) apply Wang and Wells's (2000) procedure to copula model selection for the same Loss-ALAE data set studied in Frees and Valdez (1998). They use a nonparametric estimator of the bivariate distribution that takes into account the fixed censoring mechanism underlying the Loss-ALAE data. They examine four one-parameter Archimedean copulas (Gumbel, Clayton, Frank and Joe) and select Gumbel copula since it yields the smallest estimated integrated squared difference. Chen and Fan (2007) propose a model selection test for comparing multiple semiparametric bivariate survival functions by taking into account the randomness in the estimated integrated squared difference. However, their test is still only applicable to model selection of parametric copulas within the Archimedean family only. It is known that a one or two-parameter Archimedean copula family could be too restrictive to capture various dependence structures among multivariate variables. In addition, the semiparametric model selection procedures in Wang and Wells (2000), Denuit et al. (2006) and Chen and Fan (2007) require consistent nonparametric estimation of the joint distribution function and the limiting distributions are complicated. As a result, even for a parametric Archimedean copula family, these tests are difficult to implement for multivariate (higher than bivariate) data with general censorship.

In this paper we bridge the gap in existing work for estimating and selecting a semiparametric multivariate copula-based survival model by (i) allowing for data to be censored under various censoring mechanisms, (ii) using nonparametric estimation of marginal survival functions only, (iii) permitting any parametric copula specification, which may be misspecified, non-Archimedean, and its score function may have unbounded partial derivatives. For random samples without censoring, Chen and Fan (2005) already consider the Pseudo-likelihood estimation of copula parameters and Pseudo-likelihood ratio (PLR) model selection test for semiparametric multivariate copula-based distribution models, accounting for (ii) and (iii). In this paper, we extend their results to allow for general right censorship. In particular, we first establish the convergence of the two-step estimator of the copula parameter to the pseudo-true value defined as the value of the parameter that minimizes the Kullback-Leibler Information Criterion (KLIC) between the parametric copula induced multivariate density and the unknown true density. We then derive its root-n asymptotically normal distribution and provide a simple consistent asymptotic variance estimator by accounting for (i), (ii) and (iii). These results are used to establish the asymptotic distribution of the penalized PLR statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. We also propose a standardized version of the test, whose limiting null distribution is easy to simulate. To illustrate the usefulness of our testing procedure, we apply it to copula model selection for the loss-ALAE data, taking into account the underlying censoring mechanism in the data and allowing parametric copulas to exhibit more flexible dependence structures than those in the Archimedean family. We find that the standardized test is generally more powerful than the non-standardized test.

The rest of this paper is organized as follows. Section 2 introduces the model selection criterion function and the two-step estimation of the copula dependence parameter. In Section 3, we study the large sample properties of the pseudo-likelihood estimator of the copula parameter allowing for independent but general right censorship and misspecified parametric copulas. In Section 4, we present the limiting null distributions of the (penalized) PLR test statistics for model selection among multiple semiparametric copula models for multivariate censored data. Section 5 provides an empirical application to the Loss-ALAE data set and Section 6 briefly concludes. All technical proofs are gathered into the Appendix.

2. Model selection criterion and parameter estimation

To simplify notation, we shall present our results for bivariate survival models only. Obviously, all these results have straightforward extensions to multivariate copula models for survival data with any finite dimension.

In the following we shall use (D₁, D₂) to denote the censoring variables. Thus under the right censorship, one observes ( ${\tilde{X}}_{1}$ , ${\tilde{X}}_{2}$ ) = (X₁ ^ D₁, X₂ ^ D₂) and a pair of indicators, (δ₁, δ₂) = (I{X₁ ≤ D₁}, I{X₂ ≤ D₂}), where a ^ b = min(a, b) for real numbers a and b and I{·} is the indicator function. We assume that the censoring variables (D₁, D₂) are independent of the survival variables (X₁, X₂). Let $F_{j}^{0} (x_{j}) = P (X_{j} > x_{j})$ denote the true but unknown marginal survival function of X_j for j = 1, 2. Suppose n independent (but possibly non-identically distributed) observations ${({\tilde{X}}_{1 t}, {\tilde{X}}_{2 t}, δ_{1 t}, δ_{2 t})}_{t = 1}^{n}$ are available, where ( ${\tilde{X}}_{1 t}$ , ${\tilde{X}}_{2 t}$ ) = (X_1t ^ D_1t, X_2t ^ D_2t) and (δ_1t, δ_2t) = (I{X_1t ≤ D_1t}, I{X_2t ≤ D_2t}). Denote $U_{t} = (U_{1 t}, U_{2 t}) = (F_{1}^{0} ({\tilde{X}}_{1 t}), F_{2}^{0} ({\tilde{X}}_{2 t}))$ .

2.1. Model selection criterion

Let ${C_{i} (u_{1}, u_{2}; α_{i}) : α_{i} \in A_{i} \subset R^{p i}}$ be a class of parametric copulas with i = 1, 2, …, M. By Sklar's (1959) theorem, each parametric copula family i corresponds to a parametric likelihood $L_{i, n} (α_{i}) \equiv \sum_{t = 1}^{n} ℓ_{i} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{i})$ , where

ℓ_{i} (u_{1 t}, u_{2 t}, δ_{1 t}, δ_{2 t}; α_{i}) = δ_{1 t} δ_{2 t} \log c_{i} (u_{1 t}, u_{2 t}; α_{i}) + δ_{1 t} (1 - δ_{2 t}) \log \frac{\partial C_{i} (u_{1 t}, u_{2 t}; α_{i})}{\partial u_{1}} + δ_{2 t} (1 - δ_{1 t}) \log \frac{\partial C_{i} (u_{1 t}, u_{2 t}; α_{i})}{\partial u_{2}} + (1 - δ_{1 t}) (1 - δ_{2 t}) \log C_{i} (u_{1 t}, u_{2 t}; α_{i}),

where $c_{i} (u_{1}, u_{2}; α_{i}) = \frac{\partial^{2} C_{i} (u_{1}, u_{2}; α_{i})}{\partial u_{1} \partial u_{2}}$ is the density function of copula C_i(u₁, u₂; α_i).

In this paper, we are interested in testing whether a benchmark model (say copula model 1) performs significantly better than the rest of the copula models according to the KLIC. Let E⁰ denote the expectation with respect to the true probability measure. Define

α_{in}^{*} = \arg_{α_{i} \in A_{i}}^{\max} n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{i})]

as the pseudo-true value that minimizes the KLIC between the i-th parametric copula family induced multivariate density and the unknown true density. To conclude that copula model 1 performs significantly better than the rest of the copula models calls for a formal statistical test, where the null hypothesis is:

H_{0} :_{i = 2, \dots, M}^{\max} n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{in}^{*})] - ℓ_{1} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{1 n}^{*})] \leq 0,

meaning that none of the copula models 2, …, M is closer to the true model (according to KLIC) than model 1, and the alternative hypothesis is:

H_{1} :_{i = 2, \dots, M}^{\max} n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{in}^{*})] - ℓ_{1} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{1 n}^{*})] > 0,

meaning that there exists a copula model from 2, …, M that is closer to the true model (according to KLIC) than model 1.

2.2. Two-step estimation

To construct a test statistic for the null hypothesis H₀ against the alternative H₁, we need estimates of $(U_{1 t}, U_{2 t}) = (F_{1}^{0} ({\tilde{X}}_{1 t}), F_{2}^{0} ({\tilde{X}}_{2 t}))$ and $α_{i n}^{*}$ for i = 1, …, M.

For j = 1, 2, let ${\tilde{F}}_{j} (\cdot)$ be the Kaplan–Meier estimator of $F_{j}^{0} (\cdot) = P (X_{j} > \cdot)$ :

{\tilde{F}}_{1} (x) = \prod_{{\tilde{X}}_{1 (t) \leq x}} {(1 - \frac{1}{n - t + 1})}^{δ_{1 (t)}},

{\tilde{F}}_{2} (x) = \prod_{{\tilde{X}}_{2 (t) \leq x}} {(1 - \frac{1}{n - t + 1})}^{δ_{2 (t)}},

where ${\tilde{X}}_{j (1)} \leq {\tilde{X}}_{j (2)} \leq \dots \leq {\tilde{X}}_{j (n)}$ are order statistics of ${{\tilde{X}}_{j t}}_{t = 1}^{n}$ for j = 1, 2, and ${δ_{j (t)}}_{t = 1}^{n} (j = 1, 2)$ are similarly defined. Then under independent censoring, ${\tilde{F}}_{j} (\cdot)$ is consistent for $F_{j}^{0} (\cdot)$ , j = 1, 2; see e.g., Lai and Ying (1991).

Given the definition of $α_{i n}^{*}$ , a natural estimator for it is the pseudo-likelihood estimator ${\hat{α}}_{i n}$ :

{\hat{α}}_{in} = \arg_{α_{i} \in A_{i}}^{\max} n^{- 1} \sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}), δ_{1 t}, δ_{2 t}; α_{i}), i = 1, \dots, M .

Since this estimation procedure involves the first-step nonparametric estimation of the marginal survival functions $F_{j}^{0} (\cdot)$ , j = 1, 2, the estimator ${\hat{α}}_{i n}$ is also called the “two-step” estimator.

Note that no assumption is made on the censoring variables (D_1t, D_2t) other than their independence with the survival variables (X_1t, X_2t). As a result, various censoring mechanisms are allowed, including the simple random censoring, fixed censoring, and of course no censoring. If the censoring variables are fixed at D_jt = +∞ for j = 1, 2, ${\hat{α}}_{i n}$ becomes the estimator proposed in Genest et al. (1995). If the censoring variables (D_1t, D_2t) are i.i.d. with a continuous joint survival function, ${\hat{α}}_{i n}$ becomes the estimator proposed in Shih and Louis (1995). Assuming that the parametric copula density c_i(u₁, u₂; α_i) is correctly specified and that log c_i(u₁, u₂; α_i) has bounded partial derivatives with respect to u₁, u₂, Shih and Louis (1995) establish the root-n asymptotic normality of ${\hat{α}}_{i n}$ and provide a consistent estimator of its asymptotic variance for i.i.d. randomly censored data.

The censoring mechanism for the loss-ALAE data is non-random; ALAE is not censored and Loss is censored by a constant which differs from each individual to another. Results in Shih and Louis (1995) may not be directly applicable to this data set even under a correct specification of the copula function. Moreover, for model selection, we need to establish the asymptotic properties of the two-step estimator under copula misspecification. This will be done in the next section for a general censoring mechanism.

2.3. Penalized pseudo-likelihood ratio criteria

To test the null hypothesis H₀ against the alternative H₁, we use the PLR statistic:

{LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) = {\tilde{L}}_{i, n} ({\hat{α}}_{i n}) - {\tilde{L}}_{1 n} ({\hat{α}}_{1 n}), i = 2, \dots, M,

where

{\tilde{L}}_{i, n} ({\hat{α}}_{in}) \equiv \frac{1}{n} \sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}), δ_{1 t}, δ_{2 t}; {\hat{α}}_{in}), i = 1, \dots, M .

In most applications, several parametric copula families are compared which may have different numbers of parameters. To take this into account, we follow the approach in Sin and White (1996) by adopting a general penalization of model complexity. Let Pen(p_i, n) denote a penalization term such that Pen(p_i, n) increases with p_i dim( $A_{i}$ ), decreases with n, and Pen(p_i, n)/n → 0. Then the penalized PLR statistic is

\begin{matrix} {PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}); {\hat{α}}_{in}, {\hat{α}}_{1 n}) & = {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - \frac{Pen (p_{i}, n) - Pen (p_{1}, n)}{n} \\ = {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) + o_{p} (1) . \end{matrix}

We note that Pen(p_i, n) = p_i corresponds to AIC, and Pen(p_i, n) = 0.5p_i log n corresponds to BIC criterion.

In many existing applications of copula models, AIC has been used to compare different families of parametric copula models. To be more specific, let

{AIC}_{i} = - \frac{2}{n} \sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}), δ_{1 t}, δ_{2 t}; {\hat{α}}_{in}) + \frac{2 p_{i}}{n}, i = 1, \dots, M .

Then the values of AIC_i for i = 1, …, M are compared; copula model 1 will be selected if AIC₁ = min{AIC_i : 1 ≤ i ≤ M} or equivalently if

{LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - \frac{p_{i} - p_{1}}{n} < 0, i = 2, \dots, M .

(2.1)

Noting, however, that ${PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{i n}, {\hat{α}}_{1 n})$ (such as AIC_i) is a random variable, the fact that ${PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{i n}, {\hat{α}}_{1 n}) < 0$ for i = 2, …, M (or inequality (2.1) holds) for one sample ${{\tilde{X}}_{1 t}, {\tilde{X}}_{2 t}, δ_{1 t}, δ_{2 t}}_{t = 1}^{n}$ may not imply that copula model 1 performs significantly better than the rest of the models; it may occur by chance. As we will show in the next section, ${PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{i n}, {\hat{α}}_{1 n}) = n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}, δ_{t 1}, δ_{2 t}; α_{i n}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}, δ_{1 t}, δ_{2 t}; α_{1 n}^{*})] + o_{p} (1)$ for i = 2, …, M. To conclude that copula model 1 performs significantly better than the rest of the models we need to perform a formal statistical test for H₀ against H₁.

To test H₀, we have to take into account the randomness of the (penalized) PLR statistic. More precisely, we need to derive the asymptotic distributions of ${\hat{α}}_{i n}$ and the test statistics under the null hypothesis. This will be accomplished in Sections 3 and 4 of this paper.

3. Asymptotic properties of the two-step estimator under copula misspecification

As mentioned in the previous section, asymptotic properties of the two-step estimator are established for randomly censored data in Shih and Louis (1995) under the assumptions that the parametric copula density correctly specifies the true copula density and that its score function has bounded partial derivatives. In this section, we will extend their results to a more general censoring mechanism and allow for misspecified parametric copulas whose score functions may have unbounded partial derivatives.

Recall that $A \subset R^{p}$ is the parameter space. For α, $α^{*} \in A$ we use ∥α − α*∥ to denote the usual Euclidean metric. To simplify notation, we now let

ℓ (u_{1}, u_{2}; α) = δ_{1} δ_{2} \log c (u_{1}, u_{2}; α) + δ_{1} (1 - δ_{2}) \log \frac{\partial C (u_{1}), u_{2}; α)}{\partial u_{1}} + δ_{2} (1 - δ_{1}) \log \frac{\partial C (u_{1}), u_{2}; α)}{\partial u_{2}} + (1 - δ_{1}) (1 - δ_{2}) \log C (u_{1}, u_{2}; α),

where c(u₁, u₂; α) is the density of the parametric copula C(u₁, u₂; α). Then the pseudo-true copula parameter value is $α_{n}^{*} = arg \max_{α \in A} n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ (U_{1 t}, U_{2 t}; α)]$ , and its two-step estimator is ${\hat{α}}_{n} = arg \max_{α \in A} n^{- 1} \sum_{t = 1}^{n} ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α)$ .

Finally we denote

ℓ_{α} (u_{1}, u_{2}; α) = \frac{\partial ℓ (u_{1}, u_{2}; α)}{\partial α},

ℓ_{j} (u_{1}, u_{2}; α) = \frac{\partial ℓ (u_{1}, u_{2}; α)}{\partial u_{j}}, (j = 1, 2),

ℓ_{α α} (u_{1}, u_{2}; α) = \frac{\partial^{2} ℓ (u_{1}, u_{2}; α)}{\partial α^{2}} and

ℓ_{α j} (u_{1}, u_{2}; α) = \frac{\partial^{2} ℓ (u_{1}, u_{2}; α)}{\partial u_{j} \partial α} for j = 1, 2 .

3.1. Consistency

The following conditions are sufficient to ensure the convergence of the two-step estimator ${\hat{α}}_{n}$ to the pseudo true value $α_{n}^{*}$ .

C1
- (i)
  The sequence of survival variables, ${(X_{1 t}, X_{2 t})}_{t = 1}^{n}$ is an i.i.d. sample from an unknown survival function F^o(x₁, x₂) with continuous marginal survival functions $F_{j}^{0} (\cdot), j = 1, 2$ ;
- (ii)
  The sequence of censoring variables ${D_{1 t}, D_{2 t}}_{t = 1}^{n}$ is an independent sample with joint survival functions ${G_{t} (x_{1}, x_{2})}_{t = 1}^{n} = {P (D_{1 t} > x_{1}, D_{2 t} > x_{2})}_{t = 1}^{n}$ and marginal survival functions ${G_{j t} (\cdot)}_{t = 1}^{n}, j = 1, 2$ ;
- (iii)
  The censoring variables (D_1t, D_2t) are independent of survival variables (X_1t, X_2t) and there is no mass concentration at 0 in the sense that ${lim sup}_{n \to \infty} n^{- 1} \sum_{t = 1}^{n} (1 - G_{j t} (η)) \to 0$ as η → 0.
C2
Let $A$ be a compact subset of $R^{p}$ . For every ∊ > 0,
$_{α \in A : ‖ α - α_{n}^{*} ‖ \geq ∊}^{\lim \inf} \frac{1}{n} \sum_{t = 1}^{n} [E^{0} {ℓ (U_{1 t}, U_{2 t}; α_{n}^{*}}] - E^{0} {ℓ (U_{1 t}, U_{2 t}; α)}] > 0 .$
C3
The true (unknown) copula function C^o(u₁, u₂) has continuous partial derivatives.
C4
- (i)
  For any (u₁, u₂) ∈ (0, 1)², ℓ(u₁, u₂; α) is a continuous function of $α \in A$
- (ii)
  Let $L_{t} = {sup}_{α \in A} ∣ ℓ (U_{1 t}, U_{2 t}; α) ∣$ and $L_{t α} = {sup}_{α \in A} ∣ ℓ_{α} (U_{1 t}, U_{2 t}; α) ∣$ Then,
  $_{K \to \infty}^{\lim}_{n \to \infty}^{\lim \sup} n^{- 1} \sum_{t = 1}^{n} E^{0} {L_{t} I (L_{t} \geq K) + L_{t α} I (L_{t α} \geq K)} = 0;$
- (iii)
  For any η > 0, ∊ > 0, there is K > 0 such that $∣ ℓ (u_{1}, u_{2}; α) ∣ \leq K ∣ ℓ (u_{1}^{'}, u_{2}^{'}; α) ∣$ for all $α \in A$ and all u_j ∈ [η, 1) such that $1 - u_{j} \geq ∊ (1 - u_{j}^{'}), j = 1, 2$
C5
If ${X_{j t}}_{t = 1}^{n}$ are subject to non-trivial censoring (i.e., D_jt ≠ ∞), then ${\tilde{F}}_{j}$ is truncated at the tail in the sense that for some $τ_{j}, {\tilde{F}}_{j} (x_{j}) = {\tilde{F}}_{j} (τ_{j})$ for all x_j ≥ τ_j and lim inf n⁻¹ $\sum_{t = 1}^{n} G_{j t} (τ_{j}) F^{0} (τ_{j}) > 0$ .

Note that in contrast to the censoring mechanism in Shih and Louis (1995) Condition C1(ii) allows the censoring variables ${(D_{1 t}, D_{2 t})}_{t = 1}^{n}$ to be non-identically distributed. In addition, no assumption is made on the joint survival function G_t(x₁, x₂) of the censoring variables (D_1t, D_2t). Hence Condition C1(ii) includes the fixed censoring mechanism in which each survival variable (X_1t, X_2t) is censored at a pre-specified, fixed time (D_1t, D_2t) which may differ from one observation to another, in which case, the survival function G_t(x₁, x₂) is degenerate at (D_1t, D_2t). It also allows the variables X_1t and X_2t to have different censoring mechanisms, one random and the other fixed or one censored and the other uncensored. For example, the censoring mechanism for the Loss-ALAE data is such that Loss is censored by a fixed censoring mechanism and ALAE is uncensored. As a result, the observed variables ${({\tilde{X}}_{1 t}, {\tilde{X}}_{2 t})}_{t = 1}^{n}$ may not be identically distributed and the identifiably unique maximizer $α_{n}^{*}$ defined in Condition C2 may depend on n. Condition C5 is imposed to handle the possible tail instability of the Kaplan–Meier estimator, especially for non-identically distributed censoring times. The truncation can be achieved by simply using D_jt ∧ τ_j as the censoring variables. Thus, without loss of generality, we shall assume that D_jt ∧ τ_j are the censoring variables so that ${\tilde{X}}_{j t} \leq τ_{j}$ . The simple truncation at τ_j can be changed to the more elaborate tail modification. We refer to Lai and Ying (1991) for the issue of tail instability and modification. Finally, because we allow the left tail of the copula to blow up as well, we shall set $ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) = 0$ whenever ${\tilde{F}}_{j} ({\tilde{X}}_{j t}) = 1$ for j = 1 or 2.

Proposition 3.1. Under conditions C1–C5, we have:

(1)
$‖ {\hat{α}}_{n} - α_{n}^{*} ‖ o_{p} (1);$
(2)
$\frac{1}{n} \sum_{t = 1}^{n} ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{n}) = \frac{1}{n} \sum_{t = 1}^{n} E^{0} {ℓ (F_{1}^{0} ({\tilde{X}}_{1 t}), F_{2}^{0} ({\tilde{X}}_{2 t}); α_{n}^{*})} + o_{p} (1) .$

Proposition 3.1(1) states that the two-step estimator ${\hat{α}}_{n}$ is a consistent estimator of the pseudo true value $α_{n}^{*}$ . If the censoring mechanism is random, then $α_{n}^{*} = α^{*}$ which does not depend on n. In addition, if the parametric copula correctly specifies the true copula, then $α^{*} = α^{0}$ , where α^o is such that C(u₁, u₂; α^o) = C^o(u₁, u₂) for almost all (u₁, u₂) ∈ (0, 1)².

3.2. Asymptotic normality

Recall that $(U_{1 t}, U_{2 t}) = (F_{1}^{0} ({\tilde{X}}_{1 t}), F_{2}^{0} ({\tilde{X}}_{2 t}))$ . For j = 1, 2, we denote

W_{j} ({\tilde{X}}_{jt}, δ_{jt}; α_{n}^{*}) \equiv E^{0} [ℓ_{α j} (U_{1 s}, U_{2 s}; α_{n}^{*}) I_{j}^{o} ({\tilde{X}}_{jt}, δ_{jt}) ({\tilde{X}}_{js}) ∣ {\tilde{X}}_{jt}, δ_{jt}],

I_{j}^{o} ({\tilde{X}}_{jt}, δ_{jt}) ({\tilde{X}}_{js}) \equiv - F_{j}^{o} ({\tilde{X}}_{js}) [\int_{- \infty}^{{\tilde{X}}_{js}} \frac{d n_{jt} (u)}{P_{n, j} (u)} - \int_{- \infty}^{{\tilde{X}}_{js}} \frac{I {{\tilde{X}}_{jt} \geq u} d Λ_{j} (u)}{P_{n, j} (u)}],

with $Λ_{j} (u) \equiv - \log (F_{j}^{0} (u))$ the cumulative hazard function of X_j, $N_{j t} (u) \equiv δ_{j t} I {{\tilde{X}}_{j t} \leq u}$ and dN_jt (u) = N_jt (u) − N_jt (u−), and $P_{n, j} (u) \equiv n^{- 1} \sum_{k = 1}^{n} P ({\tilde{X}}_{j k} \geq u) = n^{- 1} \sum_{k = 1}^{n} G_{j k} (u) F_{j}^{0} (u)$ .

Let Var⁰ denote the variance with respect to the true probability measure. The following conditions are sufficient to ensure the asymptotic normality of ${\hat{α}}_{n}$ .

A1
- (i)
  C2 holds with $α_{n}^{*}$ ∈ int(A*) for all n, where A* is a compact subset of $A$ ;
- (ii)
  $B_{n} \equiv - n^{- 1} \sum_{t = 1}^{n} E^{0} {ℓ_{α α} (U_{1 t}, U_{2 t}; α_{n}^{*})}$ has all its eigenvalues bounded below and above by some finite positive constants;
- (iii)
  $Σ_{n} \equiv n^{- 1} \sum_{t = 1}^{n} {Var}^{0} {ℓ_{α} (U_{1 t}, U_{2 t}; α_{n}^{*}) + W_{1} ({\tilde{X}}_{1 t}, δ_{1 t}; α_{n}^{*}) + W_{2} ({\tilde{X}}_{2 t}, δ_{2 t}; α_{n}^{*})}$ has all its eigenvalues bounded below and above by some finite positive constants;
- (iv)
  ${ℓ_{α} (U_{1 t}, U_{2 t}; α_{n}^{*}) + W_{1} ({\tilde{X}}_{1 t}, δ_{1 t}; α_{n}^{*}) + W_{2} ({\tilde{X}}_{2 t}, δ_{2 t}; α_{n}^{*})}_{t = 1}^{n}$ satisfies Lindeberg condition.
A2
Functions ℓ_αα(u₁, u₂; α) and ℓ_αj(u₁, u₂; α), j = 1, 2, are well-defined and continuous in $(u_{1}, u_{2}, α) \in {(0, 1)}^{2} \times A$ .
A3
- (i)
  $∣ ℓ_{α} (u_{1}, u_{2}; α_{n}^{*}) ∣ \leq q {u_{1} (1 - u_{1})}^{- a_{1}} {u_{2} (1 - u_{2})}^{- a_{2}}$ for some q > 0 and a_j ≥ 0 such that $lim sup n^{- 1} \sum_{t = 1}^{n} E^{0} [{U_{1 t} (1 - U_{1 t})}^{- 1 a_{1}} {U_{2 t} (1 - U_{2 t})}^{- 2 a_{2}}] < \infty$ ;
- (ii)
  $∣ ℓ_{α j} (u_{1}, u_{2}; α_{n}^{*}) ∣ \leq const . {u_{j} (1 - u_{j})}^{- b_{j}} {u_{k} (1 - u_{k})}^{- a_{k}}$ for some b_k, a_k and j ≠ k such that $lim sup n^{- 1} \sum_{t = 1}^{n} E^{0} [{U_{j t} (1 - U_{j t})}^{ξ_{j} - b_{j}} {U_{k t} (1 - U_{k t})}^{- a_{k}}] < \infty$ for some ξ_j ∈ (0, 1/2).
A4
- (i)
  Let $L_{t α j} = {sup}_{α \in A} ∣ ℓ_{α j} (U_{1 t}, U_{2 t}; α) ∣$ and $L_{t α α} = {sup}_{α \in A} ∣ ℓ_{α α} (U_{1 t}, U_{2 t}; α) ∣$ . Then,
  $_{K \to \infty}^{\lim}_{n \to \infty}^{\lim \sup} n^{- 1} \sum_{t = 1}^{n} E^{0} {L_{t α j} I (L_{t α j} \geq K) + L_{t α α} I (L_{t α α} \geq K)} = 0;$
- (ii)
  For any η > 0 and any ∊ > 0, there is K > 0, such that $∣ ℓ_{α} (u_{1}, u_{2}; α) ∣ + ∣ ℓ_{α α} (u_{1}, u_{2}; α) ∣ \leq K {∣ ℓ_{α} (u_{1}^{'}, u_{2}^{'}; α) ∣ + ∣ ℓ_{α α} (u_{1}^{'}, u_{2}^{'}; α) ∣}$ for all $α \in A$ and all u_j ∈ [η, 1) such that $1 - u_{j} \geq ∊ (1 - u_{j}^{'})$ , j = 1, 2.

Shih and Louis (1995) require bounded $ℓ_{α} (u_{1}, u_{2}; α_{n}^{*})$ and $ℓ_{α j} (u_{1}, u_{2}; α_{n}^{*})$ for j = 1, 2, however, this requirement is not satisfied by many popular copula functions such as Gaussian copula, t-copula, Gumbel copula and Clayton copula. Conditions A3 and A4 relax the boundedness requirement, and allow the score function and its partial derivatives with respect to the first two arguments to blow up at the boundaries. Similar conditions have been verified for Gaussian, Frank and Clayton copulas in Chen and Fan (2006b).

Proposition 3.2. Under conditions C1–C5 and A1–A4, we have: $B_{n} Σ_{n}^{- 1 ∕ 2} \sqrt{n} ({\hat{α}}_{n} - α_{n}^{*}) \to N (0, I_{p})$ in distribution, where B_n and Σ_n are defined A1.

Proposition 3.2 extends Theorem 2 in Shih and Louis (1995) in two directions: (i) it allows for more general censoring mechanisms than the simple random censoring in Shih and Louis (1995), and (ii) it allows for the possibility that the parametric copula may not specify the true copula correctly. As a result, there are several differences between Proposition 3.2 and Theorem 2 in Shih and Louis (1995): First, since the censoring variables ${D_{1 t}, D_{2 t}}_{t = 1}^{n}$ may not be identically distributed, B_n and Σ_n may depend on n; Second, since the parametric copula may misspecify the true copula, the information matrix equality may not hold. Consequently, the asymptotic variance of $\sqrt{n} ({\hat{α}}_{n} - α_{n}^{*})$ , $B_{n}^{- 1} Σ_{n} B_{n}^{- 1}$ , can not be reduced to $[B_{n}^{- 1} + n^{- 1} B_{n}^{- 1} \sum_{t = 1}^{n} {Var}^{0} {W_{1} ({\tilde{X}}_{1 t}, δ_{1 t}; α_{n}^{*}) + W_{2} ({\tilde{X}}_{2 t}, δ_{2 t}; α_{n}^{*})} B_{n}^{- 1}]$ as in Shih and Louis (1995). For complete data, Proposition 3.2 reduces to that in Chen and Fan (2005).

To estimate the asymptotic variance $B_{n}^{- 1} Σ_{n} B_{n}^{- 1}$ of $\sqrt{n} ({\hat{α}}_{n} - α_{n}^{*})$ , we let

{\hat{B}}_{n} = - n^{- 1} \sum_{t = 1}^{n} ℓ_{α α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), ({\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{n}),

{\hat{Σ}}_{n} = \frac{1}{n} \sum_{t = 1}^{n} {ℓ_{α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{n}) + {\hat{W}}_{1} ({\tilde{X}}_{1 t}, δ_{1 t}; {\hat{α}}_{n}) + {\hat{W}}_{2} ({\tilde{X}}_{2 t}, δ_{2 t}; {\hat{α}}_{n})} {ℓ_{α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{n}); + {\hat{W}}_{1} ({\tilde{X}}_{1 t}, δ_{1 t}; {\hat{α}}_{n} + {\hat{W}}_{2} ({\tilde{X}}_{2 t}, δ_{2 t}; {\hat{α}}_{n}),}^{'},

with

{\hat{W}}_{1} ({\tilde{X}}_{1 t}, δ_{1 t}; {\hat{α}}_{n}) = \frac{1}{n} \sum_{s \neq t, s = 1}^{n} ℓ_{α 1} ({\tilde{F}}_{1} ({\tilde{X}}_{1 s}), {\tilde{F}}_{2} ({\tilde{X}}_{2 s}); {\hat{α}}_{n}) {\hat{I}}_{1}^{0} ({\tilde{X}}_{1 t}, δ_{1 t}) ({\tilde{X}}_{1 s}),

{\hat{W}}_{2} ({\tilde{X}}_{2 t}, δ_{2}; {\hat{α}}_{n}) = \frac{1}{n} \sum_{s \neq t, s = 1}^{n} ℓ_{α 2} ({\tilde{F}}_{1} ({\tilde{X}}_{1 s}), {\tilde{F}}_{2} ({\tilde{X}}_{2 s}); {\hat{α}}_{n}) {\hat{I}}_{2}^{0} ({\tilde{X}}_{2 t}, δ_{2 t}) ({\tilde{X}}_{2 s}),

in which for j = 1, 2,

{\hat{I}}_{j}^{o} ({\hat{X}}_{jt}, δ_{jt}) ({\tilde{X}}_{js}) = - {\tilde{F}}_{j} ({\tilde{X}}_{js}) [\frac{I {{\tilde{X}}_{jt} \leq {\tilde{X}}_{js}} δ_{jt}}{n^{- 1} \sum_{k = 1}^{n} I {{\tilde{X}}_{jk} \geq {\tilde{X}}_{jt}}} - \frac{1}{n} \sum_{k = 1}^{n} \frac{I {{\tilde{X}}_{js} \geq {\tilde{X}}_{jl}} I {{\tilde{X}}_{jt} \geq {\tilde{X}}_{jl}} δ_{jl}}{{[n^{- 1} \sum_{k = 1}^{n} I {{\tilde{X}}_{jk} \geq {\tilde{X}}_{j l}}]}^{2}}] .

(3.1)

We note that an alternative expression for ${\hat{I}}_{j}^{0} ({\tilde{X}}_{j t}, δ_{j t}) ({\tilde{X}}_{j s})$ is:

{\hat{I}}_{j}^{o} ({\tilde{X}}_{jt}, δ_{jt}) ({\tilde{X}}_{js}) = - {\tilde{F}}_{j} ({\tilde{X}}_{js}) [\frac{I {{\tilde{X}}_{jt} \leq {\tilde{X}}_{js}, δ_{jt} = 1}}{{\hat{P}}_{n, j} ({\tilde{X}}_{jt})} - \sum_{{\tilde{X}}_{jl} \leq {\tilde{X}}_{js}} \frac{I {{\tilde{X}}_{jt} \geq {\tilde{X}}_{jl}} Δ {\hat{Δ}}_{j} ({\tilde{X}}_{jl})}{{\hat{P}}_{n, j} ({\tilde{X}}_{jl})}],

where ${\hat{P}}_{n, j} (u) \equiv n^{- 1} \sum_{k = 1}^{n} I {{\tilde{X}}_{j k} \geq u}$ ,

Δ {\hat{Δ}}_{j} (u) = \frac{I {{\bar{Y}}_{j} (u) > 0}}{{\bar{Y}}_{j} (u)} d {\overset{‒}{N}}_{j} (u), {\bar{Y}}_{j} (u) = \sum_{k = 1}^{n} I {{\tilde{X}}_{jk} \geq u},

{\overset{‒}{N}}_{j} (u) = \sum_{k = 1}^{n} N_{jk} (u),

in which $Δ {\hat{Λ}}_{j} (u)$ is so-called Nelson's estimator. This is because

\begin{matrix} \sum_{{\tilde{X}}_{jl} \leq {\tilde{X}}_{js}} \frac{I {{\tilde{X}}_{jt} \geq {\tilde{X}}_{jl}} Δ {\hat{Λ}}_{j} ({\tilde{X}}_{jl})}{{\hat{P}}_{n, j} ({\tilde{X}}_{jl})} & = \sum_{{\tilde{X}}_{jl} \leq {\tilde{X}}_{js}} \frac{I {{\tilde{X}}_{jt} \geq {\tilde{X}}_{jl}} δ_{jl}}{{\hat{P}}_{n, j} ({\tilde{X}}_{jl}) \sum_{k = 1}^{n} I {{\tilde{X}}_{jk} \geq {\tilde{X}}_{jl}}} \\ = \frac{1}{n} \sum_{l = 1}^{n} \frac{I {{\tilde{X}}_{js} \geq {\tilde{X}}_{jl}} I {{\tilde{X}}_{jt} \geq {\tilde{X}}_{jl}} δ_{jl}}{{[n^{- 1} \sum_{k = 1}^{n} I {{\tilde{X}}_{jk} \geq {\tilde{X}}_{jl}}]}^{2}} . \end{matrix}

By the consistency of the Kaplan-Meier estimators and ${\hat{α}}_{n}$ , and by applying the law of large numbers to independent observations, we can the following result, which provides a consistent variance estimator.

Proposition 3.3. Under conditions C1–C5 and A1–A4, the asymptotic variance of $n^{1 ∕ 2} {\hat{α}}_{n}$ can be consistently estimated by ${\hat{B}}_{n}^{-} {\hat{Σ}}_{n} {\hat{B}}_{n}^{-}$ , where ${\hat{B}}_{n}^{-}$ is the generalized inverse of ${\hat{B}}_{n}$ .

4. Pseudo-likelihood ratio test for model comparison

By applying Proposition 3.1(2) we immediately obtain the probability limit of the PLR statistic.

Proposition 4.1. Suppose for i = 1, …, M, the copula model i satisfies the conditions of Proposition 3.1. Then

{LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) = \frac{1}{n} \sum_{t = 1}^{n} E^{0} {ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})} + o_{p} (1),

where $U_{j t} = F_{j}^{0} ({\tilde{X}}_{j t})$ for j = 1, 2.

In the following, we adopt the convention that all the notations involving the copula function C(u₁, u₂; α) introduced in Section 3 are now indexed by a subscript i for i = 1, …, M to make explicit their dependence on the parametric copula model i. In addition, we define $U_{t} = (U_{1 t}, U_{2 t}) = (F_{1}^{0} ({\tilde{X}}_{1 t}), F_{2}^{0} ({\tilde{X}}_{2 t}))$ ,

e_{t} = {(e_{2 t}, \dots, e_{Mt})}^{'},

e_{it} \equiv {ℓ_{i} (U_{t}; α_{in}^{*}) - ℓ_{1} (U_{t}; α_{1 n}^{*})} + \sum_{j = 1}^{2} {Q_{i, j} ({\tilde{X}}_{jt}, δ_{jt}; α_{in}^{*}) - Q_{1, j} ({\tilde{X}}_{jt}, δ_{jt}; α_{1 n}^{*})},

where for i = 1, …, M and for j = 1, 2,

Q_{i, j} ({\tilde{X}}_{jt}, δ_{jt}; α_{in}^{*}) \equiv E^{0} [ℓ_{i, j} (U_{1 s}, U_{2 s}; α_{in}^{*}) I_{j}^{o} ({\tilde{X}}_{jt}, δ_{jt}) ({\tilde{X}}_{js}) ∣ {\tilde{X}}_{jt}, δ_{jt}] .

It is easy to see that $\frac{1}{\sqrt{n}} \sum_{t = 1}^{n} {e_{t} - E^{0} (e_{t})}$ has the same asymptotic distribution as a multivariate normal random variable with mean zero and variance Ω_n, where

Ω_{n} = \frac{1}{n} \sum_{t = 1}^{n} E^{0} [(e_{t} - E^{0} {e_{t}}) {(e_{t} - E^{0} {e_{t}})}^{'}] = {(σ_{ik})}_{i, k = 2}^{M},

σ_{ik} = \frac{1}{n} \sum_{t = 1}^{n} E^{0} [(e_{it} - E^{0} {e_{it}}) (e_{kt} - E^{0} {e_{kt}})] .

It is easy to compute a consistent estimator ${\hat{Ω}}_{n}$ for Ω_n:

{\hat{Ω}}_{n} = \frac{1}{n} \sum_{t = 1}^{n} [({\hat{e}}_{t} - \frac{1}{n} \sum_{s = 1}^{n} {\hat{e}}_{s}) {({\hat{e}}_{t} - \frac{1}{n} \sum_{s = 1}^{n} {\hat{e}}_{s})}^{'}] = {({\hat{σ}}_{ik})}_{i, k = 2}^{M},

{\hat{σ}}_{ik} = \frac{1}{n} \sum_{t = 1}^{n} [({\hat{e}}_{it} - \frac{1}{n} \sum_{s = 1}^{n} {\hat{e}}_{is}) ({\hat{e}}_{kt} - \frac{1}{n} \sum_{s = 1}^{n} {\hat{e}}_{ks})],

(4.1)

where ${\hat{e}}_{t} = {({\hat{e}}_{2 t}, \dots, {\hat{e}}_{M t})}^{'}$ and for i = 2, …, M,

{\hat{e}}_{it} \equiv {ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{in}) - ℓ_{1} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{1 n})} + \sum_{j = 1}^{2} {{\hat{Q}}_{i, j} ({\tilde{X}}_{jt}, δ_{jt}; {\hat{α}}_{in}) - {\hat{Q}}_{1, j} ({\tilde{X}}_{jt}, δ_{jt}; {\hat{α}}_{1 n})},

in which

{\hat{Q}}_{i, j} ({\tilde{X}}_{jt}, δ_{jt}; {\hat{α}}_{in}) = \frac{1}{n} \sum_{s \neq t, s = 1}^{n} ℓ_{i, j} ({\tilde{F}}_{1} ({\tilde{X}}_{1 s}), {\tilde{F}}_{2} ({\tilde{X}}_{2 s}); {\hat{α}}_{in}) {\hat{I}}_{j}^{o} ({\tilde{X}}_{jt}, δ_{jt}) ({\tilde{X}}_{js}),

for i = 1, …, M and j = 1, 2 with ${\hat{I}}_{j}^{0} ({\tilde{X}}_{j t}, δ_{j t}) ({\tilde{X}}_{j s})$ given in (3.1).

Before we present the test statistics, we recall the following definition from Chen and Fan (2005): For model i ∈ {2, …, M},

Models 1 and i are generalized non-nested if the set ${(v_{1}, v_{2}) : c_{1} (v_{1}, v_{2}; α_{i n}^{*}) \neq c_{i} (v_{1}, v_{2}; α_{i n}^{*})}$ has positive Lebesgue measure;

Models 1 and i are generalized nested if $c_{1} (v_{1}, v_{2}; α_{1 n}^{*}) = c_{i} (v_{1}, v_{2}; α_{i n}^{*})$ for almost all (v₁, v₂) ∈ (0, 1)².

Given the definition of the pseudo true value $α_{i n}^{*}$ , the closest $c_{i} (\cdot; α_{i n}^{*})$ to the true copula c⁰ (according to KLIC) in a parametric class of copulas ${c_{i} (\cdot; α_{i}) : α_{i} \in A_{i}}$ depends on the true (but unknown) copula. Hence it is not obvious a priori whether two parametric classes of copulas are generalized non-nested or generalized nested.

Remark 4.1. Define

σ_{ii}^{a} \equiv \frac{1}{n} \sum_{t = 1}^{n} {Var}^{0} [ℓ_{i} (U_{t}; α_{in}^{*}) - ℓ_{1} (U_{t}; α_{1 n}^{*})] .

It is obvious that if models 1 and i are generalized nested, then $ℓ_{i} (U_{1 t}, U_{2 t}; α_{i n}^{*}) = ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})$ almost surely, e_it = 0 almost surely, and $σ_{i i}^{a} = 0$ . Following the proof of Proposition 3 in Chen and Fan (2005), we can show that if $σ_{i i}^{0} = 0$ then models 1 and i are generalized nested, and σ_ii = 0. Therefore it is easy to test whether the models 1 and i are generalized nested by testing $σ_{i i}^{a} = 0$ , which may be done by using its consistent estimator:

{\hat{σ}}_{ii}^{a} = \frac{1}{n} \sum_{t = 1}^{n} {[{ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{in}) - ℓ_{1} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{1 n})} - {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})]}^{2} .

See Chen and Fan (2005) for details.

The following proposition provides the basis for our tests. Note that we allow for some but not all of the candidate models i ∈ {2, …, M} to be generalized nested with the benchmark model 1.

Proposition 4.2. For i = 1, 2, …, M, assume that the copula model i satisfies conditions of Proposition 3.2 and that {e_it : t = 1, …, n} satisfies Lindeberg condition. If $Ω_{n} = {(σ_{i i})}_{i, k = 2}^{M}$ is finite and its largest eigenvalue is positive uniformly in n, then:

(1)

$\begin{matrix} n^{1 ∕ 2} {[{LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]]}_{i = 2, \dots, M} \\ = \frac{1}{\sqrt{n}} \sum_{t = 1}^{n} {e_{t} - E^{0} (e_{t})} + o_{p} (1), \end{matrix}$

$\begin{matrix} \to {(Z_{2}, \dots, Z_{M})}^{'} in distribution, \\ with {(Z_{2}, \dots, Z_{M})}^{'} \sim N (0, Ω_{n}) . \end{matrix}$
(2)
${\hat{Ω}}_{n} = Ω_{n} + o_{p} (1) .$

Proposition 4.2 and the continuous mapping theorem imply

\max_{i = 2, \dots, M} n^{1 ∕ 2} {{LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]} \to \max_{i = 2, \dots, M} Z_{i} .

Define

T_{n} \equiv \max_{i = 2, \dots, M} [n^{1 ∕ 2} {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})] .

Proposition 4.2 implies that under the Least Favorable Configuration (LFC), i.e.,

n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] = 0 for i = 2, \dots, M,

T_n → max_i=2,…,MZ_i in distribution. This allows us to construct a test for H₀. Suppose the largest eigenvalue of Ω_n is positive uniformly in n, then we will reject H₀ if T_n > Z_α, where Z_α is the upper α-percentile of the distribution of max_i=2,…,MZ_i.

The asymptotic power properties of this test against fixed alternatives and Pitman local alternatives follow immediately from Proposition 4.2 and are summarized in the following proposition.

Proposition 4.3. Suppose all conditions of Proposition 4.2 are satisfied. Then the test based on T_n is consistent against fixed alternatives of the form H₁ and has non-trivial power against local alternatives satisfying

\max_{i = 2, \dots, M} \lim_{n \to \infty} {n^{- 1 ∕ 2} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]} > 0 .

Note that if the censoring mechanism is random, then the local alternatives in Proposition 4.3 can be written in the more familiar form:

\max_{i = 2, \dots, M} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{i}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1}^{*})] = K \frac{1}{\sqrt{n}},

for a positive constant K.

In general, the distribution of max_i=2,…,MZ_i is unknown, since the asymptotic variance Ω_n of (Z₂, #x2026;, Z_M) depends on $α_{1 n}^{*}, \dots, α_{M n}^{*}$ . Following White (2000), one can use either “Monte-Carlo RC” p-value or “bootstrap RC” p-value to implement this test. As noted in Chen and Fan (2005), Hansen (2003), and Romano and Wolf (2005), the finite sample power of this test may be improved by standardization. In our empirical application, we have computed both “Monte-Carlo RC” p-value using

T_{nS} = \max_{i = 2, \dots, M} {\frac{n^{1 ∕ 2} {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})}{{\sqrt{\hat{σ}}}_{ii}} G_{b} ({\hat{σ}}_{ii})},

and “bootstrap RC” p-value based on

T_{nI} = \max [\max_{i = 2, \dots, M} {\frac{n^{1 ∕ 2} {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})}{{\sqrt{\hat{σ}}}_{ii}} G_{b} ({\hat{σ}}_{ii})}, 0],

where ${\hat{σ}}_{i i}$ is a consistent estimator of σ_ii such as the one given in (4.1), b = b_n → 0 as n → ∞, and G_b(·) is a is smoothed trimming which trims out small ${\hat{σ}}_{i i}$ . The particular trimming function being used in our empirical study is

G_{b} (x) = \int_{- \infty}^{x} g_{b} (z) dz = {\begin{matrix} 0, x < b \\ \int_{- \infty}^{x} g_{b} (z) dz, b \leq x \leq 2 b \\ 1, x > 2 b \end{matrix}

where g_b(χ) = (b⁻¹g(b⁻¹ χ−1) and g(z) = B(a+1)⁻¹z^a(1−z)^a,z ∈ [0, 1] for some positive integer a ≥ 1, where B(a) = Γ(a)²/Γ(2a) is the beta function and Γ(a) is the Euler gamma function.

We note that the standardized tests T_nS and T_nI proposed here allow that some candidate models are generalized nested with the benchmark model, since the trimming $G_{b} ({\hat{σ}}_{i i})$ in T_nS and T_nI removes the effect of generalized nested models (with the benchmark model) on its limiting distribution. By a minor modification of the proof of Theorem 7 in Chen and Fan (2005), we immediately obtain the following result:

Proposition 4.4. Suppose all conditions of Proposition 4.2 are satisfied. If b → 0 and nb → ∞, then under the null hypothesis H₀, the limiting distribution of T_nI is given by that of $\max_{i \in S_{N B}} (Z_{i} ∕ \sqrt{σ_{i i}}, 0)$ , where

S_{NB} = {\begin{matrix} i \in {2, \dots, M} : σ_{ii} > 0 and \\ n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] = 0 \end{matrix}} .

Proposition 4.4 implies that the asymptotic null distribution of T_nI depends on models that are generalized non-nested with the benchmark and satisfy

n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] = 0,

and hence is unknown. We propose the following bootstrap procedure to approximate the asymptotic null distribution of T_nI:

Step1
Generate a bootstrap sample by random draws with replacement from a consistent nonparametric estimator of the unknown joint distribution of (X_1t, X_2t) that takes into account the censoring scheme. Denote ( ${\tilde{F}}_{1}^{*}$ , ${\tilde{F}}_{2}^{*}$ , ${\tilde{α}}_{i n}^{*}$ , ${\tilde{α}}_{1 n}^{*}$ ) as the bootstrap analogs of ( ${\tilde{F}}_{1}$ , ${\tilde{F}}_{2}$ , ${\tilde{α}}_{i n}$ , ${\tilde{α}}_{1 n}$ ).
Step2
Compute the bootstrap value $T_{i n}^{*} \equiv L R_{n} ({\tilde{F}}_{1}^{*}, {\tilde{F}}_{2}^{*}; {\tilde{α}}_{i n}^{*}, {\tilde{α}}_{1 n}^{*})$ of $T_{i n} \equiv L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\tilde{α}}_{i n}, {\tilde{α}}_{1 n})$ , i = 2, …, M, and define its recentered value as $T_{i n C}^{*} = T_{i n}^{*} - T_{i n} I (T_{i n} \geq - a_{n})$ , where a_n → 0 is a small positive (possibly random) number such that $\sqrt{n} a_{n} \to \infty$ .
Step3
Compute the bootstrap value of T_nI as
$T_{nI}^{*} = \max_{2 \leq i \leq M} {\frac{\sqrt{n} T_{inC}^{*}}{{\sqrt{\hat{σ}}}_{ii}} G_{b} ({\hat{σ}}_{ii}), 0};$
Step4
Repeat Steps 1–3 for a large number of times and use the empirical distribution function of the resulting values $T_{n I}^{*}$ to approximate the null distribution of T_nI.

We note that the above bootstrap procedure is very similar to that proposed in Chen and Fan (2005), except that in Step 1 we generate bootstrap samples from a consistent nonparametric estimator of the joint distribution that takes account of the censoring. For example, for bivariate random right censoring, we could sample from the bivariate Kaplan–Meier estimator; see Dabrowska (1989). See Davison and Hinkley (1997, page 85) for additional ways to generate bootstrap sample for censored data. The consistency of this standardized bootstrap RC test $T_{n I}^{b}$ could be established by a minor modification of the proof of Theorem 8 in Chen and Fan (2005).

Remark 4.2. Recall that

{PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) = {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - \frac{Pen (p_{i}, n) - Pen (p_{1}, n)}{n} .

If $\max_{i = 2}, \dots, M [Pen (p_{i}, n) - Pen (p_{1}, n)] ∕ \sqrt{n} \to 0$ (which is automatically satisfied with AIC and BIC), then

{PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) = {LR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) + o_{p} (n^{- 1 ∕ 2}) for i = 2, \dots, M .

Therefore, penalization could be incorporated in the tests. Define

T_{n}^{P} = \max_{i = 2, \dots, M} [n^{1 ∕ 2} {PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})],

T_{nS}^{P} = \max_{i = 2, \dots, M} {\frac{n^{1 ∕ 2} {PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})}{{\sqrt{\hat{σ}}}_{ii}} G_{b} ({\hat{σ}}_{ii})},

and

T_{nI}^{P} = \max [\max_{i = 2, \dots, M} {\frac{n^{1 ∕ 2} {PLR}_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})}{{\sqrt{\hat{σ}}}_{ii}} G_{b} ({\hat{σ}}_{ii})}, 0] .

(4.2)

Then we can conduct the test using $T_{n}^{P}$ (or $T_{n S}^{P}$ or $T_{n I}^{P}$ ) instead of T_n (or T_nS or T_nI).

5. An empirical application

In this section, we illustrate our testing procedure for the selection of multiple copula-based survival functions by using insurance company data on losses and ALAEs. The particular data set we use was collected by the US Insurance Services Office and has been analyzed in some detail in Frees and Valdez (1998), Klugman and Parsa (1999), and Denuit et al. (2006).

Two alternative approaches have been used in the literature to model multivariate survival data; that of the multivariate distribution function and that of the multivariate survival function. It is important to realize that in the context of semiparametric copula-based models, the copula in a semiparametric copula-based distribution function corresponds to its survival copula in the corresponding semiparametric survival function. To be specific, consider the bivariate case. Let (X₁, X₂) be the survival variables of interest with a joint survival function F^o(x₁, x₂) = Pr(X₁ > x₁, X₂ > x₂) and marginal survival functions $F_{j}^{0} (\cdot)$ , j = 1, 2. Let H(x₁, x₂) denote the corresponding joint cumulative distribution function (cdf) with marginal distributions H_j(·), j = 1, 2. Assume that $H_{1} \equiv 1 - F_{1}^{0}$ and $H_{2} \equiv 1 - F_{1}^{0}$ are continuous. By the Sklar's (1959 theorem, there exists a unique copula function C_h such that H(x₁, x₂) ≡ C_h(H₁(x₁), H₂(x₂)), which in turn implies that the representation

F^{o} (x_{1}, x_{2}) \equiv C^{o} (F_{1}^{o} (x_{1}), F_{2}^{o} (x_{2})),

holds where

C^{o} (u_{1}, u_{2}) \equiv u_{1} + u_{2} - 1 + C_{h} (1 - u_{1}, 1 - u_{2})

is itself a copula function, known as a survival copula. Hence the bivariate distribution function C_h(H₁(x₁), H₂(x₂)) and the bivariate survival function $C^{0} (F_{1}^{0} (x_{1}), F_{2}^{0} (x_{2}))$ , where $F_{j}^{0} (\cdot)$ is the survival function of H_j(·) and C^o is the survival copula of C_h represent the same model.

In Frees and Valdez (1998) and Klugman and Parsa (1999), fully parametric modeling of the joint distribution of the loss and ALAE has been examined; using various model selection techniques including AIC/BIC, Frees and Valdez (1998) select Pareto marginals and Gumbel copula, while Klugman and Parsa (1999) select inverse paralogistic for the loss, inverse Burr for ALAE and the Frank copula. Denuit et al. (2006) adopt a semiparametric distribution framework in which the marginal distributions of loss and ALAE are left unspecified, but their copula is modeled parametrically via a one-parameter Archimedean copula. Their model selection procedure is the same as that in Wang and Wells (2000) except that the joint distributions of loss and ALAE are estimated differently. They examined four one-parameter Archimedean copulas: Gumbel, Clayton, Frank and Joe, and select the same Gumbel copula as Frees and Valdez (1998). Compared with Denuit et al. (2006), we do not restrict the parametric copulas to be Archimedean. In addition, our test takes into account the randomness of the selection criterion. Chen and Fan (2005) have also studied this data set, but since their model selection test is applicable to uncensored data only, they restrict their analysis to the subset of 1466 complete data. We now apply our proposed test to the original censored data with 1500 data points.

The scatterplots for loss and ALAE presented in Frees and Valdez (1998) and Denuit et al. (2006) reveal positive right tail dependence between loss and ALAE: large losses tend to be associated with large ALAE's. This is because expensive claims generally need some time to be settled and induce considerable costs for the insurance company. Actuaries therefore expect positive dependence between large losses and large ALAE's. On the other hand, these plots do not reveal any visible left tail dependence between the two variables. As a result, it is not surprising that the Gumbel copula is chosen in Frees and Valdez (1998) and Denuit et al. (2006). To shed some light on the robustness of this result to the set of copula families being considered, we add three more copula families to the set considered in Denuit et al. (2006): Gaussian copula, survival Clayton, mixture of Clayton and Gumbel copulas; see Appendix B for expressions of these seven copulas and their partial derivatives. Survival Clayton has right tail dependence and the mixture of Clayton and Gumbel exhibits both left tail and right tail dependence unless the weights are degenerate. the Gaussian copula does not have tail dependence and is thus expected to fit poorly. They are included here in the set of copulas to see if the power of the test is adversely affected by the presence of poor copula candidates in the selection set.¹

To facilitate comparison, we also apply our tests to the subset of 1466 complete data. The results of the “Monte Carlo RC” test $T_{n}^{P}$ (using the AIC penalization factor) for the original censored data are presented in Table 1 and those for the subset of 1466 complete data are presented in Table 2, with 500,000 Monte Carlo repetitions. For each copula, we estimated its parameter(s) by the two-step procedure and computed the value of the AIC. To apply our model selection test we need to choose a benchmark model. In view of the existing results, we first use the Gumbel copula as the benchmark. For the Gumbel benchmark, we found the p-value of the test to be 1 with or without taking into account censoring. This provides strong evidence that none of the other six copulas performs significantly better than the Gumbel copula for the loss-ALAE data. This is consistent with the selection result based on comparing the values of the AIC only; Gumbel followed by mixture of Clayton and Gumbel, then by survival Clayton and then by Joe. The parameter estimates for the mixture of Clayton and Gumbel provide additional evidence in favor of the Gumbel copula; the estimates of the weight on Clayton are only 0.0003 when censoring is taken into account and 0.0002 when censoring is not taken into account. In addition, the estimates of the parameter in the Gumbel copula obtained by fitting the mixture of Clayton and Gumbel are very close to the estimates obtained by fitting the Gumbel copula alone for both the subset of complete data and the original censored data. To see if the test is sensitive to the choice of the benchmark model, we also used each of the remaining six copulas as the benchmark.

Table 1.

Monte Carlo p-values of the test for the original dataset subject to censoring.

Benchmark	p-value of	p-value of	AIC	2-step estimator
Gumbel	1.0000	0.9980	−0.1447	1.4428
Clayton	0.0015	0.0004	−0.0000	0.5152
Frank	0.0688	0.0394	−0.1009	0.0473
Joe	0.3968	0.2533	−0.1263	1.6466
Gaussian	0.1692	0.0724	−0.1125	0.4668
Survival Clayton	0.6295	0.4298	−0.1380	0.7825
Mix Clayton & Gumbel	0.9469	0.9794	−0.1420	(0.1505, 1.4433, 0.0003)

Open in a new tab

Table 2.

Monte Carlo p-values of the test for the subset without censoring.

Benchmark	p-value of	p-value of	AIC	2-step estimator
Gumbel	1.0000	0.9940	−0.2560	1.4254
Clayton	0.0037	0.0008	−0.1203	0.5098
Frank	0.1197	0.0834	−0.2160	0.0494
Joe	0.3530	0.1643	−0.2384	1.6105
Gaussian	0.2499	0.1442	−0.2286	0.4604
Survival Clayton	0.5570	0.3412	−0.2472	0.7440
Mix Clayton & Gumbel	0.9382	0.9590	−0.2530	(0.1572, 1.4256, 0.0002)

Open in a new tab

For each of the Tables 1 and 2, we present two versions of the Monte Carlo tests based on the non-standardized test, $T_{n}^{P}$ , and the standardized test, $T_{n S}^{P}$ , as described in Remark 4.2.² Comparing the first two columns in Tables 1 and 2, we see that both tests yield similar high p-values when the benchmark is either Gumbel or the mixture of Clayton and Gumbel; for all the other cases, the standardized test $T_{n S}^{P}$ yields significantly lower p-values than those of $T_{n}^{P}$ . This indicates that the standardized version of the test is generally more powerful than the original non-standardized test.

Additionally, we present a bootstrap version of the test based on $T_{n I}^{P}$ (using the AIC penalization factor). We generate a bootstrap sample by random draws with replacement from a consistent non-parametric estimator of the bivariate joint distribution that takes into account the censoring scheme. For this loss-ALAE data set, we could draw bootstrap samples either from the bivariate Kaplan–Meier estimator of Dabrowska (1989), or from the estimator of Akritas (1994) and Denuit et al. (2006). Let $T_{n I}^{*, P}$ be the counterpart of $T_{n I}^{*}$ for one bootstrap iteration, we write the re-centered bootstrap test statistic as $T_{n I C}^{*, P} = T_{n I}^{*, P} - T_{n I}^{P} \times I {T_{n I}^{P} \geq - a_{n}}$ , where for simplicity we use the same parameter values (a, b_n, a_n) = (1, n^−1/2, 0:025n^−1/2 log log n) as those in Chen and Fan (2005). In this empirical application we use 100 bootstrap repetitions. The bootstrap p-values in Tables 3 and 4 overwhelmingly support the conclusion that the Gumbel copula fits the loss-ALAE data the best among the seven copulas we considered. This finding is consistent with existing results in the literature. The fact that the results in Tables 3 and 4 are so close to each other confirms the statement in Denuit et al. (2006) that the limited amount of censored points present in this Loss-ALAE data does not seem to affect the copula selection result.

Table 3.

Bootstrap p-values of the test for the original dataset subject to censoring.

Benchmark	p-value of	AIC	Two-step estimate
Gumbel	1.0000	−0.1447	1.4428
Clayton	0.0000	−0.0000	0.5152
Frank	0.0000	−0.1009	0.0473
Joe	0.1010	−0.1263	1.6466
Gaussian	0.0517	−0.1125	0.4668
Survival Clayton	0.1414	−0.1380	0.7825
Mix Clayton & Gumbel	0.9900	−0.1420	(0.1505, 1.4433, 0.0003)

Open in a new tab

Table 4.

Bootstrap p-values of the test for the subset without censoring.

Benchmark	p-value of	AIC	Two-step estimate
Gumbel	1.0000	−0.2560	1.4254
Clayton	0.0000	−0.1203	0.5098
Frank	0.0000	−0.2160	0.0494
Joe	0.1052	−0.2384	1.6105
Gaussian	0.0202	−0.2286	0.4604
Survival Clayton	0.0909	−0.2472	0.7440
Mix Clayton & Gumbel	0.9963	−0.2530	(0.1572, 1.4256, 0.0002)

Open in a new tab

Finally, by comparing the bootstrap p-values in Tables 3 and 4 with the Monte Carlo p-values in Tables 1 and 2, we notice that the standardized “bootstrap RC” test is in general more powerful than the standardized “Monte Carlo RC” test, which in turn is more powerful than the non-standardized “Monte Carlo RC” test. Nevertheless, it is noteworthy that the standardized “bootstrap RC” test is computationally much more intensive than the standardized “Monte Carlo RC” test. For an AMD Athlon(tm) 64 Processor, 1.18 GHz and 384 Mb of RAM, for each benchmark case, the standardized “bootstrap RC” test (with 100 bootstrap replications) takes about 10,500 computer seconds, whereas the standardized “Monte Carlo RC” test (with 500,000 Monte Carlo repetitions) only takes about 350 computer seconds. Moreover, we are happy to see that the standardized “Monte Carlo RC” test and the standardized “bootstrap RC” test yield very similar rankings and lead to the same conclusion that the Gumbel copula fits the loss-ALAE data the best.

6. Conclusion

Many models of semiparametric multivariate survival functions are characterized by nonparametric marginal survival functions and parametric copula functions, where different copulas imply different dependence structures. In this paper, we first establish large sample properties of the two-step estimator of copula dependence parameter when the parametric copula function may be misspecified and when data may be subject to an independent but otherwise general right censorship. We then provide a penalized pseudo-likelihood ratio test for selecting among multiple semiparametric copula models for multivariate survival data. An empirical application to the famous Loss-ALAE insurance data set indicates the usefulness of our theoretical results.

Although our theoretical results allow for general right censoring scheme, we still assume that the data is independent and is subject to independent censoring. In some economic and financial applications, data could be serially dependent and may be subject to dependent censorship. The two-step estimator and its large sample properties have been extended to time series settings in Chen and Fan (2006a,b), but their results do not allow for any censoring. We shall extend the results in this paper to allow for time series and/or dependent censoring in another paper.

Acknowledgements

We thank a guest co-editor and the anonymous referees for detailed suggestions which greatly improved the paper. We thank Professors Frees and Valdez for kindly providing the loss-ALAE data, which were collected by the US Insurance Services Office (ISO). Chen and Fan acknowledge financial support from the National Science Foundation. Ying acknowledges financial support from the National Science Foundation and the National Institute of Health. Part of the work was initiated during Chen and Ying's visit to the Institute for Mathematical Sciences at the National University of Singapore whose hospitality and support are acknowledged.

Appendix A. Technical proofs

We first introduce additional notation: $N_{jt} (x) = δ_{jt} I ({\tilde{X}}_{jt} \leq x)$ , $J_{jt} (u) = I ({\tilde{X}}_{jt} \geq u)$ , $M_{jt} (x) = N_{jt} (x) - \int_{- \infty}^{x} J_{jt} (u) d Λ_{j} (u)$ and $Λ_{j} (u) = - \log F_{j}^{o} (u)$ the marginal cumulative hazard function of X_j, j = 1, 2.

Lemma A.1. Suppose that Conditions C1 and C5 are satisfied. Then: (i) the marginal Kaplan–Meier estimators are uniformly strongly consistent: $\sup_{x \leq τ_{j}} ∣ {\tilde{F}}_{j} (x) - F_{j}^{o} (x) ∣ \to 0$ a.s. for j = 1, 2; (ii) they can be expressed as martingle integrals:

\begin{matrix} {\tilde{F}}_{j} (x) - F_{j}^{o} (x) = & - F_{j}^{o} (x) \int_{- \infty}^{x} \frac{{\tilde{F}}_{j} (u -)}{F_{j}^{o} (u)} \frac{\sum_{t = 1}^{n} d M_{jt} (u)}{\sum_{t = 1}^{n} I ({\tilde{X}}_{jt} \geq u)} \\ = - F_{j}^{o} (x) \int_{- \infty}^{x} \frac{\sum_{t = 1}^{n} d M_{jt} (u)}{F_{j}^{o} (u) \sum_{t = 1}^{n} G_{jt} (u)} + o_{p} (n^{- 1 ∕ 2}), \end{matrix}

where o_p() is uniform in x ∈ [0, τ_j], for j = 1, 2.

Proof of Lemma A.1. Because of Condition C5, the risk set size in (−∞, τ_j] is of order n. Consequently, the uniform strong consistency is a special case of Theorem 3 of Lai and Ying (1991). The martingale integral approximation follows from formula (3.2.13) of Gill (1980) and the consistency of the Kaplan–Meier estimator.

Lemma A.2. Let ${\hat{x}}_{j} = \inf {x : {\tilde{F}}_{j} (x) < 1}$ , j = 1, 2. There exists τ₀ > 0 such that for every ∊ > 0, there is an η > 0 such that

\underset{n \to \infty}{lim inf P} (inf_{{\hat{x}}_{j} \leq x \leq τ_{0}} \frac{1 - {\tilde{F}}_{j} (x)}{1 - F_{j}^{o} (x)} \geq η) > 1 - ∊, j = 1, 2 .

Proof of Lemma A.2. For notational convenience, subscript j 1, 2 will be omitted. By definition,

\tilde{F} (x) = \prod_{t : {\tilde{X}}_{t} \leq x} (1 - \frac{δ_{t}}{\sum_{k = 1}^{n} J_{k} ({\tilde{X}}_{t})}) \leq \exp {- \int_{- \infty}^{x} \frac{\sum_{k = 1}^{n} d N_{k} (u)}{\sum_{k = 1}^{n} J_{k} (u)}} .

The right-hand side is bounded by $1 - \frac{2}{3} \int_{- \infty}^{x} \frac{\sum_{k = 1}^{n} d N_{k} (u)}{\sum_{k = 1}^{n} J_{k} (u)}$ , x ≤ τ₀ for suitably chosen τ₀, provided that $\int_{- \infty}^{τ_{0}} \frac{\sum_{k = 1}^{n} d N_{k} (u)}{\sum_{k = 1}^{n} J_{k} (u)} < - \log (2 ∕ 3)$ , which holds for all large n. Thus,

1 - \tilde{F} (x) + \frac{1}{n} \geq \frac{2}{3} {\sum_{t = 1}^{n} I (C_{t} \geq τ_{0}) I (X_{t} \leq x) + \frac{1}{n}} .

(A.1)

By a theorem of van Zuijlen (1978, Theorem 1.1), for any ∊ > 0, there exists η such that

P {\sum_{t = 1}^{n} I (C_{t} \geq τ_{0}) I (X_{t} \leq x) + \frac{1}{n} \geq η \sum_{t = 1}^{n} I (C_{t} \geq τ_{0}) F^{o} (x)} > 1 - ∊ .

(A.2)

Since $\lim \inf n^{- 1} \sum_{t = 1}^{n} I (C_{t} \geq τ_{0}) > 0$ , it follows from (A.1), (A.2) and the fact that $1 - \tilde{F} (x) \geq n^{- 1}$ for all $x \geq \hat{x}$ that the lemma holds.

Proof of Proposition 3.1. The main ideas here are to use the uniform consistency of the Kaplan–Meier estimator and the identifiability Condition C2. Write

\begin{matrix} n^{- 1} & \sum_{t = 1}^{n} {ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) - E^{0} [ℓ (F_{1}^{o} ({\tilde{X}}_{1 t}), F_{2}^{o} ({\tilde{X}}_{2 t}); α)]} \\ = n^{- 1} \sum_{t = 1}^{n} {ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) - ℓ (U_{1 t}, U_{2 t}; α)} + n^{- 1} \sum_{t = 1}^{n} {ℓ (U_{1 t}, U_{2 t}; α) - E^{0} [ℓ (U_{1 t}, U_{2 t}; α)]} . \end{matrix}

(A.3)

We first show that the first term on the right-hand side of (A.3) is of order o_p(1), uniformly in $α \in A$ . Under Condition C5, ${\tilde{F}}_{j} (x) \geq {\tilde{F}}_{j} (τ_{j})$ , j = 1, 2, are bounded away from 0. By continuity of ℓ() on $(0, 1) \times (0, 1) \times A$ and Lemma A.1, the first term, with summation over t such that both ${\tilde{F}}_{1} ({\tilde{X}}_{1 t})$ and ${\tilde{F}}_{2} ({\tilde{X}}_{2 t})$ are bounded away from 0, is of order o_p(1), uniformly in $α \in A$ . i.e., for every η > 0,

lim_{n \to \infty} sup_{α \in A} n^{- 1} \sum_{t = 1}^{n} ∣ ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) - ℓ (U_{1 t}, U_{2 t}; α) ∣ \times I ({\tilde{X}}_{1 t} \land {\tilde{X}}_{2 t} \geq η) = 0 .

(A.4)

It remains to show that for every ∊ > 0, there exists η > 0 such that

P {sup_{α \in A} n^{- 1} \sum_{t = 1}^{n} ∣ ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) I ({\tilde{X}}_{jt} \leq η) ∣ \geq ∊} \leq ∊, j = 1, 2,

(A.5)

and

P {sup_{α \in A} n^{- 1} \sum_{t = 1}^{n} ∣ ℓ ((U_{1 t}), U_{2 t}; α) I ({\tilde{X}}_{jt} \leq η) ∣ \geq ∊} \leq ∊, j = 1, 2 .

(A.6)

By Lemma A.2 and Condition C4(iii), there exists K > 0 such that

P {sup_{α \in A} n^{- 1} \sum_{t = 1}^{n} ∣ ℓ ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) I ({\tilde{X}}_{jt} \leq η) ∣ > K sup_{α \in A} n^{- 1} \sum_{t = 1}^{n} ∣ ℓ ((U_{1 t}), U_{2 t}; α) I ({\tilde{X}}_{jt} \leq η) ∣} < \frac{∊}{3} .

Therefore, to show (A.5) and (A.6), it suffices to show that for any ∊* > 0, there exists η such that

P {∣ n^{- 1} \sum_{t = 1}^{n} L_{t} I ({\tilde{X}}_{jt} \leq η) ∣ \geq ∊^{*}} \leq \frac{2}{3} ∊, j = 1, 2 .

(A.7)

By Condition C4(ii) and the Markov inequality, to show (A.7), we only need to show that for any K* > 0, there exists η such that

P {∣ n^{- 1} \sum_{t = 1}^{n} L_{t} I (L_{t} < K^{*}) I ({\tilde{X}}_{jt} \leq η) ∣ \geq ∊^{*}} \leq \frac{1}{3} ∊, j = 1, 2 .

(A.8)

But, again by the Markov inequality, the left-hand side of (A.8) is bounded by

\frac{K^{*}}{∊^{*}} n^{- 1} \sum_{t = 1}^{n} P {{\tilde{X}}_{jt} \leq η}

which can be made arbitrarily small by Condition C1.

We next show that the second term is also of order o_p(1). By Condition C4(ii), it suffices to show that for every K > 0,

n^{- 1} \sum_{t = 1}^{n} {ℓ (U_{1 t}, U_{2 t}; α) I (\max {L_{t}, L_{α t}} \leq K) - E^{0} [ℓ (U_{1 t}, U_{2 t}; α) I (\max {L_{t}, L_{α t}} \leq K)]}

converges to 0 uniformly in $α \in A$ . But this sequence converges to 0 a.s. for every α and has uniformly bounded derivatives over the compact set $A$ , and, therefore, the convergence must be uniform.

Proof of Proposition 3.2. The proof can be done by essentially combining the techniques of Shih and Louis (1995) and Chen and Fan (2005). A critical part is how to appropriately control the tail behavior.

By the mean-value theorem, we can linearly expand the pseudo-likelihood score function at $α_{n}^{*}$ to get

\hat{α} - α_{n}^{*} = {\tilde{B}}_{n}^{- 1} \frac{1}{n} \sum_{t = 1}^{n} ℓ_{α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α_{n}^{*}),

(A.9)

where ${\tilde{B}}_{n} = \frac{1}{n} \sum_{t = 1}^{n} ℓ_{α α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\tilde{α}}_{n})$ for some ${\tilde{α}}_{n}$ on the line segment between $α_{n}^{*}$ and ${\hat{α}}_{n}$ . Under Condition A4, we can apply the same argument for proving (A.5) to show that $\sup_{α \in A} n^{- 1} \sum_{t = 1}^{n} ∣ ℓ_{α α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α) I ({\tilde{X}}_{jt} \leq η) ∣$ is asymptotically negligible as η → 0. This in conjunction with Condition A2 and the consistency of ${\tilde{F}}_{j}$ and ${\hat{α}}_{n}$ , implies that ${\tilde{B}}_{n} B_{n}^{- 1} \to I_{p}$ in probability as n → ∞.

Again by the mean-value theorem,

\begin{matrix} \sum_{t = 1}^{n} & ℓ_{α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α_{n}^{*}) = \sum_{t = 1}^{n} ℓ_{α} (F_{1}^{o} ({\tilde{X}}_{1 t}), F_{2}^{o} ({\tilde{X}}_{2 t}); α_{n}^{*}) + \sum_{j = 1}^{2} \sum_{t = 1}^{n} ℓ_{α j} ({\tilde{U}}_{1 t}, {\tilde{U}}_{2 t}; α_{n}^{*}) {{\tilde{F}}_{j} ({\tilde{X}}_{jt}) - F_{j}^{o} ({\tilde{X}}_{jt})} \\ = D_{1 n} + D_{2 n}, \end{matrix}

(A.10)

where ( ${\tilde{U}}_{1 t}$ , ${\tilde{U}}_{2 t}$ ) lies on the line segment between ( ${\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t})$ ) and ( ${\tilde{F}}_{1}^{0} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2}^{0} ({\tilde{X}}_{2 t})$ ).

By Lemma A.1,

D_{2 n} = - \sum_{j = 1}^{2} \sum_{t = 1}^{n} ℓ_{α j} ({\tilde{U}}_{1 t}, {\tilde{U}}_{2 t}; α_{n}^{*}) F_{j}^{o} ({\tilde{X}}_{jt}) \times \int_{- \infty}^{{\tilde{X}}_{jt}} \frac{{\tilde{F}}_{j} (u -)}{F_{j}^{o} (u -)} \frac{\sum_{s} d M_{js} (u)}{\sum_{s} I ({\tilde{X}}_{js} \geq u)} .

(A.11)

Let $D_{2 n} (η) = \sum_{j = 1}^{2} D_{2 n, j} (η)$ denote the right-hand side of (A.11) with the summation restricted to those terms such that ${\tilde{X}}_{jt} \leq η$ We next show that for some ∊_j > 0,

∣ n^{- 1 ∕ 2} D_{2 n, j} (η) ∣ = O_{p} (1) {(1 - F_{j}^{o} (η))}^{∊_{j}}, j = 1, 2,

(A.12)

where O_p(1) is uniform over η > 0. For any ξ ∈ (0, 1), since

\begin{matrix} E^{0} & {{(1 - F_{j}^{o} (x))}^{- ξ ∕ 2} \int_{- \infty}^{x} \frac{{\tilde{F}}_{j} (u -)}{F_{j}^{o} (u -)} \frac{\sum_{s} d M_{js} (u)}{\sum_{s} I ({\tilde{X}}_{js} \geq u)}}^{2} \\ \leq E^{0} {\int_{- \infty}^{x} {\frac{{\tilde{F}}_{j} (u -)}{F_{0}^{j} (u -)}}^{2} \frac{{1 - F^{o} (u)}^{- ξ}}{\sum_{s} I ({\tilde{X}}_{js} \geq u)} I (\max_{t} {\tilde{X}}_{jt} \geq u) d Λ_{j}^{o} (u)}, \end{matrix}

it follows from Lenglart's inequality (Gill, 1980, Theorem 2.4.2) that

\begin{matrix} E^{0} & {{(1 - F_{j}^{o} (x))}^{- ξ ∕ 2} \int_{- \infty}^{x} \frac{{\tilde{F}}_{j} (u -)}{F_{j}^{o} (u -)} \frac{\sum_{s} d M_{js} (u)}{\sum_{s} I ({\tilde{X}}_{js} \geq u)}}^{2} \\ = O_{p} (1) \int_{- \infty}^{x} {\frac{{\tilde{F}}_{j} (u -)}{F_{0}^{j} (u -)}}^{2} \frac{{1 - F^{o} (u)}^{- ξ}}{\sum_{s} I ({\tilde{X}}_{js} \geq u)} I (\max_{t} {\tilde{X}}_{jt} \geq u) d Λ_{j}^{o} (u) \\ = O_{p} (1) n^{- 1} \int_{- \infty}^{x} {1 - F^{o} (u)}^{- ξ} d {1 - F^{o} (u)} \\ = O_{p} (1) n^{- 1} {1 - F^{o} (x)}^{1 - ξ}, \end{matrix}

(A.13)

where O_p(1) is uniform in x and the second equality follows from Lemma A.2 and van Zuijlen (1978, Theorem 1.1). From (A.13), Lemma A.2 (with ξ = 2ξ_j) and Condition A3, we have, ignoring the right tail,

\begin{matrix} ∣ n^{- 1 ∕ 2} D_{2 n, j} (η) ∣ & = O_{p} (1) n^{- 1} \sum_{t = 1}^{n} U_{jt}^{ξ_{j} - b_{j}} U_{kt}^{- a_{k}} {(1 - F_{j}^{o} (η))}^{1 - 2 ξ_{j}} \\ = O_{p} (1) {(1 - F_{j}^{o} (η))}^{1 - 2 ξ_{j}} . \end{matrix}

Hence, (A.12) holds with ∊_j = 1 − 2ξ_j, j = 1, 2.

In view of (A.12), we can essentially pretend that ℓ_αj in (A.10) does not blow up at the tail. Therefore, (A.11) implies that for j = 1, 2,

\begin{matrix} D_{2 n, j} & = - \sum_{s = 1}^{n} \int \frac{n^{- 1} \sum_{t = 1}^{n} ℓ_{α j} ({\tilde{U}}_{1 t}, {\tilde{U}}_{2 t}; α_{n}^{*}) U_{jt} I ({\tilde{X}}_{jt} \leq u)}{n^{- 1 \sum_{t = 1}^{n} I (\tilde{X} \geq u)}} d M_{js} (u) \\ = - \sum_{s = 1}^{n} \int \frac{n^{- 1} \sum_{t = 1}^{n} E^{0} {ℓ_{α j} (U_{1 t}, U_{2 t}; α_{n}^{*}) U_{jt} I ({\tilde{X}}_{jt} \geq u)}}{P_{nj} (u)} d M_{js} (u) + o_{p} (n^{1 ∕ 2}) . \end{matrix}

(A.14)

From (A.9)–(A.11) and (A.14), we see that ${\hat{α}}_{n} - α_{n}^{*}$ is asymptotically a sum of independent zero-mean random vectors. Given Condition A1, Proposition 3.2 now follows from the standard multivariate central limit theorem for independent but non-identically distributed random variables.

Proof of Propositions 3.3. The consistency of the variance estimator clearly follows from the laws of large numbers, the consistency of the Kaplan–Meier estimator and of ${\hat{α}}_{n}$ , when the possible “tail instability” is ignored. To control the tail behavior, we can applied the same techniques as in the proofs of Propositions 3.1 and 3.2. The details are omitted.

Proof of Proposition 4.2. For i = 1, …, M, by the definition of ${\hat{α}}_{in}$ , we have

\sum_{t = 1}^{n} ℓ_{i, α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{in}) = 0 .

Hence,

\sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α_{in}^{*}) = \sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{in}) + \frac{1}{2} {(α_{in}^{*} - {\hat{α}}_{in})}^{'} \sum_{t = 1}^{n} ℓ_{i, α α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\overset{‒}{α}}_{in}) (α_{in}^{*} - {\hat{α}}_{in}),

where ${\overset{‒}{α}}_{in}$ is between $α_{in}^{*}$ and ${\hat{α}}_{in}$ . By conditions C2–C5, A1–A4 and Proposition 3.2, we have

\frac{1}{2 n} {(α_{in}^{*} - {\hat{α}}_{in})}^{'} \sum_{t = 1}^{n} ℓ_{i, α α} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\overset{‒}{α}}_{in}) (α_{in}^{*} - {\hat{α}}_{in}) = - \frac{1}{2} {(α_{in}^{*} - {\hat{α}}_{in})}^{'} B_{in} (α_{in}^{*} - {\hat{α}}_{in}) + o_{p} (1 ∕ n) .

Hence,

\frac{1}{n} \sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); {\hat{α}}_{in}) = \frac{1}{n} \sum_{t = 1}^{n} ℓ_{i} ({\tilde{F}}_{1} ({\tilde{X}}_{1 t}), {\tilde{F}}_{2} ({\tilde{X}}_{2 t}); α_{in}^{*}) + \frac{1}{2} {(α_{in}^{*} - {\hat{α}}_{in})}^{'} B_{in} (α_{in}^{*} - {\hat{α}}_{in}) + o_{p} (1 ∕ n) .

As a result, we get for all i = 2, …, M,

L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - \frac{1}{n} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] = A_{i, n} + D_{i, n} + o_{p} (1 ∕ n),

where

\begin{matrix} A_{i, n} & \equiv L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; α_{in}^{*}, α_{1 n}^{*}) - \frac{1}{n} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] \\ = \frac{1}{n} \sum_{t = 1}^{n} {[ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] - E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] + \sum_{j = 1}^{2} \frac{1}{n} \sum_{t = 1}^{n} [{ℓ_{i, j} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1, j} (U_{1 t}, U_{2 t}; α_{1 n}^{*})} \times {{\tilde{F}}_{j} ({\tilde{X}}_{jt}) - F_{j}^{o} ({\tilde{x}}_{jt})}] + o_{p} (1 ∕ \sqrt{n}), \end{matrix}

D_{n} \equiv \frac{1}{2} {(α_{in}^{*} - {\hat{α}}_{in})}^{'} B_{in} (α_{in}^{*} - {\hat{α}}_{in}) - \frac{1}{2} {(α_{1 n}^{*} - {\hat{α}}_{1 n})}^{'} B_{1 n} (α_{1 n}^{*} - {\hat{α}}_{1 n}) .

By Proposition 3.2, we have D_n = O_p(n⁻¹).

For generalized non-nested models, Using a proof similar to that of Proposition 3.2, we obtain:

\sqrt{n} \times A_{i, n} = \frac{1}{\sqrt{n}} \sum_{t = 1}^{n} [e_{it} - E^{0} (e_{it})] + o_{p} (1) = O_{p} (n^{- 1 ∕ 2}),

hence $\sqrt{n} \times A_{i, n}$ converges in distribution to a $N (0, σ_{ii})$ . Therefore,

\sqrt{n} [L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; α_{in}^{*}, α_{1 n}^{*}) - \frac{1}{n} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]]

converges in distribution to a $N (0, σ_{ii})$ .

For generalized nested models, the term A_i,n becomes zero almost surely, we have

L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - \frac{1}{n} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})] = D_{i, n} + o_{p} (1 ∕ n),

where by Proposition 3.2, 2nD_i,n is distributed as a weighted sum of independent $χ_{[1]}^{2}$ random variables.

Proof of Proposition 4.3. Note that

\begin{matrix} T_{n} & = \max_{i = 2, \dots, M} [n^{1 ∕ 2} L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n})] \\ = \max_{i = 2, \dots, M} [n^{1 ∕ 2} {L R_{n} ({\tilde{F}}_{1}, {\tilde{F}}_{2}; {\hat{α}}_{in}, {\hat{α}}_{1 n}) - n^{- 1} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]} + n^{- 1 ∕ 2} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]] \\ \to \max_{i = 2, \dots, M} [Z_{i} + lim_{n \to \infty} {n^{- 1 ∕ 2} \sum_{t = 1} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{in}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]}] . \end{matrix}

Let

K_{i n} = \lim_{n \to \infty} {n^{- 1 ∕ 2} \sum_{t = 1}^{n} E^{0} [ℓ_{i} (U_{1 t}, U_{2 t}; α_{i n}^{*}) - ℓ_{1} (U_{1 t}, U_{2 t}; α_{1 n}^{*})]} .

Then

\begin{matrix} P (T_{n} > Z_{α}) & \to P (\max_{i = 2, \dots, M} {Z_{i} + K_{i n}} > Z_{α}) \\ \geq P (\max_{i = 2, \dots, M} Z_{i} + \max_{i = 2, \dots, M} K_{i n} > Z_{α}) . \end{matrix}

For fixed alternatives, max_i=2,…,MK_in = +∞ and so P (T_n > Z_α) → 1. For local alternatives such that maxi=2,…,MK_in > 0,

P (\max_{i = 2, \dots, M} Z_{i} + \max_{i = 2, \dots, M} K_{i n} > Z_{α}) > P (\max_{i = 2, \dots, M} Z_{i} > Z_{α}) = α .

Hence lim_n→∞P (T_n > Z_α) > α.

Appendix B. Expressions of copulas and their derivatives

In the Appendix B we describe the seven copulas and their derivatives that we have used in the empirical application Section 5.³ Let (X₁, X₂) be the lifetime variables of interest with joint survival function F^o(x₁, x₂) = Pr(X₁ > x₁, X₂ > x₂) and continuous marginal survival functions $F_{j}^{o} (x_{1}, x_{2}) = \Pr (X_{1} > x_{1}, X_{2} > x_{2})$ . Let H (x₁, x₂) denote the corresponding joint cumulative distribution function (cdf) with marginal distributions $H_{j} \equiv 1 - F_{j}^{o}, j = 1, 2$ . By the Sklar's (1959) theorem, there exists a unique copula function C_h on [0, 1]² such that

H (x_{1}, x_{2}) \equiv C_{h} (H_{1} (x_{1}), H_{2} (x_{2})),

or equivalently

F^{o} (x_{1}, x_{2}) \equiv C^{o} (F_{1}^{o} (x_{1}), F_{2}^{o} (x_{2})),

holds with

C^{o} (u_{1}, u_{2}) \equiv u_{1} + u_{2} - 1 + C_{h} (1 - u_{1}, 1 - u_{2}),

(B.1)

where the copula function C^o() is sometimes called the survival copula (of C_h).

It is easy to see that, for any j ∈ {1, 2}

\frac{\partial C^{o}}{\partial u_{j}} (u_{1}, u_{2}) = 1 - \frac{\partial C_{h}}{\partial u_{j}} (1 - u_{1}, 1 - u_{2}),

(B.2)

in fact, for any partial derivative of order k higher than 2 we have that

\frac{\partial^{k} C^{o}}{\partial u_{j_{1}} \dots \partial u_{j_{k}}} (u_{1}, u_{2}) = {(- 1)}^{k} \frac{\partial^{k} C_{h}}{\partial u_{j_{1}} \dots \partial u_{j_{k}}} (1 - u_{1}, 1 - u_{2}),

(B.3)

where j_i ∈ {1, 2}. Note that this last equation implies that

c^{o} (u_{1}, u_{2}) = c_{h} (1 - u_{1}, 1 - u_{2}),

(B.4)

where c^o and c_h are the copula densities associated to C^o and C_h, respectively.

Using relations (B.2)–(B.4), by replacing v_j = 1 − u_j in the expressions of partial derivatives of a copula C_h and its density c_h, we immediately obtain the expressions for the partial derivatives of the survival copula C^o and its density c^o. Therefore, in the following we only provide expressions for the partial derivatives of several copula functions C_h and their densities c_h that we have used in the empirical application.

Gumbel copula

The Gumbel copula and its density are given by

C_{h} (v_{1}, v_{2}) = \exp {- {({(- \log (v_{1}))}^{α} + {(- \log (v_{2}))}^{α})}^{1 ∕ α}},

(B.5)

and

c_{h} (v_{1}, v_{2}) = \frac{1}{C_{h}} \frac{\partial C_{h}}{\partial v_{1}} \frac{\partial C_{h}}{\partial v_{2}} T_{1},

with T₁ = ((α − 1)(− log(C_h))⁻¹ + 1). Following Frees and Valdez (1998), we can express the partial derivative of C_h with respect to v_j, j = 1, 2, as

\frac{\partial C_{h}}{\partial v_{j}} (v_{1}, v_{2}) = \frac{C_{h} (v_{1}, v_{2})}{v_{j}} {(\log (v_{j}) ∕ \log (C_{h} (v_{1}, v_{2})))}^{α - 1} .

(B.6)

Hence, a little algebra implies⁴

\frac{\partial^{2} C_{h}}{\partial {v_{j}}^{2}} = {(\log (v_{j}) ∕ \log (C_{h}))}^{α - 1} (\frac{\frac{\partial C_{h}}{\partial v_{j}}}{v_{j}} - \frac{C_{h}}{v_{j}^{2}}) + (α - 1) \frac{C_{h}}{\log (C_{h}) v_{j}^{2}} {(\log (v_{j}) ∕ \log (C_{h}))}^{α - 2} - (α - 1) \frac{\log (v_{j}) {(\log (v_{j}) ∕ \log (C_{h}))}^{α - 2}}{v_{j} \log {(C_{h})}^{2}} \frac{\partial C_{h}}{\partial v_{j}} .

The partial derivative of the copula density c_h with respect to v_j, j = 1, 2, is given by

\frac{\partial c_{h}}{\partial v_{j}} = - \frac{1}{C_{h}^{2}} {(\frac{\partial C_{h}}{\partial v_{j}})}^{2} \frac{\partial C_{h}}{\partial v_{i}} T_{1} + \frac{1}{C_{h}} \frac{\partial^{2} C_{h}}{\partial {v_{j}}^{2}} \frac{\partial C_{h}}{\partial v_{i}} T_{1} + \frac{1}{C_{h}} \frac{\partial C_{h}}{\partial v_{j}} c_{h} T_{1} + \frac{1}{C_{h}} \frac{\partial C_{h}}{\partial v_{j}} \frac{\partial C_{h}}{\partial v_{i}} \frac{\partial T_{1}}{\partial v_{j}},

where

\frac{\partial T_{1}}{\partial v_{j}} = \frac{(α - 1) {(- \log (C_{h}))}^{- 2}}{C_{h}} \frac{\partial C_{h}}{\partial v_{j}} .

Clayton copula

The Clayton copula and its density are given by

C_{h} (v_{1}, v_{2}) = {(v_{1}^{- α} + v_{2}^{- α} - 1)}^{- 1 ∕ α},

and

c_{h} (v_{1}, v_{2}) = (1 + α) v_{1}^{- (1 + α)} v_{2}^{- (1 + α)} {(v_{1}^{- α} + v_{2}^{- α} - 1)}^{- (1 ∕ α + 2)} .

Hence the second order partial derivative of C_h with respect to v_j, j = 1, 2, is given by

\frac{\partial^{2} C_{h}}{\partial {v_{j}}^{2}} = (1 + α) (\frac{1}{C_{h}} {(\frac{\partial C_{h}}{\partial v_{j}})}^{2} - \frac{1}{v_{j}} \frac{\partial C_{h}}{\partial v_{j}}),

where

\frac{\partial C_{h}}{\partial v_{j}} = v_{j}^{- (1 + α)} {(v_{1}^{- α} + v_{2}^{- α} - 1)}^{- 1 ∕ α - 1} .

The first order partial derivative of the copula density c_h with respect to v_j, j = 1, 2, is given by

\frac{\partial c_{h}}{\partial v_{j}} = c_{h} (\frac{(1 + 2 α)}{C_{h}} \frac{\partial C_{h}}{\partial v_{j}} - (1 + α) ∕ v_{j}) .

Frank copula

The Frank copula and its density are given by

C_{h} (v_{1}, v_{2}) = \frac{1}{\log (α)} \log (1 - \frac{(1 - α^{v_{1}}) (1 - α^{v_{2}})}{1 - α}),

and

c_{h} (v_{1}, v_{2}) = \log (α^{- 1}) \frac{α^{v_{1}} α^{v_{2}}}{1 - α} {(1 - \frac{(1 - α^{v_{1}}) (1 - α^{v_{2}})}{1 - α})}^{- 2} .

After some algebra, the second order partial derivative of C_h with respect to v_j, j = 1, 2, is given by

\frac{\partial^{2} C_{h}}{\partial {v_{j}}^{2}} = \log (α) (\frac{\partial C_{h}}{\partial v_{j}} - {(\frac{\partial C_{h}}{\partial v_{j}})}^{2})

where

\frac{\partial C_{h}}{\partial v_{j}} = {(1 - \frac{(1 - α^{v_{1}}) (1 - α^{v_{2}})}{1 - α})}^{- 1} \frac{(1 - α^{v_{i}}) α^{v_{j}}}{1 - α},

and the first order derivative of the copula density c_h with respect to v_j, j = 1, 2, is given by

\frac{\partial c_{h}}{\partial v_{j}} = \log (α) (1 - 2 \frac{\partial C_{h}}{\partial v_{j}}) c_{h} .

Joe copula

The Joe copula and its density are given by

C_{h} (v_{1}, v_{2}) = 1 - {({\overset{‒}{v}}_{1}^{α} + {\overset{‒}{v}}_{2}^{α} - {\overset{‒}{v}}_{1}^{α} {\overset{‒}{v}}_{2}^{α})}^{1 ∕ α}

and

c_{h} (v_{1}, v_{2}) = {\overset{‒}{v}}_{1}^{α - 1} {\overset{‒}{v}}_{2}^{α - 1} T_{2}^{1 ∕ α - 1} (α - \frac{(1 - α) (1 - {\overset{‒}{v}}_{1}^{α}) (1 - {\overset{‒}{v}}_{2}^{α})}{T_{2}}),

where ${\overset{‒}{v}}_{j} = 1 - v_{j}$ and $T_{2} = {\overset{‒}{v}}_{1}^{α} + {\overset{‒}{v}}_{2}^{α} - {\overset{‒}{v}}_{1}^{α} {\overset{‒}{v}}_{2}^{α}$ .

The second order partial derivative of the copula C_h with respect to v_j, j = 1, 2, is given by

\frac{\partial^{2} C_{h}}{\partial {v_{j}}^{2}} = (1 - α) {\overset{‒}{v}}_{j}^{α - 1} (1 - {\overset{‒}{v}}_{i}^{α}) T_{2}^{1 ∕ α - 1} (\frac{{\overset{‒}{v}}_{j}^{α - 1} (1 - {\overset{‒}{v}}_{i}^{α})}{T_{2}} - \frac{1}{{\overset{‒}{v}}_{j}}) .

After some tedious algebra, the first order partial derivative of the copula density c_h with respect to v_j, j = 1, 2, is given by

\frac{\partial c_{h}}{\partial v_{j}} = (α - 1) {\overset{‒}{v}}_{i}^{α - 1} {\overset{‒}{v}}_{j}^{α - 2} T_{2}^{1 ∕ α - 1} (- 1 + {\overset{‒}{v}}_{j}^{α} (1 - {\overset{‒}{v}}_{i}^{α}) T_{2}^{- 1}) \times (α - \frac{(1 - α) (1 - {\overset{‒}{v}}_{1}^{α}) (1 - {\overset{‒}{v}}_{2}^{α})}{T_{2}}) + (1 - α) α {\overset{‒}{v}}_{i}^{α - 1} {\overset{‒}{v}}_{j}^{2 α - 2} (1 - {\overset{‒}{v}}_{i}^{α}) T_{2}^{1 ∕ α - 3} \times (- T 2 - (1 - {\overset{‒}{v}}_{j}^{α}) (1 - {\overset{‒}{v}}_{i}^{α})) .

Gaussian copula

The Gaussian copula and its density are given by

C_{h} (v_{1}, v_{2}) = Φ_{α} (Φ^{- 1} (v_{1}), Φ^{- 1} (v_{2})),

where ϕ_α is the bivariate standard normal distribution with correlation α, Φ is the scalar standard normal distribution, and

c_{h} (v_{1}, v_{2}) = \frac{ϕ_{α} (Φ^{- 1} (v_{1}), Φ^{- 1} (v_{2}))}{ϕ (Φ^{- 1} (v_{1})) ϕ (Φ^{- 1} (v_{2}))},

where ϕ is the density function of Φ, and ϕ_α is the density function of ϕ_α.

The second order partial derivative of the copula C_h with respect to v_j, j = 1, 2, is given by

\frac{\partial^{2} C_{h}}{\partial {v_{j}}^{2}} = - [\int_{- \infty}^{Φ^{- 1} (v_{i})} \frac{Φ^{- 1} (v_{j}) - α s}{2 π {(1 - α^{2})}^{3 ∕ 2}} \times \exp (- \frac{1}{2} \frac{Φ^{- 1} {(v_{j})}^{2} - 2 α Φ^{- 1} (v_{j}) s + s^{2}}{1 - α^{2}}) d s] ϕ {(Φ {(v_{j})}^{- 1})}^{- 2} + \frac{\partial C_{h}}{\partial v_{j}} Φ {(v_{j})}^{- 1} ϕ {(Φ (v_{j})^{- 1})}^{- 1} .

The first order partial derivative of the copula density c_h with respect to v_j, j = 1, 2, is given by

\frac{\partial c_{h}}{\partial v_{j}} = (Φ {(v_{j})}^{- 1} - \frac{Φ {(v_{j})}^{- 1} - Φ {(v_{j})}^{- 1} α}{1 - α^{2}}) ϕ {(Φ {(v_{j})}^{- 1})}^{- 1} c_{h} .

Mixture copula

A mixture copula C_h(v₁, v₂; α), with its parameter α = (α₁, α₂, λ), is simply given by

C_{h} (v_{1}, v_{2}; α) = λ C_{h}^{1} (v_{1}, v_{2}; α_{1}) + (1 - λ) C_{h}^{2} (v_{1}, v_{2}; α_{2}), 0 \leq λ \leq 1,

where $C_{h}^{1} (v_{1}, v_{2}; α_{1})$ is one copula (such as the Clayton copula in our application) with its parameter α₁, and $C_{h}^{1} (v_{1}, v_{2}; α_{2})$ is another copula (such as the Gumbel copula in our application) with its parameter α₂. Then it is clear that the partial derivatives of C_h are simply the linear combination of the partial derivatives of the two copulas:

\frac{\partial^{k} C_{h}}{\partial v_{j}^{k}} = λ \frac{\partial^{k} C_{h}^{1}}{\partial v_{j}^{k}} + (1 - λ) \frac{\partial^{k} C_{h}^{2}}{\partial v_{j}^{k}}, j = 1, 2 .

Footnotes

Since our test is developed for semiparametric copula-based survival functions instead of distribution functions, we use the survival copulas of these seven copula functions in implementing our test. However, we present our empirical results in terms of copulas of the corresponding semiparametric distribution functions in order to compare our results with existing results just cited.

When computing the test statistic $T_{nS}^{P}$ , we have used a = 1 and b_n = 10/n².

In the empirical application we have used both analytical derivatives and numerical derivatives, while the results based on analytical derivatives perform slightly better. Since these analytical derivatives for copulas are tedious to compute, we include them in this Appendix B so that readers could use them in other applications as well.

⁴

We leave the dependence on (v₁, v₂) implicit, to ease the notational burden.

References

Akritas M. Nearest neighbor estimation of a bivariate distribution under random censoring. Annals of Statistics. 1994;22:1299–1327. [Google Scholar]
Chen X, Fan Y. Pseudo-likelihood ratio tests for model selection in semiparametric multivariate copula models. The Canadian Journal of Statistics. 2005;33:389–414. [Google Scholar]
Chen X, Fan Y. Estimation and model selection of semiparametric copula-based multivariate dynamic models under copula misspecification. Journal of Econometrics. 2006a;135:125–154. [Google Scholar]
Chen X, Fan Y. Estimation of copula-based semiparametric time series models. Journal of Econometrics. 2006b;130:307–335. [Google Scholar]
Chen X, Fan Y. A model selection test for bivariate failure-time data. Econometric Theory. 2007;23:414–439. [Google Scholar]
Dabrowska D. Kaplan–Meier estimate on the plane: Weak convergence, LIL, and the bootstrap. Journal of Multivariate Analysis. 1989;29:308–325. [Google Scholar]
Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge University Press; 1997. [Google Scholar]
Denuit M, Purcaru O, Van Keilegom I. Bivariate Archimedean copula modelling for censored data in non-life insurance. Journal of Actuarial Practice. 2006;13:5–32. [Google Scholar]
Frees E, Valdez E. Understanding relationships using copulas. North American Actuarial Journal. 1998;2:1–25. [Google Scholar]
Genest C, Ghoudi K, Rivest L. A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika. 1995;82:543–552. [Google Scholar]
Gill R. Math. Centre Tracts. Vol. 124. Mathematisch Centrum; Amsterdam: 1980. Censoring and Stochastic Integrals. [Google Scholar]
Hansen RP. Manuscript. Brown University; 2003. A test for superior predictive ability. [Google Scholar]
Joe H. Multivariate Models and Dependence Concepts. Chapman & Hall/CRC; London: 1997. [Google Scholar]
Klugman S, Parsa R. Fitting bivariate loss distributions with copulas. Insurance: Mathematics and Economics. 1999;24:139–148. [Google Scholar]
Lai T, Ying Z. Estimating a distribution function with truncated and censored data. Annals of Statistics. 1991;19:417–442. [Google Scholar]
Li D. On default correlation: A copula function approach. Journal of Fixed Income. 2000:43–54. [Google Scholar]
Nelsen R. An Introduction to Copulas. Springer; New York: 1999. [Google Scholar]
Oakes D. Bivariate survival models induced by frailties. Journal of the American Statistical Association. 1989;84:487–493. [Google Scholar]
Oakes D. Multivariate survival distributions. Journal of Nonparametric Statistics. 1994;3:343–354. [Google Scholar]
Romano JP, Wolf M. Stepwise multiple testing as formalized data snooping. Econometrica. 2005;73:1237–1282. [Google Scholar]
Shih J, Louis T. Inferences on the association parameter in copula models for bivariate survival data. Biometrics. 1995;51:1384–1399. [PubMed] [Google Scholar]
Sin C, White H. Information criteria for selecting possibly misspecified parametric models. Journal of Econometrics. 1996;71:207–225. [Google Scholar]
Sklar A. Fonctions de r'epartition'a n dimensionset leurs marges. Publications of the Institute of Statistics University Paris. 1959;8:229–231. [Google Scholar]
van Zuijlen MCA. Properties of the empirical distribution function for independent nonidentically distributed random variables. Annals of Probability. 1978;6:250–266. [Google Scholar]
Wang W, Wells M. Model selection and semiparametric inference for bivariate failure-time data. Journal of the American Statistical Association. 2000;95:62–76. [Google Scholar]
White H. A reality check for data snooping. Econometrica. 2000;68:1097–1126. [Google Scholar]

[R1] Akritas M. Nearest neighbor estimation of a bivariate distribution under random censoring. Annals of Statistics. 1994;22:1299–1327. [Google Scholar]

[R2] Chen X, Fan Y. Pseudo-likelihood ratio tests for model selection in semiparametric multivariate copula models. The Canadian Journal of Statistics. 2005;33:389–414. [Google Scholar]

[R3] Chen X, Fan Y. Estimation and model selection of semiparametric copula-based multivariate dynamic models under copula misspecification. Journal of Econometrics. 2006a;135:125–154. [Google Scholar]

[R4] Chen X, Fan Y. Estimation of copula-based semiparametric time series models. Journal of Econometrics. 2006b;130:307–335. [Google Scholar]

[R5] Chen X, Fan Y. A model selection test for bivariate failure-time data. Econometric Theory. 2007;23:414–439. [Google Scholar]

[R6] Dabrowska D. Kaplan–Meier estimate on the plane: Weak convergence, LIL, and the bootstrap. Journal of Multivariate Analysis. 1989;29:308–325. [Google Scholar]

[R7] Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge University Press; 1997. [Google Scholar]

[R8] Denuit M, Purcaru O, Van Keilegom I. Bivariate Archimedean copula modelling for censored data in non-life insurance. Journal of Actuarial Practice. 2006;13:5–32. [Google Scholar]

[R9] Frees E, Valdez E. Understanding relationships using copulas. North American Actuarial Journal. 1998;2:1–25. [Google Scholar]

[R10] Genest C, Ghoudi K, Rivest L. A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika. 1995;82:543–552. [Google Scholar]

[R11] Gill R. Math. Centre Tracts. Vol. 124. Mathematisch Centrum; Amsterdam: 1980. Censoring and Stochastic Integrals. [Google Scholar]

[R12] Hansen RP. Manuscript. Brown University; 2003. A test for superior predictive ability. [Google Scholar]

[R13] Joe H. Multivariate Models and Dependence Concepts. Chapman & Hall/CRC; London: 1997. [Google Scholar]

[R14] Klugman S, Parsa R. Fitting bivariate loss distributions with copulas. Insurance: Mathematics and Economics. 1999;24:139–148. [Google Scholar]

[R15] Lai T, Ying Z. Estimating a distribution function with truncated and censored data. Annals of Statistics. 1991;19:417–442. [Google Scholar]

[R16] Li D. On default correlation: A copula function approach. Journal of Fixed Income. 2000:43–54. [Google Scholar]

[R17] Nelsen R. An Introduction to Copulas. Springer; New York: 1999. [Google Scholar]

[R18] Oakes D. Bivariate survival models induced by frailties. Journal of the American Statistical Association. 1989;84:487–493. [Google Scholar]

[R19] Oakes D. Multivariate survival distributions. Journal of Nonparametric Statistics. 1994;3:343–354. [Google Scholar]

[R20] Romano JP, Wolf M. Stepwise multiple testing as formalized data snooping. Econometrica. 2005;73:1237–1282. [Google Scholar]

[R21] Shih J, Louis T. Inferences on the association parameter in copula models for bivariate survival data. Biometrics. 1995;51:1384–1399. [PubMed] [Google Scholar]

[R22] Sin C, White H. Information criteria for selecting possibly misspecified parametric models. Journal of Econometrics. 1996;71:207–225. [Google Scholar]

[R23] Sklar A. Fonctions de r'epartition'a n dimensionset leurs marges. Publications of the Institute of Statistics University Paris. 1959;8:229–231. [Google Scholar]

[R24] van Zuijlen MCA. Properties of the empirical distribution function for independent nonidentically distributed random variables. Annals of Probability. 1978;6:250–266. [Google Scholar]

[R25] Wang W, Wells M. Model selection and semiparametric inference for bivariate failure-time data. Journal of the American Statistical Association. 2000;95:62–76. [Google Scholar]

[R26] White H. A reality check for data snooping. Econometrica. 2000;68:1097–1126. [Google Scholar]

PERMALINK

Estimation and model selection of semiparametric multivariate survival functions under general censorship

Xiaohong Chen

Yanqin Fan

Demian Pouzo

Zhiliang Ying

Abstract

1. Introduction

2. Model selection criterion and parameter estimation

2.1. Model selection criterion

2.2. Two-step estimation

2.3. Penalized pseudo-likelihood ratio criteria

3. Asymptotic properties of the two-step estimator under copula misspecification

3.1. Consistency

3.2. Asymptotic normality

4. Pseudo-likelihood ratio test for model comparison

5. An empirical application

Table 1.

Table 2.

Table 3.

Table 4.

6. Conclusion

Acknowledgements

Appendix A. Technical proofs

Appendix B. Expressions of copulas and their derivatives

Gumbel copula

Clayton copula

Frank copula

Joe copula

Gaussian copula

Mixture copula

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Estimation and model selection of semiparametric multivariate survival functions under general censorship

Xiaohong Chen

Yanqin Fan

Demian Pouzo

Zhiliang Ying

Abstract

1. Introduction

2. Model selection criterion and parameter estimation

2.1. Model selection criterion

2.2. Two-step estimation

2.3. Penalized pseudo-likelihood ratio criteria

3. Asymptotic properties of the two-step estimator under copula misspecification

3.1. Consistency

3.2. Asymptotic normality

4. Pseudo-likelihood ratio test for model comparison

5. An empirical application

Table 1.

Table 2.

Table 3.

Table 4.

6. Conclusion

Acknowledgements

Appendix A. Technical proofs

Appendix B. Expressions of copulas and their derivatives

Gumbel copula

Clayton copula

Frank copula

Joe copula

Gaussian copula

Mixture copula

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases