CASE-CONTROL SURVIVAL ANALYSIS WITH A GENERAL SEMIPARAMETRIC SHARED FRAILTY MODEL - A PSEUDO FULL LIKELIHOOD APPROACH

Malka Gorfine; David M Zucker; Li Hsu

doi:10.1901/jaba.2009.37-1489

. Author manuscript; available in PMC: 2009 Aug 14.

Published in final edited form as: Ann Stat. 2009;37(3):1489–1517. doi: 10.1901/jaba.2009.37-1489

CASE-CONTROL SURVIVAL ANALYSIS WITH A GENERAL SEMIPARAMETRIC SHARED FRAILTY MODEL - A PSEUDO FULL LIKELIHOOD APPROACH

Malka Gorfine ¹, David M Zucker ², Li Hsu ³

PMCID: PMC2727284 NIHMSID: NIHMS113107 PMID: 19684881

Abstract

In this work we deal with correlated failure time (age at onset) data arising from population-based case-control studies, where case and control probands are selected by population-based sampling and an array of risk factor measures is collected for both cases and controls and their relatives. Parameters of interest are effects of risk factors on the failure time hazard function and within-family dependencies among failure times after adjusting for the risk factors. Due to the retrospective sampling scheme, large sample theory for existing methods has not been established. We develop a novel technique for estimating the parameters of interest under a general semiparametric shared frailty model. We also present a simple, easily computed, and non-iterative nonparametric estimator for the cumulative baseline hazard function. We provide rigorous large sample theory for the proposed method. We also present simulation results and a real data example for illustrating the utility of the proposed method.

Keywords: Case-control study, Correlated failure times, Family study, Frailty model, Multivariate survival model

1. Introduction

Clustered failure times arise often in medical and epidemiologic studies. Examples include disease onset times of twins (with time expressed in terms of age), multiple recurrence times of infections on an individual, or time to blindness for the two eyes within an individual. A typical case-control family study includes a random sample of independent diseased individuals (cases) and non-diseased individuals (controls), along with their family members. An array of genetic and environmental risk factor measures is collected on these individuals. Integration of genetic and environmental data is a central problem of modern observational epidemiology. Case-control family studies are powerful because they provide an efficient way to assess the effect of risk factors on the occurrence of a rare disease, and furthermore allow researchers to dissect genetic and environmental contributions to the disease based on the familial aggregation pattern of disease clusters. Hopper [16] suggested that such study designs may be the future of epidemiology in general, not just genetic epidemiology. Hence the need for statistical methods that can exploit such data is acute.

In this work we focus on population-based matched case-control family studies, where a number of case and control probands are randomly sampled from a well-defined population. The probands are the index subjects because of whom the families are ascertained. Here we use the term proband in a broad sense to refer to both cases and controls, in contrast with the traditional usage in which probands refer only to cases.

Relative to classical case-control methods, analysis of these studies is complicated in two major ways. First, comparisons are no longer solely between subjects with and without the disease under study, but rather between collections of the case probands and their relatives and the control probands and their relatives, with each collection typically including many subjects both with and without the disease under study. Second, data are clustered within families, and hence reflect intra-familial correlation due to unmeasured genetic and environmental factors.

Our work is motivated by a recent breast cancer study [23], [24]. In this study, the cases were incident breast cancer cases ascertained from the population-based Surveillance, Epidemiology, and End Results (SEER) cancer registry. The controls were selected by random digit dialing, and were matched with cases based on age at diagnosis and county of residence. Female relatives of case and control probands were identified, and the risk factor and outcome information was collected on these relatives. The primary goals of the study are (a) to determine the degree of familial dependence with respect to age at breast cancer diagnosis; and (b) to assess the effects of covariates on breast cancer risk.

Two modeling approaches, marginal and conditional, are typically used for accounting for the correlation within a cluster. In the conditional model, the correlation is explicitly induced by a cluster-specific random effect, with the outcomes of the cluster members being conditionally independent given the random effect. In the survival context, this model is known as the frailty model, with the random frailty term typically assumed to act multiplicatively on the hazard. Many frailty models have been considered, including gamma, positive stable, inverse Gaussian, compound Poisson, and lognormal. See [17] for a broad review. Under a frailty model, the regression coefficients are cluster-specific log-hazard ratios. In the marginal model, by contrast, the correlation is modelled through a multivariate distribution, often involving a copula function, with a specified model for the marginal hazard functions. The regression coefficients in the marginal model represent the log-hazard ratios at the population level regardless of which cluster an individual comes from. The effect is thus “population-averaged.” A comparison of the conditional and marginal modelling approaches can be found in [34].

Methods exist for case-control family studies under both modelling approaches. Shih and Chatterjee [31] proposed a semi-parametric quasi-partial-likelihood approach for estimating the regression coefficients in a bivariate copula model. Their cumulative hazard estimator requires an iterative solution, and thus, so far, the properties of their estimators have been investigated only by simulation. Moreover, in the case of multiple relatives for each proband, the relatives were treated in the estimation process as if they were independent, which may lead to loss of efficiency in the baseline hazard function estimator. By contrast, a quasi-EM algorithm method for the popular gamma frailty model is presented in [18], where the baseline hazard function estimator naturally accommodates multiple relatives in a family [19]. However, with this method as well, the properties of the proposed estimators were studied only by simulation. The Shih-Chatterjee method can be adapted to the family-specific frailty setting [27], but with the same limitation as for the marginal model: the lack of large sample theory.

In this work, we develop a new estimation technique for the general semiparametric shared frailty model, where the parameters of interest are the regression coefficients and the frailty parameters. Our method covers any frailty distribution with finite moments. The estimation procedure for the baseline hazard function leads to an estimator whose asymptotic properties can be derived and expressed in a tractable way. Thus, this paper is the first to present an estimation procedure with rigorous asymptotic theory for a frailty survival model in the case-control family study setting.

Section 2 presents our model, and Section 3 describes our estimation procedure. Section 4 gives the consistency and asymptotic normality results. In Section 5, we describe an extension of our method for the case where the proband observation times are subject to a certain restriction that can arise in some studies. Section 6 presents simulation results. In Section 7 we illustrate our method with a case-control family study of early onset breast cancer. Section 8 provides a short discussion. Section 9 provides the details of the asymptotic theory. Throughout the paper, certain details have been omitted for brevity. These details are given in an expanded version of this paper which is available at the Front for the Mathematics ArXiv under Statistics, publication number: math.ST/0703300.

2. Notation and Model Formulation

We consider a matched case-control family study where one case proband is age-matched with one control proband, and an array of risk factors is measured on the case and control probands and their relatives. Each matched set contains one case family and one control family, and there are n i.i.d. matched sets. Let $T_{i j}^{0}$ and C_ij denote the age of disease onset and age at censoring, respectively, for individual j of family i, i = 1, . . . , 2n, j = 0, 1, . . . , m_i, where j = 0 corresponds to the proband. Following [28] (p. 187), we regard m_i as a random variable over {1, . . . , m} for some m, and build up the remainder of the model conditional on m_i. Define $δ_{i j} = I (T_{i j}^{0} \leq C_{i j})$ to be the failure indicator and $T_{i j} = \min (T_{i j}^{0}, C_{i j})$ to be the observed follow-up time for individual ij. We assume that a p-vector of covariates is observed on all subjects, and let Z_ij denote the value of the time-independent covariate vector for individual ij. In addition, we associate with family i an unobservable family-level covariate ω_i, the “frailty”, which induces dependence among family members. The conditional hazard function for proband i, given the family frailty ω_i, is assumed to take the form

λ_{i 0} (t ∣ Z_{i 0}, ω_{i}) = ω_{i} λ_{0} (t) \exp (β^{T} Z_{i 0}) i = 1, . . ., 2 n .

(1)

The conditional hazard function for relative ij, j = 1, . . . , m_i, given the family frailty ω_i, is assumed to take the form

λ_{i j} (t ∣ Z_{i j}, ω_{i}) = ω_{i} λ_{0} (t) \exp (β^{T} Z_{i j}) i = 1, . . ., 2 n; j = 1, . . ., m_{i} .

(2)

Here β is a p-vector of unknown regression coefficients, and λ₀ is a conditional baseline hazard of unspecified form. The above model implies that the proband and the relatives have a common conditional baseline hazard function λ₀, and that all the dependence between the proband and the relatives in a given family is due to the frailty factor ω_i. This is the standard formulation of the frailty survival model; see, for example, [9], [10], [17] and [26]. Hence, the conditional hazard function of a relative, given Z_ij, ω_i and the proband data, is a function of only Z_ij and ω_i. The random variable ω_i is assumed to have a density f(ω) ≡ f(ω; θ), where θ is an unknown parameter. For simplicity, we assume that θ is a scalar, though the vector case could be developed in a similar manner.

We put γ = (β^T , θ)^T, and let γ^o = (β^oT, θ^o)^T denote the true value of γ. The objective is to estimate γ and the cumulative baseline hazard $Λ_{0} (t) = \int_{0}^{t} λ_{0} (u) d u$ . Let $Λ_{0}^{o} (t)$ denote the true value of Λ₀. Further, let δ_iR = (δ_i1, . . . , δ_{im_i}), T_iR = (T_i1, . . . , T_{im_i}), and $Z_{i R} = {(Z_{i 1}^{T}, \dots, Z_{i m_{i}}^{T})}^{T}$ .

We assume that Z_ij is bounded, and that the parameter γ lies in a compact subset $G$ of IR^p+1 containing an open neighborhood of γ^o. These two assumptions imply that β^TZ_ij is bounded. In addition, we assume the following.

Conditional on ${Z_{i j}}_{j = 0}^{m_{i}}$ and ω_i, the censoring times are independent and noninformative for ω_i and (β, Λ₀). In addition, the frailty ω_i is independent of ${Z_{i j}}_{j = 0}^{m_{i}}$ .
The effect of the covariates on the observed time is subject-specific, i.e. Pr(T_ij, δ_ij|Z_i0, Z_iR, ω_i) = Pr(T_ij, δ_ij|Z_ij, ω_i). This implies Pr(T_ij, δ_ij|Z_i0, Z_iR) = Pr(T_ij, δ_ij|Z_ij) even without frailty.

A number of additional technical assumptions are listed in Section 9.

Following [31] and [18], the likelihood function for the data can be written as

\begin{matrix} L & = \prod_{i = 1}^{2 n} f (T_{i R}, δ_{i R}, Z_{i R}, Z_{i 0} ∣ T_{i 0}, δ_{i 0}) \\ = \prod_{i = 1}^{2 n} f (T_{i R}, δ_{i R} ∣ Z_{i R}, Z_{i 0}, T_{i 0}, δ_{i 0}) \times f (Z_{i R} ∣ Z_{i 0}) \times f (Z_{i 0} ∣ T_{i 0}, δ_{i 0}) . \end{matrix}

(3)

Since f(Z_iR|Z_i0) does not depend on the parameters of interest (β, Λ₀, θ), this term will be ignored. In the following subsections we consider the other two terms in (3).

2.1. The likelihood for the proband data

To account for the matching of age of onset, as in [31] and [18], the likelihood function of the proband data, $\prod_{i = 1}^{2 n} f (Z_{i 0} ∣ T_{i 0}, δ_{i 0})$ , is constructed based on the retrospective likelihood for the standard matched case-control study [29]. We express this likelihood in terms of the marginal survival function S_i0(t) = Pr(T_i0 > t|Z_i0) = ∫ Pr(T_i0 > t|Z_i0,ω)f(ω)dω. In our setting we have n one-to-one matched sets. Based on the marginal survivor function, the marginal hazard function can be written as

λ_{i 0} (t ∣ Z_{i 0}) = λ_{0} (t) \exp (β^{T} Z_{i 0}) \frac{μ_{1 i} (t; γ, Λ_{0})}{μ_{0 i} (t; γ, Λ_{0})},

where

μ_{k i} (t; γ, Λ_{0}) = \int ω^{k} \exp {- ω H_{i 0} (t)} f (ω) d ω, k = 0, 1, 2,

and H_i0(t) = Λ₀(t)exp(β^TZ_i0). We arrange the notation so that the first n families are the case families and the rth case family, r = 1, . . . , n, is matched with the (n + r)th control family. The likelihood for the proband data is then replaced by the following conditional likelihood:

L^{(1)} = \prod_{r = 1}^{n} \frac{\exp (β^{T} Z_{r 0}) ξ_{10 r} (T_{r 0}; γ, Λ_{0})}{\sum_{j \in {r, n + r}} \exp (β^{T} Z_{j 0}) ξ_{10 j} (T_{j 0}; γ, Λ_{0})},

(4)

where ξ_kk’i(t; γ, Λ₀) = μ_ki(t; γ, Λ₀)/μ_k’i(t; γ, Λ₀) for k, k’ = 0, 1, 2. Given (4), the likelihood score functions $U_{l}^{(1)} (γ, Λ_{0}) = \partial \log L^{(1)} ∕ \partial β_{l}, l = 1, \dots, p$ , l = 1, . . . , p, and $U_{p + 1}^{(1)} (γ, Λ_{0}) = \partial \log L^{(1)} ∕ \partial θ$ can be obtained by straightforward differentiation. The detailed formulas are presented in the expanded paper.

Under the gamma frailty model, we have μ_1i(t; γ, Λ₀)/μ_0i(t; γ, Λ₀) = {θH_i0(t) + 1}^-1, and so the likelihood function (4) corresponds to that presented in [18] in the case of one-to-one matching. Extension to matching of multiple cases or multiple controls are straightforward, see e.g. [4].

2.2. The likelihood for the data from the relatives

Let N_ij(t) = δ_ijI(T_ij ≤ t), j = 1, . . . , m_i, $N_{i .} (t) = \sum_{j = 1}^{m_{i}} N_{i j} (t)$ , H_ij(t) = Λ₀(T_ij ∧ t) exp(β^TZ_ij), j = 1, . . . , m_i, and $H_{i .} (t) = \sum_{j = 1}^{m_{i}} H_{i j} (t)$ , and let τ be the maximum followup time. The likelihood of the data from the relatives then can be written as

\begin{matrix} L^{(2)} & = \prod_{i = 1}^{2 n} \int \prod_{j = 1}^{m_{i}} {λ_{i j} (T_{i j} ∣ T_{i 0}, δ_{i 0}, Z_{i 0}, Z_{i j}, ω)}^{δ_{i j}} S_{i j} (T_{i j} ∣ T_{i 0}, δ_{i 0}, Z_{i 0}, Z_{i j}, ω) \times f (ω ∣ T_{i 0}, δ_{i 0}, Z_{i 0}) d ω \\ = \prod_{i = 1}^{2 n} \prod_{j = 1}^{m_{i}} {λ_{0} (T_{i j}) e^{β^{T} Z_{i j}}}^{δ_{i j}} \prod_{i = 1}^{2 n} \int ω^{N_{i .} (τ)} e^{- ω H_{i .} (τ)} f (ω ∣ T_{i 0}, δ_{i 0}, Z_{i 0}) d w . \end{matrix}

(5)

Here, by a Bayes theorem argument,

f (ω ∣ T_{i 0}, δ_{i 0}, Z_{i 0}) = \frac{ω^{δ_{i 0}} \exp (- ω Λ_{0} (T_{i 0}) e^{β^{T} Z_{i 0}}) f (ω)}{\int {\tilde{ω}}^{δ_{i 0}} \exp (- \tilde{ω} Λ_{0} (T_{i 0}) e^{β^{T} Z_{i 0}}) f (\tilde{ω}) d \tilde{ω}} .

(6)

Given (5), the likelihood score functions $U_{l}^{(2)} (γ, Λ_{0}) = \partial \log L^{(2)} ∕ \partial β_{l}$ , l = 1, . . . , p, and $U_{p + 1}^{(2)} (γ, Λ_{0}) = \partial \log L^{(2)} ∕ \partial θ$ can be obtained by straightforward differentiation. The detailed formulas are presented in the expanded paper.

3. The Proposed Approach

We focus first on estimating the baseline cumulative hazard function Λ₀(t). Let Y_ij(t) = I(T_ij ≥ t), and let $F_{t}$ denote the σ-algebra generated by (T_i0, δ_i0, Z_i0) plus the entire observed history of the relatives up to time t:

F_{t} = σ (T_{i 0}, δ_{i 0}, Z_{i 0}, N_{i j} (u), Y_{i j} (u), Z_{i j}; i = 1, . . ., 2 n; j = 1, . . ., m_{i}; 0 \leq u \leq t) .

It is worth noting that the observational times T_i0 for probands can be greater than time t and thus the filtration $F_{t}$ may include probands’ failure times or censoring times that are beyond t, a feature that is unique for case-control family data. In regard to the relatives, however, $F_{t}$ includes only information up to time t.

By the innovation theorem ([3], Thm. 3.4), the stochastic intensity process for N_ij(t), i = 1, . . . , 2n, j = 1, . . . , m_i, with respect to $F_{t}$ is given as follows [11], [28]:

λ_{0} (t) \exp (β^{T} Z_{i j}) Y_{i j} (t) ψ_{i} (t -, γ, Λ_{0}),

(7)

where, using (6),

\begin{matrix} ψ_{i} (t, γ, Λ_{0}) & = E [ω_{i} ∣ F_{t}] = \frac{\int ω^{N_{i .} (t) + 1} \exp (- ω H_{i .} (t)) f (ω ∣ T_{i 0}, δ_{i 0}, Z_{i 0}) d ω}{\int ω^{N_{i .} (t)} \exp (- ω H_{i .} (t)) f (ω ∣ T_{i 0}, δ_{i 0}, Z_{i 0}) d ω} \\ = \frac{\int ω^{N_{i .} (t) + 1 + δ_{i 0}} \exp (- ω {H_{i .} (t) + H_{i 0} (T_{i 0})}) f (ω) d ω}{\int ω^{N_{i .} (t) + δ_{i 0}} \exp (- ω {H_{i .} (t) + H_{i 0} (T_{i 0})}) f (ω) d ω} . \end{matrix}

Define (for 0 ≤ r ≤ m and h ≥ 0)

ψ^{*} (r, h) = \frac{\int ω^{r + 1} e^{- h ω} f (ω) d w}{\int ω^{r} e^{- h ω} f (ω) d w} .

(8)

Some salient properties of ψ*(r, h) are noted in Sec. 9.1. With this definition, we have ψ_i(t, γ, Λ₀) = ψ*(N_i.(t) + δ_i0, H_i.(t) + H_i0(T_i0)).

Let τ_g, g = 1, . . . , G, denote the gth ordered failure time of the relatives and assume that d_g failures were observed at time τ_g. In theory, since we are dealing with continuous survival distributions, d_g = 1 for all g, but we express the following estimators in a form that allows for a modest level of ties. A Breslow-type estimator of the cumulative baseline hazard function, with a jump at each observed failure time among the relatives, can be formulated in a natural way similarly to Shih and Chatterjee [31], with the g-th jump given by

\frac{d_{g}}{\sum_{i = 1}^{2 n} ψ_{i} (τ_{g - 1}, γ, Λ_{0}) \sum_{j = 1}^{m_{i}} Y_{i j} (τ_{g}) \exp (β^{T} Z_{i j})} .

(9)

However, ψ_i(t, γ, Λ₀) could be a function of Λ₀(T_i0) and T_i0 could be greater than t. Consequently, the above Breslow formula for the jump in the baseline hazard estimator at time t will often involve values of Λ₀ for times beyond time t. For example, under the gamma frailty model with expectation 1 and variance θ, ψ_i(t, γ, Λ₀) = {θ^-1 + N_i.(t) + δ_i0}{θ^-1 + H_i.(t) + H_i0(T_i0)}^-1. An iterative procedure is thus required to compute the estimator. In addition, because of this estimator’s complicated structure, its asymptotic properties have not been established.

We propose instead to estimate the baseline hazard function using a noniterative two-stage procedure. Let Λ_max be some known (possibly large) upper bound for $Λ_{0}^{o} (t)$ . Define ψ‾(r, h) = ψ*(r, h∧h_max), with h_max = me^νΛ_max, where ν is an upper bound on |β^TZ_ij|. Further, define ψ‾_i(t, γ, Λ) = ψ‾(N_i(t), H_i.(t, γ, Λ)). The first-stage estimator is then defined as a step function whose g-th jump is given by

Δ {\tilde{Λ}}_{0} (τ_{g}) = \frac{d_{g}^{*}}{\sum_{i = 1}^{2 n} I (T_{i 0} < τ_{g}) {\overset{‒}{ψ}}_{i} (τ_{g - 1}, γ, {\tilde{Λ}}_{0}) \sum_{j = 1}^{m_{i}} Y_{i j} (τ_{g}) \exp (β^{T} Z_{i j})},

(10)

with

d_{g}^{*} = \sum_{i = 1}^{2 n} I (T_{i 0} < τ_{g}) \sum_{j = 1}^{m_{i}} d N_{i j} (τ_{g}) .

The formula (10) is of the same form as (9), with the following changes: (a) ψ_i is replaced by ψ‾_i for technical reasons, (b) more substantively, in computing the jump at each failure time τ_g, we include only relatives whose proband’s observation time is less than τ_g. We thereby avoid the problem with (9) that was described above, and hence avoid the need for an iterative optimization process. Since (10) excludes some of the available data, these benefits are attained at the expense of a loss in efficiency. We therefore follow up with a second stage in order to recoup efficiency.

The second-stage estimator is defined as a step function whose g-th jump is given by

Δ {\hat{Λ}}_{0} (τ_{g}) = \frac{d_{g}}{\sum_{i = 1}^{2 n} {\tilde{ψ}}_{i} (τ_{g - 1}, γ) \sum_{j = 1}^{m_{i}} Y_{i j} (τ_{g}) \exp (β^{T} Z_{i j})},

(11)

where ${\tilde{ψ}}_{i} (t, γ)$ is defined analogously to ψ‾(t, γ, Λ₀), with Λ₀(T_i0) replaced by ${\tilde{Λ}}_{0} (T_{i 0})$ if T_i0 ≥ t and by Λ̂₀(T_i0) otherwise. The large-sample properties of Λ̂₀(t) will be determined by those of ${\tilde{Λ}}_{0} (t)$ . The estimator ${\tilde{Λ}}_{0} (t, γ)$ is not necessarily bounded by Λ_max, but if desired, we can replace it by $\min {{\tilde{Λ}}_{0} (t, γ), Λ_{m a x}}$ without affecting the asymptotics.

For estimating (β, θ) we use a pseudo-likelihood approach: in the score functions based on L⁽¹⁾ and L⁽²⁾, we replace the unknown Λ₀ by Λ̂₀. Thus, the score function corresponding to β_l (for l = 1, . . . , p) is given by $U_{l} (γ, {\hat{Λ}}_{0}) = n^{- 1} {U_{l}^{(1)} (γ, {\hat{Λ}}_{0}) + U_{l}^{(2)} (γ, {\hat{Λ}}_{0})}$ , and the estimating function for θ is given by $U_{p + 1} (γ, {\hat{Λ}}_{0}) = n^{- 1} {U_{p + 1}^{(1)} (γ, {\hat{Λ}}_{0}) + U_{p + 1}^{(2)} (γ, {\hat{Λ}}_{0})}$ . To summarize, our proposed estimation procedure is as follows. (1) Provide an initial value for γ. (2) For the given values of γ, estimate Λ₀ using (10) and (11). (3) For the given value of Λ₀, estimate γ. (4) Repeat Steps 2 and 3 until convergence is reached with respect to Λ̂₀ and γ̂. Hence, Instead of having two iterative processes: within each iteration between γ̂ and Λ̂₀, to iterate also for estimating Λ₀ itself (as in [31]) we propose one iterative process between γ̂ and Λ̂₀. By eliminating the iterative process for estimating the baseline hazard function we are able to provide the asymptotic theory of our estimators, in contrast to [18] and [31].

4. Asymptotic Properties

We show that γ̂ is a consistent estimator of γ^o and that √n(γ̂ - γ^o) is asymptotically mean-zero multivariate normal. In this section, we present a broad outline sketch of the argument. In Section 9, we provide the details of the proofs, including a detailed list of the technical conditions assumed. The arguments are patterned after those of [14] and [36], but with considerable expansion, as will be elaborated in section 9.

Consistency is shown through the following steps.

Claim A1. ${\tilde{Λ}}_{0} (t, γ)$ converges in pr. to some function $Λ_{0}^{*} (t, γ)$ uniformly in t and γ. The function $Λ_{0}^{*} (t, γ)$ satisfies $Λ_{0}^{*} (t, γ^{o}) = Λ_{0}^{o} (t)$ .

Claim A2. Λ̂₀(t, γ) converges in pr. to some function Λ̂₀(t, γ) uniformly in t and γ. The function Λ̂₀(t, γ) satisfies $Λ_{0}^{*} (t, γ^{o}) = Λ_{0}^{o} (t)$ .

Claim B. U(γ, Λ̂₀(·, γ)) converges in pr. uniformly in t and γ to a limit u(γ, Λ₀(·, γ)).

Claim C. There exists a unique consistent (in pr.) root to U(γ̂, Λ̂₀(·, γ̂)) = 0.

It should be emphasized that in Claims A1 and A2 the limits coincide at the true parameter value γ^o. The proofs of Claims A1, A2, and B involve empirical process and function-space compactness arguments, while Claim C is shown using Foutz’s [8] theorem on consistency of maximum likelihood type estimators.

Asymptotic normality is shown by decomposing U(γ̂, Λ̂₀(·, γ̂)) = 0 as

U (γ^{o}, Λ_{0}^{o}) + [U (γ^{o}, {\hat{Λ}}_{0} (\cdot, γ^{o})) - U (γ^{o}, Λ_{0}^{o})] + [U (\hat{γ}, {\hat{Λ}}_{0} (\cdot, \hat{γ})) - U (γ^{o}, {\hat{Λ}}_{0} (\cdot, γ^{o}))] = 0 .

In Section 9 we analyze each of the above three terms and prove that √n(γ̂-γ^o) has an asymptotic mean-zero multivariate normal distribution. Although it is possible to develop a consistent closed-form sandwich estimator for the asymptotic covariance matrix, we do not present this estimator because it is too complicated to be practically useful. Instead, as discussed in Section 6, we recommend bootstrap standard-error estimates.

In Section 9 we also show the uniform consistency and weak convergence of the cumulative baseline hazard function estimator Λ̂₀(t, γ̂). Such results were not presented in [14] or [36].

5. Extension to Restricted Sampling of Probands

A key assumption in our procedure for estimating Λ₀ is that the support of the probands’ observation times and that of relatives’ observation times have the same lower limit, which is designated (without loss of generality) as time zero. In some applications, however, the probands’ observed times are restricted to some range [s₀, s₁] with s₀ > 0. For example, a multi-center case-control breast cancer study where ages of cases and controls are restricted between ages 35-64, is presented in [25]. In a design of this form, where the probands’ observed times are left-restricted by s₀ and the relatives’ failure times are unrestricted, Λ₀ will be underestimated by our two-stage procedure. But this bias can be easily corrected by first estimating Λ₀(s₀).

We present here the resulting three-stage estimator for the left-restricted design. Let $Δ {\tilde{Λ}}_{0} {τ_{g}, Λ_{0} (s_{0})}$ , and let ΔΛ̂₀{τ_g, Λ₀(s₀) be defined analogously to $Δ {\tilde{Λ}}_{0} (τ_{g})$ and ΔΛ̂₀(τ_g), with Λ₀(T_i0) = Λ₀(s₀) + Σ_{τ_g∈[s₀, T_i0]}Δ₀(τ_g). The estimator Λ̂₀(s₀) is defined as the root of

\sum_{τ_{g} \in (0, s_{0})} Δ {\hat{Λ}}_{0} {τ_{g}, Λ_{0} (s_{0})} - Λ_{0} (s_{0}) = 0 .

(12)

The root can be found by simple univariate Newton-Raphson iteration. This completes the first stage. The second stage involves calculating $Δ {\tilde{Λ}}_{0} {τ_{g}, {\hat{Λ}}_{0} (s_{0})}$ , g = 1, . . . , G, using the formula (10). In the third stage, we use the results of the second stage and the formula (11) to calculate the final estimate ΔΛ̂₀(τ_g), g = 1, . . . , G. In applying (11), we replace Λ₀(T_i0) by ${\tilde{Λ}}_{0} {T_{i 0}, {\hat{Λ}}_{0} (s_{0})}$ if T_i0 ≥ (τ_g) and by Λ̂₀(T_i0 otherwise.

In Section 6 below, we present simulation results for this estimator. In theory, the asymptotic properties of the three-stage procedure could be worked out via an extension of the arguments for the two-stage procedure, but the algebra becomes very complicated. We hope to develop asymptotic theory for the left-restricted design in future work.

6. Simulation Results - Gamma Frailty

We have performed a simulation study to evaluate the finite sample performance of the proposed method and compare it with existing methods. One of the most extensively used frailty distributions is the gamma distribution: customarily, the gamma distribution with expectation 1 and variance θ. Under this model, the variance parameter θ quantifies the heterogeneity of risk among families, with larger values of θ corresponding to stronger within-family dependence. In addition, the gamma frailty model can be re-expressed in terms of the Clayton-Oakes copula-type model [6, 27]. Moreover, the cross-ratio, introduced by Oakes [27] as a local measure of association between survival times, is constant on the support of the failure time distribution and equals 1+θ. Finally, the gamma frailty model is convenient mathematically, because it admits a closed-form representation of the marginal survival distributions. These features make the gamma frailty model very popular. We therefore chose the gamma frailty model as the framework for our simulation study.

Simulation results are based on 500 control probands matched to 500 case probands, with one relative sampled for each proband. We considered a single U[0, 1] distributed covariate with β = ln(2), Λ₀(t) = t, θ = 2, and a U[0, 1] censoring variable, yielding a censoring rate among the relatives of approximately 60%. In Table 1 we compare the following three estimates: the proposed estimate with the two-stage procedure for Λ₀, the estimate of [18], and a modified version of [31] estimate, with their method adapted to the gamma frailty model. Results are based on 500 simulated data sets. The efficiency difference between our two-stage estimator and that of Shih and Chatterjee is very small.

Table 1.

Simulation results: 500 control proband matched with 500 case probands; One relative for each proband; β = 0.693, Λ₀(t) = t, θ = 2.0, 500 samples

	Proposed Method		Hsu et al.		Shih and Chatterjee
	mean	Empirical Standard Error	mean	Empirical Standard Error	mean	Empirical Standard Error
β ^	0.706	0.197	0.697	0.201	0.698	0.182
θ ^	2.003	0.312	1.986	0.302	1.992	0.303
Λ̂₀(0.2)	0.201	0.034	0.204	0.030	0.202	0.029
Λ̂₀(0.4)	0.402	0.063	0.407	0.058	0.403	0.054
Λ̂₀(0.6)	0.603	0.095	0.612	0.090	0.605	0.084
Λ̂₀(0.8)	0.809	0.136	0.820	0.131	0.811	0.122

Open in a new tab

For our method, we also performed simulations for two additional settings. In both settings, we took β = 0 and θ = 3. The first setting involved a a censoring distribution of U[0, 4] and a covariate with a U[0, 4] distribution; the second involved a censoring distribution of U[0, 0.1] and a covariate with a U[0, 1] distribution. The respective overall censoring rates in these two settings were approximately 30% and 90%. To construct confidence intervals, we used a bootstrap approach. In the setting of censored survival data, the usual nonparametric bootstrap is problematic because it leads to a substantial proportion of tied survival times. We therefore used the weighted bootstrap approach of [22]. For the weighted bootstrap, a sample of 2n independent and identically distributed weights from the unit exponential distribution was generated for each bootstrap sample. Let ξ₁, . . . , ξ_2n be the standardized weights after dividing each weight by the average weight. Then, in the estimating functions, for any given function h the empirical mean $n^{- 1} \sum_{i = 1}^{2 n} h (T_{i}, δ_{i}, Z_{i})$ is replaced by its corresponding weighted empirical mean $n^{- 1} \sum_{i = 1}^{2 n} ξ_{i} h (T_{i}, δ_{i}, Z_{i})$ . This weighted bootstrap procedure gives valid inference for all parameters under right-censored univariate failure times [22].

Results for the two-stage procedure for Λ₀ are presented in Tables 2 for various levels of censoring. We present the mean, the empirical standard error, and the coverage rate of the 95% weighted bootstrap confidence interval. The results are based on 50 bootstrap samples for each of the 2000 simulated data sets of each configuration. Our estimates perform well in terms of bias and coverage probability.

Table 2.

Simulation results for the proposed estimators: 500 control proband matched with 500 case probands; One relative for each proband; Λ₀(t) = t; 2000 samples

		β = 0.0			β = 0.693
θ	Estimator	Mean	Empirical Standard Error	Coverage Rate	Mean	Empirical Standard Error	Coverage Rate
90% censoring rate
2.0	β ^	-0.013	0.217	93.5	0.694	0.200	96.0
	θ ^	2.127	0.872	96.0	2.082	0.667	94.8
	Λ̂₀(0.02)	0.020	0.006	94.2	0.020	0.005	95.2
	Λ̂₀(0.04)	0.041	0.010	94.7	0.040	0.010	95.2
	Λ̂₀(0.06)	0.061	0.015	94.8	0.060	0.014	96.1
	Λ̂₀(0.08)	0.081	0.020	95.0	0.080	0.019	96.1
3.0	β ^	-0.025	0.226	91.7	0.689	0.206	95.4
	θ ^	3.126	1.142	94.2	3.172	0.964	95.7
	Λ̂₀(0.02)	0.020	0.005	95.7	0.020	0.005	94.8
	Λ̂₀(0.04)	0.041	0.012	95.8	0.040	0.010	95.9
	Λ̂₀(0.06)	0.062	0.016	96.1	0.060	0.014	96.5
	Λ̂₀(0.08)	0.082	0.021	95.9	0.080	0.019	95.7
30% censoring rate
2.0	β ^	0.007	0.047	95.5	0.703	0.063	96.5
	θ ^	2.013	0.247	95.3	1.993	0.196	95.5
	Λ̂₀(0.2)	0.200	0.037	95.5	0.197	0.045	94.5
	Λ̂₀(0.4)	0.397	0.073	95.0	0.394	0.085	94.0
	Λ̂₀(0.6)	0.596	0.110	95.1	0.591	0.125	94.0
	Λ̂₀(0.8)	0.794	0.147	95.5	0.788	0.166	94.1
3.0	β ^	0.006	0.048	97.3	0.703	0.061	97.2
	θ ^	3.009	0.370	95.3	2.999	0.314	96.0
	Λ̂₀(0.2)	0.200	0.040	94.0	0.197	0.047	94.4
	Λ̂₀(0.4)	0.399	0.078	94.1	0.392	0.091	94.0
	Λ̂₀(0.6)	0.597	0.116	95.0	0.586	0.133	94.9
	Λ̂₀(0.8)	0.796	0.155	95.6	0.792	0.176	95.0

Open in a new tab

To study the case of left-restricted data, we considered a configuration similar to that of Table 1, but with the probands observation times restricted to be > 0.1. In Table 3, we present results for our three-stage estimator as well as for the methods of [18] and [31]. We see that estimating Λ₀(s₀) leads to modest efficiency loss in Λ̂₀ relative to the other two methods.

Table 3.

Simulation results of left-restricted data: 500 control proband matched with 500 case probands; One relative for each proband; s₀ = 0.1, β = 0.693, Λ₀(t) = t, θ = 2.0, 500 samples

	Proposed Method		Hsu et al.		Shih and Chatterjjee
	mean	Empirical Standard Error	mean	Empirical Standard Error	mean	Empirical Standard Error
β ^	0.735	0.214	0.698	0.234	0.694	0.170
θ ^	2.040	0.336	2.080	0.338	2.080	0.337
Λ̂₀(0.2)	0.195	0.049	0.198	0.034	0.198	0.031
Λ̂₀(0.4)	0.392	0.090	0.402	0.068	0.401	0.062
Λ̂₀(0.6)	0.589	0.129	0.604	0.102	0.603	0.092
Λ̂₀(0.8)	0.786	0.172	0.813	0.143	0.810	0.128
Λ̂₀(s₀)	0.098	0.025	-	-	-	-

Open in a new tab

7. Example

We apply our method to the breast cancer study mentioned in the introduction. Various risk factors were measured on probands and their relatives. For illustrative purposes we consider age at first full-term pregnancy with the relatives of the probands being the mothers. The following analysis is based on 437 breast cancer case probands matched with 437 control probands and a total of 874 mothers. The number of mothers who had breast cancer was 70 among the case families and 35 among the control families. The number of women whose first live birth occurred before age 20 was 142 among the probands and 181 among the mothers. We use the gamma frailty model with expectation 1 and variance θ. Three estimation procedures are considered: our proposed method, the estimate of Hsu et al. [18] method, and a modified version of Shih and Chatterjee [31] estimate. For our proposed method, the two-stage procedure for Λ₀ is used since the age range of the mothers with breast cancer was 20-76 and of the age range of the probands was 22-44. Table 4 presents the regression coefficient parameter estimate β̂, the dependency parameter estimate, θ̂, and Λ̂₀ at ages 40, 50, 60 and 70 years, along with their respective bootstrap standard errors. The proposed approach and that of Shih and Chatterjee yielded similar dependency estimates with the proposed approach being moderately more efficient. Hsu et al.’s approach gave a slightly lower dependence estimate. The regression coefficient estimates of Hsu et al. and that of Shih and Chatterjee are similar, with the latter being slightly more efficient. The proposed approach yielded a slightly lower covariate effect. The cumulative baseline hazard estimates are similar under the three estimation techniques. The results of all the three methods imply that women who had their first full-term pregnancy before age 20 have a reduced risk of developing breast cancer, supporting the observation that breast cancer risk is reduced by early first full-term pregnancy (e.g. [7], among others). The estimates of the dependency parameter imply that after adjusting for the first full-term pregnancy, there remains a significant dependency between the ages of onset for mothers and daughters with cross ratio (1 + θ) close to 2.

Table 4.

Analysis of a case-control family study of breast cancer

	Proposed Method		Hsu et al.		Shih and Chatterjee
	mean	Bootstrap Standard Error	mean	Bootstrap Standard Error	mean	Bootstrap Standard Error
β ^	-0.440	0.158	-0.484	0.216	-0.476	0.168
θ ^	0.952	0.443	0.889	0.443	0.944	0.460
Λ̂₀(40)	0.005	0.002	0.005	0.002	0.005	0.002
Λ̂₀(50)	0.022	0.006	0.023	0.006	0.023	0.006
Λ̂₀(60)	0.048	0.010	0.051	0.010	0.049	0.010
Λ̂₀(70)	0.091	0.016	0.095	0.016	0.092	0.016

Open in a new tab

8. Discussion

In this work we have presented a new estimator for matched case-control family study survival data under a frailty model, allowing an arbitrary frailty distribution with finite moments. Rigorous large sample theory has been provided. Simulation results under the popular gamma frailty model indicate that the proposed procedure provides estimates with minimal bias and confidence intervals with the appropriate coverage rate. Moreover, our estimators were seen to be essentially identical in efficiency to estimators based on the more complex approach of [31].

Rigorous large sample theory has been provided for age-unrestricted sampling of cases and controls. For age-restricted sampling, the asymptotic theory could be worked out largely following the arguments for the two-stage estimator but the algebra becomes very complicated. This development is a potential topic for future work.

Having suggested a model with an arbitrary frailty distribution with finite moments, we offer some remarks on how to choose the frailty distribution and the effect of this choice on the parameter estimates. Hougraard [17] provides a comprehensive discussion of the theoretical properties and the fit of the following distributions: gamma, positive stable, power variance function (PVF), inverse Gaussian and lognormal. Hsu et al. [20] show by simulation that the biases in the marginal regression estimates and the marginal hazard function are generally 10% or lower under the assumed gamma distribution and mis-specification of the frailty distribution. This suggests that the gamma frailty model can be a practical choice if the marginal parameters are of primary interest. However, when the dependence function is also of interest, a correct specification of the frailty distribution is crucial. A general diagnostic approach to check the bivariate association structure of clustered failure times is given in [13]. Additional tests and graphical procedures for checking the dependence structure of clustered failure-time data can be found in [5, 12, 30, 32, 33]. These procedures, however, are not directly applicable to the retrospective setting we deal with here; extension of the procedures to this setting will be needed.

9. Asymptotic Theory: Conditions and Proofs

This section presents the technical conditions we assume for the asymptotic results and the proofs of these results. The pattern of the argument is as in [14], [35] and [36], but considerable extension is required, mainly because of the two-stage cumulative baseline hazard function estimator for the case-control family data. A pseudo full likelihood estimation procedure for prospective survival data with a general semiparametric shared frailty model is given by [14] and [35]. A pseudo partial likelihood method for semiparametric survival model with errors in covariates is provide by [36]. We focus here on the added arguments needed for the present setting, and refer back to [35] and [36] for the other derivations. Briefly, the main extensions required in this work are as follows: (i) In showing consistency of ${\tilde{Λ}}_{0} (\cdot, γ)$ , the proof of Claim A in [36] cannot be applied directly since the quantity

\sum_{i = 1}^{2 n} I (T_{i 0} < t) {\overset{‒}{ψ}}_{i} (t -, γ, Λ) \sum_{j = 1}^{m_{i}} Y_{i j} (t) \exp β^{T} Z_{i j},

which is the denominator of the term for the jumps in ${\tilde{Λ}}_{0}$ , tends to 0 as t → 0. (ii) For the asymptotic normality of $\sqrt{n} (\hat{γ} - γ^{o})$ , we need a workable representation of the baseline hazard function estimators (10) and (11). The approach of [36] cannot be applied directly since ${\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ involves the “vanishing-denominator” problem mentioned above. For ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ we use the representation of ${\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ along with a recursive solution only for the relatives’ failure times.

9.1. Assumptions, Background, and Preliminaries

For the asymptotic theory, we make a number of assumptions. Several of these assumptions have already been listed in the main text. Below we list the additional assumptions.

There is a maximum follow-up time τ ∈ (0, ∞) with $E [\sum_{j = 1}^{m_{i}} Y_{i j} (τ)] = y^{*} > 0 \forall_{i}$ .
The frailty random variable ω_i has finite moments up to order (m+2).
The baseline hazard function $λ_{0}^{o} (t)$ is bounded over [0, τ] by some fixed (but not necessarily known) constant λ_max.
The function f’(w; θ) = (d/dθ)f(w; θ) is absolutely integrable.
For any given family, there is a positive probability of at least two failures.
Defining $π (s) = E [I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} Y_{i j} (s)]$ , we have
$ξ_{r} (u) \equiv \int_{0}^{u} \frac{λ_{0}^{o} (s)}{π {(s)}^{r}} d s < \infty for all u \in [0, τ] and r = 1, 2, 3 .$ (13)
This assumption is needed in the analysis of the first-stage estimator. For r = 1, it parallels Assumption (5.4) of [21].
The matrix $[(\partial ∕ \partial γ) U (γ, {\hat{Λ}}_{0} (\cdot, γ))] ∣_{γ = γ^{o}}$ is invertible with probability going to 1 as n → ∞. It should be noted that a general proof of invertibility is intractable, but given the data, one can easily check that numerically the matrix is invertible.

With ψ*(r, h) as in (8), we define $ψ_{m i n}^{*} (h) = \min_{0} \leq r \leq m ψ^{*} (r, h)$ and $ψ_{m a x}^{*} (h) \max_{0 \leq r \leq m} ψ^{*} (r, h)$ . It is easily seen that $ψ_{m a x}^{*} (h)$ is finite and $ψ_{m i n}^{*} (h)$ is strictly positive. The two lemmas below correspond to Lemmas 1 and 3 of [36].

Lemma 1. The function ψ*(r, h) is decreasing in h. Hence for all $γ \in G$ and all t, $ψ_{i} (γ, Λ, t) \leq ψ_{m a x}^{*} (0)$ and $ψ_{i} (γ, Λ, t) \geq ψ_{m i n}^{*} (m e^{ν} Λ (t))$ .

Lemma 2. For any ∊ > 0, we have $\sup_{s \in [∊, τ]} ∣ {\tilde{Λ}}_{0} (s, γ^{o}) - {\tilde{Λ}}_{0} (s -, γ^{o}) ∣ \to 0$ as n → ∞.

9.2. Consistency

As indicated in Sec. 4, the consistency proof proceeds in several stages.

Claim A1. ${\tilde{Λ}}_{0} (t, γ)$ converges in probability to some function $Λ_{0}^{*} (t, γ)$ uniformly in t and γ. The function $Λ_{0}^{*} (t, γ)$ satisfies $Λ_{0}^{*} (t, γ^{o}) = Λ_{0}^{o} (t)$ .

Proof. We can write ${\tilde{Λ}}_{0} (t, γ)$ as

{\tilde{Λ}}_{0} (t, γ) = \int_{0}^{t} \frac{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d N_{i j} (s)}{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, \tilde{Λ}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})} .

(14)

The proof here builds here on that of the corresponding Claim A in [36]. The main point needing attention here is the fact that, because of the indicators I(T_i0 < s), the denominator of (14) tends to 0 as s → 0.

Define, in parallel with [36],

Ξ_{n} (t, γ, Λ, ∊) = \int_{∊}^{t} \frac{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d N_{i j} (s)}{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, Λ) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})}

and

Ξ (t, γ, Λ, ∊) = \int_{∊}^{t} \frac{E [I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, Λ_{0}^{o}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{o T} Z_{i j})]}{E [I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, Λ) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})]} λ_{0}^{o} (s) d s .

We write Ξ_n(t, γ, Λ) for Ξ_n(t, γ, Λ, 0) and Ξ(t, γ, Λ) for Ξ(t, γ, Λ, 0). By definition, ${\tilde{Λ}}_{0} (t, γ)$ satisfies the equation ${\tilde{Λ}}_{0} (t, γ) = Ξ_{n} (t, γ, {\tilde{Λ}}_{0} (\cdot, γ))$ .

Remark

In [36], we had Ξ_n(t, γ, Λ) → Ξ(t, γ, Λ) a.s. as n → ∞, uniformly over t ∈ [0, τ], $γ \in G$ , and Λ in a certain set. We could not obtain such a result here; the argument of [2] fails in the neighborhood of zero because of the “vanishing denominator” problem. This is why we give only an in pr. consistency result rather than an a.s. result.

Next, define

q_{γ} (s, Λ) = \frac{E [I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, Λ_{0}^{o}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{o T} Z_{i j})]}{E [I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, Λ) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})]} λ_{0}^{o} (s) .

This function q_γ(s, Λ) has the same properties as its counterpart in [36]: these properties are not interfered with by the insertion of the indicator function I(T_i0 < s). In particular, from Lemma 1 we have

[{\overset{‒}{ψ}}_{i} (s -, γ, Λ_{0}^{o}) ∕ {\overset{‒}{ψ}}_{i} (s -, γ, Λ)] \leq [ψ_{m a x}^{*} (0) ∕ ψ_{m i n}^{*} (h_{m a x})] .

This leads to a bound on q_γ(s, Λ). In addition, the function q_γ(s, Λ) has the following Lipschitz-like property:

∣ q_{γ} (s, Λ_{1}) - q_{γ} (s, Λ_{2}) ∣ \leq K \sup_{0 \leq u \leq s} ∣ Λ_{1} (u) - Λ_{2} (u) ∣ .

Hence, by [15] (Thm. 1.1), the equation Λ(t) = Ξ(t, γ, Λ) has a unique solution, which we denote by $Λ_{0}^{*} (t, γ)$ . The claim then is that ${\tilde{Λ}}_{0} (t, γ)$ converges in pr. (uniformly in t and γ) to $Λ_{0}^{*} (t, γ)$ .

We now define ${\tilde{Λ}}_{0} (t, γ, ∊)$ to be the solution of ${\tilde{Λ}}_{0} (t, γ, ∊) = Ξ_{n} (t, γ, {\tilde{Λ}}_{0} (\cdot, γ), ∊)$ , starting from ${\tilde{Λ}}_{0} (∊, γ, ∊) = 0$ . For t < ∊ we set ${\tilde{Λ}}_{0} (t, γ, ∊) = 0$ . Similarly, we define $Λ_{0}^{*} (t, γ, ∊)$ to be the solution of Λ₀(t, γ, ∊) = Ξ(t, γ, Λ₀(·, γ), ∊), starting from Λ₀(∊, γ, ∊) = 0, and set $Λ_{0}^{*} (t, γ, ∊) = 0$ for t < ∊. An induction argument similar to that in the proof of [15] (Thm. 1.1) shows that $∣ Λ_{0}^{*} (t, γ, ∊) - Λ_{0}^{*} (t, γ) ∣ \leq e^{K} Λ_{0} (∊, γ)$ , where K is the Lipschitz constant for q_γ(s, Λ). We thus have

\sup_{γ \in G, t \in [0, τ]} ∣ Λ_{0}^{*} (t, γ, ∊) - Λ_{0}^{*} (t, γ) ∣ \to 0 as ∊ \to 0 .

(15)

Now, for any given ∊ > 0, there is no “vanishing denominator” problem on the interval [∊, τ]. Hence, the argument in [36] goes through as is, and we get the following result: for any ∊ > 0,

\sup_{γ \in G, t \in [∊, τ]} ∣ {\tilde{Λ}}_{0} (t, γ, ∊) - Λ_{0}^{*} (t, γ, ∊) ∣ \to 0 as n \to \infty a . s .

(16)

(and hence in probability). In fact, in the supremum above, we can replace [∊, τ] by [0, τ], since ${\tilde{Λ}}_{0} (t, γ, ∊) = Λ_{0} (t, γ, ∊) = 0$ for t < ∊.

Our aim now is to show that $\sup_{γ \in G, t \in [0, τ]} ∣ {\tilde{Λ}}_{0} (t, γ) - Λ_{0}^{*} (t, γ) ∣ \to 0$ in pr. as n → ∞. Now

\begin{matrix} ∣ {\tilde{Λ}}_{0} (t, γ) - Λ_{0}^{*} (t, γ) ∣ \\ \leq ∣ {\tilde{Λ}}_{0} (t, γ) - {\tilde{Λ}}_{0} (t, γ, ∊) ∣ + ∣ {\tilde{Λ}}_{0} (t, γ, ∊) - Λ_{0}^{*} (t, γ, ∊) ∣ + ∣ Λ_{0}^{*} (t, γ, ∊) - Λ_{0}^{*} (t, γ) ∣ \end{matrix}

(17)

The second and third terms are easily dealt with using (15) and (16). It remains to deal with the first term.

Define

\begin{matrix} C (s, γ, ∊) = \frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, \tilde{Λ} (\cdot, γ, ∊)) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j}), \\ C^{*} (s, β) = \frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j}) . \end{matrix}

We can then write

{\tilde{Λ}}_{0} (t, γ) - {\tilde{Λ}}_{0} (t, γ, ∊) = {\tilde{Λ}}_{0} (t, \land ∊, γ) + A (t, ∊),

(18)

where

A (t, ∊) = \int_{t \land ∊}^{t} [C {(s, γ, 0)}^{- 1} - C {(s, γ, ∊)}^{- 1}] [\frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d N_{i j} (s)] .

We deal with the two terms on the right side of (18) in turn. In what follows, we let R denote a “generic” constant which may vary from one appearance to another, but does not depend on the unknown parameters or ∊.

Denote Π(s) = C*(s, 0). It is clear that $\tilde{Λ} (t, γ) \leq R Υ (t, γ)$ , where

Υ (t, γ) = \int_{0}^{t} Π {(s)}^{- 1} [\frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d N_{i j} (s)] .

We can write

Υ (t, γ) = \int_{0}^{t} [\frac{C^{*} (s, β^{o})}{Π (s)}] λ_{0}^{o} (s) d s + \int_{0}^{t} Π {(s)}^{- 1} [\frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d M_{i j} (s)],

where M_ij is the martingale process corresponding to N_ij:

M_{i j} (t) = N_{i j} (t) - \int_{0}^{t} λ_{0} (u) \exp (β^{o T} Z_{i j}) Y_{i j} (u) ψ_{i} (γ^{o}, Λ_{0}^{o}, u -) d u .

(19)

The first term in Υ(t, γ) clearly bounded by $R Λ_{0}^{o} (t)$ . Thus, denoting the second term by M*(t), we have

\tilde{Λ} (t, γ) \leq R [Λ_{0}^{o} (t) + \sup_{u \in (0, τ)} ∣ M^{*} (u) ∣]

(20)

We next examine A(t, ∊). We can restrict to t ≥ ∊, since A(t, ∊) = 0 for t < ∊. Denote $Δ (t) = {\tilde{Λ}}_{0} (t, γ) - {\tilde{Λ}}_{0} (t, γ, ∊)$ . Bearing in mind the Lipschitz property of $\overset{‒}{ψ}$ we find that

∣ A (t, ∊) ∣ \leq R \int_{∊}^{t} ∣ Δ (s -) ∣ d Υ (s) .

Note that, for t ≥ ∊, dA(t, ∊) = dΔ(t). Thus, a simple induction and some additional simple manipulations lead to the following, where we employ the symbol $P$ to denote product integral and use the fact that $Δ (∊) = \tilde{Λ} (∊, γ)$ :

\begin{matrix} ∣ A (t, ∊) ∣ & \leq ∣ Δ (∊) ∣ P_{∊}^{t} (1 + R d Υ (s)) \leq ∣ Δ (∊) ∣ \exp (R [Υ (t) - Υ (∊)]) \\ \leq ∣ \tilde{Λ} (∊, γ) ∣ \exp (R Υ (τ)) . \end{matrix}

In view of the analysis above of Υ(t), we get

∣ A (t, ∊) ∣ \leq ∣ \tilde{Λ} (∊, γ) ∣ \exp (R [Λ_{0}^{o} (τ) + \sup_{u \in [0, τ]} ∣ M^{*} (u) ∣]) .

(21)

Putting (18), (20), and (21) together, we get

\begin{matrix} ∣ {\tilde{Λ}}_{0} (t, γ) - {\tilde{Λ}}_{0} (t, γ, ∊) ∣ \\ \leq R_{1} [Λ_{0}^{o} (∊) + \sup_{u \in (0, τ)} ∣ M^{*} (u) ∣] (1 + \exp (R_{2} [1 + \sup_{u \in (0, τ)} ∣ M^{*} (u) ∣]) \end{matrix}

(22)

for suitable absolute constants R₁ and R₂.

The last main step is to deal with the martingale process

M^{*} (u) = \int_{0}^{u} Π {(s)}^{- 1} [\frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d M_{i j} (s)] .

By an argument using Lenglart’s and Markov’s inequalities, as in [21] (p. 595), we obtain (with ξ₁ as in Assumption 6)

\Pr (\sup_{u \in [0, c]} ∣ M^{*} (u) ∣ > κ n^{- \frac{1}{2}}) \leq η + R κ^{2} η^{- 1} ξ_{1} (c) .

(23)

Given (22) and (23), we have control over the first term of (9.2), and the proof is thus complete.

Claim A2. ${\hat{Λ}}_{0} (t, γ)$ converges in probability to some function Λ₀(t, γ) uniformly in t and γ. The function Λ₀(t, γ) satisfies $Λ_{0} (t, γ^{o}) = Λ_{0}^{o} (t)$ .

Proof. We can write Λ̂₀(t, γ) as

{\hat{Λ}}_{0} (t, γ) = \int_{0}^{t} \frac{n^{- 1} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s)}{n^{- 1} \sum_{i = 1}^{2 n} {\tilde{ψ}}_{i} (s -, γ) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})} .

In view of Claim A1 above, up to a uniform error of o_P (1) we can replace all instances of ${\tilde{Λ}}_{0} (u, γ)$ in the definition of ${\tilde{ψ}}_{i} (s -, γ)$ by $Λ_{0}^{*} (u, γ)$ . The desired result then can be obtained using the argument used to prove Claim A of [36].

Claim A3. We have

\begin{matrix} \sup_{s \in [0, τ], γ \in G} ∣ {\tilde{Λ}}_{0} (s, γ) - {\tilde{Λ}}_{0} (s -, γ) ∣ \overset{P}{\to} 0 as n \to \infty, \\ \sup_{s \in [0, τ], γ \in G} ∣ {\hat{Λ}}_{0} (s, γ) - {\hat{Λ}}_{0} (s -, γ) ∣ \overset{P}{\to} 0 as n \to \infty . \end{matrix}

Proof. By appeal to Claims A1 and A2, and to continuity of $Λ_{0}^{*} (t, γ)$ and $Λ_{0} (t, γ)$ .

Claim B. U(γ, Λ̂₀(·, γ)) converges in probability uniformly in t and γ to a limit u(γ, Λ₀(·, γ)).

Proof. As in Claim B of [36].

Claim C. There exists a unique consistent (in pr.) root to U(γ̂, Λ̂₀(·, γ̂)) = 0.

Proof. By appeal to Foutz’s theorem [8], as in Claim C of [36].

9.3. A Workable Representation of ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$

To develop our asymptotic normality result, we need a workable representation of ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ . The first step is to develop a suitable representation of ${\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ . Then, building on this, we develop our representation of ${\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ .

9.3.1. Representation of ${\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)$

Our starting point is the following simple lemma.

Lemma 3. Let $R_{n} (t)$ and $S_{n} (t)$ be stochastic processes, and let A_n(t, ∊) and B_n(t, ∊) be quantities that are bounded in pr. uniformly in t and ∊. Define $R_{n} (t, ∊) = R_{n} (t) - A_{n} (t, ∊) R_{n} (∊)$ and $S_{n} (t, ∊) = B_{n} (t, ∊) [S_{n} (t) - S_{n} (∊)]$ . Suppose that:

$\sup_{t \in [\in, τ]} \sqrt{n} ∣ R_{n} (t, ∊) - S_{n} (t, ∊) ∣ \overset{P}{\to} 0$ as n → ∞ for any fixed ∊ > 0.
lim∊↓0 lim sup_n→∞ Pr(sup_t∈[0,∊] √n|R_n(t)| > δ) = 0 for all δ > 0.
lim∊↓0 lim sup_n→∞ Pr(sup_t∈[0,∊] √n|S_n(t)| > δ) = 0 for all δ > 0.
lim_∊↓0 sup_t∈[0,τ] |B_n(t, ∊) - B_n(t, 0)| = 0 with probability going to 1 as n → ∞.

Then $\sup_{t \in [0, τ]} \sqrt{n} ∣ R_{n} (t) - B_{n} (t, 0) S_{n} (t, 0) ∣ \overset{P}{\to} 0$ .

We apply this lemma with $R_{n} (t) = \sqrt{n} [{\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)]$ . We have to check the four conditions enumerated in the lemma.

Condition 1

Arguments along the lines of [36] yield the result of Condition 1, with

S_{n} (t) = \int_{0}^{t} \frac{\tilde{p} (S -, ∊)}{\tilde{Y} (s, Λ_{0}^{o})} [\frac{1}{n} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} I (T_{i 0} < s) d M_{i j} (s)]

(24)

and A_n(t, ∊) = B_n(t, ∊) = p̃(t, ∊)^-1 where

\tilde{p} (t, ∊) = \prod_{s \in [∊, t]} [1 + n^{- 1} \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} {Ω_{i j} (s, t) d {\tilde{N}}_{i j} + Ω^{*} (s) I (T_{i 0} < s) δ_{i j}}] .

Here

\tilde{Y} (s, Λ) = \frac{1}{n} \sum_{i = 1}^{2 n} I (T_{i 0} < s) ψ_{i} (γ^{o}, Λ, s) R_{i .} (s),

R_{i .} (s) = \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{o T} Z_{i j}),

Ω^{*} (s) = \frac{1}{n} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s) I (T_{k 0} < s)}{{\tilde{Y} (s, Λ_{0}^{o})}^{2}} \sum_{l = 1}^{m_{k}} I (T_{k l} > s) \exp (β^{o T} Z_{k l}),

Ω_{i 0} (s, t) = \frac{1}{n} \int_{s}^{t} \frac{R_{i .} (u) η_{1 i} (u) \exp (β^{o} T Z_{i 0})}{{\tilde{Y} (u, Λ_{0}^{o})}^{2}} \sum_{k = 1}^{2 n} \sum_{l = 1}^{m_{k}} d N_{k l} (u),

\begin{matrix} Ω_{i j} (s, t) = & \frac{1}{n} \int_{s}^{t} \frac{I (T_{i 0} < u) R_{i .} (u) η_{1 i} (u) \exp (β^{o T} Z_{i 0})}{{\tilde{Y} (u, Λ_{0}^{o})}^{2}} \\ \sum_{k = 1}^{2 n} \sum_{l = 1}^{m_{k}} I (T_{k 0} < u) d N_{k l} (u), j \geq 1 . \end{matrix}

In the above, η_1i(s) is defined as

η_{1 i} (s) = \frac{ϕ_{3 i} (γ^{o}, Λ_{0}^{o}, s)}{ϕ_{1 i} (γ^{o} Λ_{0}^{o}, s)} - {\frac{ϕ_{2 i} (γ^{o}, Λ_{0}^{o}, s)}{ϕ_{1 i} (γ^{o} Λ_{0}^{o}, s)}}^{2} .

In Sec. 9.3.2 below, we present in detail a similar argument for ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$ .

Appealing to Assumption 6 and using arguments similar to those used in the consistency proof, we find that the Ω quantities converge in probability uniformly in s and t, so that p̃(t, ∊) converges in probability to a deterministic limit uniformly in t and ∊.

Condition 2, 3, and 4

In regard to Condition 2, we have ${\tilde{Λ}}_{0} (t, γ) - Λ_{0}^{o} (t)$ , where

Δ_{1} (t) = \int_{0}^{t} [Γ (s, γ) - 1] λ_{0}^{o} (s) d s,

Δ_{2} (t) = \int_{0}^{t} \frac{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) \sum_{j = 1}^{m_{i}} d M_{i j} (s)}{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) {\overset{‒}{ψ}}_{i} (s -, γ, \tilde{Λ}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})},

where

Γ (s, γ) = \frac{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) ψ_{i} (s -, γ^{o}, Λ_{0}^{o}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{o T} Z_{i j})}{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) ψ_{i} (s -, γ^{o}, \tilde{Λ}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})}

and M_ij(t) is defined as in (19). We will deal with Δ₁(t) and Δ₂(t) in turn, starting with Δ₂(t). In the development below, R denotes a “generic” absolute constant.

The quadratic variation process of Δ₂(t) is given by

\begin{matrix} < Δ_{2} > (t) = \\ \int_{t}^{0} [\frac{n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) ψ_{i} (s -, γ^{o}, Λ_{0}^{o}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{o T} Z_{i j})}{{[n^{- 1} \sum_{i = 1}^{2 n} I (T_{i 0} < s) ψ_{i} (s -, γ^{o}, \tilde{Λ}) \sum_{j = 1}^{m_{i}} Y_{i j} (s) \exp (β^{T} Z_{i j})]}^{2}}] λ_{0}^{o} (s) d s . \end{matrix}

By arguments similar to those of [21] (p. 595), we find that E[n〈Δ₂〉(t)] ≤ Rξ₁(t). An application of Lenglart’s inequality then gives

\Pr (\sqrt{n} \sup_{t \in [0, ∊]} ∣ Δ_{2} (t) ∣ > κ) \leq η + R κ^{2} η^{- 1} ξ_{1} (∊) \forall η > 0 .

Assumption 6 implies that ξ₁(∊) ↓ 0 as ∊ ↓ 0, and this takes care of Δ₂(t).

We now turn to Δ₁(t). Denote J(s) = I(Π(s) > 0). We can write Δ₁(t) = Δ_1a(t) + Δ_1b(t), with

Δ_{1 a} (t) = \int_{0}^{t} [Γ (s, γ) - 1] J (s) λ_{0}^{o} d s and Δ_{1 b} (t) = \int_{0}^{t} [J (s) - 1] Λ_{0}^{o} (s) d s .

The term Δ_1b(t) can be shown to be uniformly O_p(n^-½) by the argument in the middle of page 595 in [21]. As for Δ_1a(t), we have

\begin{matrix} Δ_{1 a} (t) & \leq R t ∣ {\tilde{Λ}}_{0} (t, γ) - Λ_{0}^{o} (t) ∣ \leq R t ∣ Δ_{1} (t) ∣ + R t ∣ Δ_{2} (t) ∣ \\ \leq R t ∣ Δ_{1 a} (t) ∣ + R t ∣ Δ_{1 b} (t) ∣ + R t (Δ_{2} (t) ∣ . \end{matrix}

Thus, for t small, |Δ_1a(t)| ≤ [Rt/(1-Rt)][Δ_1b(t) + Δ₂(t)], and the terms on the right hand side have already been taken care of.

The proof of Condition 3 is similar to that given above for Δ₂(t). Condition 4 follows easily from the uniform convergence of the Ω quantities.

9.3.2. Representation of ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$

Define

Y (s, {{\tilde{Λ}}_{0}, {\hat{Λ}}_{0}}) = \frac{1}{n} \sum_{i = 1}^{2 n} {\tilde{ψ}}_{i} (γ^{o}, {\hat{Λ}}_{0}, s) R_{i .} (s)

and

Y (s, Λ) = \frac{1}{n} \sum_{i = 1}^{2 n} ψ_{i} (γ^{o}, Λ, s) R_{i .} (s),

where in ${\tilde{ψ}}_{i} (γ^{o}, {\hat{Λ}}_{0}, s)$ we take ${\tilde{Λ}}_{0} (T_{i 0})$ if T_i0 ≥ s and Λ̂₀(T_ij) if T_ij < s, j ≥ 0. By Claim A3, we have sup_s∈[0,τ] |Λ̂₀ (s, γ^o) - Λ̂₀ (s-, γ^o)| converges to zero. Thus, we obtain the following approximation, uniformly over t ∈ [0, τ]:

\begin{matrix} {\hat{Λ}}_{0} & (t, γ^{o}) - Λ_{0}^{o} (t) \approx \frac{1}{n} \int_{0}^{t} \frac{\sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d M_{i j} (s)}{Y (s, Λ_{0}^{o})} \\ + \frac{1}{n} \int_{0}^{t} [{Y (s, {{\tilde{Λ}}_{0}, {\hat{Λ}}_{0}})}^{- 1} - {Y (s, Λ_{0}^{o})}^{- 1}] \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s) . \end{matrix}

Now let $X (s, r) = {Y (s, Λ_{0}^{o} + r Δ^{*})}^{- 1}$ with $Δ^{*} = {\hat{Λ}}_{0} - Λ_{0}^{o}$ or ${\tilde{Λ}}_{0} - Λ_{0}^{o}$ according to the estimator being used. Define $\dot{X}$ and $\ddot{X}$ as the first and second derivative of $X$ with respect to r, respectively. Then, by a first order Taylor expansion of $X$ (s, r) we get

\begin{matrix} {\hat{Λ}}_{0} (t, γ^{o}) - Λ_{0}^{o} (t) \approx n^{- 1} \int_{0}^{t} {Y (s, Λ_{0}^{o})}^{- 1} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d M_{i j} (s) \\ - n^{- 2} \int_{0}^{t} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s)}{{Y (s, Λ_{0}^{o})}^{2}} \sum_{l = 1}^{m_{k}} I (T_{k l} > s) \exp (β^{o T} Z_{k l}) {{\hat{Λ}}_{0} (s) - Λ_{0}^{o} (s)} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s) \\ - n^{- 2} \int_{0}^{t} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s)}{{Y (s, Λ_{0}^{o})}^{2}} \sum_{l = 1}^{m_{k}} I (T_{k l} \leq s) \exp (β^{o T} Z_{k l}) {{\hat{Λ}}_{0} (T_{k l}) - Λ_{0}^{o} (T_{k l})} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s) \\ - n^{- 2} \int_{0}^{t} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s)}{{Y (s, Λ_{0}^{o})}^{2}} I (T_{k 0} \geq s) \exp (β^{o T} Z_{k 0}) {{\tilde{Λ}}_{0} (T_{k 0}) - Λ_{0}^{o} (T_{k 0})} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s) \\ - n^{- 2} \int_{0}^{t} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s)}{{Y (s, Λ_{0}^{o})}^{2}} I (T_{k 0} < s) \exp (β^{o T} Z_{k 0}) {{\hat{Λ}}_{0} (T_{k 0}) - Λ_{0}^{o} (T_{k 0})} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s) . \end{matrix}

The justification for ignoring the remainder term in the Taylor expansion is as in the parallel argument in [36]. Note that in the above approximation probands’ data are involved since the derivative involves an estimator of Λ₀ for the probands (either Λ or ${\tilde{Λ}}_{0}$ ) and not only an estimator for the relatives.

The second, third and fifth terms of the above equation can be written, by interchanging the order of integration, as

- n^{- 1} \int_{0}^{t} {{\hat{Λ}}_{0} (s) - Λ_{0}^{o} (s)} \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} Υ_{i j} (s, t) d {\tilde{N}}_{i j} (s)

where Ñ_ij(s) = I(T_ij ≤ t),

Υ_{i 0} (s, t) = n^{- 1} \int_{s}^{t} \frac{R_{i} (u) η_{1 i} (u) \exp (β^{o T} Z_{i 0})}{{Y {(u, Λ_{0}^{o})}^{2}}} \sum_{k = 1}^{2 n} \sum_{l = 1}^{m_{k}} d N_{k l} (u)

and for j ≥ 1

\begin{matrix} Υ_{i j} (s, t) = & n^{- 1} \int_{s}^{t} \frac{R_{i .} (u) η_{1 i} (u) \exp (β^{o} T Z_{i j})}{{Y (u, Λ_{0}^{o})}^{2}} \sum_{k = 1}^{2 n} \sum_{l = 1}^{m_{k}} d N_{k l} (u) \\ + n^{- 1} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s)}{{Y (s, Λ_{0}^{o})}^{2}} \sum_{l = 1}^{m_{k}} I (T_{k l} > s) \exp (β^{o T} Z_{k l}) δ_{k l} . \end{matrix}

The fourth term can be written, by plugging in the representation for ${\tilde{Λ}}_{0} - Λ_{0}^{o}$ , as

- n^{- 1} \int_{0}^{τ} \frac{A (s, t) \tilde{p} (s -)}{\tilde{Y} (s, Λ_{0}^{o})} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} I (T_{i 0} < s) d M_{i j} (s),

where

\begin{matrix} A (s, t) = & n^{- 2} \int_{0}^{t} \sum_{k = 1}^{2 n} \frac{R_{k .} (s) η_{1 k} (s)}{{Y (s, Λ_{0}^{o})}^{2}} \exp (β^{o T} Z_{k 0}) \\ \times [\int_{s}^{τ} {\tilde{p} (v)}^{- 1} d N_{k 0}^{*} (v)] \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d N_{i j} (s) \end{matrix}

and $N_{k 0}^{*} (t) = I (T_{k 0} \leq t)$ . Given all the above, we get

\begin{matrix} {\hat{Λ}}_{0} (t, γ^{o}) & - Λ_{0}^{o} (t) \approx n^{- 1} \int_{0}^{t} {Y (s, Λ_{0}^{o})}^{- 1} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d M_{i j} (s) \\ - n^{- 1} \int_{0}^{τ} \frac{A (s, t) \tilde{p} (s -)}{\tilde{Y} (s, Λ_{0}^{o})} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} I (T_{i 0} < s) d M_{i j} (s) \\ - n^{- 1} \int_{0}^{t} {{\hat{Λ}}_{0} (s) - Λ_{0}^{o} (s)} \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} Υ_{i j} (s, t) d {\tilde{N}}_{i j} (s) . \end{matrix}

By solving the above approximation recursively, for the relatives’ failure times, we get

\begin{matrix} {\hat{Λ}}_{0} & (t, γ^{o}) - Λ_{0}^{o} (t) \approx \frac{1}{n \hat{p} (t)} \int_{0}^{t} \frac{\hat{p} (s -)}{Y (s, Λ_{0}^{o})} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} d M_{i j} (s) \\ + \frac{1}{n \hat{p} (t)} \int_{0}^{τ} B (s, t) \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} \frac{\tilde{p} (s -)}{\tilde{Y} (s, Λ_{0}^{o})} I (T_{i 0} < s) d M_{i j} (s) \\ - \frac{\hat{p} (t -) d N (t)}{n^{2} \hat{p} (t)} \int_{0}^{τ} A (s, t) \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} \frac{\tilde{p} (s -)}{\tilde{Y} (s, Λ_{0}^{o})} I (T_{i 0} < s) d M_{i j} (s) \end{matrix}

(25)

where $N (t) = \sum_{i = 1}^{2 n} \sum_{i = 1}^{m_{i}} N_{i j} (t)$

B (s, t) = n^{- 1} \int_{0}^{t -} A (s, u) \hat{p} (u -) \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} Υ_{i j} (u, t -) d N_{i j} (u)

and

\hat{p} (t) = \prod_{s \leq t} [1 + \frac{1}{n} \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} Υ_{i j} (s, t) d {\tilde{N}}_{i j} (s)] .

9.4. Asymptotic normality of n^1/2(γ̂-γ^o)

To show that γ̂ is asymptotically normally distributed, we expand $U (\hat{γ}, {\hat{Λ}}_{0} (\cdot, \hat{γ})) = 0$ as

U (γ^{o}, Λ_{0}^{o}) + [U (γ^{o}, {\hat{Λ}}_{0} (., γ^{o})) - U (γ^{o}, Λ_{0}^{o})] + [U (\hat{γ}, {\hat{Λ}}_{0} (., \hat{γ})) - U (γ^{o}, {\hat{Λ}}_{0} (., γ^{o}))] = 0 .

We examine in turn each of the terms on the left-hand side of the above equation.

Step I

We can write $U (γ^{o} Λ_{0}^{o}) = n^{- 1} (\sum_{i = 1}^{n} ξ_{i}^{(1)} + \sum_{i = 1}^{2 n} ξ_{i}^{(2)})$ where $ξ_{i}^{(1)}$ , i = 1, . . . , n, are iid mean-zero random (p + 1)-vectors stemming from the likelihood of the proband data, while $ξ_{i}^{(2)}$ , i = 1, . . . , 2n, are iid mean-zero random (p + 1)-vectors stemming from the likelihood of the relatives’ data. It follows immediately from the classical central limit theorem that $n^{- 1 ∕ 2} U (γ^{o}, Λ_{0}^{o})$ is asymptotically mean-zero multivariate normal.

Step II

Let ${\hat{U}}_{r} = U_{r} (γ^{o}, {\hat{Λ}}_{0})$ r = 1, . . . , p, and ${\hat{U}}_{p + 1} = U_{p + 1} (γ^{o}, {\hat{Λ}}_{0})$ (in this segment of the proof, when we write ( $(γ^{o}, {\hat{Λ}}_{0})$ ) the intent is to signify ( $(γ^{o}, {\hat{Λ}}_{0} (\cdot, γ^{o}))$ ). Further, denote $Q_{i j r} (γ^{o}, Λ_{0}^{o}, T_{i j}) = [\partial U_{r} (γ^{o}, Λ_{0}^{o}) ∕ \partial Λ_{0}^{o} (T_{i j})]$ , i = 1, . . . , 2n, j = 0, . . . , m_i, r = 1, . . . ,p + 1. First order Taylor expansion of ${\hat{U}}_{r}$ about $Λ_{0}^{o}$ , r = 1, . . . , p + 1, then gives

\begin{matrix} n^{1 ∕ 2} {U_{r} (γ^{o}, {\hat{Λ}}_{0}) - U_{r} (γ^{o}, Λ_{0}^{o})} \\ = n^{- 1 ∕ 2} \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} Q_{i j r} (γ^{o}, Λ_{0}^{o}, T_{i j}) {{\hat{Λ}}_{0} (T_{i j}, γ^{o}) - Λ_{0}^{o} (T_{i j})} + o_{p} (1) \\ = \frac{1}{n} \sum_{i = 1}^{2 n} \sum_{j = 0}^{m_{i}} \int_{0}^{τ} Q_{i j r} (γ^{o}, Λ^{o}, s) {{\hat{Λ}}_{0} (s, γ^{o}) - Λ_{0}^{o} (s)} d {\tilde{N}}_{i j} (s) + o_{p} (1) . \end{matrix}

Although probands are involved in the above stochastic integral, its integrand is predictable since, by definitions (10)-(11), ${\hat{Λ}}_{0} (t, γ)$ depends only on data up to time t-. Using the representation in Sec. 9.3.2 for $\sqrt{n} [{\hat{Λ}}_{0} (s, γ^{o}) - Λ_{0}^{o} (s)]$ and replacing certain empirical sums by their limiting values (see the expanded paper for details), we obtain a representation of the form

\begin{matrix} U_{r} & (γ^{o}, {\hat{Λ}}_{0}) - U_{r} (r^{o}, Λ_{0}^{o}) \\ \approx \frac{1}{n} \sum_{k = 1}^{2 n} \sum_{l = 1}^{m_{i}} \int_{0}^{τ} [α_{r}^{(1)} (u) + I (T_{k 0} < u) {α_{r}^{(2)} (u) - α_{r}^{(3)} (u)}] d M_{k l} (u) . \end{matrix}

Thus, we have represented $U_{r} (γ^{o}, {\hat{Λ}}_{0}) - U_{r} (γ^{o}, Λ_{0}^{o})$ r = 1, . . . , p + 1 as the average of mean zero iid random variables. Hence, asymptotic normality follows from the classical central limit theorem.

Step III

First order Taylor expansion of $U (\hat{γ}, {\hat{Λ}}_{0} (\cdot, \hat{γ}))$ about γ^o = (β^oT, θ^o)^T gives

U (\hat{γ}, {\hat{Λ}}_{0} (\cdot, \hat{γ})) = U (\hat{γ}, {\hat{Λ}}_{0} (\cdot, \hat{γ})) + D (γ^{o}) {(\hat{γ} - γ^{o})}^{T} + o_{p} (1),

where $D_{l s} (γ) = \partial U_{l} (γ, {\hat{Λ}}_{0} (\cdot, γ)) ∕ \partial γ_{s}$ for l, s = 1, . . . , p + 1.

Combining the results of Steps I-III above we get that n^1/2(γ̂-γ^o) is asymptotically zero-mean normally distributed with a covariance matrix that can be consistently estimated by a sandwich-type estimator.

9.5. Asymptotic properties of ${\hat{Λ}}_{0} (\cdot, \hat{γ})$

We can write

\begin{matrix} \sqrt{n} & {{\hat{Λ}}_{0} (t, \hat{γ}) - Λ_{0}^{o} (t)} = \\ \sqrt{n} {{\hat{Λ}}_{0} (t, γ^{o}) - Λ_{0}^{o} (t)} + \sqrt{n} {{\hat{Λ}}_{0} (t, \hat{γ}) - {\hat{Λ}}_{0} (t, γ^{o})} . \end{matrix}

(26)

In (25), we have a representation of the first term above in terms of integrals with respect to the martingale processes M_ij. Weak convergence of the first term can thus be established using the martingale central limit theorem, as in [1]. In particular, the first term is tight. In regard to the second term, Taylor expansion yields

{\hat{Λ}}_{0} (t, \hat{γ}) - {\hat{Λ}}_{0} (t, γ^{o}) = W {(t, γ^{o})}^{T} (\hat{γ} - γ^{o}) + o_{p} (1),

where

\begin{matrix} W (t, γ) = \\ \int_{0}^{t} \frac{n^{- 1} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} {\frac{\partial}{\partial γ} {\tilde{ψ}}_{i} (u -, γ) + {\tilde{ψ}}_{i} (u -, γ) Z_{i j}} Y_{i j} (u) \exp (β^{T} Z_{i j})}{n^{- 1} \sum_{i = 1}^{2 n} {\tilde{ψ}}_{i} (u -, γ) R_{i .} (u)} d {\hat{Λ}}_{0} (u, γ) . \end{matrix}

The limiting value of W(t, γ) is

\begin{matrix} w (t, γ) = \\ \int_{0}^{t} \frac{E [n^{- 1} \sum_{i = 1}^{2 n} \sum_{j = 1}^{m_{i}} {\frac{\partial}{\partial γ} {\tilde{ψ}}_{i} (u -, γ) + {\tilde{ψ}}_{i} (u -, γ) Z_{i j}} Y_{i j} (u) \exp (β^{T} Z_{i j})]}{E [n^{- 1} \sum_{i = 1}^{2 n} {\tilde{ψ}}_{i} (u -, γ) R_{i .} (u)]} λ_{0}^{o} (u) d u . \end{matrix}

Since the integrand is bounded, the function ω(t, γ) is Lipschitz in t. We just showed that √n(γ̂ - γ^o) converges to a mean-zero normal variate. Hence the second term in (26) is tight. Accordingly, the entire expression (26) is tight.

Now, as seen in the normality proof of Sec. 9.4, both terms in (26) can be represented in terms of i.i.d. sums over i of functions of the data on family i. Hence, asymptotic normality of the finite dimensional distributions of the entire expression (26) follows from the classical central limit theorem. This, together with the tightness just shown, establishes weak convergence of $\sqrt{n} {{\hat{Λ}}_{0} (t, \hat{γ}) - Λ_{0}^{o} (t)}$ to a Gaussian process. A fortiori,

\sup_{t} ∣ {\hat{Λ}}_{0} (t, \hat{γ}) - Λ_{0}^{o} (t) ∣ = O_{p} (n^{- \frac{1}{2}})

Acknowledgements

The authors would like to thank Dr. Kathleen Malone for sharing the data from the case-control family study of breast cancer, which motivated the development of this work. The research was supported in part by grants from the National Institute of Health and the United States-Israel Binational Science Foundation (BSF).

Footnotes

AMS 2000 subject classiffications: Primary, 62N01,62N02, 62H12

Contributor Information

Malka Gorfine, Faculty of Industrial Engineering and Management, Technion City, Haifa 32000, Israel, E-mail: gorfinm@ie.technion.ac.il.

David M. Zucker, Department of Statistics, Hebrew University, Mt. Scopus, Jerusalem 91905, Israel, E-mail: mszucker@mscc.huji.ac.il

Li Hsu, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109-1024, USA, E-mail: lih@fhcrc.org.

References

[1].Andersen PK, Gill RD. Cox’s regression model for counting processes: A large sample study. Ann. Statist. 1982;10:1100–1120. MR0673646. [Google Scholar]
[2].Aalen OO. Nonparametric inference in connection with multiple decrement models. Scand. J. Statist. 1976;3:15–27. MR0400529. [Google Scholar]
[3].Aalen O. Nonparametric inference for a family of counting processes. Ann. Statist. 1978;3:701–726. MR0491547. [Google Scholar]
[4].Breslow NE, Day NE. Statistical methods in cancer research: Vol. 1 - The analysis of case-control studies. IARC Scientific Publication; Lyon, France: 1980. [PubMed] [Google Scholar]
[5].Chen MC, Bandeen-Roche K. A diagnostic for association in bivariate survival models. Lifetime data Anal. 2005;11:245–264. doi: 10.1007/s10985-004-0386-8. MR2158784. [DOI] [PubMed] [Google Scholar]
[6].Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141–151. MR0501698. [Google Scholar]
[7].Coditz GA, Rosner BA, Speizer FE, For the Nurses’ Health Study Reserch Group Risk factors for breast cancer according to family history of breast cancer. Journal of National Cancer Institute. 1996;88:365–371. doi: 10.1093/jnci/88.6.365. [DOI] [PubMed] [Google Scholar]
[8].Foutz RV. On the unique consistent solution to the likelihood equation. J. Amer. Statist. Assoc. 1977;72:147–148. MR0445686. [Google Scholar]
[9].Gill RD. Discussion of the paper by D. Clayton and J. Cuzick. J. R. Statist. Soc. 1985;A 148:108–109. [Google Scholar]
[10].Gill RD. Non- and semi-parametric maximum likelihood estimators and the Von Mises method (Part 1) Scand. J. Statist. 1989;16:97–128. [Google Scholar]
[11].Gill RD. Marginal partial likelihood. Scand. J. Statist. 1992;79:133–137. MR1173595. [Google Scholar]
[12].Glidden DV. Checking the adequacy of the gamma frailty model for multivariate failure times. Biometrika. 1999;86:381–393. MR1705406. [Google Scholar]
[13].Glidden DV. Pairwise dependence diagnostics for clustered failure time data. Biometrika. 2007;94:371–385. [Google Scholar]
[14].Gorfine M, Zucker DM, Hsu L. Prospective survival analysis with a general semiparametric shared frailty model - a pseudo full likelihood approach. Biometrika. 2006;93:735–741. doi: 10.1901/jaba.2009.37-1489. MR2261454. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Hartman P. Ordinary Differential Equations. 2nd ed. Birkhauser; Boston: 1973. reprinted, 1982. [Google Scholar]
[16].Hopper JL. Commentary: Case-control-family design: a paradigm for future epidemiology research? International Journal of Epidemiology. 2003;32:48–50. doi: 10.1093/ije/dyg114. [DOI] [PubMed] [Google Scholar]
[17].Hougaard P. Analysis of Multivariate Survival data. Springer; New York: 2000. [Google Scholar]
[18].Hsu L, Chen L, Gorfine M, Malone K. Semiparametric estimation of marginal hazard function from case-control family studies. Biometrics. 2004;60:936–944. doi: 10.1111/j.0006-341X.2004.00249.x. MR2133546. [DOI] [PubMed] [Google Scholar]
[19].Hsu L, Gorfine M. Multivariate survival analysis for case-control family data. Biostatistics. 2006;7:387–398. doi: 10.1093/biostatistics/kxj014. [DOI] [PubMed] [Google Scholar]
[20].Hsu L, Gorfine M, Malone K. On robustness of marginal regression coefficient estimates and hazard functions in multivariate survival analysis of family data when the frailty distribution is misspecified. To appear in Stat. Med. 2007 doi: 10.1002/sim.2870. [DOI] [PubMed] [Google Scholar]
[21].Keiding N, Gill R. Random truncation models and Markov processes. Ann. Statist. 1990;18:582–602. MR1056328. [Google Scholar]
[22].Kosorok MR, Lee BL, Fine JP. Robust inference for univariate proportional hazards regression models. Ann. Statist. 2004;32:1448–1491. MR2089130. [Google Scholar]
[23].Malone KE, Daling JR, Thompson JD, Cecilia AO, Francisco LV, Ostrander EA. BRCA1 mutations and breast cancer in the general population. Journal of the American Medical Association. 1998;279:922–929. doi: 10.1001/jama.279.12.922. [DOI] [PubMed] [Google Scholar]
[24].Malone KE, Daling JR, Neal C, Suter NM, O’brien C, Cushing-Haugen K, Jonasdottir TJ, Thompson JD, Ostrander EA. Frequency of BRCA1/BRCA2 mutations in a population-based sample of young breast carcinoma cases. Cancer. 2000;88:1393–1402. doi: 10.1002/(sici)1097-0142(20000315)88:6<1393::aid-cncr17>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]
[25].Malone KM, Daling JR, Doody DR, Hsu L, Bernstein L, Coates RJ, Marchbanks PA, Simon MS, McDonald JA, Norman SA, Strom BL, Burkman RT, Ursin G, Deapen D, Weiss LK, Folger S, Madeoy JJ, Friedrichsen DM, Suter NM, Humphrey MC, Spirtas R, Ostrander EA. Prevalence and predictors of BRCA1 and BRCA2 mutations in a population-based study of breast cancer in white and black American women aged 35-64 years. Cancer Research. 2006;16:8297–8308. doi: 10.1158/0008-5472.CAN-06-0503. [DOI] [PubMed] [Google Scholar]
[26].Nielsen GG, Gill RD, Andersen PK, Sorensen TI. A counting process approach to maximum likelihood estimation of frailty models. Scand. J. Statist. 1992;19:25–43. [Google Scholar]
[27].Oakes D. Bivariate survival models induced by frailties. J. Amer. Statist. Assoc. 1989;84:487–493. MR1010337. [Google Scholar]
[28].Parner E. Asymptotic theory for the correlated gamma-frailty model. Ann. Statist. 1998;26:183–214. MR1611788. [Google Scholar]
[29].Prentice RL, Breslow NE. Retrospective studies and failure time models. Biometrika. 1978;65:153–158. MR. [Google Scholar]
[30].Shih JH. A goodness-of-fit test for association in a bivariate survival model. Biometrika. 1998;85:189–200. MR1627281. [Google Scholar]
[31].Shih JH, Chatterjee N. Analysis of survival data from case-control family studies. Biometrics. 2002;58:502–509. doi: 10.1111/j.0006-341x.2002.00502.x. MR1925547. [DOI] [PubMed] [Google Scholar]
[32].Shih JH, Louis TA. Inference on the association parameter in copula models for bivariate survival data. Biometrics. 1995;51:1384–1399. MR1381050. [PubMed] [Google Scholar]
[33].Viswanathan B, Manatunga AK. Diagnostic plots for assessing the frailty distribution in multivariate survival data. Lifetime Data Anal. 2001;7:143–155. doi: 10.1023/a:1011348823081. MR1842324. [DOI] [PubMed] [Google Scholar]
[34].Zeger S, Liang K-Y, Albert PS. Models for longitudinal data: A generalized Estimation Equation approach. Biometrics. 1988;44:1049–1060. MR0980999. [PubMed] [Google Scholar]
[35].Zucker DM. A pseudo partial likelihood method for semi-parametric survival regression with covariate errors. J. Amer. Statist. Assoc. 2005;100:1264–1277. MR2236440. [Google Scholar]
[36].Zucker DM, Gorfine M, Hsu L. Pseudo full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: asymptotic theory. To appear in J. Statist. Plann Inference. 2007 [Google Scholar]

[R1] [1].Andersen PK, Gill RD. Cox’s regression model for counting processes: A large sample study. Ann. Statist. 1982;10:1100–1120. MR0673646. [Google Scholar]

[R2] [2].Aalen OO. Nonparametric inference in connection with multiple decrement models. Scand. J. Statist. 1976;3:15–27. MR0400529. [Google Scholar]

[R3] [3].Aalen O. Nonparametric inference for a family of counting processes. Ann. Statist. 1978;3:701–726. MR0491547. [Google Scholar]

[R4] [4].Breslow NE, Day NE. Statistical methods in cancer research: Vol. 1 - The analysis of case-control studies. IARC Scientific Publication; Lyon, France: 1980. [PubMed] [Google Scholar]

[R5] [5].Chen MC, Bandeen-Roche K. A diagnostic for association in bivariate survival models. Lifetime data Anal. 2005;11:245–264. doi: 10.1007/s10985-004-0386-8. MR2158784. [DOI] [PubMed] [Google Scholar]

[R6] [6].Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141–151. MR0501698. [Google Scholar]

[R7] [7].Coditz GA, Rosner BA, Speizer FE, For the Nurses’ Health Study Reserch Group Risk factors for breast cancer according to family history of breast cancer. Journal of National Cancer Institute. 1996;88:365–371. doi: 10.1093/jnci/88.6.365. [DOI] [PubMed] [Google Scholar]

[R8] [8].Foutz RV. On the unique consistent solution to the likelihood equation. J. Amer. Statist. Assoc. 1977;72:147–148. MR0445686. [Google Scholar]

[R9] [9].Gill RD. Discussion of the paper by D. Clayton and J. Cuzick. J. R. Statist. Soc. 1985;A 148:108–109. [Google Scholar]

[R10] [10].Gill RD. Non- and semi-parametric maximum likelihood estimators and the Von Mises method (Part 1) Scand. J. Statist. 1989;16:97–128. [Google Scholar]

[R11] [11].Gill RD. Marginal partial likelihood. Scand. J. Statist. 1992;79:133–137. MR1173595. [Google Scholar]

[R12] [12].Glidden DV. Checking the adequacy of the gamma frailty model for multivariate failure times. Biometrika. 1999;86:381–393. MR1705406. [Google Scholar]

[R13] [13].Glidden DV. Pairwise dependence diagnostics for clustered failure time data. Biometrika. 2007;94:371–385. [Google Scholar]

[R14] [14].Gorfine M, Zucker DM, Hsu L. Prospective survival analysis with a general semiparametric shared frailty model - a pseudo full likelihood approach. Biometrika. 2006;93:735–741. doi: 10.1901/jaba.2009.37-1489. MR2261454. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Hartman P. Ordinary Differential Equations. 2nd ed. Birkhauser; Boston: 1973. reprinted, 1982. [Google Scholar]

[R16] [16].Hopper JL. Commentary: Case-control-family design: a paradigm for future epidemiology research? International Journal of Epidemiology. 2003;32:48–50. doi: 10.1093/ije/dyg114. [DOI] [PubMed] [Google Scholar]

[R17] [17].Hougaard P. Analysis of Multivariate Survival data. Springer; New York: 2000. [Google Scholar]

[R18] [18].Hsu L, Chen L, Gorfine M, Malone K. Semiparametric estimation of marginal hazard function from case-control family studies. Biometrics. 2004;60:936–944. doi: 10.1111/j.0006-341X.2004.00249.x. MR2133546. [DOI] [PubMed] [Google Scholar]

[R19] [19].Hsu L, Gorfine M. Multivariate survival analysis for case-control family data. Biostatistics. 2006;7:387–398. doi: 10.1093/biostatistics/kxj014. [DOI] [PubMed] [Google Scholar]

[R20] [20].Hsu L, Gorfine M, Malone K. On robustness of marginal regression coefficient estimates and hazard functions in multivariate survival analysis of family data when the frailty distribution is misspecified. To appear in Stat. Med. 2007 doi: 10.1002/sim.2870. [DOI] [PubMed] [Google Scholar]

[R21] [21].Keiding N, Gill R. Random truncation models and Markov processes. Ann. Statist. 1990;18:582–602. MR1056328. [Google Scholar]

[R22] [22].Kosorok MR, Lee BL, Fine JP. Robust inference for univariate proportional hazards regression models. Ann. Statist. 2004;32:1448–1491. MR2089130. [Google Scholar]

[R23] [23].Malone KE, Daling JR, Thompson JD, Cecilia AO, Francisco LV, Ostrander EA. BRCA1 mutations and breast cancer in the general population. Journal of the American Medical Association. 1998;279:922–929. doi: 10.1001/jama.279.12.922. [DOI] [PubMed] [Google Scholar]

[R24] [24].Malone KE, Daling JR, Neal C, Suter NM, O’brien C, Cushing-Haugen K, Jonasdottir TJ, Thompson JD, Ostrander EA. Frequency of BRCA1/BRCA2 mutations in a population-based sample of young breast carcinoma cases. Cancer. 2000;88:1393–1402. doi: 10.1002/(sici)1097-0142(20000315)88:6<1393::aid-cncr17>3.0.co;2-p. [DOI] [PubMed] [Google Scholar]

[R25] [25].Malone KM, Daling JR, Doody DR, Hsu L, Bernstein L, Coates RJ, Marchbanks PA, Simon MS, McDonald JA, Norman SA, Strom BL, Burkman RT, Ursin G, Deapen D, Weiss LK, Folger S, Madeoy JJ, Friedrichsen DM, Suter NM, Humphrey MC, Spirtas R, Ostrander EA. Prevalence and predictors of BRCA1 and BRCA2 mutations in a population-based study of breast cancer in white and black American women aged 35-64 years. Cancer Research. 2006;16:8297–8308. doi: 10.1158/0008-5472.CAN-06-0503. [DOI] [PubMed] [Google Scholar]

[R26] [26].Nielsen GG, Gill RD, Andersen PK, Sorensen TI. A counting process approach to maximum likelihood estimation of frailty models. Scand. J. Statist. 1992;19:25–43. [Google Scholar]

[R27] [27].Oakes D. Bivariate survival models induced by frailties. J. Amer. Statist. Assoc. 1989;84:487–493. MR1010337. [Google Scholar]

[R28] [28].Parner E. Asymptotic theory for the correlated gamma-frailty model. Ann. Statist. 1998;26:183–214. MR1611788. [Google Scholar]

[R29] [29].Prentice RL, Breslow NE. Retrospective studies and failure time models. Biometrika. 1978;65:153–158. MR. [Google Scholar]

[R30] [30].Shih JH. A goodness-of-fit test for association in a bivariate survival model. Biometrika. 1998;85:189–200. MR1627281. [Google Scholar]

[R31] [31].Shih JH, Chatterjee N. Analysis of survival data from case-control family studies. Biometrics. 2002;58:502–509. doi: 10.1111/j.0006-341x.2002.00502.x. MR1925547. [DOI] [PubMed] [Google Scholar]

[R32] [32].Shih JH, Louis TA. Inference on the association parameter in copula models for bivariate survival data. Biometrics. 1995;51:1384–1399. MR1381050. [PubMed] [Google Scholar]

[R33] [33].Viswanathan B, Manatunga AK. Diagnostic plots for assessing the frailty distribution in multivariate survival data. Lifetime Data Anal. 2001;7:143–155. doi: 10.1023/a:1011348823081. MR1842324. [DOI] [PubMed] [Google Scholar]

[R34] [34].Zeger S, Liang K-Y, Albert PS. Models for longitudinal data: A generalized Estimation Equation approach. Biometrics. 1988;44:1049–1060. MR0980999. [PubMed] [Google Scholar]

[R35] [35].Zucker DM. A pseudo partial likelihood method for semi-parametric survival regression with covariate errors. J. Amer. Statist. Assoc. 2005;100:1264–1277. MR2236440. [Google Scholar]

[R36] [36].Zucker DM, Gorfine M, Hsu L. Pseudo full likelihood estimation for prospective survival analysis with a general semiparametric shared frailty model: asymptotic theory. To appear in J. Statist. Plann Inference. 2007 [Google Scholar]

PERMALINK

CASE-CONTROL SURVIVAL ANALYSIS WITH A GENERAL SEMIPARAMETRIC SHARED FRAILTY MODEL - A PSEUDO FULL LIKELIHOOD APPROACH

Malka Gorfine

David M Zucker

Li Hsu

Abstract

1. Introduction

2. Notation and Model Formulation

2.1. The likelihood for the proband data

2.2. The likelihood for the data from the relatives

3. The Proposed Approach

4. Asymptotic Properties

5. Extension to Restricted Sampling of Probands

6. Simulation Results - Gamma Frailty

Table 1.

Table 2.

Table 3.

7. Example

Table 4.

8. Discussion

9. Asymptotic Theory: Conditions and Proofs

9.1. Assumptions, Background, and Preliminaries

9.2. Consistency

Remark

9.3. A Workable Representation of Λ^0(t)−Λ0o(t)

9.3.1. Representation of Λ~0(t)−Λ0o(t)

Condition 1

Condition 2, 3, and 4

9.3.2. Representation of Λ^0(t)−Λ0o(t)

9.4. Asymptotic normality of n1/2(γ̂-γo)

Step I

Step II

Step III

9.5. Asymptotic properties of Λ^0(⋅,γ^)

Acknowledgements

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

9.3. A Workable Representation of ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$

9.3.1. Representation of ${\tilde{Λ}}_{0} (t) - Λ_{0}^{o} (t)$

9.3.2. Representation of ${\hat{Λ}}_{0} (t) - Λ_{0}^{o} (t)$

9.4. Asymptotic normality of n^1/2(γ̂-γ^o)

9.5. Asymptotic properties of ${\hat{Λ}}_{0} (\cdot, \hat{γ})$