Moving Block Bootstrap for Analyzing Longitudinal Data

Hyunsu Ju

doi:10.1080/03610926.2013.766341

. Author manuscript; available in PMC: 2015 May 26.

Published in final edited form as: Commun Stat Theory Methods. 2013 Jun 20;44(6):1130–1142. doi: 10.1080/03610926.2013.766341

Moving Block Bootstrap for Analyzing Longitudinal Data

Hyunsu Ju ¹

PMCID: PMC4443935 NIHMSID: NIHMS688371 PMID: 26023251

Abstract

In a longitudinal study subjects are followed over time. I focus on a case where the number of replications over time is large relative to the number of subjects in the study. I investigate the use of moving block bootstrap methods for analyzing such data. Asymptotic properties of the bootstrap methods in this setting are derived. The effectiveness of these resampling methods is also demonstrated through a simulation study.

Keywords: Longitudinal study, Moving block bootstrap

1. Introduction

Many longitudinal designs are the case that the number of subjects n is large and the number of replications m_i is bounded. Liang and Zeger (1986) proved that the generalized estimating equation (GEE) estimator is consistent and asymptotically normal with misspecification of covariance parameters. That asymptotic property is considered when the number of subjects n goes to infinity and m_i is bounded. Xie and Yang (2003) proved the almost sure existence and strong consistency of GEE estimators.

Alternatively in this article, I will consider the case of the longitudinal design in which the number of subjects n is bounded and the number of replications m, the same number for all subjects, is large. The model on which we focus is given by

y_{i} = x_{i} β + e_{i},

(1)

where i = 1, n, …, x_i is m × p design matrix, β is p × 1 vector of unknown parameters, y_i = (y_i1, …, y_im)′, and e_i = (e_i1, …, e_im)′.

I focus on the following estimating equation. The estimator β̂ is called a regression estimator if it solves

\sum_{i = 1}^{n_{0}} x_{i}^{'} V_{i}^{- 1} (y_{i} - x_{i} β) = 0 .

(2)

The repeated observations are correlated for each subject. Since bootstrap methods that resample small subjects or resample observations independently may not work well, we will investigate the moving block bootstrap method developed for time-series-correlated data.

The remaining part of this paper is organized as follows. In Sec. 2, we present some literature review. In Sec. 3, we explore the use of moving block bootstrap for analyzing longitudinal data. In Sec. 4, we demonstrate the moving block bootstrap justification in longitudinal data theoretically and empirically. In Sec. 5, we conclude the paper with a brief discussion.

2. Literature Review

Efron (1979) introduced the bootstrap procedure for estimating sampling distributions of statistics based on independent and identically distributed (i.i.d. observations. It is well known, in the i.i.d. setup, that the bootstrap often offers more accurate approximations than classical large sample approximations, e.g., Singh (1981) and Babu (1986). However, when the observations are not necessarily independent, the classical bootstrap no longer succeeds, as shown by Singh (1981).

In a time series case, Lahiri (1996) applied the moving block bootstrap method to multiple linear regression models

y_{j} = x_{j}^{'} β + ∊_{j}, j = 1, \dots, m,

(3)

Where x_j's are known p × 1 vectors, β is a p × 1 vector of parameters, and ∊₁, ∊₂, …, ∊_m are stationary, strongly mixing random variables. If β̂_m is an M-estimator of β corresponding to some score function ϕ, under some conditions, a two-term Edgeworth expansion for studentized multivariate M-estimator was observed. Also, Lahiri (1996) showed that the block bootstrap has a second-order correctness for some suitable bootstrap analogs of studentized β̂_m.

Hall et al. (1995) showed that the optimal asymptotic rate of the block size for the moving blocks method depends significantly on context, being equal to m^1/3, m^1/4, and m^1/5 in the cases of variance or bias estimation, estimation of a one-sided distribution function, and estimation of a two-sided distribution function, respectively. The latter two quantities are needed for the construction of equal-tailed and symmetric confidence intervals, respectively. Hall et al. (1995) present a practical rule for selecting the block size empirically. It is based on the fact that the asymptotic formula is b ∼ Cm^1/k, where k = 3, 4, or 5 is known, and C is a constant that depends on the underlying process. The rule suggested provides a way for estimating the optimal block for a time series of smaller length than the original.

Paparoditis and Politis (2002) presented a new block bootstrap variation, the tapered block bootstrap, which is applicable in the general time series case of approximately linear statistics. The asymptotic validity and the favorable bias properties of the tapered block bootstrap are shown in two important cases: smooth function of means and M-estimators.

3. Moving Block Bootstrap for Longitudinal Data

The major drawback with model-based resampling is that in practice not only the parameters of a model, but also its structure, must be identified from the data. If the chosen structure is incorrect, the resampled series will be generated for the wrong model, and they will not have the same statistical properties as the original data. The model-based approach is inconsistent if the model used for resampling is misspecified.

The moving bootstrap involves resampling possibly overlapping blocks. The mixed block bootstrep (MBB) does not force one to select a model and the only parameter required is the block length. If the block is long enough the original dependence will be reasonably preserved in the resampled series. This approximation is better if the dependence is weak and the blocks are as long as possible, thus preserving the dependence more faithfully. But the distinct values of the statistics must be as numerous as possible to provide a good estimate of the distribution of the statistics and this points toward short blocks.

Unless the length of the series is considerable to accommodate longer and more number of blocks, the preservation of the dependence structure may be difficult, especially for complex, long-range dependent structures. In such cases, the block resampling scheme tends to generate resampled series that are less dependent than the original ones. Furthermore, the resampled series often exhibits artifacts that are caused by joining randomly selected blocks. Then, the asymptotic variance–covariance matrices of the estimators based on the original series and those based on the bootstrap series are different and a modification of the original scheme is needed. This suggests a strategy intermediate between model-based and block resampling. The idea comes from a procedure of pre-whitening the original dependent series by fitting a model and is intended to remove much of the dependence between the original observations. A resampling series is generated by block resampling of residuals from the simple fitted model, and the innovation series is then post-blackened by applying the simple estimated model to the resampled innovations. The post-blackened version works more consistently in practice (Davison and Hinkley, 1997).

Bühlmann (1997) suggested the sieve bootstrap that is model based. The autoregressive (AR)(p) model is just used to filter the residual series. If the model used in the sieve bootstrap is not appropriate, the resulting residuals cannot be treated as i.i.d. A hybrid approach between the model based method and moving block bootstrap, named post-blacken bootstrap, was suggested by Davison and Hinkley (1997). The procedure is similar to the sieve bootstrap, but the residuals from AR(p) model are not resampled in an iid manner but by using the MBB bootstrap. If some residual dependent structure is present in the AR residuals, this is kept from the blockwise bootstrap. The linear model is used to pre-whiten the series by fitting the model that is intended to remove much of the dependence present the observations. A series of innovations is then generated by block resampling of residuals obtained from the fitted model, and the innovation series is then post-blackened by applying the estimated model to the resampled innovations.

A block bootstrap algorithm in a longitudinal model is given as follows. We continue to assume (1) as our longitudinal model under consideration.

Let ê_ij, i = 1, …, n₀, j = 1, …, m be the residuals from the model fit.

${\hat{e}}_{i j} = y_{i j} - x_{i j} \hat{β},$ (4)

where β̂ is the ordinary least square estimate.
Now assuming that m = bk with b and k integers: Let $B_{1}^{*}, \dots, B_{k}^{*}$ denotes k uniform draws with replacement from the integers {0, 1, …, m−b}. These represent the starting point for each block of length b. A block bootstrap resample of residuals, $({\hat{e}}_{i 1}^{*}, \dots, {\hat{e}}_{i m}^{*})$ , is defined by:

${\hat{e}}_{i, (j - 1) b + s}^{*} = {\hat{e}}_{i, B_{j}^{*} + s}, (1 \leq j \leq k, 1 \leq s \leq b) for each i .$ (5)
The bootstrapped response, $y_{i j}^{*}$ , are then generated from the estimated model with residuals ${\hat{e}}_{i j}^{*}$ and the original covariates:

$y_{i j}^{*} = x_{i j} \hat{β} + {\hat{e}}_{i j}^{*} .$ (6)
From the resampled responses, $y_{i j}^{*}$ , and original covariates, we fit the model and obtain new parameter estimates.
Repeating steps (2) through (4) a large number, R, of times one obtains R bootstrap replicates from which features of the distribution of the parameter estimates can be estimated. In particular, the bootstrap variance estimates are simply variance of the B computed values for each parameter.

I consider the six different kinds of block bootstrap methods in a balanced longitudinal design in which the number of subjects is small and the number of replications is large:

Case 1, MBB1 (Within block bootstrap): For each i subject, we construct overlapping blocks with m − b + 1 blocks and block size b, i.e B₁, …, B_m−b+1. Let us define m/b = k which is assumed to be an integer for simplicity, in general k = [m/b]. We can add the k blocks with replacement among B₁, …, B_m−b+1. We get the $B_{1}^{*}, \dots, B_{k}^{*}$ with kb = m, and create ${\hat{e}}_{i 1}^{*}, \dots, {\hat{e}}_{i m}^{*}$ from ê_i1, …, ê_im, where ê_ij = y_ij − β̂₀ − β̂₁x_ij. We can add up to n₀ individuals and plug this into the model and the results is a pseudo sample series $y_{11}^{*}, \dots, y_{n_{0} m}^{*}$ . From the model $y_{i j}^{*} = {\hat{β}}_{0} + {\hat{β}}_{1} x_{i j} + {\hat{e}}_{i j}^{*}$ , we fit model and produce new parameters ${\hat{β}}_{0}^{*}$ and ${\hat{β}}_{1}^{*}$ .
Case 2, MBB2 (Mixed block bootstrap): We have m − b + 1 blocks and block size b, i.e B₁, …, B_m−b+1 and add up to n₀ subjects. We sample n₀k blocks with replacements among B₁, …, B_{n₀(m−b+1)}. We construct $B_{1}^{*}, \dots, B_{n_{0} k}^{*}$ with kb = m, and plug this into the model and obtain a pseudo series $y_{11}^{*}, \dots, y_{n_{0} m}^{*}$ . Similarly, from the model $y_{i j}^{*} = {\hat{β}}_{0} + {\hat{β}}_{1} x_{i j} + {\hat{e}}_{i j}^{*}$ , we fit the model and produce new parameters ${\hat{β}}_{0}^{*}$ and ${\hat{β}}_{1}^{*}$ .
Case 3, One-line moving block bootstrap: One can make up to one long series and perform the moving block bootstrap using a time series without splitting the different individual consecutive data.

Case 4, Standard bootstrap: This is a special case of b = 1 in MBB2.
Case 5, Resampling subject bootstrap: This is a special case of b = m in MBB2.
Case 6, Stratified standard bootstrap: This is a special case of b = 1 in MBB1.

4. Justification of Moving Block Bootstrap in Longitudinal Data

I consider the justification of moving block bootstrap in longitudinal data. We focus on the within block bootstrap method (MBB1) in the six different kinds of scenario in previous section. I follow the Lahiri's (1996) assumptions. Let's consider the relationship between the GEE and M-estimators. The robust approach can be extended to the regression setup to analyze a predictor−outcome relationship. Suppose we have model (1) with n = n₀. The estimator β̂ is called a robust regression estimator or an M-estimator if it solves

\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} ψ (y_{i j} - x_{i j} β) = 0,

(7)

for some choice of function ψ(·).

4.1 Expansion for M-estimator

It is known that

{(\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} x_{i j})}^{1 / 2} ({\hat{β}}_{n_{0} m} - β) \sim N_{p} (0, \frac{E ψ^{2} (e_{11})}{{(E ψ' (e_{11}))}^{2}} I_{p}),

(8)

where I_p denotes the identity matrix of order p.

Let ê_ij = y_ij − x_ijβ̂_n₀m denote residuals. Define

σ_{i} (k) = E ψ (e_{i 1}) ψ (e_{i (1 + k)}), k \geq 0; τ = τ_{i} = E ψ' (e_{i 1}) for each i {\hat{σ}}_{i m} (k) = {(m - k)}^{- 1} \sum_{j = 1}^{m - k} ψ ({\hat{e}}_{i j}) ψ ({\hat{e}}_{i (j + k)}), 0 \leq k \leq m - 1, {\hat{τ}}_{i m} = m^{- 1} \sum_{j = 1}^{m} ψ' ({\hat{e}}_{i j}) .

Also, let σ_i (k) = σ(k) and σ̂_im(k) = σ̂_m(k)

Assumption 1.

(A.1)
1. ψ is twice differentiable, and ψ″ satisfies a Lipschitz condition of order δ₁ > 0,
2. ψ, ψ′, ψ″ are bounded.
(A.2)
1. for each i Eψ(e_i1) = 0, τ ≡ Eψ′ (e_i1) ≠ 0,
2. $σ_{\infty} \equiv σ (0) - 2 \sum_{k = 1}^{\infty} | σ (k) | > 0$ .
(A.3) There exists ρ > 0 such that
1. $sup {| P (A \cap B) - P (A) P (B) | : A \in F_{- \infty}^{r}$ , $B \in F_{r + k}^{\infty}, r \geq 1} \leq ρ^{- 1} exp (- ρ k)$ for all k ≥ 1,
2. for all r ≥ 1, and all k ≥ ρ⁻¹, there exists a $F_{r - k}^{r + k}$ -measurable random variable ẽ_ir,k such that E|e_ir − ẽ_ir,k| ≤ ρ⁻¹ exp(−ρk),
3. for all r, k, q ≥ ρ⁻¹ and $A \in F_{r - q}^{r + q}, E | P (A | F_{j} : j \neq r) - P (A | F_{j} : 0 < | j - r | \leq q + k) | \leq ρ^{- 1} exp (- ρ k)$ , and
4. for all r ≥ ρ⁻¹, k ≤ r and all t_r−k, …, t_r+k ∈ ℝ with |t_r| > ρ, $E | E (exp (\sqrt{- 1} \sum_{i = 1}^{n_{0}} \sum_{j = r - k}^{r + k} t_{j} ψ (e_{i j})) | F_{j} : j \neq r | < exp (- ρ)$ .
(A.4) max{‖x_ij‖ : 1 ≤ j ≤ m} = O(1) and lim inf_m→∞m⁻¹λ_m ≡ λ > 0, where λ_m denotes the smallest eigenvalue of $(\sum_{j = 1}^{m} x_{i j}^{'} x_{i j})$ .

Let $D_{n_{0} m} = {(\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} x_{i j})}^{- 1 / 2}$ and $d_{i j} = D_{n_{0} m} x_{i j}^{'}$ , 1 ≤ j ≤ m. When e_ij are weakly dependent for each i, the asymptotic covariance of $D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m} - β)$ matrix is given by

{Cov}_{n_{0} m} \equiv {(E ψ' (e_{11}))}^{- 2} \times \sum_{i = 1}^{n_{0}} \sum_{k = 0}^{m} L_{i k m} E ψ (e_{i 1}) ψ (e_{i (1 + k)}),

(9)

where L_i0m = I_p and $L_{i k m} = \sum_{j = 1}^{m - k} (d_{i j} d_{i (j + k)}^{'} + d_{i (j + k)} d_{i j}^{'})$ , 1 ≤ k ≤ m − 1.

To define the studentized version of β̂_n₀m, note that the asymptotic matrix $D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m} - β)$ is given by

\sum_{n_{0} m} \equiv Cov (\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} ψ (e_{i j})) = \sum_{i = 1}^{n_{0}} \sum_{k = 0}^{m - 1} L_{i k m} σ (k) .

(10)

Therefore, a natural estimator of Σ_n₀m is

{\sum^{^}}_{n_{o} m} = \sum_{i = 1}^{n_{0}} \sum_{k = 0}^{l} L_{i k m} {\hat{σ}}_{m} (k),

(11)

where 1 ≤ l ≡ l_m ≤ m − 1 is an integer. If l → ∞ slowly with m, then ‖Σ̂_n₀m − Σ_n₀m‖ = o_p(1). Σ̂_n₀m is non singular with high probability for m large, and can be inverted to define the studentized statistic,

T_{n_{0} m} = {\sum^{^}}_{n_{0} m}^{- 1 / 2} D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m} - β) .

(12)

Next, I extend Lahiri's (1996) results for longitudinal case: assume that (A.1),(A.2),(A.3)(i),(ii), and (A.4) hold. Then, there exists a sequence of statistics {β̂_m} such that

P ({\hat{β}}_{n_{0} m} satisfies (7) and {‖ D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m} - β) ‖}^{2} \leq C log m) = 1 - o (m^{- 1 / 2}) .

(13)

If we have a unique solution β̂_n₀m, then $‖ D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m} - β) ‖ = O_{P} ({(log m)}^{1 / 2})$ . When (7) has a unique solution, one can obtain the strong consistency of β̂_n₀m as in Lahiri (1992). The next result gives a first-order Edgeworth expansion for the studentized M-estimator.

Theorem 1. Assume that Assumptions (A.1)–(A.4) hold and that {β̂_n₀m} is a sequence of measurable solutions of (7). Then, there exist's a polynomial p_m(·) on ℝ^p such that

sup_{B \in B} | P (T_{n_{0} m} \in B) - \int_{B} (1 + p_{n_{0} m} (x)) d Φ (x) | = o (m^{- 1 / 2})

(14)

for every class ℬ of Borel subsets of ℝ^p satisfying

sup_{B ɛ B} Φ ({(\partial B)}^{∊}) = O (∊) as η ↓ 0 .

(15)

Here ‖p_n₀mϕ‖_∞ = O(m^−1/2) with sup norm ‖‖_∞, Φ denotes the standard normal distribution onℝ^p(p ≥ 1), and the coefficient of p_n₀m(·) are continuous functions of cross-product moments of ψ(e_ij), ψ′ (e_ij), and ψ″ (e_ij). Here ∂B denote's the boundary of a set B ⊆ ℝ^p and (∂B)^∊ = {x : ‖x − y‖ < ∊ for some y ∈ ∂B}.

4.2. Expansion for Bootstrap M-estimator

Define the bootstrap M-estimator ${\hat{β}}_{n_{o} m}^{*}$ as a solution of the equation in t ∈ ℝ^p

\sum_{i = 1}^{n_{0}} (\sum_{j = 1}^{m} x_{i j}^{'} (ψ (y_{i j}^{*} - x_{i j} t)) - {\hat{μ}}_{m}) = 0,

(16)

where ${\hat{μ}}_{m} = \frac{1}{b} E_{n_{0} m}^{*} {ψ ({\hat{e}}_{11}^{*}) + \dots + ψ ({\hat{e}}_{1 b}^{*})}$ and $y_{i j}^{*}$ is given in (6). The $\sum_{n_{0} m}^{*}$ is the conditional covariance matrix of $\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} ψ ({\hat{e}}_{i j}^{*})$ which is given by

\sum_{i = 1}^{n_{0}} \sum_{u = 1}^{k} {Cov}_{m} (\sum_{j = 1}^{b} d_{i, (u - 1) b + j} ψ ({\hat{e}}_{i j}^{*})) .

(17)

The natural estimator of $\sum_{n_{0} m}^{*}$ is

{\sum^{^}}_{n_{0} m}^{*} = \sum_{i = 1}^{n_{0}} \sum_{j = 0}^{b - 1} \sum_{u = 1}^{k} \sum_{l = 1}^{b - j} D_{i, l u j}^{*} {\hat{σ}}_{n_{0} m}^{*} (j),

(18)

where $D_{l u j}^{*} = (1 - 2^{- 1} I (j = 0)) ({\tilde{D}}_{l u j}^{*} + {\tilde{D}}_{l u j}^{*^{'}})$ , ${\tilde{D}}_{l u j}^{*} = d_{(u - 1) b + l} d_{(u - 1) b + l + j}^{'}$ . The bootstrap version $T_{n_{0} m}^{*}$ of T_n₀m is given by

T_{n_{0} m}^{*} = {({\sum^{^}}_{n_{0} m}^{*})}^{- 1 / 2} D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m}^{*} - {\hat{β}}_{n_{0} m}) .

(19)

By assumptions, there exists a sequence of statistics ${{\hat{β}}_{n_{0} m}^{*}}$ such that

P ({\hat{β}}_{n_{0} m}^{*} satisfies (16) and {‖ D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m}^{*} - {\hat{β}}_{n_{0} m}) ‖}^{2} \leq C log m) = 1 - o_{p} (m^{- 1 / 2}) .

(20)

Theorem 2. Assume that the conditions in Theorem 1. hold. Suppose that $T_{n_{0} m}^{*}$ is defined for some measurable sequence ${{\hat{β}}_{n_{0} m}^{*}}$ satisfying (20) and also suppose that m^δb⁻¹ = O(1) and b = O(m^(1−κ)/4) for some δ > 0, and κ > max{p + 3, 5}δ₀. Then

sup_{B \in B} | P^{*} (T_{n_{0} m}^{*} \in B) - P (T_{n_{0} m} \in B) | = o_{p} (m^{- 1 / 2})

(21)

for any class ℬ of Borel subset of ℝ^p.

Theorem 2 shows that the MBB indeed provides more accurate approximation for studentized multivariate M-estimator of the regression parameter vector β than normal approximation. Consequently, Theorem 2 is useful for constructing second-order correct multivariate inference procedures for β under multiple regression model. The studentized moving block bootstrap statistics obtain the second-order accuracy for the bounded n = n₀ case.

4.3. A Simulation Study

The block bootstrap captures the dependence in the series of residuals without the need to know the correlation structure. It can be simple and account for correlations in a regression model with correlated error. To obtain the bootstrap version of β̂_n₀m, first form the observed blocks of residual length b as ξ_ih = (ê_ij, …, ê_i(h+b−1)), 1 ≤ h ≤ q, where q = m − b + 1 and ê_ij = y_ij − x_ijβ̂_n0m, 1 ≤ j ≤ m, 1 ≤ i ≤ n. Next draw $ξ_{i 1}^{*}, \dots, ξ_{i k}^{*}$ randomly, with replacements from ξ_i1, …, ξ_iq, where m/b = k is assumed to be an integer for simplicity. Note that each $ξ_{i k}^{*}$ has b components. Denote the lth component of $ξ_{i k}^{*}$ , 1 ≤ l ≤ b by $ξ_{i k l}^{*}$ . Also, set ${\hat{e}}_{i ((b - 1) k + l)}^{*} = ξ_{i k l}^{*}$ , 1 ≤ l ≤ b, and we have the bootstrap pseudo-observations

y_{i j}^{*} = x_{i j} {\hat{β}}_{n m} + {\hat{e}}_{i j}^{*}, 1 \leq i \leq n_{0}, 1 \leq j \leq m .

(22)

Adapting Shorack's approach, we obtain the bootstrapped estimator ${\hat{β}}_{n_{0} m}^{*}$ as a solution of the equation t ∈ R^p,

g_{n_{0} m}^{'} = \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} ((y_{i j}^{*} - x_{i j} β) - {\hat{μ}}_{n m}) = 0,

(23)

where ${\hat{μ}}_{n_{0} m} = b^{- 1} E_{n_{0} m} {e_{11}^{*} + \dots + e_{n_{0} b}^{*}}$ , and E_n₀m denotes the conditional expectations under the MBB resampling scheme, given ê₁₁, …, ê_nm. Centering the above equation by μ̂_n₀m makes the estimating equation conditionally unbiased at β = β̂_n₀m and ensures the bootstrap analog. The bootstrap estimator is as follows:

{\hat{β}}_{n_{0} m}^{*} = {(\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} x_{i j})}^{- 1} \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} (y_{i j}^{*} - {\hat{μ}}_{n_{0} m}) .

(24)

Consider the following specific model in simulation work:

y_{i j} = β_{0} + β_{1} x_{i j} + γ_{i} + e_{i j}, i = 1, \dots, n_{0}, j = 1, \dots, m,

(25)

where e_ij = ϕe_i_(j−1) + u_ij,

γ_{i} ~ N (0, σ_{γ}^{2}) and u_{i j} ~ N (0, σ_{u}^{2}) .

In particular, let n₀ = 5, m = 20, and x_ij = (1,…, 20)′.

β_{0} = 10, β_{1} = 1, b = 4, k = 5,

σ_{γ}^{2} = 1, σ_{u}^{2} = 1 and ϕ = 0.75 .

4.4. Bootstrapping the Distribution of Statistics

Let R be the number of bootstrap simulations (r = 1, …, R), and β̂* be the bootstrap estimate of β for the r samples. The important result is that the distribution of β̂*, estimated by the empirical distribution function of the β̂^*,r(r = 1, …, R), approximates the distribution of β̂. Now the studentized statistic has the following form:

{\hat{T}}^{*} = \frac{{\hat{β}}^{*} - \hat{β}}{{\hat{S}}_{{\hat{β}}^{*}}^{*}} .

(26)

The difference between the distribution functions of T̂ and T̂* tends to 0, when the number of observations is large; thus we can use the quartiles of T̂* instead of T̂ to construct intervals or tests. Let (T̂^*,r, r = 1, …, R) be the rth sample of T̂*, where T̂* is calculated in the same way as T̂, replacing y_ij with $y_{i j}^{*}$ . Let q̂_α be the percentile of the T̂^*,r. It can be shown that P(T̂ ≤ q̂_α) tends to α, when m tends to infinity. This gives a bootstrap confidence interval for β

{\hat{I}}_{R} = [{\hat{β}}^{*} - {\hat{q}}_{1 - \frac{α}{2}} {\hat{S}}_{{\hat{β}}^{*}}^{*}, {\hat{β}}^{*} - {\hat{q}}_{\frac{α}{2}} {\hat{S}}_{{\hat{β}}^{*}}^{*}] .

(27)

For large m and R, the coverage probability of Î_R is close to 1 − α. The bootstrap estimation of the variance is calculated using the empirical variance of the R sample (β̂^*,r, r = 1, …, R):

{\hat{S}}^{* 2} = \frac{1}{R - 1} \sum_{r = 1}^{R} {({\hat{β}}^{*, r} - \bar{{\hat{β}}^{*}})}^{2},

(28)

where $\bar{{\hat{β}}^{*}}$ is the sample mean $\bar{{\hat{β}}^{*}} = \sum_{r = 1}^{R} {\hat{β}}^{*, r} / R$ .

Coverage accuracy, where coverage is the probability that a confidence interval includes β, is the important property for a confidence interval procedure. Bootstrap confidence interval methods differ in their asymptotic properties. Our simulation results are given in Table 1. MBB1 and MBB2 are similar to each other. Those two block bootstrap methods obtained correct coverage probability at the nominal level of 95%. The standard bootstrap or stratified ordinary bootstrap did not perform well in highly correlated longitudinal data with low coverage probabilities.

Table 1.

Coverage probability and length of CI $({\hat{β}}_{1}^{*})$ : 500 replications; ϕ = 0.75, SOB is stratified ordinary bootstrap, SB is a standard bootstrap estimation, and β̂₁ is a robust estimation with unknown covariance structure

Methods	CI $({\hat{β}}_{1}^{*})$	Probability	Length
MBB1	(0.849, 1.168)	0.949	0.318
MBB2	(0.844, 1.163)	0.952	0.320
SOB	(0.805, 1.091)	0.768	0.286
SB	(0.804, 1.086)	0.747	0.282
β̂₁	(0.840, 1.169)	0.950	0.329

Open in a new tab

5. Concluding Remarks and Discussion

The moving block bootstrap methods are used for analyzing longitudinal data in which a small number of subjects have a large number of replications over time by investigating the efficacy and utility of the methodology, theoretically and empirically, through a small simulation study. Those have second-order optimality in the case of dependent stationary data, under regular conditions.

Quasi-likelihood approaches such as the GEE of Liang and Zeger (1986) and quadratic inference function (QIF) of Qu et al. (2000) are useful for modeling longitudinal data in the form of large sample short time series. However, their advantages of simplicity and robustness against the misspecification of the correlation structure are offset by their loss of estimation efficiency and lack of procedure for model assessment and selection. In other words, these estimators have consistency properties, but are not fully efficient. To offset this inefficiency, subjects rather than observations can be resampled to obtain an estimation efficiency that takes correlation structure into account.

In a longitudinal study with a small sample and a long time series in which modeling the repeated time pattern is necessary, we assume a stationary process. The stationary process assumption represents a fundamentally different stochastic mechanism from other methods used to govern the structure and behavior of transitions over time. Using a moving block bootstrap estimation procedure is preferable in case of cluster high-correlated data with an equal large number of clusters or in spatial strong-correlated data collected from a large number of locations.

When both the number of subjects and the number of replications are large, the correlation structure is determined by a trade off between the number of nuisance parameters and the closeness of the mathematical model to the true underlying structure. The question is, which provides better estimation efficiency: a simpler correlation with small nuisance parameters or a model closer to the true structure with many nuisance parameters. Further simulations using the bootstrap procedure will be investigated in the future study.

Appendix

Proof of Theorem 1

Proof. We follow Lahiri (1996) notation and definitions. For a smooth function h : ℝ^p → ℝ, let us D_jh denote the partial derivative of h(x) with respect to the jth coordinate of x, 1 ≤ j ≤ p. For p × 1 vectors $v = {(v_{1}, \dots, v_{p})}^{'} \in Z_{+}^{p}$ and w = (w₁,…, w_p)′ ∈ ℝ^p, let |v| = v₁ + … + v_p, $v! = v_{1}! … v_{p}!, w^{v} = Π_{i = 1}^{p} {(w_{i})}^{v_{i}}$ , and $‖ w ‖ = {(w_{1}^{2} + \dots + w_{p}^{2})}^{1 / 2}$ . Let D^v denote the differential operator $D_{1}^{v_{1}} \dots D_{p}^{v_{p}}$ , namely, $D^{v} = Π_{j = 1}^{p} {(\frac{\partial}{\partial t^{(j)}})}^{v (j)}$ . For $v \in Z_{+}^{p}$ with 1 ≤ |v| ≤ s, let χ_v denote the vth cumulant and μ_v is the vth moment of w. Note that ${\sqrt{- 1}}^{| v |} μ_{v} = D^{v} \hat{Φ} (0)$ , $D^{α} \hat{μ} (t) = {(\sqrt{- 1})}^{| α |} \int w^{α} e^{\sqrt{- 1} t^{'} w} μ (d w)$ , t ∈ ℝ^p, and ${\sqrt{- 1}}^{| v |} χ_{v} = (D^{v} log \hat{Φ}) (0)$ .

We consider w is a ℝ^p -valued random vector with Ew =0 and E‖w‖^s < ∞ for some integer s ≥ 3.

Let m₃ = [log m log log(3 + m)], v₁_m = m^−1/2(log m)^1/2, v_m = m^−1/2, v₂_m = v_m(log m)⁻¹, and v₃_m = v_m(log m)^−3/2. Furthermore, define

\begin{matrix} G_{n_{0} m} = D_{n_{0} m}^{- 1} ({\hat{β}}_{n_{0} m} - β), & G_{1 n_{0} m} = \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} ψ (e_{i j}) & d_{i j} = {(\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} x_{i j})}^{- 1 / 2} x_{i j}, \\ D_{n_{0} m} = {(\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} x_{i j}^{'} x_{i j})}^{- 1 / 2}, & \tilde{ψ} = ψ (\cdot) - {\hat{μ}}_{n_{0} m}, & A_{n_{0} m} = \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} d_{i j}^{'} ψ^{'} (e_{i j}), \end{matrix}

A = τI_p = Eψ′ (e_ij)I_p, τ = τ_i = Eψ′ (e_i₁) for each i, w_i₁_j = ψ (e_i₁), W_i₂_j = ψ′(e_ij) − τ, W_i₃_j = ψ″(e_ij) − Eψ″(e_i₁), and W_i4j(k) = ψ(e_ij)ψ(e_i₍_j₊_k₎) − σ_i(k). Also, write $χ (U) = {(- 1)}^{p / 2} D_{1} \dots D_{p} E exp (\sqrt{- 1} t^{'} U) |_{t = 0}$ for a random vector U in ℝ^p.

Let $Δ = D_{n_{0} m}^{- 1} (t - β)$ , t ∈ ℝ^p. Then, by Taylor's expansion, one can get

[\sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} d_{i j}^{'} ψ^{'} (e_{i j})] Δ = \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} ψ (e_{i j}) + \frac{1}{2} \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} {(d_{i j}^{'} Δ)}^{2} ψ^{″} (e_{i j}) + R_{n_{0} m} (t),

(29)

Where $‖ R_{n_{0} m} (t) ‖ \leq C \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} {‖ d_{i j} ‖}^{3 + δ_{1}} {‖ Δ ‖}^{2 + δ}$ , t ∈ ℝ^p.

Following Lahiri (1992), we obtain that

G_{n_{0} m} = (A^{- 1} + τ^{- 2} (A - A_{n_{0} m})) G_{1 n_{0} m} + {(2 τ^{3})}^{- 1} \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} {(d_{i j}^{'} θ_{1 n_{0} m})}^{2} E ψ^{″} (e_{i j}) + R_{n_{0} m}^{'},

(30)

where $P (‖ R_{n_{0} m}^{'} ‖ > C (σ_{\infty}) v_{2 m}) = o (v_{m})$ ,

T_{n_{0} m} = \sum_{n_{0} m}^{- 1 / 2} [G_{1 n_{0} m} + τ^{- 1} G_{1 n_{0} m} (m^{- 1} \sum_{j = 1}^{n_{0}} \sum_{j = 1}^{m} W_{i 2 j} - {(m τ)}^{- 1} \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} (d_{i j}^{'} G_{1 n_{0} m}) E ψ^{″} (e_{i 1})) + τ^{- 1} (A - A_{n_{0} m}) G_{1 n_{0} m} + {(2 τ^{2})}^{- 1} \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} {(d_{i j}^{'} G_{1 n_{0} m})}^{2} E ψ^{″} (e_{i 1})] + \sum_{| β | = 1} {({\sum^{^}}_{n_{0} m})}^{β} D^{β} (\sum_{n_{0} m}^{- 1 / 2}) G_{1 n_{0} m} + R_{n_{0} m}^{†} .

(31)

If we have T_n₀m = T_1n₀m + R_n₀ms, where R_n₀ms is the remainder term that under the moment condition E‖y₁₁‖^s < ∞ satisfies P(‖R_n₀ms‖ > δ_m_, _s) = δ_m_, _s for some sequence δ_m_, _s = o(m⁻⁽^s^−2)/2), then the random variable T_1n₀m is called a (s − 2)th -order stochastic approximation to T_n₀m. Note that the (s − 2)th order Edgeworth expansions for T_n₀m and T_1n₀m coincide. The reason for T_1n₀m is that the first term is the same as T_n₀m, but the remaining terms consist of all independent variables for deriving a simpler expansion. The stochastic approximation T_1n₀m can be expressed in the form

T_{1 n_{0} m} = \sum_{n_{0} m}^{- 1 / 2} G_{1 n_{0} m} + \sum_{r = 1}^{p} G_{1 n_{0} m}^{'} Λ_{n_{0} r m} G_{1 n_{0} m} q_{r} + \sum_{r = 1}^{p} {\tilde{W}}_{2 n_{0} m}^{'} Λ_{1 n_{0} r m} G_{1 n_{0} m} q_{r} + \sum_{| v | = 1} {({\sum^{^}}_{1 n_{0} m})}^{v} Λ_{v m} G_{1 n_{0} m},

(32)

where q₁ = (1, 0,…, 0),…, q_p = (0, 0,…, 1) are the standard basis of ℝ^p, ${\tilde{W}}_{2 n_{0} m} = {({(A - A_{n_{0} m})}^{'} : \sum_{i = 1}^{n_{0}} m^{- 1} \sum_{j = 1}^{m} W_{i 2 j})}^{'}$ and Λ_n₀rm, Λ_1n₀rm, Λ_vm are non random matrices satisfying max{m^1/2‖Λ_n₀rm‖ + ‖Λ_1n₀rm‖ + ‖Λ_vm‖ : 1 ≤ r ≤ p, |v| = 1} = o(1). In the following C, C(·) denotes pure constants which depend on each arguments, and the dependance of C(·) on p, α, and the finite moments of ψ (e_ij), ψ′(e_ij), and ψ″(e_ij) will be suppressed for notational simplicity. Using Lahiri's (1992, 1996) arguments, we can show that

P (‖ \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j} ψ (e_{i j}) ‖ > C {(log m)}^{1 / 2}) = o (m^{- 1 / 2}),

(33)

P (‖ A_{n_{0} m} - A ‖ > C m^{- 1 / 4} {(log m)}^{- 2}) = o (m^{- 1 / 2}),

(34)

P (| \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} d_{i j u} d_{i j l} d_{i j z} | > C m^{- 5 / 8}) \leq C m^{- 3 / 4}

(35)

for all 1 ≤ u, l, z ≤ p, where d_iju denote the uth component of d_ij, we have T_n₀m = T_1n₀m + R_n₀m, where P(‖R_n₀m‖ > Cv₂_m) = o(v_m). We know that the first order Edgeworth expansions for T_1n₀m and T_n₀m coincide.

Let $G_{2 n_{0} m} = \sum_{i = 1}^{n_{0}} \sum_{j = a}^{m} d_{i j} W_{i 1 j}$ with a = [m^{(1−2δ₀)/2}], $A_{2 n_{0} m} = {((\sum_{i = 1}^{n_{0}} \sum_{j = a}^{m} d_{i j k} d_{i j l} W_{i 2 j}))}_{p \times p}$ , ${\hat{W}}_{2 n_{0} m} = {((A_{2 n_{0} m}^{'} - E A_{2 n_{0} m}^{'}) : \sum_{i = 1}^{n_{0}} m^{- 1} \sum_{j = 1}^{m} W_{i 2 j})}^{'}$ , ${\sum^{^}}_{2 n_{0} m} = \sum_{i = 1}^{n_{0}} \sum_{k = 0}^{l} {(m - k)}^{- 1} L_{i k m} \sum_{j = a}^{m - k} [W_{i 4 j} - G_{2 n_{0} m}^{'} γ_{i j k}]$ , and

T_{2 n_{0} m} = \sum_{n_{0} m}^{- 1 / 2} G_{1 n_{0} m} + \sum_{r = 1}^{p} G_{2 n_{0} m}^{'} Λ_{n_{0} r m} G_{2 n_{0} m} q_{r} + \sum_{r = 1}^{p} {\hat{W}}_{2 n_{0} m}^{'} Λ_{1 n_{0} r m} G_{2 n_{0} m} q_{r} + \sum_{| v | = 1} {({\sum^{^}}_{2 n_{0} m})}^{v} Λ_{v m} G_{2 n_{0} m} .

(36)

The reason for T_2n₀m is that the first term is the same as T_1n₀m, but the remaining terms consist of truncated independent variables for obtaining the simplified forms of expansion. Using an Edgeworth expansion under dependence for T_1n₀m (Lahiri 1994, 1996), we have

P (‖ T_{1 n_{0} m} - T_{2 n_{0} m} ‖ > C v_{m}) = o (v_{m}) .

(37)

We have the same first-order Edgeworth expansions for T_2n₀m, T_1n₀m, and T_n₀m; namely, three statistics are close to each other.

We obtain that

sup_{B \in B} | P (T_{n_{0} m} \in B) - \int_{B} (1 + p_{n_{0} m} (x)) d Φ (x) | = o (m^{- 1 / 2}),

(38)

with ‖p_n₀mϕ‖_∞) = O(m^−1/2). The proof is then complete.

Proof of Theorem 2

Proof. Let $G_{n_{0} m}^{*} = D_{n_{0} m}^{- 1} ({\hat{β}}_{n 0 m}^{*} - {\hat{β}}_{n_{0} m})$ , $G_{1 n_{0} m}^{*} = \sum_{i = 1}^{n_{0}} \sum_{u = 1}^{k} W_{i 1 u}^{*}$ , $A_{n_{0} m}^{*} = \sum_{u = 1}^{k} W_{i 2 u}^{*}$ , ${\hat{A}}_{n_{0} m} = E_{n_{0} m} A_{n_{0} m}^{*}$ , $τ_{1 n_{0} m}^{*} = m^{- 1} \sum_{i = 1}^{n_{0}} \sum_{j = 1}^{m} ψ^{'} ({\hat{e}}_{i j}^{*})$ , ${\hat{τ}}_{1 n_{0} m} = E_{n_{0} m}^{*} (τ_{1 n_{0} m}^{*})$ , $ξ_{u j}^{*} = j th$ , component of $B_{u}^{*}$ , for u = 1, k,…,

W_{i 1 u}^{*} = \sum_{j = 1}^{b} d_{i ((u - 1) b + j)} \hat{ψ} (ξ_{u j}^{*}), W_{i 2 u}^{*} = \sum_{j = 1}^{b} d_{i ((u - 1) b + j)} d_{i ((u - 1 b + j)}^{'} ψ^{'} (ξ_{u j}^{*}), 1 \leq k \leq m .

As in the proof of Theorem 4.5.1 and in Lahiri's (1996) result, we have

T_{n_{0} m}^{*} = T_{1 n_{0} m}^{*} + R_{n_{0} m}^{† *}

(39)

where $P_{n_{0} m} (‖ R_{n_{0} m}^{† *} ‖ > C v_{2 m}) = O_{p} (v_{3 m})$ . We also use $T_{1 n_{0} m}^{*}$ and $T_{2 n_{0} m}^{*}$ , which is the same definition of T_1n₀m and T_2n₀m in Theorem 4.5.1.

Now using the results of Bhattachararya and Ranga Rao (1986) and Lahiri (1996), We have the same first-order Edgeworth expansion forms for $T_{2 n_{0} m}^{*}$ , $T_{1 n_{0} m}^{*}$ , and $T_{n_{0} m}^{*}$ , since those are close to each other. We obtain

sup_{B \in B} | P^{*} (T_{n_{0} m}^{*} \in B) - P (T_{n_{0} m} \in B) | = o_{p} (v_{m}) .

(40)

References

Babu GJ. Bootstrapping statistics with linear combinations of chi-squares as weak limit. Sankhya Ser A. 1986;56:85–93. [Google Scholar]
Bhattacharya RN, Ranga Rao R. Normal Approximations and Asymptotic Expansions. Malabar, FL: Krieger; 1986. [Google Scholar]
Bühlmann P. Sieve bootstrap for time series. Bernoulli. 1997;3:128–148. [Google Scholar]
Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge, UK: Cambridge University Press; 1997. [Google Scholar]
Efron B. Bootstrap methods: Another look at the jackknife. Annals Stat. 1979;7:1–16. [Google Scholar]
Hall P, Horowitz JL, Jing BY. On blocking rules for the bootstrap withe dependent data. Biometrika. 1995;82:561–574. [Google Scholar]
Lahiri SN. Bootstrapping m-estimators of a multiple linear regression parameter. Annals Stat. 1992;20:1548–1570. [Google Scholar]
Lahiri SN. On two-term Edgeworth expansions and bootstrap approximations for studentized multivariate m-estimators. Sankhya Ser A. 1994;56:201–226. [Google Scholar]
Lahiri SN. On Edgeworth and moving block bootstrap for studentized m-estimators in multiple linear regression models. J Multivar Anal. 1996;56:42–59. [Google Scholar]
Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. [Google Scholar]
Paparoditis E, Politis DN. Tapered block bootstrap for general statistics from stationary sequences. Econometric J. 2002;5:131–148. [Google Scholar]
Qu A, Lindsay BG, Li B. Improving generalized estimating equations using quadratic inference function. Biometrika. 2000;87:823–836. [Google Scholar]
Singh K. On the asymptotic accuracy of the Efron's bootstrap. Annals Stat. 1981;9:1187–1195. [Google Scholar]
Xie M, Yang Y. Asymptotics for generalized estimating equations with large cluster sizes. Annals Stat. 2003;31:13–22. [Google Scholar]

[R1] Babu GJ. Bootstrapping statistics with linear combinations of chi-squares as weak limit. Sankhya Ser A. 1986;56:85–93. [Google Scholar]

[R2] Bhattacharya RN, Ranga Rao R. Normal Approximations and Asymptotic Expansions. Malabar, FL: Krieger; 1986. [Google Scholar]

[R3] Bühlmann P. Sieve bootstrap for time series. Bernoulli. 1997;3:128–148. [Google Scholar]

[R4] Davison AC, Hinkley DV. Bootstrap Methods and Their Application. Cambridge, UK: Cambridge University Press; 1997. [Google Scholar]

[R5] Efron B. Bootstrap methods: Another look at the jackknife. Annals Stat. 1979;7:1–16. [Google Scholar]

[R6] Hall P, Horowitz JL, Jing BY. On blocking rules for the bootstrap withe dependent data. Biometrika. 1995;82:561–574. [Google Scholar]

[R7] Lahiri SN. Bootstrapping m-estimators of a multiple linear regression parameter. Annals Stat. 1992;20:1548–1570. [Google Scholar]

[R8] Lahiri SN. On two-term Edgeworth expansions and bootstrap approximations for studentized multivariate m-estimators. Sankhya Ser A. 1994;56:201–226. [Google Scholar]

[R9] Lahiri SN. On Edgeworth and moving block bootstrap for studentized m-estimators in multiple linear regression models. J Multivar Anal. 1996;56:42–59. [Google Scholar]

[R10] Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22. [Google Scholar]

[R11] Paparoditis E, Politis DN. Tapered block bootstrap for general statistics from stationary sequences. Econometric J. 2002;5:131–148. [Google Scholar]

[R12] Qu A, Lindsay BG, Li B. Improving generalized estimating equations using quadratic inference function. Biometrika. 2000;87:823–836. [Google Scholar]

[R13] Singh K. On the asymptotic accuracy of the Efron's bootstrap. Annals Stat. 1981;9:1187–1195. [Google Scholar]

[R14] Xie M, Yang Y. Asymptotics for generalized estimating equations with large cluster sizes. Annals Stat. 2003;31:13–22. [Google Scholar]

PERMALINK

Moving Block Bootstrap for Analyzing Longitudinal Data

Hyunsu Ju

Abstract

1. Introduction

2. Literature Review

3. Moving Block Bootstrap for Longitudinal Data

4. Justification of Moving Block Bootstrap in Longitudinal Data

4.1 Expansion for M-estimator

4.2. Expansion for Bootstrap M-estimator

4.3. A Simulation Study

4.4. Bootstrapping the Distribution of Statistics

Table 1.

5. Concluding Remarks and Discussion

Appendix

Proof of Theorem 1

Proof of Theorem 2

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Moving Block Bootstrap for Analyzing Longitudinal Data

Hyunsu Ju

Abstract

1. Introduction

2. Literature Review

3. Moving Block Bootstrap for Longitudinal Data

4. Justification of Moving Block Bootstrap in Longitudinal Data

4.1 Expansion for M-estimator

4.2. Expansion for Bootstrap M-estimator

4.3. A Simulation Study

4.4. Bootstrapping the Distribution of Statistics

Table 1.

5. Concluding Remarks and Discussion

Appendix

Proof of Theorem 1

Proof of Theorem 2

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases