SUMMARY
In many experiments, time series data can be collected from multiple units and multiple time series segments can be collected from the same unit. This article introduces a mixed effects Cramér spectral representation which can be used to model the effects of design covariates on the second-order power spectrum while accounting for potential correlations among the time series segments collected from the same unit. The transfer function is composed of a deterministic component to account for the population-average effects and a random component to account for the unit-specific deviations. The resulting log-spectrum has a functional mixed effects representation where both the fixed effects and random effects are functions in the frequency domain. It is shown that, when the replicate-specific spectra are smooth, the log-periodograms converge to a functional mixed effects model. A data-driven iterative estimation procedure is offered for the periodic smoothing spline estimation of the fixed effects, penalized estimation of the functional covariance of the random effects, and unit-specific random effects prediction via the best linear unbiased predictor.
Keywords: Cramér representation, Mixed effects model, Smoothing spline, Spectral analysis, Replicated time series
1. INTRODUCTION
In biomedical experiments, it is common to collect time series data from multiple subjects and use the time series as the basic unit in the analysis to study the effects of design covariates. These studies can include multiple time series segments collected from the same unit, which can be potentially correlated. The motivating study considered in this article measures three epochs of heart rate variability from subjects during three stages of sleep where it is believed that the heart rate variability spectra are associated with sleep and the presence of disease (Malik et al., 1996; Hall et al., 2004). When the focus of the analysis is on the effects of the design covariates on the first moment, such data can be modelled by mixed effects models. However, few methods exist when the interest is on the second-order spectra.
A model for replicate time series collected from independent units was introduced by Diggle & Al Wasel (1997) which models unit-specific spectra through a log-linear mixed effects model. Iannaccone & Coles (2001) generalize this model by allowing for the nonparametric spline estimation of the fixed effects. More recently, Freyermuth et al. (2010) introduced a tree-structured wavelet method for the estimation of the spectra of replicated time series. These models are designed exclusively for the analysis of a collection of time series under a very simple nested structure, where the individual time series are mutually independent. A novel contribution of our article is the introduction of the mixed effects Cramér representation for modelling a collection of stationary time series that exploits the flexibility of the linear mixed effects model (Laird & Ware, 1982) where many designs can be handled by correctly specifying the design matrices. This model can take into account potential correlations among the spectra of time series segments from a common unit.
The asymptotic properties of periodograms for deterministic spectra are well known and are the foundation for most nonparametric spectral estimation procedures. We investigate the asymptotic properties of the periodograms of collections of time series that can be modelled through a mixed effects Cramér spectral representation and show that when the replicate-specific spectra almost surely lie in a Sobolev space, the log-periodograms uniformly converge to a functional mixed effects model that is comprised of the log-spectral fixed effects, the log-spectral random effects and additive noise with known variance.
Mixed effects models with functional random and fixed effects have been the focus of considerable research including Rice & Silverman (1991), Brumback & Rice (1998), Staniswalis & Lee (1998), Rice & Wu (2001), Guo (2002), Wu & Zhang (2002), Morris & Carroll (2006), Qin & Guo (2006) and Zhang & Chen (2007). The functional models of Guo (2002), Morris & Carroll (2006) and Qin & Guo (2006) can flexibly encompass a variety of designs through the specification of design matrices for both the random and fixed effects in a manner similar to the traditional linear mixed effects model. Guo (2002) and Qin & Guo (2006) define the covariance kernels of the random effects as the reproducing kernel of a Sobolev space while Morris & Carroll (2006) parametrically define the covariance kernels through a small number of covariance parameters. Our proposed mixed effects model for the log-periodograms is similar to these three models in that design matrices can be specified to flexibly incorporate a variety of designs, but differs in that the only assumptions made of the covariance kernels of the log-spectral random effects are inherited from the assumption that the random effects are independent second-order stochastic processes with trajectories almost surely in a Sobolev space (Lukic & Beder, 2001). Another distinction between our proposed model and the models of Guo (2002), Morris & Carroll (2006) and Qin & Guo (2006) is that they assume that the random effects are Gaussian whereas our model does not assume a distribution for the log-spectral random effects.
To fit the mixed effects model for the log-periodograms, we propose an iterative algorithm that begins with an initial smoothing spline estimator of the log-spectral fixed effects. This initial estimator is obtained by approximating the minimizer of a penalized sum-of-squares which ignores the within-unit log-spectral correlation and can be viewed as an extension of the estimators of Cogburn & Davis (1974) and Wahba (1980) to the regression setting. Despite the empirical findings of Qin & Wang (2008), which show that the negative penalized Whittle-likelihood under the proper selection of smoothing parameters can produce more efficient spectral estimates than the penalized sum-of-squares for a single deterministic spectrum, we base our fixed effects estimator on a penalized sum-of-squares because of its computational feasibility and transparent form as a multivariate low-pass filter applied to the ordinary least squares estimates of the log-periodograms at each frequency.
Smoothing spline estimates of fixed effects from correlated data are consistent if the within-unit correlation is ignored but they are not efficient (Welsh et al., 2002; Lin et al., 2004). Iterative procedures for functional models that nonparametrically account for the within-replicate correlation when unique replicates are independent are offered by Yao & Lee (2006) and by Krafty et al. (2008). We extend these ideas to formulate an iterative procedure for the log-periodogram mixed effects model. First, the log-spectral fixed effects are estimated. Conditional on estimates of the fixed effects, the functional covariances of the random effects are estimated as the outer product of smoothed deviations of unit-specific log-periodograms from their estimated mean with smoothing parameters selected for the optimal estimation of the covariance functions via a Kullback–Leibler criterion (Krafty et al., 2008). The estimates of the first two moments of the log-spectra are used to estimate the unit-specific random effects through plug-in estimates of the best linear unbiased predictors. Fixed effects are then re-estimated by approximating the minimizer of the penalized sum-of-squares after removing the estimated unit-specific random effects from the log-periodograms.
2. MODEL
2.1. Mixed effects Cramér spectral representation
We introduce the mixed effects Cramér spectral representation for modelling a collection of time series from N independent units, where nj time series are observed from the jth unit. This model consists of a stochastic transfer function that is composed of population fixed effects and unit-specific random effects. Let Ujk = (ujk1,..., ujkP)T ∈ 𝒰 and Vjk = (vjk1,...,vjkQ)T ∈ 𝒱 be vectors of covariates for the kth replicate of the jth independent unit which index the fixed effects and the unit-specific random effects, respectively. These covariates can include continuous covariates as well as indicator variables for categorical variables. Our motivating study of heart rate variability, discussed in greater detail in § 5, consists of n = 375 epochs of heart rate variability measured at nj = 3 different sleep stages from N = 125 independent subjects. In addition to its dependence on sleep stage, the expected spectrum of heart rate variability is hypothesized to be associated with the presence of insomnia. The fixed covariates are modelled with P = 6 by indicator variables to indicate the presence of insomnia and stage of sleep. The covariates of the random effects are modelled with Q = 2 to capture the variation in the across-the-night average heart rate variability between different subjects and the correlation in the heart rate variability from the same subject at different stages of sleep.
The transfer function of the kth replicate of the jth independent unit is decomposed into A0(ω; Ujk)Aj (ω; Vjk), where A0 is a fixed effects term and Aj is a random effects term. To formally define our model, let the population fixed effects term A0 be a complex valued function over ℝ × 𝒰 such that for every Ujk ∈ 𝒰, A0(·; Ujk) is Hermitian, square-integrable over [−1/2, 1/2], and has period 1 as a function of frequency. The unit-specific random terms are defined for j = 1,..., N as the complex valued random functions Aj over ℝ × 𝒱 such that for every Vjk ∈ 𝒱, Aj (·; Vjk) are Hermitian, square-integrable over [−1/2, 1/2], and have period 1 as trajectories over frequency. Additionally, Aj and Aj′ for j, j′ ≠ 0 are independent and identically distributed conditional on Vjk when j ≠ j′, and it is assumed that supω∈ℝ, Vjk∈𝒱 E{|Aj (ω; Vjk)|2} < ∞.
The mixed effects Cramér spectral representation defines the kth replicate time series of the jth independent unit {Xjkt} as
(1) |
where Zjk are mutually independent identically distributed mean-zero orthogonal increment processes over [−1/2, 1/2] that are independent of Aj′ for all j and j′, and E{|dZjk (ω)|2} = dω. The time series {Xjkt} exists with probability one, is mean zero second-order stationary and has spectral density |A0(ω; Ujk)|2 E{|Aj (ω; Vjk)|2}. Additionally, any second-order stationary time series with a spectral density possesses a mixed effects Cramér spectral representation. Conditional on Aj, the time series {Xjkt} is also mean zero second-order stationary and we define the replicate-specific spectra and the population average spectrum as
We focus on inference on the log-spectral scale and without loss of generality assume that the replicate-specific spectra are parameterized such that E{log |Aj (ω; Vjk)|2} = 0 for all ω ∈ ℝ and Vjk ∈ 𝒱.
2.2. Semiparametric log-spectral model
We will assume semiparametric models for both the fixed and random components of the transfer functions. The semiparametric model of the fixed effects component of the transfer function is defined for Ujk = (ujk1,..., ujkP)T as , where the are deterministic Hermitian functions over ℝ with period 1 that are bounded away from zero. The random components of the unit-specific transfer functions are defined for Vjk = (vjk1,...,vjkQ)T ∈ 𝒱 as where hjq are mutually independent Hermitian random functions with period 1 that are bounded away from zero, hjq and hj′q are independent and identically distributed for j ≠ j′, and E{|hjq (ω)|4} < ∞. Define the functions and αjq (ω) = log |hjq (ω)|2 as well as the p-dimensional vectors β (ω) = {β1(ω), . . . , βP (ω)}T and the q-dimensional vectors αj (ω) = {αj1(ω), . . . , αjQ (ω)}T. This transfer function model induces the semiparametric mixed effects model on the replicate-specific log-spectra
(2) |
If we define the covariance function for the qth log-spectral random effect as Γq (ω, ν) = E{αjq (ω)αjq (ν)} and let Γ(ω, ν) = diag{Γ1(ω, ν), . . . , ΓQ (ω, ν)} be the diagonal Q × Q matrix of these covariances, the first two central moments of log-spectra are
2.3. Modelling log-spectral effects in a Sobolev space
The mixed effects Cramér spectral representation (1) and semiparametric log-spectral model (2) offer a general model for collections of time series data that can be potentially correlated. Many applications, such as the study of heart rate variability explored in § 5, involve unit-specific log-spectra which are smooth. Section 3 develops a data-driven estimation procedure when the unit-specific log-spectra are real-valued periodic functions that are absolutely continuous, have absolutely continuous first derivatives and L2 integrable second derivatives. Although the space of such functions can be expanded in a cosine series, we find it advantageous for later theoretic exposition to represent this space as
Penalized estimation procedures will be developed by viewing this space as the reproducing kernel Hilbert space under the well-studied periodic Sobolev norm which is examined in Cogburn & Davis (1974) and in § 4.2 of Gu (2002). Consider the decomposition where and are viewed as Hilbert spaces under the respective inner products and . The spaces W0 and W1 are reproducing kernel Hilbert spaces with reproducing kernels R0(ω, ν) = 1 and so that is a reproducing kernel Hilbert space with reproducing kernel 1 + R1(ω, ν).
3. ESTIMATION
3.1. Log-periodogram mixed effects model
Let T = 2L for a positive integer L and assume that we observe epochs of length T of a collection of time series {Xjk1,..., XjkT} that follow a mixed effects Cramér spectral representation for k = 1,..., nj and j = 1,..., N. Let ωℓ = ℓ/T for ℓ = (1 − L),..., L be the Fourier frequencies and define the finite Fourier transforms as and subsequent periodograms as Ijkℓ =|djkℓ|2. Theorem 1 establishes asymptotic properties of the log-periodograms when the replicate-specific spectra are in and allows the log-periodograms to be approximated by a functional mixed effects model.
Theorem 1. Assume that for p = 1,..., P, almost surely for q = 1,..., Q, and Assumptions A1–A5 in the Appendix. Let κjkℓ = log Ijkℓ − log fjk (ωℓ; Ujk, Vjk) and note that κjkℓ = κjk−ℓ. As T → ∞, κjkℓ are asymptotically independent for ℓ = 0,..., L, the κjkℓ are asymptotically distributed as log(ψ) for ℓ = 1,...,(L − 1) where , and κjk0 and κjkL are asymptotically distributed as log(ϕ) where .
Let γ ≈ 0.577 be the Euler–Mascheroni constant. For ℓ ≠ 0, L, let γℓ = E{log(ψ)} = γ and . Define γ0 = γL = E{log(ϕ)}= (log 2 + γ)/π and . Uniformly in j, k and ℓ, as T → ∞
The first part of Theorem 1 demonstrates that analogous distributional properties to the well-studied properties of log-periodograms for deterministic transfer functions exist for the Cramér spectral representation model. Letting yjkℓ = log Ijkℓ + γℓ, the log-periodograms approximately follow the functional mixed effects model
(3) |
where ∊jkℓ are zero mean independent random variables for ℓ = 0,..., L with . The second part of Theorem 1 provides the uniform convergence of the first two moments of this smooth signal plus noise model and subsequently allows functional mixed effects modelling techniques to be applied to (3) to obtain consistent estimates of βp, Γq and αjq.
3.2. Fixed effects
The proposed estimator of β is based on minimizing the penalized sums-of-squares
(4) |
over given smoothing parameters λp ⩾ 0. By the representer lemma for smoothing splines, if Uj are the nj × P matrices with kpth elements ujkp and is full rank, then a unique solution exists. To find this solution, let , . Further, let R be the T × T matrix of the reproducing kernel R1 of W1 evaluated the Fourier frequencies. The matrix R has the singular value decomposition FT DF, where F is the matrix of the discrete Fourier transform that has ℓmth element T−1/2 exp{−2πi(m − L) (ℓ − L)/T} (Gu, 2002, § 4.2). The minimizer of (4) over is defined for ω ∈ ℝ as d̂ + [{R1(ω, ω1−L),..., R1(ω, ωL)} ⊗ IP]ĉ for d̂ ∈ ℝP and ĉ ∈ ℝPT that satisfy
where Λ = diag(λ1,...,λP), e0 is the T-vector of zeros except for a one in the (L + 1)st element, and 0p is the P-vector of zeros. Cogburn & Davis (1974) developed and Wahba (1980) popularized an approximation to the minimizer of the penalized sums-of-squares over that results in an estimator that is equivalent to applying the classical Butterworth filter to the log-periodograms. We apply this approximation to the minimizer (4) and estimate β (ω) with
(5) |
When n = 1 and P = 1, the proposed estimator is equivalent to the estimator proposed by Wahba (1980) and is subsequently a generalization of this well-studied estimator to the regression setting with multiple log-spectra. Using simple algebra to express
illuminates that the proposed estimate is a type of multivariate low-pass filter applied to the ordinary least squares estimates at each frequency. It is dependent on the smoothing parameters such that as max λj → 0, β̂ approaches a spline interpolation of the ordinary least squares estimates. Theorem 2 establishes the optimal decay of λj if both the number of independent units and the number of time-points is large, as well as the point-wise consistency of β̂.
Theorem 2. Assume the conditions of Theorem 1, regularity assumptions on the covariate design given by Assumptions A6 and A7, and that nj is bounded away from zero and infinity through Assumption A8. Let λ = maxp λp, and assume that there exists a constant C such that minp λp/λ > C. As N, T →∞, the optimal mean-square convergence rate is achieved when λ ∼ (NT)−4/5, under which
Traditionally, the large sample properties of smoothing splines are explored either when the number of independent units is bounded, most famously in the estimation of a single curve when N = 1 and n1 = 1 (Wahba, 1990), or in the longitudinal data setting where N may be large but the number of observations per individual unit are bounded (Lin et al., 2004). Theorem 2 shows that the amount of smoothing for the estimation of fixed effects depends on both N and T and consequently smoothing parameter selection procedures directed towards the prediction of individual log-spectra, i.e., ignoring N, will behave suboptimally. One popular approach to the data-driven selection of smoothing parameters for linear smoothers is generalized crossvalidation (Gu, 2002). The generalized crossvalidation procedure to select the parameters indexing the estimate β̂ as a linear smoother of the ordinary least squares estimates of β at each Fourier frequency minimizes
3.3. Functional covariance
We propose an estimate of the functional covariance of the log-spectral random effects conditional on the estimate β̂ through the outer product of smoothed unit-specific quantities. Define the residuals and . We propose to estimate Γq (ω, ν) for ω, ν ∈ ℝ as
(6) |
where α̃j = (α̃j1,..., α̃jQ)T is based on minimizing
over given the smoothing parameters θq ⩾ 0. Approximating the solution to the penalized sums-of-squares, we estimate Γ̂q (ω, ν) as the qth diagonal element of the Q × Q matrix
where Vj is the nj × Q matrix with kqth element vjkq and Θ = diag(θ1,...,θQ). Theorem 3 finds the optimal decay of the smoothing parameters θq for the estimation of Γq as well as the point-wise mean-squared consistency of the estimate of Γq.
Theorem 3. Assume the conditions of Theorem 2 and the existence of the first eight moments of Xjkt given in Assumption A9. Let θ = maxq θq, and assume that there exists a constant C such that minq θq/θ > C. As N, T → ∞, the optimal mean-square convergence is achieved when θ ∼ (N−2/3 + T−4/3), under which
The amount of smoothing for method-of-moments type estimation of Γq requires a smoothing parameter that decays with respect to the number of time-points at a rate of T−4/3, whereas the amount of smoothing for α̃jq to predict the unit-specific log-spectra αjq would decay at a rate of T−4/5 (Wahba, 1980). Consequently, selecting smoothing parameters to optimally predict unit-specific random effects when forming a method-of-moments estimate for the functional covariance will result in over-smoothing. This result partially motivates the development of an iterative procedure in § 3.5 for the estimation of βp and Γq that allows for different amounts of smoothing to be used for the estimation of the fixed effect coefficients and the covariance of the random effects. The functions α̃j will be thought of as nuisance parameters for the estimation of the functional covariance and not estimates of the unit-specific random effects.
Krafty et al. (2008) offer a crossvalidation procedure for the data-driven selection of smoothing parameters θq that uses the quasi Kullback–Leibler distance conditional on fixed effect estimates as a measure of lack-of-fit over the functional covariance space. To approximate the first two moments of all data from an independent-unit, let W be the QT × QT block matrix with ℓmth Q × Q block Γ(ωℓ−L, ωm−L), and Σ∊ be the T × T matrix with ℓmth element . Theorem 1 implies that and where . We use a leave-out-one procedure to estimate the quasi Kullback–Leibler distance of conditional on β̂ between the within-unit covariance obtained by using smoothing parameters Θ and the true within-unit covariance. Letting Γ̂(−j)(ω, ν : Θ) be the estimate of the matrix Γ(ω, ν) using smoothing parameters Θ and excluding data from the jth independent unit, Ŵ(−j)(Θ) be the corresponding QT × QT block matrix with ℓmth Q × Q block Γ̂(−j)(ωℓ−L, ωm−L : Θ), and , we select Θ to minimize
where |Σ̂j (Θ)|+ is the product of the positive eigenvalues of Σ̂j (Θ) and Σ̂j (Θ)− is its Moore–Penrose inverse.
3.4. Unit-specific random effects
Estimates of the best linear unbiased predictors of unit-specific random effects allows for the pooling of information across units and serves as the cornerstone of unit-specific random effect estimation (Robinson, 1991). The plug-in estimate of the best linear unbiased estimate of αj (ω) for ω ∈ ℝ based on is
(7) |
where Γ̂(ω, ν) and Ŵ are the estimates of Γ(ω, ν) and W with their dependence on smoothing parameters suppressed for notational simplicity.
3.5. Iterative estimation of βp and αjq
The estimates obtained by first estimating β through (5) then estimating Γq through (6) are consistent. However, Welsh et al. (2002) and Lin et al. (2004) showed that failing to account for within-unit correlation in the smoothing spline estimation of fixed effects results in suboptimal estimates. Adapting the ideas proposed by Yao & Lee (2006) for principal component-based penalized spline models to our periodic spline model for log-periodograms, we propose to increase the efficiency by iterating between the estimation of the random effects after removing estimates of the fixed effects and the estimation of the fixed effects after removing estimates of the random effects. Formally, if we set , the estimates β(m), and are defined for iteration m = 1, 2,... by
letting and , define β̂(m) through (5) with Yℓ replaced with ;
defining and for j = 1,..., N through (6) and (7) with β̂ replaced with β̂(m) in the definition of .
This algorithm can be repeated until same convergence criteria are met and, if M is the final iteration of the algorithm, set β̂ = β̂(M), and .
3.6. Point-wise confidence intervals for fixed effects
We propose a parametric bootstrap procedure for obtaining point-wise confidence intervals for the log-spectral fixed effects. Although the proposed model in (2) and iterative procedure for obtaining point estimates given in § 3.5 do not assume a distribution for the log-spectral random effects, we propose parametric Gaussian bootstrap sampling of the log-spectral random effects. Morris (2002) discusses the underestimation of variability in nonparametric bootstrap sampling of mixed effects models.
The bth bootstrap sample of the collection of time series for b = 1,..., B is generated by first generating the log-spectral random effects for j = 1,..., N and q = 1,..., Q as independent mean zero Gaussian random processes with covariance kernel Γ̂q. These random effects are used to compute the replicate specific spectra . Dai & Guo (2004, Theorem 2) allow for the approximate simulation of time series epochs with a given spectrum and we use this to simulate time series epochs for t = 1,..., T with spectrum . The log-periodograms of the time series are computed and used in the data-driven iterative algorithm to obtain the fixed effects coefficient estimates from the bth bootstrap sample, . We estimate the (1 − α)% confidence interval for βp (ω) as {ξp (ω; α/2), ξp (ω; 1 − α/2)}, where ξp (ω; α) is the α percentile of the set and .
4. SIMULATIONS
Simulation studies were conducted to investigate the empirical performance of the proposed iterative estimation procedure and to compare it to the performance of three alternative procedures. The first alternative estimation procedure is the noniterative procedure, or the procedure proposed in § 3.5 with M = 1, which is implemented to allow for the empirical assessment of the benefits in iterating. The proposed iterative estimation procedure can be viewed as a data-driven smoother applied to the collection of log-periodograms which is formulated to exploit the dependence structure across time series. A naive approach would be to obtain replicate-specific log-spectral estimates by applying a smoothing procedure to replicate specific log-periodograms while ignoring the dependence structure among time series. Fixed effect estimates can then be obtained through ordinary least squares at each frequency. The second alternative estimation procedure considered examines the empirical properties of this approach, which we will refer to as presmoothing, by applying the generalized crossvalidated spline smoother of Wahba (1980) to each replicate-specific log-spectra. Estimation procedures that smooth across frequency are essential in the consistent estimation of a spectrum from a single time series. However, consistent estimates of log-spectrum fixed effects can be obtained from the log-periodograms of a collection of time series without smoothing across frequency when N is large. The third alternative estimation procedure explores the empirical properties of estimating the log-spectral fixed effects without smoothing across frequency by computing the ordinary least square estimates of the log-spectral fixed effects at each Fourier frequency.
The simulation settings consider nj = 5, P = 2 and Q = 1. Replicate-specific log-spectra are obtained as gjk (ω) = β1(ω) + ujk β2(ω) + αj (ω) for β1(ω) = 2 cos(2πω), β2(ω) = 2 cos(4πω) and αj = ξj1 + ξj2 cos(2πω) + ξj3 cos(4πω), where (ξj1, ξj2, ξj3)T are independent mean zero normal random vectors with variance matrix diag(2.5, 2, 1), and ujk are independent uniform random variables over [0, 1]. After a replicate-specific log-spectrum is simulated, the square-root of the spectrum is calculated and used as the replicate-specific transfer function to simulate the time series Xjkt in accordance with Dai & Guo (2004, Theorem 2). Five hundred random samples of N independent units of time series epochs of length T are drawn for the six possible combinations of N = 50, 100 and T = 50, 100, 200. Ninety-five per cent bootstrap confidence intervals for the log-spectral fixed effects from B = 1000 bootstrap samples are computed for each simulated random sample.
The performance of the estimation procedures are evaluated through the square error of the log-spectral fixed effects over the Fourier frequencies, of the replicate-specific log-spectral estimates and of the estimates of the covariance kernel as displayed in Table 1. The average across the curve coverage of the 95% point-wise bootstrap confidence intervals for the six settings ranged from 90.4 to 95.5%. In each of the six settings, the proposed iterative procedure has smaller square error than the other methods for the estimation of the log-spectral fixed effects, the prediction of replicated-specific log-spectra, and for the estimation of the covariance kernel. Although the errors of the estimates of the replicate-specific spectra via the proposed iterative estimation procedure decreased as either N or T increased, the error of the estimates of the replicate-specific spectra obtained through presmoothing did not change with N. This finding, which is a consequence of the fact that the estimated best linear unbiased estimate of a subject-specific log-spectrum requires the estimation of the covariance kernels that depends on N whereas the performance of presmoothing the data for the estimation of a the replicate-specific log-spectrum only depends on T, implies that the relative gain in efficiency for the estimation of the replicate-specific spectra between the proposed iterative procedure and presmoothing decreases as the ratio of T to N increases.
Table 1.
T | Method | Square error ×102 | |||||||
---|---|---|---|---|---|---|---|---|---|
N = 50 | N = 100 | ||||||||
β̂1 | β̂2 | ĝjk | Γ̂ | β̂1 | β̂2 | ĝjk | Γ̂ | ||
50 | Iterative | 7.9 | 11.0 | 22.8 | 57.3 | 3.9 | 9.3 | 21.3 | 32.8 |
Noniterative | 13.9 | 29.1 | 24.5 | 58.9 | 7.9 | 17.8 | 22.5 | 34.0 | |
Presmooth | 15.0 | 34.8 | 275.6 | — | 9.6 | 28.0 | 286.9 | — | |
OLS | 17.2 | 26.7 | — | — | 11.2 | 17.2 | — | — | |
100 | Iterative | 7.3 | 2.1 | 7.2 | 51.2 | 3.8 | 1.5 | 5.9 | 24.8 |
Noniterative | 12.4 | 19.8 | 8.5 | 51.9 | 6.6 | 11.7 | 6.8 | 25.1 | |
Presmooth | 12.2 | 20.0 | 49.5 | — | 7.1 | 14.6 | 48.1 | — | |
OLS | 15.5 | 25.3 | — | — | 10.1 | 15.1 | — | — | |
200 | Iterative | 5.3 | 0.7 | 2.8 | 33.4 | 2.8 | 0.4 | 1.5 | 20.5 |
Noniterative | 9.4 | 13.2 | 3.6 | 36.2 | 4.6 | 8.0 | 2.2 | 26.2 | |
Presmooth | 9.3 | 13.7 | 9.4 | — | 4.6 | 8.7 | 9.4 | — | |
OLS | 12.8 | 21.3 | — | — | 7.3 | 12.3 | — | — |
OLA, ordinary least squares.
5. ANALYSIS OF HEART RATE VARIABILITY
Heart rate variability is the measure of variability in the elapsed time between consecutive heart beats. The spectral analysis of heart rate variability is important in the study of various physiological outcomes and provides indirect measures of autonomic nervous system activity (Malik et al., 1996). Researchers have devised methods to assess heart rate variability continuously and non-invasively throughout sleep (Hall et al., 2004, 2007). In the present study, sleep and heart rate variability were concurrently assessed in a sample of N = 125 men and women between the ages of 60 and 89 years. Data were collected in participants’ homes to enhance the ecological validity of study measures. Of these participants, 76 were poor sleepers due to insomnia while 49 were poor sleepers due to emotional strain of caregiving for a spouse with advanced dementia. The present analysis uses epochs of heart rate variability tachograms, or the series of the elapsed time between consecutive heart beats indexed by beat number, of the first 500 continuous heart beats during each of the first three periods of non-rapid eye movement. The data for each subject are comprised of patient type, either insomnia or caregiver and three time series, heart rate variability for the first three periods. Heart rate variability during the first three periods, their associated log-periodograms and estimated best linear unbiased predictors of the log-spectra for a caregiver participant are displayed in Fig. 1.
The goal of our analysis is to quantify the expected differences in heart rate variability spectra between individuals with insomnia and caregivers during different periods. We model the heart rate variability log-spectrum for the j = 1,..., 129 subjects at the first k = 1, 2, 3 periods as , where
and Sj is the indicator variable for the jth subject being an insomniac. The fixed effects β1, β2, β3 are the mean log-spectrum at periods 1, 2 and 3, respectively, for caregivers, β4, β5, β6 are the differences in the mean log-spectra between individuals with insomnia and caregivers at periods 1, 2 and 3 respectively, Γ1 accounts for the variability in the across-the-night average log-spectra among subjects and Γ2 is the covariance kernel among adjacent periods within a subject.
The estimated fixed effects are displayed in Fig. 2 along with point-wise 95% bootstrap confidence intervals from B = 1000 random bootstrap samples. The mean log-spectrum for caregivers over different periods, reflected in β̂1, β̂2 and β̂3, are very similar with a slight gain in power at frequencies <0.05 cylces/beat as the night progresses. The expected log-power at all frequencies for individuals with insomnia is greater than that for caregivers at the first and third periods, reflected through β̂4 and β̂6, but less than that for caregivers during the second period, reflected in β̂5. The monotonic decrease in β̂4, β̂5 and β̂6 reveals that the ratio of power from low frequencies when compared with high frequencies is exaggerated for each of the first three periods among individuals with insomnia compared to caregivers. Clinically, this increase in the ratio of power in low to high frequencies is an indirect measure of an increase in the sympathetical balance and has been shown to be associated with acute stress (Hall et al., 2007). Figure 3 shows that both the estimated variability in the across-the-night average log-spectrum among subjects from Γ̂1 and the estimated within-subject variability between adjacent periods from Γ̂2 are larger for higher frequencies.
6. DISCUSSION
The model and estimation procedure introduced in this article offer tools for analysing collections of time series from designed studies and can be extended in several directions to encompass other popular settings. We have focused on estimation based on the first two moments of the log-spectra. It is possible to extend our procedure to Whittle-likelihood-based inference. In addition, many applications involve the analysis of replicated time series that are not stationary. The incorporation of the tensor-product design employed by Guo et al. (2003) into our proposed model to allow for the spectral analysis of replicated nonstationary time series could provide a useful tool for the analysis of replicated locally stationary time series. We hypothesize that the computational burden associated with Whittle-likelihood-based inference and the tensor-product analysis of locally stationary time series for a collection of time series could make these two extensions nontrivial. Although the mixed effects Cramér spectral representation holds when unit-specific spectra are not smooth, our estimation procedure is formulated for applications such as the analysis of heart rate variability where a global smoothness criterion is appropriate. Extending the iterative estimation procedure to utilize tools that can capture local properties, such as free knot splines or wavelet bases, could prove useful in many applications.
SUPPLEMENTARY MATERIAL
Acknowledgments
The authors would like to thank the editor, associate editor and two referees for insightful comments. We thank Daniel Buysse, Anne Germain, Sati Mazumdar, Timothy Monk and Eric Nofzinger for their contributions to the AgeWise study. This work is supported by the National Institute on Ageing, the National Cancer Institute and the National Science Foundation, U.S.A.
APPENDIX. Regularity assumptions
The properties established in the theorems are dependent on regularity assumptions about the distributions of Zjk and fjk (ω; Ujk, Vjk) and on the covariate design. For the noise process, we will make Assumption A1 which assures that Xjkt can be written as a linear process, the Cramér type condition in Assumption A2 which guarantees that the distribution of the Fourier transform of zjkt is absolutely continuous, and the existence of the fourth moment in Assumption A3.
Assumption A1. Let zjkt = ∫ e2πiωt dZjk (ω). The random variables zjkt are white noise with mean zero and unit variance.
Assumption A2. There exists an integer ρ > 0 such that ∫ |E (eiszjkt)|ρ ds < ∞.
Assumption A3. Fourth moments of zjkt exist such that supω E{|Zjk (ω)|4} < ∞.
Assumption A2, which excludes zjkt with discrete distributions, is satisfied when zjkt possesses a differentiable density. Regularity on fjk (ω; Ujk, Vjk) will be induced through regularity assumptions on Aj, 𝒰 and 𝒱. Assumption A4 assures that the unit-specific spectra are bounded away from zero while Assumptions A5–A7 place regularity conditions on the covariate design.
Assumption A4. There exists an ∊ > 0 such that supω, Vjk pr{|Aj (ω; Vjk)|2 < ∊} = 0.
Assumption A5. The sets 𝒰 and 𝒱 are compact.
For a square matrix M, let σ−(M) and σ+(M) be its smallest and largest eigenvalues. We will assume the following.
Assumption A6. There exist positive constants D1 and D2 such that limN→∞ σ−(UTU/n) = D1 and limN→∞ σ+(UTU/n) = D2.
Assumption A7. There exist positive constants D3 and D4 such that and for all j.
The asymptotic properties are explored when the number of replicates per unit are bounded through Assumption A8.
Assumption A8. There exists positive n− and n+ such that n− ⩽ nj ⩽ n+ for all j.
The consistency of the covariance kernel of the log-spectral random effects requires the existence of the first eight moments of Xjkt through Assumption A9.
Assumption A9. The first eight moments of hjq and Zjk are bounded such that supω, q E{|hjq (ω)|8} < ∞ and supω E{|Zjk (ω)|8} < ∞.
References
- Brumback BA, Rice JA. Smoothing spline models for the analysis of nested and crossed samples of curves. J Am Statist Assoc. 1998;93:961–94. [Google Scholar]
- Cogburn R, Davis HT. Periodic splines and spectral estimation. Ann Statist. 1974;2:1108–26. [Google Scholar]
- Dai M, Guo W. Multivariate spectral analysis using cholesky decomposition. Biometrika. 2004;91:629–43. [Google Scholar]
- Diggle PJ, Al Wasel I. Spectral analysis of replicated biomedical time series. Appl Statist. 1997;46:37–71. [Google Scholar]
- Freyermuth J-M, Ombao H, von Sachs R. Tree-structured wavelet estimation in a mixed effects model for spectra of replicated time series. J Am Statist Assoc. 2010;105:634–46. [Google Scholar]
- Gu C. Smoothing Spline ANOVA Models. New York: Springer; 2002. [Google Scholar]
- Guo W. Functional mixed effects models. Biometrics. 2002;58:121–8. doi: 10.1111/j.0006-341x.2002.00121.x. [DOI] [PubMed] [Google Scholar]
- Guo W, Dai M, Ombao H, von Sachs R. Smoothing spline anova for time-dependent spectral analysis. J Am Statist Assoc. 2003;98:643–52. [Google Scholar]
- Hall M, Thayer J, Germain A, Moul D, Vasko R, Puhl M, Miewald J, Buysse D. Psychological stress is associated with heightened physiological arousal during NREM sleep in primary insomnia. Behav Sleep Med. 2007;5:178–93. doi: 10.1080/15402000701263221. [DOI] [PubMed] [Google Scholar]
- Hall M, Vasko R, Buysse D, Ombao H, Chen Q, Cashmere J, Kupfer D, Thayer J. Acute stress affects heart rate variability during sleep. Psychosomatic Med. 2004;66:56–62. doi: 10.1097/01.psy.0000106884.58744.09. [DOI] [PubMed] [Google Scholar]
- Iannaccone R, Coles S. Semiparametric models and inference for biomedical time series with extravariation. Biostatistics. 2001;2:261–76. doi: 10.1093/biostatistics/2.3.261. [DOI] [PubMed] [Google Scholar]
- Krafty RT, Gimotty PA, Holtz D, Coukos G, Guo W. Varying coefficient model with unknown within-subject covariance for analysis of tumor growth curves. Biometrics. 2008;64:1023–31. doi: 10.1111/j.1541-0420.2007.00980.x. [DOI] [PubMed] [Google Scholar]
- Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38:963–74. [PubMed] [Google Scholar]
- Lin X, Wang N, Welsh A, Carroll RJ. Equivalent kernels of smoothing splines in nonparametric regression for clustered data. Biometrika. 2004;91:177–93. [Google Scholar]
- Lukic MN, Beder JH. Stochastic processes with sample paths in reproducing kernel hilbert spaces. Trans Am Math Soc. 2001;353:3945–69. [Google Scholar]
- Malik M, Bigger JT, Camm AJ, Kleiger RE, Malliani A, Moss AJ, Schwartz PJ. Heart rate variability—standards of measurement, physiological interpretation, and clinical use. Circulation. 1996;93:1043–65. [PubMed] [Google Scholar]
- Morris JS. The BLUPs are not “best” when it comes to bootstrapping. Statist Prob Lett. 2002;56:425–30. [Google Scholar]
- Morris JS, Carroll RJ. Wavelet-based functional mixed models. J. R. Statist. Soc. B. 2006;68:179–99. doi: 10.1111/j.1467-9868.2006.00539.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin L, Guo W. Functional mixed-effects model for periodic data. Biostatistics. 2006;7:225–34. doi: 10.1093/biostatistics/kxj003. [DOI] [PubMed] [Google Scholar]
- Qin L, Wang Y. Nonparametric spectral analysis with applications to seizure characterization using eeg time series. Ann Appl Statist. 2008;2:1432–51. [Google Scholar]
- Rice JA, Silverman BW. Estimating the mean and covariance structure nonparametrically when the data are curves. J. R. Statist. Soc. B. 1991;53:233–43. [Google Scholar]
- Rice JA, Wu CO. Nonparametric mixed effects models for unequally sampled noisy curves. Biometrics. 2001;57:253–9. doi: 10.1111/j.0006-341x.2001.00253.x. [DOI] [PubMed] [Google Scholar]
- Robinson GK. That BLUP is a good thing: the estimation of random effects. Statist Sci. 1991;6:15–32. [Google Scholar]
- Staniswalis JG, Lee JJ. Nonparametric regression analysis of longitudinal data. J Am Statist Assoc. 1998;93:1403–18. [Google Scholar]
- Wahba G. Automatic smoothing of the log-periodogram. J Am Statist Assoc. 1980;75:122–32. [Google Scholar]
- Wahba G. Spline Models for Observational Data. Philadelphia: SIAM; 1990. CBMS-NSF Regional Conference Series in Applied Mathematics. [Google Scholar]
- Welsh A, Lin X, Carroll RJ. Marginal longitudinal nonparametric regression: locality and efficiency of spline and kernel methods. J Am Statist Assoc. 2002;97:482–93. [Google Scholar]
- Wu H, Zhang J.-T. Local polynomial mixed-effects models for longitudinal data. J Am Statist Assoc. 2002;97:883–97. [Google Scholar]
- Yao F, Lee TCM. Penalized spline models for functional principal components analysis. J. R. Statist. Soc. B. 2006;68:3–25. [Google Scholar]
- Zhang J-T, Chen J. Statistical inferences for functional data. Ann Statist. 2007;35:1052–79. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.