Additive quantile regression for clustered data with an application to children’s physical activity

Marco Geraci

doi:10.1111/rssc.12333

. Author manuscript; available in PMC: 2020 Aug 1.

Published in final edited form as: J R Stat Soc Ser C Appl Stat. 2018 Dec 25;68(4):1071–1089. doi: 10.1111/rssc.12333

Additive quantile regression for clustered data with an application to children’s physical activity

Marco Geraci ¹

PMCID: PMC6664292 NIHMSID: NIHMS1042531 PMID: 31363233

Summary.

Additive models are flexible regression tools that handle linear as well as non-linear terms. The latter are typically modelled via smoothing splines. Additive mixed models extend additive models to include random terms when the data are sampled according to cluster designs (e.g. longitudinal).These models find applications in the study of phenomena like growth, certain disease mechanisms and energy expenditure in humans, when repeated measurements are available. We propose a novel additive mixed model for quantile regression. Our methods are motivated by an application to physical activity based on a data set with more than half a million accelerometer measurements in children of the UK Millennium Cohort Study. In a simulation study, we assess the proposed methods against existing alternatives.

Keywords: Bag of little bootstraps, Linear quantile mixed models, Low rank splines, Random effects, Shrinkage, Smoothing

1. Introduction

The goal of regression analysis is to model the distribution of an outcome as a function of one or more covariates. Mean regression is used to assess how the outcome changes on average when the covariates change, and it often implies that the direction and strength of the statistical associations are the same for all individuals in a population. However, conditionally on their observed characteristics, subjects who rank below or above the average of the outcome distribution may respond differently to the same treatment or exposure. Evidence of heterogeneous effects across the outcome distribution have been found in countless research including the effect of smoking on weight in lighter or heavier infants (Geraci, 2016), or the effect of sedentary behaviour on different centiles of children’s anthropometric variables (España-Romero et al., 2013). These children may be at higher risks of morbidity and mortality than those who are at the centre of the distribution.

By definition, mean effects average out stronger and weaker effects. The averaging may even cancel out symmetric effects of the same magnitudes but opposite signs in the tails of the distribution. Quantile regression (QR) (Koenker and Bassett, 1978) is a flexible statistical tool with a vast number of applications that complements mean regression. QR has become a successful analytic method in many fields of science because of its ability to draw inferences about individuals that rank below or above the population conditional mean. The ranking within the conditional distribution of the outcome can be considered as a natural index of individual latent characteristics which cause heterogeneity at the population level (Koenker and Geling, 2001). There is an increasingly wider acknowledgement of the importance of investigating sources heterogeneity to quantify more accurately costs, benefits and effectiveness of interventions or medical treatments, whether it be a healthcare reform (Winkelmann, 2006) or a thrombolytic therapy (Austin et al., 2005). QR is particularly suitable for this purpose as it yields inferences that are valid regardless of the true underlying distribution. Also, quantiles enjoy several good properties, including equivariance to monotone transformations and robustness to outliers.

In this paper we are concerned specifically with non-parametric QR functions of continuous response variables when data arise from cluster designs. Our research is motivated by a study on daily and weekly physical activity patterns in school-aged children by using high frequency accelerometer data. This study was conducted as part of the Millennium Cohort Study (MCS), which is a larger, longitudinal, nationally representative study of a cohort of children and their families in the UK. Findings from the MCS showed that only half of UK 7-year-olds achieve recommended levels of physical activity, with girls far less active than boys (Griffiths et al., 2013). The benefits of regular physical activity on wellbeing and life expectancy as well as the detrimental health effects of sedentary behaviour have been amply documented (for example see Ekelund et al. (2016) and Warburton et al. (2006)). It is therefore important to identify the predictors of physical activity, not only at the average intensity of activity but also (and perhaps especially) at lower and upper intensities as they relate to respectively sedentary behaviour and moderate-to-vigorous activity.

Fig. 1 shows accelerometer counts by time of the day during weekdays (or workdays) and weekend days for over 1000 MCS children who provided reliable data for 7 days of the week, totalling more than half a million observations. During weekdays there are periods of higher activity levels that mirror travelling times to and from school, and lunch and break times (Sera et al., 2017). In general, temporal (diurnal) trajectories of physical activity are characterized by strongly non-linear patterns that require some degree of smoothing (Morris et al., 2006; Sera et al., 2017). In contrast, some predictors of interest, such as those reported in Table 1, may simply have linear effects. Since data are collected longitudinally to examine weekly patterns, then the correlation at the individual level must be taken into account.

Table 1.

Categorical and continuous variables for English children of the MCS^†

Variable	Level	Children (%)	Measurements (%)
Sex	Male (reference)	614 (53.2)
Sex	Female	540 (46.8)
Ethnicity	White (reference)	962 (83.4)
Ethnicity	Other than white	192 (16.6)
Income quintile	1	117 (10.1)
	2	170 (14.7)
	3	220 (19.1)
	4	318 (27.6)
	5 (reference)	329 (28.5)
Reading for pleasure	Often (reference)	998 (86.5)
Reading for pleasure	Not often	156 (13.5)
Transportation to or from school	Active (reference)	604 (52.3)
Transportation to or from school	Passive	550 (47.7)
Number of cars or vans owned	0	65 (5.6)
	1	412 (35.7)
	2 (reference)	620 (53.7)
	3 or more	57 (5.0)
Day of the week	Monday-Friday (reference)		455830 (71.4)
Day of the week	Saturday or Sunday		182332 (28.6)
Season	Autumn		230285 (36.1)
	Winter		13509 (2.1)
	Spring		82634 (12.9)
	Summer (reference)		311734 (48.9)
Time of the day	min
Body mass index	Kg m⁻²	5-number summary: (11.2, 15.1, 16.1, 17.5,32.6)
Accelerometer counts (× 1000)		5-number summary: (0, 0.9, 3.1, 7.5, 297.1)

Open in a new tab

^†

The data set consists of 638162 accelerometer measurements, aggregated over 10-min intervals, from a total of 1154 children. Note that the reference categories are the modal categories.

Additive mixed models (AMMs) (Wood, 2006) have been developed precisely to incorporate linear and non-linear effects, as well as random effects for clustered data. However, the temporal trajectories at different quantile levels of the conditional distribution in Fig. 1 are not simply vertical shifts of one another, as implied by the AMM. In this situation, QR is a natural approach to consider. Of course, parametric approaches based on flexible distributions such as generalized additive models for location, scale and shape (Rigby and Stasinopoulos, 2005) can be an equally valid alternative. However, we believe that an important characteristic of QR is the ‘quantile treatment effect’ interpretation of the regression coefficients as discussed by Koenker (2005). Such interpretation may be of primary interest in some applications and is not immediately available from generalized additive models for location, scale and shape and approaches alike.

There is a considerable variety of proposals in the literature on non-parametric quantile functions for data with no clustering (see, for example, Koenker et al. (1994), He et al. (1998), Koenker and Mizera (2004), Yu and Jones (1998) and Horowitz and Lee (2005)). A few approaches have been proposed also for the estimation of non-parametric quantile functions with repeated measurements or when the data are subject to other forms of dependence. Wei et al. (2006) discussed an additive QR model that includes a first-order auto-regressive component to model serial correlation. Fenske et al. (2013) proposed an additive QR model for longitudinal data that includes fixed cluster-specific intercepts and slopes (and, thus, no covariance structure), as well as additive non-linear effects modelled via penalized splines. In their model, fitted via boosting, they controlled shrinkage and smoothing with prespecified tuning parameters. An additive model was also considered by Yue and Rue (2011) who proposed normally distributed random intercepts and non-linear terms with Bayesian P-splines and Gaussian Markov random fields as smoothness priors.

The modelling approach that we develop in this paper differs on several accounts from existing proposals. First, we model the intracluster correlation by means of random effects instead of auto-regressive errors (Wei et al., 2006). Secondarily, in contrast with Yue and Rue (2011), we include cluster-specific random slopes in addition to random intercepts, and, in contrast with Fenske et al. (2013), we allow cluster-specific effects to have a general covariance matrix. In addition, our estimation approach radically differs from that of Fenske et al. (2013) since, as described in Section 2.2, the optimal degree of shrinkage of the cluster-specific effects and the optimal level of smoothing for the non-linear terms are automatically estimated from the data.

We propose novel additive quantile models that include linear terms and non-linear terms, as well as random-effects terms which account for the clustering. Further, non-linear terms are modelled non-parametrically by using penalized splines and fitted via automatic scatter plot smoothing within a mixed model framework (Ruppert et al., 2003). Our quantile models can be considered as the equivalent of AMMs for the mean. To our knowledge, our models are the first additive models in a frequentist framework that include multiple random effects with a general variance–covariance structure and allow for automatic smoothing selection. Also, they are shown to have a superior performance compared with the only directly comparable approach which is based on cluster-specific fixed effects with a prespecified degree of smoothing (Fenske et al., 2013). Finally, the software implementation of our methods has been made readily available under a ‘GNU’ general public licence.

In the next section, we describe the methods and, briefly, their implementation in the R language (R Core Team, 2018), with further technical details provided in Web-based supporting materials. In Section 3, we carry out a simulation study to assess the performance of the methods proposed (with details reported in the supporting materials). The real data analysis is presented in Section 4 whereas concluding remarks are given in Section 5.

2. Methods

2.1. Notation

We consider data from two-level nested designs in the form $(x_{i j}^{T}, z_{i j}^{T}, y_{i j})$ , for j = 1, … , n_i and i = 1, … , M, N = Σ_in_i, where $x_{i j}^{T}$ is the jth row of a known n_i × p matrix X_i, $z_{i j}^{T}$ is the jth row of a known n_i × q matrix Z_i and y_ij is the jth observation of the response vector y_i = (y₁₁, … y_1ni)^T for the ith unit or cluster. This kind of data arises from longitudinal studies and other cluster sampling designs (e.g. spatial cluster designs). Throughout the paper, the covariates x and z are assumed to be given and measured without error. The n × 1 vector of 0s and 1s will be denoted by respectively 0_n and 1_n, the n × n identity matrix by I_n and the m × n matrix of 0s by O_m×n. The norm of a vector a with respect to a matrix G will be denoted as $‖ a ‖_{G} = \sqrt{(} a^{T} G a)$ . Finally, the Kronecker product and the direct sum will be denoted by ‘⊗’ and ‘⊕’ respectively.

2.2. The model

We define the following τth additive QR model

Q_{y_{i j} | u_{i}, x_{i j}, z_{i j}} (τ) = β_{τ, 0} + \sum_{k = 1}^{p} g_{τ}^{(k)} (x_{i j k}) + z_{i j}^{T} u_{τ, i}, j = 1, \dots, \dots, n_{i}, i = 1, \dots, M,

(1)

for τ ∈(0, 1), where $g_{T}^{(k)}$ is a τ-specific, centred, twice-differentiable smooth function of the kth component of x. The q × 1 vector u_τ,i collects cluster-specific random effects associated with z_ij and its distribution is assumed to depend on a τ-specific parameter (further details are provided in the next section).

Without loss of generality, let the components of x = (x₁, … , x_s, x_s+1, … , x_p)^T be ordered in such a way that the first s terms of the summation in model (1) are non-linear functions and the remaining p−s are linear. To model non-linear functions, we consider a spline model of the type

g_{τ} (x) \approx \sum_{h = 1}^{H} v_{τ, h} B_{h} (x),

(e.g. a cubic or B-spline), where the B_hs and v_τ,hs, h = 1, … , H, denote respectively the basis functions and the corresponding coefficients, and H depends on the degrees of freedom or the number of knots. Note that the coefficients are τ specific. The quantile function in model (1) is then approximated by

Q_{y_{i j} | u_{i}, x_{i j}, z_{i j}}^{*} (τ) = β_{τ, 0} + \sum_{k = 1}^{s} \sum_{h = 1}^{H_{k}} v_{τ, h k} B_{h}^{(k)} (x_{i j k}) + \sum_{k = s + 1}^{p} β_{τ, k} x_{i j k} + z_{i j}^{T} u_{τ, i} .

(2)

Let B^(k)(x_ijk) be the H_k × 1 vector of values taken by the kth spline evaluated at x_ijk, v_τ,k = v_τ,1, … , v_τ,Hk)^T be the H_k × 1 vector of spline coefficients for the kth covariate and H = Σ_kH_k. Further, define B_i as the n_i × H matrix with rows (B⁽¹⁾(x_ij1)^T, … , B^(s)(x_ij1)^T)^T, j = 1, … , n_i, and let $v_{τ} = {(v_{τ, 1}^{T}, \dots, v_{τ, s}^{T})}^{T}$ . With a slight abuse of notation, we write equation (2) for the ith cluster in matrix form as

Q_{y_{i} | u_{i}, X_{i}, Z_{i}}^{*} (τ) = F_{i} β_{τ} + Z_{i} u_{τ, i} + B_{i} v_{τ}, i = 1, \dots, M,

(3)

where F_i is the n_i × (p − s + 1) matrix with rows (1, x_ij(s+1), … , x_ijp)^T, j = 1, … , n_i, and β_τ = (β_τ,0, β_τ,s+1, … , β_τ,p)^T We call model (3) an additive quantile mixed model (AQMM).

The additive model that was introduced above opens up the question on how to control the trade-off between bias and efficiency, and, thus, the degree of smoothness of the estimate. We tackle this problem by exploiting the well-known link between penalized splines and mixed effect models (see, for example, Ruppert et al. (2003) for an excellent review of this topic). The key idea is that the penalized objective function for a spline model is equivalent to the best linear unbiased prediction criterion of a corresponding linear mixed effects model with random spline coefficients. Since the variance of the latter is (inversely) proportional to the smoothing parameter and is estimated from the data, it follows that the degree of smoothing is automatically chosen by the estimation algorithm.

Automatic smoothing selection does not necessarily lead to optimal smoothing (Ruppert et al., 2003). However, one of the advantages of working with random spline coefficients when modelling cluster data is that they can be subsumed in the random part of the model containing the cluster-specific effects. Choice of the ‘prior’ distribution for these coefficients effectively corresponds to choosing the form of the penalty. One approach is to use the same metric for the penalty term as that for the loss term. The L₁-penalty, which is linked to the double-exponential distribution (Geraci and Bottai, 2007, 2014), is sometimes used in QR models because of its computational convenience. In the approach by Koenker et al. (1994) and Koenker and Mizera (2004), the resulting smoothed curves are piecewise linear and are most useful in the presence of break points, sharp bends and spikes.

In contrast, the L₂-penalty represents a more suitable choice for modelling smooth changes as in, for example, variations of energy expenditure over time. This is, for example, the approach that was considered by Cox and Jones in the discussion of Cole (1988) who suggested the spline smoothing QR model

ρ_{τ} {y - f (x)} + λ \int f^{''} {(x)}^{2} d x,

where ρ_τ(r) = r{τ−I(r<0)} is the QR check function (Koenker and Bassett, 1978) and I denotes the indicator function. Compared with other roughness functionals, this kind of penalty yields a more visually appealing form of smoothness. Note that using the L₁-penalty does not rule out the possibility of obtaining smooth curves as in the strategy by Bollaerts et al. (2006) based on P-splines. See Mizera (2018) for a recent overview on penalization in QR and He and Ng (1999), Ng and Maechler (2007) and Reiss and Huang (2012) for other related topics.

A natural link between L₂-penalized splines and random effects is provided by the normal distribution. Hence, in our random-effects specification of model (3), we assume that the vectors u_τ,i and v_τ follow zero-centred multivariate Gaussian distributions with variance–covariance matrices Σ_τ and $Φ_{τ} = \oplus_{k = 1}^{s} ϕ_{τ, k} I_{H_{k}}$ respectively. Further, we assume that the u_τ,is are independent for different i (but may have a general covariance structure) and are independent from v_τ. Our objective function is then given by

\sum_{i = 1}^{M} ρ_{τ} (y_{i} - F_{i} β_{τ} - Z_{i} u_{τ, i} - B_{i} v_{τ}) + \sum_{i = 1}^{M} ‖ u_{τ, i} ‖_{Σ_{τ}^{- 1}}^{2} + \sum_{k = 1}^{s} ϕ_{τ, k}^{- 1} ‖ v_{τ, k} ‖^{2},

(4)

with the convention that $ρ_{τ} (r) = \sum_{j = 1}^{n} r_{j} {τ - I (r_{j} < 0)}$ for a vector r = (r₁, … , r_nt)^T. Note that the ϕ_τ,ks determine the amount of smoothing for the non-parametric terms.

2.3. Inference

The minimization of expression (4) is equivalent to fitting a linear quantile mixed model (Geraci and Bottai, 2007, 2014) where the response, conditionally on the random effects, is assumed to follow the asymmetric Laplace distribution. The density of a continuous random variable T with asymmetric Laplace distribution is defined as

p (t) = \frac{τ (1 - τ)}{σ_{τ}} exp {- \frac{1}{σ_{τ}} ρ_{τ} (t - μ_{τ})},

where $μ_{τ} \in ℝ$ and σ_τ>0 are respectively location and scale parameters. By fixing the asymmetry parameter τ ∈ (0, 1), the quantile of interest is given by the location parameter, i.e. Pr(T ≤ μ_τ) = τ.

Define $y = {(y_{1}^{T}, \dots, y_{M}^{T})}^{T}$ and $u_{τ} = {(u_{τ, 1}^{T}, \dots, u_{τ, M}^{T})}^{T}$ . Let $θ_{τ} \equiv {(β_{τ}^{T}, ξ_{τ}^{T}, log {(ϕ_{τ})}^{T})}^{T} \in ℝ^{p + m + 1}$ denote the parameter of interest, where ξ_τ is an unrestricted m-dimensional vector, 1 ≤ m ≤ q(q + 1)/2, of non-redundant parameters in Σ_τ (for example, see Pinheiro and Bates (1996)) and ϕ_τ = (ϕ_τ,1, ϕ_τ,2, … , ϕ_τ,s)^T. Our goal is to maximize the marginal log-likelihood

l (θ_{τ}; y) = N log {\frac{τ (1 - τ)}{σ_{τ}}} - \frac{M}{2} log | {\tilde{Σ}}_{τ} | - \frac{1}{2} log | {\tilde{Φ}}_{τ} | + log \int_{ℝ^{H}} (\prod_{i = 1}^{M} \int_{ℝ^{q}} \frac{exp [- {2 ρ_{τ} (y_{i} - μ_{τ, i}) + u_{τ, i}^{T} {\tilde{Σ}}_{τ}^{- 1} u_{τ, i}} / (2 σ_{τ})]}{(2 π σ_{τ}) q / 2} d u_{τ, i}) \times \frac{exp [- {1 / (2 σ_{τ})} v_{τ}^{T} {\tilde{Φ}}_{τ}^{- 1} v_{τ}]}{{(2 π σ_{τ})}^{H / 2}} d v_{τ},

(5)

where ${\tilde{Σ}}_{τ} = Σ_{τ} / σ_{τ}$ and ${\tilde{Φ}}_{τ} = Φ_{τ} / σ_{τ}$ are the scaled variance–covariance matrices of the random effects, and μ_τ,i = F_iβ_τ + Z_iu_τ,i + B_iv_τ. This is a three-level hierarchical model, with the innermost grouping factor represented by the clusters i and the outermost factor represented by one single-level group (i.e. the entire sample). Despite the three levels, we define ${\hat{Q}}_{y_{i} | u_{i} = 0, X_{i}, Z_{i}}^{(0)} (τ) = F_{i} {\hat{β}}_{τ} + B_{i} {\hat{v}}_{τ}$ as the predictions at level 0 since the smooth terms originally ‘belong’ to the fixed design matrix. Similarly, we define ${\hat{Q}}_{y_{i} | u_{i}, X_{i}, Z_{i}}^{(1)} (τ) = F_{i} {\hat{β}}_{τ} + Z_{i} {\hat{u}}_{τ, i} + B_{i} {\hat{v}}_{τ}$ as the predictions at level 1 (i.e. at the cluster level).

We follow the estimation strategy that was originally proposed by Geraci (2017) for non-linear quantile mixed models and apply a double approximation of log-likelihood (5):

the loss function ρ_τ(r) is first smoothed at the kink r = 0;
the integral is then solved by using a Laplacian approximation for the (smoothed) loss function.

In particular, we use the following smooth approximation (Madsen and Nielsen, 1993; Chen, 2007):

κ_{ω, τ} (r) = {\begin{array}{l} r (τ - 1) - \frac{1}{2} {(τ - 1)}^{2} ω & if r ⩽ (τ - 1) ω, \\ {1 / (2 ω)} r^{2} & if (τ - 1) ω ⩽ r ⩽ τ ω, \\ r τ - \frac{1}{2} τ^{2} ω & if r ⩾ τ ω, \end{array}

(6)

where $r \in ℝ$ and ω > 0 is a scalar ‘tuning’ parameter.

We then replace the function ρ_τ in log-likelihood (5) with κ_ω,τ to obtain a smoothed likelihood and apply a second-order Taylor expansion (Pinheiro and Chao, 2006) to the resulting exponent. After some algebra, we obtain the following Laplacian approximation:

l_{LA} (θ_{τ}; y, {\hat{w}}_{τ}) = N log {\frac{τ (1 - τ)}{σ_{τ}}} - \frac{1}{2} (log | {\tilde{Ψ}}_{τ} \overset{..}{H} | + σ_{τ}^{- 1} h_{0}),

(7)

where $\tilde{Ψ}$ is the scaled variance–covariance matrix of $w_{τ} = {(u_{τ}^{T}, v_{τ}^{T})}^{T}$ , and h₀ and Ḧ are the terms of order respectively 0 and 2 of the above-mentioned Taylor series expansion around the mode ŵ_τ.

The derivation of equation (7) and the estimation algorithm are described in detail in appendix A of the on-line supporting materials. In summary, we maximize equation (7) iteratively. The algorithm requires setting the starting value of θ_τ, σ_τ, and the tuning parameter ω, the tolerance for the change in the log-likelihood and the maximum number of iterations, as well as obtaining an initial estimate ŵ_τ. At each iteration, the parameter ω is reduced by a given factor (e.g. by half). At convergence, the value of ω should be small, ideally, since the approximation of κ_ω,τ to the loss function ρ_τ improves as ω decreases, i.e. κ_ω,τ(r) → ρ_τ(r) for ω → 0.

Various strategies can be used to determine the starting values. In the simulation study (Section 3), we considered a model-based and a naive approach. In the former case, we used parameter and random-effects estimates from an AMM. In the latter case, we used the least squares estimate for β_τ, the identity matrix for Σ_τ and the mean of the absolute least squares residuals for σ_τ, whereas the random effects were all set equal to 0. In both strategies, the tuning parameter ω was set at half the standard deviation of y (but also see Chen (2007), page 143, for how to choose ω alternatively) and subsequently halved at each iteration. The performance of the AQMM estimation algorithm using either of these two strategies in the simulation study is discussed in detail in appendix B of the on-line supporting materials. In general, model-based starting values led to improved results, although the naive approach was likewise satisfactory.

When using the asymmetric Laplace distribution as pseudolikelihood, inference should be confined to point estimation since the distribution is misspecified. In Bayesian modelling, asymmetric-Laplace-based posterior credible intervals have coverage that is considerably lower than the nominal level (see for example Reich et al. (2010)). Standard errors of non-random parameters estimates can be calculated by using the block bootstrap, although this increases the computational cost. Bootstrap confidence intervals have been shown to have good coverage in linear quantile mixed models (Geraci and Bottai, 2014). Given the relatively large size of the MCS data set, for the analysis in Section 4 we implemented an adaptation of the method by Kleiner et al. (2014). The general idea is to perform a bootstrap on several subsets of the original data and then to summarize measures of uncertainty from all subsets. This strategy, called the ‘bag of little bootstraps’ (BLB), greatly reduces the computing cost when the sample size is large (see Kleiner et al. (2014) for more details). The original method was developed for independent and identically distributed observations. Since we are dealing with clusters, we adapted the BLB approach as follows:

sample without replacement s subsets of size b < M from the pool of M clusters (random partition);
for each of the s subsets, repeatedly (R times) take a bootstrap sample of size M and fit an AQMM for each replicate;
for each of the s subsets, calculate the bootstrap variance;
as the final estimate of the standard error, take the square root of the average of the s variances in step (c).

As explained by Kleiner et al. (2014) the advantage of the BLB approach compared with the traditional bootstrap lies in the smaller size of the subsets. Although the nominal bootstrap sample size is M, there are at most b unique clusters in each subset. To obtain a bootstrap replicate, we need only a sample from a multinomial distribution with M trials and uniform probability over b possible events. Estimation proceeds with a weighted likelihood, where the cluster-specific weights are given by the multinomial counts.

2.4. Implementation

The methods that are described in this section were implemented as an add-on to the R package lqmm (Geraci, 2014). The add-on is currently available from the author’s Web site (https://marcogeraci.wordpress.com) and will appear in a future release of the main package. The core function made use of routines that are available from the mgcv (Wood, 2006) and nlme (Pinheiro et al., 2017) packages using syntax and options (e.g. selection of spline models) that are familiar to users of these packages.

3. Simulation study

We ran a simulation study to assess the proposed methods. In our analysis, we considered the two most relevant alternatives for additive regression modelling: AMMs (Wood, 2006) and additive fixed effects quantile regression (AFEQR) for longitudinal data (Fenske et al., 2013). Since the former approach aims at modelling the conditional expectation of the outcome under the assumption of normal errors, AQMMs should have an advantage over AMMs when the true errors are non-normal and the location shift hypothesis of the normal model is violated. In contrast, AFEQR is directly comparable with AQMMs since they both aim at the conditional quantiles of the outcome with no assumption about the error distribution. However, as noted in Section 1, there are two basic differences between these two QR approaches since in AQMMs

the cluster-specific effects are assumed to be random as opposed to fixed, and thus a covariance structure between effects can be introduced, and
the level of smoothing of the non-parametric terms is automatically estimated from the data (as reciprocal of the variance components) as opposed to prior specification.

These are not necessarily advantages (or disadvantages) but they do represent aspects to consider when choosing a strategy for modelling and estimation.

The data were generated according to the model

y_{i j} = β_{0} + β_{1} sin (x_{i j, 1}) + \frac{β_{2}}{1 + exp {- (x_{i j, 2} - 0.5) / 0.1}} + β_{3} x_{i j, 3} + β_{4} x_{i j, 4} + z_{i j}^{T} u + (1 + γ x_{i j, 3}) ϵ,

(8)

J = 1, … , n, i = 1, … , M, where β = (1, 4, 15, 4, 3)^T, and $x_{i j, 1} ~ U (0, 4 π)$ , $x_{i j, 2} ~ U (0, 1)$ , x_ij,3 ~ Bin(1, 0.3) and $x_{i j, 4} ~ N (0, 1)$ , independently. Moreover, Z_{i j} = (1, x_ij,4)^T, $u ~ N (0, Σ)$ , and

Σ = (\begin{matrix} 2 & 0.8 \\ 0.8 & 1 \end{matrix}) .

In one scenario, we set γ = 0 (homoscedastic), whereas in a separate scenario we set γ = 1 (heteroscedastic). Within these two scenarios, the error was generated according to either a standard normal, a Student’s t-distribution with 3 degrees of freedom or a χ²-distribution with 3 degrees of freedom. Thus, in total there were 2 × 3 = 6 different models. For each model, a balanced data set was generated according to six sample size combinations of n ∈ {5, 10} and M ∈ {50, 100, 500}, yielding 6 × 6 = 36 simulation cases. Each case was replicated R = 500 times.

For each replication, we fitted the AQMM defined in expression (2) for τ ∈ {0.1, 0.5, 0.95} using a cubic spline for the non-linear terms that are associated with x_ij,1 and x_ij,2. The model also included a random intercept and a random slope for x_ij,4 with a symmetric positive definite covariance matrix. We followed the estimation algorithm that is described in the on-line supporting materials (appendix A) with model-based starting values (but see also appendix B for the results using naive starting values). Analogous AFEQR models were fitted following the recommendations of Fenske et al. (2013). The details on optimization parameters used for AQMMs and AFEQR in the simulation study are reported in the supporting materials (appendix B). As a measure of performance, we calculated the bias and the root-mean-squared error (RMSE) of the level 1 quantile predictions. Analogously, we calculated the relative bias and RMSE of the coefficients for the linear terms, namely β₃ and β₄. We also determined the proportion of negative level 1 residuals, PNR, which is expected to be approximately equal to τ. All summary measures (see the supplementary materials for their definitions) were averaged over the replications.

In a separate, smaller, simulation study, we evaluated the bootstrap confidence interval coverage at the nominal 95% by using 200 bootstrap samples for 100 replicated data sets with normal errors.

For brevity, here we highlight the main findings, whereas all the results (including the computational performance of the AQMM estimation algorithm) are reported and discussed in more detail in appendix B of the on-line supporting materials. In general, the AQMM showed bias and RMSE lower than those of AFEQR consistently across most quantiles and sample sizes for both level 1 predictions and linear terms. For illustration, selected results are presented in Figs 2 and 3 which show RMSE values for respectively Q(τ) and β₃(τ) (similar patterns were observed for β₄(τ)). As expected, the AQMM had an advantage over the AMM in non-normal and heteroscedastic scenarios. PNR-rates for the AQMM were equal to the expected nominal τ. Finally, the AQMM also yielded optimal levels of smoothing, with individual non-linear terms predicted accurately even at the smallest sample size.

Fig. 2. — Average RMSE for Q(τ) from the AQMM () and the AFEQR () for three quantile levels (0.1, 0.5, 0.95) in the homoscedastic scenario: (a) normal distribution; (b) Student t-distribution; (c) χ²-distribution

Inline graphic — Average RMSE for Q(τ) from the AQMM () and the AFEQR () for three quantile levels (0.1, 0.5, 0.95) in the homoscedastic scenario: (a) normal distribution; (b) Student t-distribution; (c) χ²-distribution

Fig. 3. — Average RMSE for β₃(τ) from the AQMM () and the AFEQR () for three quantile levels (0.1, 0.5, 0.95) in the homoscedastic scenario: (a) normal distribution; (b) Student t-distribution; (c) χ²-distribution

The results (which are not shown) from the smaller simulation study showed that bootstrap confidence intervals for β₃ and β₄ were close to the nominal coverage, with proportions ranging from 94% to 97%.

4. Data analysis

Accelerometer data collected in 7-year-old children of the MCS represent a major, large-scale epidemiological resource to study physical activity determinants. Accelerometers are devices that are capable of providing an objective measure of the intensity and duration of movement. They produce an output known as ‘acceleration counts’ which is dimensionless and thus requires calibration to be converted into physiologically more relevant units, usually based on energy expended per unit of time (e.g. the metabolic equivalent of task).

The MCS accelerometer data were collected between May 2008 and August 2009 from participating children of the fourth sweep of the parent longitudinal survey, which provided information on several covariates, including sociodemographic and behavioural variables. Several cleaning and processing procedures were applied to the raw accelerometer data (Geraci et al., 2012; Rich et al., 2014) using the R package pawacc. Out of 12625 children participating in the study, approximately 6500 provided reliable data, defined as data from accelerometers that were deemed to have been worn for at least 2 days, at least 10 h each day (Rich et al., 2013). However, for our analysis, we retained observations for only those children with reliable data between 7.00 a.m. and 8.00 p.m. of each day of the week.

We considered several covariates. Linear terms pertaining to the sociodemographic domain were sex (binary; reference, male) and ethnic group (binary; reference, white) of the child, and Organisation for Economic Co-operation and Development equivalized income quintiles (categorical; reference, fifth quintile). Linear terms pertaining to the behavioural domain were time spent reading for enjoyment (binary; reference, often), mode of transport to or from school (binary; reference, active), number of cars or vans owned (categorical; reference, two). Linear terms pertaining to the temporal domain were day of the week (binary; reference, weekday) and calendar season (categorical; reference, summer). Finally, we considered three non-parametric terms: one for time of the day on weekdays, one for time of the day on weekends and one for body mass index (BMI). The outcome variable was accelerometer counts. The analysis was restricted to singletons born in England. This decision was motivated by the ethnic composition of the sample, consisting of almost all white children in Wales, Scotland and Northern Ireland. Since ethnicity is a strong predictor of physical activity (Griffiths et al., 2013) and ethnicity is confounded with country, we removed children from Celtic countries. Further, we excluded 15 children with missing information on ethnicity and BMI. A summary of the data set is given in Table 1. Our sample comprised 1154 children for whom accelerometer counts were aggregated over 10-min intervals between 7.00 a.m. and 8.00 p.m. (thus producing 79 time points), for 7 days of the week. In total, this gave N = 638162 accelerometer measurements (i.e. n_i = 79 × 7, i = 1, … , 1154). Since trajectories of activity were similar between Monday and Friday, and during Saturday and Sunday, temporal effects were modelled according to weekdays and weekends, not to individual days of the week.

Using a similar notation to that in model (2), the τth additive linear quantile regression model was specified as

Q_{y_{i j} | u_{i}, x_{i}, z_{i}}^{*} (τ) = β_{τ, 0} + \sum_{h = 1}^{H_{1}} v_{τ, 1} B_{h}^{(1)} (t_{j, 0}) + \sum_{h = 1}^{H_{2}} v_{τ, 2} B_{h}^{(2)} (t_{j, 1}) + \sum_{h = 1}^{H_{3}} v_{τ, 3} B_{h}^{(3)} ({BMI}_{i}) + β_{τ, 1} se x_{i, 1} + β_{τ, 2} {ethnicity}_{i, 1} + β_{τ, 3} {income}_{i, 1} + β_{τ, 4} {income}_{i, 2} + β_{τ, 5} {income}_{i, 3} + β_{τ, 6} {income}_{i, 4} + β_{τ, 7} {reading}_{i, 1} + β_{τ, 8} {transport}_{i, 1} + β_{τ, 9} car s_{i, 0} + β_{τ, 10} car s_{i, 1} + β_{τ, 11} car s_{i, 3 +} + β_{τ, 12} {weekend}_{i, 1} + β_{τ, 13} {autumn}_{i} + β_{τ, 14} {winter}_{i} + β_{τ, 15} {spring}_{i} + Z_{i j}^{T} u_{τ, i},

(9)

for τ ∈ {0.1, 0.5, 0.9, 0.95, 0.99}. For fitting purposes, the outcome was scaled by 10⁴ (however, the results of the analysis are reported on the original scale). The variables t_j,0 and t_j,1, j = 1, … , 79, denote the time of the day for respectively weekdays and weekend days. Time was expressed as minutes divided by 60 × 24 (e.g. with 0.29 corresponding to 7.00 a.m. and 0.83 to 8.00 p.m.) and then centred about its mid-value (0.56 corresponding to 1.30 p.m.). Similarly, the BMI was centred about its mode (15.5 kg m⁻²). Given the large size of the data set, smooth terms were modelled by using low rank thin plate splines (Wood, 2003), which have been shown to have optimal properties both statistically and computationally. Model (9) included the 4 × 1 vector z_ij = (δ_ij, 1 − δ_ij, t_j,0, t_j,1)^T, where δ_ij is an indicator that is equal to 1 if the ijth observation belongs to weekdays or to 0 otherwise. The random effects u_i = (u_i,0, u_i,1, u_i,t0, u_i,t1)^T were assumed to follow a multivariate normal distribution with symmetric positive definite variance–covariance matrix

Σ = (\begin{matrix} σ_{0}^{2} & σ_{0, 1} & σ_{0, t_{0}} & σ_{0, t_{1}} \\ σ_{1}^{2} & σ_{1, t_{0}} & σ_{1, t 1} \\ σ_{t_{0}}^{2} & σ_{t_{0}, t_{1}} \\ σ_{t_{1}}^{2} \end{matrix}) .

The first two terms of model (9) can be interpreted as the τth time-specific quantile function of accelerometer counts on an summer weekday for a boy of white ethnicity with modal BMI living in a household in the highest income quintile and with two cars, who reads often (at least once or twice a week) and walks or bikes from and to school (as opposed to moving by car or bus), and whose temporal (linear) trajectory belongs to the zero (or modal) random-effect cluster.

We made an attempt to fit an analogous AMM to compare results and to obtain starting values for the AQMM. However, the function gamm failed because of insufficient memory. We also tried with a smaller subset of 200 children, but the gamm function failed with a convergence error. Given the satisfactory simulation results, we therefore used the naive approach that was described in Section 2.3 to determine the starting values.

Fig. 4 shows the estimated quantile function at level 0 for a child in the reference group. Diurnal patterns show markedly different shapes during the week. On weekdays, there are multiple peaks of activity in the morning and early afternoon, followed by a plateau of higher activity in the evening. On weekends, the trajectories look flatter and are characterized by two grand peaks around 11.00 a.m. and 5.00 p.m.

Estimates of the fixed effects and standard errors from the AQMM are reported in Table 2. The latter were obtained by using the BLB approach that was described in Section 2.3 with a fivefold partition (b≈230) and R = 50 bootstrap replications. Some of the findings are consistent with those from previous analyses (Griffiths et al., 2013; Sera et al., 2017) that focused on the central part of the distribution, namely girls and children of ethnicity other than white are less active than their peers, reading frequently during the week is negatively associated with activity, and higher activity levels characterize spring and summer, followed by autumn and winter.

Table 2.

Estimated fixed effects (counts per 10 min) and, in parentheses, their standard errors, followed by the proportion of negative residuals PNR from the AQMM for the MCS physical activity data^†

	Fixed effects for the following values of τ:
	τ = 0.1	τ = 0.5	τ = 0.9	τ = 0.95	τ = 0.99
Intercept	992 (101)	4408 (183)	13704 (305)	18473 (492)	31065 (1136)
Sex (female)	−24 (64)	−180 (95)	−2049 (222)	−2752 (328)	−3113 (843)
Ethnicity (not white)	−101 (104)	−82 (124)	−1126 (285)	−1696 (388)	−3964 (948)
Income quintile (1)	−43 (146)	−39 (216)	−483 (422)	−784 (567)	−2747 (1525)
Income quintile (2)	70 (124)	99 (148)	35 (337)	−2 (519)	−369 (1679)
Income quintile (3)	53 (95)	13 (129)	−237 (303)	−512 (448)	−1196 (1009)
Income quintile (4)	−44 (83)	35 (129)	−135 (273)	−56 (412)	776 (1089)
Reading for pleasure (not often)	92 (114)	122 (141)	8 (367)	69 (502)	−276 (1179)
Transportation (passive)	62 (72)	86 (87)	−274 (226)	−409 (363)	−209 (750)
Number of cars or vans (0)	25 (169)	−77 (224)	1121 (549)	1279 (581)	3315 (739)
Number of cars or vans (1)	53 (83)	75 (100)	564 (231)	682 (368)	2083 (850)
Number of cars or vans (≥ 3)	4 (164)	−53 (224)	518 (496)	655 (809)	2586 (1072)
Day of the week (weekend)	−148 (103)	−131 (106)	−168 (223)	−45 (364)	3023 (1066)
Season (autumn)	12 (76)	−164 (91)	−958 (209)	−1067 (313)	−3155 (774)
Season (winter)	−3 (199)	−204 (326)	−1377 (561)	−1675 (705)	−2999 (1124)
Season (spring)	82 (117)	244 (156)	1242 (333)	2197 (528)	6027 (2224)
Linear basis term for time of the day (weekdays)	410 (50)	782 (33)	2050 (54)	2639 (107)	5374 (448)
Linear basis term for time of the day (weekend)	635 (56)	980 (46)	2620 (83)	3434 (158)	7659 (965)
Linear basis term for BMI	−40 (125)	−11 (51)	−16 (113)	−64 (166)	−95 (387)
PNR	0.11	0.50	0.90	0.95	0.99

Open in a new tab

^†

The reference categories are given in Table 1.

However, the narrative emerging from Table 2 is more variegated than this. First, there is a gradient across quantiles of increasingly larger differences in activity levels for girls and children of ethnicity other than white. Secondly, activity is lower in children from less affluent households at the most extreme quantile. In particular, activity is lower in those from economically disadvantaged households (first quintile) across all quantiles. However, the estimates of the coefficients for income have large standard errors, resulting in statistical non-significance at the 95% level. The effects that are associated with reading and mode of transportation do not seem to be important, neither practically nor statistically. In contrast, there are marked differences between children living in households with two vehicles (reference) and those with none, the latter being substantially more active. It also seems that, at the quantile 0.99, there is a U-shaped relationship between car or van ownership and activity counts.

Whereas the main effects of weekend on activity levels are approximately the same as those during the rest of the week across several quantiles, there is a rather strong positive weekend effect at the more extreme quantile. The results that were reported by Sera et al. (2017) showed no weekend effect, which is probably the consequence of averaging out stronger and weaker effects. Finally, it is interesting that the magnitude of the seasonal effects also increases with increasing quantiles. This is consistent with another QR analysis of the MCS accelerometer data (Geraci and Farcomeni, 2016).

The estimated effect of BMI on activity counts for a child in the reference group is depicted in Fig. 5. Whereas the relationship is roughly constant up to the quantile 0.95, it is non-linear at τ = 0.99, with an overall negative gradient. The variance of the corresponding smooth term (Table 3) indicates a stronger penalty on the spline coefficients at the most extreme quantile.

Table 3.

Estimated standard deviations and correlations of the random effects, and standard deviations of the random spline coefficients from the AQMM for the MCS physical activity data

	Results for the following values of τ:
	τ =0.1	τ =0.5	τ =0.9	τ =0.95	τ =0.99
Standard deviations (random effects)
${\hat{σ}}_{0}$ (intercept weekdays)	2969	3923	2882	2769	4897
${\hat{σ}}_{1}$ (intercept weekend)	3015	3526	2842	2809	5054
${\hat{σ}}_{t_{0}}$ (time of the day weekdays)	2868	3575	2817	2800	5069
${\hat{σ}}_{t 1}$ (time of the day weekend)	2867	3376	2940	2858	5017
Correlations (random effects)
${\hat{ρ}}_{0, 1}$	0.93	0.97	0.73	0.36	0.52
${\hat{ρ}}_{0, t_{0}}$	0.99	0.99	0.94	0.89	0.93
${\hat{ρ}}_{1, t_{0}}$	0.93	0.97	0.71	0.35	0.51
${\hat{ρ}}_{0, t_{1}}$	0.93	0.97	0.73	0.37	0.52
${\hat{ρ}}_{1, t_{1}}$	0.99	0.99	0.93	0.88	0.93
${\hat{ρ}}_{t_{0}, t_{1}}$	0.93	0.97	0.72	0.36	0.51
Standard deviations (smooth terms)
${\hat{ϕ}}_{weekdays}$	4136	15215	4114	4343	1722
${\hat{ϕ}}_{weekend}$	8385	14541	6699	2402	515
${\hat{ϕ}}_{BMI}$	2905	4945	7094	2777	181

Open in a new tab

The estimated standard deviations of the random effects show larger variability of individual linear trends (intercepts and temporal slopes) at the median and at τ = 0.99 (Table 3). The correlation between random effects within weekdays or within weekends is strong, but the cross-correlation between weekdays and weekend terms is substantially weaker in the second half of the conditional distribution. This means that children tend to have trends of higher intensity activity that are less similar between weekdays and weekends.

Individual trajectories of accelerometer counts for two children of the MCS are plotted in Fig. 6. Despite both being white females with similar BMI (approximately 15.6), living in a household with income in the lowest quintile and one car, having similar behaviours in terms of reading (often) and transportation (passive), they showed somewhat different daily patterns during summer weekend days. In particular, the conditional distribution was markedly skewed for the girl with identifier M16179P.

5. Conclusion

We have developed a novel additive model for QR when data are clustered. Compared with alternative approaches, ours has unique features, namely the mixed effects representation of smoothing splines, which in turn leads to automatic smoothing selection, and the ability to model the variance–covariance matrix of the random effects.

As shown in a simulation study, the performance of the AQMM was satisfactory despite the minimal tuning of the estimation algorithm. This takes a little burden away from the user who may instead focus their attention on other aspects of the analysis. This can be an asset if the data present complexities like those illustrated in the MCS accelerometer analysis. In particular, the presence of a large number of regression coefficients and multiple smooth terms hinders the application of computationally intensive smoothing selection (e.g. cross-validation) to large data sets.

Standard error calculation in AQMMs is facilitated by the bootstrap. We were able to overcome the relatively large size of the MCS data set by using an adaptation of the BLB approach (Kleiner et al., 2014). However, the versatility of the bootstrap comes at a (computational) price and its application is limited to more central quantiles unless the sample size (i.e. the number of clusters) is adequate. Further research is needed to develop accurate ‘sampling-free’ approximations of standard errors in AQMMs as well as in linear quantile mixed models.

Finally, in contrast with estimation based on numerical quadrature (Geraci and Bottai, 2014), random-effects estimates in AQMMs are a by-product of the optimization algorithm (Geraci, 2017) rather than being calculated post hoc via, for example, best linear prediction (Geraci and Bottai, 2014). As a consequence, the proportion of negative residuals (conditional on the random effects) in AQMMs will be close to the nominal quantile level since the random effects are obtained within the same optimization procedure. However, the algorithm proposed can be more demanding in terms of computing time compared with, say, boosting, with the computational bottleneck indeed represented by the estimation of the random effects. For example, it took about 2 h to fit a single AQMM when using the MCS data set. Whereas, on the one hand, the large size of this data set impaired even one of the most refined software packages for linear mixed effects models, on the other hand a possible improvement in computing speed of the algorithm proposed is conceivable and is part of future research. And so is a thorough comparison of alternative estimation approaches in QR with random effects along the lines of previous studies on mean regression (Pinheiro and Bates, 1995; Pinheiro and Chao, 2006).

Supplementary Material

Supplemental

NIHMS1042531-supplement-Supplemental.pdf^{(609.7KB, pdf)}

Acknowledgements

This research has been supported by the National Institutes of Health—National Institute of Child Health and Human Development (grant 1R03HD084807–01A1).

Footnotes

Supporting information

Additional ‘supporting information’ may be found in the on-line version of this article:

‘Web-based supporting materials for “Additive quantile regression for clustered data with an application to children’s physical activity”‘.

References

Austin PC, Tu JV, Daly PA and Alter DA (2005) The use of quantile regression in health care research: a case study examining gender differences in the timeliness of thrombolytic therapy. Statist. Med, 24, 791–816. [DOI] [PubMed] [Google Scholar]
Bollaerts K, Eilers PHC and Aerts M (2006) Quantile regression with monotonicity restrictions using p-splines and the l1-norm. Statist. Modllng, 6, 189–207. [Google Scholar]
Chen C (2007) A finite smoothing algorithm for quantile regression. J. Computnl Graph. Statist, 16, 136–164. [Google Scholar]
Cole TJ (1988) Fitting smoothed centile curves to reference data (with discussion). J. R. Statist. Soc. A, 151, 385–418. [Google Scholar]
Ekelund U, Steene-Johannessen J, Brown WJ, Fagerland MW, Owen N, Powell KE, Bauman A and Lee IM (2016) Does physical activity attenuate, or even eliminate, the detrimental association of sitting time with mortality?: A harmonised meta-analysis of data from more than 1 million men and women. Lancet, 388, 1302–1310. [DOI] [PubMed] [Google Scholar]
España-Romero V, Mitchell JA, Dowda M, O’Neill JR and Pate RR (2013) Objectively measured sedentary time, physical activity and markers of body fat in preschool children. Ped. Exrcs. Sci, 25, 154–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fenske N, Fahrmeir L, Hothorn T, Rzehak P and Höhle M (2013) Boosting structured additive quantile regression for longitudinal childhood obesity data. Int. J. Biostatist, 9, 1–18. [DOI] [PubMed] [Google Scholar]
Geraci M (2014) Linear quantile mixed models: the lqmm package for Laplace quantile regression. J. Statist. Softwr, 57, 1–29. [Google Scholar]
Geraci M (2016) Estimation of regression quantiles in complex surveys with data missing at random: an application to birthweight determinants. Statist. Meth. Med. Res, 25, 1393–1421. [DOI] [PubMed] [Google Scholar]
Geraci M (2017) Nonlinear quantile mixed models. University of South Carolina, Columbia: Preprint arXiv:1712.09981v1. [Google Scholar]
Geraci M and Bottai M (2007) Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics, 8, 140–154. [DOI] [PubMed] [Google Scholar]
Geraci M and Bottai M (2014) Linear quantile mixed models. Statist. Comput, 24, 461–479. [Google Scholar]
Geraci M and Farcomeni A (2016) Probabilistic principal component analysis to identify profiles of physical activity behaviours in the presence of non-ignorable missing data. Appl. Statist, 65, 51–75. [Google Scholar]
Geraci M, Rich C, Sera F, Cortina-Borja M, Griffiths LJ and Dezateux C (2012) Technical report on accelerometry data processing in the Millennium Cohort Study Technical Report. University College London, London: (Available from http://discovery.ucl.ac.uk/1361699.) [Google Scholar]
Griffiths LJ, Cortina-Borja M, Sera F, Pouliou T, Geraci M, Rich C, Cole TJ, Law C, Joshi H, Ness AR, Jebb SA and Dezateux C (2013) How active are our children?: Findings from the Millennium Cohort Study. BMJ Open, 3, article 002893. [DOI] [PMC free article] [PubMed] [Google Scholar]
He X and Ng P (1999) COBS: qualitatively constrained smoothing via linear programming. Computnl Statist, 14, 315–337. [Google Scholar]
He XM, Ng P and Portnoy S (1998) Bivariate quantile smoothing splines. J. R. Statist. Soc. B, 60, 537–550. [Google Scholar]
Horowitz J and Lee S (2005) Nonparametric estimation of an additive quantile regression model. J. Am. Statist. Ass, 100, 1238–1249. [Google Scholar]
Kleiner A, Talwalkar A, Sarkar P and Jordan MI (2014) A scalable bootstrap for massive data. J. R. Statist. Soc. B, 76, 795–816. [Google Scholar]
Koenker R (2005) Quantile Regression. New York: Cambridge University Press. [Google Scholar]
Koenker R and Bassett G (1978) Regression quantiles. Econometrica, 46, 33–50. [Google Scholar]
Koenker R and Geling O (2001) Reappraising medfly longevity. J. Am. Statist. Ass, 96, 458–468. [Google Scholar]
Koenker R and Mizera I (2004) Penalized triograms: total variation regularization for bivariate smoothing. J. R. Statist. Soc. B, 66, 145–163. [Google Scholar]
Koenker R, Ng P and Portnoy S (1994) Quantile smoothing splines. Biometrika, 81, 673–680. [Google Scholar]
Madsen K and Nielsen HB (1993) A finite smoothing algorithm for linear l₁ estimation. SIAM J. Optimizn, 3, 223–235. [Google Scholar]
Mizera I (2018) Quantile regression: penalized In Handbook of Quantile Regression (eds Koenker R, Chernozhukov V, He X and Peng L), ch. 3, pp. 21–39. Boca Raton: Chapman and Hall–CRC. [Google Scholar]
Morris JS, Arroyo C, Coull BA, Ryan LM, Herrick R and Gortmaker SL (2006) Using wavelet-based functional mixed models to characterize population heterogeneity in accelerometer profiles: a case study. J. Am. Statist. Ass, 101, 1352–1364. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ng P and Maechler M (2007) A fast and efficient implementation of qualitatively constrained quantile smoothing splines. Statist. Modllng, 7, 315–328. [Google Scholar]
Pinheiro JC and Bates DM (1995) Approximations to the log-likelihood function in the nonlinear mixed-effects model. J. Computnl Graph. Statist, 4, 12–35. [Google Scholar]
Pinheiro JC and Bates DM (1996) Unconstrained parametrizations for variance-covariance matrices. Statist. Comput, 6, 289–296. [Google Scholar]
Pinheiro J, Bates D, DebRoy S, Sarkar D and R Core Team (2017) nlme: linear and nonlinear mixed effects models. R Package Version 3.1–131. [Google Scholar]
Pinheiro JC and Chao EC (2006) Efficient Laplacian and adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models. J. Computnl Graph. Statist, 15, 58–81. [Google Scholar]
R Core Team (2018) R: a Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]
Reich BJ, Bondell HD and Wang HJ (2010) Flexible Bayesian quantile regression for independent and clustered data. Biostatistics, 11, 337–352. [DOI] [PubMed] [Google Scholar]
Reiss PT and Huang L (2012) Smoothness selection for penalized quantile regression splines. Int. J. Biostatist, 8. [DOI] [PubMed] [Google Scholar]
Rich C, Geraci M, Griffiths LJ, Sera F, Dezateux C and Cortina-Borja M (2013) Quality control methods in accelerometer data processing: defining minimum wear time. PLOS One, 8, article e67206. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rich C, Geraci M, Griffiths LJ, Sera F, Dezateux C and Cortina-Borja M (2014) Quality control methods in accelerometer data processing: identifying extreme counts. PLOS One, 9, article e85134. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rigby RA and Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. Appl. Statist, 54, 507–544. [Google Scholar]
Ruppert D, Wand M and Carroll R (2003) Semiparametric Regression. New York: Cambridge University Press. [Google Scholar]
Sera F, Griffiths LJ, Dezateux C, Geraci M and Cortina-Borja M (2017) Using functional data analysis to understand daily activity levels and patterns in primary school-aged children: cross-sectional analysis of a UK-wide study. PLOS One, 12, article e0187677. [DOI] [PMC free article] [PubMed] [Google Scholar]
Warburton DER, Nicol CW and Bredin SSD (2006) Health benefits of physical activity: the evidence. Can. Med. Ass. J, 174, 801–809. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wei Y, Pere A, Koenker R and He XM (2006) Quantile regression methods for reference growth charts. Statist. Med, 25, 1369–1382. [DOI] [PubMed] [Google Scholar]
Winkelmann R (2006) Reforming health care: evidence from quantile regressions for counts. J. Hlth Econ, 25, 131–145. [DOI] [PubMed] [Google Scholar]
Wood SN (2003) Thin plate regression splines. J. R. Statist. Soc. B, 65, 95–114. [Google Scholar]
Wood SN (2006) Generalized Additive Models: an Introduction with R. Boca Raton: Chapman and Hall–CRC. [Google Scholar]
Yu K and Jones M (1998) Local linear quantile regression. J. Am. Statist. Ass, 93, 228–237. [Google Scholar]
Yue YR and Rue H (2011) Bayesian inference for additive mixed quantile regression models. Computnl Statist. Data Anal, 55, 84–96. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental

NIHMS1042531-supplement-Supplemental.pdf^{(609.7KB, pdf)}

[R1] Austin PC, Tu JV, Daly PA and Alter DA (2005) The use of quantile regression in health care research: a case study examining gender differences in the timeliness of thrombolytic therapy. Statist. Med, 24, 791–816. [DOI] [PubMed] [Google Scholar]

[R2] Bollaerts K, Eilers PHC and Aerts M (2006) Quantile regression with monotonicity restrictions using p-splines and the l1-norm. Statist. Modllng, 6, 189–207. [Google Scholar]

[R3] Chen C (2007) A finite smoothing algorithm for quantile regression. J. Computnl Graph. Statist, 16, 136–164. [Google Scholar]

[R4] Cole TJ (1988) Fitting smoothed centile curves to reference data (with discussion). J. R. Statist. Soc. A, 151, 385–418. [Google Scholar]

[R5] Ekelund U, Steene-Johannessen J, Brown WJ, Fagerland MW, Owen N, Powell KE, Bauman A and Lee IM (2016) Does physical activity attenuate, or even eliminate, the detrimental association of sitting time with mortality?: A harmonised meta-analysis of data from more than 1 million men and women. Lancet, 388, 1302–1310. [DOI] [PubMed] [Google Scholar]

[R6] España-Romero V, Mitchell JA, Dowda M, O’Neill JR and Pate RR (2013) Objectively measured sedentary time, physical activity and markers of body fat in preschool children. Ped. Exrcs. Sci, 25, 154–163. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Fenske N, Fahrmeir L, Hothorn T, Rzehak P and Höhle M (2013) Boosting structured additive quantile regression for longitudinal childhood obesity data. Int. J. Biostatist, 9, 1–18. [DOI] [PubMed] [Google Scholar]

[R8] Geraci M (2014) Linear quantile mixed models: the lqmm package for Laplace quantile regression. J. Statist. Softwr, 57, 1–29. [Google Scholar]

[R9] Geraci M (2016) Estimation of regression quantiles in complex surveys with data missing at random: an application to birthweight determinants. Statist. Meth. Med. Res, 25, 1393–1421. [DOI] [PubMed] [Google Scholar]

[R10] Geraci M (2017) Nonlinear quantile mixed models. University of South Carolina, Columbia: Preprint arXiv:1712.09981v1. [Google Scholar]

[R11] Geraci M and Bottai M (2007) Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics, 8, 140–154. [DOI] [PubMed] [Google Scholar]

[R12] Geraci M and Bottai M (2014) Linear quantile mixed models. Statist. Comput, 24, 461–479. [Google Scholar]

[R13] Geraci M and Farcomeni A (2016) Probabilistic principal component analysis to identify profiles of physical activity behaviours in the presence of non-ignorable missing data. Appl. Statist, 65, 51–75. [Google Scholar]

[R14] Geraci M, Rich C, Sera F, Cortina-Borja M, Griffiths LJ and Dezateux C (2012) Technical report on accelerometry data processing in the Millennium Cohort Study Technical Report. University College London, London: (Available from http://discovery.ucl.ac.uk/1361699.) [Google Scholar]

[R15] Griffiths LJ, Cortina-Borja M, Sera F, Pouliou T, Geraci M, Rich C, Cole TJ, Law C, Joshi H, Ness AR, Jebb SA and Dezateux C (2013) How active are our children?: Findings from the Millennium Cohort Study. BMJ Open, 3, article 002893. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] He X and Ng P (1999) COBS: qualitatively constrained smoothing via linear programming. Computnl Statist, 14, 315–337. [Google Scholar]

[R17] He XM, Ng P and Portnoy S (1998) Bivariate quantile smoothing splines. J. R. Statist. Soc. B, 60, 537–550. [Google Scholar]

[R18] Horowitz J and Lee S (2005) Nonparametric estimation of an additive quantile regression model. J. Am. Statist. Ass, 100, 1238–1249. [Google Scholar]

[R19] Kleiner A, Talwalkar A, Sarkar P and Jordan MI (2014) A scalable bootstrap for massive data. J. R. Statist. Soc. B, 76, 795–816. [Google Scholar]

[R20] Koenker R (2005) Quantile Regression. New York: Cambridge University Press. [Google Scholar]

[R21] Koenker R and Bassett G (1978) Regression quantiles. Econometrica, 46, 33–50. [Google Scholar]

[R22] Koenker R and Geling O (2001) Reappraising medfly longevity. J. Am. Statist. Ass, 96, 458–468. [Google Scholar]

[R23] Koenker R and Mizera I (2004) Penalized triograms: total variation regularization for bivariate smoothing. J. R. Statist. Soc. B, 66, 145–163. [Google Scholar]

[R24] Koenker R, Ng P and Portnoy S (1994) Quantile smoothing splines. Biometrika, 81, 673–680. [Google Scholar]

[R25] Madsen K and Nielsen HB (1993) A finite smoothing algorithm for linear l₁ estimation. SIAM J. Optimizn, 3, 223–235. [Google Scholar]

[R26] Mizera I (2018) Quantile regression: penalized In Handbook of Quantile Regression (eds Koenker R, Chernozhukov V, He X and Peng L), ch. 3, pp. 21–39. Boca Raton: Chapman and Hall–CRC. [Google Scholar]

[R27] Morris JS, Arroyo C, Coull BA, Ryan LM, Herrick R and Gortmaker SL (2006) Using wavelet-based functional mixed models to characterize population heterogeneity in accelerometer profiles: a case study. J. Am. Statist. Ass, 101, 1352–1364. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Ng P and Maechler M (2007) A fast and efficient implementation of qualitatively constrained quantile smoothing splines. Statist. Modllng, 7, 315–328. [Google Scholar]

[R29] Pinheiro JC and Bates DM (1995) Approximations to the log-likelihood function in the nonlinear mixed-effects model. J. Computnl Graph. Statist, 4, 12–35. [Google Scholar]

[R30] Pinheiro JC and Bates DM (1996) Unconstrained parametrizations for variance-covariance matrices. Statist. Comput, 6, 289–296. [Google Scholar]

[R31] Pinheiro J, Bates D, DebRoy S, Sarkar D and R Core Team (2017) nlme: linear and nonlinear mixed effects models. R Package Version 3.1–131. [Google Scholar]

[R32] Pinheiro JC and Chao EC (2006) Efficient Laplacian and adaptive Gaussian quadrature algorithms for multilevel generalized linear mixed models. J. Computnl Graph. Statist, 15, 58–81. [Google Scholar]

[R33] R Core Team (2018) R: a Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. [Google Scholar]

[R34] Reich BJ, Bondell HD and Wang HJ (2010) Flexible Bayesian quantile regression for independent and clustered data. Biostatistics, 11, 337–352. [DOI] [PubMed] [Google Scholar]

[R35] Reiss PT and Huang L (2012) Smoothness selection for penalized quantile regression splines. Int. J. Biostatist, 8. [DOI] [PubMed] [Google Scholar]

[R36] Rich C, Geraci M, Griffiths LJ, Sera F, Dezateux C and Cortina-Borja M (2013) Quality control methods in accelerometer data processing: defining minimum wear time. PLOS One, 8, article e67206. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Rich C, Geraci M, Griffiths LJ, Sera F, Dezateux C and Cortina-Borja M (2014) Quality control methods in accelerometer data processing: identifying extreme counts. PLOS One, 9, article e85134. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Rigby RA and Stasinopoulos DM (2005) Generalized additive models for location, scale and shape. Appl. Statist, 54, 507–544. [Google Scholar]

[R39] Ruppert D, Wand M and Carroll R (2003) Semiparametric Regression. New York: Cambridge University Press. [Google Scholar]

[R40] Sera F, Griffiths LJ, Dezateux C, Geraci M and Cortina-Borja M (2017) Using functional data analysis to understand daily activity levels and patterns in primary school-aged children: cross-sectional analysis of a UK-wide study. PLOS One, 12, article e0187677. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] Warburton DER, Nicol CW and Bredin SSD (2006) Health benefits of physical activity: the evidence. Can. Med. Ass. J, 174, 801–809. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] Wei Y, Pere A, Koenker R and He XM (2006) Quantile regression methods for reference growth charts. Statist. Med, 25, 1369–1382. [DOI] [PubMed] [Google Scholar]

[R43] Winkelmann R (2006) Reforming health care: evidence from quantile regressions for counts. J. Hlth Econ, 25, 131–145. [DOI] [PubMed] [Google Scholar]

[R44] Wood SN (2003) Thin plate regression splines. J. R. Statist. Soc. B, 65, 95–114. [Google Scholar]

[R45] Wood SN (2006) Generalized Additive Models: an Introduction with R. Boca Raton: Chapman and Hall–CRC. [Google Scholar]

[R46] Yu K and Jones M (1998) Local linear quantile regression. J. Am. Statist. Ass, 93, 228–237. [Google Scholar]

[R47] Yue YR and Rue H (2011) Bayesian inference for additive mixed quantile regression models. Computnl Statist. Data Anal, 55, 84–96. [Google Scholar]

PERMALINK

Additive quantile regression for clustered data with an application to children’s physical activity

Marco Geraci

Summary.

1. Introduction