Skip to main content
Computational and Mathematical Methods in Medicine logoLink to Computational and Mathematical Methods in Medicine
. 2016 Jan 14;2016:4724395. doi: 10.1155/2016/4724395

Prediction Accuracy in Multivariate Repeated-Measures Bayesian Forecasting Models with Examples Drawn from Research on Sleep and Circadian Rhythms

Clark Kogan 1, Leonid Kalachev 2, Hans P A Van Dongen 1,3,*
PMCID: PMC4808749  PMID: 27110271

Abstract

In study designs with repeated measures for multiple subjects, population models capturing within- and between-subjects variances enable efficient individualized prediction of outcome measures (response variables) by incorporating individuals response data through Bayesian forecasting. When measurement constraints preclude reasonable levels of prediction accuracy, additional (secondary) response variables measured alongside the primary response may help to increase prediction accuracy. We investigate this for the case of substantial between-subjects correlation between primary and secondary response variables, assuming negligible within-subjects correlation. We show how to determine the accuracy of primary response predictions as a function of secondary response observations. Given measurement costs for primary and secondary variables, we determine the number of observations that produces, with minimal cost, a fixed average prediction accuracy for a model of subject means. We illustrate this with estimation of subject-specific sleep parameters using polysomnography and wrist actigraphy. We also consider prediction accuracy in an example time-dependent, linear model and derive equations for the optimal timing of measurements to achieve, on average, the best prediction accuracy. Finally, we examine an example involving a circadian rhythm model and show numerically that secondary variables can improve individualized predictions in this time-dependent nonlinear model as well.

1. Introduction

Significant steps forward in the analysis of repeated-measures data were made with the introduction of linear and nonlinear mixed-effects models [13], which distinguish within-subjects variance (from multiple measurements in each subject) versus between-subjects variance (from multiple subjects being measured). Distinguishing these types of variance can also be thought of as explicitly modeling random error in the data. This can be useful in understanding how different individuals are from one another as compared to how different multiple measurements are for given individuals. In research on sleep and sleepiness, for example, breakthroughs made possible by mixed-effects models include elucidation of the dose-response effects of sustained sleep restriction on sleep architecture and neurobehavioral impairment [4, 5] and demonstration of the trait characteristics of individual differences in vulnerability to sleep loss [6]. In recent years, mixed-effects model approaches to statistical regression and analysis of variance have become widely available in statistical software packages. They are nowadays the methodology of choice for many repeated-measures investigations in sleep research and other fields of study.

A further advance was the introduction of a model individualization technique called Bayesian posterior distribution estimation or Bayesian forecasting. This technique was first used in sleep research to overcome a shortcoming of biomathematical models of fatigue and performance. Existing models did not account for individual differences in sleep regulation and vulnerability to sleep loss and therefore did not accurately predict performance for given individuals. Bayesian forecasting addressed this shortcoming by utilizing the separation of within- and between-subjects variance in model parameters as enabled by mixed-effects modeling [2]. In Bayesian forecasting, the between-subjects variance of model parameters serves as Bayesian prior information. Measurements from a new individual, not previously studied, are combined with the prior information to efficiently derive model parameters tailored to the new individual, thereby yielding a subject-specific mathematical model [2, 7, 8]. As a bonus, the Bayesian forecasting technique also yields quantitative estimates of the accuracy of individualized predictions made with the subject-specific mathematical model [9].

In a published example, Bayesian forecasting was implemented for the two-process model of sleep regulation [10, 11] to predict performance impairment of selected individuals undergoing a period of total sleep deprivation (see Figure 1). Comparisons with the individuals' actual data revealed that the model parameters converged efficiently to those that best characterized each individual, and the response predictions were significantly more accurate than could have been achieved with the original, nonindividualized two-process model [7].

Figure 1.

Figure 1

Lapses of attention on a psychomotor vigilance test (PVT) for a subject in a study involving 88 h of total sleep deprivation under controlled laboratory conditions. In each of the six plots, different amounts of subject data are assumed known (black dots), and the Bayesian forecasting procedure is applied to the known data to construct predictions of PVT number of lapses for a 24 h interval immediately following the most recent collected data point at time t. For the 24 h interval, 95% prediction intervals (vertical bars) are shown, along with the remainder of the data for comparison (gray dots). The beginning of the 88 h sleep deprivation period, t 0, was at 07:30. Graphs taken from [7] with permission.

In Bayesian forecasting, individualized parameter estimates are derived from the posterior distributions of the parameters in question after combining prior distributions with the measurement(s) from the individual, and individualized model predictions for future outcomes or responses are obtained following estimation of the posterior distribution of the expected responses at given times. A variety of methods are available for obtaining parameter estimates and response predictions once posterior distributions have been estimated. We employ the Bayesian mean squared error (BMSE) [12] and make use of the Bayesian minimum mean squared error (MMSE) estimator [13, 14], which produces unbiased point estimates that minimize the BMSE [15]. The BMSE of parameter estimates and response predictions is dependent on the amount of data available for the individual at hand and the magnitude of between-subjects variance captured in the Bayesian prior distributions.

In cases where between-subjects variance is relatively large, such as performance responses to sleep loss [6, 16], measurement data for the individual at hand are more critical for making accurate individualized response predictions. Sparseness of such measurement data (e.g., due to practical or cost-based limitations) can result in unacceptably low levels of accuracy. For example, Bayesian forecasting could be used to develop a drowsy driver warning system based on a mathematical model of fatigue and performance [17, 18] calibrated to predict lateral lane deviation, using camera-based measurements of lane position to individualize the model for the driver. However, when lane markers are covered with snow and cameras are unable to determine vehicle position relative to the lane, the individualization effort becomes less effective.

To address this limitation, we consider the use of secondary variables to increase data availability, boost response prediction accuracy, and/or reduce data collection costs for individualized response prediction. For example, in the case of a drowsy driver warning system, camera-based measurements of lateral lane deviation serving as the primary response variable could be augmented with in-car secondary variables such as steering wheel variability or driver eyelid closure assessments. However, individualization of predictions based on two or more measurement variables would only be straightforward if individual differences in responses on these variables are equivalent (cf. [19]). Generally, this is not what the evidence shows. As a case in point, trait individual differences in vulnerability to performance impairment due to sleep loss vary considerably across outcome variables [6, 20], such that the most vulnerable individuals based on one variable are not necessarily also the most vulnerable individuals based on another variable. Therefore, when considering two or more response variables as the basis for individualized prediction, it is essential to account statistically for the level of congruence between the response variables.

Here we develop a multivariate statistical framework for individualized prediction of sleep or performance variables, based on Bayesian forecasting with measurements of a primary response variable, augmented with one or more measurements of secondary response variables. The response variables are assumed to follow equivalent dynamics over time, such that they can be described by the same model framework after appropriate scaling. This is a reasonable assumption in the case of, for example, models describing sleep variables measured repeatedly across multiple nights [21] and models describing performance changes over time in response to sleep loss [22]. We make use of multivariate Bayesian prior distributions of the primary and secondary variables, assumed to have been assessed in advance by means of mixed-effects modeling [2] or other suitable techniques. The between-subjects correlation(s) between the primary and secondary variables, used here to account for the level of congruence between the response variables, is assumed to have been estimated as part of the covariance matrix of the multivariate Bayesian prior distributions. The between-subjects correlation(s) are assumed to be at least moderately strong, lest the secondary variables contain essentially no information about the primary response variable that rises above the level of measurement noise.

Our focus in this paper is on prediction accuracy in the multivariate Bayesian forecasting technique. We develop the technique by first considering the details of making individualized response predictions and estimating their accuracy for a simple univariate intercept model. To demonstrate how secondary, correlated responses can be used to make more accurate individualized predictions, we expand the intercept model to include both primary and secondary responses. Assuming fixed costs of data collection for each of the responses, we show how to optimize data collection given a desired level of accuracy for predictions of the primary response variable.

We then consider a linear approximation of the homeostatic component of the two-process model [11] and derive a closed form equation of the BMSE to quantify prediction accuracy in this time-dependent model. For this example, we study the problem of optimizing the timing of measurements in order to enhance the individualized prediction accuracy most efficiently. Finally, we consider more complicated bivariate models, both linear and nonlinear, for which the BMSE cannot be determined in closed form. For these models, we describe a process of numerically assessing the BMSE for individualized prediction based on a primary response variable in Bayesian forecasting augmented with measurements of a secondary response.

2. Subject-Specific Bayesian Models

First we discuss a modeling framework for Bayesian forecasting. Consider a response variable y, dependent on a subject-specific trait parameter θ. Let i be an index for individual, ranging from 1 to N, and let j be an index for observations ordered by time, nested within individual. Suppose that the response variable can be modeled by

yij=ftij,θi+ϵij, (1)

where f(t ij, θ i) represents the model function, t ij represents a fixed measurement time, θ i represents a random subject-specific parameter, and ϵ ij represents additive measurement error. We assume that the distributions for θ i and ϵ ij are known. Equation (1) and the distributions of θ i and ϵ ij constitute a population model.

Limiting our focus to a particular individual, we may remove the subscript i and model the subject's responses as

yj=fj+ϵj, (2)

where f j is used to denote f(t j, θ).

Suppose that a total of m responses (y 1,…, y m) have been measured for the individual at hand, and let j denote the index of a response y j at some future time t j. We consider a prediction (estimator) of the expected response E[y jθ] = f j, which we denote as f^j. Our interest is in constructing f^j such that the expected accuracy for an arbitrary, given individual from the population is minimized.

We define the accuracy using the squared error (fj-f^j)2. The expected accuracy, which is referred to as the Bayesian mean squared error (BMSE), is thus given by

Mf^jBMSEf^jEfjf^j2, (3)

where the expectation is taken with respect to the marginal probability density function (pdf) of y 1,…, y m and θ.

We refer to the prediction that minimizes the BMSE as a minimum mean squared error (MMSE) prediction. After observing data from a particular individual, y 1,…, y m, and constructing the MMSE prediction f^j, we seek to assess the expected accuracy of this particular prediction. This can be done with confidence intervals on f j, obtained from quantiles of the posterior distribution of θy 1,…, y m. In the following sections, we describe specific types of models and investigate the BMSE and the MMSE that minimizes it.

3. Univariate Random Intercept Model

We consider a random intercept model obtained from (2) by letting f j = θ:

yj=θ+ϵj, (4)
θ~Nμ,δ2, (5)
ϵj~N0,σ2, (6)

where j = 1,…, m. For a particular individual, the expected response at time t j is

Eyjθ=fj=θ. (7)

It follows that the MMSE prediction of E[y yθ] is

f^j=θ^, (8)

where θ^ is the MMSE of θ. The estimator θ^ (and therefore f^j) is given by [23]:

θ^=υy¯+1υμ, (9)

where y¯ is the sample mean of the measured responses and

υ=δ2δ2+σ2/m. (10)

The variance of the posterior distribution of θy 1,…, y m (and therefore f^jy1,,ym) is [23]:

Varθy1,,ym=σ2mδ2δ2+σ2/m. (11)

Furthermore, the BMSE of θ^ (and therefore f^j) is given by [23]:

Mθ^=Varθy1,,ym. (12)

The MMSE prediction f^j (given by (8) and (9)) represents a trade-off between knowledge about the population and knowledge about the subject at hand. This trade-off is embodied by the weighting factor υ in (10). When no data are available for the subject at hand, υ = 0, resulting in a prediction made at f^j=μ, where μ is, in this case, representative of the population mean E[θ] as well as the population mean response E[y j]:

Eyj=Eθ=μ. (13)

As we begin to collect subject-specific data, the weighting factor moves towards the value υ = 1, resulting in a prediction that converges to the subject-specific sample mean as data collection continues.

Likewise, when no data are available for the subject at hand, the expected accuracy of the prediction is Mf^j=δ2. The term δ 2 is, in this case, representative of the population variance as well as the variance in mean response over the population:

VarEyjθ=EEyjθEEyjθ2=EEyjθEyj2=Eθμ2=δ2, (14)

where the expectation E[y jθ] is evaluated with respect to the conditional distribution of y jθ, and all other expectations are evaluated with respect to the marginal distribution of y j. Per (12), as we begin to collect subject-specific data, the expected accuracy of the prediction improves (i.e., the BMSE decreases, where smaller is better).

To illustrate how predictions in this model depend on the amount of subject-specific data collected, we conducted a simulation of model (4) for a particular individual. For this example, the parameter θ for the individual was chosen far from the population mean (when compared to the magnitude of the population variance), in order to make the transition from the population mean to the true expected response f j large enough so as not to be obscured by measurement noise in the example. We assumed the population parameters μ = 0 and δ = 1 and chose the subject-specific parameter value θ = 1.4. We simulated m = 10 data points for the individual, with a standard deviation of measurement error of σ = 1. The MMSE prediction f^j was calculated by incorporating observations y j iteratively. Figure 2 shows f^j plotted against the number of data points used. The variance of the posterior distribution for f j from (11) was used to construct confidence intervals on f j. As expected, for the individual considered, the prediction f^j began at the population mean and moved to the true expected response with shrinking confidence interval as more simulated data were collected.

Figure 2.

Figure 2

MMSE predictions of the expected response f^j from the univariate random intercept model specified in (4), and corresponding 95% confidence intervals, determined by incorporating simulated observations y j iteratively. The individualized Bayesian forecasting predictions begin at the population mean expected response μ = 0 when no individual data are used and converge to the subject's expected individualized response f j = θ = 1.4 as more subject-specific data are collected.

4. Bivariate Subject-Specific Bayesian Models

When subject-specific data are sparse, individualized predictions may not, on average, reach acceptable levels of accuracy. Accuracy may be improved by including data from a secondary subject-specific data source. However, individual differences in one response variable may not be identical to individual differences in another (e.g., [6, 21]). Therefore, data from a secondary response variable may not simply be used as a substitute for the primary response variable. Rather, to improve prediction accuracy on the primary response by incorporating data from the secondary response, the between-subjects correlation between the primary and secondary response variables must be taken into account.

Here we derive the average accuracy of individualized predictions of a primary response variable based on distinct primary and secondary subject-specific response variables, accounting for between-subjects correlation between the two responses. Let i be an index for individual, let r be an index for response type, and let j be an index for observations ordered by time, nested within individual and response type. Suppose that the response y rij can be modeled by

yrij=ftrij,θri+ϵrij, (15)

where f(t rij, θ ri) represents the model function, t rij represents measurement time, θ ri represents a random subject-specific parameter associated with response type r, and ϵ rij represents additive measurement error. Limiting our focus to a particular individual, we may remove the subscript i and model the subject's responses as

yrj=frj+ϵrj, (16)

where f rj is used to denote f(t rj, θ r). Suppose that a total of m r responses have been observed for each response type r for the individual at hand. In the next sections, we focus on constructing the MMSE prediction for the expected primary response, f^1j.

5. Bivariate Random Intercept Model

We consider the bivariate random intercept model obtained from (16) by letting f rj = θ r:

yrj=θr+ϵrj, (17)

where r = 1,2. The scalar model can be converted to vector form by concatenating the responses for each response type,

y1=y11y1m1,y2=y21y2m2, (18)

and then concatenating the response vectors of different types,

y=y1y2. (19)

A similar assembly of the measurement errors can be done so that

ϵ1=ϵ11ϵ1m1,ϵ2=ϵ21ϵ2m2,ϵ=ϵ1ϵ2. (20)

An assembly of the parameters can be accomplished by first constructing the parameter vector,

θ=θ1θ2, (21)

and the design matrix,

H=10100101, (22)

so that the single individual model of (17) can be vectorized as

y=Hθ+ϵ. (23)

We consider the case where subject-specific traits and measurement errors are both normally distributed,

θ~Nμ,Cθ, (24)
ϵ~N0,Cϵ, (25)

where μ, C θ, and C ϵ are fixed population characteristics.

Correlations between primary and secondary response variables y 1 and y 2 can be modeled as arising from a correlation between θ 1 and θ 2 (between-subjects correlation) or correlations between ϵ 1 and ϵ 2 (within-subjects correlation). Here, we assume that response correlations arise from between-subjects correlations:

Cθ=δ12ρδ1δ2ρδ1δ2δ22, (26)

where ρ (−1 < ρ < 1) represents the between-subjects correlation between primary and secondary response variable means and δ r 2 represents the between-subjects variance for response variable r. Furthermore, we assume that measurement errors are uncorrelated with response variable-specific variance σ r 2, so that correlations between the response variables arise only from the between-subjects components. For subject-specific repeated-measures data with no covariates for two response variables, it may be fair to consider the error variance within subjects to be independent as long as perturbations from the intercepts do not tend to be common over both response types.

The error covariance matrix for response type r is a diagonal matrix with dimension m r, where each nonzero element is the type-specific error variance σ r 2,

Σr=σr200σr2. (27)

The full error covariance matrix can then be built from the type-specific blocks,

Cϵ=Σ100Σ2. (28)

The bivariate Bayesian model we consider here is fully characterized by (23)–(28).

As was the case for model (4), for a particular individual, the expected primary response at time t 1j is

Ey1jθ=f1j=θ1. (29)

The MMSE prediction is

f^1j=θ^1. (30)

The MMSE estimator for model (23) is [23]:

θ^=μ+Cθ1+HCϵ1H1HCϵ1yHμ. (31)

The MMSE estimator θ^1 can be extracted as the first element of θ^.

The variance of the posterior distribution of θy is given by

Varθy=Cθ1+HCϵ1H1. (32)

The variance of the posterior distribution of θ 1y (and therefore f 1jy) can be obtained by extracting the first element from the diagonal of θy. The BMSE of the MMSE estimator θ can be obtained from the parameter BMSE matrix Mθ^, which is given by [23]:

Mθ^=Varθy. (33)

The BMSE of θ^1 (and therefore f^1j) can be obtained by extracting the first element from the diagonal of Mθ^. Substituting (22), (26) and (28) into (33), we find that the BMSE for θ^1 can be simplified as follows:

Mθ^1=m1σ12+1δ12+λm21, (34)

where

λm2=ρ2δ22/δ12δ221ρ2+σ22/m2. (35)

Figure 3 illustrates the dependence of the BMSE of f^j on the number of observations from the primary and secondary responses. For this illustration, the population parameters were fixed at the values δ 1 = 1, δ 2 = 1, ρ = 0.85, σ 1 = 1, and σ 2 = 1. The figure shows the decrease of the BMSE as a function of the collection of secondary response measurements, given different numbers of primary response measurements. For a large number of primary response measurements, little change in BMSE is derived from the secondary response. However, for a small number of primary response measurements, the BMSE decreases substantially with just a few measurements of the secondary response.

Figure 3.

Figure 3

BMSE (i.e., accuracy) of MMSE predictions of the expected primary response f^j for the bivariate random intercept model specified in (23), as a function of the amount of data collected on a secondary response variable (m 2), shown for different amounts of data collected on the primary response variable (m 1). The figure illustrates that the accuracy of individualized Bayesian forecasting predictions improves progressively with just a few measurements of the secondary response variable when measurements of the primary response variable are increasingly sparse.

To further illustrate the bivariate Bayesian forecasting procedure, simulated subject-specific parameter pairs θ = (θ 1, θ 2) from the population distribution given by (24) with δ 1 = 1, δ 2 = 1, and ρ = 0.9 were generated for N = 20 individuals, as shown in Figure 4. From this simulated set, individual #19 (circled in Figure 4) with subject-specific parameter vector θ = (−1.6; −2.0) was chosen to illustrate the transition of the primary response prediction from the population mean response to the true expected response f 1j through Bayesian forecasting. For this individual, we simulated errors from (25) using σ1=2, σ2=0.5, m 1 = 10, and m 2 = 10. Bivariate responses were then constructed using (23).

Figure 4.

Figure 4

Subject-specific parameter pairs, simulated from a probability distribution specified by (24), from which a simulated individual was selected for a bivariate Bayesian forecasting simulation. Each number represents the parameter pair θ = (θ 1, θ 2) for a different simulated individual. The circled individual (#19) is used for the illustration in Figure 5. The diagonal line shows where the points would fall given a between-subjects correlation of ρ = 1.

The MMSE prediction f^1j for individual #19 was iteratively determined after assuming only primary responses were observed and after assuming pairs of primary and secondary responses were observed. The iterative estimates for both cases are shown in Figure 5, along with the simulated data. The variance of the posterior distribution for f 1j from (32) was used to construct 95% confidence intervals on f 1j. For the individual considered, the predictions based on both response variables (purple line) were more accurate than the predictions based on only the primary response (blue line) for the first few iterations of MMSE prediction. Note, however, that while this behavior of the prediction accuracy is found on average, as follows from (34) it is not necessarily true for each and every individual to which the procedure may be applied. Thus, caution is needed in relying on bivariate Bayesian forecasting for improved prediction accuracy of specific individuals; improved prediction accuracy can only be counted on in the average over individuals.

Figure 5.

Figure 5

MMSE predictions of the expected primary response f^j from the bivariate random intercept model specified in (23), and corresponding 95% confidence intervals (shaded areas) for individual #19. The MMSE estimator f^j was iteratively determined assuming only primary responses were observed, as well as assuming both primary and secondary responses were observed. For the former case, the MMSE was iteratively determined by incorporating observations y 1j; for the latter case, the MMSE was iteratively determined by incorporating pairs of observations (y 1j, y 2j). Confidence intervals were obtained from quantiles of the posterior distribution, which is defined by the posterior mean (the MMSE estimator) (31) and the posterior variance (32).

6. Data Collection Cost Minimization

When Bayesian forecasting is applied to individualize predictions, data must be collected to tailor the population model to the individual at hand. In certain sleep research applications, such as forecasting of sleep parameters across nights or predicting performance deficits across periods of sleep deprivation, and in a wide range of other biomedical contexts, this requires creating multiple opportunities for taking measurements. This may be an expensive proposition, and reducing the number of measurement bouts needed to obtain the necessary data could entail considerable cost savings. By measuring secondary responses and incorporating these through bivariate Bayesian forecasting, it may be possible to achieve a given level of prediction accuracy at lower overall cost of data acquisition. Here we explore this possibility in the case of the bivariate random intercept model.

We consider a scenario in which the cost of collecting an observation on the primary response is c 1, the cost of collecting an observation on the secondary response is c 2, and the total cost of data collection is the sum of primary and secondary response costs accrued,

c=m1c1+m2c2, (36)

where m 1, m 2 ≥ 0. For this scenario, we determine how many observations m 1 and m 2 we may expect to have to collect from each response type in order to minimize the total cost of achieving, on average, a given prediction accuracy η 2:

Mf^1j=η2. (37)

We can simplify the minimization problem by removing either m 1 or m 2 from both the total cost equation and the nonnegativity constraints on the number of data points. Using f^1j=θ^1 (see (30)), it follows that the BMSE of f^1j is equal to the BMSE of θ^1 (i.e., Mf^1j=Mθ^1). Fixing Mf^1j=η2 (37) therefore implies that Mθ^1=η2, which can be used with (34) to obtain a relationship between the number of primary and secondary observations needed to meet the average accuracy criterion η 2:

m1=σ121η21δ12λm2. (38)

We then substitute (38) into (36):

c=c1σ121η21δ12λm2+c2m2. (39)

The constraint m 1 ≥ 0 can be equivalently formulated as an upper bound on λ(m 2):

λm21η21δ12. (40)

Consideration of this constraint is only necessary if there are a certain number of measurements on the secondary response for which it is possible to obtain the desired accuracy without measurement of the primary response; that is,

limm2λm21η21δ12. (41)

Substituting for λ(m 2) from (35), this latter condition can be reformulated as follows:

δ121ρ2η2. (42)

Thus, m 2 has an upper bound when η 2, the desired BMSE of f^j, is not smaller than δ 1 2(1 − ρ 2), the minimum BMSE of f^j which can be obtained using only the secondary response. It follows that the constraint set for the minimization problem is

0m2σ221/η21/δ12δ221ρ21/δ121ρ21/η2if  δ121ρ2η2,0m2otherwise. (43)

If m 2 exceeds its upper bound then the number of secondary response measurements is more than what is minimally needed to meet the average accuracy criterion (37).

The minimal cost solution occurs either on the boundary of the region defined by (43) or at a local minimum in the interior of this region. Figure 6 shows how different values of the error variance σ 2 can result in either boundary or interior solution types. For this demonstration, the population parameters are set at ρ = 0.85, δ 1 = 1.0, δ 2 = 1.0, and σ 2 = 0.5, the costs of measurement are assumed to be c 1 = $500 and c 2 = $100, and the BMSE of f^j is fixed at the value η 2 = 0.30.

Figure 6.

Figure 6

Illustration of three types of absolute minima of the total cost of obtaining a fixed average prediction accuracy inside the region defined by (43). Each plot shows the total cost c = c 1 m 1 + c 2 m 2 plotted against secondary response sample size m 2. For m 2 given, the primary response sample size m 1 that obtains a fixed BMSE of η 2 in the expected primary response f^j is computed using (38). The lower and upper boundaries of the region defined by (43) are shown with solid vertical lines. (a) shows an interior point minimum, obtained by letting σ 1 = 1.00. This type of solution occurs when the minimum of the unconstrained cost function lies within the region defined by (43). (b) shows a lower boundary solution, obtained by letting σ 1 = 0.15. This type of solution occurs when the minimum of the unconstrained cost function lies below the feasible region defined by (43). (c) shows an upper boundary solution, obtained by letting σ 1 = 3.00. This type of solution occurs when the minimum of the unconstrained cost function lies above the region defined by (43).

In cases where the solution lies in the interior of (43) at a local minimum, the solution must occur at critical points of (39), which can be found by setting to zero the derivative of the total cost with respect to m 2:

m2cm2=c1σ12λm2m2+c2=0, (44)

where

λm2m2=σ22ρ2δ22/δ12m2δ221ρ2+σ222. (45)

Substituting the above expression for ∂λ(m 2)/∂m 2 into (44) and solving for m 2, we obtain the following two critical points:

m2±=σ2δ1δ221ρ2δ1σ2±c1c2δ2σ1ρ. (46)

The smaller critical point m 2 can be disregarded as a possible solution since it is always less than zero. The second derivative at m 2 +,

2m22cm2+=2c2c2/c11ρ2δ1δ2ρσ1σ2, (47)

is positive when |ρ | <1, which implies that the cost function exhibits a local minimum at this point. If the local minimum m 2 + is inside the region defined by (43) (see Figure 6(a)), then the solution to the cost minimization problem is

m^1=σ121η21δ121ρ21c2δ12/σ12c1δ22/σ22ρ2,m^2=m2+, (48)

where m^1 is determined by substituting m^2 into (38), and the total cost is found from (36).

Alternatively, if m 2 + is below the lower boundary of the region defined by (43) (see Figure 6(b)), then the minimal cost solution involves collecting no data from the secondary response. The conditions for which the secondary response is not part of the minimal cost solution are as follows:

c2c1>δ22/σ22δ12/σ12ρ2, (49)

where δ r 2/σ r 2 reflects the between-to-within variance ratio for the rth response type. For this case, the solution which achieves, on average, the level of accuracy η 2 is found from (38):

m^1=σ121η21δ12,m^2=0. (50)

Finally, if m 2 + is above the upper boundary of the region defined by (43) (see Figure 6(c)), then the solution for m 2 occurs at this boundary, where all the data are collected from the secondary response and none from the primary response. For this case, the solution is as follows:

m^1=0,m^2=σ221/η21/δ12δ221ρ21/δ121ρ21/η2. (51)

Figure 7 illustrates the number of observations required from primary and secondary responses to obtain a fixed level of accuracy η 2 on average for different values of the between-subjects correlation ρ between primary and secondary response parameters. For the example shown, the population model parameters and cost parameters were fixed at δ 1 = 1, δ 2 = 1, σ 1 = 1, σ 2 = 0.5, c 1 = 5, and c 2 = 1, and the desired level of accuracy was η 2 = 0.30. The figure illustrates the three cases described by (48), (50), and (51).

Figure 7.

Figure 7

Example of data collection cost minimization that shows the number of observations on primary and secondary responses needed to obtain a fixed BMSE in the MMSE prediction of the expected primary response f^j, for different values of the between-subjects correlation between primary and secondary response parameters. The solid curve represents the number of measurements to collect from the primary response, m^1, and the dashed curve represents the number of measurements to collect from the secondary response, m^2. In this example, for 0.0≤|ρ | ≤ 0.18, no data is to be collected from the secondary response, and the number of data points to collect from the primary response is obtained from (50). For 0.18<|ρ | ≤ 0.85, the number of secondary observations increases and the number of primary observations decreases with increasing correlation, as specified by m^2 and m^1 in (48). For 0.85<|ρ | ≤ 1.0, observations are to be collected only from the secondary response, the number of which is given by (51). The equation numbers are indicated near the relevant pieces of the curves.

7. Example: Efficient Assessment of an Individual's Characteristic Wakefulness after Sleep Onset

To illustrate the cost minimization approach outlined in the previous section, we apply it in an example involving the assessment of wakefulness after sleep onset (WASO) in laboratory-based sleep studies. Here we define WASO as the duration of intermittent wakefulness during a sleep period, between the time of sleep onset and the time of final awakening. WASO can be measured by polysomnography (PSG), that is, measuring the sleep electroencephalogram (EEG) and other physiological sleep signals and scoring sleep/wake states, typically in 30 s epochs, based on those signals. PSG is the gold standard procedure for sleep/wake assessment, but it is labor-intensive and expensive to perform. WASO may also be measured in the laboratory using wrist actigraphy (i.e., wrist activity monitoring), which is considerably less expensive. Although actigraphy is not considered a gold standard for measuring WASO, the correspondence with PSG-based WASO is at least moderate in healthy populations [24].

We base our example on data from n = 33 subjects (ages 22–38; 15 females) who spent between 6 and 13 nights and days inside a controlled laboratory environment with 10 h in bed for sleep (22:00–08:00) each day. The Institutional Review Board (IRB) of Washington State University approved the research, and subjects gave written informed consent. WASO was measured using both PSG (WASO-P) and actigraphy (WASO-A). PSG recordings were performed using digital equipment (Nihon Kohden, Foothill Ranch, CA). Sleep stages and periods of wakefulness were scored in 30 s epochs using standard criteria [25], and WASO-P was calculated from the scored records. Actigraphic recordings were made with Motionlogger wrist actigraphs (Ambulatory Monitoring, Inc., Ardsley, NY). Sleep and wakefulness were assessed from the actigraphic records using the automated algorithm of [26], which calculated WASO-A.

Let WASO-P be the primary response (as it is the gold standard measure) and let WASO-A be the secondary response. Assuming that WASO-P and WASO-A are normally distributed around distinct subject-specific means, we apply the model defined by (23)–(28) to our example. We anticipate that the subject-specific means for WASO-P and WASO-A are positively correlated. We aim to determine a cost-effective data collection scheme given a specific level of desired accuracy in estimates of an individual's mean WASO-P. For illustration purposes, we assume a fixed cost of $1,250 per night for WASO-P and $150 per night for WASO-A. We wish to estimate an individual's mean WASO-P to an average accuracy (i.e., BMSE) of η = 15 min. Using the equations derived in the previous section, we estimate the number of nights of WASO-P and WASO-A that most cost-effectively achieves the desired level of accuracy on average.

We have estimated the population model parameters using the nlme package for R [27]. This package fits mixed-effects models using the approach of [28]. The maximum likelihood estimation method tends to underestimate variance parameters [29]; therefore, we estimate these parameters using the restricted maximum likelihood method [6].

In our dataset, the overall means for WASO-P and WASO-A were estimated as follows: μ 1 = 54 min ± 5 min and μ 2 = 32 min ± 4 min (estimate ± standard error), indicating that WASO-A tended to underestimate the total amount of WASO as compared to PSG. The estimated variability between subjects for WASO-P and WASO-A was found to be the same: δ 1 = δ 2 = 21 min. There was a substantial correlation between subject means for WASO-P and WASO-A: ρ = 0.69 (95% confidence interval: [0.30, 0.90]). The within-subject variation around the subject mean was σ 1 = 32 min for WASO-P and σ 2 = 21 min for WASO-A.

We determine from (43) that m 2, the number of actigraphy nights, is not bounded above; that is, we cannot achieve our accuracy with actigraphy alone. Further, we find from (46) that m 2 + is within the feasible region defined by (43), and, therefore, the solution to the cost minimization problem is given by (48). Applying these equations, we achieve an average accuracy of η = 15 min for minimal cost by collecting 3.99 nights of actigraphy and 0.73 nights of PSG (see Figure 8(a)). An approximately optimal solution in the integer domain is found by the common practice of rounding the optimal continuous solution to the nearest integer values [30]. We verify through a grid search that the minimal cost integer value solution neighbors the analytic solution and can be obtained by rounding down to three nights of actigraphy and up to one night of PSG. This yields a total cost of $1700. In contrast, to achieve the same accuracy with PSG alone would require 2.24, or in practice three nights of measurement, for a total cost of $3750.

Figure 8.

Figure 8

Cost of collection and number of nights of actigraphy and polysomnography (PSG) which combined will yield a certain average accuracy of wakefulness after sleep onset (WASO) parameter estimates. For illustration purposes, PSG and actigraphy are assumed to cost $1250 and $150 per night, respectively. The cost of data collection is fixed on each diagonal dashed line (for illustrative purposes these are only shown at fixed intervals) and increases as we collect more nights of PSG and actigraphy. Subject-specific WASO estimates of an average accuracy of (a) 15 min, (b) 14 min, and (c) 13 min are obtained on the solid curve. The point that minimizes the cost of obtaining the fixed accuracy (open squares) is determined as the point on the fixed accuracy curve where the line tangent to the curve is parallel to the fixed cost lines. For comparison the solution for obtaining the same average accuracy using only polysomnography is also shown (solid squares).

Note that the results are highly dependent on the estimated between-subjects correlation and that the 95% confidence interval for this correlation was large. The minimal cost solution also depends on the level of accuracy that is required; see Figures 8(b) and 8(c) for scenarios with an average accuracy of 14 min and 13 min, respectively.

8. Linear Models with Time Dependency

For both the univariate and bivariate linear models, time dependency can be introduced by adding time as a covariate. This complicates the construction of a design matrix that enables predictions with a given average accuracy. We show that, in models with time dependency, the BMSE of a predicted response depends on the times at which responses are measured, and this dependency can be summarized by the mean and variance of the measurement times.

To illustrate, we consider a linear approximation of a time-dependent model known as the two-process model of sleep regulation [10, 31]. It has been shown that, for a range of sleep/wake scenarios, the two-process model can describe temporal changes in waking cognitive performance as the algebraic difference between two functions describing physiological processes: the homeostatic pressure for sleep and the circadian pressure for wakefulness [11]. Here we focus solely on modeling the homeostatic pressure for sleep, the dynamics of which are specified separately for sleep and for wakefulness. The dynamics can be modeled using the recursive formulation of [10]

St=eΔt/τdSt1sleep1eΔt/τr1St1wake, (52)

where S t represents the homeostatic pressure after the tth time step of duration Δt, S t−1 represents the homeostatic pressure at one time step before (i.e., at time (t − 1) · Δt), Δt is typically fixed at 0.5 h, and τ d and τ r are time constants for the decay and rise of the homeostatic process during sleep and wakefulness, respectively.

We divide the sleep/wake schedule into periods of sleep and periods of wake, both indexed by k. Let S 0 (k) represent the initial homeostatic pressure for the kth period of wakefulness. It has been proposed that change over time in the model is better modeled as linear rather than exponential [32]. For the purpose of the present example, we adopt this idea and approximate the model over a period of continuous wakefulness using a linear interpolation between the start and end points of the wake period (see Figure 9).

Figure 9.

Figure 9

Homeostatic pressure for wakefulness plotted over two complete sleep/wake cycles for a repeating schedule with 16 h of wakefulness and 8 h time in bed for sleep in each cycle. The dashed line represents the homeostatic pressure for sleep as given by (52); the solid line is a linear approximation based on interpolation between the sleep/wake transition points. The parameter values τ d = 4.2 h and τ r = 18.2 h are taken from [10]. The initial condition (S 0) is derived assuming steady state (54). Black bars indicate the 8 h periods in bed for sleep.

Let y j represent (hypothetical) measurements of the build-up of homeostatic pressure for sleep during wakefulness. These data can be modeled using the following approximation:

yj=S0k+1eTk/τr1S0kTktj+ϵj, (53)

where t j represents the amount of time elapsed since awakening, T (k) is used to denote the duration of the kth wake period, and ϵ j represents normally distributed measurement error.

We consider the homeostatic process over a repeating schedule consisting of T (k) = T = 16 h of wakefulness and 8 h of sleep. In this schedule, individuals maintain a steady state for which the homeostatic pressure S 0 (k) at the onset of the wake period is constant across days; that is, S 0 (k) = S 0. This allows us to derive the homeostatic pressure at the start of wakefulness as a function of τ r and τ d:

S0=e24T/τd1eT/τr1e24T/τd+T/τr. (54)

The equation for the homeostatic pressure during a particular wake period can thus be written as

yj=α+βtj+ϵ, (55)

where

α=S0, (56)
β=1eT/τr1S0T. (57)

In matrix form, the model can be written as

y=Hθ+ϵ, (58)

where the design matrix is given by

H=1t11tm, (59)

and the parameter vector is given by

θ=αβ. (60)

Analogous to (5), but for two parameters, we assume the following prior distribution on model parameters:

αβ~Nμ,δα200δβ2. (61)

As in (6), we assume that the errors are independent realizations from a normal distribution with zero mean and variance σ 2. Analogous to (11) and (12), the BMSE for the MMSE prediction of the expected (primary) response f^j at some time t j in the univariate case is as follows (see Appendix A):

Mf^j=1/δβ2+ms2/σ2+mt¯tj2/σ2+tj2/δα21/δα2+m/σ21/δβ2+ms2+t¯2/σ2mt¯/σ22, (62)

where t¯ denotes the mean of these times,

t¯=1mj=1mtj, (63)

and s 2 denotes the variance of these times,

s2=1mj=1mtjt¯2. (64)

Our task is to determine the measurement times t 1,…, t m for this example that will minimize Mf^j. Equation (62) shows that Mf^j depends on the measurement times only through their mean t¯ and variance s 2. Consequently, instead of conducting an m-dimensional minimization of Mf^j over all the measurement times t 1,…, t 1m, we can write Mf^j as a function of t¯ and s 2 and conduct a two-dimensional minimization. In doing this, we find that Mf^j is minimized both when s 2 (see Appendix B) and also, more practically relevant, when

t¯=tjσ2/m+δα2δα2. (65)

The optimal mean measurement time for this example, given by (65), lies slightly above the prediction time; in the limit as δ α 2, Mf^j is minimized when the data are collected at times such that t¯min=tj. If the prior variance on the intercept is equal to the error variance (i.e., δ α 2 = σ 2), the effect of the Bayesian prior is equivalent to increasing the value of t¯ that produces the minimum Mf^j by t j/m. This adjustment is the same as what would manifest with no prior information on the intercept when adding an additional measurement time, t m+1 = 0, to the design matrix H.

The absolute minimum of Mf^j in (62) is not always located in the feasible region in this example, as defined by 0 ≤ t jT, ∀j ∈ {1,…, m}. More specifically, the absolute minimum of the unconstrained case is not located inside the feasible region if and only if (see Appendix B):

t¯minT. (66)

Under this condition we hypothesize that, within the feasible region, Mf^j exhibits an absolute minimum when all the data are collected at time T. This is easy to show for the case of one measurement time (i.e., m = 1), and Appendix C contains a proof for the case of two measurement times (i.e., m = 2).

For m > 2, we conducted a simulation study to search for a counterexample (i.e., a case where the value of Mf^j when all data is collected at time T is not the smallest value of Mf^j within the feasible region). For the simulation study, we simulated 10,000,000 times with the following values:

δα~Uniform0.001,1;δβ~Uniform0.001,1;σ~Uniform0.001,1;m~Discrete  Uniform1,2,,100;tj~UniformTδα2σ2/m+δα2,T. (67)

Concerning the ranges of the variance components, note that Mf^j is invariant to the scale of the response. This can be demonstrated by scaling the variance matrices C θ and C ϵ by c y and showing that the resulting Mf^j is then scaled by the same factor c y. The conclusion is that the shape of the surface of Mf^j depends on the variance components only through their relative magnitudes. Furthermore, we argue that if any variance component is more than three orders of magnitude greater than any other component, then it would be advantageous to simplify the model by removing the smaller component. As such, all cases for which this model is reasonable can be covered within the range from 0.001 to 1 for each variance component. Further, concerning the number of observations, we expect that if there is a counterexample, it can be found somewhere in the range 1 ≤ m ≤ 100. Finally, the range for t j is determined specifically so that the absolute minimum lies outside the feasible region.

For each simulation, we compared Mf^j at the hypothesized minimum, where t j = T, ∀j ∈ {1,…, m}, to a randomly chosen time point, where each t j is a realization from the following distribution:

tj~Uniform0,T. (68)

For each of the 10,000,000 simulations, Mf^j at the hypothesized minimum was indeed smaller than Mf^j at the randomly chosen time point. Thus, we found no evidence against our original hypothesis that Mf^j exhibits an absolute minimum if all the data are collected at time T. An analytical proof is beyond the scope of this paper.

We now extend our analysis to consider a time-dependent model with both primary and secondary responses. We formulate the model as follows:

yrj=srj+ϵrj=αr+βrtrj+ϵrj. (69)

In matrix form, the model is

y=Hθ+ϵ, (70)

where the design matrix is given by

H=10t11010t1m10010t21010t2m2, (71)

and the parameter vector is given by

θ=α1α2β1β2. (72)

Let us assume the following prior distribution on model parameters:

α1α2β1β2~Nμ,δα12ρδα1δα200ρδα1δα2δα220000δβ12ωδβ1δβ200ωδβ1δβ2δβ22. (73)

As in (28), we assume that the errors are independent realizations from a normal distribution with zero mean and σ r  (r = 1,2). The BMSE of estimates of the primary response at time t 1j is given by

Mf^1j=hMθ^h, (74)

where

h=10t1j0, (75)
graphic file with name CMMM2016-4724395.e076.jpg (76)

where t¯1 denotes the mean of the primary measurement times,

t¯1=1m1j=1mt1j, (77)

t¯2 denotes the mean of the secondary measurement times,

t¯2=1m2j=1mt2j, (78)

s 1 2 denotes the variance of the primary measurement times,

s12=1m1j=1m1t1jt¯12, (79)

and s 2 2 denotes the variance of the secondary measurement times,

s22=1m2j=1m2t2jt¯22. (80)

Our task in the bivariate case of the example is to determine the primary measurement times t 11,…, t 1m1 and the secondary measurement times t 21,…, t 2m2 that minimize Mf^1j. Equations (74) and (76) show that Mf^1j depends on the measurement times only through their response-specific means and variances, t¯1, t¯2, s 1 2, and s 2 2. Consequently, it is sufficient to minimize Mf^1j with respect to t¯1, t¯2, s 1 2, and s 2 2 and choose any set of measurement times with these means and variances. We find that Mf^1j can be minimized by collecting data at the following times (see Appendix D):

t¯1min=t1jδα121ρ2δα22+σ22/m2+σ12/m1δα22+σ22/m2δα121ρ2δα22+σ22/m2, (81)
t¯2min=0. (82)

The optimal mean measurement time for the primary response variable, t¯1min in (81), lies slightly above the prediction time. In the limit as m 2 ↓ 0, the solution becomes the univariate solution:

t¯1min=t1jδα12+σ12/m1δα12. (83)

Again, the absolute minimum of Mf^1j is not always located in the feasible region defined by 0 ≤ t rjT. More specifically, the absolute minimum of the unconstrained function is located outside of the feasible region if and only if (see Appendix D):

t¯1minT. (84)

Under this condition, it seems logical that the minimum would occur if we were to collect all the primary data at time T and all the secondary data at time zero. However, a simulation study similar to that described above revealed counterexamples, which occurred when ω > 0.99. When T was decreased (or the ranges of the variance components for the slope were increased), counterexamples also occurred at smaller values of ω. Therefore, the optimal measurement scheme for the bivariate case appears to depend on ω and T. When ω is small or T is large (in comparison to the magnitudes of δ β1 and δ β2), the optimal design seems to be collecting all the primary data at t = T and all the secondary data at t = 0. When ω is large or T is small (compared to the magnitudes of δ β1 and δ β2), better designs are likely to be found numerically.

In summary, the average prediction accuracy for the simple linear time-dependent model depends not only on the number of measurements collected, but also on the times when these measurements are taken. In the univariate case, the prediction accuracy depends on these times only through their mean and variance. In the example of the two-process model, the optimal mean of the measurement times is slightly after the prediction time, where the delay increases with more prior information and with fewer or less informative data. When little prior information on the intercept is available, it is usually possible to collect data so that the absolute minimum of Mf^j is achieved. In the case where the theoretical minimum cannot be achieved (i.e., (66) is not satisfied), minimization of Mf^j can be achieved by collecting all the data at time T.

In the bivariate case of our example, the prediction accuracy depends on the measurement times only through the means and variances of the primary and secondary measurement times. Furthermore, Mf^1j is minimized by centering the primary measurement times above t 1j (see (81)) as in the univariate case and collecting all secondary measurements at time zero. As in the univariate case, when little prior information on the primary intercept is available, it is usually possible to collect data so that the absolute minimum of Mf^1j is achieved. In the case where this minimum cannot be achieved (i.e., (84) is not satisfied), minimization of Mf^1j can usually be obtained for parameter ranges considered in our simulation by collecting all primary data at time T, and all secondary data at time zero.

9. Nonlinear Models with Time Dependency

Lastly, we focus briefly on the nonlinear case, where the BMSE generally lacks a closed form solution. Obtaining the BMSE of the MMSE estimator for nonlinear models typically requires numerical integration of the joint probability density of y and θ. We illustrate this with an example in which we numerically estimate the prediction BMSE given a nonlinear model and a single primary response measurement and show how it can be improved by a secondary response measurement.

We consider a two-parameter sinusoidal model of circadian (i.e., 24 h) rhythmicity, defined for a given subject and a bivariate response (r = 1,2), as follows:

yrj=frj+ϵrj=Arsin2πtrjϕ24+ϵrj, (85)

where t is in hours; A r represents a response-specific amplitude; and ϕ represents the phase, which is assumed to be common to both response types. For our example, we assume

A1A2~N55,10.950.951,ϕ~N0,2. (86)

We simulated n = 1000 individuals from this model using normally distributed errors, with primary and secondary response standard deviations of σ 1 = 0.25 and σ 2 = 0.1. For each individual, we simulated a single primary response at time t 11 = 14 and a single secondary response at time t 12 = 22. Bayesian forecasting was performed using Markov Chain Monte Carlo (MCMC) with a chain length of 100,000 to obtain the MMSE predictions f 1j for each individual at time t 1j = 24. Predictions were constructed using a primary data point alone, and also using both a primary data point and a secondary data point.

Figure 10 shows these predictions for a randomly chosen individual. The 95% confidence intervals on f 1j were constructed using the quantiles of the posterior distribution for f 1j. For the individual considered, the predictions based on both response variables (purple line and shading) are at many times substantially more accurate (i.e., they have smaller posterior variance) than the predictions based on only the primary response (blue line and shading).

Figure 10.

Figure 10

MMSE predictions of the expected primary response f^1j and corresponding 95% confidence intervals for a randomly simulated individual from the model specified in (85). The MMSE estimator f^1j is determined assuming each of the following: no data was observed (black line and gray confidence interval), a single primary response was observed (blue line and confidence interval), and both a primary and a secondary response were observed (purple line and confidence interval). The expected primary and secondary responses using the subject's simulated parameter values are shown with blue and red dashed lines. The black vertical dashed line shows the time at which predictions are made.

The average accuracy over individuals was assessed at time t 1j = 24 by estimating the BMSE, as follows (cf. (3)):

M^f^1ij=1Ni=1Nf1ijf^1ij2. (87)

The estimated BMSE of f^1ij when using only the primary data points was 0.53, and the estimated BMSE of f^1ij when using both the primary and secondary data points was 0.20. These results suggest that there are nonlinear modeling scenarios where secondary response variables can improve predictions of primary response variables considerably via between-subjects correlation.

10. Discussion

In this paper, we illustrated in a Bayesian, repeated-measures framework how to improve the prediction accuracy of the expected response, as measured by the BMSE, for a primary response variable in the presence of a secondary response variable that is correlated with the primary response variable between subjects. To set up the general procedure of improving prediction accuracy through Bayesian forecasting, we constructed the BMSE for a simple univariate random intercept model.

We applied the general procedure to a bivariate random intercept model and derived the BMSE of the primary response predictions for this model. We studied how the BMSE depends on the number of observations from the secondary response and found that, for a fixed number of primary response observations, the BMSE is bounded below as specified by (40). The potential value of considering a secondary response may be assessed by considering whether this lower bound represents a meaningful improvement in the expected prediction accuracy.

Assuming the availability of a reasonably highly correlated secondary variable we also addressed the problem of determining the number of primary and secondary measurements needed to obtain a given average accuracy at minimal cost. We derived equations for the solution to this problem and illustrated their use with an example from sleep research. Given previously observed means, variances, and between-subjects correlation for polysomnographic (primary) and actigraphic (secondary) measurements of wakefulness after sleep onset and assumed reasonable measurement costs, we found that, to obtain an average prediction accuracy of 15 min, the use of actigraphy in addition to polysomnography resulted in a substantial reduction in estimated data collection costs as compared to polysomnography alone.

We then considered a steady-state, linear approximation of the homeostatic component for the two-process model of sleep regulation [10, 31] in the univariate case. We found that the minimization of the BMSE with respect to the vector of times at which the data are collected can be divided into two subcases. In one subcase, the BMSE is minimized by collecting data at times with a mean value slightly above the time at which predictions are to be made, with the offset being inversely proportional to the variance of the prior on the intercept. In the second subcase, all of the data is to be collected at the maximal time point. We extended the results of this example to the bivariate case, which again can be divided into two subcases. In one subcase, the BMSE is minimized by collecting the data so that the secondary measurement times have a mean of zero, and the primary variable is collected at times with a mean value somewhat above the time at which the predictions are to be made, like in the univariate case. In the other subcase, the time points that minimize the BMSE may best be found numerically.

Finally, we considered a nonlinear circadian model and determined the improvement in individualized prediction accuracy from a single primary data point versus both a single primary data point and a single secondary data point. For this particular model (85), we found the improvement to be substantial, suggesting that there are cases for nonlinear models where a secondary variable can substantially improve prediction accuracy for the primary variable, given a reasonably high between-subjects correlation.

In conclusion, depending on the between- and within-subjects variance components and the between-subjects correlation between primary and secondary responses, using secondary response data can be effective in increasing the individualized prediction accuracy on the primary response variable in Bayesian forecasting.

This work represents an improvement over the work of Chandler and colleagues [19], who proposed incorporating secondary variables in individualized performance predictions as covariates in a generalized linear model. An advantage of their approach is that it accounts for perturbations on system dynamics from external factors that are common to the outcome variables considered. A drawback is that the approach does not accumulate information about individual differences over time and therefore does not become increasingly accurate for individualized predictions as more data are collected for the individual at hand. Furthermore, the technique requires secondary data to be measured at the same time as the primary data is to be predicted. This is not a requirement for the presently proposed method, for which individualized predictions can be made for any given time, even if secondary data are unavailable then.

That said, here we considered only Bayesian models with diagonally structured error covariance matrices. Such models do not account for correlation within subjects between response types. The work of Chandler and colleagues [19] does account for such correlation, using a fixed linear relationship between primary and secondary responses. We did not consider this possibility here, using merely a diagonal error covariance matrix with one parameter for each response type. However, the Bayesian modeling framework in this paper can be expanded to account for correlation within subjects between response types by adding additional structure to the error covariance matrix.

Finally, the models described here assume that error variance is constant over repeated measures and across different subjects. This constraint can be relaxed easily by allowing the error covariance matrix to have different elements for different individuals.

The multivariate repeated-measures Bayesian forecasting framework presented here may be useful in a variety of clinical settings. One example is modeling the disabling effects of chronic back pain, where pain-related fear may be a good choice for a secondary variable [33]. For other examples, a rich literature in this area can be found in the domain of anesthesiology [34], where clinical applications of multivariate Bayesian forecasting abound.

Acknowledgments

The authors are grateful to Hongbo Dong for help with some of the mathematical proofs in the paper. This research was supported by ONR Grant N00014-13-1-0302 and in part by FMCSA Award DTMC75-07-D-00006 and FAA Award DTFAAC-11-A-00003.

Appendices

A. Expression for the Univariate Linear Time-Dependent Model

Theorem A.1 A.1. —

The BMSE of MMSE predictions for prediction BMSE in the univariate, linear, time-dependent model given by (58) is given by

Mf^j=hCθ1+HCϵ1H1h, (A.1)

where

h=1tj, (A.2)

where t j represents the time for which the expected response is predicted, C θ is the prior variance matrix for the parameter vector θ, H is the design matrix, C ϵ is the error covariance matrix, and Mf^j is defined in (3).

Proof —

Since

fj=hθ (A.3)

and since the MMSE estimator commutes over linear transformations [23], the MMSE estimator of the expected response is given by

f^j=hθ^. (A.4)

Therefore,

Mf^jEfjf^j2=Efjf^jfjf^j=Ehθθ^θθ^h=hEθθ^θθ^h=hMθ^h, (A.5)

where the parameter BMSE matrix Mθ^ is given by [23] as follows:

Mθ^Eθθ^θθ^=Cθ1+HCϵ1H1. (A.6)

Therefore, the BMSE of the estimated response is given by (A.1).

Theorem A.2 A.2. —

The BMSE of MMSE predictions for the univariate, linear, time-dependent model given by (58) is given by

Mf^j=1/δβ2+ms2/σ2+mt¯tj2/σ2+tj2/δα21/δα2+m/σ21/δβ2+ms2/σ2+mt¯2/δα2σ2, (A.7)

where Mf^j is given in (A.1).

Proof —

For this model, the inverse of the between-subjects covariance matrix is given by

Cθ1=1δα2001δβ2. (A.8)

Furthermore, the parameter BMSE matrix is given by (see Theorem A.1)

Mf^j=hCθ1+HCϵ1H1h, (A.9)

where h is the covariate vector at the time at which we want to make predictions:

h=1tj. (A.10)

The error variance matrix is as follows:

Cϵ=σ2I. (A.11)

The design matrix is as follows:

H=1t11tm. (A.12)

The matrix multiplication HC ϵ −1 H can be computed to be

HCϵ1H=1σ2mj=1mtjj=1mtjj=1mtj2. (A.13)

The matrix inverse (C θ −1 + HC ϵ −1 H)−1 can then be computed as follows:

Cθ1+HCϵ1H1=1δα2+mσ2j=1mtjσ2j=1mtjσ21δβ2+j=1mtj2σ21=1δα2+mσ21δβ2+j=1mtj2σ2j=1mtjσ221·1δβ2+j=1mtj2σ2j=1mtjσ2j=1mtjσ21δα2+mσ2. (A.14)

Computing the quadratic form, we find that

Mf^j=1/δβ2+j=1mtj2/σ22tjj=1mtj/σ2+tj21/δα2+m/σ21/δα2+m/σ21/δβ2+j=1mtj2/σ2j=1mtj/σ22. (A.15)

We note the following decomposition of the sum of squared times:

j=1mtj2j=1mtjt¯+t¯2=j=1mtjt¯2+j=1mt¯2=ms2+t¯2. (A.16)

Including this decomposition, we find that

Mf^j=1/δβ2+ms2/σ2+mt¯2/σ22mtjt¯/σ2+tj2/δα2+mtj2/σ21/δα2+m/σ21/δβ2+ms2+t¯2/σ2mt¯/σ22=1/δβ2+ms2/σ2+m/σ2t¯22tjt¯+tj2+tj2/δα21/δα2+m/σ21/δβ2+ms2+t¯2/σ2mt¯/σ22=1/δβ2+ms2/σ2+mt¯tj2/σ2+tj2/δα21/δα2+m/σ21/δβ2+ms2/σ2+mt¯2/δα2σ2. (A.17)

B. Unconstrained Minimization of the BMSE for the Univariate, Time-Dependent Linear Model

Theorem B.1 B.1. —

For the univariate, linear, time-dependent model given by (58), Mf^j as given by (A.17) exhibits an absolute minimum at a point (t¯min,smin2) if and only if

Mf^j=11/δα2+m/σ2. (B.1)

Proof —

Recall that t¯ and s 2 are defined by (63) and (64). To prevent ambiguous notation, let γ = s 2. Consider the derivative of Mf^j with respect to γ:

Mf^jt¯,γγ=mσ2tjσ2+mt¯+tjδα22δβ4mδα2σ2+mγδβ2+σ2σ2+mγ+t¯2δβ22. (B.2)

We can represent this derivative more simply as follows:

Mf^jt¯,γγ=mσ2c12δβ4c22, (B.3)

where the values of c 1 and c 2 are found explicitly from (B.2). It is easy to show that c 1 2 ≥ 0 and c 2 2 > 0, and consequently

Mf^jt¯,γγ0. (B.4)

Therefore, for a fixed t¯, Mf^j either decreases or remains constant as we increase γ. It follows that, for fixed value of t¯, Mf^j is bounded below by Mf^j in the limit as γ:

limγMf^jt¯,γ=11/δα2+m/σ2. (B.5)

Since the limit of Mf^j does not depend on the values of t¯ and γ, any values of t¯ and γ for which Mf^j takes on the value given by (B.5) are an absolute minimum of Mf^j.

Theorem B.2 B.2. —

For the univariate, linear, time-dependent model given by (58), an absolute minimum of Mf^j as given by (A.17), inside the region defined by 0 ≤ t jT, can be obtained by collecting data at times t j such that

t¯=tjσ2/m+δα2δα2, (B.6)

if and only if

t¯T. (B.7)

Proof —

The critical points of the unconstrained function are found by taking the derivative of M fj with respect to t¯ and s 2 and setting these derivatives to zero. The derivative with respect to t¯ is as follows:

Mf^jt¯,s2t¯=2mσ2tjσ2+mt¯+tjδα2δβ2t¯tjσ2δβ2+δα2σ2+ms2δβ2mδα2σ2+ms2δβ2+σ2σ2+ms2+t¯2δβ22. (B.8)

The derivative with respect to s 2 is given by (B.2). Setting both derivatives equal to zero, we find the following critical point:

t¯min=tjσ2/m+δα2δα2. (B.9)

Evaluating the BMSE at this critical point, we find that

Mf^jt¯min,s2=11/δα2+m/σ2, (B.10)

which is sufficient to show that Mf^j exhibits an absolute minimum at this point.

Given the constraint that 0 ≤ t jT for each j, t¯min is a feasible point only when 0t¯minT. It is easy to see that t¯min>0 in all cases. Therefore, the absolute minimum can be achieved by collecting all data at t¯min if and only if t¯minT.

C. Constrained Minimization of the BMSE for the Univariate, Time-Dependent Linear Model with Two Time Points

Theorem C.1 C.1. —

For the univariate, linear, time-dependent model given by (58) with m = 2 and 0 ≤ t jT, if

tjσ2/2+δα2δα2>T, (C.1)

then Mf^j as given by (A.17) is minimized by collecting data at times such that t 1 = t 2 = T.

Proof —

Substituting m = 2 into (A.15) we find that

Mf^j=t12+t22/σ22t1+t2tj/σ2+tj22/σ2+1/δα2+1/δβ2t1+t22/σ4+2/σ2+1/δα2t12+t22/σ2+1/δβ2. (C.2)

From Theorem B.2 we know that there are no critical points inside the feasible region, and the solution must therefore exist on the boundary of the region, which is defined by the following line segments:

A:  t10,T,t2=0;B:  t1=0,t20,T;C:  t10,T,t2=T;D:  t1=T,t20,T. (C.3)

Note that Mf^j is symmetric in t j:

Mf^jt1=a,t2=b=Mf^jt1=b,t2=a. (C.4)

Therefore, it is sufficient to consider only line segments A and C. We first consider finding the minimum value on line segment A. Setting t 2 = 0 and then taking the derivative of Mf^j with respect to t 1, and solving for t 1, we find the following two critical points:

t1minA+=tjσ2+2δα2δα2, (C.5)
t1minA=σ2δα2tjσ2+δα2δβ2. (C.6)

Applying condition (C.1) to t 1min A+, we find that t 1min A+ > 2T. Taking the second derivative of Mf^j with respect to t 1, such that t 2 = 0, and evaluating it at t 1min A+ result in

2Mf^jt12=2σ2δα8δβ2σ2+2δα22σ4tj2δβ2+3σ2tj2δα2δβ2+δα4σ2+2tj2δβ2, (C.7)

which is positive. Therefore (C.5) represents a minimum.

It is easy to see that t 1min A is always negative. Taking the second derivative of Mf^j with respect to t 1, such that t 2 = 0, and evaluating it at t 1min A (C.6) result in

2Mf^jt12=2tn4σ2+δα22δβ6σ6tn2δβ2+3σ4tn2δα2δβ2+δα4σ4+2σ2tn2δβ2, (C.8)

which is negative. Therefore t 1min A represents a maximum.

Since the maximum occurs at a value t 1min A < 0, the minimum occurs at a value t 1min A+ > T, and these are the only two critical points; the BMSE must be decreasing over the line segment, and the minimum BMSE on the line segment therefore occurs at t 1 = T, t 2 = 0.

We next consider finding the minimum value on the line segment C. Setting t 2 = T and then taking the derivative of Mf^j with respect to t 1, and solving for t 1, we find the following two critical points:

t1minC+=σ2tjTδα2+2tjδα2δα2,t1minC=δα2σ2+T2δβ2Ttjδβ2σ2tj+Tδα2tjδα2δβ2. (C.9)

We see that t 1min C+ is minimized by applying the lower bound for t j given by (C.1), and we find that t 1min C+ > T. Concerning t 1min C, the numerator can be minimized by applying the upper bound for t j of T. In this case, the numerator is δ α 2 σ 2, which is always positive. The maximum value of the denominator is found by substituting in the minimum value for t j given by (C.1). We find that the maximum value is − 2 δ α 2 δ β 2/(σ 2 + 2δ α 2), which is negative. Therefore t 1min C is always negative.

Taking the second derivative of Mf^j with respect to t 1, such that t 2 = T, and evaluating it at t 1min C+ result in

2Mf^jt12=2σ2δα8δβ2σ2+2δα22σ4tj2δβ2+σ2tj2tf+3tjδα2δβ2+δα4σ2+2tf2δβ24tftjδβ2+2tj2δβ2. (C.10)

Evaluating (C.10) at t j = T results in

2Mf^jt12=2δα8δβ2σ2+2δα22δα4+σ2tf2δβ2+tf2δα2δβ2, (C.11)

which is positive. In general, (C.10) is positive when

σ4tj2δβ2+σ2tj2tf+3tjδα2δβ2+δα4σ2+2tf2δβ24tftjδβ2+2tj2δβ2>0. (C.12)

Setting the expression on the left-hand side to zero and solving for t j, we find the following roots:

tj=σ2tfδα2δβ2+2tfδα4δβ2±σ6δα4δβ23σ4δα6δβ22σ2δα8δβ2σ4tf2δα4δβ42σ2tf2δα6δβ4σ4δβ2+3σ2δα2δβ2+2δα4δβ2. (C.13)

These roots are imaginary, which implies that (C.10) is always positive, and therefore, t 1min C+ is a minimum. Given that the BMSE must always be positive, and there are only two critical points, this implies that t 1min C is a maximum. Since the maximum occurs at a value of t 1min C < 0 and the minimum occurs at a value of t 1min C+ > T, the minimum on the line segment must occur at the value t 1 = T.

In summary, the minimum value on the line segment A occurs at t 1 = T, t 2 = 0. Applying symmetry, the BMSE at this point is the same as the BMSE at the points t 1 = 0, t 2 = T, which is the maximum point for the line segment C. Therefore, the minimum BMSE does not occur on the line segment A. Again, by symmetry, this means that the minimum BMSE does not occur on the line segment B. For the line segment C, the minimum BMSE occurs at t 1 = T, t 2 = T. Applying symmetry, the minimum BMSE on the line segment D occurs at t 1 = T, t 2 = T. Since these two points are the same, we find that the overall minimum BMSE within the feasible region occurs at the point t 1 = t 2 = T.

D. Unconstrained Minimization of the BMSE for the Bivariate, Time-Dependent Linear Model

Theorem D.1 D.1. —

For the bivariate, linear, time-dependent model given by (70), Mf^1j as given by (74) exhibits an absolute minimum at a point (t¯1min,t¯2min,s1min2,s2min2) if and only if

Mf^1j=δα12σ12/m11ρ2δα22+σ22/m2σ12/m1δα22+σ22/m2+a121ρ2δα22+σ22/m2. (D.1)

Proof —

Recall that t¯1, t¯2, s 1 2, and s 2 2 are defined in (77), (78), (79), and (80). To prevent ambiguous notation, let γ 1 = s 1 2 and γ 2 = s 2 2. Consider the derivative of Mf^1j (see (74)) with respect to γ 1:

Mf^1jt¯1,t¯2,γ1,γ2γ1=δβ12m1σ12c12c22, (D.2)

where

c1=t¯2ρωδα1δα2δβ2m2σ12σ22+tjδβ1σ12δα22m2s221+ω2δβ22m2σ22σ22s22+t¯221+ω2δβ22m2+σ22+t¯1tjδα12δβ1m11+ρ2δα22m2s221+ω2·δβ22m2σ22+σ22s22+t¯221+ω2δβ22m2+σ22,c2=2t¯1t¯2ρωδα1δα2δβ1δβ2m1m2σ12σ22+σ12δα22m2s12+t¯12·δβ12m1s221+ω2δβ22m2σ22+σ12s22δβ22m2+σ22+σ22s12+t¯12·δβ12m1s22+t¯221+ω2δβ22m2σ22+σ12s22+t¯22δβ22m2+σ22+a12m11+ρ2δα22m2s12δβ12m1s221+ω2δβ22m2σ22σ12s22δβ22m2+σ22+σ22σ12s22+t¯22δβ22m2+σ22+s12δβ12m1s22+t¯221+ω2δβ22m2+σ22. (D.3)

It is easy to show that c 1 2 ≥ 0 and c 2 2 ≥ 0, and consequently

Mf^1jt¯1,t¯2,γ1,γ2γ10. (D.4)

Consider also the derivative of Mf^1j with respect to γ 2:

Mf^1jt¯1,t¯2,γ1,γ2γ2=δβ22m2σ14σ22c32c42, (D.5)

where

c3=t¯2ρδα1δα2δβ2m2s12+t¯1t¯1tj1+ω2·δβ12m1σ12+t¯1tjωδα12δβ1m11+ρ2·δα22m2σ22+tjωδβ1σ12δα22m2+σ22,c4=2t¯1t¯2ρωδα1δα2δβ1δβ2m1m2σ12σ22+σ12δα22m2s12+t¯12·δβ12m1s221+ω2δβ22m2σ22+σ12s22δβ22m2+σ22+σ22s12+t¯12·δβ12m1s22+t¯221+ω2δβ22m2σ22+σ12s22+t¯22δβ22m2+σ22+a12m11+ρ2δα22m2s12δβ12m1s221+ω2δβ22m2σ22σ12s22δβ22m2+σ22+σ22σ12s22+t¯22δβ22m2+σ22+s12δβ12m1s22+t¯221+ω2δβ22m2+σ22. (D.6)

It can be shown that c 3 2 ≥ 0 and c 4 2 > 0, and consequently

Mf^1jt¯1,t¯2,γ1,γ2γ20. (D.7)

Therefore, for fixed t¯1 and t¯2, Mf^1j either decreases or remains constant as we increase γ 1 or γ 2. It follows that, for fixed values of t¯1 and t¯2, Mf^1j is bounded below by Mf^1j in the limit as γ 1, γ 2:

limγ1,γ2Mf^1jt¯1,t¯2,γ1,γ2=δα12σ12/m11ρ2δα22+σ22/m2σ12/m1δα22+σ22/m2+a121ρ2δα22+σ22/m2. (D.8)

Since the limit of Mf^1j does not depend on t¯1 or t¯2, any values of t¯1,t¯2,γ1, and γ 2 for which Mf^1j takes on the value given by (D.8) are an absolute minimum of Mf^1j.

Theorem D.2 D.2. —

For the bivariate, linear, time-dependent model given by (70), an absolute minimum of Mf^1j as given by (74), inside the region defined by 0 ≤ t rjT, can be obtained by collecting data at times t rj such that

t¯1min=t1jδα121ρ2δα22+σ22/m2+σ12/m1δα22+σ22/m2δα121ρ2δα22+σ22/m2,t¯2min=0, (D.9)

if and only if

t¯1minT. (D.10)

Proof —

Substituting (D.9) into Mf^1j, we find that

Mf^1jt¯1min,t¯2min,s12,s22=δα12σ12/m11ρ2δα22+σ22/m2σ12/m1δα22+σ22/m2+a121ρ2δα22+σ22/m2, (D.11)

which is sufficient to show that any point (t¯1min,t¯2min,s12,s22) is an absolute minimum of the unconstrained function (see Theorem D.1).

Given the constraint that 0 ≤ t 1jT for each j, t¯1min is a feasible point only when 0t¯1minT. It is easy to see that t¯1min>0 in all cases. Therefore, the absolute minimum can be achieved by collecting all data at t¯1min if and only if t¯1minT.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

  • 1.Bliese P. D., Ployhart R. E. Growth modeling using random coefficient models: model building, testing, and illustrations. Organizational Research Methods. 2002;5(4):362–387. doi: 10.1177/109442802237116. [DOI] [Google Scholar]
  • 2.Olofsen E., Dinges D. F., Van Dongen H. P. A. Nonlinear mixed-effects modeling: individualization and prediction. Aviation, Space, and Environmental Medicine. 2004;75(3):A134–A140. [PubMed] [Google Scholar]
  • 3.Van Dongen H. P. A., Maislin G., Dinges D. F. Dealing with inter-individual differences in the temporal dynamics of fatigue and performance: importance and techniques. Aviation, Space, and Environmental Medicine. 2004;75(3):A147–A154. [PubMed] [Google Scholar]
  • 4.Belenky G., Wesensten N. J., Thorne D. R., et al. Patterns of performance degradation and restoration during sleep restriction and subsequent recovery: a sleep dose-response study. Journal of Sleep Research. 2003;12(1):1–12. doi: 10.1046/j.1365-2869.2003.00337.x. [DOI] [PubMed] [Google Scholar]
  • 5.Van Dongen H. P. A., Maislin G., Mullington J. M., Dinges D. F. The cumulative cost of additional wakefulness: dose-response effects on neurobehavioral functions and sleep physiology from chronic sleep restriction and total sleep deprivation. Sleep. 2003;26(2):117–126. doi: 10.1093/sleep/26.2.117. [DOI] [PubMed] [Google Scholar]
  • 6.Van Dongen H. P. A., Baynard M. D., Maislin G., Dinges D. F. Systematic interindividual differences in neurobehavioral impairment from sleep loss: evidence of trait-like differential vulnerability. Sleep. 2004;27(3):423–433. [PubMed] [Google Scholar]
  • 7.Van Dongen H. P. A., Mott C. G., Huang J.-K., Mollicone D. J., McKenzie F. D., Dinges D. F. Optimization of biomathematical model predictions for cognitive performance impairment in individuals: accounting for unknown traits and uncertain states in homeostatic and circadian processes. Sleep. 2007;30(9):1129–1143. doi: 10.1093/sleep/30.9.1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Olofsen E., Van Dongen H. P. A., Mott C. G., Balkin T. J., Terman D. Current approaches and challenges to development of an individualized sleep and performance prediction model. Open Sleep Journal. 2010;3(1):24–43. doi: 10.2174/1874620901003010024. [DOI] [Google Scholar]
  • 9.Van Dongen H. P. A., Mott C. G., Huang J.-K., Mollicone D. J., McKenzie F. D., Dinges D. F. Confidence intervals for individualized performance models. Sleep. 2007;30(9):p. 1083. [Google Scholar]
  • 10.Borbély A. A., Achermann P. Sleep homeostasis and models of sleep regulation. Journal of Biological Rhythms. 1999;14(6):557–568. doi: 10.1177/074873099129000894. [DOI] [PubMed] [Google Scholar]
  • 11.Daan S., Beersma D. G., Borbély A. A. Timing of human sleep: recovery process gated by a circadian pacemaker. The American Journal of Physiology. 1984;246(2):R161–R183. doi: 10.1152/ajpregu.1984.246.2.R161. [DOI] [PubMed] [Google Scholar]
  • 12.Reinsel G. C. Mean squared error properties of empirical bayes estimators in a multivariate random effects general linear model. Journal of the American Statistical Association. 1985;80(391):642–650. doi: 10.1080/01621459.1985.10478164. [DOI] [Google Scholar]
  • 13.Brochot C., Bessoud B., Balvay D., Cuénod C.-A., Siauve N., Bois F. Y. Evaluation of antiangiogenic treatment effects on tumors' microcirculation by Bayesian physiological pharmacokinetic modeling and magnetic resonance imaging. Magnetic Resonance Imaging. 2006;24(8):1059–1067. doi: 10.1016/j.mri.2006.04.002. [DOI] [PubMed] [Google Scholar]
  • 14.Dalton L. A., Dougherty E. R. Application of the Bayesian MMSE estimator for classification error to gene expression microarray data. Bioinformatics. 2011;27(13):1822–1831. doi: 10.1093/bioinformatics/btr272. [DOI] [PubMed] [Google Scholar]
  • 15.Kamen E. W., Su J. K. Introduction to Optimal Estimation. London, UK: Springer; 1999. [DOI] [Google Scholar]
  • 16.Morgan B. B., Jr., Winne P. S., Dugan J. The range and consistency of individual differences in continuous work. Human Factors. 1980;22(3):331–340. [Google Scholar]
  • 17.Hursh S. R., Redmond D. P., Johnson M. L., et al. Fatigue models for applied research in warfighting. Aviation, Space, and Environmental Medicine. 2004;75(3):A44–A53. [PubMed] [Google Scholar]
  • 18.McCauley P., Kalachev L. V., Mollicone D. J., Banks S., Dinges D. F., Van Dongen H. P. A. Dynamic circadian modulation in a biomathematical model for the effects of sleep and sleep loss on waking neurobehavioral performance. Sleep. 2013;36(12):1987–1997. doi: 10.5665/sleep.3246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chandler J. F., Arnold R. D., Phillips J. B., Turnmire A. E. Predicting individual differences in response to sleep loss: application of current techniques. Aviation, Space, and Environmental Medicine. 2013;84(9):927–937. doi: 10.3357/asem.3581.2013. [DOI] [PubMed] [Google Scholar]
  • 20.Frey D. J., Badia P., Wright K. P., Jr. Inter- and intra-individual variability in performance near the circadian nadir during sleep deprivation. Journal of Sleep Research. 2004;13(4):305–315. doi: 10.1111/j.1365-2869.2004.00429.x. [DOI] [PubMed] [Google Scholar]
  • 21.Tucker A. M., Dinges D. F., Van Dongen H. P. A. Trait interindividual differences in the sleep physiology of healthy young adults. Journal of Sleep Research. 2007;16(2):170–180. doi: 10.1111/j.1365-2869.2007.00594.x. [DOI] [PubMed] [Google Scholar]
  • 22.Hursh S. R., Van Dongen H. P. A. Fatigue and performance modeling. In: Kryger M. H., Roth T., Roth W. C., editors. Principles and Practice of Sleep Medicine. 5th. St. Louis, Mo, USA: Elsevier Saunders; 2010. pp. 745–752. [Google Scholar]
  • 23.Kay S. Fundamentals of Statistical Signal Processing: Estimation Theory: Volume I. Upper Saddle River, NJ, USA: Prentice Hall; 1993. [Google Scholar]
  • 24.Van de Water A. T. M., Holmes A., Hurley D. A. Objective measurements of sleep for non-laboratory settings as alternatives to polysomnography—a systematic review. Journal of Sleep Research. 2011;20(1):183–200. doi: 10.1111/j.1365-2869.2009.00814.x. [DOI] [PubMed] [Google Scholar]
  • 25.Rechtschaffen A., Kales A. A Manual of Standardized Terminology, Techniques and Scoring System for Sleep Stages of Human Participants. Bethesda, Md, USA: Neurological Information Network, National Institutes of Health; 1969. (Publication no. 204). [Google Scholar]
  • 26.Cole R. J., Kripke D. F., Gruen W., Mullaney D. J., Gillin J. C. Automatic sleep/wake identification from wrist activity. Sleep. 1992;15(5):461–469. doi: 10.1093/sleep/15.5.461. [DOI] [PubMed] [Google Scholar]
  • 27.Pinheiro J., Bates D., DebRoy S., Sarkar D., R Development Core Team nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-111. 2013, https://cran.r-project.org/web/packages/nlme/nlme.pdf.
  • 28.Laird N. M., Ware J. H. Random-effects models for longitudinal data. Biometrics. 1982;38(4):963–974. doi: 10.2307/2529876. [DOI] [PubMed] [Google Scholar]
  • 29.Pinheiro J., Bates D. Mixed-Effects Models in S and S-PLUS. New York, NY, USA: Springer; 2009. [Google Scholar]
  • 30.Chaloner K. Optimal Bayesian experimental design for linear models. The Annals of Statistics. 1984;12(1):283–300. doi: 10.1214/aos/1176346407. [DOI] [Google Scholar]
  • 31.Borbély A. A. A two process model of sleep regulation. Human Neurobiology. 1982;1(3):195–204. [PubMed] [Google Scholar]
  • 32.Campbell I. G., Higgins L. M., Darchia N., Feinberg I. Homeostatic behavior of Fast Fourier Transform power in very low frequency non-rapid eye movement human electroencephalogram. Neuroscience. 2006;140(4):1395–1399. doi: 10.1016/j.neuroscience.2006.03.005. [DOI] [PubMed] [Google Scholar]
  • 33.Crombez G., Vlaeyen J. W. S., Heuts P. H. T. G., Lysens R. Pain-related fear is more disabling than pain itself: evidence on the role of pain-related fear in chronic back pain disability. Pain. 1999;80(1-2):329–339. doi: 10.1016/s0304-3959(98)00229-2. [DOI] [PubMed] [Google Scholar]
  • 34.Olofsen E., Dahan A. Population pharmacokinetics/pharmacodynamics of anesthetics. AAPS Journal. 2005;7(2):E383–E389. doi: 10.1208/aapsj070239. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Computational and Mathematical Methods in Medicine are provided here courtesy of Wiley

RESOURCES