Intake_epis_food(): An R Function for Fitting a Bivariate Nonlinear Measurement Error Model to Estimate Usual and Energy Intake for Episodically Consumed Foods

Adriana Pérez; Saijuan Zhang; Victor Kipnis; Douglas Midthune; Laurence S Freedman; Raymond J Carroll

doi:10.18637/jss.v046.c03

. Author manuscript; available in PMC: 2012 Jul 24.

Published in final edited form as: J Stat Softw. 2012 Mar 5;46(c03):1–17. doi: 10.18637/jss.v046.c03

Intake_epis_food(): An R Function for Fitting a Bivariate Nonlinear Measurement Error Model to Estimate Usual and Energy Intake for Episodically Consumed Foods

Adriana Pérez ¹, Saijuan Zhang ², Victor Kipnis ³, Douglas Midthune ⁴, Laurence S Freedman ⁵, Raymond J Carroll ⁶

PMCID: PMC3403723 NIHMSID: NIHMS355296 PMID: 22837731

Abstract

We consider a Bayesian analysis using WinBUGS to estimate the distribution of usual intake for episodically consumed foods and energy (calories). The model uses measures of nutrition and energy intakes via a food frequency questionnaire (FFQ) along with repeated 24 hour recalls and adjusting covariates. In order to estimate the usual intake of the food, we phrase usual intake in terms of person-specific random effects, along with day-to-day variability in food and energy consumption. Three levels are incorporated in the model. The first level incorporates information about whether an individual in fact reported consumption of a particular food item. The second level incorporates the amount of intake from those individuals who reported consumption of the food, and the third level incorporates the energy intake. Estimates of posterior means of parameters and distributions of usual intakes are obtained by using Markov chain Monte Carlo calculations. This R function reports to users point estimates and credible intervals for parameters in the model, samples from their posterior distribution, samples from the distribution of usual intake and usual energy intake, trace plots of parameters and summary statistics of usual intake, usual energy intake and energy adjusted usual intake.

Keywords: excess zero models, MCMC, nonlinear mixed models, R, R2WinBUGS, zero-inflation

1. Introduction

There are many statistical challenges when modeling food intakes reported on two or more 24 hours recalls. Some of the challenges involve the presence of measurement error because estimating the distribution of usual intake of nutrients and foods in the population involves monitoring and measuring such intakes over time and their associated recall biases. In addition, many consumers report episodically consumed foods, foods that are not typically consumed every day. For example, fish may be reported in a particular day with a positive intake, but on a different day fish intake may equal zero. Consequently, it is difficult to estimate nutrition intake with recall surveys when one of these recalls incorporate excess zero measurements. Data of this nature is often modeled with measurement error models with zero inflated data.

Recently, in nutritional surveillance (Tooze et al. 2006) and nutritional epidemiology (Kipnis et al. 2009) two-part methods had been developed for analyzing episodically consumed foods. In the first part of the model, the probability of an episodically consumed food is estimated using logistic regression with a person-specific random effect. Then, in the second part of the model the amount of that episodically consumed food per day is modeled using linear regression on a transformed scale with also a person-specific effect. These two parts are linked allowing that the two person-specific effects are correlated as well as by allowing common covariates in both parts of the model. This method is known as the “NCI method” (http://riskfactor.cancer.gov/diet/usualintakes/).

An extension into a three-part method that also incorporates the estimation of the amount of energy intake consumed per day is described in detail by Kipnis et al. (2010). These authors estimated this three-part method using nonlinear mixed effects models with likelihoods computed by adaptive Gaussian quadrature in SAS software. However, computationally it was found to have serious convergence issues within the context of nutritional epidemiology and nutritional surveillance as indicated by Kipnis et al. (2010) and Zhang et al. (2010).

The goal of this article is to present a function for the implementation of this nonlinear mixed effects model. The function Intake_epis_food() allows readers to input their data in R (R Development Core Team 2009) to generate and run the script of the three part model in WinBUGS (Spiegelhalter et al. 1999) and to run and obtain output simulations to R using the package R2WinBUGS (Sturtz et al. 2005). While the most important functions of the package R2WinBUGS are illustrated, we do not provide comprehensive documentation here; instead the reader is referred to the manual and online documentation with that package available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=plink. In the next sections, the function and corresponding algorithms are explained and an example is provided.

2. Method and parameter inference

The computational details of the Bayesian approach to fit the three-part model for an episodically consumed food and energy model (Kipnis et al. 2010) through Markov Chain Monte Carlo techniques are provided by Zhang et al. (2010) with a brief summary given here. Our model takes into consideration measures of nutrition and energy intakes via a food frequency questionnaire (FFQ) along with two repeated 24 hour recalls (24hr) and adjustment covariates. These two repeated 24hr provide information on the amounts of food and energy consumed by each individual. Consequently, an indicator variable of whether the food was consumed can be generated from the reported amount of food consumed. In addition, because two or more 24hr require distinguishing between and within person random error (Eckert et al. 1997), a type of classical measurement error is needed. Generally nutritional data are skewed, which may require transformations to reach normality and standardization. For person i = 1, …, n, and for the k = 1, 2 repeats of the 24hr, the data are Ỹ_ik = (Y_i1k, …, Y_i3k)^T, where

Y_i1k = Indicator of whether the food is consumed.
Y_i2k = Amount of the food consumed as reported by the 24hr, which equals zero if the food is not consumed.
Y_i3k = Amount of energy consumed as reported by the 24hr.

We will use a Box-Cox transformation to account for the skewness in the data. The Box-Cox transformation with transformation parameter λ, is h(y, λ) = (y^λ −1)/λ if λ ≠ 0, and h(y, λ) = log(y) if λ = 0. We will allow for the user to specify different transformation parameters λ_F and λ_E for food and energy intake, respectively.

After Box-Cox transformation, we further standardize and center these transformed variables to have a mean of zero and variance of one. This is useful for making the prior distribution specifications given below to be sensible and allows rapid convergence of the posterior samples. Specifically, let μ_{λ_F} and σ_{λ_F} be the mean and standard deviation of the transformed non-zero food data h(Y_i2k, λ_F), and let μ_{λ_E} and σ_{λ_E} be the mean and standard deviation of the transformed energy data h(Y_i3k, λ_E). Then, our analysis is performed using the following transformations:

Q_{1 ik} = Y_{i 1 k};

(1)

Q_{i 2 k} = \sqrt{2} Y_{i 1 k} {h (Y_{i 2 k}, λ_{F}) - μ_{λ_{F}}} / σ_{λ_{F}};

(2)

Q_{i 3 k} = \sqrt{2} {h (Y_{i 3 k}, λ_{E}) - μ_{λ_{E}}} / σ_{λ_{E}} = h_{tr} (Y_{i 3 k}, λ_{E}, μ_{λ_{E}}, σ_{λ_{E}});

(3)

There are also covariates such as age category, ethnic status and, in many cases the results of reported intakes from a food frequency questionnaire. In principle, the covariates can differ based on the three types of data, so we denote them as V_i1, V_i2, V_i3, which are vector-valued. To improve linearity and homocedasticity of the model, we follow the recommendations of (i) implementing Box-Cox transformations on all or some covariates like intakes from a food frequency questionnaire (Kipnis et al. 2009), (ii) centering and (iii) scaling on all covariates. Without loss of generality, let λ_c represent a vector containing the Box-Cox transformation parameter used for covariates. Let μ_{λ_c} and σ_{λ_c} be the vector means and standard deviations of the transformed covariates. Then, these transformed, center and scaled covariates are denoted by:

X_{i 1} = {h (V_{i 1}, λ_{c}) - μ_{λ_{c}}} / σ_{λ_{c}}

(4)

X_{i 2} = {h (V_{i 2}, λ_{c}) - μ_{λ_{c}}} / σ_{λ_{c}}

(5)

X_{i 3} = {h (V_{i 3}, λ_{c}) - μ_{λ_{c}}} / σ_{λ_{c}}

(6)

Let (U_i1, U_i2, U_i3) = Normal(0, Σ_u) be random effects associated with consumption, amount (if the food is consumed), and energy. Similarly, for k = 1, 2, (ε_i1k, ε_i2k, ε_i3k) = Normal(0, Σ_ε) accounts for day-to-day variation. The model for whether there is consumption can be stated as:

P (Q_{i 1 k} = 1 | X_{i 1}, U_{i 1}, U_{i 2}, U_{i 3}) = Φ (α_{1} + X_{i 1}^{T} β_{1} + U_{i 1}),

(7)

where Φ(·) is the standard normal distribution function and (W_i1k, W_i2k, W_i3k) represent their corresponding latent variables as follows. A food being consumed at visit k is equivalent to

Q_{i 1 k} = 1 \Leftrightarrow W_{i 1 k} = α_{1} + X_{i 1}^{T} β_{1} + U_{i 1} + ε_{i 1 k} > 0 .

(8)

The food when consumed and energy intake are modeled as

[Q_{i 2 k} | Q_{i 1 k} > 0] = W_{i 2 k} = α_{2} + X_{i 2}^{T} β_{2} + U_{i 2} + ε_{i 2 k}

(9)

[Q_{i 3 k}] = W_{i 3 k} = α_{3} + X_{i 3}^{T} β_{3} + U_{i 3} + ε_{i 3 k} .

(10)

The prior distribution of parameters β₁, β₂, β₃ is assumed to be multivariate Normal with vector mean 0 and variance covariance Σ_u. An inverse Wishart denoted as IW(Ω_u, m_u) prior was specified for Σ_u. We set Ω_u to have an exchangeable correlation structure with starting values with diagonal entries all equal to 1 and correlations 0.5. We have two necessary restrictions on Σ_ε: (i) ε_i1k and ε_i2k are independent; and (ii) var(ε_i1k) = 1, so that β₁ is identifiable, along with the distribution of U_i1. Furthermore, we constrained this covariance matrix using a polar coordinate representation with γ ∈ (−1, 1) and θ ∈ (−π, π). With these considerations Σ_ε can be written as:

Σ_{ε} = [\begin{matrix} 1 & 0 & p_{1} s_{33}^{1 / 2} \\ 0 & s_{22} & p_{2} {(s_{22} s_{33})}^{1 / 2} \\ p_{1} s_{33}^{1 / 2} & p_{2} {(s_{22} s_{33})}^{1 / 2} & s_{33} \end{matrix}],

(11)

where p₁ = γcos(θ) and p₂ = γsin (θ). The recommended prior distributions for s₂₂ and s₃₃ are Uniform(0,3), for γ is Uniform(−1,1) and for θ is Uniform(−π, π) (Zhang et al. 2010). The corresponding inverse of Σ_u is a Wishart distribution denoted as $Σ_{u}^{- 1} = W (Ω_{u}^{- 1}, m_{u})$ , and the inverse of Σ_ε can be written as:

Σ_{ε}^{- 1} = [\begin{matrix} 1 + {wp}_{1}^{2} s_{33} & {wp}_{1} p_{2} s_{33} s_{22}^{- 1 / 2} & - {wp}_{1} s_{33}^{1 / 2} \\ {wp}_{1} p_{2} s_{33} s_{22}^{- 1 / 2} & s_{22}^{- 1} + {wp}_{2}^{2} s_{33} s_{22}^{- 1} & - {wp}_{2} s_{33}^{1 / 2} s_{22}^{- 1 / 2} \\ - {wp}_{1} s_{33}^{1 / 2} & - {wp}_{2} s_{33}^{1 / 2} s_{22}^{- 1 / 2} & w \end{matrix}],

(12)

where $w = {s_{33} (1 - p_{1}^{2} - p_{2}^{2})}^{- 1}$ .

2.1. Usual intake analysis

Besides the parameters and the random effects (U_i1, U_i2, U_i3) in this context it is vital to also have the posterior samples of usual intake T_Fi, usual energy intake T_Ei, and energy adjusted usual intake 1000T_Fi/T_Ei, all on the original scale. We used the best-power method (Dodd et al. 2006) for estimating both usual energy intake and usual intake for person i. We first consider energy:

T_{Ei} = E {h_{tr}^{- 1} (α_{3} + X_{i 3}^{T} β_{3} + U_{i 3} + ε_{i 3}, λ_{E}, μ_{λ_{E}}, σ_{λ_{E}}) | X_{i 3}, U_{i 3}}

(13)

\approx h_{tr}^{*} {α_{3} + X_{i 3}^{T} β_{3} + U_{i 3}, λ_{E}, μ_{λ_{E}}, σ_{λ_{E}}, Σ_{ε} (3, 3)} .

(14)

where

h_{tr}^{*} {υ, λ, μ, σ, Σ} = h_{tr}^{- 1} (υ, λ, μ, σ) + \frac{Σ}{2} \frac{\partial^{2} h_{tr}^{- 1} (υ, λ, μ, σ)}{\partial υ^{2}} .

Similarly, a person’s usual intake of the dietary component on the original scale is defined as

T_{Fi} = Φ (α_{1} + X_{i 1}^{T} β_{1} + U_{i 1}) h_{tr}^{*} {α_{2} + X_{i 2}^{T} β_{2} + U_{i 2}, λ_{F}, μ_{λ_{F}}, σ_{λ_{F}}, Σ_{ε} (2, 2)} .

(15)

When λ = 0, the back-transformation and second derivative are:

\begin{matrix} h_{tr}^{- 1} (υ, 0, μ, σ) = & exp (μ + σ υ / \sqrt{2}); \\ \partial^{2} h_{tr}^{- 1} (υ, 0, μ, σ) / \partial υ^{2} = & σ^{2} h_{tr}^{- 1} (υ, 0, μ, σ) / 2 . \end{matrix}

Similarly, when λ ≠ 0, the back-transformation and second derivative are:

\begin{matrix} h_{tr}^{- 1} (υ, λ, μ, σ) = & {(1 + λ μ + λ σ υ / \sqrt{2})}^{1 / λ}; \\ \partial^{2} h_{tr}^{- 1} (υ, λ, μ, σ) / \partial υ^{2} = & \frac{σ^{2}}{2} (1 - λ) {(1 + λ μ + λ σ υ / \sqrt{2})}^{- 2 + 1 / λ} . \end{matrix}

3. Using the Intake_epis_food() function in `R`

Users must install the R2WinBUGS (Sturtz et al. 2005) package in R and WinBUGS (Spiegelhalter et al. 1999) before running Intake_epis_food() function. This function assumes that there are no missing observations. Intake_epis_food() loads the libraries R2WinBUGS, stats and fSeries automatically.

The Intake_epis_food() function incorporates this three-part model, with truncated normal random variables for generating the latent W_i1k, and Metropolis-Hastings computations for α₁, α₂, α₃, β₁, β₂, β₃, the elements of Σ_u, and Σ_ε. This function generates and runs the script of this model in WinBUGS and returns Markov Chain Monte Carlo (MCMC) computation. We emphasize that MCMC computation can either be thought of as a strictly Bayesian computation with ordinary Bayesian inference, or as a means of developing frequentist estimators of the crucial parameters. These MCMC produces estimates which are known to be asymptotically equivalent to maximum likelihood estimators, which are difficult to obtain (Zhang et al. 2010). In Bayesian terms we used the posterior means of the model. Users are mainly interested in estimates of equations (14), (15) as well as the energy-adjusted usual intake 1000T_Fi/T_Ei.

The Intake_epis_food() function produces seven output files. The first output file contains parameter estimates (posterior means) of α₁, α₂, α₃, β₁, β₂, β₃ from equations (8), (9) and (10) as well as Σ_μ, $Σ_{μ}^{- 1}$ , Σ_ε, and $Σ_{ε}^{- 1}$ . The second output file contains summary statistics of Bayesian estimates of equations (14), (15) and the energy-adjusted usual intake 1000T_Fi/T_Ei. The third output file contains the iterations from the MCMC computations. The fourth output file contains summary statistics of Bayesian estimates for U_i1, U_i2, U_i3. The fifth output file contains trace plots of the intercept and coefficients in equations (8), (9) and (10). The sixth output file contains trace plots of the variance-covariances Σ_μ, $Σ_{μ}^{- 1}$ , Σ_ε and $Σ_{ε}^{- 1}$ . The seventh output file contains density plots of equations (15) and the energy adjusted usual intake 1000T_Fi/T_Ei.

The default input for the Intake_epis_food() function in R is:

Intake_epis_food(data.file.name="data.file.name",
numcov=4, df=5,lambdaf=0.32,lambdae=0.23,lvar1=999, lvar2=999, lvar3=0.33,
lvar4=0, n.iter=11000,n.burnin=1000, n.thin=10, bugs.seed=123456,
working.directory="C:/", file.estimates="estimates.csv",
file.istats="estimates_intake.csv", file.iterations="iterations.csv",
file.uis="uis.csv", file.tracep="trace_plots.pdf",
file.tpvarcov="trace_plot_var_cov_matrices.pdf",
file.densityp="density_plots.pdf",
bugs.directory="c:/Program Files/WinBUGS14/")

with the following arguments:

data.file.name : name of the file containing the data (default=data.file.name) in the following order:
- the first variable that identifies individuals, e.g. an ID number
- the next two variables are the 24hr variables of food intake, in order
- the next two variables are the 24hr variables of energy intake, in order
- the next variables are the covariates, and cannot have a column of ones. We limited the number of covariates to four, given that each covariate may have a different Box-Cox transformation parameter.
numcov : number of covariates in user’s dataset (default=4)
df : number of the degrees of freedom of m_u (default=5)
lambdaf : value of λ_F to be used for transformation of food intake (default=1 indicating no transformation done)
lambdae : value of λ_E to be used for transformation of energy intake (default=1 indicating no transformation done)
lvar1 : value of λ_c to be used for Box-Cox transformation of first covariable (default=999 indicating no transformation done)
lvar2 : value of λ_c to be used for Box-Cox transformation of second covariable (default=999 indicating no transformation done)
lvar3 : value of λ_c to be used for Box-Cox transformation of third covariable (default=999 indicating no transformation done)
lvar4 : value of λ_c to be used for Box-Cox transformation of fourth covariable (default=999 indicating no transformation done)
n.iter : number of iterations of the MCMC including burning iterations (default=11,000)
n.burnin : number of burn-in iterations of the MCMC (default=1,000)
n.thin : thinning rate of MCMC (default=10)
bugs.seed : random seed to be use for MCMC (default=123456)
working.directory : drive and folder location for output files (default=”C:”)
file.estimates : name of the file to save posterior summary statistics including mean, standard deviation as well as percentiles: 2.5^th, 25^th, 50^th, 75^th and 97.5^th (default = estimates.csv), of α₁, α₂, α₃, β₁, β₂, β₃ from equations (8), (9) and (10) as well as Σ_μ, $Σ_{μ}^{- 1}$ , Σ_ε, and $Σ_{ε}^{- 1}$ .
file.istats : name of the file to save summary statistics of usual food intake, (15), usual energy intake, (14) and usual food intake per 1000 calories (1000T_Fi/T_Ei) (default = estimates intake.csv).
file.iterations : name of the file to save iterations (default=iterations.csv). The number of iterations saved corresponds to (n.iter − n.burnin)/(n.thin)
file.uis : name of the file to save summary statistics of Bayesian estimates for U_i1, U_i2, U_i3 including mean, standard deviation as well as percentiles: 2.5^th, 25^th, 50^th, 75^th and 97.5^th for person i = 1, …, n. The first variable given in the dataset that identifies individuals is included as the first variable in this file for linking purposes to users.
file.tracep : name of the file to save trace plots of parameters (default = trace plots.pdf)
file.tpvarcov : name of the file to save trace plots of variance-covariances Σ_μ, $Σ_{μ}^{- 1}$ , Σ_ε and $Σ_{ε}^{- 1}$ (default = trace plot var cov matrices.pdf)
file.densityp : name of the file to save density plots of usual food intake and usual food intake per 1000 calories. Each one is generated individually and an overlayed plot is generated as well (default = density plots.pdf)
bugs.directory : drive and folder location of WinBUGS (default=”c:/Program Files /Win-BUGS14/”)

Once the data is input, Intake_epis_food() invokes WinBUGS from R. WinBUGS requires a file describing the model in equations (1)–(5) in Section 2. This model is included as file intake_program.txt and is loaded in R using its bugs() argument. Although, the default priors of the constrained variance-covariance matrix of random errors are geared to the pre-standardization of the data as described in Section 2, the user may change those prior entries within the file intake_program.txt, possibly changing:

Constrained variance-covariance matrix of random errors
Matrix notation	Default distribution and range	Code in `intake_program.txt`
s₂₂	Uniform(0,3)	tau.e1 ~ dunif(0,3)
s₃₃	Uniform(0,3)	tau.e2 ~ dunif(0,3)
γ	Uniform (−1,1)	a.e1 ~ dunif(−0.99,0.99)
θ	Uniform (−1,1)	a.e2 ~ dunif(−3.11,3.11)

Open in a new tab

Users may change any of the defaults of Intake_epis_food() function when the function is called to run by replacing those arguments in the last sentence of the file accompanied by this paper. MCMC posterior means of α₁, α₂, α₃, β₁, β₂, β₃, the elements of Σ_u, Σ_ε, $Σ_{u}^{- 1} and Σ_{ε}^{- 1}$ are displayed on the screen of R by Intake_epis_food() and saved automatically in numeral matrices on file names previously listed. See Section 4 for an example.

4. Application of the Intake_epis_food() function

We simulated data using parameter values identified from the calibration sub-study of the National Institutes of Health (NIH) and the American Association of Retired Persons (AARP) Diet and Health Study (Schatzkin et al. 2001). The NIH-AARP study is a cohort composed of people who resided in one of six states: California, Florida, Pennsylvania, New Jersey, North Carolina, and Louisiana or in two metropolitan areas: Atlanta, Georgia and Detroit, Michigan. From 1995 through 1996, 3.5 million questionnaires were mailed to members of the AARP, aged 50–71 years. The questionnaire included a dietary section as well as some lifestyle questions. Over 500,000 people returned the questionnaire after three mailing waves. This, then, is the largest study of diet and health ever conducted in the USA. In 1996–1997, these participants received a Risk-Factor Questionnaire which asked additional questions about lifestyle and behavior, and in 2004–2006 these participants received another follow-up questionnaire (http://dietandhealth.cancer.gov/history.html).

Participants for the calibration study were randomly selected from the 46,970 subjects who had responded to the first wave as of January 1996. Two thousand participants was the targeted sample size to attempt recruitment calls for a baseline and 24 hours follow up. The sample available to us included 920 men. Because the data are not publicly available, we simulated 920 food intakes that are similar in distribution to the men in the NIH-AARP calibration study. We used as $X_{ik}^{T}$ four covariates: age, body mass index (BMI), consumption of servings of whole grains from the FFQ, and energy intake from whole grains from the FFQ. Therefore, the degrees of freedom of the inverse Wishart were setup as m_u = 5. The latter two covariates were transformed by the cube root and the logarithm, respectively.

We present results from Intake_epis_food() for this simulated data example with a sample size of 920 men, modeling whole grains consumption. We used a burn-in of 1, 000 steps followed by 10, 000 MCMC iterations; our Intake_epis_food() function took approximately 6 hours and 16 minutes (Pentium computer (R) D CPU 3.5GHz and 1.99GB of RAM). We used only every 10^th value of the chain. The first output file contains the average over 10,000 MCMC of posterior mean, posterior standard deviation, posteriors: 2.5^th percentile, 25^th percentile, 50^th percentile, 75^th percentile and 97.5^th percentile. The intercepts of the model are α₁, α₂, α₃ corresponding to each level of the three-part model. Also, estimates of regression parameters for each covariate appear for each level of the model. This is the notation used for the output:

alpha_1 : represents the intercept in equation (8).
alpha_2 : represents the intercept in equation (9).
alpha_3 : represents the intercept in equation (10).
beta_1_j : represents the estimated coefficient of the j^th covariate in equation (8), j=1,‥,numcov.
beta_2_j : represents the estimated coefficient of the j^th covariate in equation (9), j=1,‥,numcov.
beta_3_j : represents the estimated coefficient of the j^th covariate in equation (10), j=1,‥,numcov.
tau_j_p : represents the (j,p) entry estimate of $Σ_{u}^{- 1}$ , j=p=1,2,3.
sigma2_j_p : represents the (j,p) entry estimate of Σ_u, j=p=1,2,3.
a.e1 : represents the estimate of γ
a.e2 : represents the estimate of θ
sigmaemat_j_p : represents the (j,p) entry estimate of Σ_ε, j=p=1,2,3, with Σ_ε11=1 and Σ_ε12=Σ_ε21=0.
tauemat_j_p : represents the the (j,p) entry estimate of $Σ_{ε}^{- 1}$ , j=p=1,2,3.

We used λ_F = 0.32 as the Box-Cox transformation parameter of food intake λ_E = 0.23 as the Box-Cox transformation of energy intake. We setup our working directory in R with the command setwd(’C:’) and using Intake_epis_food() function:

Intake_epis_food(data.file.name="Sim_AARP_wg_Men.csv",
numcov=4, df=5,lambdaf=0.32,lambdae=0.23,lvar1=999, lvar2=999, lvar3=0.33,
lvar4=0, n.iter=11000,n.burnin=1000, n.thin=10, bugs.seed=123456,
working.directory="C:/", file.estimates="estimates.csv",
file.istats="estimates_intake.csv", file.iterations="iterations.csv",
file.uis="uis.csv", file.tracep="trace_plots.pdf",
file.tpvarcov="trace_plot_var_cov_matrices.pdf",
file.densityp="density_plots.pdf",
bugs.directory="c:/Program Files/WinBUGS14/")

After calculations, estimates of α₁, α₂, α₃, β₁, β₂, β₃ from equations (8), (9) and (10), Σ_μ, $Σ_{μ}^{- 1}$ , Σ_ε, $Σ_{ε}^{- 1}$ are shown as well as summary statistics using a Bayesian approach for three variables are provided immediately in the console screen (a) food usual intake, (b) usual food intake per 1000 calories, and (c) energy usual intake. The following shows the results for whole grains.

Inference for Bugs model at ' Intake_program.txt ', fit using WinBUGS,
 1 chains, each with 11000  iterations  (first 1000   discarded)
 n.thin= 10  n.sims=  1000  iterations  saved
               mean   sd  2.5%   25%    50%   75% 97.5%
alpha_1        0.89 0.06  0.79  0.85   0.89  0.92  1.00
alpha_2       −0.04 0.05 −0.13 −0.07  −0.04 −0.01  0.05
alpha_3        0.00 0.04 −0.07 −0.03   0.00  0.02  0.07
beta_1_1       0.16 0.05  0.07  0.13   0.16  0.19  0.25
beta_1_2      −0.08 0.05 −0.18 −0.12  −0.08 −0.05  0.00
beta_1_3       0.57 0.05  0.47  0.54   0.57  0.61  0.67
beta_1_4      −0.29 0.05 −0.38 −0.32  −0.29 −0.26 −0.19
beta_2_1      −0.07 0.04 −0.15 −0.09  −0.06 −0.04  0.01
beta_2_2      −0.08 0.04 −0.16 −0.11  −0.08 −0.06 −0.01
beta_2_3       0.42 0.04  0.34  0.40   0.42  0.45  0.50
beta_2_4      −0.11 0.04 −0.18 −0.13  −0.11 −0.08 −0.03
beta_3_1      −0.07 0.04 −0.15 −0.10  −0.07 −0.05  0.00
beta_3_2      −0.06 0.04 −0.14 −0.09  −0.06 −0.04  0.00
beta_3_3       0.00 0.04 −0.07 −0.03   0.00  0.03  0.08
beta_3_4       0.34 0.04  0.27  0.32   0.34  0.37  0.42
tau_1_1        2.05 0.57  1.26  1.66   1.94  2.31  3.50
tau_1_2       −0.33 0.60 −1.64 −0.66  −0.31  0.04  0.82
tau_1_3       −0.11 0.19 −0.47 −0.23  −0.11  0.01  0.29
tau_2_1       −0.33 0.60 −1.64 −0.66  −0.31  0.04  0.82
tau_2_2        3.29 0.79  2.21  2.77   3.13  3.62  5.31
tau_2_3       −0.57 0.28 −1.21 −0.73  −0.54 −0.38 −0.09
tau_3_1       −0.11 0.19 −0.47 −0.23  −0.11  0.01  0.29
tau_3_2       −0.57 0.28 −1.21 −0.73  −0.54 −0.38 −0.09
tau_3_3        1.53 0.16  1.26  1.42   1.51  1.63  1.89
sigma2_1_1     0.56 0.13  0.33  0.47   0.55  0.64  0.83
sigma2_1_2     0.06 0.09 −0.12  0.00   0.06  0.12  0.22
sigma2_1_3     0.06 0.06 −0.06  0.02   0.06  0.10  0.18
sigma2_2_1     0.06 0.09 −0.12  0.00   0.06  0.12  0.22
sigma2_2_2     0.37 0.07  0.23  0.32   0.36  0.41  0.51
sigma2_2_3     0.13 0.05  0.03  0.10   0.13  0.16  0.23
sigma2_3_1     0.06 0.06 −0.06  0.02   0.06  0.10  0.18
sigma2_3_2     0.13 0.05  0.03  0.10   0.13  0.16  0.23
sigma2_3_3     0.72 0.07  0.60  0.68   0.72  0.77  0.86
a.e1           0.17 0.05  0.09  0.14   0.17  0.20  0.26
a.e2           0.81 0.29  0.28  0.62   0.80  0.99  1.37
sigmaemat_1_3  0.12 0.05  0.03  0.09   0.12  0.16  0.23
sigmaemat_2_2  1.26 0.08  1.11  1.21   1.26  1.31  1.42
sigmaemat_2_3  0.14 0.05  0.04  0.11   0.14  0.18  0.24
sigmaemat_3_1  0.12 0.05  0.03  0.09   0.12  0.16  0.23
sigmaemat_3_2  0.14 0.05  0.04  0.11   0.14  0.18  0.24
sigmaemat_3_3  1.16 0.06  1.06  1.12   1.16  1.19  1.27
tauemat_1_1    1.02 0.01  1.00  1.01   1.01  1.02  1.05
tauemat_1_2    0.01 0.01  0.00  0.01   0.01  0.02  0.03
tauemat_1_3   −0.11 0.05 −0.21 −0.14  −0.11 −0.08 −0.03
tauemat_2_1    0.01 0.01  0.00  0.01   0.01  0.02  0.03
tauemat_2_2    0.81 0.05  0.72  0.78   0.81  0.84  0.91
tauemat_2_3   −0.10 0.04 −0.17 −0.13  −0.10 −0.08 −0.03
tauemat_3_1   −0.11 0.05 −0.21 −0.14  −0.11 −0.08 −0.03
tauemat_3_2   −0.10 0.04 −0.17 −0.13  −0.10 −0.08 −0.03
tauemat_3_3    0.89 0.04  0.81  0.87   0.89  0.92  0.98

      Usual food intake Usual food intake per 1000 calories Energy usual intake
Mean          0.9359701                           0.4072328           2298.3852
S.d.          0.6947905                           0.2849965            446.3253
5th           0.1887330                           0.0823138           1636.7102
10th          0.2632069                           0.1156591           1755.9317
25th          0.4359173                           0.1992969           1982.9681
50th          0.7421419                           0.3408097           2267.2364
75th          1.2117088                           0.5350990           2568.3280
90th          1.9042402                           0.7920862           2891.3676
95th          2.3828280                           0.9484864           3088.9744

This is the only output presented in the console screen. These results are saved in file.estimates and file.istats respectively. Five additional output files are stored as mentioned before, but examples of these files are not presented here. Instead, we present the posterior density of the mean of usual whole grains intake plot and the posterior density of the mean of usual whole grains intake per 1000 calories in figure 1 (file.densityp). We present trace plots of the intercept and coefficients in equations (8), (9) and (10) in figure 2 (file.tracep). We present trace plots of entry estimates saved in file.tpvarcov of variance-covariance matrices (i) for Σ_μ in figure 3, (ii) for $Σ_{μ}^{- 1}$ in figure 4, (iii) for Σ_ε in figure 5 and (iv) for $Σ_{ε}^{- 1}$ in figure 6.

Posterior density of the mean for whole grains. The solid line is the density estimate for usual intake from 1000 MCMC. The dashed line is the density estimate for usual intake per 1000 calories.

Trace plot from 1000 MCMC of intercepts and estimated coefficients. First row shows trace plots of parameters for the intercept, age, body mass index, consumption of whole grains and energy intake from whole grains in equation (8). Second row shows trace plots of parameters for the intercept, age, body mass index, consumption of whole grains and energy intake from whole grains in equation (9). Third row shows trace plots of parameters for the intercept, age, body mass index, consumption of whole grains and energy intake from whole grains in equation (10).

Trace plot from 1000 MCMC for each entry estimate of variance-covariance matrix Σ_μ j=p=1,2,3.

Trace plot from 1000 MCMC for each entry estimate of variance-covariance matrix $Σ_{μ}^{- 1}$ j=p=1,2,3.

Trace plot from 1000 MCMC for each entry estimate of variance-covariance matrix Σ_ε j=p=1,2,3. The following parameters are neither estimated nor plotted because they are fixed: Σ_ε11=1 and Σ_ε12=Σ_ε21=0.

Trace plot for each entry estimate of variance-covariance matrix $Σ_{ε}^{- 1}$ j=p=1,2,3.

5. Discussion

This paper was motivated by the AARP calibration sub-study in nutritional epidemiology. Our main aim when implementing this function was to help users to estimate this recent nonlinear mixed three-part model of measurement error for an episodically food consumed in a Bayesian manner.

Acknowledgments

This research was supported by a grant from the National Cancer Institute (R37-CA-057030).

Contributor Information

Adriana Pérez, Division of Biostatistics and Michael & Susan Dell Center for Healthy Living, The University of Texas Health Science Center at Houston, School of Public Health, 1616 Guadalupe Street, Suite 6.340, Austin, TX 78705, Telephone: +1-(512)-391-2524, Fax: +1-(512)-482-6185, adriana.perez@uth.tmc.edu.

Saijuan Zhang, Department of Statistics, Blocker Building, Room, Texas A&M University, 3143 TAMU, College Station TX 77843-3143, Telephone: +1-979-845-3141, Fax: +1-979-845-3144, sjzhang@stat.tamu.edu.

Victor Kipnis, Division of Cancer Prevention, National Cancer Institute, 6130 Executive Boulevard, Bethesda, Maryland 20892-7354, Telephone: +1-301-496-7464, Fax: +1-301-496-7463, vk3b@nih.gov.

Douglas Midthune, Division of Cancer Prevention, National Cancer Institute, 6130 Executive Boulevard, Bethesda, Maryland 20892-7354, Telephone: +1-301-496-7464, Fax: +1-301-496-7463, dm76q@nih.gov.

Laurence S. Freedman, Gertner Institute for Epidemiology and Health Policy Research, Biostatistics Unit, Sheba Medical Center, Tel Hashomer 52161, Israel, Telephone: +972-3-530-5390, Fax: +972-3-534-9607, lsf@actcom.co.il

Raymond J. Carroll, Department of Statistics, Blocker Building, Room, Texas A&M University, 3143 TAMU, College Station TX 77843-3143, Telephone: +1-979-845-3141, Fax: +1-979-845-3144, carroll@stat.tamu.edu

References

Dodd KW, Guenther PM, Freedman LS, Subar AF, Kipnis V, Midthune D, Tooze JA, Krebs-Smith SM. Statistical Methods for Estimating Usual Intake of Nutrients and Foods: a Review of the Theory. Journal of the American Dietetic Association. 2006;106(10):1640–1650. doi: 10.1016/j.jada.2006.07.011. [DOI] [PubMed] [Google Scholar]
Eckert RS, Carroll RJ, Wang N. Transformations to Additivity in Measurement Error Models. Biometrics. 1997;53:262–272. [PubMed] [Google Scholar]
Kipnis V, Freedman LS, Carroll RJ, Midthune D. A measurement error model for episodically consumed foods and energy. 2010 doi: 10.18637/jss.v046.c03. Preprint. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kipnis V, Midthune D, Buckman DW, Dodd KW, Guenther PM, Krebs-Smith SM, Subar AF, Tooze JA, Carroll RJ, Freedman LS. Modeling Data with Excess Zeros and Measurement Error: Application to Evaluating Relationships between Episodically Consumed Foods and Health Outcomes. Biometrics. 2009;65(4):1003–1010. doi: 10.1111/j.1541-0420.2009.01223.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
R Development Core Team. R: A Language and Environment for Statistical Computing. 2009 ISBN 3-900051-07-0, URL http://www.R-project.org.
Schatzkin A, Subar AF, Thompson FE, Harlan LC, Tangrea J, Hollenbeck AR, Hurwitz PE, Coyle L, Schussler N, Michaud DS, Freedman LS, Brown CC, Midthune D, Kipnis V. Design and serendipity in establishing a large cohort with wide dietary intake distributions: the National Institutes of Health-AARP Diet and Health Study. American Journal of Epidemiology. 2001;154:1159–1125. doi: 10.1093/aje/154.12.1119. [DOI] [PubMed] [Google Scholar]
Spiegelhalter DJ, Thomas A, Best NG. WinBUGS Version 1.2 User Manual. Cambridge, UK: MRC Biostatistics Unit; 1999. URL http://www.mrc-bsu.cam.ac.uk/bugs/. [Google Scholar]
Sturtz S, Ligges U, Gelman A. R2WinBUGS: A Package for Running WinBUGS from R. Journal of Statistical Software. 2005;12(3):1–16. URL http://www.jstatsoft.org. [Google Scholar]
Tooze JA, Midthune D, Dodd KW, Freedman LS, Krebs-Smith SM, Subar AF, Guenther PM, Carroll RJ, Kipnis V. A New Statistical Method for Estimating the Usual Intake of Episodically Consumed Foods With Application to Their Distribution. Journal of the American Dietetic Association. 2006;106(10):1575–1587. doi: 10.1016/j.jada.2006.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang S, Midthune D, Pérez A, Buckman DW, Kipnis V, Freedman LS, Dodd KW, Krebs-Smith SM, Carroll RJ. A Bivariate Nonlinear Measurement Error Model for Episodically Consumed Dietary Components. 2010 doi: 10.2202/1557-4679.1267. Preprint. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] Dodd KW, Guenther PM, Freedman LS, Subar AF, Kipnis V, Midthune D, Tooze JA, Krebs-Smith SM. Statistical Methods for Estimating Usual Intake of Nutrients and Foods: a Review of the Theory. Journal of the American Dietetic Association. 2006;106(10):1640–1650. doi: 10.1016/j.jada.2006.07.011. [DOI] [PubMed] [Google Scholar]

[R2] Eckert RS, Carroll RJ, Wang N. Transformations to Additivity in Measurement Error Models. Biometrics. 1997;53:262–272. [PubMed] [Google Scholar]

[R3] Kipnis V, Freedman LS, Carroll RJ, Midthune D. A measurement error model for episodically consumed foods and energy. 2010 doi: 10.18637/jss.v046.c03. Preprint. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Kipnis V, Midthune D, Buckman DW, Dodd KW, Guenther PM, Krebs-Smith SM, Subar AF, Tooze JA, Carroll RJ, Freedman LS. Modeling Data with Excess Zeros and Measurement Error: Application to Evaluating Relationships between Episodically Consumed Foods and Health Outcomes. Biometrics. 2009;65(4):1003–1010. doi: 10.1111/j.1541-0420.2009.01223.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] R Development Core Team. R: A Language and Environment for Statistical Computing. 2009 ISBN 3-900051-07-0, URL http://www.R-project.org.

[R6] Schatzkin A, Subar AF, Thompson FE, Harlan LC, Tangrea J, Hollenbeck AR, Hurwitz PE, Coyle L, Schussler N, Michaud DS, Freedman LS, Brown CC, Midthune D, Kipnis V. Design and serendipity in establishing a large cohort with wide dietary intake distributions: the National Institutes of Health-AARP Diet and Health Study. American Journal of Epidemiology. 2001;154:1159–1125. doi: 10.1093/aje/154.12.1119. [DOI] [PubMed] [Google Scholar]

[R7] Spiegelhalter DJ, Thomas A, Best NG. WinBUGS Version 1.2 User Manual. Cambridge, UK: MRC Biostatistics Unit; 1999. URL http://www.mrc-bsu.cam.ac.uk/bugs/. [Google Scholar]

[R8] Sturtz S, Ligges U, Gelman A. R2WinBUGS: A Package for Running WinBUGS from R. Journal of Statistical Software. 2005;12(3):1–16. URL http://www.jstatsoft.org. [Google Scholar]

[R9] Tooze JA, Midthune D, Dodd KW, Freedman LS, Krebs-Smith SM, Subar AF, Guenther PM, Carroll RJ, Kipnis V. A New Statistical Method for Estimating the Usual Intake of Episodically Consumed Foods With Application to Their Distribution. Journal of the American Dietetic Association. 2006;106(10):1575–1587. doi: 10.1016/j.jada.2006.07.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Zhang S, Midthune D, Pérez A, Buckman DW, Kipnis V, Freedman LS, Dodd KW, Krebs-Smith SM, Carroll RJ. A Bivariate Nonlinear Measurement Error Model for Episodically Consumed Dietary Components. 2010 doi: 10.2202/1557-4679.1267. Preprint. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Intake_epis_food(): An R Function for Fitting a Bivariate Nonlinear Measurement Error Model to Estimate Usual and Energy Intake for Episodically Consumed Foods

Adriana Pérez

Saijuan Zhang

Victor Kipnis

Douglas Midthune

Laurence S Freedman

Raymond J Carroll

Abstract

1. Introduction

2. Method and parameter inference

2.1. Usual intake analysis

3. Using the Intake_epis_food() function in `R`

4. Application of the Intake_epis_food() function

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

5. Discussion

Acknowledgments

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Intake_epis_food(): An R Function for Fitting a Bivariate Nonlinear Measurement Error Model to Estimate Usual and Energy Intake for Episodically Consumed Foods

Adriana Pérez

Saijuan Zhang

Victor Kipnis

Douglas Midthune

Laurence S Freedman

Raymond J Carroll

Abstract

1. Introduction

2. Method and parameter inference

2.1. Usual intake analysis

3. Using the Intake_epis_food() function in R

4. Application of the Intake_epis_food() function

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

5. Discussion

Acknowledgments

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3. Using the Intake_epis_food() function in `R`