Skip to main content
. Author manuscript; available in PMC: 2010 Mar 28.
Published in final edited form as: Am Econ J Appl Econ. 2009 Jan 1;1(1):164–182. doi: 10.1257/app.1.1.164

Table 1.

Alcohol Consumption: Participation

(1) (2) (3) (4) (5)
12 or more drinks in lifetime
Over 21 0.0418
(0.0242)
0.0316
(0.0301)
0.0268
(0.0292)
0.0198
(0.0423)
0.0199
(0.0179)
Observations 16,107 16,107 16,107 16,107
R2 0.02 0.03 0.10 0.10
Prob > Chi-Squared 0.00 0.61

12 or more drinks in one year
Over 21 0.0796
(0.0254)
0.0657
(0.0313)
0.0611
(0.0301)
0.0603
(0.0438)
0.0461
(0.0218)
Observations 16,107 16,107 16,107 16,107
R2 0.02 0.03 0.11 0.11
Prob > Chi-Squared 0.00 0.56

Any heavy drinking in last year
Over 21 0.0761
(0.0248)
0.0527
(0.0304)
0.0492
(0.0291)
0.0262
(0.0430)
0.0398
(0.0201)
Observations 16,107 16,107 16,107 16,107
R2 0.01 0.01 0.10 0.10
Prob > Chi-Squared 0.00 0.67
Covariates N N Y Y N
Weights N Y Y Y N
Quadratic terms Y Y Y Y N
Cubic terms N N N Y N
LLR N N N N Y

Notes: The first column of each panel contains the regression from the corresponding figure. Robust standard errors are in parentheses. Covariates include dummies for census region, race, gender, health insurance, employment status, twenty-first birthday, twenty-first birthday + 1 day, and looking for work. Weights are the NHIS adult sample weights and reduce the precision of the regressions significantly as the weights vary substantially across observations. People reporting five or more drinks on one day (Nt necessarily in one sitting) are coded as heavy drinkers. The first four columns give the estimates from polyNmial regressions on age interacted with a dummy for being over 21. The age variable is centered on 21, so the Over 21 variable gives us an estimate of the discontinuous increase at age 21. In the fifth column, we present the results of a local linear regression procedure with a rule-of-thumb bandwidth. For this procedure, we follow Fan and Gijbels (1996) and fit a fourth order polyNomial separately on each side of the age-21 cutoff. We use the fit of this regression to estimate the average second derivative of the expectation function (D) and the mean squared error of this function (σ2). The rule-of-thumb bandwidth is h = c2 R/D), where c is a constant that depends on the kernel (c = 3.44 for a triangular kernel), and R is the range of the running variable (i.e., the range of ages used to estimate the polyNmial on each side). We then use this bandwidth, and a triangular kernel, to fit local linear regressions on each side of age 21, and estimate the limit of the expectation function from the left and the right of age 21. The local linear regressions have two fewer observations because the twenty-first birthday and the day after the twenty-first birthday have been dropped.