Abstract
Local knowledge of when humans are at elevated risk for exposure to tick vectors of human disease agents is required both for the effective use of personal protection measures to avoid tick bites and for implementation of control measures to suppress host-seeking ticks. Here, we used previously published data on the seasonal density of host-seeking Ixodes pacificus Cooley and Kohls nymphs, the primary vectors of Lyme disease spirochetes in the far western USA, collected across a broad habitat and climate gradient in northwestern California to identify predictors of periods of time within the year when questing nymphal density is elevated. Models based on calendar week alone performed similarly to models based on calendar week and woodland type, or meteorological variables. The most suitable model for a given application will depend on user objectives, timescale of interest, and the geographic extent of predictions. Our models sought not only to identify when seasonal host-seeking activity commences, but also when it diminishes to low levels. Overall, we report a roughly 5–7 month period in Mendocino County during which host-seeking nymphal densities exceed a low threshold value.
Keywords: Ixodes pacificus, risk assessment, phenology, Lyme disease
Local knowledge of when humans are at elevated risk for tick exposure is required both for the effective use of personal protection measures to avoid tick bites and for implementation of control measures to suppress host-seeking ticks (Piesman and Eisen 2008). Previous studies in North America have shown that the host-seeking phenologies of the primary vectors of Lyme disease spirochetes—the nymphal stages of Ixodes scapularis Say in the east and Ixodes pacificus Cooley and Kohls in the far west—are strongly associated with environmental variables (Lane et al. 1995; Vail and Smith 1998; Eisen et al. 2002, 2003, 2016; Diuk-Wasser et al. 2006; Gatewood et al. 2009).
Here, we used previously published data on the seasonal density of host-seeking I. pacificus nymphs collected across a broad habitat and climate gradient in northwestern California (Eisen et al. 2002, 2003, 2006) to identify predictors of periods of time within the year when nymphal density is elevated. We included sets of variables with different combinations of perceived predictive capacity and utility for being communicated to and used by the public. First we evaluated whether calendar week alone—a variable that readily can be used by the public—could define when ticks are active above different density thresholds. Next, we sought to determine if calendar week and woodland type—a set of combined variables that are still reasonably easy for the public to use—improved the nymphal density predictions for the fitted model. Finally, we determined if adding meteorological variables and elevation as possible predictors with calendar week and woodland type—combinations of which may be more accurate predictors of year-to-year variability in periods of elevated nymphal density linked to climate fluctuations, but less accessible for use by the public—predicted nymphal density as well as or better than the previous models.
Materials and Methods
Study area
Mendocino County, located in northwestern California (Fig. 1), covers an area of roughly 9,090 km2, yet is ecologically and climatically diverse. The county spans coastal areas along the Pacific Ocean and moving inland transitions from coastal redwood (Sequoia sempervirens (D. Don)) or Douglas fir (Pseudotsuga menziesii (Mirb.)) dominated forests to woodlands dominated by Quercus spp. oaks, Douglas fir and tan oak (Lithocarpus densiflorus (Hooker and Arnott)) in the central portion of the county, Quercus spp. oak woodlands in the comparatively hot and dry southeastern section, and Ponderosa pine (Pinus ponderosa P. and C. Lawson) in the northwest. Annual rainfall ranges from approximately 900 mm in the southeast to more than 2,000 mm in portions of western Mendocino County (Eisen et al. 2003). For all sites, the seasonal pattern of rainfall is unimodal, peaking during the winter months (December–March) and reaching its lowest values during the summer months, with nearly no rain falling in July and August. Across sites, maximum (minimum) daily average temperatures range from 10 to 14 °C (−1 to 4 °C) during December and January, and 28 to 32 °C (8 to 12 °C) during July and August; the coolest temperatures occur at the higher elevation sites in the northwest.
Fig. 1.
Locations of sampling sites throughout Mendocino County, California. Sites included in the buildset are indicated using shaded circles, validation sets 1 and 2 are indicated using white squares and hollow stars, respectively. Inset shows the location of Mendocino County within California.
Nymphal density data
Nymphal density data sets used to fit and validate the models described in the Results were described previously (Eisen et al. 2002, 2003, 2006). Briefly, the build data set consisted of 162 observations on the density of host-seeking nymphs based on nonremoval drag sampling of 12 leaf/fir-needle litter sites throughout Mendocino County that ranged in size from 750 to 1,935 m2 (Eisen et al. 2003). Sampling sites included Quercus spp. oak woodlands with Pacific madrone (Arbutus menziesii Pursh) (n = 3), Quercus spp. oak with Pacific madrone and Douglas fir (n = 6) and redwood–tanoak sites (n = 3). Each site was sampled biweekly from mid-March through early October in 2001 or until no nymphs were collected during two consecutive visits. Nymphs were active in all sites by mid-March and were collected until late June for all sites and into October in a quarter of the sites.
The three models (defined below as “week,” “week-woodland type” or “meteorological”) were evaluated using two independent data sets representing a spatially constrained but temporally expansive sampling scheme (Validation set 1; Eisen et al. 2002), and a spatially expansive but temporally constrained sampling scheme (Validation set 2; Eisen et al. 2006). Validation set 1 included 120 observations of nymphal densities based on nonremoval drag sampling conducted every 2–3 wks from mid- to late-March through August or September 1998–2001 in a 900 m2 leaf litter dominated Black oak (Quercus kelloggii Newb.) and madrone woodland site. Similar sampling was also conducted from mid-March to September 2000 in a Pacific madrone dominated woodland, a tanoak-dominated site and another redwood–tanoak site, each measuring 900 m2 (Eisen et al. 2002). Validation set 2 included 154 observations of nymphal densities based on removal drag sampling conducted on two occasions (from 28 April to 14 May, 2004 for the first and 17 May to 4 June, 2004 for the second) during the peak of nymphal host-seeking at seventy-eight 750 m2 plots. All sampling was conducted in areas with a leaf or fir or needle substrate in the following vegetation categories: redwood (n = 10), pine (n = 11), hardwood (n = 19), hardwood/conifer (n = 22), mixed tanoak/madrone or Douglas fir with a redwood influence (n = 4), or tanoak (n = 12).
Predictors
Day and week of the calendar year and the year in which nymphal density was assessed were recorded at the time of nymphal sampling. Elevation was derived from a United States Geological Survey 1/3 arc-second (approximately 10 m) resolution digital elevation model (2009) based on each site’s longitude and latitude. Based on predominant tree species noted in the field, each site was assigned by the authors to one of three woodland types: hardwood, hardwood-conifer, or redwood–tanoak–pine. Meteorological variables included weekly averages of saturation vapor pressure deficit (mmHg), maximum and minimum temperature (°C), relative humidity (%), and incident shortwave radiation (W/m2; used as a location-specific proxy for daylength that takes into account the site’s sun-slope geometry). Growing degree days (GDDs), total degree days (TDD) and rainfall (mm) were calculated as weekly totals and cumulative weekly values. A GDD was defined as the magnitude by which the average daily temperature was above a threshold of 10 °C; values of GDD below 10 °C were then set to zero. A TDD was defined similarly, but values of TDD below 10 °C are not set to zero. Therefore, GDDs provide a measure of warmth above a biologically relevant threshold of 10 °C (Eisen et al. 2016), whereas TDDs do the same, but also account for the effects of cold temperatures below the threshold. The model was also fit with GDD and TDD with a threshold of 0 °C in order to test the sensitivity of the variables’ classification. Meteorological variables were obtained from version 2.1 of the Daymet dataset of Thornton et al. (1997, 2014). Daymet is a 1 km by 1 km spatial resolution dataset covering the United States, Mexico and southern Canada, and spanning 1980-present with daily temporal resolution. Data for the 1 km by 1 km pixel corresponding to the coordinates of each site were extracted using version 2.1 of the Daymet Multiple Coordinates Extractor (http://daymet.ornl.gov/tools.html). All meteorological predictors were evaluated at four different time lags leading up to each nymphal tick collection date. Each time lag was separated by a week, starting at 4 wks prior to the collection and continuing to 1 wk prior. Data from the date of collection were not included, as this would preclude use as a predictive model. In total, 47 possible first order fixed predictors were included for model selection.
Statistical analysis
The objective of each candidate model was to predict elevated and lower classifications of nymphal density in Mendocino County. Three generalized linear mixed models assuming binomial distributions were fit for classification using the build set and model performance was evaluated using the two independent validation sets described above. The three models, described in detail below, are referred to as “week,” “week-woodland type” or “meteorological”. Standard diagnostics were performed on all models to ensure model assumptions held. All analyses were run using R 3.1.2 statistical software (R Core Team 2014).
The response for all three candidate models was defined by dichotomizing the nymphal density into elevated (≥0.35 nymphs/100 m2) and low (<0.35 nymphs/100 m2) classifications. The threshold of 0.35 nymphs/100 m2 represents the lowest decile (10%) of the observed tick densities from Validation set 2 (Eisen et al. 2006). Most plots measured 750 m2, therefore based on this low threshold, sites with more than 2 nymphs collected within a sampling period would be considered elevated. We later tested the sensitivity of the models to the selected threshold by fitting all three models to the 2nd decile of 0.93 nymphs/100 m2 and compared the percent of observations predicted correctly among the models.
All quantitative predictors were standardized to have a mean of zero and variance of one in order to construct predictors of equal importance with the same physical quantities and magnitude. The candidate models differed by which fixed predictors were available for model selection. All climate and descriptive predictors were available for building the meteorological model; only two predictors, week, and woodland type, were allowed for the week-woodland type model, and only week was available for the week model. In addition, all second-order interactions of all numeric variables were added to the list of possible predictors. Therefore, the models had 1,271, 4, and 2 possible coefficients to consider for model selection, respectively. For each model, site was included as a random effect.
The number of predictors available in the meteorological model was extremely high considering the building data set had only 162 observations. Therefore, variable selection was performed by the glmmLasso package (R Core Team 2014), which implements Least Absolute Shrinkage and Selection Operator (LASSO) regression (Tibshirani 1996). The estimate for lambda was chosen by performing a grid search on the integers varying from 1000 down to 40 and the lambda which produced the lowest Bayesian Information Criterion (BIC) score was selected. Because LASSO regression is known for arbitrarily selecting only one predictor from a highly correlated group, the glmmLasso function was executed 1,000 times with a randomized order of predictors. All significant coefficients (P-value < 0.1) that were repeated greater than 10% of the time by the glmmLasso function were kept. Following variable reduction for the meteorological model, all three candidate models were fit by using the glmer function in the lme4 package (R Core Team 2014). In addition, likelihood ratio tests were performed to examine if the variance for the random effect, site, should be allowed to vary by woodland type for the meteorological and week-woodland type models.
We defined elevated risk as when the probability of high nymphal density is at least 0.50. The sim function in the arm library (R Core Team 2014) was used to create 1000 realizations of each estimated model (Gelman and Hill 2007). For each realization, we solved for the start week and end week of elevated risk. This allowed for estimation of averages and prediction intervals (PIs) for the start and end of elevated risk.
In order to determine which of the three competing models best fit the buildset data, the following measurement tools were calculated: percent predicted correctly, Bayesian Information Criterion (BIC), sensitivity, specificity, positive and negative predictive values, and false positive and negative rates. In order to determine how well the best predicted models performed on independent validation sets, the same measurements except for BIC were calculated. BIC was excluded because we are no longer trying to decide the optimal fitted model; rather we are trying to determine which model predicts elevated and low classifications of nymphal density the best.
Results
Week model
The best “week” model is shown in Table 1. Based on the twelve sites used to train the model, the week model predicted that the probability of a site being classified as having elevated nymphal density began to increase as early as calendar week 6.9 (mid-February) and as late as calendar week 12.0 (late March) with a mean of calendar week 8.9 (95% PI [4.6, 13.4]; late February). Thereafter, the probability of a site being classified as having elevated nymphal density increased until calendar week 20.4 (95% PI [18.9, 21.4]) (mid-late May) and then decreased thereafter. The probability of a site being classified as having elevated nymphal density decreased below the “elevated” threshold as early as calendar week of 28.7 (mid-July) and as late as calendar week 33.8 (mid-August) with an average calendar week of 31.9 (95% PI [27.3, 35.2]) (early-August). Thus, the period of elevated nymphal density is expected to span a mean of 23 wk.
Table 1.
Fixed effect parameter estimates based on a threshold of 0.35 I. pacificus nymphs/100 m2
| Model | Model covariates | Parameter estimate | Parameter SE | Z value | P-value | PI for parameter estimate |
|---|---|---|---|---|---|---|
| Week | Intercept | 4.48 | 0.94 | 4.75 | <0.001 | (2.95, 6.83) |
| Week | −1.86 | 0.44 | −4.21 | <0.001 | (−2.90, −1.12) | |
| Week2 | −2.37 | 0.48 | −4.98 | <0.001 | (−3.45, −1.55) | |
| Week—Woodland type | Intercept | 6.67 | 1.63 | 4.09 | <0.001 | (4.15, 10.9) |
| Week | −5.58 | 1.41 | −3.98 | <0.001 | (−9.02, −3.30) | |
| Week2 | −4.05 | 0.94 | −4.32 | <0.001 | (−6.36, −2.55) | |
| Week: Type of Woodland (redwood–tanoak–pine) | 2.57 | 1.45 | 1.76 | 0.08 | (−0.80, 5.60) | |
| Week: Type of Woodland (hardwood-conifer) | 4.39 | 1.28 | 3.42 | <0.001 | (2.24, 7.45) | |
| Meteorological | Intercept | 4.96 | 1.28 | 3.87 | <0.001 | (3.00, 4.92) |
| 4-week lagged GDD | −1.76 | 0.52 | −3.38 | <0.001 | (−3.02, −0.90) | |
| 4-week lagged GDD2 | −2.70 | 0.64 | −4.23 | <0.001 | (−4.23, −1.64) |
All models include site as a random effect. GDD is measured relative to 0 °C.
When the model was tested using the buildset data, 90.7% of observations were correctly classified (Table 2). The model accurately predicted elevated nymphal density observations 95.9% of the time (sensitivity), yielding a false negative rate of 4.1%. However, specificity was lower with 75.0% of low nymphal density observations correctly classified, resulting in a false positive rate of 25%. In addition, when the model predicted an elevated nymphal density observation, it was correct 92.1% of the time (positive predictive value [PPV] = 92.1%). Predictions of sites with low nymphal density were accurate 85.7% of the time (negative predictive value [NPV] = 85.7%). Evaluation of the model using two independent validation sets yielded overall accuracies of 86.7% and 83.8% (Table 2).
Table 2.
Comparison of models based on a threshold of 0.35 I. pacificus nymphs/100 m2
| Model | BIC | % Predicted Correctlya | Sensitivityb | Specificityc | False positive rated | False negative ratee | Positive predictive valuef | Negative predictive valueg | |
|---|---|---|---|---|---|---|---|---|---|
| Best fitted model | |||||||||
| Week | 112.9 | 90.7 (147/162) | 95.9 (117/122) | 75.0 (30/40) | 25.0 (10/40) | 4.1 (5/122) | 92.1 (117/127) | 85.7 (30/35) | |
| Week—Woodland type | 100.7 | 93.8 (152/162) | 97.5 (119/122) | 82.5 (33/40) | 17.5 (7/40) | 2.5 (3/122) | 94.4 (119/126) | 91.7 (33/36) | |
| Meteorological | 111.5 | 92.0 (149/162) | 97.5 (119/122) | 75.0 (30/40) | 25.0 (10/40) | 2.5 (3/122) | 92.2 (119/129) | 90.9 (30/33) | |
| Best predicted model for validation set 1 | |||||||||
| Week | NA | 86.7 (101/120) | 88.6 (78/88) | 81.3 (26/32) | 18.8 (6/32) | 11.4 (10/88) | 92.9 (78/84) | 72.2 (26/36) | |
| Week—Woodland type | NA | 82.5 (99/120) | 83.0 (73/88) | 81.3 (26/32) | 18.8 (6/32) | 17.0 (15/88) | 92.4 (73/79) | 63.4 (26/41) | |
| Meteorological | NA | 85.0 (102/120) | 89.8 (79/88) | 71.9 (23/32) | 28.1 (9/32) | 10.2 (9/88) | 89.8 (79/88) | 71.9 (23/32) | |
| Best predicted model for validation set 2 | |||||||||
| Week | NA | 83.8 (129/154) | 87.1 (121/139) | 53.3 (8/15) | 46.7 (7/15) | 12.9 (18/139) | 94.5 (121/128) | 30.8 (8/26) | |
| Week—Woodland type | NA | 83.8 (129/154) | 87.1 (121/139) | 53.3 (8/15) | 46.7 (7/15) | 12.9 (18/139) | 94.5 (121/128) | 30.8 (8/26) | |
| Meteorological | NA | 77.9 (120/154) | 81.3 (113/139) | 46.7 (7/15) | 53.3 (8/15) | 18.7 (26/139) | 93.4 (113/121) | 21.1 (7/33) | |
Number of observations predicted correctly/total number of observations shown in parentheses.
Percent (number of observations predicted elevated risk correctly/total number of actual elevated risk observations).
Percent (number of observations predicted low risk correctly/total number of actual low risk observations).
Percent (number of observations predicted elevated risk incorrectly/total number of actual low risk observations).
Percent (number of observations predicted low risk incorrectly/total number of actual elevated risk observations).
Percent (number of elevated risk observations/total number of observations predicted to be elevated.
Percent (number of low risk observations/total number of observations predicted to be low).
Week-woodland type model
Table 1 shows the best “week-woodland” model. Again, we used the observed data from 12 sites in the model buildset to estimate when nymphal density was expected to increase or decrease for each site. The probability of a site being classified as having elevated nymphal density increased in early- to mid-February at a mean of calendar week 6.3 (95% PI [4.0, 11.6]) for hardwood, a mean of calendar week 14.5 (early-April; 95% PI [8.4, 16.0]) for hardwood-conifer, and a mean of calendar week 9.8 (early-March; 95% PI [3.0, 13.3]) for redwood–tanoak–pine. The probability of a site being classified as having elevated nymphal density increased until calendar weeks 18.0 (early-May), 22.4 (late-May), and 20.4 (mid-May), respectively for hardwood, hardwood-conifer, and redwood–tanoak–pine, and then decreased thereafter. The probability of a site being classified as having elevated nymphal density was at its minimum in late July at a mean of week 29.7 (95% PI [24.7, 30.8]) for hardwood, week 35.3 (late August; 95% PI [28.9, 35.9]) for hardwood-conifer and week 31.3 (early-August; 95% PI [29.7, 34.3]) for redwood–tanoak–pine. Overall, the number of weeks predicted to be classified as having elevated nymphal density ranged from a mean of 23.4 weeks (range: 16.5–22.6) for hardwoods to 20.8 wk (15.6–24.1) for hardwood-conifer, and 21.5 wk (21.5–23.9) for redwood–tanoak–pine.
When the model was tested using buildset data, 94% of observations were correctly categorized. Sensitivity and specificity of the model were 97.5% and 82.5%, respectively, yielding false negative and false positive rates of 2.5% and 17.5%, respectively. PPV and NPV were 94.4% and 91.7%, respectively. When the model was tested using independent validation sets, overall accuracy decreased to roughly 83%. Sensitivity and specificity ranged from 83.0% to 87.1% and 81.3% to 53.3%, respectively (Table 2).
Meteorological model
After variable selection using LASSO, the following predictors remained: the interaction between cumulative GDD at a 4 wk lag with cumulative TDD at a 4 wk lag, the squared values of total rainfall at 3 wk lag, cumulative GDD at a 4 wk lag, and cumulative TDD at a 4 wk lag. To guard against built-in restriction and bias, the first order terms of the squared predictors were also included as possible predictors. Among these coefficients, both the squared and linear value of cumulative GDD were the only significant predictors when performing the glmer function in the lme4 package (R Core Team 2014).
Next, the sensitivity to the definition of cumulative GDD was addressed by evaluating two models with 0 and 10 °C thresholds. The model with cumulative GDD at 0 °C threshold was chosen because the model had a better prediction rate and the BIC score was lower by 5.6.
The selected model indicates that the probability of an observation being classified as having elevated nymphal density increases with increasing GDD at a 4 wk lag until reaching a peak and thereafter decreases. Based on the twelve sites included in the model build set, a site has a 95% chance of being first classified as having high nymphal density when GDD is less than 512 and increases until 4 wk after GDD reaches a mean of 1,133.8 then decreases thereafter. The probability of being classified as elevated nymphal density reaches its minimum 4 wk after GDD reaches a mean of 2,370.1 (95% PI [1732.8, 2807.1]). We used the 1980–2010 weekly average GDDs for the 12 build sites to estimate the average week that these GDD thresholds were reached and showed that the nymphal densities became elevated at approximately calendar week 4.0 (late January; 95% PI [4.0–13.8]) and reached its minimum at approximately calendar week 32.9 (mid-August; 95% PI [28.1, 36.1]); elevated nymphal density is expected to span a mean of 28.9 weeks.
When tested using the model buildset, 92.0% of observations were correctly classified. Sensitivity and specificity were 97.5% and 75.0%, yielding false negative and false positive rates of 2.5% and 25.0%, respectively. PPV and NPV were 92.9% and 90.9%, respectively. When tested using data from two independent validation sets, 85.0% and 77.9% of observations were correctly classified. Sensitivity and specificity ranged from 89.8% to 81.3% and 71.9% to 46.7%, respectively (Table 2).
Sensitivity of model performance to selected nymphal density threshold
All three models were sensitive to the nymphal density threshold value employed to dichotomize sites into elevated or low nymphal density for the binomial response. To illustrate the sensitivity, we show the results of each model using threshold of 0.93 nymphs/100 m2; this threshold represents the 20% quantile of the 2004 nymphal density sampling (Table 3). All models with a threshold value of 0.93 nymphs/100 m2 yielded weaker models, as measured by higher BIC values, and produced lower accuracy in correctly predicting nymphal densities for both validation sets when compared with models using 0.35 nymphs/100 m2 (Tables 2 and 4).
Table 3.
Fixed effect parameter estimates based on a threshold of 0.93 I. pacificus nymphs/100 m2
| Model | Model covariates | Parameter estimate | Parameter SE | Z value | P-value | PI for parameter estimate |
|---|---|---|---|---|---|---|
| Week | Intercept | 2.56 | 0.77 | 3.32 | <0.001 | (1.07, 4.36) |
| Week | −1.78 | 0.39 | −4.60 | <0.001 | (−2.64, −1.10) | |
| Week2 | −2.54 | 0.47 | −5.37 | <0.001 | (−3.59, −1.72) | |
| Week—Woodland type | Intercept | 3.39 | 1.06 | 3.20 | <0.001 | (1.39, 5.92) |
| Week | −4.95 | 1.08 | −4.57 | <0.001 | (−7.37, −3.07) | |
| Week2 | −3.77 | 0.71 | −5.31 | <0.001 | (−5.38, −2.56) | |
| Week: Type of Woodland (redwood–tanoak–pine) | 3.81 | 1.10 | 3.47 | <0.001 | (1.84, 6.20) | |
| Week: Type of Woodland (hardwood-conifer) | 3.17 | 1.00 | 3.19 | <0.001 | (1.37, 5.32) | |
| Meteorological | Intercept | 2.42 | 0.75 | 3.23 | 0.001 | (0.96, 4.20) |
| L4.gddC | −1.48 | 0.37 | −4.00 | <0.001 | (−2.29, −0.81) | |
| L4.gddC2 | −2.67 | 0.53 | −5.03 | <0.001 | (−3.84, −1.72) |
All models include site as a random effect.
Table 4.
Comparison of models based on a threshold of 0.93 I. pacificus nymphs/100 m2
| Model | BIC | % predicted correctly [*] | Sensitivity (%) [**] | Specificity (%) [***] | False positive rate (%) [****] | False negative rate (%) [*****] | |
|---|---|---|---|---|---|---|---|
| Best fitted model | |||||||
| Week | 146.8 | 85.2 [138/162] | 90.1 [82/91] | 78.9 [56/71] | 21.1 [15/71] | 9.9 [9/91] | |
| Week—Woodland type | 138.4 | 92.0 [149/162] | 94.5 [86/91] | 88.7 [63/71] | 11.3 [8/71] | 5.5 [5/91] | |
| Meteorological | 150.2 | 86.4 [140/162] | 91.2 [83/91] | 80.3 [57/71] | 19.7 [14/71] | 8.8 [8/91] | |
| Best predicted model for validation set 1 | |||||||
| Week | NA | 85.0 [102/120] | 83.8 [57/68] | 86.5 [45/52] | 13.5 [7/52] | 16.2 [11/68] | |
| Week—Woodland type | NA | 82.5 [99/120] | 82.4 [56/68] | 82.7 [43/52] | 17.3 [9/52] | 17.6 [12/68] | |
| Meteorological | NA | 85.8 [103/120] | 89.7 [61/68] | 80.8 [42/52] | 19.2 [10/52] | 10.3 [7/68] | |
| Best predicted model for validation set 2 | |||||||
| Week | NA | 59.1 [91/154] | 63.6 [75/118] | 44.4 [16/36] | 55.6 [20/36] | 36.4 [43/118] | |
| Week—Woodland type | NA | 62.3 [95/154] | 67.8 [80/118] | 44.4 [16/36] | 55.6 [20/36] | 32.2 [38/118] | |
| Meteorological | NA | 63.6 [98/154] | 68.6 [81/118] | 47.2 [17/36] | 52.8[19/36] | 31.4 [37/118] | |
Discussion
In Mendocino County, California, the accuracy of classifying density of I. pacificus nymphs as elevated or low was similar among models that were based on either week of the calendar year alone, a combination of calendar week and woodland type, or time-lagged cumulative GDD. This is an intriguing finding because it indicates that predictors that can be easily understood by the public—calendar week or a combination of calendar week and broad forest type—have predictive values similar to more complex and less easily understood models that also incorporate meteorological data. Our analyses thus suggest that messaging to alert members of the public in Mendocino County as to when they should increase their vigilance in taking precautionary measures to prevent nymphal tick bites can be based on calendar week.
Our models sought not only to identify when seasonal host-seeking activity commences, but also when it diminishes to low levels. Overall, we report a roughly 5–7 month period in Mendocino County during which host-seeking nymphal densities exceed a low threshold value. However, accuracy of the models is threshold-dependent and is greater when the threshold is set at a lower nymphal density. This finding indicates that the models perform sufficiently for predicting a minimal level of host-seeking activity, but not for identifying periods of time when host-seeking nymphal densities are peaking or elevated above a very minimal threshold level. Host-seeking activity commences when diapause (behavioral or morphogenetic) is broken, and ceases when it is induced. Both the induction and cessation of diapause are linked with day length and temperature (Padgett and Lane 2001; Belozerov 2009; Eisen et al. 2016; Gray et al. 2016). These factors were captured directly (GDD) or indirectly (week is correlated with day length) in our models. In contrast, host abundance, which was not included in our models, can be an important factor to determine the local density of ticks (LoGiudice et al. 2003; Ostfeld et al. 2006). It therefore is not surprising that our models were less accurate for a higher, as compared with a lower, nymphal density threshold. Notably, in the week-woodland type model elevated nymphal activity was predicted to span approximately 2–3 wk longer in hardwood compared with the other woodland types. This finding may seem counterintuitive compared with an earlier study (Eisen et al. 2003) that reported an abbreviated peak period of nymphal questing activity in hot, dry hardwood-dominated sites compared with cooler and moister hardwood-conifer or redwood sites. These apparently contradictory findings may result from using different thresholds for nymphal abundance in the previous study and the present one: because nymphal abundance is typically higher in hardwoods, compared with redwood-dominated sites (Eisen et al. 2002, 2003, 2006), a lower proportion of the ticks present need to be active in the hardwood sites in order to cross the minimal threshold set in our modeling.
Given that week and woodland type were shown here to be good predictors of nymphal host-seeking activity, one may ask why these variables were not included in the meteorological model. Closer examination of the relationships among predictors revealed that GDD at a 4 wk lag was very strongly correlated with calendar week (R = 0.98). Within each of the three woodland type categories, there was considerable variation among sites in time-lagged GDD. Therefore, the site-specific heat indices were more informative than broad woodland categories.
Each of the three models yielded similar overall accuracies in correctly categorizing observations into elevated or low nymphal density and for each, sensitivity was higher than specificity. As a result, the false positive rate was higher than the false negative rate. Thus, misclassification errors would generally favor assuming ticks were present when they were not. Notably, all three models performed better when tested using validation set one compared with validation set two. Specifically, specificity of all models was lower when tested using validation set two compared with the buildset or validation set one. We hypothesize that this arose because validation set one had a similar sampling design to the build set, with biweekly sampling throughout the active season for nymphs and sampling sites located in geographically similar areas to the build set. By contrast, validation set two was collected during the peak of nymphal activity and spanned a much broader geographical region and a greater diversity of environmental conditions than the buildset.
Indeed, closer inspection of sites in validation set two yielding misclassifications revealed that most of the false positive sites were located in the far west or the most northern sites, primarily in redwood–tanoak–pine woodlands, where peak nymphal densities were generally lower than in other parts of the county. However, false positives were observed throughout the study area and may arise either through misclassification by the model, or may be the result of the daily variability in host-seeking activity combined with the fact that a drag sample collects only a small proportion (<10%) of the I. pacificus nymphs present within the sampled substrate (Tälleklint-Eisen and Lane 2000). Said another way, in some instances, nymphs may have been present, but not collected on drags. Notably, because the data used in validation set two were collected during the peak of nymphal host-seeking activity, the number of observations below the set threshold was low (n = 15). Therefore, although only up to 8 sites were misclassified as positive, this yields a high percentage of false positives. Examining false negatives in validation site two, the majority of misclassifications occurred in the later of the two observations per site. Specifically, for the meteorological, week, and week woodland type models, 76% (20/26), 88% (16/18), and 94% (17/18) of false negative observations were the second of the two observations per site, suggesting that in some instances, the models anticipate tick numbers to wane earlier in the year than they actually do.
We conclude that in Mendocino County, particularly in the southern portion of the county in hardwood and hardwood-conifer woodlands, calendar week adequately described the period of time during which nymphal ticks were actively host-seeking. Although the week-woodland type and the meteorological models performed similarly to the week model, categorizing woodlands is subjective and interpretation of meteorological data is more challenging than using calendar week alone. However, we emphasize that each of the three models may have particular advantages over the others, depending on the application. For example, the meteorological model might be more useful than calendar week when extrapolating to new geographical areas of limited size (e.g., a different county) or to larger spatial scales (states or entire tick species distributional areas) because tick behavior is often strongly influenced by temperature and moisture. The meteorological model is also the only model capable of resolving year-to-year variations in periods of elevated nymphal host-seeking activity that result from inter-annual climatic variability, and also could be employed to examine how projected climatic warming during the 21st century may alter the timing of future I. pacificus nymphal host-seeking behavior.
Acknowledgments
Collection of previously published data upon which these models were based was supported in part by a cooperative agreement (U50/CCU906594) from the Centers for Disease Control and Prevention and by a grant from the National Institutes of Health (AI22501) to Robert S. Lane. We thank Kalanthe Horiuchi for helpful discussions and technical support.
References Cited
- Belozerov V. N. 2009. Diapause and quiescence as two main kinds of dormancy and their significance in life cycles of mites and ticks (Chelicerata: Arachnida: Acari). Acarina. 17: 3–32. [Google Scholar]
- Diuk-Wasser M. A., Gatewood A. G., Cortinas M. R., Yaremych-Hamer S., Tsao J., Kitron U., Hickling G., Brownstein J. S., Walker E., Piesman J., et al. 2006. Spatiotemporal patterns of host-seeking Ixodes scapularis nymphs (Acari: Ixodidae) in the United States. J. Med. Entomol. 43: 166–176. [DOI] [PubMed] [Google Scholar]
- Eisen L., Eisen R. J., Lane R. S. 2002. Seasonal activity patterns of Ixodes pacificus nymphs in relation to climatic conditions. Med. Vet. Entomol. 16: 235–244. [DOI] [PubMed] [Google Scholar]
- Eisen R. J., Eisen L., Lane R. S. 2006. Predicting density of Ixodes pacificus nymphs in dense woodlands in Mendocino County, California, based on geographic information systems and remote sensing versus field-derived data. Am. J. Trop. Med. Hyg. 74: 632–640. [PubMed] [Google Scholar]
- Eisen R. J., Eisen L., Castro M. B., Lane R. S. 2003. Environmentally related variability in risk of exposure to Lyme disease spirochetes in northern California: effect of climatic conditions and habitat type. Environ. Entomol. 32: 1010–1018. [Google Scholar]
- Eisen R. J., Eisen L., Ogden N. H., Beard C. B. 2016. Linkages of weather and climate with Ixodes scapularis and Ixodes pacificus (Acari: Ixodidae), enzootic transmission of Borrelia burgdorferi, and Lyme Disease in North America. J. Med. Entomol. 53: 250–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gatewood A. G., Liebman K. A., Vourc'h G., Bunikis J., Hamer S. A., Cortinas R., Melton F., Cislo P., Kitron U., Tsao J., et al. 2009. Climate and tick seasonality are predictors of Borrelia burgdorferi genotype distribution. Appl. Environ. Microb. 75: 2476–2483. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gelman A., Hill J. 2007. Data Anaysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, NY, New York. [Google Scholar]
- Gray J. S., Kahl O., Lane R. S., Levin M. L., Tsao J. I. 2016. Diapause in ticks of the medically important Ixodes ricinus species complex. Ticks Tick Borne Dis. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lane R. S., Kleinjan J. E., Schoeler G. B. 1995. Diel activity of nymphal Dermacentor occidentalis and Ixodes pacificus (Acari, Ixodidae) in relation to meteorological factors and host activity periods. J. Med. Entomol. 32: 290–299. [DOI] [PubMed] [Google Scholar]
- LoGiudice K., Ostfeld R. S., Schmidt K. A., Keesing F. 2003. The ecology of infectious disease: effects of host diversity and community composition on Lyme disease risk. Proc. Natl. Acad. Sci. U. S. A. 100: 567–571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostfeld R. S., Canham C. D., Oggenfuss K., Winchcombe R. J., Keesing F. 2006. Climate, deer, rodents, and acorns as determinants of variation in lyme-disease risk. PLoS Biol. 4: e145.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Padgett K. A., Lane R. S. 2001. Life cycle of Ixodes pacificus (Acari: Ixodidae): timing of developmental processes under field and laboratory conditions. J. Med. Entomol. 38: 684–693. [DOI] [PubMed] [Google Scholar]
- Piesman J., Eisen L. 2008. Prevention of tick-borne diseases. Annu. Rev. Entomol. 53: 323–343. [DOI] [PubMed] [Google Scholar]
- R Core Team. 2014. A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria: (http://www.R-project.org) (Accessed 2 September 2016). [Google Scholar]
- Tälleklint-Eisen L., Lane R. S. 2000. Efficiency of drag sampling for estimating population sizes of Ixodes pacificus (Acari: Ixodidae) nymphs in leaf litter. J. Med. Entomol. 37: 484–487. [DOI] [PubMed] [Google Scholar]
- Thornton P. E., Thornton M. M., Mayer B. W., Wilhelmi N., Wei Y., Devarakonda R., Cook R. B. 2014. Daymet: Daily Surface Weather Data on a 1-km Grid for North America, Version 2. Data set. Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, USA: (http://daac.ornl.gov) (accessed 16 January 2014). [Google Scholar]
- Thornton P. E., Running S. W., White M. A. 1997. Generating surfaces of daily meteorological variables over large regions of complex terrain. J. Hydrol. 190: 214–251. [Google Scholar]
- Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B. 58: 267–288. [Google Scholar]
- U.S. Geological Survey (USGS). 2009. National Elevation Data (NED), USGS, Sioux Falls, SD. http://nationalmap.gov.
- Vail S. G., Smith G. 1998. Air temperature and relative humidity effects on behavioral activity of blacklegged tick (Acari: Ixodidae) nymphs in New Jersey. J. Med. Entomol. 35: 1025–1028. [DOI] [PubMed] [Google Scholar]

