Skip to main content
. 2009 Dec 28;76(5):1486–1496. doi: 10.1128/AEM.02288-09

TABLE 5.

Seasonal E. coli (target variable) regression tree split criteriaa for terminal nodesb

Data set (total no. of samples)c Root node split criterion Secondary split criteriond Median E. coli density (CFU 100 ml−1) ± MAD No. of samples
E. coli densities for spring (595) NUD_CATTLE BARN, >2.29 km DPAST_2K, >1.26 obs. km−2 58 ± 1,040 190
NUD_CATTLE BARN, >2.29 km DPAST_2K, ≤1.26 obs. km−2 20 ± 206 315
NUD_CATTLE BARN, ≤2.29 km SHREVE, >5.0 108 ± 258 61
NUD_CATTLE BARN, ≤2.29 km SHREVE, ≤5.0 230 ± 516 29
E. coli densities for summer (750) SHREVE, >15.5 NUD_CATTLE BARN, ≤3.65 km 112 ± 114 87
SHREVE, >15.5 NUD_CATTLE BARN, >3.65 km 20 ± 88 383
SHREVE, ≤15.5 NUD_PAST, ≤1.16 km 468 ± 3,650 141
SHREVE, ≤15.5 NUD_PAST, >1.16 km 174 ± 521 139
E. coli densities for fall (575) FORAGEP_2K, >0.39 km2 km−2 NA 230 ± 608 64
FORAGEP_2K, ≤0.39 km2 km−2 NUD_FORAGE, >2.95 km 24 ± 84 73
FORAGEP_2K, ≤0.39 km2 km−2 NUD_FORAGE, ≤2.95 km 78 ± 526 438
a

Root node split criterion, the variable and condition by which all the data were divided into two nodal groupings (child nodes); secondary split criterion, the variable and condition by which the child nodes derived from the root nodal split were divided (for purposes of brevity, we present only results up to this tree level in this study). The variables that define the E. coli split criteria are described in Table 1.

b

Terminal nodes are data groupings where no further splitting occurs.

c

All data sets are for 2004 to 2007. Note that winter data are not included, since CART could not cross-validate that seasonal data set, as a result of lack of data structure.

d

obs., observations; NA, not applicable, since there was no terminal node of data at that level in the tree model.