Table 1.
Target-prediction feature group | Total variables available for prediction modelsa | Data sourceb | Data description | Variable examples |
---|---|---|---|---|
Demographics | 149 main effects 15 second order terms | United States Census Bureau | Per capita information on race, gender, family size, education, income etc. | Per capita percent of:(1) adults 25 years+ with college degree (2) Black household heads |
Adult & Child Health Characteristics | 146 main effects 15 second order terms | Centers for Disease Control & Prevention | Per capita prevalence of select diseases, health behaviors etc. | Per capita percent of: (1) children (less than 18) with asthma (2) adults (18 years+) with overweight BMI |
Community Characteristics | 151 main effects 15 second order terms | American Community Survey | Per capita employment information, housing characteristics etc. | Per capita percent of:(1) travel time between 30 and 59 min (2) median age and size of home |
Consumer Expenditures | 571 main effects 57 second order terms | Bureau of Labor Statistics | Per capita expenditures on food, household goods, and miscellaneous items | Per capita expenditures on: (1) beef or red meat (2) infant snowsuits or jackets |
A total of 1119 target-prediction features from 1 of 4 groups was made available to the machine learning prediction model pipeline. Main effects and second order terms were used in parametric prediction models while second order terms were excluded from nonparametric prediction models. Variables from each of the 4 groups were available for all eligible U.S. counties.
Data was extracted from DataPlanet©, which aggregates public domain and licensed data. The original governmental data source is listed in the table.