Abstract
Walkability is a popular and ubiquitous term at the intersection of urban planning and public health. As the number of potential walkability measures grows in the literature, there is a need to compare their relative importance for specific research objectives. This study demonstrates a classification and regression tree (CART) model to compare five familiar measures of walkability from the literature for their relative ability to predict whether or not walking occurs in a dataset of objectively measured locations. When analyzed together, the measures had moderate-to-high accuracy (87.8% agreement: 65.6% of true walking GPS-measured points classified as walking and 93.4% of non-walking points as non-walking). On its own, the most well-known composite measure, Walk Score, performed only slightly better than measures of the built environment composed of a single variable (transit ridership, employment density, and residential density), suggesting there may be contexts where transparent and longitudinally available measures of urban form are worth a marginal tradeoff in prediction accuracy. This demonstration of CART for the comparison of walkability measures also highlights the importance for public health and urban design researchers to think carefully about how and why particular walkability measures are used.
Keywords: walkability, physical activity, health, decision tree
1. INTRODUCTION
The relationship between the built (i.e., human-made) environment and health outcomes through physical activity is of tremendous interest in public health and urban design literature (1-10). Many studies identify characteristics of the built environment that are associated with physical activity, particularly emphasizing walking as a widespread population-level means of getting physical activity (4, 9-18). In an effort to succinctly characterize these environments, numerous researchers have combined measures of the built environment into composite indices of walkability (19-25). However, measures of walkability in the public health literature are diverse in their construction (13, 19, 21, 26-28) and conceptualization (29-31). Due to immense heterogeneity, researchers need to be able to compare measures and critically evaluate whether a chosen measure of walkability provides the necessary attributes and interpretation to address their research objective.
Walkability definitions in the literature can be grouped into two key clusters (29): means and outcomes. Most traditional walkability measures are related to characteristics of the urban form that provide the means of a walkable environment (e.g., transversable, compact, safe, and physically enticing) (29). However, another important set of definitions pertains to the perceived outcomes of walking, which considers an environment to be walkable because it leads to certain outcomes (e.g., socialization, sustainable transportation, and physical activity). Defining walkable areas as those where walking is occurring may be of particular interest to researchers and policymakers focused on health outcomes at an ecological level. For example a walk-inducing definition might be used to explain interaction by walkability in an ecological study of air pollution and hypertension or diabetes (32, 33).
However, even with a more specific activity-focused definition, the choice of walkability measures is not clear. Tremendous research has been done to quantify walkability, including emerging measures that integrate global positioning system (GPS), geographic information system (GIS), and accelerometry technologies (4, 9). Obtaining the data required to implement many of these measures is daunting, if not impossible, in many urban settings. For example, while some countries have systematically collected street-level data at the national level (34), data collection in the United States is largely a patchwork local effort. While the American Community Survey is a high quality national dataset, it is limited to administrative boundaries (35). Given the diversity of settings in which outcome-focused walkability research is of interest, an easy-to-use method of comparing available measures of walkability could allow researchers to sort through potential measures more strategically.
When data elements are available, the attributes of certain variables may make a measure more or less viable for a given research question. Three attributes an investigator might consider, include: ease of interpretation, availability (e.g., potential for replication in other settings and over time), and resolution of data (e.g., U.S. Census block group vs. latitude/longitude point). Composite indices have been proposed as improvements over single-variable measures (19) and are a useful proxy for services in settings where service data are not widely collected or easily available. However, composite indices have limitations that may be important depending on the specific study design. For example, Walk Score (36) is a proprietary algorithm that is difficult to evaluate in detail and the National Walkability Index (37) is only available at the Census block group level) (Table 1). They also are not useful for provoking interventions or policy changes (e.g., a 1-unit change in Walk Score is not cannot be acted upon). In contrast, single-variable measures of population density are commonly available and reproducible in multiple cities and time points, but their relative utility in predicting walking is less clear. Among many available in the literature, employment density and residential density are both direct measurements of built environment characteristics prevalent in the literature as recognizable proxies for walkability or major components of walkability indices (1, 3, 8, 29, 38, 39). Although they represent different constructs of the built environment, their data are both broadly available and longitudinally collected (16).
Table 1:
Attributes of selected walkability measures.
Attributes for consideration: |
Ease of
interpretation |
Resolution | Reproducibility | |||
---|---|---|---|---|---|---|
Single- variable or composite |
Source data level |
Analysis data level (for analysis) |
Transparent | Consistent | Longitudinal | |
Walk Score (36) | Composite | Unknown | Point level | No | No | With subscription |
National Walkability Index (37) | Composite | Census block group | Census block group | Yes | Yes | Potentially1 |
Transit ridership | Single-variable | Stop-level | Point level | Yes | Yes | Yes |
Employment density | Single-variable | Parcel data | Point level | Yes | Yes | Potentially2 |
Residential density | Single-variable | Parcel data | Point level | Yes | Yes | Potentially2 |
Not currently available but may be updated in the future using subsequent census information for additional time frames; they would still be limited in frequency of updates.
Yet because these are important measures of social and economic environment at the local level, it is reasonable to expect that they will continue to be available longitudinally.
Other measures, such as transit ridership, are readily available reflections of built environment attributes that walkability metrics traditionally comprise, such as land use mix and development density (21, 40). While residential unit and employment densities reflect the area composition and thus can act as surrogates for destinations in a given area, transit ridership is a direct count of at least some of the foot traffic occurring at and around transit stops (41). One benefit to ridership data is that they are collected by transit agencies for planning and evaluation purposes, on a routine basis, and thus are available longitudinally. Recorded as a value per transit stop, ridership also captures aspects of place through which people walk that may not score as high by other walkability metrics (e.g., the vicinity of a park-and-ride or along a thoroughfare between high-density neighborhoods with parking constraints).
Given these different considerations and the availability of diverse measures, there is a need for more comparisons of the relative importance of different walkability measures. Comparisons will better inform researchers’ prioritization of measures for indicating characteristics of locations where people walk (e.g., How much more correlated to walking does a measure need to be to be chosen over one that is more interpretable or reproducible?). Classification and regression tree (CART) analysis offers a user-friendly decision tree approach for ranking variable importance in a unified model. CART is a nonparametric statistical procedure that examines all possible independent variables and selects the one that splits data into binary groups that are most different with respect to the outcome. The methodology allows for the comparison of measures that are collinear and accepts the non-random relationship between urban locations through which people travel (e.g., because street layouts are typically spatially restricted in the division of public and private property). CART has been underutilized in public health, possibly because of a general lack of awareness of the utility of CART procedures (42).
This article illustrates the utility of CART for comparing how potential measurements of walkability perform in an example dataset of objectively measured outdoor walking at a large number of GPS-recorded locations in King County, WA.
2. MATERIALS AND METHODS
2.1. Data: Secondary use of Travel Assessment and Community study (TRAC) data
The study takes advantage of data previously collected and analyzed as part of a well-studied cohort (16, 43-46) of personal monitoring data in the Travel Assessment and Community (TRAC) study, which measured physical activity and travel patterns in King County, Washington before and after the installation of a light rail transit system. TRAC was a three-wave cohort study that recruited adults living in King County, stratified such that a roughly equal number of participants resided less than one mile and more than one mile from a planned light rail stop, in otherwise similar built environments with similar demographic characteristics (47). Enrollment and data collection methodology have been detailed previously (43). Briefly, participants were mailed and instructed to wear an accelerometer (ActiGraph GT3X), carry a GPS unit, and record their travel in a paper diary for 7 consecutive days. Accelerometers were set to aggregate counts in 30-second epochs and GPS devices were set to collect data at 30-second intervals. Data from the three instruments were matched by time of recording (48). Participants also answered survey questions to provide self-reported sociodemographic variables. The analysis presented here used accelerometer and GPS data collected from participants between May 1, 2012 and August 30, 2013 (TRAC study, wave 3) because the GPS technology from this measurement wave (QStarz BT-1000XT) contained satellite information that could be used to determine whether subjects were indoors or outdoors (49). The Institutional Review Board at the Seattle Children’s Research Institute approved the project.
2.1.1. Case points: Identifying locations in TRAC dataset through which people walked
The full methodology used for pre-processing TRAC data has been previously described (48, 50). Briefly, data were restricted to those recorded on valid days, defined as ≥1 location recorded in the travel diary, ≥1 GPS record, and an accelerometer wear time of ≥8 hours following protocols from Troiano et al. (51). After linking data from GPS, accelerometer, and travel-diary and excluding data collected on non-valid days during the cohort period, 8,815,812 GPS points were recorded by TRAC participants. GPS-recorded latitude/longitude locations (points) recorded on valid days were first partitioned into those that were or were not part of light-to-vigorous physical activity bouts (Figure 1). Bouts of physical activity were defined according to accelerometer count thresholds (50). A physical activity bout was defined as an interval with at least seven minutes of >500 counts per 30-second epoch (cpe), allowing for up to two minutes below that threshold. For example, an 8-minute physical activity bout might begin with 3 minutes >500 cpe, have 2 minutes of <500 cpe, and conclude with another 3 minutes >500 cpe. This definition allowed for brief breaks common in urban walking (e.g., stopping at intersections).
Figure 1: Selection of GPS-recorded latitude/longitude location points for CART demonstration dataset.
Description of data points selected from GPS-recorded latitude/longitude (X/Y) coordinates recorded by participants in TRAC study during wave 3 (2012-2013). After exclusions, 22,900 locations recorded by people walking and 91,600 locations recorded by people being non-active and moving faster than at a walking speed, for a total of 114,500 locations for analyses. (1) Valid days were defined as having ≥1 location recorded in the travel diary, ≥1 GPS record, and an accelerometer wear time of ≥8 hours. (2) Physical activity bouts were identified from GPS and accelerometer data and classified as walking and non-walking physical activity according to methodology detailed in Kang et al., 2013 (50).
Of the more than 8.8 million recorded points, 68,810 were determined to be a part of physical activity bouts according to accelerometer counts. Any points recorded indoors [indicated by a satellite signal-to-noise ratio ≤250 (49); (33.4%)] or lacking sufficient satellite information by which to make the determination (0.4%) were excluded because walking in an indoor location is assumed not to be influenced by built environment exposures of interest. Due to built environment data limitations, 2,584 (3.4%) points recorded outside geographic boundaries of King County were also excluded. Of the remaining 42,951 valid outdoor points, 20,051 (46.7%) were excluded because they had an attributed speed greater than 6 kmh or accelerometer counts greater than 2,863 cpe, suggesting that higher intensity physical activity other than walking was taking place (50). GPS-recorded locations recorded during walk bouts (i.e., walking routes) were thus considered to be locations where walking took place and are subsequently referred to as “case” points (n=22,900).
2.1.2. Control points: Identifying locations in TRAC dataset through which people are not walking
Controls were prepared in procedures parallel to those used for cases. After excluding all points comprising physical activity bouts (Figure 1), the remaining data points that occurred on valid days (approximately 8.7 million) were further censored using the following exclusions: missing satellite information (0.2%), recorded at speeds >160 km/h (suggesting an error in GPS recording system) (0.3%), and recorded indoors (5.4%). Locations with a speed greater than 6 km/h or adjacent in time (i.e., within 60 seconds prior or subsequent) to a recorded point with a speed greater than 6 km/h (n=679,147) were selected for the analysis. This definition captures locations where a person is traveling at a ‘faster-than-walking’ speed (i.e., movement by a mode other than walking) but may be stopping occasionally (e.g., at bus stops or traffic lights). To improve data processing efficiency, 91,600 points were randomly selected from the full sample to use in the analyses (for a ratio of four control locations to one case location). The final set of points used in analyses is thus comprised of 22,900 case locations and 91,600 control locations for a total of 114,500 locations.
2.2. Walkability measures for comparison: Calculating five pre-selected measures for all case and control points
Five walkability measures with an array of attributes were selected from the literature: (i) Walk Score, (ii) National Walkability Index (NWI), (iii) transit ridership, (iv) employment density, and (v) residential density. These five were chosen not to maximize explanatory power of a model to predict walking behavior, but rather as examples of different constructs and combinations of critical attributes for a useful walkability measure. These attributes are: 1) straightforward to generate and interpret (i.e., single variable and/or high-resolution source data); 2) replicable across geographic spaces and over time in the U.S. (i.e., a transparent and consistent algorithm with available longitudinal data); and 3) available at a fine resolution (i.e., latitude/longitude coordinate resolution) (Table 1).
Walk Score and NWI were selected for analysis because they are two well-known, nationally available indices of walkability. Each combines multiple aspects of the built environment to provide a scaled composite measure. Commercially available and made easy-to-use with a 100-point scale, it is nearly ubiquitous in real estate listings, Walk Score has often been used to compare other research-based walkability indices (25, 36, 52, 53). However, its primary limitations include its lack of transparency and, relatedly, difficulty in assessing consistency. At least one major change in algorithm was disclosed in 2014, thus a longitudinal study using Walk Score as a measure of the built environment would be inconsistent pre- and post- 2014 (54). Walk Score values were obtained for each GPS point using the Walk Score application programming interface via the walkscoreAPI R package (55).
NWI is a product of the Environmental Protection Agency’s Smart Location Database and was recently reported for its association to self-reported walking in urban areas (24). The index quantifies each 2010 U.S. Census block group’s walkability relative to others in the United States, on a scale from 1 to 20 (37). In this study, each point was assigned to the NWI value of the Census block group in which the point was measured. The NWI measure at a point is equal to the spatially averaged built environment values across an entire census block, regardless of where in that block the point occurs. This exemplifies a limitation of walkability measures that are only available at an aggregated or administrative boundary level, rather than at the specific location or local neighborhood experienced by someone moving through space. The NWI is currently only available using 2010 Census data.
Residential unit data were extracted from tax parcel polygon data available through King County GIS and the King County assessor (56). Residential density was quantified as the number of residential units per hectare within a quarter-mile radius of each GPS point. Quarter-mile buffers were used to represent the area accessible within an approximately 5-minute walk, thus emphasizing the immediate environment around each measured location. Employment data were developed from employment sector data from the Washington Employment Security Department. Commercial tax parcel data (56) were assigned to employment sectors and the count of employees per parcel was estimated by the ratio of individual parcel areas to the total area of parcels in the sector. Density was quantified as the number of employees per hectare within a quarter-mile radius of each GPS point.
Transit ridership was quantified as the number of average weekday boardings and alightings at all stops within a quarter-mile radius of each GPS point, as observed during annual observations and provided by King County Metro, the County's transportation agency. Residential unit, employment, and transit ridership data were assessed for 2012-2013, the years in which travel data were collected and operationalized using a series of raster layers (spatially continuous surfaces of grid cells) called SmartMaps (48). SmartMaps are raster data sets that are generated by using focal processes in geographic information system (GIS) software to produce neighborhood-level summaries of built environment characteristics for all locations in a study area using a predefined focal radius. SmartMaps enable bulk measures of environmental variables from large GPS-point data sets, for which point-centric buffer analyses would be computationally infeasible.
2.3. Statistical analyses: Comparison of variable importance by CART analysis
Measures of participant sociodemographic characteristics were obtained from survey responses collected immediately prior to wearing of devices in 2012-2013 and included in analyses as follows: age (years), gender (categorical: male, female), race (categorical: White, African American, Asian, Native American or Alaska, Pacific Islander, Other, or Multiple), number of household vehicles, and annual household income (categorical, in US dollars: <$10,000, $10,000-19,000, $20,000-29,000, $30,000-39,000, $40,000-49,000, $50,000-59,000, $60,000-69,000, $70,000-79,000, $80,000-89,000, $90,000-99,000, >$100,000).
Data were partitioned by random sample to a training data set (70%) and test data set (30%). The training set was used to generate the decision tree by CART and to generate variable importance estimates. The test set was used for testing each variable’s predictive ability. CART analysis is a forward-selection model building approach that selects, at each step, the variable and cutoff threshold with the strongest association to the outcome (walking) and performs a binary split of the data according to that variable. Further partitioning occurs within prior partitions, thus forming “branches” that represent strata of points that are maximally segregated into groupings of case and control points. Unlike a linear regression model, CART analysis keeps track of splits during the process such that a variable may not serve as the “primary splitter” (i.e. determining factor that is visible in the resulting decision tree diagram) but can accumulate importance over a series of splits and thus be accounted for in the outcome of the final tree. This variable importance metric was used to explore which of the walkability measures was best for classifying GPS-measured locations as walking or non-walking, while accounting for the full set of participant sociodemographic characteristics (age, gender, household vehicle count, race, and income). The CART method is especially useful for this type of prediction because it can perform well even when partitioning on small data subsets (e.g., minority racial groups instead of White vs. non-White) and when measures exhibit high collinearity (57).
The goal of this analysis was to assess relative importance of each walkability measure in classifying whether a GPS-recorded point was a case (walk) or a control (non-walk, non-physical activity) point. Importance was determined using the rpart (58) package in R (version 3.6.0), which defines importance for each variable as the sum of the decrease in impurity (of the resulting classification groups) for each of the variables at each node. A variable may appear in a tree many times, as either a primary or surrogate variable. The overall measure of variable importance, therefore, is the sum of the goodness of split measures for each split in which the variable serves as a primary variable, plus each split in which it is a surrogate. Thus variable importance is a cumulative measure of how well a variable partitions data points into each outcome group (i.e., the degree to which the groups generated at each split in a decision tree category containing only one type of outcome). This sum is then scaled to 100% across all variables in the model such that the value is interpretable as a relative percent importance for classifying points over the overall tree. In our analysis, the tree was set to continue ‘growing’ until subsequent splits did not improve the purity of the classes by more than 0.0001 (the complexity parameter). By setting this value near to zero, our analysis seeks to minimize classification errors in the analyzed data set.
After building and analyzing the CART model with the training data set, the performance of each walkability measure was assessed independently using the remaining test data set for sensitivity (i.e., the probability that the model correctly classified a case location as walking) and specificity (i.e., the probability that the model correctly classified a control location as non-walking).
2.3.1. Sensitivity analysis
Although the inclusion criteria for participants included a required baseline physical activity level (“able to walk unassisted for at least 10 minutes”) (43), the possibility that some participants might have personal characteristics that make them less susceptible to influences of the built environment on whether or not they walked (i.e., individual traits that disinclined them towards walking) was also considered. Everyone in the study population contributed control points (i.e., traveled by means other than walking), but not everyone contributed case points (i.e., walked). Therefore, we conducted a sensitivity analysis using the subset of locations recorded by participants (n=277; 53.1%) who contributed both case and control points: 55,715 control and 21,047 case points.
3. RESULTS
3.2. Descriptive statistics of points in analysis data set
Summary statistics describing attributes of location points are provided in Table 2. On average, each participant contributed 201.5 points (median) to the analysis data set. The median number of control points and case points per participant was 167 (IQR: 111-226) and 8.5 (IQR: 0-57.5), respectively. Walking points were predominantly recorded in the course of utilitarian trips (79.0%) vs. recreational trips (21.0%).
Table 2:
Point-level characteristics i
Control points (N=91600) |
Case (walk) points (N=22900) |
Total (N=114500) |
|
---|---|---|---|
WalkScore (scale 0-100) | |||
Mean (SD) | 74.2 (25.2) | 77.5 (24.6) | 74.9 (25.1) |
Median [Min, Max] | 83.0 [0, 100] | 87.0 [0, 100] | 84.0 [0, 100] |
National Walkability Index (scale 1-20) | |||
Mean (SD) | 16.0 (2.69) | 16.5 (2.56) | 16.1 (2.67) |
Median [Min, Max] | 16.3 [1.83, 20.0] | 17.0 [4.83, 20.0] | 16.5 [1.83, 20.0] |
Transit Ridership (average weekday boardings and alightings within radius) | |||
Mean (SD) | 54.1 (141) | 67.6 (156) | 56.8 (144) |
Median [Min, Max] | 9.80 [0, 987] | 12.7 [0, 982] | 10.2 [0, 987] |
Employment Density (jobs per hectare) | |||
Mean (SD) | 38.4 (84.7) | 49.2 (95.2) | 40.6 (87.0) |
Median [Min, Max] | 7.43 [0, 489] | 7.95 [0, 490] | 7.53 [0, 490] |
Residential Density (residential units per hectare) | |||
Mean (SD) | 10.2 (11.1) | 13.0 (12.0) | 10.7 (11.3) |
Median [Min, Max] | 6.18 [0, 55.7] | 7.95 [0, 55.7] | 6.57 [0, 55.7] |
Age (years)i | |||
Mean (SD) | 55.1 (12.3) | 53.8 (12.1) | 54.8 (12.3) |
Median [Min, Max] | 56.0 [26.0, 88.0] | 51.0 [26.0, 84.0] | 55.0 [26.0, 88.0] |
Missing | 2539 (2.8%) | 921 (4.0%) | 3460 (3.0%) |
Vehicles (per household)i | |||
Mean (SD) | 1.59 (0.944) | 1.53 (0.990) | 1.58 (0.953) |
Median [Min, Max] | 1.00 [0, 7.00] | 1.00 [0, 7.00] | 1.00 [0, 7.00] |
Missing | 6605 (7.2%) | 3320 (14.5%) | 9925 (8.7%) |
Gender i | |||
Female | 55535 (60.6%) | 12660 (55.3%) | 68195 (59.6%) |
Male | 35663 (38.9%) | 10054 (43.9%) | 45717 (39.9%) |
Missing | 402 (0.4%) | 186 (0.8%) | 588 (0.5%) |
Race i | |||
White | 74572 (81.4%) | 19626 (85.7%) | 94198 (82.3%) |
African American | 6946 (7.6%) | 922 (4.0%) | 7868 (6.9%) |
Asian | 4942 (5.4%) | 1075 (4.7%) | 6017 (5.3%) |
Native American/Alaskan Native | 36 (0.0%) | 0 (0%) | 36 (0.0%) |
Pacific Islander | 318 (0.3%) | 10 (0.0%) | 328 (0.3%) |
Other | 223 (0.2%) | 18 (0.1%) | 241 (0.2%) |
More than One | 3067 (3.3%) | 651 (2.8%) | 3718 (3.2%) |
Not Reported | 1094 (1.2%) | 412 (1.8%) | 1506 (1.3%) |
Missing | 402 (0.4%) | 186 (0.8%) | 588 (0.5%) |
Income i | |||
<$10,000 | 4587 (5.0%) | 814 (3.6%) | 5401 (4.7%) |
$10,000 - $19,000 | 4638 (5.1%) | 1288 (5.6%) | 5926 (5.2%) |
$20,000 - $29,000 | 5404 (5.9%) | 1638 (7.2%) | 7042 (6.2%) |
$30,000 - $39,000 | 5221 (5.7%) | 1459 (6.4%) | 6680 (5.8%) |
$40,000 - $49,000 | 8165 (8.9%) | 1890 (8.3%) | 10055 (8.8%) |
$50,000 - $59,000 | 8965 (9.8%) | 2105 (9.2%) | 11070 (9.7%) |
$60,000 - $69,000 | 6028 (6.6%) | 1206 (5.3%) | 7234 (6.3%) |
$70,000 - $79,000 | 6717 (7.3%) | 937 (4.1%) | 7654 (6.7%) |
$80,000 - $89,000 | 4930 (5.4%) | 2596 (11.3%) | 7526 (6.6%) |
$90,000 - $99,000 | 7286 (8.0%) | 1142 (5.0%) | 8428 (7.4%) |
>$100,000 | 26065 (28.5%) | 6795 (29.7%) | 32860 (28.7%) |
Missing | 3594 (3.9%) | 1030 (4.5%) | 4624 (4.0%) |
Walking Purpose | |||
Utilitarian | Not recorded | 18097 (79.0%) | -- |
Recreational | Not recorded | 4803 (21.0%) | -- |
Unique participants | 513 | 286 | 522 |
Points per participant (median; IQR) | 167.0 (111.0-226.0) | 8.5 (0.0-57.5) | 201.5 (128.5-279.0) |
Participant-level characteristic that was assigned to every GPS-recorded location point contributed by the given participant who recorded it.
3.2. CART analysis
The relative Importance of the five walkability measures for the classification of a location point as a case (walking) or control (non-active, not walking) point is shown in Figure 2, along with values of the sociodemographic variables. The relative importance value presents a cumulative measure of how well the variable classified each point across all splits in the decision tree, scaled to 100% and essentially is used to identify which measure best captures the way people are using a specific location in space (i.e., walking versus traveling by other means). Setting aside the values of splits made on sociodemographic factors (since they are intrinsic to the person and not reflective of external built environment influences), Walk Score was the top predictor of walking, followed sequentially by employment density, transit ridership, residential density, and NWI. This rank order of variable importance was also found for the classification of walk points as being utilitarian versus recreational (data not shown).
Figure 2: Relative importance of walkability measures.
Graphed bars display relative importance of walkability measures for classification of points as walking (scaled to 100%). The tree shown was used to determine the importance measure for each variable by calculating the goodness of split value at each node of decision tree diagram (complexity parameter=0.0001), summed across the whole tree per variable. Colored nodes of the tree indicate the walkability measure that had the highest goodness of fit value at that split: employment density (light blue), residential density (yellow), transit ridership (dark blue), NWI (red), and Walk Score (green). White boxes correspond to sociodemographic characteristics. Each terminal node is designated with W (case walk point) or O (control non-walk point).
Among the sociodemographic variables, age was the top predictor of whether or not someone would be walking at a given location, while the other three factors were less predictive than walkability measures. The full model (including all five measures and four sociodemographic factors) was relatively good at classifying walking points (87.8% agreement): accurately classifying 65.6% of true walking points as walking and 93.4% of control points as controls. Models of each walkability measure individually generated similar percent agreements, ranging from 84.6% for residential density to 86.8% for Walk Score.
The sensitivity of each of individual measure models was approximately 50% (range: 49.3% for residential density to 55.2% for Walk Score), indicating that for locations where walking occurred, approximately half were classified correctly. Specificity was higher, indicating that for control locations, approximately 94% were classified correctly (range: 93.5% for residential density to 94.8% for Walk Score). Full results are shown in Table 3.
Table 3:
Accuracy of CART prediction models with each measure individually for the prediction of walking at a given location. Arranged in order of sensitivity.
Walkability measure* |
TN | TP | FP | FN | Sensitivity | Specificity | Percent Agreement |
---|---|---|---|---|---|---|---|
All measures | 25642 | 4524 | 1823 | 2368 | 0.656 | 0.934 | 87.8% |
Walk Score | 26033 | 3802 | 1432 | 3090 | 0.552 | 0.948 | 86.8% |
National Walkability Index | 25733 | 3573 | 1732 | 3319 | 0.518 | 0.937 | 85.3% |
Transit ridership | 25714 | 3548 | 1751 | 3344 | 0.515 | 0.936 | 85.2% |
Employment density | 25697 | 3462 | 1768 | 3430 | 0.502 | 0.936 | 84.9% |
Residential density | 25673 | 3398 | 1792 | 3494 | 0.493 | 0.935 | 84.6% |
TP: True positive (true walk point (case), correctly predicted as case); TN: True negative (true control point, correctly predicted as control); FP: False positive (misclassification of true control point as case point); FN: False negative (misclassification of true case point as control point)
All models also included the following predictors: age, gender, household vehicle count, race, and income
In a sensitivity analysis, we limited the dataset to only those locations for which the recording participants contributed both case and control points. This selection resulted in a sample with a larger ratio of walking points to non-walking points (~1:3). The ranking of relative measures of importance was the same as the full analysis (Figure S-1). The sensitivity analysis also showed a similar pattern of agreement for prediction validation but slightly lower values: 83.0% for the model with all measures and 77.3% to 80.9% for individual measure models (Table S-1).
4. Discussion
This study introduces classification and regression tree models as an innovative approach to compare the relative value of five walkability measures—Walk Score, NWI, transit ridership, employment density, and residential density—for predicting the likelihood of walking at a given built environment location. We find Walk Score to be the leading predictor of walking, followed by employment density, transit density, residential density, and NWI. However, similarities in variable importance, sensitivity, and specificity values across the 5 walkability measures suggest that single-variable measures of the built environment (transit ridership, employment density, and residential density) may be useful alternatives for characterizing walkability in contexts where their other attributes (e.g., longitudinal availability, ease of interpretation, etc.) are important to the research objective. Importantly, the similar variable importance was observed in a dataset of objectively measured GPS-based locations and walking activity, which are limited but growing in the active mobility literature (59-66).
Further exploration of walkability measures in other urban settings where walking data are available will be useful. Regardless of which walkability measures are compared or how many elements of the built environment are included in a walkability index, their use inherently either highlights or obscures important aspects of the heterogeneity in urban form, as illustrated by the nine examples and corresponding walkability measures in Figure 3. It follows that, however measured, these obscured influences impact individuals’ exposures to the built environment and resultant likelihood of walking. This aligns with recent findings of Liao and colleagues (38), which used a different methodology (multiple regression models) and outcome (self-reported travel frequency) for comparing walkability measures but generally found poor predictive validity of their new data-driven walkability measure and two of the early composite measures (3, 19). Thus we echo other researchers’ calls for careful consideration in the choice of a walkability measure during study design (29, 31, 67, 68).
Figure 3: Examples of heterogeneity of urban form with similar walkability measures.
The table provides walkability measures corresponding to the images above (credits: Google Maps, by Google). Walkability measures with similar numerical values are highlighted with similar colors (blue = residential density ~5.9; green = “Very Walkable” Walk Score; Orange = “Car dependent” Walk Score).
1) 0-100%; 2) employees per ha within ¼ mile; 3) transit boardings and alightings at all stops within ¼ mile; 4) residential units per ha within ¼ mile; 5) Range of values: 1-20.
Consideration of different measures is especially important because evidence from several studies suggest walkability measures focused on destinations may mistakenly weight those that are not actually important to different subgroups in the same population (69) (e.g., single older adults vs. families with young children) and vary in effect by underlying sociodemographic differences between neighborhoods (70). For example, a review of research using Walk Score found the measure’s correlation to physical activity outcomes was sensitive to the gender of the walker (71). It also found that most papers using Walk Score included supplemental measures of the built environment in order to better describe the multiple dimensions of walkability in their research (71).
Results from the present CART analysis show that single-variable measures with practical attributes may have similar correlation to objectively measured walking compared to Walk Score. When there are logistical and conceptual tradeoffs to different walkability measures—as there frequently are in designing a research question—comparing measures is important for validating whether the benefit is sufficient to outweigh other data interpretation, availability, and resolution limitations. Here we offer CART as a method for this comparison because it is easy to implement and interpret in statistical software. It provides an estimate of variable importance within a single model, including adjustment for relevant sociodemographic confounders. Decision trees have only recently emerged in the built environment literature (72, 73) but show great promise for expanding the field’s ability to better calibrate, validate, and compare walkability measures.
While our study is based on available data in single metropolitan area and using a selected five walkability measures for purposes of the methods demonstration, our CART approach, along with efforts from Hankey et al. (73) and Mooney et al. (16), show the value of novel machine learning techniques to explore walkability measures in large datasets. Although some standard regression approaches could be used for similar comparisons, each measure would need to be accounted for in separate models with model selection through AIC, BIC, or other indicators of model fit (38). Thus we present CART as a reasonable alternative that also better accounts for correlated data in objectively measured walking data and collinearity of walkability variables in a single model.
The generalizability of relative variable importance rankings from our study is limited due to the walking locations used for our demonstration, since the selected locations are not a random representative of the total geography but rather represent self-selection of locations where people in the primary cohort traveled. Additionally, some walking points are likely to have contiguous or even identical locations, given the confluence of inaccessible private property and buildings that corral individuals into overlapping travel paths (i.e., sidewalks and roads). However, less than 7.2% of case points and 0.2% of control points were identical to another point in the analytic dataset and, more importantly, the CART approach does not require an assumption of independent data points so the similarities would only have reduced the diversity of location environments evaluated. Furthermore, locations in a geospatial dataset have inherent dependencies between points because the mode used to travel through one point is likely a strong predictor of whether that mode is being used at a subsequent location in a trip.
Further research should involve the comparison of more walkability measures and include objectively measured walking datasets in other settings where street-level data are available. Component measures of walkability indices such as density of services and pedestrian infrastructure would be of particular interest, since they could indicate the relative importance of index algorithms, as well as aid in the selection of data elements that are most valuable for critical research and policy objectives. This study lays the groundwork for more comparisons of walkability measures to inform more precise study designs for specific questions of interest.
5. CONCLUSION
We demonstrate a classification and regression tree approach for the comparison of five prominent walkability measures from the literature for their ability to estimate, for a person in a given place, the likelihood they are walking versus not. We find that all five measures have relatively low predictive power and the similarity of prediction accuracies reinforces our hypothesis that walkability measures with practical advantages may be more suitable given other data constraints. The interpretation of relative importance lends itself to easy replication in datasets from other geographies and cohorts. CART also can also be easily used on a larger set of measures, if the data are available for a dataset of locations, so this could be operationalized to compare density of services, intersection density, or any number of the universe of potential walkability measures. Alternative measures, especially simpler and more available measures of urban form, may be useful for studies of the built environment and its effect on walking. Selection of an appropriate walkability measure, by policy makers or researchers seeking to identify where people walk and the exposures associated with those locations, requires thoughtful consideration of its practicality and purpose that may be aided by the use of CART.
Supplementary Material
HIGHLIGHTS.
A walkability measure should be simple and able to identify places where people walk.
Our classification and regression tree modeling approach compares walkability measures for their ultimate health-related definition: whether or not someone is walking at that location.
Walk Score was a leading predictor of walking activity, closely followed by National Walkability Index and transit ridership.
Walkability measures with utilitarian attributes (i.e. transparent, reproducible, longitudinally available, etc.) are reasonable alternative measures for quantifying the urban form.
Funding:
This work was supported by the National Institutes of Health [R01HL091881, R01CA178343, K99LM012868].
Footnotes
Declarations of interest: none.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
7. REFERENCES
- 1.Frank LD, Engelke P. How Land Use and Transportation Systems Impact Public Health: A Literature Review of the Relationship Between Physical Activity and Built Form. ACES: Active Community Environments Initiative Working Paper #1. www.cdc.gov/nccdphp/dnpa/pdf/aces-workingpaper1.pdf. [Google Scholar]
- 2.Durand CP, Andalib M, Dunton GF, Wolch J, Pentz MA. A Systematic Review of Built Environment Factors Related to Physical Activity and Obesity Risk: Implications for Smart Growth Urban Planning. doi: 10.1111/j.1467-789X.2010.00826.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Grasser G, Van Dyck D, Titze S, Stronegger W. Objectively measured walkability and active transport and weight-related outcomes in adults: A systematic review. 2013. p. 615–25. [DOI] [PubMed] [Google Scholar]
- 4.Yi L, Wilson JP, Mason TB, Habre R, Wang S, Dunton GF. Methodologies for assessing contextual exposure to the built environment in physical activity studies: A systematic review. Health & Place. 2019;60:102226. doi: 10.1016/j.healthplace.2019.102226. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.McCormack GR, Shiell A. In search of causality: A systematic review of the relationship between the built environment and physical activity among adults. International Journal of Behavioral Nutrition and Physical Activity. 2011;8(1):125–. doi: 10.1186/1479-5868-8-125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sugiyama T, Neuhaus M, Cole R, Giles-Corti B, Owen N. Destination and route attributes associated with adults' walking: A review. Medicine and Science in Sports and Exercise. 2012;44(7):1275–86. doi: 10.1249/MSS.0b013e318247d286. [DOI] [PubMed] [Google Scholar]
- 7.Kärmeniemi M, Lankila T, Ikäheimo T, Koivumaa-Honkanen H, Korpelainen R. The Built Environment as a Determinant of Physical Activity: A Systematic Review of Longitudinal Studies and Natural Experiments. Annals of Behavioral Medicine. 2018;52(3):239–51. doi: 10.1093/abm/kax043. [DOI] [PubMed] [Google Scholar]
- 8.Hajna S, Ross NA, Brazeau AS, Bélisle P, Joseph L, Dasgupta K. Associations between neighbourhood walkability and daily steps in adults: a systematic review and meta-analysis. BioMed Central; 2015. p. 768–. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wali B, Frank LD, Chapman JE, Fox EH. Developing policy thresholds for objectively measured environmental features to support active travel. Transportation Research Part D: Transport and Environment. 2021;90:102678. doi: 10.1016/j.trd.2020.102678. [DOI] [Google Scholar]
- 10.Ramakreshnan L, Aghamohammadi N, Fong CS, Sulaiman NM. A comprehensive bibliometrics of 'walkability' research landscape: visualization of the scientific progress and future prospects. Environmental science and pollution research international. 2021;28(2):1357–69. Epub 2020/10/24. doi: 10.1007/s11356-020-11305-x. PubMed PMID: 33094458. [DOI] [PubMed] [Google Scholar]
- 11.Cerin E, Cain KL, Conway TL, Van Dyck D, Hinckson E, Schipperijn J, De Bourdeaudhuij I, Owen N, Davey RC, Hino AAF, Mitáw J, Orzanco-Garralda R, Salvo D, Sarmiento OL, Christiansen LB, Macfarlane DJ, Schofield G, Sallis JF. Neighborhood environments and objectively measured physical activity in 11 countries. Medicine and Science in Sports and Exercise. 2014;46(12):2253–64. doi: 10.1249/MSS.0000000000000367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Ewing R, Tian G, Goates JP, Zhang M, Greenwald MJ, Joyce A, Kircher J, Greene W. Varying influences of the built environment on household travel in 15 diverse regions of the United States. Urban Studies. 2014;52(13):2330–48. doi: 10.1177/0042098014560991. [DOI] [Google Scholar]
- 13.Frank LD, Schmid TL, Sallis JF, Chapman J, Saelens BE. Linking objectively measured physical activity with objectively measured urban form: Findings from SMARTRAQ. American Journal of Preventive Medicine. 2005;28(2 SUPPL. 2):117–25. doi: 10.1016/j.amepre.2004.11.001. [DOI] [PubMed] [Google Scholar]
- 14.Huang R, Moudon AV, Zhou C, Saelens BE. Higher residential and employment densities are associated with more objectively measured walking in the home neighborhood. Journal of Transport and Health. 2019;12:142–51. doi: 10.1016/j.jth.2018.12.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Isiagi M, Okop KJ, Lambert EV. The Relationship between Physical Activity and the Objectively-Measured Built Environment in Low- and High-Income South African Communities. Int J Environ Res Public Health. 2021;18(8). Epub 2021/05/01. doi: 10.3390/ijerph18083853. PubMed PMID: 33916926; PMCID: PMC8067549. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Mooney SJ, Hurvitz PM, Moudon AV, Zhou C, Dalmat R, Saelens BE. Residential neighborhood features associated with objectively measured walking near home: Revisiting walkability using the Automatic Context Measurement Tool (ACMT). Health & Place. 2020;63:102332. doi: 10.1016/j.healthplace.2020.102332. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Moudon AV, Huang R, Stewart OT, Cohen-Cline H, Noonan C, Hurvitz PM, Duncan GE. Probabilistic walking models using built environment and sociodemographic predictors. Population Health Metrics. 2019;17(1):7. doi: 10.1186/s12963-019-0186-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Vale DS, Saraiva M, Pereira M. Active accessibility: A review of operational measures of walking and cycling accessibility. Journal of Transport and Land Use. 2015;9(1). doi: 10.5198/jtlu.2015.593. [DOI] [Google Scholar]
- 19.Frank LD, Sallis JF, Saelens BE, Leary L, Cain L, Conway TL, Hess PM. The development of a walkability index: Application to the neighborhood quality of life study. 2010. p. 924–33. [DOI] [PubMed] [Google Scholar]
- 20.Stockton JC, Duke-Williams O, Stamatakis E, Mindell JS, Brunner EJ, Shelton NJ. Development of a novel walkability index for London, United Kingdom: Cross-sectional application to the Whitehall II Study. BMC Public Health. 2016;16(1):416–. doi: 10.1186/s12889-016-3012-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Rundle AG, Chen Y, Quinn JW, Rahai N, Bartley K, Mooney SJ, Bader MD, Zeleniuch-Jacquotte A, Lovasi GS, Neckerman KM. Development of a Neighborhood Walkability Index for Studying Neighborhood Physical Activity Contexts in Communities across the U.S. over the Past Three Decades. Journal of Urban Health. 2019;96(4):583–90. doi: 10.1007/s11524-019-00370-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ribeiro AI, Hoffimann E. Development of a Neighbourhood Walkability Index for Porto Metropolitan Area. How Strongly Is Walkability Associated with Walking for Transport? International journal of environmental research and public health. 2018;15(12):2767. doi: 10.3390/ijerph15122767. PubMed PMID: 30563290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Glazier RH, Weyman JT, Creatore MI, Gozdyra P, Moineddin R, Matheson FI, Booth GL. Development and validation of an urban walkability index for Toronto, Canada. Canadian Journal of Diabetes. 2008;32(4). [Google Scholar]
- 24.Watson KB, Whitfield GP, Thomas JV, Berrigan D, Fulton JE, Carlson SA. Associations between the National Walkability Index and walking among US Adults - National Health Interview Survey, 2015. Prev Med. 2020;137:106122. Epub 2020/05/12. doi: 10.1016/j.ypmed.2020.106122. PubMed PMID: 32389677. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Tuckel P, Milczarski W. Walk ScoreTM, Perceived Neighborhood Walkability, and Walking in the US. American Journal of Health Behavior. 2015;39(2):241–55. doi: 10.5993/AJHB.39.2.11. PubMed PMID: 103927202. Language: English. Entry Date: 20141218. Revision Date: 20150710. Publication Type: Journal Article. [DOI] [PubMed] [Google Scholar]
- 26.Dannenberg AL, Cramer TW, Gibson CJ. Assessing the Walkability of the Workplace: A New Audit Tool. American Journal of Health Promotion. 2005;20(1):39–44. doi: 10.4278/0890-1171-20.1.39. [DOI] [PubMed] [Google Scholar]
- 27.Hajna S, Dasgupta K, Halparin M, Ross NA. Neighborhood Walkability: Field Validation of Geographic Information System Measures. American Journal of Preventive Medicine. 2013;44(6):e55–e9. doi: 10.1016/j.amepre.2013.01.033. [DOI] [PubMed] [Google Scholar]
- 28.Lefebvre-Ropars G, Morency C. Walkability: Which Measure to Choose, Where to Measure It, and How? Transportation Research Record. 2018;2672(35):139–50. doi: 10.1177/0361198118787095. [DOI] [Google Scholar]
- 29.Forsyth A. What is a walkable place? The walkability debate in urban design. Urban Design International. 2015;20(4):274–92. doi: 10.1057/udi.2015.22. [DOI] [Google Scholar]
- 30.Moudon AV, Lee C, Cheadle AD, Garvin C, Johnson D, Schmid TL, Weathers RD, Lin L. Operational Definitions of Walkable Neighborhood: Theoretical and Empirical Insights. Journal of physical activity & health. 2006;3(s1):S99–S117. doi: 10.1123/jpah.3.s1.s99. [DOI] [PubMed] [Google Scholar]
- 31.Shashank A, Schuurman N. Unpacking walkability indices and their inherent assumptions. Health and Place. 2019;55:145–54. doi: 10.1016/j.healthplace.2018.12.005. [DOI] [PubMed] [Google Scholar]
- 32.Howell NA, Tu JV, Moineddin R, Chen H, Chu A, Hystad P, Booth GL. Interaction between neighborhood walkability and traffic-related air pollution on hypertension and diabetes: The CANHEART cohort. Environment International. 2019;132:104799. doi: 10.1016/j.envint.2019.04.070. [DOI] [PubMed] [Google Scholar]
- 33.Marshall Julian D, Brauer M, Frank Lawrence D. Healthy Neighborhoods: Walkability and Air Pollution. Environ Health Perspect. 2009;117(11):1752–9. doi: 10.1289/ehp.0900595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Bartzokas-Tsiompras A, Photis YN, Tsagkis P, Panagiotopoulos G. Microscale walkability indicators for fifty-nine European central urban areas: An open-access tabular dataset and a geospatial web-based platform. Data in Brief. 2021;36:107048. doi: 10.1016/j.dib.2021.107048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.United States Census Bureau. American Community Survey Summary File. 2009-2013. Accessed September 17, 2021 <https://www.census.gov/programs-surveys/acs/data/summary-file.html>.
- 36.Carr LJ, Dunsiger SI, Marcus BH. Validation of Walk Score for estimating access to walkable amenities. British Journal of Sports Medicine. 2011;45(14):1144. doi: 10.1136/bjsm.2009.069609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Environmental Protection Agency. Smart Location Mapping ∣ Smart Growth. Available from: https://www.epa.gov/smartgrowth/smart-location-mapping#walkability.
- 38.Liao B, van den Berg PEW, van Wesemael PJV, Arentze TA. Empirical analysis of walkability using data from the Netherlands. Transportation Research Part D: Transport and Environment. 2020;85:102390. doi: 10.1016/j.trd.2020.102390. [DOI] [Google Scholar]
- 39.Forsyth A, Oakes JM, Schmitz KH, Hearst M. Does Residential Density Increase Walking and Other Physical Activity? Urban Studies. 2007;44(4):679–97. doi: 10.1080/00420980601184729. [DOI] [Google Scholar]
- 40.Puget Sound Regional C. Transit-Supportive Densities and Land Uses. 2015. [Google Scholar]
- 41.TransitCenter. Who's On Board 2016. https://transitcenter.org/publication/whos-on-board-2016/.
- 42.Lemon SC, Roy J, Clark MA, Friedmann PD, Rakowski W. Classification and regression tree analysis in public health: Methodological review and comparison with logistic regression. Annals of Behavioral Medicine. 2003;26(3):172–81. doi: 10.1207/S15324796ABM2603_02. [DOI] [PubMed] [Google Scholar]
- 43.Saelens BE, Vernez Moudon A, Kang B, Hurvitz PM, Zhou C. Relation between higher physical activity and public transit use. American journal of public health. 2014;104(5):854–9. Epub 2014/03/13. doi: 10.2105/AJPH.2013.301696. PubMed PMID: 24625142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Hurvitz PM, Moudon AV, Kang B, Fesinmeyer MD, Saelens BE. How far from home? The locations of physical activity in an urban U.S. setting. Preventive medicine. 2014;69:181–6. doi: 10.1016/j.ypmed.2014.08.034. PubMed PMID: 25285750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Kang B, Moudon AV, Hurvitz PM, Saelens BE. Differences in behavior, time, location, and built environment between objectively measured utilitarian and recreational walking. Transportation Research Part D: Transport and Environment. 2017;57:185–94. doi: 10.1016/j.trd.2017.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Eisenberg-Guyot J, Moudon AV, Hurvitz PM, Mooney SJ, Whitlock KB, Saelens BE. Beyond the bus stop: Where transit users walk. Journal of Transport & Health. 2019;14:100604. doi: 10.1016/j.jth.2019.100604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Moudon AV, Saelens B, Rutherford S, Hallenbeck M. A report on participant sampling and recruitment for travel and physical activity data collection : final technical report, July 2009. 2009. PubMed PMID: dot:5697. [Google Scholar]
- 48.Hurvitz PM, Moudon AV, Kang B, Saelens BE, Duncan GE. Emerging technologies for assessing physical activity behaviors in space and time. Frontiers in Public Health. 2014;2(January):1–15. doi: 10.3389/fpubh.2014.00002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Tandon PS, Saelens BE, Zhou C, Kerr J, Christakis DA. Indoor versus outdoor time in preschoolers at child care. American Journal of Preventive Medicine. 2013;44(1):85–8. doi: 10.1016/j.amepre.2012.09.052. [DOI] [PubMed] [Google Scholar]
- 50.Kang B, Moudon AV, Hurvitz PM, Reichley L, Saelens BE. Walking objectively measured: classifying accelerometer data with GPS and travel diaries. Medicine and science in sports and exercise. 2013;45(7):1419–28. doi: 10.1249/MSS.0b013e318285f202. PubMed PMID: 23439414. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Troiano RP, Berrigan D, Dodd KW, Mâsse LC, Tilert T, McDowell M. Physical activity in the United States measured by accelerometer. Med Sci Sports Exerc. 2008;40(1):181–8. Epub 2007/12/20. doi: 10.1249/mss.0b013e31815a51b3. PubMed PMID: 18091006. [DOI] [PubMed] [Google Scholar]
- 52.Twardzik E, Judd S, Bennett A, Hooker S, Howard V, Hutto B, Clarke P, Colabianchi N. Walk Score and objectively measured physical activity within a national cohort. Journal of Epidemiology and Community Health. 2019;73(6):549–56. doi: 10.1136/jech-2017-210245. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Duncan D, Méline J, Kestens Y, Day K, Elbel B, Trasande L, Chaix B. Walk Score, Transportation Mode Choice, and Walking Among French Adults: A GPS, Accelerometer, and Mobility Survey Study. International Journal of Environmental Research and Public Health. 2016;13(6):611–. doi: 10.3390/ijerph13060611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Lerner M Walk Score Blog [Internet]. http://blog.walkscore.com/2013/11/2014-rankings-methodology/2013. [February 24, 2020]. Available from: http://blog.walkscore.com/2013/11/2014-rankings-methodology/. [Google Scholar]
- 55.Whalen J walkscoreAPI: Walk Score and Transit Score API. 2012. [Google Scholar]
- 56.Parcels for King County with Address with Property Information / parcel address area KCGIS Open Data 2012 [December 12, 2016]. Available from: https://gis-kingcounty.opendata.arcgis.com/datasets/kingcounty::parcels-for-king-county-with-address-with-property-information-parcel-address-area/.
- 57.Strobl C, Malley J, Tutz G. An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychol Methods. 2009;14(4):323–48. doi: 10.1037/a0016973. PubMed PMID: 19968396. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Therneau T, Atkinson B, Ripley B. rpart: Recursive Partitioning and Regression Trees. 2019. [Google Scholar]
- 59.Hirsch JA, Winters M, Ashe MC, Clarke PJ, McKay HA. Destinations That Older Adults Experience Within Their GPS Activity Spaces: Relation to Objectively Measured Physical Activity. Environment and Behavior. 2015;48(1):55–77. doi: 10.1177/0013916515607312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Holliday KM, Howard AG, Emch M, Rodríguez DA, Evenson KR. Are buffers around home representative of physical activity spaces among adults? Health & Place. 2017;45:181–8. doi: 10.1016/j.healthplace.2017.03.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Giles LV, Koehle MS, Saelens BE, Sbihi H, Carlsten C. When physical activity meets the physical environment: precision health insights from the intersection. Environmental health and preventive medicine. 2021;26(1):68. Epub 2021/07/02. doi: 10.1186/s12199-021-00990-w. PubMed PMID: 34193051; PMCID: PMC8247190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Koohsari MJ, Oka K, Owen N, Sugiyama T. Natural movement: A space syntax theory linking urban form and function with walking for transport. Health & Place. 2019;58:102072. doi: 10.1016/j.healthplace.2019.01.002. [DOI] [PubMed] [Google Scholar]
- 63.Rundle AG, Sheehan DM, Quinn JW, Bartley K, Eisenhower D, Bader MMD, Lovasi GS, Neckerman KM. Using GPS Data to Study Neighborhood Walkability and Physical Activity. Am J Prev Med. 2016;50(3):e65–e72. Epub 2015/11/13. doi: 10.1016/j.amepre.2015.07.033. PubMed PMID: 26558700. [DOI] [PubMed] [Google Scholar]
- 64.Smith L, Foley L, Panter J. Activity spaces in studies of the environment and physical activity: A review and synthesis of implications for causality. Health & Place. 2019;58:102113. doi: 10.1016/j.healthplace.2019.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Vich G, Marquet O, Miralles-Guasch C. Green streetscape and walking: Exploring active mobility patterns in dense and compact cities. Journal of Transport & Health. 2019;12:50–9. doi: 10.1016/j.jth.2018.11.003. [DOI] [Google Scholar]
- 66.Carlson JA, Saelens BE, Kerr J, Schipperijn J, Conway TL, Frank LD, Chapman JE, Glanz K, Cain KL, Sallis JF. Association between neighborhood walkability and GPS-measured walking, bicycling and vehicle time in adolescents. Health & place. 2015;32:1–7. Epub 2015/01/09. doi: 10.1016/j.healthplace.2014.12.008. PubMed PMID: 25588788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Hajna S, Ross NA, Griffin SJ, Dasgupta K. Lexical neutrality in environmental health research: Reflections on the term walkability. BMC Public Health. 2017;17(1):940. doi: 10.1186/s12889-017-4943-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Lo RH. Walkability: What is it? Journal of Urbanism. 2009;2(2):145–66. doi: 10.1080/17549170903092867. [DOI] [Google Scholar]
- 69.Rivera-Navarro J, Bonilla L, Gullón P, González-Salgado I, Franco M. Can we improve our neighbourhoods to be more physically active? Residents' perceptions from a qualitative urban health inequalities study. Health Place. 2021:102658. Epub 2021/09/01. doi: 10.1016/j.healthplace.2021.102658. PubMed PMID: . [DOI] [PubMed] [Google Scholar]
- 70.Koschinsky J, Talen E, Alfonzo M, Lee S. How walkable is Walker’s paradise? Environment and Planning B: Urban Analytics and City Science. 2016;44(2):343–63. doi: 10.1177/0265813515625641. [DOI] [Google Scholar]
- 71.Hall CM, Ram Y. Walk score® and its potential contribution to the study of active transport and walkability: A critical and systematic review. Transportation Research Part D: Transport and Environment. 2018;61:310–24. doi: 10.1016/j.trd.2017.12.018. [DOI] [Google Scholar]
- 72.Ding C, Cao X, Liu C. How does the station-area built environment influence Metrorail ridership? Using gradient boosting decision trees to identify non-linear thresholds. Journal of Transport Geography. 2019;77:70–8. doi: 10.1016/j.jtrangeo.2019.04.011. [DOI] [Google Scholar]
- 73.Hankey S, Zhang W, Le HTK, Hystad P, James P. Predicting bicycling and walking traffic using street view imagery and destination data. Transportation Research Part D: Transport and Environment. 2021;90:102651. doi: 10.1016/j.trd.2020.102651. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.