Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 May 1.
Published in final edited form as: Environ Behav. 2012 Apr 10;45(4):526–547. doi: 10.1177/0013916512436421

Measurement Properties of a Park Use Questionnaire

Kelly R Evenson 1, Fang Wen 2, Daniela Golinelli 3, Daniel A Rodríguez 4, Deborah A Cohen 5
PMCID: PMC3708671  NIHMSID: NIHMS367075  PMID: 23853386

Abstract

We determined the criterion validity and test-retest reliability of a brief park use questionnaire. From five US locations, 232 adults completed a brief survey four times and wore a global positioning system (GPS) monitor for three weeks. We assessed validity for park visits during the past week and during a usual week by examining agreement between frequency and duration of park visits reported in the questionnaire to the GPS monitor results. Spearman correlation coefficients (SCC) were used to measure agreement. For past week park visit frequency and duration, the SCC were 0.62–0.65 and 0.62–0.67, respectively. For usual week park visit frequency and duration, the SCC were 0.40–0.50 and 0.50–0.53, respectively. Usual park visit frequency reliability was 0.78–0.88 (percent agreement 69%–82%) and usual park visit duration was 0.75–0.84 (percent agreement 64%–73%). These results suggest that the questionnaire to assess usual and past week park use had acceptable validity and reliability.

Keywords: environment, geographic information system (GIS), global positioning system (GPS), physical activity, reliability, validity

Introduction

Parks are integral to a favorable built environment and provide places for physical activity (Kaczynski & Henderson, 2008), a health behavior that protects against cardiovascular disease, type 2 diabetes, and certain cancers (U.S. Department of Health and Human Services, 2008). Located in most communities across the United States (US), parks provide an extensive network of free or low-cost options for physical activity. Yet unlike other government services and facilities, parks may be considered as optional. Some researchers note that parks contribute not only to physical activity, but also to the social well-being of their community, enhance property values, and contribute to health (Kaczynski & Henderson, 2008). However, in spite of the potential benefits, some studies indicate that some parks lack visitors while others are used quite extensively (D. A. Cohen et al., 2007). In order to study the contribution of parks and understand why some parks are not used, a science of measurement around public parks and their use is under development to quantify these contributions.

Accurate assessment of park use is also important for tracking park use, in planning studies that assess utilization and demand for park services, and in research that relies on individuals’ park use patterns. Tracking the use of parks could inform, for example, whether park visits, types of activities, and characteristics of park users change over time. Park and recreation organizations have collected these types of data at local, state, and national levels since as early as 1960, but often omit items that focus on health-enhancing physical activity (Kruger, Mowen, & Librett, 2007). Experts have recommended incorporating more detailed measures of physical activity into park-related tracking systems, in particular, tracking the core dimensions of physical activity which include frequency, duration, intensity, and mode (Kruger et al., 2007). This would allow park and health researchers and planners to monitor and identify changes in physical activity in parks.

Studies focusing on behaviors involving parks also need accurate assessment of park use. For example, studies have measured exposure to parks to understand their impacts on mental health (Watts, Pheasant, & Horoshenkov Kirill, 2011), activity engagement (Ioja, Rozylowicz, Patroescu, Nita, & Vanau, 2011; Wolch et al., 2011), and environmental pollution (Su, Jerrett, de Nazelle, & Wolch, 2011). From the health side, intervention studies also need accurate assessment of park use. For example, a recent study assessed whether park improvements changed participant park use and physical activity within the park (D. A. Cohen et al., 2009). These studies frequently rely on self-reported park use to determine exposures or changes over time. Thus, as more research and intervention studies are developed with a focus on parks, suitable assessment instruments will be needed.

To study the use of parks, including physical activity occurring in park settings, information can be collected broadly through either observational methods or through self-reported methods. Observational data can be collected using instruments such as the System for Observing Play and Recreation in Communities (SOPARC) (McKenzie, 2006). However, this method requires multiple observations over different days and seasons of the year to be reliable (D. Cohen et al., 2011), adding to the time and cost to collect this type of data. More recently, researchers are using global positioning systems (GPS) to assess the proportion of time in a day spent at parks and the proportion of moderate to vigorous physical activity done at parks (Jones, Coombes, Griffin, & van Sluijs, 2009; Quigg, Gray, Reeder, Holt, & Waters, 2010; Wheeler, Cooper, Page, & Jago, 2010). This instrumentation requires that participants wear a GPS, which may not be feasible in some studies, and requires electronic maps of parks (called shape files) to match with the GPS data.

An alternative to study the use of parks are self-reported measures, such as relying on interviews or surveys. Questions or surveys of park use have been developed (for example (Payne, Orsega-Smith, Roy, & Godbey, 2005; Raymore & Scott, 1998; Tinsley, Tinsley, & Croskeys, 2002; Walker et al., 2009)), but few have explored the measurement properties of the questionnaire to determine their usefulness, including validity and reliability. Validity determines whether the questionnaire is assessing what is intended and criterion-related validity demonstrates associations between similar measures of interest. Reliability is the ability of the questionnaire to assess what it is measuring in a consistent, reproducible way. Test-retest reliability is one type of reliability that examines whether measures applied on different occasions agree with one another. Desirable self-reported measures, such as from questionnaires, will have evidence for both validity and reliability.

We developed a brief questionnaire to assess past week and usual park use at a particular park, which also addressed the frequency, duration, and type of physical activity within the park as recommended elsewhere (Kruger et al., 2007). This paper describes the measurement properties of the park use questionnaire, including criterion-related validity and test-retest reliability, ultimately to determine the suitability of the measures for use in surveillance, research, and evaluation. We explored variation in the measurement properties of the questionnaire by gender, age, race/ethnicity, education, body mass index (BMI), participant recruitment method, and site, to identify characteristics of individuals for whom the self-report differed from the corresponding objective measures or were less reliable.

Methods

Study Sample and Recruitment

Participants were recruited from field centers in five states: Los Angeles, California (CA); Albuquerque, New Mexico (NM); Chapel Hill and Durham, North Carolina (NC); Columbus, Ohio (OH); and Philadelphia, Pennsylvania (PA). At each of the six locations, participants were recruited at or near 6 (for NM, NC, OH, and PA) or 7 (for CA) predefined parks. These geographic areas included a wide range of people with respect to race/ethnicity and household income, as reported in the US census. Characteristics of the parks size and staffing and the population surrounding the parks, by state, can be found in Table 1.

Table 1.

Characteristics of parks by state*

Characteristics State Location of Parks
CA (n=7) NC (n=6) NM (n=6) OH (n=6) PA (n=6)
Park size in acres
 Mean 9.4 13.5 15.6 7.4 5.3
 Median 6.9 11 8.1 7.3 5
Park full-time program staff
 Percentage 100 67 17 50 100
Population density (0.5 mile radius around each park)
 Mean 13,005 5,944 4,473 7,532 18,328
 Median 13,491 5,233 4,364 6,832 18,711
Households in poverty (0.5 mile radius around each park)
 Percent mean 17 10 16 21 29
 Percent median 18 7 10 15 30
*

Only CA included 7 parks; all other sites studied 6 parks.

The NC parks included target areas only.

Enrollment occurred during the spring, summer, and fall between May 2009 to April 2011. Inclusion criteria for enrollment were age >=18 years, English speaking, ambulatory, and either living within 1 mile from the study park or recruited during a visit at the study park. The volunteer participants were purposively sampled to include both male and female respondents across age groups (18–35, 36–59, >=60). In the parks, participants were recruited in person, following completion of a brief park survey (similar to what we were testing), and through posted flyers. In neighborhoods surrounding the park, household interviews were obtained in two ways. First, adults were recruited by visiting homes door-to-door; if they completed a brief park survey, then a flyer was left with a number to call if they were interested in participating in the study. Second, in the more dense areas, adults were enrolled through queries made outside of local shops located close to the park. Again, if they completed a brief park survey, then a flyer was left with a number to call if they were interested in participating in the study.

Participants provided informed consent and visits were typically conducted in a research office. Participants were asked to answer an interviewer-administered questionnaire at enrollment and during each of the next 3 weeks, and to wear an accelerometer and GPS unit for 3 weeks. At enrollment (baseline), participants were weighed with a Tanita Bc551 scale and measured for height using a Seca Portable Stadiometer. BMI was calculated as weight in kilograms divided by height in meters squared and participants were grouped into 4 categories: underweight (<18.5), normal weight (18.5–<25.0), overweight (25.0–<30.0), and obese (>=30.0). Participants were paid a monetary incentive at the conclusion of the data collection period. This study was approved by the Institutional Review Board at each university or organization affiliated with the 5 field centers.

Park Use Questionnaire

A questionnaire was developed from earlier studies of park use (D. A. Cohen et al., 2009). Usual and past week park use questions that were assessed for evidence of reliability and validity are provided in Appendix 1. Usual park use frequency and duration were calculated by averaging the self-reported items from baseline and weeks 1, 2, and 3. The activities the participant reported while in the park were categorized and checked by a second coder as either “not active” (examples: yoga, fly kite, play with baby, watch children, watch sports, picnic, watch dog) or “active” (examples: walking, tennis, basketball, volleyball). Active modalities were defined as requiring a metabolic equivalent (MET) value of at least 3.0, using the compendium of activities (Ainsworth et al., 2000). Participants were also asked to report their age, race/ethnicity, education, mode of travel for the most recent park visit, and past week exercise. Past week exercise was ascertained by asking, “In the past 7 days, on how many days did you engage in exercise?”

GIS Data

All built environment measures were derived using ArcGIS 10 (Environmental Systems Research Institute (ESRI) Inc., Redlands CA, 2010). The study park shape files were obtained from each locality and checked using Google, Bing, and MapQuest electronic maps. Each participant’s home address was geocoded using 2010 TIGER/Line shapefiles in ArcGIS, and supplemented with the electronic maps, as needed. The Euclidean and road network distance from home to the nearest edge of the study park was calculated using the ArcGIS Analysis tool and Network Analyst, respectively.

GPS Measures

To assess evidence for validity, we developed a comparison measure using GPS and geographic information systems (GIS). Participants were asked to wear the Qstarz BT-Q1000X portable GPS units (weight 65 grams, dimensions 72×46×20 mm) on their waist during all waking hours for three one-week periods. They were asked to keep the unit dry and to charge it overnight, every night. Each participant received written instructions and a telephone number to call with questions, and met with a staff person weekly to exchange units.

The units were set to record latitude, longitude, and speed every minute, with the Wide Area Augmentation System (WAAS) enabled (a system of satellites and ground stations that provides correction data to increase the accuracy of GPS readings). The map datum used was World Geodetic Survey 1984 and the position format was latitude and longitude in degrees and minutes (HD° MM′). The GPS data files were downloaded and cleaned, removing data headers, converting coordinate information into decimal degrees, and transforming the data into wide-character ASCI format to enable further processing with SAS version 9.2 and ArcGIS.

Since participants exchanged their GPS unit between weeks 1 to 2 and weeks 2 to 3, for some time the GPS data from the unit being returned by the participant overlapped with the GPS data from the unit being picked up. These overlapping points were removed and the three weeks of data were merged into one file. Each point was mapped using ArcGIS. Geoprocessing procedures were used to extract points that fell within the study park and remove points within 50 meters of the participant’s residence, accounting for any inaccuracies in point locations. The removal of points near participant’s residence affected only 5 people. Points that corresponded to a speed of >=30 kilometers/hour were further removed to exclude driving within parks.

These cleaned data were then processed using SAS. To be defined as a park visit, consecutive points within the park boundaries were required to span >=3 minutes. A time gap of at least 45 minutes between consecutive park points was deemed two separate park visits. Otherwise the points were considered as part of the same park visit. All data on visits were further screened and evaluated to identify any possible overlaps from the equipment exchange, which were then combined as needed. For each park visit the following variables were derived: start time, end time, duration, and average speed of the visit (sum of all speeds divided by the total duration.

Statistical Analysis

Validity was assessed using Spearman correlation coefficients (SCC) by comparing responses on the questionnaire to the GPS data for the frequency and duration of usual (averaged over the 4 questionnaires) and past week park use. In order to match the question about frequency of usual park visits (question 1 in Appendix 1), response options were translated into the following categories for the GPS data matching: “daily” to >=7 times/week; “a few times a week” to >=2 and <7; “once per week” to >=1 and <2; “a couple times per month” to >=0.235 and <1; “monthly” to >=0.225 and <0.235; “rarely”, “this is the first time”, or “never” to >=0 and <0.225. The matching of the time categories for usual and past week duration of park visits followed the minutes and hour categories listed in the response options and were not further reduced (questions 2 and 5 in Appendix 1). The corresponding GPS data were categorized using these same response options. Similarly, the number of times visited the park in the past week was matched without further categorization to the GPS data (question 4 in Appendix 1).

Frequencies comparing categories and Bland-Altman plots (Bland & Altman, 1986) provided graphic assessment of over- and under-reporting comparing the questionnaire to the corresponding GPS measures. Linear mixed effect models for repeated measures (Laird & Ware, 1982) were used to explore associations for (1) past week park use frequency and (2) past week park use duration, between the self-reported measure from the questionnaire and the corresponding GPS outcome. With these analyses we were able to jointly model the three weeks of measures, accounting for the correlation among measures from the same participant. To assess whether the association between the self-reported measure and corresponding GPS measure varied by gender, age, race/ethnicity, education, BMI, participant recruitment method, and site we re-ran the repeated measures models interacting one at a time the self-reported measure with the characteristics listed above. We report in the text only those interactions with a significance level of p<0.10.

Test-retest reliability was assessed using percent agreement, simple kappa coefficients (for 2-level variables), and SCC by comparing usual park use and duration of visits reported on the questionnaire for each paired week: baseline/1, baseline/2, baseline/3, 1/2, 1/3, and 2/3. Cronbach’s coefficient alpha and intraclass correlation coefficients (ICC) with 95% confidence intervals (CI) were then computed using the four surveys, to account for the correlation among the four repeated measures. To explore agreement with the type of park activity recalled (active or not active; 2 levels), overall (simple) kappa coefficients were used to account for the four observations within the same person using the SAS macro MAGREE (Chen, Zaebst, & Seel, 2005). As a guide, we followed the ratings suggested by Landis and Koch (1977) for agreement level: <0 poor, 0–0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, and 0.81–1.00 almost perfect.

Results

Sample Characteristics

In total, 232 adults enrolled in the study, wore the GPS monitor, and completed four repeat surveys. Of the 232 participants, 80% were recruited from the park and 20% from households within 1 mile of the park. The distribution of the participants from each state and each park is provided in Table 2. The median Euclidean distance from the participants’ homes to the study park was 0.8 miles and the average network distance from home to the study park was 0.9 miles (Table 3). Park users lived farther from the park than those who were interviewed at their home, which was by the study design since they had to live within a mile of the park. Overall, participants averaged 1.8 park visits/week with an average duration/day of 41.7 minutes according to the GPS monitor.

Table 2.

Distribution of study sample by state, park, and whether they were recruited from homes or within the park

State Analysis sample Park 1 Park 2 Park 3 Park 4 Park 5 Park 6 Park 7*


N n n n n n n n
CA Overall 49 8 5 8 8 4 8 8
Household interview 12 2 2 2 2 0 2 2
Park user 37 6 3 6 6 4 6 6
NC Overall 49 8 8 9 8 8 8
Household interview 9 1 1 3 1 2 1
Park user 40 7 7 6 7 6 7
NM Overall 47 8 8 8 8 8 7
Household interview 11 2 2 2 2 2 1
Park user 34 5 6 6 6 6 5
missing 2 1 0 0 0 0 1
OH Overall 48 9 8 9 9 7 6
Household interview 0 0 0 0 0 0 0
Park user 48 9 8 9 9 7 6
PA Overall 39 5 8 5 8 9 4
Household interview 14 2 2 4 2 2 2
Park user 25 3 6 1 6 7 2
*

Only CA included 7 parks; all other sites studied 6 parks.

Table 3.

Descriptive characteristics of participants, overall (n=232) and by method of recruitment*

Mean Median Interquartile Range
Distance from the home to the study park
Euclidean distance (miles)
 Overall 2.3 0.8 0.2, 3.0
 Household interview 0.6 0.3 0.1, 0.5
 Park user 2.8 1.2 0.3, 3.8
Network distance (miles)
 Overall 2.8 0.9 0.3, 3.7
 Household interview 0.7 0.4 0.2, 0.8
 Park user 3.3 1.6 0.4, 4.9
Average # park visits/week to the study park by GPS
 Overall 1.8 1.0 0, 2.7
 Household interview 1.0 0.3 0, 1.3
 Park user 2.0 1.2 0.3, 3.0
Average duration (# minutes/day) in the study park by GPS
 Overall 41.7 30.8 0, 61.9
 Household interview 20.9 7.0 0, 30.8
 Park user 46.5 34.7 12.0, 66.0
*

Participants were recruited in the parks (park user, n=184) or from a nearby household (n=46). Two participants were missing participant type of recruitment.

The mean participant age was 40.5 years (median 37.0), with 17.2% at least 60 years of age. Most participants drove (55.9%) or walked (38.3%) to the park on their last visit. Other descriptive characteristics of the sample are in Table 4.

Table 4.

Descriptive characteristics of participants from the baseline survey

Overall
Missing
N %
Gender 0
 Male 104 44.8
 Female 128 55.2
Age 0
 18–35 110 47.4
 36–59 82 35.3
 >=60 40 17.2
Race/ethnicity 1
 non-Hispanic White 117 50.7
 non-Hispanic Black 58 25.1
 Hispanic 35 15.2
 Other 21 9.1
Education 0
 Less than high school, high school, or GED 51 22.0
 Some college or vocational 51 22.0
 College 82 35.3
 Post college 48 21.7
Body mass index 0
 Under weight 1 0.4
 Normal weight 84 36.2
 Overweight 74 31.9
 Obese 73 31.5
How often usually visit <park name>? 0
 Never 8 3.5
 This is the first time 4 1.7
 Rarely 24 10.3
 Monthly 11 4.7
 A couple times per month 17 7.3
 Once per week 43 18.5
 A few times a week 85 36.6
 Daily 40 17.2
How long do you usually stay at <park name>? 0
 0 minute 8 3.5
 Less than 15 minutes 12 5.2
 15–30 minutes 33 14.2
 31–60 minutes 50 21.6
 >1 hour, but <2 hours 81 34.9
 2 – 3 hours 29 12.5
 >3 hours, but <5 hours 15 6.5
 5 or more hours 4 1.7
What do you usually do at <park name>? 2
 Not active 26 11.3
 Active 204 88.7
Over the past 7 days, how many times did you visit <park name>? 9
 0 48 21.5
 1 55 24.7
 2 36 16.1
 3 25 11.2
 4 16 7.2
 5 or more 43 19.3
In the past 7 days, how long did you stay at the park on your most recent visit? 10
 Did not go to the park in the last 7 days 49 22.1
 Less than 15 minutes 8 3.6
 15–30 minutes 23 10.4
 31–60 minutes 41 18.5
 >1 hour, but <2 hours 50 22.5
 2 – 3 hours 32 14.4
 >3 hours, but <5 hours 13 5.9
 5 or more hours 6 2.7
In the past 7 days, on the last day you came to the park, what did you do while there? 10
 Not active 80 36.0
 Active 142 64.0
On the last visit to the park, how did you get there? 10
 Walked 85 38.3
 Biked 6 2.7
 Drove 124 55.9
 Bus or other public transportation 5 2.3
 Other 2 0.9
In the past 7 days, how many days did you engage in exercise? 5
 0 22 9.7
 1 12 5.3
 2 28 12.3
 3 44 19.4
 4 39 17.2
 5 27 11.9
 6 11 4.9
 7 44 19.4

Criterion-related Validity Assessment

For usual week validity, the questions on park visit frequency and duration for each separate week (1, 2, 3) were compared to GPS findings averaged over the 3-week period. The SCC agreement ranged from 0.40 to 0.50 on the question of usual park visit frequency and from 0.50 to 0.53 for usual park visit duration (Table 5). Bland-Altman plots did not reveal systematic over- or under-reporting among the overall sample (figures not shown). For past week validity, the items on park visit frequency and duration were compared to GPS findings during the same week. As expected, the SCC agreement was slightly higher than usual week agreement, ranging from 0.62 to 0.65 on the question of past week frequency of park visits and 0.62 to 0.67 for past week park duration (Table 5). Bland-Altman plots did not reveal systematic over- or under-reporting among the overall sample (figures not shown).

Table 5.

Evidence for validity comparing the survey to the GPS

Q1. Usual frequency of park visits* Q2. Usual duration of park visits** Q4. Past week number of visits to the park Q5. Past week length of park visit **
n SCC (95% CI) n SCC (95% CI) n SCC (95% CI) n SCC (95% CI)
Week 1 214 0.50 (0.39, 0.61) 214 0.52 (0.42, 0.63) 205 0.62 (0.51, 0.73) 209 0.64 (0.55, 0.74)
Week 2 220 0.45 (0.33, 0.56) 220 0.53 (0.42, 0.63) 213 0.63 (0.53, 0.74) 216 0.67 (0.58, 0.76)
Week 3 225 0.40 (0.28, 0.52) 225 0.50 (0.39, 0.61) 213 0.65 (0.54, 0.75) 221 0.62 (0.53, 0.72)

CI=confidence interval; SCC=Spearman correlation coefficients

*

The categories used to assess Q1 were: daily, a few times/week; once/week; a couple times/week; monthly; rarely/first time/never

**

The categories used to assess Q2 and Q5 were: 0; <15 min; 15–30 min; 31–60 min; >1 but <2 hours; 2–3 hours; >3 but <5 hours; >=5 hours

Question items correspond to the question numbers in Appendix 1.

Q1 and Q2 were compared to GPS findings averaged over the 3-week period.

Q4 and Q5 were compared to the GPS from the specific week under study.

The estimates of the association between the self-reported measures of (1) past week park visit frequency and (2) past week park visit duration and the corresponding GPS outcome were significantly different from zero (p<0.0001 for both linear mixed effect models). We used these models to then test interactions between the self-reported measures and each covariate one at a time. For past week park use frequency, interactions with gender, age, race/ethnicity, education, and BMI were not significant. However, for participants recruited from the parks the association between the self-reported and GPS measures was weaker (p=0.05) than the association for participants recruited from nearby residences. Similarly, the association between the self-reported and GPS measures for the OH participants was weaker (p=0.002) than the association for the CA participants.

For past week park use duration, interactions with gender, age, BMI, and participant recruitment method were not significant. However, the association between the two measures for non-Hispanic black participants was weaker (p=0.03) than for Hispanics/Other race/ethnicity, and for participants with college or higher education the association was stronger (p=0.009) than for those with high school education or less. Also, for the OH participants the association was weaker (p=0.007), while for the NC participants the association was stronger (p=0.0006), when compared to the association estimated for the CA participants.

Test-retest Reliability Assessment

Considering the 6 possible comparisons, the assessment of test-retest reliability using SCC ranged from 0.78 to 0.88 (percent agreement 68.5%–82.3%) for usual frequency of park visits (Table 6). For the overall sample, both the Cronbach alphas and the ICC were nearly perfect for usual frequency of park visits (α=0.85, ICC 0.88 (95% CI 0.86, 0.91)) and the ICC’s were consistent across the covariates (i.e., all values remained in the same Landis and Koch (Landis & Koch, 1977) category) (Table 7).

Table 6.

Evidence for test-retest reliability of the park use questionnaire (n=232)

Q1. Usual frequency of park visits Q2. Usual duration of park visits Q3. Park activities

Comparisons PO SCC (95% CI) PO SCC (95% CI) PO Simple Kappa (95% CI)
Baseline vs. Week 1 71.6 0.84 (O.78, 0.89) 69.0 0.82 (0.75, 0.88) 91.2 0.61 (0.45, 0.76)
Baseline vs. Week 2 71.6 0.81 (0.74, 0.88) 66.8 0.81 (0.74, 0.87) 89.9 0.55 (0.39, 0.71)
Baseline vs. Week 3 68.5 0.78 (0.70, 0.86) 64.2 0.75 (0.67, 0.83) 37.5 0.46 (0.30, 0.63)
Week 1 vs Week 2 78.8 0.88 (0.84, 0.93) 71.6 0.84 (0.77, 0.91) 91.6 0.67 (0.53, 0.81)
Week 1 vs Week 3 78.8 0.86 (0.80, 0.92) 67.2 0.79 (0.71, 0.86) 89.8 0.61 (0.46, 0.75)
Week 2 vs Week 3 82.3 0.88 (0.83, 0.93) 73.3 0.84 (0.78, 0.90) 94.2 0.78 (0.66, 0.89)

Question items correspond to the numbers in Appendix 1.

CI=confidence interval; SCC=Spearman correlation coefficients; PO=proportion of raw agreement

Table 7.

Evidence for test-retest reliability using 4 weeks of surveys (n=232)

Q1. Usual frequency of park visits Q2. Usual duration of park visits Q3. Park activities

ICC (95% CI) ICC (95% CI) Overall Kappa* (95% CI)
Overall 0.88 (0.86, 0.91) 0.82 (0.78, 0.85) 0.59 (0.54, 0.64)
Stratified analysis:
Gender
 Male 0.86 (0.82, 0.89) 0.81 (0.76, 0.85) 0.63 (0.57, 0.70)
 Female 0.86 (0.82, 0.89) 0.81 (0.76, 0.85) 0.47 (0.40, 0.54)
Age
 18–35 0.89 (0.86, 0.92) 0.82 (0.77, 0.86) 0.56 (0.49, 0.64)
 36–59 0.89 (0.85, 0.92) 0.87 (0.82, 0.91) 0.57 (0.49, 0.66)
 >=60 0.86 (0.79, 0.92) 0.73 (0.62, 0.83) 0.64 (0.53, 0.74)
Race/Ethnicity
 non-Hispanic White 0.88 (0.85, 0.91) 0.89 (0.86, 0.92) 0.77 (0.71, 0.84)
 non-Hispanic Black 0.84 (0.78, 0.90) 0.59 (0.47, 0.71) 0.34 (0.25, 0.43)
 Hispanic or Other 0.95 (0.93, 0.97) 0.87 (0.82, 0.92) 0.53 (0.43, 0.63)
Education
 College or more 0.90 (0.87, 0.92) 0.86 (0.82, 0.89) 0.71 (0.65, 0.78)
 Some college or vocational 0.86 (0.79, 0.91) 0.83 (0.76, 0.89) 0.62 (0.52, 0.72)
 Less than high school, high school, or GED 0.86 (0.80, 0.91) 0.72 (0.61, 0.81) 0.28 (0.18, 0.38)
Body mass index
 Under or normal weight 0.86 (0.82, 0.90) 0.86 (0.82, 0.90) 0.69 (0.61, 0.78)
 Overweight 0.92 (0.88, 0.94) 0.88 (0.84, 0.92) 0.65 (0.57, 0.73)
 Obese 0.87 (0.82, 0.91) 0.70 (0.60, 0.78) 0.45 (0.36, 0.54)
Participant Recruitment
 Household interview 0.89 (0.84, 0.93) 0.89 (0.84, 0.93) 0.54 (0.49, 0.59)
 Park user 0.87 (0.84, 0.89) 0.77 (0.73, 0.82) 0.70 (0.59, 0.81)
State
 CA 0.95 (0.92, 0.97) 0.86 (0.80, 0.91) 0.64 (0.53, 0.76)
 NC 0.94 (0.91, 0.96) 0.88 (0.82, 0.92) 0.78 (0.66, 0.89)
 NM 0.92 (0.88, 0.95) 0.91 (0.86, 0.94) 0.76 (0.66, 0.85)
 OH 0.80 (0.71, 0.87) 0.75 (0.64, 0.83) 0.59 (0.48, 0.70)
 PA 0.80 (0.70, 0.87) 0.71 (0.59, 0.82) 0.26 (0.15, 0.38)

ICC= intraclass correlation coefficient; CI=confidence interval

*

Overall kappa is a simple kappa that accounts for the four repeat survey administrations within person.

Considering the 6 possible comparisons, the assessment of test-retest reliability using SCC ranged from 0.75 to 0.84 (percent agreement 64.2%–73.3%) for usual park visit duration (Table 6). For the overall sample, the Cronbach alphas and the ICC were nearly perfect for usual park visit duration (α=0.84, ICC 0.82 (95% CI 0.78, 0.85)) and the ICC were consistent across gender. However, the ICC’s were lower for those age 60 years and older, among non-Hispanic blacks, and among those with a high school education or less (Table 7). They were also lower among obese participants, those recruited in the park, and for the OH and PA participants.

Usual activities were classified as active or not active, with the simple kappa coefficient ranging from 0.46 to 0.78 (percent agreement 37.5%–94.2%) considering the 6 possible comparisons (Table 6). For the overall sample, the simple kappa was 0.59, with lower agreement among women compared to men, among non-Hispanic Blacks compared to other race/ethnic groups, and among those with a high school education or less (Table 6). They were also lower among obese participants, those recruited from households, and for the PA participants.

Discussion

The brief questionnaire on usual frequency and duration of park use demonstrated moderate to substantial criterion validity and substantial to almost perfect test-retest reliability. The item on usual week activities engaged at the park, categorized as active or not active, varied in agreement from moderate to substantial. The items on past week park use frequency and duration demonstrated substantial criterion validity. Reliability for these items was not assessed, since the recall periods varied week-to-week.

Validity

Considering evidence for criterion validity, we found lower agreement for duration and frequency of usual park use compared to past week park use. This was not surprising, given that our comparison measure for a usual week was derived by averaging the first, second, and third week values from the GPS. Temporally, the items did not correspond and likely introduced a source of error into the validity estimates. It is unclear what time period participants refer to when asked to recall a usual week. The questionnaire may be improved by specifying a usual week in the past month. After exploring the mismatches in the data, it may be park activities in the past week should be asked before park activities in a usual week, if both sets of questions are used and to emphasize those words as differences between the items.

The questions on past week park use frequency and duration showed similar evidence for validity by gender, age, and BMI. However, for past week park use frequency, participants recruited from the parks had lower agreement than participants recruited from nearby residences. For past week park visit duration, non-Hispanic blacks showed lower agreement than Other/Hispanic race/ethnic categories and those with at least a college education showed significantly higher agreement than those with high school or less education. There were also some site differences for both questions. The reasons for these differences are not understood, but in some cases it may be due to park usage. Those who use the park more often may have lower agreement, since they have more episodes to recall than participants who use the park less often. This is consistent with our finding that participants recruited from within the park, who had higher study park usage than those recruited from households, also had lower agreement on the question about past week park use frequency.

Reliability

We did not find that reporting improved over the course of a retaking the questionnaire week-to-week, labeled a learning effect, as happened in another study (Craig et al., 2003). If a learning effect occurred, we would expect higher reliability at later time points. Instead, we found that the SCC for both usual and past week park use was similar week-to-week and the closer together the weeks were, the higher the SCC for usual park use. The items comparing test-retest reliability from baseline to week 3 were lower than comparisons only one week apart for usual frequency, duration, and types of activities performed in the park.

In terms of individual characteristics, we found that those at least 60 years and older reported usual duration less consistently than the younger age groups. Other physical activity assessment studies have found older age groups to have lower test-retest reliability (Meyer, Evenson, Morimoto, Siscovick, & White, 2009), which may have to do with declines in cognitive abilities with age. The question on usual park duration was also less consistent among non-Hispanic blacks, those with a high school education or less, obese participants, those recruited in the park, and for some sites. Despite this lower consistency, reliability and validity were still adequate and suggest that the questionnaire would be appropriate for use with these subgroups. Cognitive interviewing may lead to further insight into these discrepancies and suggestions for questionnaire improvement (Altschuler et al., 2009).

Strengths and Limitations

The strengths of this study include the diversity of geographic locations and participants enrolled, along with the novel use of GPS and GIS data to create an objective assessment of park use. While a strength of the study, GPS has several limitations to this work worth noting. The GPS battery could not last for an entire week and therefore participants were asked to charge the unit each night. There is a chance of missed park visits due to the need to charge the battery. GPS units have difficulty recording locations in dense urban environments, especially with large closely connected buildings, or indoors. This could affect the classification of park visits that occurred inside recreation centers or other indoor park buildings, or in study parks in urban areas surrounded by buildings.

A limitation of this study is that it did not assess test-retest reliability of frequency and duration of park use in the past week. Also, the questionnaire did not directly assess relative or absolute intensity of activities in the park. Future iterations of the questionnaire could consider these additions. We classified the reported activities using METS (Ainsworth et al., 2000) to assess absolute intensity. Also, it is important to note that the questionnaire focused only on one park near where the participant lived. Other iterations may want to consider asking about any park use.

Conclusion

In this paper, we examined the criterion validity and test-retest reliability of a questionnaire to assess usual and past week park use. We compared past-week and usual-week self-reported frequency and duration of park visits with GPS data. Our results showed evidence for acceptable criterion validity and test-retest reliability across a diverse group of participants. Results for questions related to past week had higher evidence of reliability and validity than results for a usual week, due in part to how we assessed usual park use (e.g., by examining the past three weeks of data). We also found differences in the evidence for validity and reliability by certain sociodemographic characteristics and obesity. Although frequency of park use may partly explain these differences, other explanation could be explored to improve the questionnaire further. More generally, the study illustrates the usefulness of evaluating the psychometric properties of data collection tools for the study and evaluation of individuals’ behavior.

Acknowledgments

This study was supported by the National Institutes of Health (NIH), National Heart Lung and Blood Institute grant #R01HL092569. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. The authors thank the co-investigators, study coordinators, and staff for their help with this study. We also acknowledge the anonymous reviewers and comments by Sara Satinsky on an earlier draft of the paper.

Appendix 1: Brief questionnaire on park use

  1. How often do you usually visit (park name)? (this variable was reverse coded for the analysis)

    • daily

    • a few times a week

    • once per week

    • a couple times per month

    • monthly

    • rarely

    • this is the first time

    • never (If this option was chosen, then #2 was coded as zero and #3 was set to missing; they were skipped to #4.)

  2. On a typical day when you come to this park, how long do you usually stay?

    • <15 minutes

    • 15–30 minutes

    • 31–60 minutes

    • >1 hour, but <2 hours

    • 2 to 3 hours

    • >3 hours, but <5 hours

    • 5 or more hours

  3. What do you usually do at the park? (More than one answer can be provided.)

  4. Over the past 7 days, how many times did you visit this park?

    • (Insert number. If 0 was chosen then #5 was coded as “I did not go to the park in the last 7 days” and they were finished with this set of questions.)

  5. In the past 7 days, how long did you stay at the park on your most recent visit?

    • I did not go to the park in the last 7 days

    • <15 minutes

    • 15–30 minutes

    • 31–60 minutes

    • >1 hour, but <2 hours

    • 2 to 3 hours

    • >3 hours, but <5 hours

    • 5 or more hours

  6. In the past 7 days, on the last day you came to the park, what did you do while there? (More than one answer can be provided.)

Contributor Information

Kelly R. Evenson, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina – Chapel Hill, Chapel Hill, NC, USA

Fang Wen, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina – Chapel Hill, Chapel Hill, NC, USA.

Daniela Golinelli, RAND Corporation, Santa Monica, CA, USA.

Daniel A. Rodríguez, Department of City and Regional Planning, University of North Carolina – Chapel Hill, Chapel Hill, NC, USA

Deborah A. Cohen, RAND Corporation, Santa Monica, CA, USA

References

  1. Ainsworth B, Haskell W, Whitt M, Irwin M, Swartz A, Strath S, et al. Compendium of physical activities: an update of activity codes and MET intensities. Med Sci Sport Exer. 2000;32(9 supplement):S498–S516. doi: 10.1097/00005768-200009001-00009. [DOI] [PubMed] [Google Scholar]
  2. Altschuler A, Picchi T, Nelson M, Rogers JD, Hart J, Sternfeld B. Physical activity questionnaire comprehension: lessons from cognitive interviews. Med Sci Sports Exerc. 2009;41(2):336–343. doi: 10.1249/MSS.0b013e318186b1b1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bland J, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307–310. [PubMed] [Google Scholar]
  4. Chen B, Zaebst D, Seel L. A macro to calculate kappa statistics for categorizations by multiple raters: Paper 155–30. SUGI 30 Proceedings; 2005. [Accessed November 28, 2011]. pp. 1–6. at http://www2012.sas.com/proceedings/sugi2030/2155-2030.pdf. [Google Scholar]
  5. Cohen D, Setodji C, Evenson K, Ward P, Hillier A, Lapham S, et al. How much observation is enough? Refining the administration of SOPARC. J Phys Act Health. 2011;8(6):1117–1123. doi: 10.1123/jpah.8.8.1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Cohen DA, Golinelli D, Williamson S, Sehgal A, Marsh T, McKenzie TL. Effects of park improvements on park use and physical activity: policy and programming implications. Am J Prev Med. 2009;37(6):475–480. doi: 10.1016/j.amepre.2009.07.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Cohen DA, McKenzie TL, Sehgal A, Williamson S, Golinelli D, Lurie N. Contribution of public parks to physical activity. Am J Public Health. 2007;97(3):509–514. doi: 10.2105/AJPH.2005.072447. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Craig C, Marshall A, Sjostrom M, Bauman A, Booth M, Ainsworth B, et al. International physical activity questionnaire: 12-country reliability and validity. Med Sci Sports Exerc. 2003;35(8):1381–1395. doi: 10.1249/01.MSS.0000078924.61453.FB. [DOI] [PubMed] [Google Scholar]
  9. Ioja C, Rozylowicz L, Patroescu M, Nita M, Vanau G. Dog walkers’ vs. other park visitors’ perceptions: The importance of planning sustainable urban parks in Bucharest, Romania. Landscape and Urban Planning. 2011;103(1):74–82. [Google Scholar]
  10. Jones AP, Coombes EG, Griffin SJ, van Sluijs EM. Environmental supportiveness for physical activity in English schoolchildren: a study using Global Positioning Systems. Int J Behav Nutr Phys Act. 2009;6:42. doi: 10.1186/1479-5868-6-42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Kaczynski AT, Henderson KA. Parks and recreation settings and active living: a review of associations with physical activity function and intensity. J Phys Act Health. 2008;5(4):619–632. doi: 10.1123/jpah.5.4.619. [DOI] [PubMed] [Google Scholar]
  12. Kruger J, Mowen AJ, Librett J. Recreation, parks, and the public health agenda: developing collaborative surveillance frameworks to measure leisure time activity and active park use. J Phys Act Health. 2007;4(Suppl 1):S14–23. doi: 10.1123/jpah.4.s1.s14. [DOI] [PubMed] [Google Scholar]
  13. Laird N, Ware J. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974. [PubMed] [Google Scholar]
  14. Landis J, Koch G. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174. [PubMed] [Google Scholar]
  15. McKenzie T, Cohen D, Sehgal A, Williamson S, Golinelli D. System for Observing Play and Recreation in Communities (SOPARC): Reliability and Feasibility Measures. Journal of Physical Activity and Health. 2006;3(suppl 1):S208–S222. [PMC free article] [PubMed] [Google Scholar]
  16. Meyer AM, Evenson KR, Morimoto L, Siscovick D, White E. Test-retest reliability of the Women’s Health Initiative physical activity questionnaire. Med Sci Sports Exerc. 2009;41(3):530–538. doi: 10.1249/MSS.0b013e31818ace55. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Payne L, Orsega-Smith B, Roy M, Godbey G. Local park use and personal health among older adults: An exploratory study. J Park Recreation Administration. 2005;23(2):1–20. [Google Scholar]
  18. Quigg R, Gray A, Reeder AI, Holt A, Waters DL. Using accelerometers and GPS units to identify the proportion of daily physical activity located in parks with playgrounds in New Zealand children. Prev Med. 2010;50(5–6):235–240. doi: 10.1016/j.ypmed.2010.02.002. [DOI] [PubMed] [Google Scholar]
  19. Raymore L, Scott D. The characteristics and activities of older adult visitors to metropolitan park districts. J Park Recreation Administration. 1998;16(4):1–21. [Google Scholar]
  20. Su JG, Jerrett M, de Nazelle A, Wolch J. Does exposure to air pollution in urban parks have socioeconomic, racial or ethnic gradients? Environ Res. 2011;111(3):319–328. doi: 10.1016/j.envres.2011.01.002. [DOI] [PubMed] [Google Scholar]
  21. Tinsley H, Tinsley C, Croskeys C. Park usage, social milieu, and psychosocial benefits of park use reported by older urban park users from four ethnic groups. Leisure Sci. 2002;24(2):199–218. [Google Scholar]
  22. U.S. Department of Health and Human Services. ODPHP Publication No U0036. Washington, D.C: 2008. [Accessed November 1, 2008]. 2008 Physical Activity Guidelines for Americans. at http://www.health.gov/paguidelines/ [Google Scholar]
  23. Walker JT, Mowen AJ, Hendricks WW, Kruger J, Morrow JR, Jr, Bricker K. Physical activity in the park setting (PA-PS) questionnaire: reliability in a California statewide sample. J Phys Act Health. 2009;6(Suppl 1):S97–S104. doi: 10.1123/jpah.6.s1.s97. [DOI] [PubMed] [Google Scholar]
  24. Watts G, Pheasant R, Horoshenkov Kirill V. Predicting perceived tranquillity in urban parks and open spaces. Environment and Planning B: Planning and Design. 2011;38(4):585–594. [Google Scholar]
  25. Wheeler BW, Cooper AR, Page AS, Jago R. Greenspace and children’s physical activity: a GPS/GIS analysis of the PEACH project. Prev Med. 2010;51(2):148–152. doi: 10.1016/j.ypmed.2010.06.001. [DOI] [PubMed] [Google Scholar]
  26. Wolch J, Jerrett M, Reynolds K, McConnell R, Chang R, Dahmann N, et al. Childhood obesity and proximity to urban parks and recreational resources: A longitudinal cohort study. Health Place. 2011 doi: 10.1016/j.healthplace.2010.10.001. In press – Epub ahead of print 10/15/2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES