Skip to main content
. Author manuscript; available in PMC: 2022 Sep 16.
Published in final edited form as: Health Place. 2020 Sep 6;66:102388. doi: 10.1016/j.healthplace.2020.102388

Table 4.

Quality reporting of observational school audit tool studies.

ID Was formative or pilot testing done, or are adaptations described? What type (if any) of reliability testing was done? What were the results? What type (if any) validity testing was done? What were the results? Is the scoring protocol described? Total Items Reported
1 ACTION! Staff Audit New; pilot study 1
2 Adachi et al, 2013 Locations described, then other items summed to reflect total number of machines, filled slots, and machine-front advertising per school 1
3 Belansky et al, 2013 Combined with other data to describe implementation changes, then qualitatively classified changes as effective, promising or emerging. 1
4 Branding Checklist New; pilot-tested at first school Combined with other data sources, qualitatively analyzed using constant comparative method to find patterns 2
5 Co-SEA Adapted from Endorse and SPEEDY; Pilot tested to work through technical difficulties Scored similarly to ENDORSE and SPEEDY, with change scores calculated from Year 1 to Year 2 2
5.1 Co-SEA Unadapted Used unadapted from original COMPASS study 1
6 EAPRS New; input from parks officials/users and made revisions over several iterations Interrater (% agreement, ICC, Kappas: 66% items had good-excellent reliabiliy) Face (Several rounds of input from parks and rec staff and park users) Variable created for each exposure category (summed binary or frequency items, averaged categorical items) 4
7 ENDORSE New; Reviewed by experts and pilot-tested Summed or counted, re-coded into 8 ′′ availability” variables, which were dichotomized or categorized into tertiles 2
8 Food Decision Environment Tool New; developed using behavioral economics theories, modified based on feedback from school/study stakeholders throughout Inter-rater (system to resolve discrepancies during analysis, conducted peer debriefing meetings to clarify) Trustworthiness of methods (e.g., credibility, transferability, dependability, confirmability) were established and described Data from observational form were summarized and triangulated using field notes, and analyzed qualitatively for emerging themes 4
9 GRF-OT New; developed over several iterations with input from experts and field testing through PlayWorks Inter-rater (weighted Kappa: 0.54–1.00, Scale ICC: 0.84); Test-retest (ICC: 0.95) Convergent (associated with activity levels); Content (fit assessed using exploratory structural equation modeling) 4-category items summed within subdomains 2
10 Hecht et al, 2017 Inter-rater (Kappa: 0.88–1.00) 1
11 ISAT Adapted from SPEEDY and IDEA, then customized by country Inter-rater (% agreement: 83.9–100%; Kappa: 0.61–0.96) Construct (could discriminate child PA between highest and lowest quintile schools) Binary items were reported, and items were also summed within each category 4
12 Laurie et al, 2017 New; developed using pre-existing polices and guidelines and piloted in 9 schools Categorical data were expressed as frequencies and percentages 2
13 LCFO Adapted from SNDA and unpublished tools, based on input from nutrition professionals Inter-rater (% agreement: >80% with gold standard researcher; monthly quality control review) 2
14 PARA Used unadapted from original PARA study of community physical activity resources Type not specified (rs > 0.77) Scored according to original PARA protocol (frequency of features, amenities, incivilities are summed; quality presented as a 3 or 4 item scale) 3
14.1 PARA (Adapted) Adapted from original PARA Inter-rater (% agreement: >80% with research lead; monthly quality control review) 2
15 Patel et al, 2009 New; developed by members community advisory board in several iterations, including a mock site visit Inter-rater (Kappa = 0.65–1.0; ICC = 1.0) Observers compare records to the foods/beverages that align with policy; Other info was qualitatively coded with other data sources to identify themes 3
16 School Food Environment Scan Inter-rater was not conducted because observers completed it together Binary items summed into scale and dichotomized; frequency items summed and dichotomized (some vs none) 2
17 School Lunchroom Audits Items are combined with field notes and photos to generate scale score for each service line; summed across all service lines in each school 1
18 SF-EAT New; developed based on literature review and existing policy documents, then tested for feasibility in 7 schools Face (circulated to Co-Is and project partners) Items combined with other data sources into 6 pre-determined domains, each scored 1–5 based on extent to which initiatives are happening 3
19 SNDA-III Adapted from previous iterations of SNDA study Binary variables on audit were combined with other data sources and summed in 3 different categories 2
20 SNEO New; Conducted Q-sort with 8 research staff to select items Inter-rater (Gewt’s AC1 = 0.73); Internal consistency (Cronbach’s a = 0.77–0.85) Two subscales were created: recommended and non-recommended items 3
21 SPACE Checklist Used unadapted from SPACE (Spatial Planning and Children’s Exercise) study, but applied to school Inter-rater (system to resolve discrepancies on-site) 2
22 SPAN-ET Adapted from several existing instruments Inter-rater (Percent agreement: 80.8–96.8%; Kappa: 0.61–0.94) Face and content (field tested, with school personnel provided subject-matter expertise) Binary items are summed within each category, then categories are explained by a 4-item scale 4
23 SPEEDY New, but based on existing green space instrument Inter-rater (% agreement: 76–90%; Kappa: 0.67–1) Face (draft sent to 3 experts); Construct (could discriminate child PA between highest and lowest quintile schools) Binary items were summed, frequencies were weighted by response mean, scales were weighted, then all were summed within each category 2
23.1 SPEEDY (Adapted; Dias et al, 2017) Adapted from SPEEDY, used only sports and play facility category Scored according to original SPEEDY protocol 2
23.2 SPEEDY (Adapted; Harrison et al, 2016) Slightly adapted from SPEEDY, added 3 facilities commonly recorded as ‘other’ in original audit Scored according to original SPEEDY protocol 3
23.3 SPEEDY (Adapted; Tarun et al, 2017) Adapted from SPEEDY, removed few items and added “comments" (advised by local experts) Inter-rater (Kappa scores: 0.4–1.0, % agreement: 61.9%–100.0%) Scored according to original SPEEDY protocol 3
23.4 SPEEDY (Unadapted; Chalkley et al, 2018) 0
23.5 SPEEDY (Unadapted; Hyndman and Chancellor, 2017) Used unadapted from original SPEEDY study Scored according to original SPEEDY protocol 2