Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Nov 1.
Published in final edited form as: Am J Prev Med. 2016 Aug 12;51(5):792–800. doi: 10.1016/j.amepre.2016.06.006

“Spatial Energetics”: Integrating Data From GPS, Accelerometry, and GIS to Address Obesity and Inactivity

Peter James 1, Marta Jankowska 2, Christine Marx 3, Jaime E Hart 4, David Berrigan 5, Jacqueline Kerr 6,7, Philip M Hurvitz 8, J Aaron Hipp 9, Francine Laden 10
PMCID: PMC5067207  NIHMSID: NIHMS810549  PMID: 27528538

Abstract

To address the current obesity and inactivity epidemics, public health researchers have attempted to identify spatial factors that influence physical inactivity and obesity. Technologic and methodologic developments have led to a revolutionary ability to examine dynamic, high-resolution measures of temporally matched location and behavior data through GPS, accelerometry, and GIS. These advances allow the investigation of spatial energetics, high–spatiotemporal resolution data on location and time-matched energetics, to examine how environmental characteristics, space, and time are linked to activity-related health behaviors with far more robust and detailed data than in previous work. Although the transdisciplinary field of spatial energetics demonstrates promise to provide novel insights on how individuals and populations interact with their environment, there remain significant conceptual, technical, analytical, and ethical challenges stemming from the complex data streams that spatial energetics research generates. First, it is essential to better understand what spatial energetics data represent, the relevant spatial context of analysis for these data, and if spatial energetics can establish causality for development of spatially relevant interventions. Second, there are significant technical problems for analysis of voluminous and complex data that may require development of spatially aware scalable computational infrastructures. Third, the field must come to agreement on appropriate statistical methodologies to account for multiple observations per person. Finally, these challenges must be considered within the context of maintaining participant privacy and security. This article describes gaps in current practice and understanding, and suggests solutions to move this promising area of research forward.

Introduction

Physical inactivity, a contributor to obesity and an independent risk factor for health, is linked to a substantial proportion of the global burden of diseases including coronary heart disease, diabetes, and premature mortality.1 The social-ecologic model explains that physical activity is a consequence of the dynamic social, physical, and ecologic context with which individuals interact.2 Improving population health through modifications to the environment to increase physical activity is a promising public health approach because environmental features are often ubiquitous exposures that can be modified broadly through economic, development, and transportation policy changes.28

Advances in technology have enabled novel research to understand how environmental factors influence physical activity. GPS data, which provide time-indexed geographic coordinates, can be used to link locations with environment; to calculate speed of movement; to represent the various locations an individual visits throughout the day (activity space); or to explain behavior, such as trips between places.9 Technologies to assess physical activity, such as accelerometry to objectively measure movement, have evolved and are available at low cost.10 GIS provides a computational framework in which spatially referenced data can be linked to environmental attributes (Figure 1). Advances in accelerometry, GPS, and GIS provide data to examine the relationship between location and physical activity objectively at precise spatiotemporal scales,11 and have resulted in a rapid increase in evidence supporting the important role of contextual factors.1216 These technologies allow researchers to ask what types of activities at certain times, in specific locations, for various types of people matter for health outcomes. These data can be used, for instance, to test the influence of new infrastructure to affect behavior change17,18; to evaluate characteristics of locations that provide opportunities for physical activity19,20; or to create personalized metrics of air pollution exposure based on location and respiratory rate.21

Figure 1.

Figure 1

Hypothetical map of spatial energetics data: GPS data color coded by accelerometry cutpoints (illustration of basic spatial energetics data) overlaid on greenness spatial data.

Spatial energetics is defined here as the incorporation of high–spatiotemporal resolution data on location (e.g., GPS combined with GIS) and time-matched energetics (e.g., accelerometry to measure physical activity and sedentary behavior) to examine how environmental characteristics, space, and time are linked to activity-related health behaviors. One of the first studies to capitalize on spatial energetics data provided 35 adults in North Carolina with GPS units and accelerometers for a 3-day monitoring period.22 The authors found that only 46% of bouts of moderate to vigorous physical activity occurred within a 1-mile radial buffer of each participant’s home, a buffer where previous research had focused. A number of groups have expanded on this analysis with larger sample sizes in diverse populations, using different buffer sizes and shapes, and examining different environmental metrics.11,16,2330

With technologic and methodologic advances represented by these analyses, there also come new assumptions, limitations, and questions of generalizability. Some have addressed emerging issues when combining GPS and physical activity data sets9,11; however, there remain unanswered conceptual, technical, analytical, and ethical questions that continue to challenge spatial energetics research. This article discusses dominant unanswered questions for the field and offers recommendations on how to confront these problems.

Conceptual Questions

As a transdisciplinary field, spatial energetics demands an integrated conceptual framework that incorporates concepts and theories from a number of specialties, including proximity (geography), time allocation (economics), travel behavior and values (transportation research), space–time studies of human behavior (planning), behavior (psychology, behavioral medicine), and health outcomes (epidemiology). Pioneering work from the 1970s on space–time budgeting31 and time–geography (e.g., daily time–space prisms32) developed the foundations of a time–space framework for analysis of “constraints” that determine behavior as an individual moves through space. Despite this, to date, the majority of analyses on environment and behavior have focused on static and distinct categorizations of places to define exposure (e.g., the home, neighborhood, school), as well as questionnaire-based health behaviors. Spatial energetics research generates detailed dynamic data that cannot be understood with this static theoretic approach. Below, major conceptual questions for the field are examined.

What Do Spatial Energetics Data Represent?

Linking behavioral and spatial data provides more specific and objective information about activities within environmental contexts than self-reported data alone.9 The data resolution allows for precise representation of behaviors across temporal and spatial units, rather than as single aggregate measures. These data allow for the exploration of detailed patterns, thresholds, relationships, sequences, and distributions of behavior in environmental context that may be masked by averaged or perception-based reporting. The granularity of these measured behaviors also allows for interventions that are specifically targeted at important time points, events, or locations along an individual’s daily trajectory. Disaggregation of spatiotemporal energetics data may improve causal inference and hence result in a refinement of location- and time-specific physical activity recommendations.

How Are Relevant Contexts That Influence Energetics Defined?

Central to the use of spatial data in studies of behavior is the precise meaning of “place,” as well as the relevance of particular places to energetics. Kwan33 identifies this as the uncertain geographic context problem defined as: “spatial uncertainty in the actual areas that exert the contextual influences under study.” If environmental exposures are measured in areas that have no bearing on behaviors, this exposure misclassification can bias effect estimates and may lead to inconsistent observed relationships between context and health behaviors.3440 Spatial energetics addresses the uncertain geographic context problem by providing location data to test theories concerning relevant spatial contexts that are related to energetics, as well as to generate hypotheses through data-driven analytical approaches.41,42 Researchers can now tease out whether participants obtain the majority of their physical activity close to their home or workplace (Figure 2A). Algorithms can also identify all places individuals spend time, which researchers can use to generate hypotheses about their relative importance for energetics (Figure 2B). These data can isolate specific geographic contexts driving behaviors in certain individuals, how relevant contexts may differ across individuals, and what interpersonal characteristics may drive different relevant geographic contexts. Alternatively, the environment (along with other factors) may shape the area in which people move and incidentally determine exposure.

Figure 2.

Figure 2

Maps of spatial energetics data detailing conceptual issues: (a) GPS points and radial/line based network buffers around a home address; (b) Data derived locations.

How Can Researchers Build Evidence for Causal Relationships Between Context and Energetics?

Building evidence for causality requires a lack of confounding (no prior common cause that explains the relationship between exposure and outcome).43 In the context of spatial energetics research for observational studies, confounding by intrapersonal characteristics can occur through what Chaix et al.44,45 have termed “selective daily mobility bias,” which arises when unmeasured factors (e.g., intrapersonal characteristics) lead individuals to visit certain locations and drives the behaviors conducted in those locations. For instance, physical activity preferences may motivate individuals to visit gyms for exercise; however, daily proximity to a gym may have little bearing on the likelihood of engaging in this behavior. Because of this bias, individual-level conclusions on how contextual measures “cause” physical activity behaviors are limited. To address this bias, some have suggested combining spatial energetics data with mobility surveys to understand motivations for visiting certain locations, as well as disaggregating GPS data into multiple anchor points and only examining behavior around these anchor points.45 Another potential avenue of research is to utilize study designs such as longitudinal, natural experiment, or interventions to help establish causal relationships between environment and energetics.

Technical Questions

Planning for data collection, implementation, and compliance with study protocols, as well as cleaning and storing immense and complex data sets collected from GPS, accelerometer, and GIS technologies can pose considerable technologic and computational challenges to researchers. Furthermore, spatial data can be difficult to obtain and standardize across multiple study sites,46 and often include poorly documented levels of error and uncertainty.

How Accurate Are GIS Data?

As the spatial scope of studies increases, so does reliance on databases typically collected for other purposes, often with unknown errors and infrequent updates. Nationwide commercial databases, such as InfoUSA or Dun & Bradstreet, are often used to provide information on locations that might serve as walking destinations. As shown by a number of studies comparing these national data sets to field-validated data, results vary widely from 30% to 90% of locations matching at the street level.4749 Such errors may be non-stationary, and differ, for instance, by the SES of a community, which may bias findings. Furthermore, error may be present in spatial accuracy and quality of the attribute data (e.g., park amenities). Quality may be validated using ground-level audit data, and more research is necessary comparing the utility of audit versus GIS data.50 One approach to estimate spatial bias is comparing a small-scale validation data set to a larger error-prone spatial data set, and measuring effect size differences between data sets. For example, comparing observer-verified park data in St. Louis, MO with national park data sets (Figure 3A) demonstrates that the two data sets overlapped for only 28.7% of data, with 308,268,222 m2 of parks missed by national data sets. More information on the existence and effect of this bias is necessary to better understand how GIS error might affect inferences about energetics, and researchers should understand the potential error in spatial data sets when conducting spatial energetics research.

Figure 3.

Figure 3

Examples of technical issues with spatial energetics data: (a) Example of GIS layer error in St. Louis, Missouri park layers versus nationwide park layers; (b) Example of GPS scatter due to physical structures, bodies of water, and urban canyons that block or reflect GPS signals.

How Accurate Are GPS Data?

Inherent error in GPS data can displace geographic coordinates by 3–10 m, which may be negligible or may be a major concern, depending on research aims.51 Physical structures, bodies of water, and urban canyons can also block or reflect GPS signals, leading to missing data and GPS scatter (Figure 3B). More research is required to identify factors related to compliance (e.g., age, gender) either to increase compliance in these populations or to statistically adjust for lack of compliance. Missing GPS data can be a substantial problem; in an analysis of 782 individuals in San Diego County who wore GPS and accelerometers for an average of 5.6 days, 17.4% of data was missing because of GPS signal lapse, resulting in an average of 41 minutes of daily light and moderate to vigorous physical activity without contextual data.52 Methods to impute missing data using a nearest neighbor approach from the previous and next valid GPS point are promising to address missing GPS data.53,54 Software that automatically cleans large GPS data sets by imputing and reducing scatter, such as University of California, San Diego’s Personal Activity and Location Measurement System, may be an important first step in GPS data processing, as is conducting GPS validations to understand GPS scatter rates in specific environments.55

How Do Researchers Understand Energetics and Behavior From Accelerometry?

In its early days, accelerometer data processing focused on analysis of count frequencies, with high frequencies representing physical activity and low frequencies representing sitting. These techniques, however, misclassify many behaviors that occur in contexts that are important to spatial energetics like driving, walking, and cycling. A shift is occurring from count-based approaches and regression calibrations for physical activity estimation to activity characterization based on features extracted from raw acceleration signals. New machine-learned techniques derived from training data collected during free living can support better classification of key behaviors,56 and when combined with GPS will allow measurement of cycling along designated routes, walking indoors and outdoors, and driving time associated with pollution levels. More research is needed to understand scenarios where machine-learned approaches work best, as well as how they compare with more traditional cut points.

What Sampling Period Is Necessary to Capture Routine Spatial Energetics Patterns?

Overwhelmingly, spatial energetics studies utilize a time sampling frame of 1–2 weeks, capturing both the weekend and weekdays for a representative sample of regular weekly activity. Shorter measurement intervals may not be enough to capture variability in daily travel behaviors,57,58 but no research has been done in the spatial energetics realm to confirm or refute such findings. Closely related are the need for refined measurement intervals throughout seasons,59,60 across the life course, and surrounding major life events such as a move or retirement. One potential solution is the increasing uptake of life-logging technologies such as smartphones and consumer wearable devices. Data derived from these devices may provide a cornucopia of data for understanding spatial energetics throughout the life course; however, there will be significant conceptual and technical work needed to utilize such devices in research in meaningful ways.

Analytical Questions

What Are the Computational Challenges Resulting From GPS, GIS, and Accelerometry Data?

The measurement framework of combining GPS, GIS, and accelerometry generates very large data sets; data sampled at the sub-minute level quickly grow to terabytes of data. After adding participant information (e.g., health outcomes, questionnaire data, -omics, social networks),61 the large numbers of observations quickly become multidimensional, posing unique data processing challenges that require complex database structures coupled with rapid computing solutions. Hurvitz and colleagues11 describe a methodology to efficiently process and link accelerometry, GPS, and smartphone-based logs of digital travel diaries into a combined “LifeLog” table. The workflow enables data streams to be merged by common timestamps and addresses numerous logistic problems, including different time zones across participants and timestamp rounding to ensure that data sets gathered at different time intervals can be merged. Beyond merely combining the data sets, the investigators developed an innovative tool to visualize GPS points, processed accelerometry data, and travel diaries. These types of tools will prove extremely valuable to progress the field of spatial energetics and to make sense of diverse data types, and it will be important for researchers to share computational approaches and workflows.

Do Statistical Methodologies and Computational Abilities Exist to Appropriately Analyze Spatial Energetics Data?

Statistical methodologies to analyze the complex structure of spatial energetics data have not been fully developed, and methods that do exist are challenging without robust computational infrastructures that have the processing capacity to model terabyte-sized data sets. Cross-classified multilevel modeling approaches must be applied that account for unbalanced data (e.g., some participants contribute more data than others),62 data aggregation (e.g., minute-level data aggregated to the daily level), and correlated data within individuals and locations (e.g., multiple observations within an individual within multiple, often proximal locations),27 while addressing dominant conceptual causal questions (e.g., “selective daily mobility bias”44,45). Spatial energetics may be able to borrow methods from other fields dealing with big data to address the computational burdens of spatial energetics data. For example, genome-wide association study methods have been applied in other domains such as nutrition63 and environmental health,64 and could similarly be applied to the high-dimensional nature of spatial energetics data.

Ethical Questions

How Do Researchers Protect the Privacy of Participants While Measuring Their Context and Behavior in Such Detail?

Ethical concerns when using spatial energetics data are paramount. Location and behavior data are protected health information, and protocols for data sharing and data security should ensure that the privacy of study participants is protected.65 Information contained in spatial energetics analyses may reveal activity patterns that participants may want to remain confidential.66,67 Researchers may want to consider Certificates of Confidentiality to ensure participant data privacy in spatial energetics studies. However, there is also a growing recognition that open and available data for data mining, testing of new algorithms, and big data analytics are necessary to translate findings from a cohort to population level. Another option is the removal or masking of spatial data in sensitive locations (such as the home68), and allowing participants to consent to sharing data with researchers or public planners. Although these approaches may address some ethical concerns, more work is needed, especially if researchers begin to rely on location-enabled devices, such as smartphones and wearable devices. As guidelines and resources are developed for ethical research in spatial energetics, they should be shared with the broader research community.

Conclusions

Research in spatial energetics is nascent, yet evolving rapidly. High–spatial resolution, temporally linked objective measures of environment and behavior will provide unprecedented perspectives on how individuals interact with their environment. In turn, these data can be used to inform decision making on how to create places that provide optimal environments for healthy, physically active lifestyles and alter the makeup of obesogenic locales. Before researchers forge ahead, they must endeavor to answer conceptual, technical, and analytical questions that persist. This document has attempted to provide some solutions to these questions (Table 1), but more thought is required as the field develops, especially in light of groundbreaking technologic advances.

Table 1.

Potential Solutions to Unanswered Questions in Spatial Energetics

Question Potential solutions
Conceptual
  • Use spatial energetics data to isolate specific geographic contexts driving behaviors in certain individuals; algorithms can identify places individuals spend time and trips taken, use to generate hypotheses about their relative importance for energetics

  • Develop theory and experimental designs that utilize bouts of activity and sedentary behavior rather than day or week aggregations

  • Combine spatial energetics data with electronic mobility surveys to understand motivations for visiting certain locations, or use of specific routes/transit modes to account for selective daily mobility bias

  • Aggregate GPS data into known participant anchor points and examine behavior around these anchor points to account for selective daily mobility bias

Technical
  • Validate and update GIS layers to maximize accuracy

  • Identify factors related to compliance with GPS procedures to increase compliance in these populations or to statistically adjust for lack of compliance

  • Use software that automatically cleans large GPS datasets by imputing and reducing scatter, such as UCSD’s Personal Activity and Location Measurement System (PALMS)

  • Apply machine learning techniques derived from training data collected during free living to support classification of key behaviors e.g., driving, walking, and cycling

Analytical
  • Create tools for efficiently linking, processing, and analyzing diverse and complex data streams

  • Apply approaches to rapidly visualize GPS points, processed accelerometry data, and travel diaries

  • Develop cross-classified multilevel modeling approaches when analyzing data to account for correlations of measures within individuals

  • Borrow statistical and computing methods from other fields dealing with “big data,” such as genetics, economics, and machine learning

Ethical
  • Develop standardized protocols for data sharing and data security that ensure that the privacy of study participants is protected

  • Consider Certificates of Confidentiality to ensure participant data privacy

  • Remove or mask spatial data in sensitive locations and develop common standards for data masking to begin working towards open data

  • Communicate clearly with participants so that they have a full understanding of the data they are contributing and how those data will be used, as well as the ability to opt out of studies at any time

Widespread use of GPS- and accelerometer-enabled smartphones, as well as consumer wearable devices with high-quality accelerometry (such as the Fitbit, Jawbone UP, or MisFit Shine),6972 could provide researchers with a wealth of information to explore how individuals interact with their environments, as well as the effectiveness of interventions in time and place.29,73 Capitalizing on the potential of innovative technologies will require pioneering methodologic approaches and collaboration between academic researchers and private industry.74,75 The emerging field of exposomics, which endeavors to encompass the totality of human environmental (i.e., nongenetic) exposures from conception onward, is increasingly embracing smartphones and wearables to develop exposure metrics.61 Advances in machine learning and signal processing have enabled processing of large data sets to decipher and analyze complex spatial and behavioral information. These methodologies have made spatial energetics research more feasible than ever. With this wave of streaming data, it is imperative that the field addresses the unanswered questions in spatial energetics research.

In this overview of unanswered questions in spatial energetics, there is one concluding question: Does the broad adoption of sensor technologies and the corresponding data they generate for understanding human activity in its environmental context require a major shift in the driving theoretic models, data, and methods of public health and epidemiology? It is unclear if the same underlying principles and methods should guide spatial health energetics research, which tends to collect significantly detailed data, both spatiotemporally as well as behaviorally. After decades of development of GPS, GIS, and accelerometry, and the promise they offer for modeling exposure to hazards and resources affecting health over time and space, these technologies and spatiotemporal methods of analysis are still not widely taught in public health. Perhaps the adoption of smartphones and wearable devices into epidemiologic cohorts could shift the analytical basis of epidemiology away from the traditional statistical approaches, which largely ignore time and space, toward the development of approaches designed specifically for spatial energetics data.

Spatial energetics research lays on the intersection between public health and big data analytics, leading to exciting potential to answer questions about how environment influences behavior. As the field moves forward, researchers must confront and solve these problems through transdisciplinary collaborations that capitalize on the expertise of geographers, physical activity researchers, computer and data scientists, statisticians, and epidemiologists. In turn, these collaborations should result in translational research that is communicated to the public, as well as policymakers, planners, and developers who create environments that may provide opportunities or barriers for physical activity.

Acknowledgments

This work was supported by the National Cancer Institute (NCI) Centers for Transdisciplinary Research on Energetics and Cancer (TREC) (U54 CA155626, U54 CA155435, U54 CA155850, U54 CA155796, U01 CA116850). All authors were funded by NCI as part of the TREC initiative (except Berrigan and Hurvitz). The opinions or assertions contained herein are the private ones of the authors and are not considered as official or reflecting the views of NIH. Work was also supported by the Harvard National Heart, Lung, and Blood Institute Cardiovascular Epidemiology Training Grant T32 HL 098048 and NIH Grants UM1 CA176726 and R01 ES017017. We thank the TREC Spatial and Contextual Measures and Modeling Working Group, who contributed greatly to this analysis.

Footnotes

No financial disclosures were reported by the authors of this paper.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

RESOURCES