Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jan 19.
Published in final edited form as: Nature. 2017 Jul 10;547(7663):336–339. doi: 10.1038/nature23018

Large-scale physical activity data reveal worldwide activity inequality

Tim Althoff 1, Rok Sosič 1, Jennifer L Hicks 2, Abby C King 3,4, Scott L Delp 2,5, Jure Leskovec 1,6,*
PMCID: PMC5774986  NIHMSID: NIHMS882836  PMID: 28693034

Abstract

Understanding the basic principles that govern physical activity is needed to curb the global pandemic of physical inactivity17 and the 5.3 million deaths per year associated with in-activity2. Our knowledge, however, remains limited owing to the lack of large-scale measurements of physical activity patterns across free-living populations worldwide1, 6. Here, we leverage the wide usage of smartphones with built-in accelerometry to measure physical activity at planetary scale. We study a dataset consisting of 68 million days of physical activity for 717,527 people, giving us a window into activity in 111 countries across the globe. We find inequality in how activity is distributed within countries and that this inequality is a better predictor of obesity prevalence in the population than average activity volume. Reduced activity in females contributes to a large portion of the observed activity inequality. Aspects of the built environment, such as the walkability of a city, were associated with less gender gap in activity and activity inequality. In more walkable cities, activity is greater throughout the day and throughout the week, across age, gender, and body mass index (BMI) groups, with the greatest increases in activity for females. Our findings have implications for global public health policy and urban planning and highlight the role of activity inequality and the built environment for improving physical activity and health.


Physical activity improves musculoskeletal health and function, prevents cognitive decline, reduces symptoms of depression and anxiety, and helps maintain a healthy weight4, 7. While prior surveillance and population studies have revealed that physical activity levels vary widely between countries1, more information is needed about how activity levels vary within countries and the relationships between physical activity disparities, health outcomes (e.g., obesity levels), and modifiable factors such as the built environment. For example, while much is known about how both intrinsic (e.g., gender, age, and weight) and extrinsic (e.g., public transportation density) factors are related to activity levels, evidence about how these factors interact (e.g., the influence of environmental factors on older adults or obese individuals) is more limited8. Understanding these interactions is important for developing public policy9, 10, planning cities11, and designing behavior change interventions12, 13.

The majority of physical activity studies are based on information that is either self-reported, with attendant biases14, or measured via wearable sensors, but limited in the number of subjects, observation period, and geographic range15. Mobile phones are a powerful tool for studying large-scale population dynamics and health on a global scale12, 16, revealing the basic patterns of human movement17, mood rhythms18, the dynamics of the spread of diseases such as malaria19, and socioeconomic status in developing countries20. Smartphones are now being used globally, with the adoption rate among adults at 69% in developed countries and 46% in developing economies and growing rapidly21. With onboard accelerometers for automatic recording of activity throughout the day, smartphones provide a scalable tool to measure physical activity worldwide. Here, we use a large-scale physical activity dataset to quantify disparities in the distribution of physical activity in countries around the world, identify the relationship between activity disparities and obesity, and explore the role of the built environment, in particular walkability, in creating a more equal distribution of activity across populations.

We study 68 million days of minute-by-minute step recordings from 717,527 anonymized users of the Argus smartphone application developed by Azumio. The dataset includes recordings of physical activity for free-living individuals from 111 countries (Fig. 1a). We focus on the 46 countries with at least 1000 users (Supplementary Table 1); 90% of these users were from 32 high income countries and 10% were from 14 middle income countries (including 5 lower-middle income countries; Methods). The average user recorded 4961 steps per day (standard deviation σ = 2684) over an average span of 14 hours. We verified that the smartphone application data reproduces established relationships between age, gender, weight status, and activity (Extended Data Fig. 1), as well as country-level variations in activity and obesity levels determined from prior surveillance data and population studies (Extended Data Fig. 2). Recent research has further demonstrated that smartphones provide accurate steps counts22 and reliable activity estimates in both laboratory and free-living settings23. We perform complete-case analyses, which we accompany with sample-correction, stratification, outlier, and balance testing to verify that our conclusions are robust to missing data (see Supplementary Table 1) and biases in age and gender, and hold for both high and middle income countries (see Methods and Extended Data Figs. 3, 4).

Figure 1. Smartphone data from over 68 million days of activity by 717,527 individuals reveal variability in physical activity across the world.

Figure 1

(a) World map showing variation in activity (mean daily steps) measured through smartphone data from 111 countries with at least 100 users. Cool colors correspond to high activity (e.g., Japan in blue) and warm colors indicate low levels of activity (e.g., Saudi Arabia in orange). (b) Typical activity levels differ between countries. Curves show distribution of steps across the population in four representative countries as a normalized probability density (high to low activity: Japan, United Kingdom, United States, Saudi Arabia). Vertical dashed lines indicate the mode of activity for Japan (blue) and Saudia Arabia (orange). (c) The variance of activity around the population mode differs between countries. Curves show distribution of steps across the population relative to the population mode. In Japan, the activity of 76% of the population falls within 50% of the mode (i.e., between light gray dashed lines), whereas in Saudi Arabia this fraction is only 62%. The United Kingdom and United States lie between these two extremes for average activity level and variance. This map is based on CIA World Data Bank II data publicly available through the R package “mapdata”.

Our large-scale activity measurements enable the characterization of the full distribution of activity within a population beyond activity level averages and including the tails of the distribution (Fig. 1b). Consider two countries with divergent activity distributions, Japan and Saudi Arabia. In Japan, the mode of recorded steps is high (Fig. 1b, dashed blue line; 5846 steps), while in Saudi Arabia it is low (Fig. 1b, dashed red line; 3103 steps). In Saudi Arabia, the mode is low, but the variance of recorded steps across the population is larger as well (Fig. 1c). This larger variance means that while some individuals are highly active, others record very little activity even relative to the low country baseline.

We formally characterize these systematic differences in country-level activity distributions by measuring activity inequality, which we define as the Gini coefficient of the population activity distribution24, 25 (Extended Data Fig. 5). We find that not only is there inequality in how steps are distributed within countries, but that activity inequality is associated with higher obesity levels (Fig. 2a). For example, Saudi Arabia has a high obesity rate in comparison to Japan. At the same time Saudi Arabia has lower average activity (Fig. 1b) and a wider activity distribution (Fig. 1c), that is, a higher activity inequality. This finding is independent of gender and age biases (Extended Data Fig. 3) and independent of a country’s income level (high vs. middle; no lower income countries were included in our dataset; Extended Data Fig. 4). In fact, a country’s activity inequality is a better predictor of obesity prevalence than the average volume of steps recorded (R2 = 0.64 vs. 0.47; p < 0.01; Extended Data Fig. 6). For example, the United States and Mexico have similar average daily steps (4774 vs. 4692), but the United States exhibits larger activity inequality (0.303 vs. 0.279; 10th vs 7th deciles of country activity inequality distribution) and higher obesity prevalence (27.7% vs. 18.1%; 10th vs 8th deciles of country obesity prevalence distribution) compared to Mexico (Supplementary Table 1).

Figure 2. Activity inequality is associated with obesity and increasing gender gaps in activity.

Figure 2

(a) Activity inequality predicts obesity (LOESS fit; R2 = 0.64). Individuals in the five countries with highest activity inequality are 196% more likely to be obese than individuals from the 5 countries with lowest activity inequality. (b) Activity inequality is associated with reduced activity, particularly in females. The figure shows the 25th, 50th, and 75th percentiles of daily steps within each country along with 95% confidence intervals (shaded) as a linear function of activity inequality. As activity inequality increases, median activity (50th percentile) decreases by 39% for males (blue) and by 58% for females (red). (c) Obesity-activity relationship differs between males and females and between high and low activity individuals. The plot shows the prevalence of obesity as a function of daily number of steps across all subjects in all countries (with 95% confidence intervals). For both males (blue) and females (red), a larger number of steps recorded is associated with lower obesity, but for females, the prevalence of obesity increases more rapidly as step volume decreases (232% obesity increase for females vs. 67% increase for males; comparing lowest vs. highest activity).

We find that in countries with high activity inequality, activity in females is reduced disproportionately compared to males, across all quartiles of activity (Fig. 2b). In particular, 43% of activity inequality is explained by the gender gap in activity (Extended Data Fig. 7). Thus the larger variances we observe (Fig. 1c) are due to reduced activity for females in comparison to males and not just an increase in variance overall (Extended Data Fig. 7a). While lower physical activity in females has been reported in several countries1, 26, we discover that in countries with low activity and high activity inequality, the gender gap in activity is amplified (Extended Data Fig. 7b).

By quantifying the relationship between activity and obesity at the individual level (Fig. 2c), we were able to determine why a country’s activity inequality is a better predictor of obesity than average activity level. We find that the prevalence of obesity increases more rapidly for females than males as activity decreases. And while lower activity is associated with a significant increase in obesity prevalence for low activity individuals, there is little change in obesity prevalence among high activity individuals. So given two countries with identical average activity levels, the country with higher activity inequality will have a greater fraction of low activity individuals (Fig. 1c), many of them female (Fig. 2b), leading to higher obesity than predicted from average activity levels alone. These findings echo the phenomenon revealed in past studies of the effects of income inequality on health27, 28, whereby a relatively small change in wealth (in our case activity) for an individual at the bottom of the distribution can lead to significant improvements in health. Based on our model relating activity inequality to obesity prevalence (Fig. 2a), we also performed a simulation experiment which, assuming perfect information (Methods), suggests that interventions focused on reducing activity inequality could result in up to a 4 times greater reduction in obesity prevalence compared to population-wide approaches (Extended Data Fig. 8).

We investigated the walkability of a city as a modifiable extrinsic factor that could increase activity levels8 and reduce activity inequality and the gender activity gap. Based on data from 69 United States cities (Supplementary Table 2), we find that higher walkability scores are associated with lower activity inequality (Fig. 3a) across all quartiles of median income (Extended Data Fig. 9). Examining San Francisco, San Jose, and Fremont—California cities in close geographic proximity—reveals that activity inequality is lowest in San Francisco, the city with the highest walkability (Supplementary Table 3), suggesting that the relationship between walkability and activity inequality holds even for geographically and socioeconomically similar cities. Furthermore, in more walkable cities, activity is higher on weekdays during morning and evening commute times and at lunch time and on weekends during the afternoon (Fig. 3bc). This indicates that walkable environments increase physical activity during both work and leisure time.

Figure 3. Aspects of the built environment, such as walkability, may mitigate gender differences in activity and overall activity inequality.

Figure 3

(a) Higher walkability scores are associated with lower activity inequality, based on data from 69 United States cities (LOESS fit; R2 = 0.61). (b,c) Walkability is linked to increased activity levels. Curves show average steps recorded throughout the day in United States cities with the top 10 walkability scores (green) and bottom 10 walkability scores (blue). (b) On weekdays, walkable cities exhibit a spike in activity during morning commute (9:00), evening commute (18:00) and lunch times (12:00), while activity is relatively constant and lower overall in less walkable cities. (c) On weekend days, people in more walkable cities take more steps throughout the middle of the day, thus walkability is associated with higher activity levels even when most people do not work or commute. (d) Higher walkability is associated with more daily steps across age, gender, and BMI groups. Bars show the steps gained per day for each point increase in walkability score for 24 United States cities, including 95% confidence intervals (assuming linear model; Methods). Positive values across all bars reveal that, with increasing walkability, more steps are taken by every subgroup. The effect is significantly larger for females overall (left), with the greatest increases for women under 50 years (middle) and individuals with a BMI less than 30 (right).

We find that higher walkability is associated with significantly more daily steps across all age, gender, and BMI groups (Fig. 3d). The relationship between walkability and activity is significantly stronger for females, whose activity was also disproportionately reduced with higher activity inequality, with the greatest increases for women under 50 years. For example, our linear model shows that for 40-year-old women, a 25 point increase in walkability (e.g., from Sacramento, CA to Oakland, CA) is associated with 868 more steps per day, while for men, this 25 point increase is associated with only 622 additional daily steps. While walkability was associated with the greatest increases in recorded steps among normal weight individuals, even overweight and obese individuals in more walkable cities record more steps.

There are limitations in the instrument we used to collect daily physical activity. For example, our sample is cross-sectional and potentially biased towards individuals of higher socioeconomic status, particularly in lower income countries, and people interested in their activity and health. However, we find that activity inequality predicts obesity in both middle and high income countries (Extended Data Fig. 4) and that walkability predicts activity inequality across four quartiles of median income in U.S. cities (Extended Data Fig. 9), suggesting that our findings are robust to variation in socioeconomic status. The majority of adults in developed countries already own a smartphone and the number of smartphone connections worldwide is expected to increase 50% by 202021, so we expect any biases to diminish in the future. While walking is the most popular aerobic physical activity29, our dataset may fail to capture time spent in activities where it is impractical to carry a phone (e.g., playing soccer) or steps are not a major component of the activity (e.g., bicycling), and there may exist systematic differences in wear time based on gender and age because users must carry their phone for steps to be recorded. However, analysis of our dataset reproduces previously established relationships between activity across geographic locations, gender and age (Extended Data Figs. 1, 2). We also find that between countries, the span of time over which steps were recorded is uncorrelated with the number of steps (Extended Data Fig. 10), and thus systematic wear time differences are unlikely to affect our country-level comparisons. Together, these results provide confidence that our dataset is able to identify activity differences between countries, genders, and age groups.

This study presents a new paradigm for population activity studies by demonstrating that smartphones can deliver new insights about key health behaviors. We examine the distribution of activity in 46 countries around the world, including rarely studied countries such as Saudi Arabia and Mexico. Our findings highlight activity inequality as an important indicator of activity disparities in the population and identify “activity poor” subpopulations, such as women, who could most benefit from interventions to promote physical activity. We further find that walkability is associated with reduced activity inequality and greater activity across age, gender, and BMI groups, which indicates the importance of the built environment to global activity levels and health. Our findings can help us to understand the prevalence, spread, and effects of inactivity and obesity within and across countries and subpopulations and to design communities, policies, and interventions that promote greater physical activity.

Methods

Dataset Description

We analyzed anonymized, retrospective data collected between July 2013 and December 2014 from Apple iPhone smartphone users of the Azumio Argus app, a free application for tracking physical activity and other health behaviors. Data is available at http://activityinequality.stanford.edu. We define a step as a unit of activity as determined through iPhone accelerometers and Apple’s proprietary algorithms for step-counting. The app records step measurements on a minute-by-minute basis. We considered only users with at least 10 days of steps data. The dataset contains 111 countries with 100 users or more (717,527 users; 68 million days of data; Fig. 1a). We restricted further analyses to the 46 countries with at least 1000 or more users (693,806 users; 66 million days of data). We aggregated data from all of these users to the country level. A user’s country was assigned based on the most common country identified through the user’s IP addresses. In the United States, users were assigned to a city based on the most commonly occurring location of weather updates in the user’s activity feed. Weather updates are automatically added to the feed of each user according to the nearest cell phone tower. The user enters gender, age, height, and weight in the app settings, and can change these values at any time; we used the most recent recorded values. 28.9% of users report multiple values for their weight; among these users, weight changed by 0.24 kg on average between the first and last recorded weight. Users had on average 95 days with recorded steps, although variation was large (standard deviation σ = 313 days). Subjects were excluded from a particular analysis if information was unreported (e.g., subjects with no reported height or weight were excluded from the analysis of Fig. 2a). The amount of data for each country can be found in Supplementary Table 1. To verify that subjects with missing data on gender, age, or BMI are not different from those who report data, we computed the standardized mean difference in age, gender, BMI, and average steps per day between groups with and without missing data. Across all combinations of missing variables (age, gender, BMI) and outcomes (age, gender, BMI, daily steps), the groups were balanced30, with all standard mean differences lower than 0.25. Data handling and analysis was conducted in accordance with the guidelines of the Stanford University Institutional Review Board.

Verifying Established Physical Activity Trends

To determine the ability of our dataset to identify relationships between physical activity and gender, age, BMI, and geographic location, we confirmed that the activity measure (daily steps) in our dataset reproduces trends established in prior work. We find that activity decreased with increasing age1, 8, 31, 32 and BMI8, 15, 32, and is lower in females than in males1, 8, 3133, which is consistent with previous reports (Extended Data Fig. 1). We compared our physical activity estimates to physical activity data aggregated by the World Health Organization (WHO)34. The comparison between recorded steps in our dataset and the WHO data is limited for the following reasons. The WHO’s dataset is based on self-reports instead of accelerometer-defined measures as in our dataset. It contains the percentage of the population meeting the WHO guidelines for moderate to vigorous physical activity rather than recorded steps, and there is no published direct correspondence between the WHO data and daily steps. Furthermore, the confidence intervals in the WHO dataset are often very large and make a comparison complicated (e.g., Japan: 28-89% meeting guidelines). Yet, we do observe moderate correlation between the two measures (r=0.3194; p=0.0393, Extended Data Fig. 2a). Similarly, we determined the correlation between obesity prevalence in a country in our dataset and comparable WHO estimates from 201435 (r=0.691; p < 106 ; Extended Data Fig. 2b). In addition, we find a significant correlation between the gender gap in activity in our dataset and that reported by the WHO (Pearson r=0.52, p < 103; Extended Data Fig. 2c). For these analyses we used the 46 countries with 1000 users in our dataset that also had WHO data34, 35 (that excludes Hong Kong and Taiwan).

Daily Recorded Steps and Wear Time

We define a proxy for wear time of the activity-tracking smartphone as daily span of recorded activity; that is, the time between the first and the last recorded step each day. We find that users have an average wear time of 14.0 hours per day. To verify that differences in recorded steps between countries are not confounded by differences in wear time from country to country, we compared the average wear time in each country versus the average number of daily steps (Extended Data Fig. 10). We find no significant correlation (r=−0.086, p=0.57). Across the 46 countries, males have a 30 minute longer average wear time than women (14.2 vs. 13.7 hours), which is consistent with longer average sleep duration of females16, 36.

Defining Activity Inequality

We used the Gini coefficient24, 25 to compute activity inequality, as it is the most commonly used measure to quantify inequality and statistical dispersion37. The Gini coefficient is based on the Lorenz curve, which plots the share of the population’s total average daily steps that is cumulatively recorded by the bottom x% of the population (Extended Data Fig. 5). The Gini coefficient is the ratio of the area that lies between the line of equality and the Lorenz curve (marked A in the diagram) to the total area under the line of equality (marked A and B in the diagram): Gini Coefficient = A / (A + B). The Gini coefficient ranges from 0 (complete equality) to 1 (complete inequality), since physical activity is non-negative. Several other measures have been used to quantify inequality and statistical dispersion including the coefficient of variation24, 25, decile ratio38, and others37, 38; we find that these measures are all highly correlated with the Gini coefficient (r=0.96 or higher) when applied to step counts within countries.

Correlation between Activity Inequality and Obesity

We computed the Pearson correlation coefficient of activity inequality and the prevalence of obesity in a country (Fig. 2a; r=0.79; p < 1010 ; R2 = 0.64) using local polynomial regression fitting (LOESS; R statistical software package with a re-descending M estimator and Tukey’s biweight function). We included all subjects with reported height and weight. We additionally correlated obesity with average daily steps for users in a country and compared the Pearson correlation coefficient for average daily steps with that for activity inequality (r=−0.62; p < 105; R2 = 0.47; Extended Data Fig. 6). Steiger’s Z-Test39 shows that activity inequality is more strongly correlated with obesity than the average volume of steps recorded in a country (r = 0.79 vs. −0.62; N = 46; t = 2.86; p < 0.01). For example, even though the United Kingdom has higher average daily steps than Germany and France (5444 vs. 5205 and 5141), it exhibits higher obesity prevalence (19.5% vs. 14.3% and 8.9%). However, the high obesity levels in the United Kingdom are matched to their high activity inequality (0.288 vs. 0.266 and 0.268).

Robustness of Correlation between Activity Inequality and Obesity

While for some countries the gender ratio in our sample closely matched official estimates (e.g., the United States, Canada, and Australia) in other countries our sample is more biased (e.g., Japan, Germany, and India; Supplementary Table 1). There is also a bias towards younger subjects in many countries (e.g., median age for U.S. is 34 years vs. 37 years; United Kingdom is 33 vs. 40; Japan is 38 years vs. 46 years; Brazil is 33 years vs. 31 years). Our sample further includes both middle and high income countries, as classified by the World Bank40. To verify the robustness of our results, we calculated gender-unbiased estimates for activity inequality and obesity prevalence for each country by reweighting males and females in our sample to exactly match World Bank estimates41 using a bootstrap42 with 500 replications. In addition, we computed activity inequality separately for males and females in each country and then correlated the activity inequality for each gender with obesity prevalance for that gender. We also computed the correlation between obesity and activity inequality for specific age groups in our dataset — [10,20), [20,30), [30,40), [40,50) and [50,100), again using only subjects with a reported age. In addition, we stratified countries by middle vs. high income status. In all cases, activity inequality remains a strong predictor of obesity (Extended Data Fig. 3, 4), which makes our findings independent of the exact age and gender distributions in our sample and suggests our results are not confounded by middle vs. high income status of countries or isolated to high income countries. Note that the results of these robustness analyses also show that our findings are not explained by patterns of missing data in our sample. We find similar results in analyses that include all subjects (Fig. 2a) or only those that report gender (Extended Data Fig. 3a) or age (Extended Data Fig. 3b). We further verified that the relationship between activity inequality and obesity is not unduly driven by outliers. We removed the potential outliers of Indonesia, Malaysia, and the Philippines from our dataset and found that activity inequality was still a better predictor of obesity than average volume of steps recorded (R2 was 0.69 for activity inequality vs. 0.56 for average steps).

Gender Gaps in Activity and Obesity

To determine how activity varies with increasing activity inequality across countries, we calculated the 25th, 50th, and 75th percentile of daily steps in each country, with separate calculations for males and females. We then fitted a linear model based on each country’s activity inequality to each percentile/gender group, along with 95% confidence intervals (Fig. 2b). We determined the relationship between obesity prevalence and average daily steps for males and females in our sample by measuring the fraction of obese subjects who recorded a certain amount of activity (1-2k daily steps, 2-3k, …, 10-11k) and then computing bootstrapped 95% confidence intervals (Fig. 2c). This analysis included all subjects in the dataset who reported height and weight (N=297,268). We computed the proportion of variability explained by the gender gap in activity using the R2 measure (Extended Data Fig. 7b).

City Walkability Analysis

Walkability scores were obtained from Walk Score43. Scores are on a scale of 1 to 100 (100 = most walkable) and are based on amenities (e.g., shops and parks) within a 0.25 to 1.5 mile radius (a decay function penalizes more distant amenities) and measures of friendliness to pedestrians, such as city block length and intersection density. At a city level, the score shows good correlation with gold standard, GIS-determined measures of walkability44. For the 69 United States cities with at least 200 Azumio users (Supplementary Table 2), we correlated walkability scores with the activity inequality on a city-level (i.e., using the within-city distribution of average daily step counts). We verified that correlations between walkability and activity inequality are similar when controlling for the median income level of the city by grouping the 69 cities used in Fig. 3a into quartiles based on median household income data from the 2015 American Community Survey45. We find that walkable environments are associated with lower levels of activity inequality for all four median income groups (Extended Data Fig. 9). We next analyzed activity in our dataset throughout the day on weekdays and weekend days in the 10 cities with the highest walkability scores, and the 10 cities with the lowest walkability score. We only considered cities with at least 20,000 weekdays of tracked steps across all users for this analysis. We aggregated steps taken over time within each city to the the average number of steps per 30 minute interval. We only considered days with (1) at least 60 minutes with nonzero steps, (2) first and last recorded step at least 8 hours apart, and (3) recorded total steps between 500 and 100,000. We examined a subset of similar cities in close geographic proximity to show that our results cannot be explained by simple differences in geographic variation or city populations (Supplementary Table 3).

Impact of Walkability on Daily Steps

We computed the relationship between walkability and average daily steps for several subgroups of our sample. We used data from United States cities that had at least 25 Azumio users in each subgroup (Age 0-29, Age 30-49, Age 50+, normal BMI, overweight, obese, all; for both males and females). There are 24 such cities in the dataset (Supplementary Table 4). The number of subjects for each group and city is shown in Supplementary Table 4. For each group, we ran independent linear regressions of steps on walkability on a per-subject level. The models include an intercept coefficient. We determined the estimated coefficient of walkability (i.e., the increase in daily steps for each one point increase in walkability of a city) along with 95% confidence intervals (based on Student’s t-distribution) for each subgroup (Fig. 3d). We refer to the set of these coefficients as our linear model in the main text.

Simulating population-level changes in activity

We used our model relating activity inequality to obesity prevalence to simulate how changes in activity might affect a country’s obesity prevalence. We consider an activity budget of 100 additional daily steps per person in a country to distribute across the population (we found similar results for different activity budgets). We compared two strategies for distributing the steps–a population-wide distribution and an inequality-centric distribution. Both strategies result in the same shift in the average activity level of a country. For the population-wide distribution strategy, we increased each individual’s daily activity by 100 steps. We then recomputed the country’s activity inequality after the redistribution and estimated the country’s new obesity prevalence based on our inequality-obesity model (line fit in Fig. 2a). We next tested an activity inequality-centric strategy, where we distributed the activity budget equally among the activity-poorest X% of the population (e.g., the bottom 20% of the population would increase their daily steps by 500). For the inequality-centric strategy, we computed the optimal fraction X for each country that results in the greated reduction in the country’s activity inequality. Optimal values for X across all countries ranged from 5-9%. Further, assuming a fixed X (e.g., X=10%) yielded similar results. Our simulation assumes perfect knowledge of population activity levels and perfect compliance; that is, any user targeted in this simulation would increase their activity levels according to the available budget. We also assume that other factors affecting weight would be held constant when activity levels change. In our simulations, the inequality-centric intervention resulted in reductions in obesity prevalence of up to 8.3% (median 4.0%; Extended Data Fig. 8), whereas the population-wide approach led to reductions of up to 2.3% (median 1.0%). Thus, activity inequality-centric interventions could result in up to a 4 times greater median reduction in obesity prevalence compared to the population-wide approach.

Data availability

Data is available at http://activityinequality.stanford.edu/.

Extended Data

Extended Data Figure 1. Activity and obesity data gathered with smartphones exhibit well established trends.

Extended Data Figure 1

(a) Daily step counts across age and (b) BMI groups for all users. Error bars correspond to bootstrapped 95% confidence intervals. Observed trends in the dataset are consistent with previous findings; that is, activity decreases with increasing age1, 8, 31, 32 and BMI8, 15, 31, and is lower in females than in males1, 8, 3133.

Extended Data Figure 2. Activity and obesity data gathered with smartphones are significantly correlated with previously reported estimates based on self-report.

Extended Data Figure 2

(a) WHO physical activity measure34 versus smartphone activity measure. The WHO measure corresponds to the percentage of the population meeting the WHO guidelines for moderate to vigorous physical activity based on self-report. The smartphone activity measure is based on accelerometer-defined average daily steps. We find a correlation of r=0.3194 between the two measures (p < 0.05). Note that this comparison is limited because there is no direct correspondence between the two measures—values of self-report and accelerometer-defined activity can differ14, and the WHO confidence intervals are very large for many countries (Methods). (b) WHO obesity estimates35, based on self-reports to survey conductors, versus obesity estimates in our dataset, based on height and weight reported to the activity-tracking app. We find a significant correlation of r=0.691 between the two estimates (p < 106). (c) Gender gap in activity estimated from smartphones is strongly correlated with previously reported estimates based on self-report. We find that the difference in average steps per day between females and males is strongly correlated to the difference in the fraction of each gender who report being sufficiently active according to the WHO (Pearson r=0.52, p < 103).

Extended Data Figure 3. Activity inequality remains a strong predictor of obesity levels across countries when reweighting the sample based on officially reported gender distributions and when stratifying by gender or age.

Extended Data Figure 3

(a) Obesity versus activity inequality on country level where subjects are reweighted to accurately reflect the official gender distribution in each country (Methods). The gender-unbiased estimates are very similar to estimates using all data (r=0.953 for activity inequality and r=0.986 for obesity). (b) Obesity versus activity inequality on a country level for males and females. Activity inequality predicts obesity for both genders. (c) Obesity versus activity inequality on a country level across different age groups. We find associations between activity inequality and obesity persists within every single age groups. Older people are more likely to be obese (see y-axis ranging from 5% to 45% obesity for subjects older than 50 years) and more likely to get little activity (i.e., higher activity inequality on x-axis). These results indicate that our main result—activity inequality predicts obesity—is independent of any potential gender and age bias in our sample.

Extended Data Figure 4. Relationship between activity inequality and obesity holds within countries of similar income.

Extended Data Figure 4

Out of the 46 countries included in our main result, we have 32 high income (green) and 14 middle income (orange) countries according to the current World Bank classification40. We find that activity inequality is a strong predictor of obesity levels in both high income countries as well as middle income countries. While in middle income countries, iPhone users might belong to the wealthiest in the population, in high income countries iPhones are used by larger parts of the population. The fact that we find a strong relationship between activity inequality and obesity in both groups of countries suggests that our findings are robust to differences in wealth in our sample.

Extended Data Figure 5. Graphical definition of activity inequality measure using the Gini coefficient.

Extended Data Figure 5

The Lorenz curve plots the share of total physical activity of the population on the y-axis that is cumulatively performed by the bottom x% of the population, ordered by physical activity level. The diagonal line at 45 degrees represents perfect equality of physical activity (i.e., everyone in the population is equally active). The Gini coefficient is defined as the ratio of the area that lies between the line of equality and the Lorenz curve (marked A in the diagram) over the total area under the line of equality (marked A and B in the diagram). The Gini coefficient for physical activity can range from 0 (complete equality) to 1 (complete inequality).

Extended Data Figure 6. Activity inequality is a better predictor of obesity than the the average activity level.

Extended Data Figure 6

(a) Obesity is significantly correlated with the average number of daily steps in each country (LOESS fit; R2 = 0.47). (b) However, activity inequality is the better predictor of obesity (LOESS fit; R2 = 0.64). The difference is significant according to Steiger’s Z-Test (p < 0.01; Methods). This shows that there is value to measuring and modeling physical activity across countries beyond average activity levels. Activity inequality captures the variance of the distribution; that is, how many activity rich and activity poor people there are, allowing for better prediction of obesity levels. Figure repeated from Fig. 2a for comparison.

Extended Data Figure 7. Female activity is reduced disproportionately in countries with high activity inequality.

Extended Data Figure 7

(a) Distribution of daily steps for females, males, and all users in representative countries of increasing activity inequality (Japan, United Kingdom, United States, and Saudi Arabia). While in countries with low activity inequality females and males get very similar amounts of activity (e.g., Japan), the distributions of female and male activity differ greatly for countries with high activity inequality (e.g., Saudi Arabia and United States). Activity distributions in these countries demonstrate that larger variances in activity (Fig. 1c) are due to a disproportionate reduction in the activity of females and not just an increase in variance overall. (b) Activity inequality increases with the relative activity gender gap on a country level (Methods). We find that the relative gender gap ranges between 0.041 (Sweden) and 0.380 (Qatar). The average daily steps for females is lower than for males in all 46 countries. The gender gap explains 43% of the observed variance in activity inequality (linear fit: R2 = 0.43). This suggests that activity inequality could be reduced significantly through increases in female activity alone.

Extended Data Figure 8. Activity inequality-centric interventions could result in up to 4 times greater reductions in obesity prevalence than population-wide approaches.

Extended Data Figure 8

Given a fixed activity budget (100 daily steps per individual) to distribute across the population, we compare an inequality-centric strategy which equally distributes this budget to minimize activity inequality (100/X% daily steps increase for the activity-poorest X% where X minimizes the country’s resulting activity inequality; Methods) and a population-wide strategy which equally distributes the budget across the entire population (100 daily steps per individual; Methods). Based on our simulations, we find that the inequality-centric strategy would lead to predicted reductions in obesity prevalence of up to 8.3% (median 4.0%), whereas the population-wide approach would lead to predicted reductions of up to 2.3% (median 1.0%).

Extended Data Figure 9. Relationship between walkability and activity inequality holds within US cities of similar income.

Extended Data Figure 9

Walkable environments are associated with lower levels of activity inequality within socioeconomically similar groups of cities. We group the 69 cities into quartiles based on median household income (data from the 2015 American Community Survey45). We find that walkable environments are associated with lower levels of activity inequality for all four groups. The effect appears attenuated for cities in the lowest median household income quartile. These results suggest that our main result—activity inequality predicts obesity and is mediated by factors of the physical environment—is independent of any potential socioeconomic bias in our sample.

Extended Data Figure 10. Differences in country level daily steps are not explained by differences in estimated wear time.

Extended Data Figure 10

Users have an average span of 14.0 hours between the first and last recorded step, our proxy for daily wear time (Methods). While on an individual level, longer estimated wear time is associated with more daily steps (r=0.427, p < 1010), on a country level, there is no significant association between wear time and daily steps (r=−0.086, p = 0.57). Line shows linear fit using the 46 countries with at least 1000 users. This suggests that differences in recorded steps between countries are due to actual differences in physical activity behavior and are not explained by differences in wear time.

Supplementary Material

1

Acknowledgments

Further information and data are available at http://activityinequality.stanford.edu. The authors thank Azumio for donating the data for independent research, and Thomas Uchida and Will Hamilton for comments and discussions. T.A., R.S., J.H., A.K., S.D., and J.L. were supported by National Institutes of Health (NIH) grant U54 EB020405 (Mobilize Center; NIH Big Data to Knowledge Center of Excellence). T.A. was supported by the SAP Stanford Graduate Fellowship. J.H. and S.D. were supported by grants R24 HD065690 and P2C HD065690 (NIH National Center for Simulation in Rehabilitation Research). J.L. and R.S. were supported by grants NSF IIS-1149837 and Stanford Data Science Initiative. J.L. is a Chan Zuckerberg Biohub investigator.

Footnotes

Author Contributions T.A. performed the statistical analysis. T.A., R.S., J.H., A.K., S.D., and J.L. jointly analyzed the results and wrote the paper.

The authors declare no competing financial interests.

References

  • 1.Hallal PC, et al. Global physical activity levels: surveillance progress, pitfalls, and prospects. Lancet. 2012;380:247–257. doi: 10.1016/S0140-6736(12)60646-1. [DOI] [PubMed] [Google Scholar]
  • 2.Lee IM, et al. Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy. Lancet. 2012;380:219–229. doi: 10.1016/S0140-6736(12)61031-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.UN Secretary General. Prevention and control of non-communicable diseases. 2011 http://www.who.int/nmh/publications/2011-report-of-SG-to-UNGA.pdf. Accessed April 21, 2016.
  • 4.WHO. Global recommendations on physical activity for health. World Health Organization; Geneva: 2010. [PubMed] [Google Scholar]
  • 5.Kohl HW, et al. The pandemic of physical inactivity: global action for public health. Lancet. 2012;380:294–305. doi: 10.1016/S0140-6736(12)60898-8. [DOI] [PubMed] [Google Scholar]
  • 6.Tudor-Locke C, Hatano Y, Pangrazi RP, Kang M. Revisiting “how many steps are enough?”. Med Sci Sport Exer. 2008;40:S537–S543. doi: 10.1249/MSS.0b013e31817c7133. [DOI] [PubMed] [Google Scholar]
  • 7.Sallis JF, et al. Progress in physical activity over the olympic quadrennium. Lancet. 2016;388:1325–1336. doi: 10.1016/S0140-6736(16)30581-5. [DOI] [PubMed] [Google Scholar]
  • 8.Bauman AE, et al. Correlates of physical activity: why are some people physically active and others not? Lancet. 2012;380:258–271. doi: 10.1016/S0140-6736(12)60735-1. [DOI] [PubMed] [Google Scholar]
  • 9.Physical Activity Guidelines Advisory Committee. Physical Activity Guidelines Advisory Committee Report. Department of Health and Human Services; Washington, DC: 2008. [Google Scholar]
  • 10.Chokshi DA, Farley TA. Changing behaviors to prevent noncommunicable diseases. Science. 2014;345:1243–1244. doi: 10.1126/science.1259809. [DOI] [PubMed] [Google Scholar]
  • 11.Sallis JF, et al. Physical activity in relation to urban environments in 14 cities worldwide: a cross-sectional study. Lancet. 2016;387:2207–2217. doi: 10.1016/S0140-6736(15)01284-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Servick K. Mind the phone. Science. 2015;350:1306–1309. doi: 10.1126/science.350.6266.1306. [DOI] [PubMed] [Google Scholar]
  • 13.Reis RS, et al. Scaling up physical activity interventions worldwide: stepping up to larger and smarter approaches to get people moving. Lancet. 2016;388:1337–1348. doi: 10.1016/S0140-6736(16)30728-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Prince SA, et al. A comparison of direct versus self-report measures for assessing physical activity in adults: a systematic review. Int J Behav Phys Act. 2008;5:56. doi: 10.1186/1479-5868-5-56. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Van Dyck D, et al. International study of objectively measured physical activity and sedentary time with body mass index and obesity: IPEN adult study. Int J Obes. 2015;39:199–207. doi: 10.1038/ijo.2014.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Walch OJ, Cochran A, Forger DB. A global quantification of “normal” sleep schedules using smartphone data. Sci Adv. 2016;2:e15017705. doi: 10.1126/sciadv.1501705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Gonza´lez MC, Hidalgo CA, Baraba´si AL. Understanding individual human mobility patterns. Nature. 2008;453:779–782. doi: 10.1038/nature06958. [DOI] [PubMed] [Google Scholar]
  • 18.Golder SA, Macy MW. Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science. 2011;333:1878–1881. doi: 10.1126/science.1202775. [DOI] [PubMed] [Google Scholar]
  • 19.Wesolowski A, et al. Quantifying the impact of human mobility on malaria. Science. 2012;338:267–270. doi: 10.1126/science.1223467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Blumenstock J, Cadamuro G, On R. Predicting poverty and wealth from mobile phone metadata. Science. 2015;350:1073–1076. doi: 10.1126/science.aac4420. [DOI] [PubMed] [Google Scholar]
  • 21.Anthes E. Mental health: there’s an app for that. Nature. 2016;532:20–23. doi: 10.1038/532020a. [DOI] [PubMed] [Google Scholar]
  • 22.Case MA, Burwick HA, Volpp KG, Patel MS. Accuracy of smartphone applications and wearable devices for tracking physical activity data. JAMA. 2015;313:625–626. doi: 10.1001/jama.2014.17841. [DOI] [PubMed] [Google Scholar]
  • 23.Hekler EB, et al. Validation of physical activity tracking via android smartphones compared to ActiGraph accelerometer: laboratory-based and free-living validation studies. JMIR mHealth uHealth. 2015;3:e36. doi: 10.2196/mhealth.3505. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Atkinson AB. On the measurement of inequality. J Econ Theory. 1970;2:244–263. [Google Scholar]
  • 25.Allison PD. Measures of inequality. Am Sociol Rev. 1978;43:865–880. [Google Scholar]
  • 26.Brown WJ, Mielke GI, Kolbe-Alexander TL. Gender equality in sport for improved public health. Lancet. 2016;388:1257–1258. doi: 10.1016/S0140-6736(16)30881-9. [DOI] [PubMed] [Google Scholar]
  • 27.Wagstaff A, Van Doorslaer E. Income inequality and health: what does the literature tell us? Annu Rev Publ Health. 2000;21:543–567. doi: 10.1146/annurev.publhealth.21.1.543. [DOI] [PubMed] [Google Scholar]
  • 28.Lynch JW, Smith GD, Kaplan GA, House JS. Income inequality and mortality: importance to health of individual income, psychosocial environment, or material conditions. Brit Med J. 2000;320:1200. doi: 10.1136/bmj.320.7243.1200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Centers for Disease Control and Prevention. CDC vital signs: more people walk to better health. 2012 http://www.cdc.gov/vitalsigns/walking/ Accessed November 3, 2016.
  • 30.Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25:1–21. doi: 10.1214/09-STS313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Bassett DR, Wyatt HR, Thompson H, Peters JC, Hill JO. Pedometer-measured physical activity and health behaviors in U.S. adults. Med Sci Sport Exer. 2010;42:1819–1825. doi: 10.1249/MSS.0b013e3181dc2e54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Troiano RP, et al. Physical activity in the United States measured by accelerometer. Med Sci Sport Exer. 2008;40:181–188. doi: 10.1249/mss.0b013e31815a51b3. [DOI] [PubMed] [Google Scholar]
  • 33.Tudor-Locke C, Johnson WD, Katzmarzyk PT. Accelerometer-determined steps per day in US adults. Med Sci Sport Exer. 2009;41:1384–1391. doi: 10.1249/MSS.0b013e318199885c. [DOI] [PubMed] [Google Scholar]
  • 34.World Health Organization. Prevalence of insufficient physical activity among adults: data by country. http://apps.who.int/gho/data/node.mainA893?lang=en Accessed May 19, 2016.
  • 35.World Health Organization. Obesity (body mass index ≥ 30) (age-standardized estimate): data by country. http://apps.who.int/gho/data/node.mainA900A?lang=en Accessed May 19, 2016.
  • 36.Basner M, et al. American time use survey: sleep time and its relationship to waking activities. Sleep. 2007;30:1085–1095. doi: 10.1093/sleep/30.9.1085. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.De Maio FG. Income inequality measures. J Epidemiol Community Health. 2007;61:849–852. doi: 10.1136/jech.2006.052969. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Kawachi I, Kennedy BP. The relationship of income inequality to mortality: does the choice of indicator matter? Soc Sci Med. 1997;45:1121–1127. doi: 10.1016/s0277-9536(97)00044-0. [DOI] [PubMed] [Google Scholar]
  • 39.Steiger JH. Tests for comparing elements of a correlation matrix. Psychol Bull. 1980;87:245–251. [Google Scholar]
  • 40.World Bank. World Bank Country and Lending Groups. https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups. Accessed October 5, 2016.
  • 41.World Bank: Population, female (% of total) http://data.worldbank.org/indicator/SP.POP.TOTL.FE.ZS. Accessed May 10, 2016.
  • 42.Efron B, Tibshirani RJ. An introduction to the bootstrap. CRC press; 1994. [Google Scholar]
  • 43.Walk Score. https://www.walkscore.com/cities-and-neighborhoods/. Accessed May 17, 2016.
  • 44.Duncan DT, Aldstadt J, Whalen J, Melly SJ, Gortmaker SL. Validation of Walk Score® for estimating neighborhood walkability: an analysis of four US metropolitan areas. Int J Environ Res Public Health. 2011;8:4160–4179. doi: 10.3390/ijerph8114160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.United States Census Bureau. American Community Survey. http://www.census.gov/programs-surveys/acs/. Accessed October 5, 2016.
  • 46.Bureau U.S .C. Census and American Community Survey 2006–2010. 2010 Accessed through Bay Area Census. http://www.bayareacensus.ca.gov/cities/cities. Accessed July 5, 2016.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

Data is available at http://activityinequality.stanford.edu/.

RESOURCES