Skip to main content
Journal of the American Medical Informatics Association : JAMIA logoLink to Journal of the American Medical Informatics Association : JAMIA
. 2019 Mar 12;26(5):412–419. doi: 10.1093/jamia/ocz003

Investigating data accessibility of personal health apps

Yoojung Kim 2, Bongshin Lee 3, Eun Kyoung Choe 1,
PMCID: PMC6433179  PMID: 30861531

Abstract

Objective

Despite the potential values self-tracking data could offer, we have little understanding of how much access people have to “their” data. Our goal of this article is to unveil the current state of the data accessibility—the degree to which people can access their data—of personal health apps in the market.

Materials and Methods

We reviewed 240 personal health apps from the App Store and selected 45 apps that support semi-automated tracking. We characterized the data accessibility of these apps using two dimensions—data access methods and data types.

Results

More than 90% of our sample apps (n =41) provide some types of data access support, which include synchronizing data with a health platform (ie, Apple Health), file download, and application program interfaces. However, the two approachable data access methods for laypeople—health platform and file download—typically put a significant limit on data format, granularity, and amount, which constrains people from easily repurposing the data.

Discussion

Personal data should be accessible to the people who collect them, but existing methods lack sufficient support for people in accessing the fine-grained data. Lack of standards in personal health data schema as well as frequent changes in market conditions are additional hurdles to data accessibility.

Conclusions

Many stakeholders including patients, healthcare providers, researchers, third-party developers, and the general public rely on data accessibility to utilize personal data for various goals. As such, improving data accessibility should be considered as an important factor in designing personal health apps and health platforms.

Keywords: consumer health informatics, personal health data, self-tracking, mobile apps, data accessibility

INTRODUCTION

As technologies that capture, store, and analyze personal health data have proliferated in our everyday life,1,2 we see a growing interest from various stakeholders in accessing and leveraging the data. Personal health data could be valuable for many stakeholders, including self-trackers who want to learn insights about themselves,3 researchers who want to incorporate personal health data in their research,4–6 and app developers who want to integrate multiple data sources in a new service, for example, Exist.io,7 Gyroscope,8 Instant,9 and Sherbit.10 Accessing personal health data could also open up new opportunities in clinical contexts where doctors can use the patients’ data collected outside the hospital to diagnose patients accurately and monitor them closely.11–14

Researchers have argued that people should have access to their personal health data, especially the data collected by themselves.15–20 In particular, Bietz et al.15 reported that 75% of self-trackers they surveyed expressed a strong desire to access self-tracking data, with 54% of them believing that they should own the personal health data they collected. Eighty-nine percent of researchers also mentioned that accessing personal health data such as vital signs, stress levels, and mood could help their research.15

Although it might seem obvious that people should have access to “their” data because they contribute to collecting the data,15,21 it is often not the case.20 For example, some companies do not provide the data at all,20,22 and people have to pay an extra charge to access their data.23,24 Even worse, patients struggle to access personal health data crucial to their care (eg, continuous glucose monitoring data for diabetes care).20 In this light, some researchers have argued that personal digital traces—both collected and derived—should be given back to the person.17,18

Currently, we have little understanding of how people can access their personal health data they collected with mobile apps. In this work, we bring the notion of data accessibility—a commonly used term to denote the degree to which data can be accessed by an agent—in the personal health data context. We aim to unveil the data accessibility of popular personal health apps, which is necessary to identify challenges and opportunities to improve data accessibility.

Background on data accessibility

In distributed computing research, data accessibility is used as an efficiency metric,25,26 denoting how quickly systems can access a given data object. In the management information systems field, data accessibility is discussed as an important aspect of data quality.27,28 Pipino et al.27 defined (data) accessibility as “the extent to which data is available, or easily and quickly available” and suggested several criteria to assess data quality, such as data amount and data completeness.

In the healthcare field, patients’ data are being increasingly stored in electronic health records. In response to patients’ request to access their data, large hospitals have begun to provide patients with direct access to their data through personal health records (PHRs).29 One of the essential issues around PHRs is to define what and how much data should be shared with patients. Some PHRs allow patients to access—either fully or partially—health data such as problem list, medication list, lab results, and clinical notes.30,31

Data stored in PHRs are largely limited to patients’ data generated inside the clinic. Outside of the clinic, people are also increasingly collecting self-tracking data. Many companies including major information technology companies (eg, Apple, Google, Microsoft) have invested in products to help people collect, manage, and share their health and fitness data. At the same time, people’s data are scattered across many databases in a silo. Noting the difficulties of retrieving people’s own data, Estrin17 argued that companies should give data back to individuals (or even to third-party companies, with proper anonymization) so that individuals can get personalized insights into their health. A similar issue is discussed in biomedical research: called a “one-way transaction,” once study participants submit their sample (eg, hair, urine) to researchers, they are not granted access to the data (eg, genomic data) directly drawn from it.19 Lunshof et al.19 argued that study participants should have an option to access their data. Various research projects, such as All of Us (by the National Institutes of Health), 100,000 Genomes Project, and the Harvard Personal Genome Project, grant varying degrees of data access to individuals.32

In a similar vein, Baker33 used the concept of user sovereignty to emphasize that people should have the right to decide how to use and share their data and whether service providers can store their data. Furthermore, the General Data Protection Regulation (GDPR), which came into effect in May 2018, includes the access right: people should be able to access their personal data if the data are identifiable.21 Although these arguments highlight the importance of data access, little is known about how people can access their data in the current personal health app ecosystem. Thus, examining data accessibility in the personal health context is a critical and timely topic.

MATERIALS AND METHODS

Two dimensions of data accessibility: Data access methods and data types

We decomposed data accessibility into two dimensions: data access methods and data types. Data access method refers to how individuals with a varying level of technical skills and diverse purposes can access their self-tracking data. Data type refers to a target of the measure, such as step, heart rate, weight, and sleep. The characteristics of each data type determine a meaningful frequency of the measure. For example, it is unlikely to be meaningful to measure weight frequently due to the daily fluctuation.34 As capturing technology advances over time, data capture feasibility—both in frequency and level of details—changes. For example, a few research prototypes have recently demonstrated the feasibility to continuously capture blood pressure,35,36 which might become available in everyday life.

A combination of data access methods and data types determines the granularity, which we define as the minimum time interval of the accessible data. The self-tracking data are inherently time based. The granularity of self-tracking data can be fixed, for example, each day or every second, or it can be event based, for example, when caffeine intake occurs.

To develop a coding scheme for these two dimensions, we reviewed 70 popular self-tracking apps including Fitbit, Misfit, and Apple Watch, and identified three types of data access method: (1) application program interfaces (APIs),[1] (2) health platform connection, and (3) file download. We considered APIs as a data access method because our stakeholders include researchers and developers, who can use APIs to develop tools for laypeople. Among a few existing health platforms (Apple Health, Google Fit, and Samsung Health), we decided to focus on one platform, Apple Health, for two reasons: (1) our goal was not to compare across health platforms and (2) Apple Health and its frameworks (HealthKit and ResearchKit) are widely adopted,39 becoming a standard interface to fitness and medical devices.40 The unit of analysis for the granularity analyses was each tracking item (eg, step count, weight, heart rate) because most apps track multiple tracking items and the amount of accessible data varies depending on tracking items.

App selection

In March 2018, we reviewed all the apps listed on the App Store’s health and fitness popular app list (240 total)41 and selected 45 apps through the inclusion process shown in Figure 1. We first excluded apps that are unrelated to personal health, and then filtered out apps that do not record any data. We further limited our sample to semiautomated tracking apps.42 Semi-automated tracking relies on various sensors (eg, GPS, accelerometer) to collect data, although they require some user interactions (eg, wearing a wristband, starting or ending a route capture). We decided to include only semi-automated tracking apps because they impose interesting data accessibility challenges such as a lack of transparency in what data are being stored and can be retrieved. We provide details of the selected apps in the Supplementary Appendix.

Figure 1.

Figure 1.

The app selection process.

Data analysis

We examined these 45 apps according to the coding scheme we developed. For the apps that provide an API, we reviewed the API documentation to examine the data type and data amount that API supports and the price policy. We made each API call to check whether it works, what the limit rate of API calls is, and what information is required to submit to get the permission to use the API. For the apps that support transferring data to a health platform (Apple Health), we reviewed data types people can retrieve from the platform, in addition to examining the data granularity of the accessible data. Last, for the apps that support file download, we examined the file to identify data type and data granularity, and file format.

To examine how data accessibility differs depending on tracking items, we first cataloged tracking items for each app and listed a total of 157 tracking items from 42 apps; we excluded three apps (Equinox, SleepIQ, and Virgin Pulse) that we could not access due to sign-in restrictions (eg, requires a gym membership). We also excluded tracking items that require manual logging (eg, food intake, height). We then categorized the tracking items through an affinity analysis.

RESULTS

Data access methods

Among the 45 apps, 41 apps provide a total of 73 data access methods whereas four apps do not provide any data access method (M =1.62 ± 0.89 per app). Data can be accessed through three types of data access methods: health platform (n = 36), file download (n = 22), and API (n = 15) (Figure 2). Eighteen percent of apps (n = 8) provide all three methods, 36% of apps (n = 16) provide two access methods, and 38% of apps (n = 17) provide only one access method.

Figure 2.

Figure 2.

The number of apps that provide the three data access methods—application program interface (API), file download, and health platform: 41 apps provide a total of 73 data access methods whereas four provide none.

Health platform (Apple Health)

Synchronizing data with the health platform was the most common data access method: 80% of apps (n =36) synchronize data with Apple Health. Apple Health allows people to sync their data collected from other apps, review the data, and obtain the data in a file format (eg, XML[2]). People can easily set up a data connection to Apple Health from their mobile app and configure access (eg, read, write) permissions for each tracking item (eg, sync step count data only but not others).

File download

Forty-nine percent of apps (n =22) allow people to download data files from websites or mobile apps. The types of file format include CSV,[3] FIT,[4] GPX,[5] KML,[6] and TCX.[7] Ten apps support two or more file formats. For example, the Runtastic Running & Fitness app allows people to download data in three file formats: GPX, KML, and TCX.

API

Thirty-three percent of apps (n =15) have public websites that describe the API definitions and usage protocols. Among these 15 apps, five warrant limited usage of their APIs, either requiring payment or licensing requests. We note that APIs of another five apps are not working or announced to be discontinued as of September 2018. We provide details of apps’ API status in the Supplementary Appendix.

Data types

We categorized tracking items into 23 data types, which we grouped into five categories: daily activity, workout, sleep, body measurement, and cardiac measurement (Table 1).

Table 1.

Twenty-three unique data types grouped into 5 categories

Data Type Category Data Type (Number of Unique Apps)
Daily Activity Step Count (n = 17), Calories Burneda(n = 12), Distance (12), Flights Climbed (n = 5), Activity Levela(n = 3), Elevation (n = 2), Resting Energy (n = 2), Walking Time (n = 2), Brushing (n = 1), Stress (n = 1)
Workout Workout Sessiona(n = 25), Route (n = 12)
Sleep Sleep Analysisa(n = 14)
Body Measurement Weight (n = 5), Body Fat (n = 5), Body Mass Index (n = 4), Body Massa(n = 3), Oxygen Saturation (n = 1), Respiratory Rate (n = 1), Body Temperature (n = 1)
Cardiac Measurement Heart Ratea(n = 13), Blood Pressurea(n = 2), Pulse Wave Velocity (n = 1)
a

Subitems (eg, the activity level includes four subitems: sedentary minutes, lightly active minutes, fairly active minutes, and very active minutes).

Daily activity category

The daily activity category consists of 10 data types. Frequently observed data types are step count (17 apps), calories burned (12), and distance (12). Such data types are easily captured or derived leveraging sensors (eg, accelerometer, gyroscope) and detection algorithms.

Workout category

We separated the workout category from the daily activity category because many apps focus exclusively on tracking workouts. The workout category is composed of two data types: workout session (25 apps) and route (12 apps). The workout session is a collection of tracking items related to physiological data (eg, heart rate), behavioral data (eg, calories, distance, step count, speed), and environmental data (eg, elevation, altitude). The route data type usually refers to GPS coordinates. These two types of data are captured during a specific time frame. A person may explicitly record her workouts (ie, start or end of a workout session) or a device may automatically infer a workout session (eg, based on velocity and location, a device might wrap a timeframe as a “running” session).

Sleep category

The sleep category has only one data type: sleep analysis (14 apps), which includes a person’s sleep cycle based on the sleep stages (Figure 3G). Depending on the device functionalities and service providers, sleep analysis is expressed in different sleep stage schemas. Thus, the structure of sleep analysis does not conform to a single standard (eg, four-stage model, awake/sleep model, sleep duration only).

Figure 3.

Figure 3.

The granularity of the representative data types (rows [step count, workout session, sleep analysis, weight, heart rate]) by each access method (column [API, Health Platform, File Download]). Across all 5 data types, application program interfaces (APIs) tend to provide finer-grained data access than the other two methods do. Health platform defines the schema of a specific data type, and data granularity of individual app varies only within the range of data granularity defined by the health platform. File downloads provide more coarse-grained data across all five data types.

Body measurement category

The body measurement category encompasses a variety of data types, some of which may not be meaningful to track continuously (eg, weight, body fat).34 Whereas, continuously tracking the data types similar to vital signs (eg, oxygen saturation, respiratory rate, body temperature) can be critical for specific user groups (eg, asthma patients, pediatric patients).43 New technologies that frequently track such data types are emerging,43 but they are currently too expensive or too early to be adopted broadly.

Cardiac measurement category

We found three data types related to cardiac measurements: heart rate (13 apps), blood pressure (2), and pulse wave velocity (1). Heart rate can be captured continuously with fine granularity due to the advance of sensing technologies (eg, optical sensors). However, only a few research prototypes have recently demonstrated the feasibility to continuously capture blood pressure35,36 and pulse wave velocity.44 Commonly available blood pressure monitors do not yet support continuous data capture.

Data granularity

The granularity in accessing data depends on data capture feasibility and access methods: when the capture feasibility is low, the granularity is necessarily coarse, while when the feasibility is high, the granularity varies depending on the access method. Figure 3 illustrates the typical granularity of the five most frequently observed data types from each data type category for the three access methods. Overall, APIs provide finer-grained data access than health platforms and file downloads.

API

Through APIs, people can generally access the most fine-grained data: heart rate data with a 1-second interval (Figure 3M, top), workout session data with a few-seconds interval (Figure 3D), and step count data with a 1-minute interval (Figure 3A). In rare cases, an API (eg, Misfit) provides step counts in a 1-day interval.

Although the granularity of sleep analysis data depends on how each app defines sleep stages (see the Supplementary Appendix), APIs provide the most detailed time series data (Figure 3G). Many APIs give people a set of time series data categorized in several sleep stages (eg, light sleep, deep sleep, REM sleep, awake).

For certain data types with limited capture feasibility, people can obtain sporadically captured data only, even with APIs. Some heart rate tracking apps, for example, provide discrete data points with an API (Figure 3M, bottom) because they cannot continuously capture heart rate data. Similarly, weight data are also provided as discrete points (Figure 3J).

Health platform

Health platform defines the schema of a specific data type, and data granularity of individual app varies only within the range of data granularity defined by the health platform. In the case of step count, for example, when synchronizing the data with Apple Health, some apps aggregate step counts into chunks, with varying minute-level intervals (Figure 3B, top), while others provide step counts with a fixed minute-level (1 or 15) interval (Figure 3B, bottom).

For workout session and sleep analysis data, they could lose granularity in conforming to the schema that the health platform defined. People can access only a subset of tracking items predefined by the health platform with a varying granularity. Apple Health defines that the workout session has two tracking items—calories burned and distance—and provides only the total amount of these tracking items for the session (Figure 3E). For sleep analysis data, because many apps differently define sleep stages from Apple Health, their sleep analysis data sometimes are simplified during synchronization with the platform (Figure 3H).

Similar to other methods, people can typically access discretely captured weight (Figure 3K) and heart rate data (Figure 3N). Only a few apps provide heart rate data with the minute-level interval if there is an accompanying wearable device that can continuously capture heart rate (eg, Garmin Connect).

File download

With the file download method, the granularity is coarse in most data types. Most apps provide step counts in a 1-day interval (Figure 3C), with a few exceptions (eg, Nokia Health Mate [1 minute], Pacer Pedometer & Step Tracker [15 minutes]).

For the workout session data type, people can access only the aggregated data of each tracking item (Figure 3F), while a few apps provide each tracking item data by seconds (eg, Polar Beat [1 second], Walkmeter Walking & Hiking GPS [a few seconds]). Similarly, most sleep tracking apps provide only the total duration of each sleep session (Figure 3I), with an exception of Nokia Health Mate, which provides a set of time series data based on sleep stages.

Heart rate and weight data are generally provided as discrete data (Figure 3L, 3O), with an exception that the Nokia Health Mate provides heart rate with a 1-minute interval.

DISCUSSION

This work was motivated by the premise that people should have access to their data and also by our recognition of the aggravating challenges in personal health data access. We believe that personal data should be accessible to the people who collected them, and data access method should be transparent and easy to understand. If so, data accessibility can be one of the considerations in deciding which personal health apps to adopt. However, our results show that personal data access is limited and data access process is obscure.

Towards better data accessibility

Our findings revealed that trade-offs exist across the different access methods, regarding the required skills and granularity of the accessible data for each method: the more fine-grained data people want to access, the more advanced skills they need to have. APIs provide finer-grained data but are difficult to use; health platform and file downloads are easier to use for laypeople, but they typically put a significant limit on data format, granularity, and amount, which constrains people from easily repurposing the data. One way to enhance data accessibility is to provide people with control over these data access attributes so that they can determine what to access depending on their goals.

We interpret data accessibility in a broad sense in that people can not only access their raw data in the most granular sense if they want, but also “utilize” their data better to meet their goals. We thus envision an approach that enables flexible access to support people with varying data management and analytics skills.

Some people yearn to access raw data to explore and analyze the data further because of the limitations of existing tools in providing insights.3 Although Apple Health supports data integration, it does not provide rich insights from multiple data sources. Such limitations can be addressed through a system capable of integrating data from multiple sources and providing data exploration features (eg, Visualized Self45). For those who do not have the data analysis and visualization skills, providing rich insights generated by the system can reduce the need to access the raw data in the first place. Furthermore, we envision a system that enables people to authorize experts (eg, clinicians, researchers) to access and utilize data for them, making the data sharing easy and efficient.16,45

Data schema standards

Data standards, defined as the “documented agreements on representations, formats, and definitions of common data,” support the interoperability among heterogeneous systems.46,47 However, in our analyses, we learned that no clear standards exist in personal health data schema (eg, sleep analysis, workout session), resulting in data loss when transferring data between services.

As in the case of the Open mHealth initiative,48 some efforts have been made in creating standard for personal health data schema. Human API promotes the adoption of an open architecture and schema for people to centralize personal data and medical records and share the data with healthcare providers.49 However, the uptake has been slow and there is no incentive in place for the companies to follow the standardization just yet. We call for health platform companies and health app developers to make a concerted effort to establish standard for personal data schemas for interoperability and preventing data loss.

Data schemas for personal health data can change over time due to continuous advancement of technologies. For example, if a new polysomnography technique is developed, the sleep data schema can be changed. It would be important to ensure that data schema standards adapt to these changes.

Threats of unstable market conditions

Frequent changes in market conditions threaten the data accessibility. During our analysis period (ie, between March and September in 2018), services of four apps[8] were discontinued. Service providers usually inform their users of the closure and provide options to back up their data for a certain period, after which people cannot access their data. However, in a few occasions, companies did not inform users of their service closure, incurring data loss, such as Jawbone50 and ZEO.51 Furthermore, service providers occasionally change their internal policies about access methods, which is another risk to data accessibility. For example, the Oral-B app decided to discontinue their APIs during our analysis period.

Due to the rise and fall of information technology companies and the cost of maintenance, these threats are inevitable. To prevent the undesirable data loss, service providers need to offer people a means to properly back up and obtain the data. The GDPR, which came into effect in May 2018, includes the right to data portability, which allows people to obtain their data from service providers and to reuse them for individual purposes.21 According to Article 29 Working Party (the right to data portability), the personal health data that we surveyed are subject to the GDPR.52 The GDPR also encourages service providers to develop interoperable formats53 for people to either store the data for personal use or transmit it to another service.52 The GDPR applies globally to app providers that are likely to collect or monitor EU citizens’ personal data.54 Although the regulations have the potential to enhance personal data accessibility, our data showed that the industry lags behind the reforms.

Limitations and future work

As a first step to understand the current landscape of data accessibility, we examined semi-automated personal health apps from iOS apps only. However, a further study with apps from other platforms is warranted; examining other health platforms such as Google Fit and Samsung Health can enrich the understanding of data standards, cross-platform issues, and data portability. Investigating a broader range of personal data tracking apps including manual tracking apps (eg, food journaling) can bring additional insights about data accessibility. While our analyses provide a state of the data accessibility in 2018, using the same analyses method in the future will allow us to monitor the progression of data accessibility.

CONCLUSION

We presented the current state of data accessibility in the personal health data context. Analyzing 45 personal data tracking apps, we identified that two dimensions—data access method and data type—determine the granularity and amount of the accessible data. More than 90% of these apps provided some type of data access support, but the granularity and amount of the accessible data were highly limited and obscure, especially for laypeople who do not have programming skills. Lack of standards in personal health data schema as well as frequent changes in market conditions are additional hurdles to data accessibility. We discussed ways to overcome the barriers to improve data accessibility. As data accessibility influences many stakeholders, including patients, health providers, researchers, third party developers, and the general public, data accessibility should be a crucial design consideration in personal health app and health platform design.

AUTHOR CONTRIBUTORS

CHOE EK conceived of the main conceptual idea. CHOE EK and LEE B designed the analysis and supervised KIM Y, who collected the data and performed the analysis. All authors discussed the results and contributed to the final manuscript.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online.

Conflict of interest statement. None declared.

Supplementary Material

Supplementary Data

Footnotes

[1]

APIs are sets of standardized requests that allow different computer programs to communicate with each other.37 With APIs, developers can access the data or services without having to implement the underlying objects and procedures.38 For example, developers can access Fitbit user’s data using Fitbit Web API as long as the user granted access to them.

[2]

Extensible Markup Language.

[3]

Comma-Separated Values.

[4]

Flexible and Interoperable Data Transfer.

[5]

GPS Exchange Format.

[6]

Keyhole Markup Language.

[7]

Training Center XML.

[8]

“Steps—Personalized Pedometer” Nike+ Fuel app (Nike), UP for UP Move, and wired UP bands by Jawbone app (Jawbone), UP – Smart Coach for Health by Jawbone app (Jawbone), Moves app (Facebook).

REFERENCES

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Data Availability Statement

In distributed computing research, data accessibility is used as an efficiency metric,25,26 denoting how quickly systems can access a given data object. In the management information systems field, data accessibility is discussed as an important aspect of data quality.27,28 Pipino et al.27 defined (data) accessibility as “the extent to which data is available, or easily and quickly available” and suggested several criteria to assess data quality, such as data amount and data completeness.

In the healthcare field, patients’ data are being increasingly stored in electronic health records. In response to patients’ request to access their data, large hospitals have begun to provide patients with direct access to their data through personal health records (PHRs).29 One of the essential issues around PHRs is to define what and how much data should be shared with patients. Some PHRs allow patients to access—either fully or partially—health data such as problem list, medication list, lab results, and clinical notes.30,31

Data stored in PHRs are largely limited to patients’ data generated inside the clinic. Outside of the clinic, people are also increasingly collecting self-tracking data. Many companies including major information technology companies (eg, Apple, Google, Microsoft) have invested in products to help people collect, manage, and share their health and fitness data. At the same time, people’s data are scattered across many databases in a silo. Noting the difficulties of retrieving people’s own data, Estrin17 argued that companies should give data back to individuals (or even to third-party companies, with proper anonymization) so that individuals can get personalized insights into their health. A similar issue is discussed in biomedical research: called a “one-way transaction,” once study participants submit their sample (eg, hair, urine) to researchers, they are not granted access to the data (eg, genomic data) directly drawn from it.19 Lunshof et al.19 argued that study participants should have an option to access their data. Various research projects, such as All of Us (by the National Institutes of Health), 100,000 Genomes Project, and the Harvard Personal Genome Project, grant varying degrees of data access to individuals.32

In a similar vein, Baker33 used the concept of user sovereignty to emphasize that people should have the right to decide how to use and share their data and whether service providers can store their data. Furthermore, the General Data Protection Regulation (GDPR), which came into effect in May 2018, includes the access right: people should be able to access their personal data if the data are identifiable.21 Although these arguments highlight the importance of data access, little is known about how people can access their data in the current personal health app ecosystem. Thus, examining data accessibility in the personal health context is a critical and timely topic.

We decomposed data accessibility into two dimensions: data access methods and data types. Data access method refers to how individuals with a varying level of technical skills and diverse purposes can access their self-tracking data. Data type refers to a target of the measure, such as step, heart rate, weight, and sleep. The characteristics of each data type determine a meaningful frequency of the measure. For example, it is unlikely to be meaningful to measure weight frequently due to the daily fluctuation.34 As capturing technology advances over time, data capture feasibility—both in frequency and level of details—changes. For example, a few research prototypes have recently demonstrated the feasibility to continuously capture blood pressure,35,36 which might become available in everyday life.

A combination of data access methods and data types determines the granularity, which we define as the minimum time interval of the accessible data. The self-tracking data are inherently time based. The granularity of self-tracking data can be fixed, for example, each day or every second, or it can be event based, for example, when caffeine intake occurs.

To develop a coding scheme for these two dimensions, we reviewed 70 popular self-tracking apps including Fitbit, Misfit, and Apple Watch, and identified three types of data access method: (1) application program interfaces (APIs),[1] (2) health platform connection, and (3) file download. We considered APIs as a data access method because our stakeholders include researchers and developers, who can use APIs to develop tools for laypeople. Among a few existing health platforms (Apple Health, Google Fit, and Samsung Health), we decided to focus on one platform, Apple Health, for two reasons: (1) our goal was not to compare across health platforms and (2) Apple Health and its frameworks (HealthKit and ResearchKit) are widely adopted,39 becoming a standard interface to fitness and medical devices.40 The unit of analysis for the granularity analyses was each tracking item (eg, step count, weight, heart rate) because most apps track multiple tracking items and the amount of accessible data varies depending on tracking items.

Our findings revealed that trade-offs exist across the different access methods, regarding the required skills and granularity of the accessible data for each method: the more fine-grained data people want to access, the more advanced skills they need to have. APIs provide finer-grained data but are difficult to use; health platform and file downloads are easier to use for laypeople, but they typically put a significant limit on data format, granularity, and amount, which constrains people from easily repurposing the data. One way to enhance data accessibility is to provide people with control over these data access attributes so that they can determine what to access depending on their goals.

We interpret data accessibility in a broad sense in that people can not only access their raw data in the most granular sense if they want, but also “utilize” their data better to meet their goals. We thus envision an approach that enables flexible access to support people with varying data management and analytics skills.

Some people yearn to access raw data to explore and analyze the data further because of the limitations of existing tools in providing insights.3 Although Apple Health supports data integration, it does not provide rich insights from multiple data sources. Such limitations can be addressed through a system capable of integrating data from multiple sources and providing data exploration features (eg, Visualized Self45). For those who do not have the data analysis and visualization skills, providing rich insights generated by the system can reduce the need to access the raw data in the first place. Furthermore, we envision a system that enables people to authorize experts (eg, clinicians, researchers) to access and utilize data for them, making the data sharing easy and efficient.16,45


Articles from Journal of the American Medical Informatics Association : JAMIA are provided here courtesy of Oxford University Press

RESOURCES