Abstract
Privacy is a growing concern in mobile health research, particularly regarding passive data. Apple SensorKit provides a novel platform for collecting phone and wearable usage and sensor data, however the acceptability and feasibility of collecting these sensitive data to research subjects remain unknown. To address this gap, we piloted the SensorKit platform as part of the longitudinal Intern Health Study. Unlike prior research on digital privacy, which has often relied on small samples, this study leverages a large and demographically diverse cohort of US medical residents to explore racial and ethnic differences in the acceptability of passive sensor data collection. Findings demonstrate that successful enrollment and retention rates can be achieved in a longitudinal e-Cohort study that collects SensorKit data, however lower opt-in rates among racial minorities suggest the need for further evaluation of the equity implications around specific data types in mobile health research.
Introduction
Digital phenotyping, a way to organize passively collected mobile technology data from participants in real time to better understand behavior, has the potential to utilize new technology and sensors to capture moment-to-moment behavior and better understand participant experiences [1,2]. Various studies have found that phone sensor data (e.g., GPS, accelerometer, light, or microphone) can capture circumscribed constructs of sleep, social context (e.g., relationships, social support), mood, and stress [3]. The availability of this data has enabled just-in-time adaptive interventions, which use conditions (e.g., location, physical activity, phone use) detected by passive sensors to deliver an intervention when it is expected to have the greatest impact [4,5].
In contrast to ecological momentary assessments, collection of this passive phone and sensor data places minimal burden on participants. However, studies have noted participant concerns related to privacy that may negatively impact participation [6,7]. Specifically, mHealth studies have found that a significant number of participants do not want to be monitored, tracked, or to provide private sensor-based data [8]. With the rapid expansion of mobile health technologies in recent years, digital health equity is also a growing concern [9]. However, most studies assessing the acceptability and privacy concerns have been small in scale, limiting sample generalizability and the assessment of specific subgroups including racial/ethnic minority groups [3,10,11].
Apple SensorKit provides a novel technological framework for health researchers to capture passive participant data in domains relevant to well-being. However, the acceptability of collecting this data among patients is not well-understood. To address this question, the Apple SensorKit platform was assessed with first-year medical residents in the United States via the MyDataHelps mobile application as part of the longitudinal Intern Health Study [12]. Unlike prior research on digital privacy concerns, this study leverages a large and demographically diverse cohort of US medical residents to explore racial and ethnic differences in the acceptability of passive sensor data collection.
Incoming intern physicians for the 2023–2024 academic year were invited to enable the collection of Apple SensorKit data as an optional component during the process of initial enrollment and onboarding in the Intern Health Study, which took place on a rolling basis over a three month period in spring 2023. Five sensors were selected for inclusion based on potential utility for data analyses concerning the mental health and adaptive functioning in training physicians (Ambient Light, Keyboard Metrics, Message Usage, Phone Usage, and Frequently Visited Locations). SensorKit data collection was tracked as part of the study for the first two months of the intern year.
Methods
Ethics statement
The study design was approved by the Institutional Review Board at the University of Michigan and the participating hospitals in the Intern Health Study (HUM00033029). All participants provided written informed consent via a secure online survey as approved by the University of Michigan IRB.
Study design
Between April 11, 2023 and June 30, 2023, incoming intern physicians for the 2023–2024 academic year were invited to opt into providing Apple SensorKit data. Five sensors were selected for inclusion based on potential utility for data analyses concerning the mental health and adaptive functioning in training physicians (Ambient Light, Keyboard Metrics, Message Usage, Phone Usage, and Frequently Visited Locations) (Fig 1).
Fig 1. Apple SensorKit types.
Apple SensorKit data types selected for inclusion in the study. Participants completed consent and the initial survey of the parent study (S2), and then logged into the MyDataHelps mobile app. The detailed Apple SensorKit Permissions screens appeared in the participant task list upon initial app login. Opening the permissions survey prompted participants with the standard Apple SensorKit introduction screen. Participants were then required to proceed through screens with detailed descriptions and examples for all sensors included in the present study (Fig 1) [13]. Following the information for each data type, participants were explicitly asked for permission to access that specific sensor: “Allow Collection & Sharing’‘ or “Don’t Allow Collection & Sharing.” Participants could opt into data collection for all, some, or none of the sensor types presented. The data types were presented in alphabetical order as dictated by the Apple SensorKit platform. Participants were informed that enabling SensorKit was optional and could be disabled at any time. No additional incentives were provided for opting into the collection of SensorKit data. Device data were tracked as part of the study for the first two months of intern year.
Data analysis
We descriptively assessed the overall SensorKit opt-in and retention rates, and used chi-square analyses to assess differences in the initial opt-in rate and retention after 2 months between each of the 5 sensor options (Ambient Light, Keyboard Metrics, Messages Usage, Phone Usage, Frequently Visited Locations). We utilized chi-square analyses to evaluate differences in SensorKit enrollment based on the following self-reported demographic characteristics: age, gender, and race. For analysis purposes, racial groups were combined into the following categories: White, Asian, and underrepresented in medicine (URiM) (S1 Table). Residents were coded as URiM according to the American Association of Medical Colleges definition as “racial and ethnic populations that are underrepresented in the medical profession relative to their numbers in the general population”. In this study, this group included interns self-identifying as African American, Arab or Middle Eastern, Latino, Native American, Pacific Islander, other, or multi-racial. We then conducted chi-square analyses to compare national intern race data provided by the American Association of Medical Colleges (AAMC) [14] with enrollment in the parent study in order to assess the overall racial representativeness of the sample. Analyses were conducted using SAS version 9.4 (SAS Institute). Statistical tests were 2-sided and used a significance threshold of P < .05.
Results
Among the 1437 incoming intern physicians who enrolled in the parent study, 1164 were iPhone users and invited to participate in the SensorKit arm of the study. Of those, 695 (59.7%) enabled at least one SensorKit data type during initial study onboarding. The mean (IQR) participant age was 27.7 (26–29) and 396 participants (57.0%) were women. In all, 399 participants self-identified as White (57.4%), 136 self-identified as Asian (19.6%), and 159 (22.9%) self-identified with a race or ethnicity underrepresented in medicine (URiM). A significant difference in opt-in rates was observed between data types at enrollment (P < .001, ES = 0.08) and at 2 months (P < .001, ES = 0.08), with Ambient Light as the highest enrolled and Frequently Visited Locations as the lowest enrolled at both time points (Fig 2). At 2 months, 94.0% (653/695) of participants were still providing SensorKit data. The most likely sensors to be disabled after initial opt-in were Keyboard Metrics (8.70%) and Ambient Light (6.26%), followed by Messages Usage (5.31%), Phone Usage (4.90%), and Frequently Visited Locations (4.59%).
Fig 2. Apple SensorKit opt-in rate at enrollment and retention rates at month 2 by data type.
There were no significant differences in opt-in rates based on age or gender. However, interns who self-identified as White (67.0% [399/596]) were significantly more likely to opt into providing SensorKit data than those who identified as Asian (53.8% [136/253]) or as races underrepresented in medicine (51.6% [159/308]), (P < .001, ES = 0.11). We utilized national intern data provided by the American Association of Medical Colleges (AAMC) [14] to determine if the parent study enrollment rate also differed by self-identified race and found no significant difference; the racial difference only emerged at the SensorKit opt-in level.
Discussion
Data privacy and security are a growing concern in mobile health research [15,16]. Yet, little is known about the feasibility of collecting this data on a large scale, or its acceptability by research participants [17]. In this study, we establish that successful participation rates can be achieved in a carefully designed longitudinal e-Cohort study that includes the collection of Apple SensorKit data. However, this may be contingent on SensorKit opt-in remaining optional. Future research should investigate how requiring participants to enable particular SensorKit data types affects enrollment rates.
Further, we found that the opt-in rate for Ambient Light was significantly greater than the other data types, with the lowest opt-in rate for Frequently Visited Locations (geographical location). Even though it is clearly stated in the SensorKit permissions screens that GPS data and specific addresses are not collected, it is possible that it was not clear to participants how this differed from other applications that do collect and exploit this kind of personal information. It is also unclear whether this difference was due to variation in privacy concerns with each data type, or because the data types were presented in alphabetical order as required by Apple, with the highest opt-in rate for the sensor presented first (“Ambient Light”) and the lowest rate for the sensor presented last(“Visits”). Further investigation of the role of sensor type order on opt-in rates is needed.
Additionally, once SensorKit is enabled, we find that the overwhelming majority of participants (94.0%) continue to provide SensorKit data even after 2 months. This stands in contrast to adherence to daily mood reports in the Intern Health Study, which often declines over the course of the year, and highlights the advantage of passive sensing for longer-term monitoring and detection [18]. Furthermore, while most recent studies on sensor technologies do not focus on user engagement and adherence, a few mHealth studies have found poor study adherence, reporting close to fifty percent of participants uninstalling their app or stopping engagement in the app within weeks of enrollment [8,19]. In the future, SensorKit data could be analyzed with corresponding daily-level mood data to identify predictive features– allowing for mood inferences to be made strictly by ongoing SensorKit data long after participants stop completing daily self-report mood ratings.
While the proportion of each racial group enrolled in the overall Intern Health Study closely reflected national US intern data, we found that interns self-identifying as Asian or as races/ethnicities underrepresented in medicine (e.g., African American, Latino) were significantly less likely to opt into providing SensorKit data than their peers self-identifying as White. Most research on the “digital divide” has focused primarily on age and socioeconomic disparities [20,21]. However, this work indicates that there may be an important racial difference in acceptability of digital phenotyping, consistent with racial differences in other classes of medical research due to historical discrimination [21–23]. This aligns with qualitative work that suggests that experiences of discrimination and surveillance from employers and government institutions may lead to increased awareness of privacy risks among marginalized groups, and greater likelihood of taking steps to protect online privacy [24]. Similarly, a recent study of social media users found higher reports of privacy concerns among Asian and Latinx users compared to White users [25].
Much of the research in this area suggests that the primary driver of a potential “online privacy divide” is lower socioeconomic status and fewer resources (e.g., access to technology skills training) [26]. Yet, this study shows that even among a population of young people with iPhones, and the same educational and occupational status, racial identity still affects willingness to opt in to potentially more sensitive sensor and phone usage data. These findings suggest that motivation for privacy management in this population may be based more on identity and associated experiences (e.g., discrimination) rather than inequalities in resources, which may lead to lack of trust.
Beyond privacy concerns, other recent work suggests the potential for racial bias in mobile health research, based on potential inaccuracies in wearable device data for people with darker skin tones [27]. Prior concerns about representation in research including smartphone sensor data have also been raised due to variability in smartphone penetration across different global regions, low participation rates in mHealth studies, and possible differences between smartphone owners and non-owners [28]. Future studies should more closely investigate drivers of these disparities and consider implications for mobile health equity more broadly.
Limitations
To our knowledge, this is the first national e-Cohort study to evaluate SensorKit enrollment and retention rates. While most prior work on digital privacy concerns has been limited by small sample sizes, this study leverages a large, demographically diverse sample of US medical interns to assess racial and ethnic differences in passive sensor acceptability, extending the literature on digital health equity.
One limitation of the current study is the sample homogeneity in terms of age and education level, limiting the generalizability of the results. Participants in their late twenties with a medical degree may be more comfortable with technology or have fewer privacy concerns compared with older populations or individuals with other educational backgrounds. Another limitation was our inability to customize, and ideally randomize, the sensor type order during the opt-in process due to the technical constraints of the Apple SensorKit platform. This may have affected overall willingness to participate (for example, if a more “sensitive” sensor type was presented early on), or resulted in fewer participants enabling data types presented later. As the SensorKit data collection platform is only available for Apple devices, we were unable to include participants without iPhones in this analysis. Future research should assess enrollment and retention rates for the collection of comparable sensor data in other smartphone types, as well as assess the accuracy and reliability of sensor data collected using various platforms. While our results indicate racial disparities in SensorKit opt-in rates, we did not directly assess reasons for opting out and how these may vary based on racial identity. Using qualitative methods to further investigate potential barriers to participation, particularly as relates to such issues as privacy and security, would be useful.
Conclusion
Taken together, passive sensing methods such as those in SensorKit provide an opportunity to detect and intervene with individuals struggling with mental or physical health problems, particularly those that are chronic and have extended periods of potential risk. While there is significant potential for these methods, combined with computational advancements, to improve health outcomes, these opportunities must be balanced against privacy concerns and individuals’ willingness to participate in these types of programs. Significant health disparities already exist for minoritized populations and mHealth approaches must be cautious not to extend this gap.
Supporting information
aResidents were coded as underrepresented in medicine according to the American Association of Medical Colleges definition as “racial and ethnic populations that are underrepresented in the medical profession relative to their numbers in the general population.” In this study, this group included interns self-identifying as African American, Arab or Middle Eastern, Latino, Native American, Pacific Islander, other, or multi-racial. bSurgical specialties were assigned based on the American College of Surgeons classification. Specifically, for this study, physicians in the following specialties were classified as “surgical”: General Surgery, Gynecology and Obstetrics, Neurological Surgery, Orthopaedic Surgery, Otolaryngology, Plastic Surgery, Urology, and Other surgical. Physicians from the following specialties were classified as “non-surgical”: Internal Medicine, Pediatrics, Psychiatry, Neurology, Emergency Medicine, Internal Medicine-Pediatrics, Family Medicine, Family Practice, Anesthesiology, Dermatology, Medical Genetics, Nuclear Medicine, Pathology, Physical Medicine & Rehabilitation, Preventative Medicine, Radiation Oncology, Radiology-Diagnostic, Sleep Medicine, and Other Non-surgical.
(DOCX)
(DOCX)
(DOCX)
Acknowledgments
We thank John Brussolo and Siqing Hu from the University of Michigan HITS database team for their support of this project.
Data Availability
SensorKit data cannot be shared publicly because of Apple's data sharing restrictions. De-identified general study enrollment and survey data are available via the ICPSR repository (https://www.openicpsr.org/openicpsr/project/129225/version/V1/view) or from the corresponding author upon reasonable request and completion of a data use agreement with the University of Michigan.
Funding Statement
National Institutes of Health (R01MH101459) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Henson P, Pearson JF, Keshavan M, Torous J. Impact of dynamic greenspace exposure on symptomatology in individuals with schizophrenia. PLoS One. 2020;15(9):e0238498. doi: 10.1371/journal.pone.0238498 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hsu M, Ahern DK, Suzuki J. Digital phenotyping to enhance substance use treatment during the COVID-19 pandemic. JMIR Ment Health. 2020;7(10):e21814. doi: 10.2196/21814 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mohr DC, Zhang M, Schueller SM. Personal sensing: understanding mental health using ubiquitous sensors and machine learning. Annu Rev Clin Psychol. 2017;13:23–47. doi: 10.1146/annurev-clinpsy-032816-044949 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Nahum-Shani I, Smith SN, Spring BJ, Collins LM, Witkiewitz K, Tewari A, et al. Just-in-time adaptive interventions (JITAIs) in mobile health: key components and design principles for ongoing health behavior support. Ann Behav Med. 2018;52(6):446–62. doi: 10.1007/s12160-016-9830-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hardeman W, Houghton J, Lane K, Jones A, Naughton F. A systematic review of just-in-time adaptive interventions (JITAIs) to promote physical activity. Int J Behav Nutr Phys Act. 2019;16(1):31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Keusch F, Struminskaya B, Antoun C, Couper MP, Kreuter F. Willingness to participate in passive mobile data collection. Public Opin Q. 2019;83(Suppl 1):210–35. doi: 10.1093/poq/nfz007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ságvári B, Gulyás A, Koltai J. Attitudes towards participation in a passive data collection experiment. Sensors (Basel). 2021;21(18):6085. doi: 10.3390/s21186085 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Bauer M, Glenn T, Geddes J, Gitlin M, Grof P, Kessing LV, et al. Smartphones in mental health: a critical review of background issues, current status and future concerns. Int J Bipolar Disord. 2020;8(1):2. doi: 10.1186/s40345-019-0164-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Jaworski B, Hooper M, Aklin W, Jean-Francois B, Elwood W, Belis D, et al. Advancing digital health equity: directions for behavioral and social science research. Transl Behav Med. 2023;13(3):132–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Langholm C, Kowatsch T, Bucci S, Cipriani A, Torous J. Exploring the potential of Apple Sensorkit and digital phenotyping data as new digital biomarkers for mental health research. Digit Biomark. 2023;7(1):104–14. doi: 10.1159/000530698 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Mahalingaiah S, Fruh V, Rodriguez E, Konanki SC, Onnela J-P, de Figueiredo Veiga A, et al. Design and methods of the Apple Women’s Health Study: a digital longitudinal cohort study. Am J Obstet Gynecol. 2022;226(4):545.e1-545.e29. doi: 10.1016/j.ajog.2021.09.041 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fang Y, Bohnert ASB, Pereira-Lima K, Cleary J, Frank E, Zhao Z, et al. Trends in depressive symptoms and associated factors during residency, 2007 to 2019 : a repeated annual cohort study. Ann Intern Med. 2022;175(1):56–64. doi: 10.7326/M21-1594 [DOI] [PubMed] [Google Scholar]
- 13.SensorKit documentation. Apple, Inc; 2025. Available from: https://developer.apple.com/documentation/sensorkit [Google Scholar]
- 14.Request AAMC data. Association of American Medical Colleges; 2023. [cited 2024 Jan 9]. Available from: https://www.aamc.org/request-aamc-data [Google Scholar]
- 15.Onnela J-P. Opportunities and challenges in the collection and analysis of digital phenotyping data. Neuropsychopharmacology. 2021;46(1):45–54. doi: 10.1038/s41386-020-0771-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lui GY, Loughnane D, Polley C, Jayarathna T, Breen PP. The apple watch for monitoring mental health-related physiological symptoms: literature review. JMIR Ment Health. 2022;9(9):e37354. doi: 10.2196/37354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Huckvale K, Venkatesh S, Christensen H. Toward clinical digital phenotyping: a timely opportunity to consider purpose, quality, and safety. NPJ Digit Med. 2019;2:88. doi: 10.1038/s41746-019-0166-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.NeCamp T, Sen S, Frank E, Walton MA, Ionides EL, Fang Y, et al. Assessing real-time moderation for developing adaptive mobile health interventions for medical interns: micro-randomized trial. J Med Internet Res. 2020;22(3):e15033. doi: 10.2196/15033 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Abdullah S, Choudhury T. Sensing technologies for monitoring serious mental illnesses. IEEE MultiMedia. 2018;25(1):61–75. doi: 10.1109/mmul.2018.011921236 [DOI] [Google Scholar]
- 20.Rodriguez JA, Clark CR, Bates DW. Digital health equity as a necessity in the 21st century cures act era. JAMA. 2020;323(23):2381–2. doi: 10.1001/jama.2020.7858 [DOI] [PubMed] [Google Scholar]
- 21.Mitchell UA, Chebli PG, Ruggiero L, Muramatsu N. The digital divide in health-related technology use: the significance of race/ethnicity. Gerontologist. 2019;59(1):6–14. doi: 10.1093/geront/gny138 [DOI] [PubMed] [Google Scholar]
- 22.Shavers VL, Lynch CF, Burmeister LF. Knowledge of the Tuskegee study and its impact on the willingness to participate in medical research studies. J Natl Med Assoc. 2000;92(12):563–72. [PMC free article] [PubMed] [Google Scholar]
- 23.Boulware LE, Cooper LA, Ratner LE, LaVeist TA, Powe NR. Race and trust in the health care system. Public Health Rep. 2003;118(4):358–65. doi: 10.1093/phr/118.4.358 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Marwick A, Fontaine C, Boyd D. “Nobody sees it, nobody gets mad”: social media, privacy, and personal responsibility among low-SES youth. Soc Media Soc. 2017;3(2):2056305117710455. doi: 10.1177/2056305117710455 [DOI] [Google Scholar]
- 25.Wang LH, Metzger MJ. The online privacy divide: testing resource and identity explanations for racial/ethnic differences in privacy concerns and privacy management behaviors on social media. Commun Res. 2024:00936502241273157. doi: 10.1177/00936502241273157 [DOI] [Google Scholar]
- 26.Dodel M. Inequalities and privacy in the context of social media. In: Trepte S, Masur P, editors. The Routledge handbook of privacy and social media. Routledge; 2023. p. 204–14. [Google Scholar]
- 27.Colvonen PJ, DeYoung PN, Bosompra N-OA, Owens RL. Limiting racial disparities and bias for wearable devices in health science research. Sleep. 2020;43(10):zsaa159. doi: 10.1093/sleep/zsaa159 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Keusch F, Conrad FG. Using smartphones to capture and combine self-reports and passively measured behavior in social research. J Surv Stat Methodol. 2021;10(4):863–85. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
aResidents were coded as underrepresented in medicine according to the American Association of Medical Colleges definition as “racial and ethnic populations that are underrepresented in the medical profession relative to their numbers in the general population.” In this study, this group included interns self-identifying as African American, Arab or Middle Eastern, Latino, Native American, Pacific Islander, other, or multi-racial. bSurgical specialties were assigned based on the American College of Surgeons classification. Specifically, for this study, physicians in the following specialties were classified as “surgical”: General Surgery, Gynecology and Obstetrics, Neurological Surgery, Orthopaedic Surgery, Otolaryngology, Plastic Surgery, Urology, and Other surgical. Physicians from the following specialties were classified as “non-surgical”: Internal Medicine, Pediatrics, Psychiatry, Neurology, Emergency Medicine, Internal Medicine-Pediatrics, Family Medicine, Family Practice, Anesthesiology, Dermatology, Medical Genetics, Nuclear Medicine, Pathology, Physical Medicine & Rehabilitation, Preventative Medicine, Radiation Oncology, Radiology-Diagnostic, Sleep Medicine, and Other Non-surgical.
(DOCX)
(DOCX)
(DOCX)
Data Availability Statement
SensorKit data cannot be shared publicly because of Apple's data sharing restrictions. De-identified general study enrollment and survey data are available via the ICPSR repository (https://www.openicpsr.org/openicpsr/project/129225/version/V1/view) or from the corresponding author upon reasonable request and completion of a data use agreement with the University of Michigan.