Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Sep 25.
Published in final edited form as: Am J Obstet Gynecol. 2021 Oct 2;226(4):545.e1–545.e29. doi: 10.1016/j.ajog.2021.09.041

Design and methods of the Apple Women’s Health Study: a digital longitudinal cohort study

Shruthi Mahalingaiah 1, Victoria Fruh 2, Erika Rodriguez 3, Sai Charan Konanki 4, Jukka-Pekka Onnela 5, Alexis de Figueiredo Veiga 6, Genevieve Lyons 7, Rowana Ahmed 8, Huichu Li 9, Nicola Gallagher 10, Anne Marie Z Jukic 11, Kelly K Ferguson 12, Donna D Baird 13, Allen J Wilcox 14, Christine L Curry 15, Sanaa Suharwardy 16, Tyler Fischer-Colbrie 17, Gracee Agrawal 18, Brent A Coull 19, Russ Hauser 20, Michelle A Williams 21
PMCID: PMC10518829  NIHMSID: NIHMS1924931  PMID: 34610322

Abstract

BACKGROUND:

Prospective longitudinal cohorts assessing women’s health and gynecologic conditions have historically been limited.

OBJECTIVE:

The Apple Women’s Health Study was designed to gain a deeper understanding of the relationship among menstrual cycles, health, and behavior. This paper describes the design and methods of the ongoing Apple Women’s Health Study and provides the demographic characteristics of the first 10,000 participants.

STUDY DESIGN:

This was a mobile-application-based longitudinal cohort study involving survey and sensor-based data. We collected the data from 10,000 participants who responded to the demographics survey on enrollment between November 14, 2019 and May 20, 2020. The participants were asked to complete a monthly follow-up through November 2020. The eligibility included installed Apple Research app on their iPhone with iOS version 13.2 or later, were living in the United States, being of age greater than 18 years (19 in Alabama and Nebraska, 21 years old in Puerto Rico), were comfortable in communicating in written and spoken English, were the sole user of an iCloud account or iPhone, and were willing to provide consent to participate in the study.

RESULTS:

The mean age at enrollment was 33.6 years old (±standard deviation, 10.3). The race and ethnicity was representative of the US population (69% White and Non-Hispanic [6910/10,000]), whereas 51% (5089/10,000) had a college education or above. The participant geographic distribution included all the US states and Puerto Rico. Seventy-two percent (7223/10,000) reported the use of an Apple Watch, and 24.4% (2438/10,000) consented to sensor-based data collection. For this cohort, 38% (3490/9238) did not respond to the Monthly Survey: Menstrual Update after enrollment. At the 6-month follow-up, there was a 35% (3099/8972) response rate to the Monthly Survey: Menstrual Update. 82.7% (8266/10,000) of the initial cohort and 95.1% (2948/3099) of the participants who responded to month 6 of the Monthly Survey: Menstrual Update tracked at least 1 menstrual cycle via HealthKit. The participants tracked their menstrual bleeding days for an average of 4.44 (25%–75%; range, 3–6) calendar months during the study period. Non-White participants were slightly more likely to drop out than White participants; those remaining at 6 months were otherwise similar in demographic characteristics to the original enrollment group.

CONCLUSION:

The first 10,000 participants of the Apple Women’s Health Study were recruited via the Research app and were diverse in race and ethnicity, educational attainment, and economic status, despite all using an Apple iPhone. Future studies within this cohort incorporating this high-dimensional data may facilitate discovery in women’s health in exposure outcome relationships and population-level trends among iPhone users. Retention efforts centered around education, communication, and engagement will be utilized to improve the survey response rates, such as the study update feature.

Keywords: digital health, longitudinal cohort, menstrual cycles, women’s health

Introduction

Multiple factors influence the menstrual cycle length, duration of flow, and cyclicity. Common factors include stress, nutrition and body weight, physical activity, and combination lifestyle factors. Furthermore, menstrual cycle function is an indicator of health and longevity.1 For example, bone density is built and maintained in the presence of cyclic ovarian function.2 Fertility, terminal breast differentiation, and aspects of cognition and memory are also influenced by menstrual status and function.35 The endocrine pathways involved in menstrual cycle physiology include the hypothalamic-pituitary-ovarian axis and a healthy functioning of the thyroid and adrenal glands; the anatomic structures necessary include the uterus, endometrium, and ovaries.

Furthermore, irregular patterns or the absence of menstrual cycles can serve as indicators of underlying health problems. The presence of irregular menstrual cycles may indicate hormonal disturbances, including thyroid disorders, prolactinomas, systemic illness (including cancers), anatomic pathology such as uterine fibroids, polyps, or adenomyosis; it has also has been linked to an increase in all-cause mortality.1 Abnormalities in the bleeding length and amount affect up to 30% of those who menstruate.6 The existing data on the menstrual cycle characteristics, including the normal distributions of the cycle length and bleeding days, are based on prior and well-designed but demographically-limited cohorts.7,8 Little is known about how these conditions vary across subpopulations or how they relate to future health conditions. An understanding of population-based variation in menstrual cycle characteristics is lacking.

The objectives of this study are to (1) advance the understanding of the menstrual cycle, including how it relates to exercise, sleep, environmental, behavioral, and other physiological processes, and (2) inform screening and risk assessment for gynecologic health conditions using menstrual cycle, reproductive, health, and sensor data. We present here the design of the Apple Women’s Health Study (AWHS) and the characteristics of the first 10,000 participants enrolled in the study, including their follow-up after 6 months of participation.

Materials and Methods

Survey design

The survey questions for the AWHS were derived from previous, similarly large longitudinal studies on women’s health. These surveys included the National Health and Nutrition Examination Survey developed by the Centers for Disease Control and Prevention, the Perceived Stress Scale, All of Us study—a large research program sponsored by the National Institutes of Health, Nurses’ Health Study 2, and the Ovulation and Menstruation Health Study.912 The questions were modified for a mobile user experience and were standardized across 2 other research studies also housed in the Apple Research app. For menstrual-cycle-specific questions, the exposures hypothesized to affect the hypothalamic-pituitary-ovarian axis in a cycle-specific way were included. Once the research teams had completed the initial drafting of the baseline survey, it was reviewed by Apple, Inc. to conform to their design standards across the 3 studies.

The interface for the Research app was designed to be intuitive and simple. The research profile completed at enrollment is common to all 3 studies and includes the initial demographic survey. To implement the intuitive and simple survey concept, the baseline women’s health study survey was split across the initial year of the study to distribute the participant burden over time as noted in Supplemental Table 1. The health history survey was given at month 4, and the reproductive history was given at month 10. The Monthly Survey: Menstrual Update (MSMU) was initiated after the first completed month of enrollment and then monthly thereafter. The time to survey completion was estimated in 5-minute increments on the basis of usability testing by Apple, Inc. The response time for the enrollment processes (screening, eligibility, demographics) was estimated to be 25 minutes. The expected response times as listed in the Research app for monthly surveys such as the MSMU, Pregnancy Update, and Lactation Update were 5 minutes each.

Apple research app platform

The study is hosted within the Apple Research app (available on the Apple App Store), which allows a participant to find, enroll, and participate in Apple-supported, health-related research studies.

The Research app was designed with 3 studies for a simultaneous launch, where the demographics and certain questions were standardized across the studies. The other 2 studies were the Apple Heart and Movement Study and the Apple Hearing Study.13,14

All the shared data are stored securely in a system within Apple that is designed to meet the technical safeguard requirements of the Health Insurance Portability and Accountability Act. Access to any contact information or other identifying data that the participants provide through the Research app is restricted to the Principal Investigator staff.

Study population description

The AWHS is a longitudinal cohort study of persons who have menstruated at least once in their life. The study began enrolling on November 14, 2019. The planned duration of this study is 10 years, that is, until November 2029, with a potential for extension or additional long-term follow-up. The goal is to recruit 500,000 participants over 10 years. The study was approved by the Advarra Central Institutional Review Board (number PRO00037562) and registered to ClinicalTrials.gov (ClinicalTrials.gov Identifier: NCT04196595).

Eligibility, screening, and consent

Individuals were eligible for enrollment if they had installed the Apple Research app on their iPhone 6s or later with iOS version 13.2 or later, were living in the United States, had menstruated at least once, were at least 18 years old (at least 19 years old in Alabama and Nebraska, at least 21 years old in Puerto Rico), were comfortable communicating in written and spoken English, were the sole users of their iCloud account or iPhone, and were willing to provide informed consent to participate in the study (Supplemental Table 2).

After self-report and self-verification of eligibility, the participants followed the steps for enrollment as delivered by the Research app. They were instructed to read the informed consent form (ICF), which is included in Appendix A, and provide an electronic signature if they consented to participate; a part of this process is shown in Supplemental Figure 1. The ICF covers key aspects, including the study objectives, study procedures (surveys and demographic data), the types of study data that require additional consent (Health app Data, Research Sensor and Usage Data, Medical Conditions and History, and other related information such as clinically sourced data from Health app). The ICF included the participation risks (theoretical potential for breach of confidentiality) and benefits (no incentives or compensation) and the option for study withdrawal at any time for any reason. The current informed consent document is available in the supplemental documents. The participants who were enrolling were informed that they will be reconsented every 2 years.

The individuals who downloaded the Research app, enrolled in AWHS, and met the inclusion criteria proceeded to study participation. They were then provided the survey tasks for completion, starting with the demographics section.

Incentive: There was no compensation for study participation.

Recruitment and marketing

On September 10, 2019, the Apple Research app was announced publicly. The Research app houses the following 3 research studies: the AWHS, the Apple Heart and Movement Study, and the Apple Hearing Study. The AWHS was launched on November 14, 2019. Institutional review board-approved recruitment efforts included a Harvard T.H. Chan School of Public Health study website,15 which was made public and includes frequently asked questions and background information regarding the study. Social media accounts were created on Twitter and Instagram (March 10, 2020), YouTube (March 12, 2020), Facebook (April 13, 2020) and LinkedIn (August 7, 2020) to post recruitment materials inviting potential participants to join the study, with new posts added approximately 1 to 3 times per week. In addition, media events included a podcast,16 media articles, and press interviews after study launch.

Data Collection and Methods

Survey data

On enrollment, the participants who onboarded with the AWHS responded to the Research Profile Survey within the Research app and to the Demographic Survey; the data included the year of birth, state of residence, race and ethnicity, marital status, employment status, gender identity, and sex assigned at birth (Supplemental Figure 1 and Supplemental Table 3). The participants were asked to respond to a baseline Menstrual Status Survey at enrollment (Supplemental Table 3). The response categories for the baseline Menstrual Status Survey included currently menstruating, lactating, menopausal, and pregnant. For the remaining months within the first year of follow-up, the participants were given monthly surveys to provide a menstrual status update, pregnancy update, and/or a lactation update. Health surveys were given quarterly to assess the changes in sleep, physical activity, nutrition, stress, alcohol use, tobacco use, electronic nicotine use, and marijuana use within the previous calendar month. Reproductive history was assessed at the tenth month (Supplemental Table 1). As detailed in Supplemental Table 3, a more extensive health overview (including questions on nutrition, sleep, stress, alcohol use, tobacco use, electronic nicotine use, marijuana use, second-hand smoke, and overall self-rated health) and medical history (gynecologic, endocrine, heart, blood, digestive, kidney, lung, musculoskeletal, cancer, brain and nervous system, mental health and infectious disease conditions, surgeries, and family medical history) are given annually, initially at month 4 and month 7, respectively (Figure 1). A summary of the surveys, topics, sources, number questions is shown in Table 4.

FIGURE 1.

FIGURE 1

Timeline of data collection

TABLE 4.

Summary of surveys, topics, number and source of questions

Surveys Topics Number of questions Source of questions
Profile Name, date of birth, email, phone number, country, and state.   7 Study-specific
Demographics Race and ethnicity, social economic status, zip code, body measurements, sex, gender identity 11 Study-specific, All of Us study (NIH), Behavioral Risk Factor Surveillance System, Centers for Disease Control and Prevention, US Census Current Population Survey, MacArthur Scale of Subjective Social Status
Menstrual status survey Menstrual status, tracking cycles, tracking pregnancy, tracking health   4 Study-specific
Monthly survey Menstrual update Menstrual status, monthly tracking check-in, hormone use, health factors, pregnancy planning   6 Study-specific
Pregnancy update Pregnancy update, due date, pregnancy details, lactation. 10 Study-specific
Lactation update Lactation update   1 Study-specific
Annual health survey Physical activity, nutrition, sleep habits, stress level, alcohol use, tobacco use, electronic nicotine use, marijuana use, second-hand smoke, overall health 30 Study-specific, National Health and Nutrition Examination Survey, Centers for Disease Control and Prevention, Perceived Stress Scale, All of Us study (NIH)
Quarterly health survey Health update (sleep, nutrition, physical activity, stress, alcohol use, nicotine use)   6 Study-specific
Annual medical history Menstrual status; gynecologic, endocrine, heart, blood, digestive, kidney, lung, musculoskeletal, mental health and brain and nervous system conditions; cancer, infectious diseases, surgeries, gynecologic surgeries, gender and sexual orientation, family medical history, medical forms 24 Study-specific, Derived from OM Study, and Brief Health Literacy Screening Tool
Reproductive history Participant’s birth details, early menstrual cycles, hormone use, hormone use details, pregnancy history 18 Study-specific and Derived from OM Study

Adapted from Harvard University,10 Denny et al,11 Cohen et al,12 Adler et al,19 Centers for Disease Control and Prevention,30 Chew et al,31 Centers for Disease Control and Prevention.32

NIH, National Institutes of Health.

Monthly surveys and timing

The participants receive monthly surveys on the basis of the status reported in the baseline Menstrual Status Survey. The monthly surveys include the MSMU, the Lactation Update, and the Pregnancy Update.

The MSMU included a total of 6 questions to ascertain the status in the last calendar month regarding the accuracy of tracked period days, status update for pregnancy, lactation, and menopause, current hormone use, reason for hormone use, and monthly update of health factors (change in exercise or weight, significant sleep disruption, stress, illness, and hospitalization or surgery). The purpose of the MSMU is to give additional context for the logged menstrual bleeding days and associated symptom logging in HealthKit. The MSMU is given to those who have previously indicated the capacity to currently menstruate on the baseline survey; it is not given to those who have indicated current pregnancy, menopause, or to those who are no longer menstruating. The MSMU is delivered within the Research app on the first Sunday of every month, starting in month 2 of participation, at least 4 weeks after enrollment. The survey is available for 1 week after delivery and then expires and is no longer available to the participant. Programming logic was applied in the analysis phase as follows: the monthly surveys were counted when the participants responded at least 4 weeks after enrollment. Any monthly survey with participant responses before 4 weeks of enrollment was excluded from this count. For this analysis, the respondents were censored at pregnancy and menopause.

Research sensor and usage data

The participants consented to the collection of data and derived metrics that were retrieved from certain iPhone sensors and from a paired optional Apple Watch. These include frequently visited locations with an anonymized identifier, watch-on-wrist state, and Optical Sensor (including accelerometer and heart rate) as summarized in Supplemental Table 4.

HealthKit data

HealthKit provides a central repository for health and fitness data on iPhones and Apple Watches. With the user’s permission, specific apps communicate with HealthKit to access and share these data while maintaining the user’s privacy and control. HealthKit stores the data merged from multiple sources and contains data types such as menstrual bleeding days, symptoms and associated user recorded features (cervical mucous changes, temperature changes, sexual activity), heart rate, sleep analysis, and clinical records (lab tests, diagnoses) from clinical interfaces. A bleeding event can be tracked either via Cycle Tracking or by any third-party menstrual tracking application that has been permitted to write to HealthKit by the participant. This allows for ongoing collection of menstrual characteristics and symptoms and reduces missing data owing to lack of direct participant input. For this descriptive analysis, we evaluate the months of contributed cycle tracking data where a logged bleeding event in the HealthKit was attributed to the calendar month of that bleeding event.

Participation

This study was designed to allow participation through the following ways: (1) response to survey questions; (2) contribution of HealthKit data, which include logged menstrual cycles; and (3) contribution of SensorKit data, which include sensor-based data streams from SensorKit.

In general, a participant is considered to be actively participating by contributing data from any of these sources.

Given that one of the main goals of this study is to understand menstrual health and its risk factors, we report on

  1. the response rates to the MSMU, and

  2. tracked bleeding events in HealthKit as measures of ongoing participation for this descriptive analysis.

Nonparticipation in the MSMU is defined as no response to the monthly survey. Invalid responses are defined as those responses that seem biologically unlikely, such as a concurrent state of menopause and pregnancy. We defined completion rate (% completed) as the proportion of individuals who have completed the survey among those eligible for the baseline Menstrual Status Survey or MSMU.

Defining demographic variables

Race and ethnicity

We utilized the same question that was used to determine the race and ethnicity from the NIH-sponsored All of Us study, which asks, “Which categories describe you? Select all that apply: American Indian or Alaska Native [subchoices: American Indian, Alaska Native, Center or South American Indian, none of these fully describe me, I prefer not to answer]; Asian [subchoices: Asian Indian, Cambodian, Chinese, Filipino, Hmong, Japanese, Korean, Pakistani, Vietnamese, none of these fully describe me, I prefer not to answer]; Black, African American, or African [subchoices: African American, Barbadian, Caribbean, Ethiopian, Ghanaian, Haitian, Jamaican, Liberian, Nigerian, Somali, South African, none of these fully describe me, I prefer not to answer]; Hispanic, Latino, or Spanish [subchoices: Colombian, Cuban, Dominican, Ecuadorian, Honduran, Mexican or Mexican American, Puerto Rican, Salvadoran, Spanish, none of these fully describe me, I prefer not to answer]; Middle Eastern or North African [subchoices: Afghan, Algerian, Egyptian, Iranian, Iraqi, Israeli, Lebanese, Moroccan, Syrian, Tunisian, none of these fully describe me, I prefer not to answer]; Native Hawaiian or other Pacific Islander [subchoices: Chamorro, Chuukese, Fijian, Marshallese, Native Hawaiian, Palauan, Samoan, Tahitian, Tongan, none of these fully describe me, I prefer not to answer]; White [subchoices: Dutch, English, European (not listed), French, German, Irish, Italian, Norwegian, Polish, Scottish, Spanish, none of these fully describe me, I prefer not to answer]; none of these fully describe me; I prefer not to answer.” In field testing of the 2 questions typically used to separately determine the race and ethnicity, the All of Us study found that the focus group participants found it burdensome to have 2 questions. The All of Us study, where our race and ethnicity question is sourced from, uses a single question, with the response of “select all that apply” to determine the race and ethnicity. We have classified the responses into traditional reporting of race and ethnicity of the following categories: White, non-hispanic; Hispanic, Latina, Spanish and/or other Hispanic; Black or African American or African; Asian; Other; >1 race or ethnicity, in nonrestrictive categories.17

MacArthur social status scale and socioeconomic status

We utilized the MacArthur Scale of Subjective Social Status, which asks, “Think of this ladder as representing where people stand in the country you live in. At the top of the ladder are the people who are the best off – those who have the most money, the most education, and the most respected jobs. At the bottom are the people who are the worst off – those who have the least money, least education, the least respected jobs, or no job. The higher up you are on this ladder, the closer you are to the people at the very top; the lower you are, the closer you are to the people at the very bottom. Where would you place yourself on this ladder? Please select where you think you stand at this time in your life relative to other people around you. Ladder 0 (worst off) to 10 (best off).” The scale has been noted to correlate with health status across the lifespan.18,19 Notably, the MacArthur Scale is correlated with but is not the same as objective socioeconomic status (SES).20 One main benefit of the MacArthur Scale is that it may be more applicable as a marker of subjective social status than using measures of objective SES in non-White populations.18 We categorized the responses into: 0 to 3 corresponding to low, 4 to 5 corresponding to middle, and 6 to 9 corresponding to high. There are no established categorizations for this variable. We refer to this metric as the SES.

The study population described in this paper was enrolled from November 14, 2019 until May 20, 2020, and we allowed at least 6 months of follow-up for all the women enrolled during this time.

Planned statistical analyses

Statistical analyses are planned for the following 3 categories: (1) longitudinal analysis of survey data, (2) longitudinal analysis of passively collected smartphone and watch data, and (3) longitudinal analysis of associations between the survey data and passively collected data. In each category, we will perform exploratory descriptive analyses and formulate more specific hypothesis-driven models.

For all 3 types of longitudinal analyses, we will use longitudinal extensions of regression methods, such as linear and generalized linear mixed models,21 statistical learning techniques for high-dimensional data,22 and functional data analysis methods.23 For the longitudinal analysis of the survey data, we will quantify the associations among both participant characteristics and risk factors and the menstrual cycle length, and how these associations vary across the age range of the study population. For the longitudinal analysis of passive data, we will perform individual-level analyses to identify the possible change points in behaviors over time and how they relate to subsequent health outcomes. For the longitudinal analysis of passive and survey data, the predictors will initially be the daily summary statistics derived from passively collected data and the outcomes of interest will consist of all the items on which survey data are available.

Study statistical power

We estimated the power to detect meaningful differences in (1) the mean menstrual cycle length and (2) the prevalence of health conditions of interest. We estimated loss to follow-up (20%) and incomplete survey response (90%–95%) over the course of 12 menstrual cycles. To calculate a conservative power estimate, we estimated a high incomplete survey response rate. This rate assumes an incomplete response if there is any skip of any question across the year. The survey is designed to not require a response to any question; a participant may choose not to answer by selecting a prefer not to answer response option or advancing through the survey questions. We present the conservative power calculations on the basis of a 2-sided alpha=0.01 test rather than the more common alpha=0.05 level (Supplemental Table 5).

Menstrual cycle

We estimated our power to detect the differences in the mean menstrual cycle length among a general dichotomy of the study sample. Such a comparison will be of interest when comparing the outcome across subpopulations defined by demographics (eg, race and ethnicity), lifestyle factors (eg, low or high physical activity), or potential exposures (eg, marijuana products or alcohol consumption) (Supplemental Table 6). Although the longitudinal methods will use all cycles from all participants, we evaluated scenarios where 5% to 10% of 400,000 participants provide menstrual cycle data on 12 cycles and conservatively calculated minimally detectable differences in cycle length using data from those 5% to 10%. We assumed a two-sided alpha=0.01 test, and a within-participant longitudinal correlation in the cycle length of 0.6.24 We estimate that we will have the power to detect the menstrual cycle length differences of <1 day for all the scenarios considered, as shown in Supplemental Table 6.

Gynecologic health conditions

We estimated the power of the study to detect differences in the prevalence of health conditions for high-risk vs low-risk groups for various initial sample sizes and health condition prevalence estimates in the baseline group (Supplemental Table 6). We estimate that we will have the power to detect relatively small differences in prevalence for all scenarios, even when the health condition is rare (0.5% prevalence), and the high-risk group is small (10% of the sample).

Results

Baseline characteristics of the first 10,000 participants

We present the demographic characteristics of the first 10,000 enrolled and eligible participants who provided demographic information and show their retention across the first 6 months of participation. From the launch of the study to May 20, 2020, there were 11,113 people who downloaded the Research app and clicked through to the AWHS, 10,459 who consented, and 10,030 who responded to the demographics survey. Eleven participants formally withdrew from the study within their first month of participation. The participant flow for the first 10,000 participants is shown in Figure 2. On average, 370 participants enrolled per week.

FIGURE 2.

FIGURE 2

Participant flow in the Apple Women’s Health Study, first 10,000 participants. Study onboarding as of May 20, 2020; categories of ineligibility are not mutually exclusive

The mean (standard deviation) age at enrollment was 33.6 years (10.3). Although most of the participants were White and non-hispanic (69%), there were 12% Latina, 6% Black, 4% Asian, and 5% identifying as >1 race and ethnicity. Other races and ethnicities represented include American Indian or Alaskan Native (3%). Most participants had graduated college or beyond (51%), were employed for pay (either part-time, full-time, or self-employed) (70%), and reported the use of an Apple Watch (72%). Most participants also gender-identified as a ‘woman’ (96%) and were assigned ‘female’ at birth (99%). Forty percent of the participants were married and 33% were never married. The distribution of SES for the 10,000 participants on the MacArthur Scale are as follows: low (0–3): 31%, middle4,5: 42%, and high69: 25%. At the time of enrollment, more participants were living in the Southern region (35%) of the United States than any other but with many women also living in the Northeast (15%), Midwest (20%), and West (25%). A small proportion of women were living in the US territories (0.3%) or had no data available on their location (4%) (Table 1). The states with the most participants enrolled, adjusted for the total population by state from the 2010 census were Massachusetts, Alabama, and Oregon (Supplemental Figure 2). Menstrual tracking data in HealthKit (either retroactive data or data collected during study period) were available for 82.7% of the women, and SensorKit data on heart rate were available for 24% of the women (Table 1).

TABLE 1.

Demographics of Apple Women’s Health Study participants

Demographics Participants at baseline, first 10,000 participantsa
Mean±SD or mean (25%–75% range) or % (n)
Participants responding to month 6 of the “Monthly Survey: Menstrual Update” (n=3099)a
Mean±SD or mean (25%–75% range) or % (n)
Age (y),b mean±SD 33.6±10.3 33.0±8.5
Race, % (n)
 White, non-Hispanic   69.1 (6910)   73.7 (2284)
 Hispanic, Latina, Spanish and/or other Hispanicc   12.0 (1202)   10.2 (317)
 Black or African
 American or African
 6.1 (609)  4.7 (147)
 Asian  4.3 (428)  4.4 (135)
 Other  2.7 (274)  2.2 (67)
 >1 race  5.1 (513)  4.7 (145)
 Prefer not to answer, or missing  0.6 (64)  0.1 (4)
Gender identity, % (n)
 Woman   96.1 (9612)   97.2 (3013)
 Man  0.3 (31)   <0.1 (3)
 Transwoman   <0.1 (1)     0 (0)
 Transman   0.17 (17)   <0.1 (3)
 Genderqueer or nonbinary 1.04 (104) 1.3 (39)
 Another gender identity or multiple selected  0.9 (92)  0.8 (25)
 I prefer not to answer     0 (0)     0 (0)
 Skip or missing  1.4 (143)  0.5 (16)
Sex assigned at birth, % (n)
 Female   98.5 (9845)   99.5 (3085)
 Intersex  0.2 (19)  0 (0)
 Skip or missing  1.3 (136)  0.5 (14)
Education, n (%)
 Never attended school or only attended kindergarten  0.2 (18)   <0.1 (1)
 Grades 1 through 11 (primary, middle, or some high school)  2.5 (250)  1.3 (39)
 Grade 12 or GED (high school graduate)   12.4 (1237)  9.1 (280)
 1 to 3 y after high school (technical school or some college or associate’s degree)   32.6 (3261)   29.1 (904)
 College 4 y or more (college graduate)   30.5 (3053)   35.5 (1099)
 Master’s degree   15.6 (1563)   18.9 (586)
 Doctorate degree  4.7 (473)  5.5 (171)
 I prefer not to answer     0 (0)     0 (0)
 Skip or missing  1.5 (145)  0.6 (19)
Employment status, % (n)
 Employed for pay (part-time, full-time, self-employed)   70.3 (7031)   77.1 (2388)
 Unemployed  5.6 (559)  3.5 (109)
 Unable to work (ie, disability, illness, other circumstances)  4.2 (417)  2.8 (86)
 In school   11 (1102)   10.7 (331)
 Taking care of house or family  5.9 (584)  5.0 (156)
 In retirement  1.5 (152)  0.2 (7)
 I prefer not to answer  0.9 (97)  0.5 (14)
 Missing  0.6 (58)  0.3 (8)
Marital status, % (n)
 Married   39.6 (3951)   41.5 (1285)
 Divorced  8.3 (831)  6.8 (210)
 Widowed  0.8 (82)  0.4 (12)
 Separated  1.9 (194)  1.4 (43)
 Never married   33.5 (3347)   34.6 (1071)
 A member of an unmarried couple   14.2 (1416)   14.5 (450)
 I prefer not to answer     0 (0)     0 (0)
 Skip or missing  1.8 (179)  0.9 (28)
Socioeconomic status, % (n)
 0–3   31.5 (3146)   33.3 (1032)
 4–5   42.2 (4223)   44.4 (1376)
 6–9   25.4 (2535)   21.9 (678)
 Missing   0.96 (96)  0.4 (13)
Region of the United States, % (n)
 Northeast   15.3 (1528)   15.6 (484)
 Midwest   20.3 (2033)   20.9 (649)
 South   34.8 (3477)   32.7 (1013)
 West   24.9 (2492)   25.8 (801)
 Territory  0.3 (25)  0.3 (8)
 Data not available  4.5 (445)  4.6 (144)
Apple Watch users, % (n)   72.2 (7223)   82.5 (2557)
Heart Rate SensorKit Data Opt-In Authorization, % (n)   24.4 (2438)   31.2 (968)
Menstrual flow HealthKit tracking
 Tracked at least 1 menstrual cycle ever, % (n)   82.7 (8266)   95.1 (2948)
 Tracked at least 1 menstrual cycle within 3 mo of enrollment, % (n)   70.6 (7064)   89.5 (2772)
 Tracked at least 1 menstrual cycle within 6 mo of enrollment, % (n)   72.4 (7236)   91.4 (2833)
 Average number of calendar months tracked before baseline   8.08 (25%–75%; range, 2–13)   9.12 (25%–75%; range, 3–16)
 Average number of calendar months tracked within 3 mo of enrollment (among those tracking)d   2.67 (25%–75%; range, 2–3)   2.94 (25%–75%; range, 3–4)
 Average number of calendar months tracked within 6 mo of enrollment (among those tracking)d   4.44 (25%–75%; range, 3–6)   5.29 (25%–75%; range, 5–6)

GED, general equivalency diploma; SD, standard deviation.

a

Subset for the first 10,000 participants enrolled and providing demographic data that met inclusion criteria (through May 20, 2020). First demographics entry captured for each participant;

b

Age based on birth year;

c

Includes all individuals identifying as Hispanic (exclusively or in addition to another race group). Detailed breakdown available in Supplemental Table 8;

d

Number of calendar months where any bleeding events were tracked during the 3 month follow-up period can range in value from 0 to 4 because a single menstrual cycle can span 2 calendar months. Number of calendar months tracked during the 6 month follow-up period can range from 0 to 7 months.

Supplemental Table 7 describes the selected characteristics obtained from ResearchKit of the 429 participants who enrolled in the study during the same time but did not respond to the demographic questionnaire. Among these participants, the mean age was 34 years. When compared with enrollees that responded to the demographics questionnaire, a greater proportion of those who did not respond were from the South (40% vs 35%), fewer were Apple Watch users (45% vs 72%), and fewer opted-in to the authorization of SensorKit heart rate access (5% vs 24%).

On the basis of the self-reported menstrual status at the time of enrollment, 88% reported actively menstruating, 2% were lactating, 6% were menopausal, and 1% were pregnant. The baseline status and the monthly menstrual statuses across 6 months of follow-up are outlined in Table 2.

TABLE 2.

Monthly menstrual survey status count

Menstrual status Baseline % (n) Survey #1 % (n) Survey #2 % (n) Survey #3 % (n) Survey #4 % (n) Survey #5 % (n) Survey #6 % (n)
Menstruating 87.6 (8763) 55.4 (5540) 42.7 (4274) 37.9 (3791) 36.6 (3659) 33.6 (3355) 30.2 (3021)
Lactation   1.8 (183)   1.0 (102)   0.8 (83)   0.6 (61)   0.5 (53)   0.5 (45)   0.4 (38)
Menopause   6.3 (634)   0.3 (26)   0.2 (22)   0.1 (11) <0.1 (6)   0.1 (10) <0.1 (5)
Pregnant   1.1 (110)   0.7 (67)   0.3 (32)   0.4 (35)   0.3 (31)   0.3 (26)   0.3 (32)
Invalid responsea <0.1 (7)   0 (0)   0 (0)   0 (0)   0 (0)   0 (0)   0 (0)
Prefer not to answer   0.5 (54)   0.1 (13) <0.1 (2) <0.1 (4) <0.1 (2) <0.1 (1) <0.1 (2)
No responseb   2.5 (249) 34.9 (3490) 47.3 (4732) 51.9 (5189) 52.9 (5293) 55.7 (5571) 58.7 (5873)
Ineligiblec -   7.6 (762)   8.6 (855)   9.1 (909)   9.6 (955)   9.9 (992) 10.3 (1028)

Subset for the first 10,000 participants enrolled and providing demographic data that met inclusion criteria (through May 20, 2020).

a

Invalid Responses include those that are not biologically feasible (eg, “Pregnant” & “Menopause”) and are most likely because of user error;

b

There is no metadata captured for individuals who do not respond to the baseline or monthly surveys. This value is derived from subtracting the “Count Completed” column from the “Count Eligible” columns in Table 2;

c

Individuals who reported a menstrual status of “Menopause,” “Pregnant,” or with an invalid response at baseline or in the monthly survey during a prior month are not eligible to receive the survey in subsequent months. In addition, participants who withdrew (N=11) from the study are not eligible to receive the survey. The ‘Ineligible’ group is a cumulative sum across the 6 month follow-up period.

Retention

Among the first 10,000 enrollees, 6519 participated in at least 1 of the MSMU data collection (65.2%). Of those eligible, 3099 participants responded to the 6-month check-in (Table 1). 5748 participants responded to the first MSMU, and 4413 and 3902 responded to the second and third, respectively, as summarized in Table 3. The demographics of those at baseline (N=10,000) and those who responded to the month 6 questionnaire (N=3099) are shown in Table 1. The participants who contributed to the sixth MSMU were broadly similar to the enrolled cohort, differing mostly in being more likely to be educated, White, and unmarried. Among the consistent responders, more were Apple Watch users (83% vs 72%) and were associated with higher levels of tracking through HealthKit than the enrolled cohort.

TABLE 3.

Monthly menstrual survey response totals

Menstrual survey number Count completed Count eligiblea Percentage completedb
Baseline 9751 10,000 97.5
1 5748 9238 62.2
2 4413 9145 48.3
3 3902 9091 42.9
4 3752 9045 41.5
5 3437 9008 38.2
6 3099 8972 34.5
a

Subset for the first 10,000 participants enrolled and providing demographic data that met inclusion criteria (through May 20, 2020). Monthly menstrual surveys started in January 2020. Consecutive responses to surveys not required. Individuals who reported a menstrual status of “Menopause” or “Pregnant” at baseline or in the monthly survey were not eligible to receive the survey in subsequent months. In addition, participants who withdrew from the study (N=11) are not eligible to receive the survey;

b

Percentage completed refers to proportion of individuals who were eligible to receive the baseline or monthly menstrual survey and completed the survey. Calculated using the ‘Count Completed’ and ‘Count Eligible’ columns.

Of the 10,000 participants enrolled, 82.7% tracked the menstrual bleeding days via HealthKit at some point in the 2-years before study entry or in the 6 months of follow-up. 72.4% of the participants tracked at least 1 menstrual bleeding event during the 6-month follow-up period, and the participants tracked an average of 4.44 (25%–75%; range, 3–6) calendar months during the study period. Among those who completed the sixth MSMU, 95.1% contributed a tracked bleeding event, and 91.4% tracked at least 1 menstrual bleeding event during the 6-months of follow-up. Among this subset, the participants tracked an average of 5.29 (25%–75%; range, 5–6) months of cycle data (Table 1).

Survey response time

The participants spent a median time of 1.83 minutes to complete the demographic survey (25th–75th percentile range, 1.43–2.45 minutes) and a median time of 0.77 minutes to complete the MSMU (25th–75th percentile range, 0.55–1.10 minutes) and are noted in Supplemental Table 10.

Discussion

Principal findings

This is the first app-based study of this scope with the goal of collecting longitudinal data for at least 10 years. Among the first 10,000 participants enrolled, the principle findings of this study are that (1) the racial and ethnic distribution of the enrolled cohort was similar to that of the US population, (2) Non-White participants were slightly more likely to drop out of the study than White participants over 6 months of follow-up, (3) the participant geographic distribution of the AWHS included all the US states and Puerto Rico, (4) most of the participants had graduated college or beyond, were employed for pay, and reported the use of an Apple Watch.

Clinical implications

The study aims to advance the understanding of menstrual cycles, the factors affecting both within-woman and between-woman variability, and the associations with various health conditions. The study will have novel opportunities to examine variations in reproductive physiology using passively collected data such as physical activity and heart rate. The possible clinical implications include understanding the associations between population-based exposures among iPhone users and the reproductive outcomes including the menstrual cycle length and irregularity, infertility, menopause, and chronic diseases such as cancer and heart disease measured via Health Records in HealthKit.

Research implications

Among the 10,502 participants who downloaded the app and were eligible, 96% enrolled and responded to the demographics section, which compares favorably with other longitudinal studies.2527 The recruitment efforts were generally not targeted to specific populations but mainly comprised general media coverage and social media posts. Thirty-eight percent of the participants did not respond to the MSMU after enrollment, with a total of 35% of participants responding to the MSMU at the 6-month follow-up. At least 1 menstrual cycle was tracked via HealthKit, by 82.7% (8266/10,000) of the initial cohort and 95.1% (2948/3,099) of the participants who responded to month 6 of the MSMU. The participants contributed with an average of 4.44 (25%–75%; range, 3–6) tracked calendar months during the 6-month follow-up period. Although the race and ethnicity distribution at the baseline was more closely representative of the US population, we noted a slight increase in the proportion of White (Non-Hispanic) race and ethnicity and a higher education status for the participants responding to month 6 of the MSMU than the participants at enrollment. It is to be noted that the survey completion times were considerably shorter than expected.

Strengths and limitations

The AWHS overcomes several limitations of the existing studies of menstrual cycle characteristics including the following: (1) not restricting to those women attempting conception,25,28 (2) low responses to reproductive questions surrounding menstrual cycle characteristics,26 (3) underascertainment of the menstrual cycle characteristics for cycle-specific analysis.9 Furthermore, many existing cohorts lack ascertainment on basic reproductive history and menstrual cycle characteristics, despite having well-characterized outcome data for the leading causes of mortality and morbidity.29 This study builds on the Tremin study7 by extending racial and ethnic diversity and obtaining expanded demographic, anthropometric, lifestyle, and behavioral data elements from HealthKit and Research Sensor and Usage Data, which allow for unique exposure and outcomes assessment. Furthermore, though the findings may have limited generalizability outside of iPhone users, the digital platform may increase access, allowing participants to engage with this digital study.

Although the study has many strengths, several limitations must be considered. Firstly, the study is limited to Apple iPhone users, and there is a concern of generalizability to other populations using other digital platforms. Secondly, the loss to follow-up measured by the response rates for the Monthly Survey: Menstrual Updates (MSMU) is higher than other retention metrics such as tracked menstrual bleeding events in HealthKit. We have initiated the retention efforts utilizing education, communication, and engagement through a study update feature within the app and on the study website to promote monthly survey completion. To ensure continued diversity in this cohort, care must be taken to both recruit and retain the population through effective engagement strategies. Furthermore, engagement strategies may be constructed to limit the loss to follow-up of certain subpopulations. Further evaluation of the data must be conducted to understand whether nonrandom dropout may bias the longitudinal effect estimates on the basis of the monthly surveys, though the menstrual data that is tracked will be written to HealthKit, and it will limit missingness and avoid reliance on survey responses.

Conclusion

The AWHS captures time-varying demographic, lifestyle, and behavioral data that are built into the Research app, an app that interfaces with multiple data sources. These data streams create an opportunity for the AWHS to contribute new knowledge about long-overlooked and understudied women’s health issues.

Supplementary Material

Supplementary Materials

AJOG at a Glance.

Why was this study conducted?

The Apple Women’s Health Study aims to advance the understanding of menstrual cycles and their relationship to health conditions such as infertility, menopause, and health across the lifespan. This paper describes the study design and methods, the demographics of the first 10,000 enrollees, and the study retention rates measured by the response to the Monthly Survey: Menstrual Update and menstrual tracking in HealthKit.

Key findings

The racial and ethnic distribution of the enrolled cohort was similar to that of the US population. Non-White participants were slightly more likely to drop out of the study than White participants over 6 months of follow-up. Notably, 38% (3490/9238 eligible participants) of the cohort did not respond to the Monthly Survey: Menstrual Update after enrollment. Thirty-five percent (3099/8972) of women who were eligible responded to the 6-month Monthly Survey: Menstrual Update. The participants tracked their menstrual bleeding days for an average of 4.44 (25%–75%; range, 3–6) calendar months during the 6 month follow-up period. In addition, 82.7% (8266/10,000) of the cohort tracked at least one menstrual cycle via HealthKit.

What does this add to what is known?

This is the first longitudinal research study of this scale and scope to use a mobile application to collect survey and sensor-based data on menstrual cycles and women’s health.

Acknowledgments

The AWHS team would like to thank all the participants for signing up for the study and contributing to the advancement of women’s health research. We would like to acknowledge Kaitlyn Haughey, MS, Manasvi Marathe, MPH, and Jill MacRae, MS, for their work in supporting the study as a part of the Harvard Study Staff at study initiation; Michael Grusby, PhD, who was involved in the initial phase of this project; the Harvard Information Technology Team, including Andy Ross, BCS, Noah Hulbert, MBA; and David Waxman, MBA, from the Office of Financial Services. We would like to acknowledge Richa Gujarati, MBA, former marketing team lead at Apple, Inc, for supporting the recruitment efforts for this project.

S.M. receives research funding from the National Institutes of Health (NIH), National Science Foundation, and March of Dimes. R.H. receives research funding from the NIH. B.A.C. receives research funding from the NIH and the United States Environmental Protection Agency. J.P.O. receives research funding from the NIH, Boehringer Ingelheim. J.P.O. received an unrestricted gift from Mindstrong Health in 2018. J.P.O. is a cofounder of a recently established commercial entity that operates in digital phenotyping outside of women’s health. C.L.C., T.F.C., G.A. own Apple, Inc stock. The other authors report no conflict of interest.

This study received funding from Apple, Inc. The funder had no role in the analysis and interpretation of data. Support for A.M.Z.J., K.K.F., D.D.B., and A.J.W. was provided by the Intramural Research Program of the National Institute of Environmental Health Sciences, National Institutes of Health.

Contributor Information

Shruthi Mahalingaiah, Harvard T.H. Chan School of Public Health, Boston, MA.

Victoria Fruh, Harvard T.H. Chan School of Public Health, Boston, MA.

Erika Rodriguez, Harvard T.H. Chan School of Public Health, Boston, MA.

Sai Charan Konanki, Harvard T.H. Chan School of Public Health, Boston, MA.

Jukka-Pekka Onnela, Harvard T.H. Chan School of Public Health, Boston, MA.

Alexis de Figueiredo Veiga, Harvard T.H. Chan School of Public Health, Boston, MA.

Genevieve Lyons, Harvard T.H. Chan School of Public Health, Boston, MA.

Rowana Ahmed, Harvard T.H. Chan School of Public Health, Boston, MA.

Huichu Li, Harvard T.H. Chan School of Public Health, Boston, MA.

Nicola Gallagher, Harvard T.H. Chan School of Public Health, Boston, MA.

Anne Marie Z. Jukic, Epidemiology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC.

Kelly K. Ferguson, Epidemiology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC.

Donna D. Baird, Epidemiology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC.

Allen J. Wilcox, Epidemiology Branch, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC.

Christine L. Curry, Health, Apple Inc, Cupertino, CA.

Sanaa Suharwardy, Division of Maternal Fetal Medicine, Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, CA.

Tyler Fischer-Colbrie, Health, Apple Inc, Cupertino, CA.

Gracee Agrawal, Health, Apple Inc, Cupertino, CA.

Brent A. Coull, Harvard T.H. Chan School of Public Health, Boston, MA.

Russ Hauser, Harvard T.H. Chan School of Public Health, Boston, MA.

Michelle A. Williams, Harvard T.H. Chan School of Public Health, Boston, MA.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials

RESOURCES