Abstract
Purpose
Although improvements in breast cancer detection and treatment have significantly increased survival, important questions related to breast cancer risk, prognosis, and survivorship remain. This brief report describes the Health of Women (HOW) Study® methodology and characterizes the participants who completed the My Health Overview and My Breast Cancer modules.
Methods
The HOW Study® was a collection of cross-sectional, web-based modules designed to survey a large number of participants with and without breast cancer.
Results
A total of 42,540 participants completed the My Health Overview module, of whom 13,285 (31.2%) reported a history of breast cancer. The majority of participants were white (94.3%), female (99.5%), married (74.1%), college educated (73.2%), post-menopausal (91.1%), parous (68.8%), and reported breastfeeding their children (56.0%). A total of 11,670 participants reported a history of breast cancer in the My Breast Cancer module. The majority of survivors reported on their primary breast cancer, and were diagnosed over the age of 40 years (83.5%), had either Stage I or Stage II breast cancer (63.1%), and were treated with surgery (98.8%), radiation (64.8%), and/or chemotherapy (62.3%).
Conclusions
The HOW Study® provides an innovative framework for collecting large amounts of epidemiological data in an efficient and minimally invasive way. Data are publicly available to researchers upon request.
Implications for Cancer Survivors
The HOW Study® can be leveraged to answer important questions about survivorship, and researchers are encouraged to utilize this new data source.
Keywords: Breast cancer, Survivorship, Quality of life, Risk factors
Introduction
Breast cancer is the most common cancer among women in the USA [1]. While improvements in breast cancer detection and treatment have increased the average 5-year survival rate to 91% for all subtypes [2], important questions related to breast cancer risk, prognosis, and survivorship remain. The Health of Women (HOW) Study® (NCT02334085) was initiated as a collaboration between the Dr. Susan Love Research Foundation (DSLRF) and the City of Hope to assess potential risk factors for breast cancer and to gather information related to breast cancer diagnosis, treatment, and the development of adverse sequelae. The HOW Study® investigators sought to design and field an online survey to enroll a large number of participants. This brief report describes the HOW Study® methodology and characterizes the participants who completed the My Health Overview and My Breast Cancer modules. These were the first two fielded surveys and underscore the success of the HOW Study® as one of the first research initiatives to use a large-scale, web-based approach.
Methods
Study design and setting
Six independent, web-based modules were developed to assess a breadth of topics, including: general and reproductive health history, family health, physical activity, body composition, environmental exposures, quality of life, and breast cancer diagnosis and treatment (Table 1). Several features were built into each module to improve usability and data completeness. First, skip patterns were used to shorten the length of the surveys and reduce irrelevant content. Second, survey items had a built in “learn more” feature with a pop-up window to explain content in lay terms. Data completeness was ensured by flagging mandatory questions within the software and participants could save, return to, and submit modules at any time.
Table 1.
Module and survey domains in The Health of Women Study®
Module 1. My health overview | Module 2. My personal and family health history | Module 3. Health, weight, and exercise | |
---|---|---|---|
N = 42,540 | N = 22,540 | N = 17,026 | |
Sociodemographic information | Personal diagnoses and treatment | General Health | |
Reproductive health history | Screening practices | Sleep | |
Current health | Family history | Weight and weight fluctuations | |
Health limiting activities | Physical activity | ||
Tobacco and alcohol use | Medication use | ||
Module 4. Environmental exposures | Module 5. Quality of life | Module 6. My breast cancer | |
N = 13,979 | N = 11,570 | N = | |
Personal care | Breast cancer diagnosis and treatment | A. Primary Breast Cancer Diagnosis | 12,482 |
Product use | Chronic conditions | B. Primary Breast Cancer Treatment | 7462 |
Hobbies | Treatment-related symptoms | C. Local Recurrence | 3477 |
Passive smoking | Mental health | D. Metastatic Treatment | 3525 |
Physical functioning | |||
Patient provider communication | |||
Social network | |||
Finances and occupation |
Prior to releasing the HOW Study® modules, a beta launch was conducted in December 2009 to test the feasibility, usability, and delivery of content. Individuals from the Love Research Army (LoveRA, formerly the Army of Women), another DSLRF initiative with over 350,000 members at the time of the launch, were sent invitations via email to participate over the course of three weeks. The HOW Study® enrolled 25,423 individuals to beta test the first module, of whom 18% (n = 4611) identified as having a primary breast cancer diagnosis. Of the 249 women who completed the additional feedback survey on their user experience, 80% of respondents indicated no problems registering online; 97% had a clear understanding of the consent form; 87% were either comfortable or very comfortable providing their health information online; and 90% completed the survey in 25 min or less. The beta launch provided confidence that a large number of participants, both with and without a history of breast cancer, could be successfully recruited. Feedback from the beta launch was incorporated into the final version of each HOW Study® module.
The HOW Study® is best characterized as an adaptive design, where modifications were integrated into each cross-sectional survey/module throughout the field period to maximize reach, accessibility, and participation of individuals, regardless of geographic location. Study eligibility was open to anyone aged 18 and older with access to the internet. The DSLRF LoveRA and the general public were solicited to participate through e-blasts, speaking events, traditional media, and social media outlets. Modules were released sequentially from 2012 to 2014 and took, on average, 30–60 min to complete. The HOW Study® closed enrollment in late 2019, but here we present data from the most recent data freeze in 2016. The first module, My Health Overview, included information related to socio-demographics, reproductive health history, current health status, health behaviors, and data on health-limiting activities. A total of 44,819 individuals consented into the study and started the first module. Among the 42,540 participants who completed the first module (94.9% completion rate), 13,285 (31.2%) reported a history of breast cancer. If the participant reported a history of breast cancer, they were asked to complete a separate module (My Breast Cancer) to ascertain information related to diagnosis and treatment, as well as primary, recurrent, and metastatic cancer. In the My Breast Cancer module, 11,670 participants indicated a history of breast cancer.
All individuals received a written explanation of the study purpose, protocol, confidentiality practices, and risks of participation before consent was obtained. No incentives were offered for study enrollment or survey completion. The study protocol and procedures were approved by the Western Institutional Review Board.
Statistical analysis
Results are presented as frequency distributions for ordinal and nominal variables. Differences in the distributions of baseline characteristics by breast cancer status were assessed using chi-squared tests. All p-values are two-sided, and analyses were performed using SAS version 9.4.
Results
A total of 42,540 individuals primarily located in the USA (<1% international representation) participated in the My Health Overview module, of whom 13,285 (31.2%) reported a history of breast cancer. Table 2 describes the characteristics of the My Health Overview sample. Overall, the majority of participants were non-Hispanic white (94.3%), female (99.5%), married (74.1%), college educated (73.2%), post-menopausal (91.1%), parous (68.8%), and reported breastfeeding their children (56.0%). Compared to adults without a history of breast cancer, individuals with a breast cancer diagnosis were significantly older and more likely to have had some college or less, be parous, and report never breastfeeding or using hormone replacement therapy. Breast cancer survivors also reported significantly poorer health status and reported more difficulty climbing a flight of stairs, walking one block, bathing, and dressing, compared to respondents without a history of breast cancer.
Table 2.
Characteristics of the HOW Study (n = 42,540)
Total samples | Breast cancer survivors | Cancer-free adults | ||
---|---|---|---|---|
n = 42,540 | n = 13,285 | n = 29,179 | ||
n (%) | n (%) | n (%) | p-value | |
Age | <.001 | |||
17–30 | 2846 (6.69) | 116 (0.87) | 2724 (9.34) | |
31–40 | 5604 (13.17) | 865 (6.50) | 4728 (16.20) | |
41–50 | 8117 (19.08) | 2499 (18.81) | 5608 (19.22) | |
51–60 | 12,656 (29.75) | 4613 (34.72) | 8018 (27.48) | |
61–70 | 10,914 (25.66) | 4126 (31.06) | 6768 (23.19) | |
70+ | 2403 (5.65) | 1066 (8.02) | 1333 (4.57) | |
Gender | .52 | |||
Male | 157 (0.37) | 52 (0.39) | 105 (0.36) | |
Female | 42,324 (99.51) | 13,221 (99.52) | 29,037 (99.51) | |
Other | 50 (0.12) | 12 (0.09) | 37 (0.13) | |
Ethnicity | .002 | |||
Non-Hispanic White | 39,941 (94.28) | 12,564 (94.82) | 27,356 (94.03) | |
Hispanic White | 980 (2.31) | 250 (1.89) | 729 (2.51) | |
Black | 677 (1.60) | 204 (1.54) | 472 (1.62) | |
American Indian/Pacific Islander | 87 (0.21) | 23 (0.17) | 64 (0.22) | |
Asian | 401 (0.95) | 130 (0.98) | 270 (0.93) | |
Other | 280 (0.66) | 79 (0.60) | 201 (0.69) | |
Education | <.001 | |||
High school/vocational school or less | 2412 (5.69) | 876 (6.61) | 1535 (5.27) | |
Associates degree or some college | 8965 (21.13) | 3117 (23.51) | 5845 (20.06) | |
Bachelor’s degree | 13,662 (32.21) | 3981 (30.03) | 9671 (33.19) | |
Graduate/professional school | 17,380 (40.97) | 5283 (39.85) | 12,085 (41.48) | |
Marital status | <.001 | |||
Married | 31,397 (74.11) | 9961 (75.26) | 21,427 (73.58) | |
Divorced | 4817 (11.37) | 1647 (12.44) | 3167 (10.88) | |
Widowed | 1573 (3.71) | 590 (4.46) | 983 (3.38) | |
Never married | 4581 (10.81) | 1038 (7.84) | 3542 (12.16) | |
BMI | <.001 | |||
Underweight | 739 (1.75) | 208 (1.57) | 528 (1.82) | |
Normal | 19,328 (45.66) | 5943 (45.00) | 13,354 (45.96) | |
Overweight | 12,227 (28.89) | 3997 (30.26) | 8212 (28.26) | |
Obese | 10034 (23.71) | 3060 (23.17) | 6961 (23.96) | |
Reproductive risk factors | ||||
Age at menarche | .10 | |||
Never had a period | 11 (0.03) | 3 (0.02) | 8 (0.03) | |
<10 years | 719 (1.76) | 227 (1.78) | 492 (1.76) | |
10–11 years | 8430 (20.67) | 2583 (20.25) | 5841 (20.85) | |
12–13 years | 23,508 (57.63) | 7487 (58.69) | 16,016 (57.16) | |
14–15 years | 6722 (16.48) | 2030 (15.91) | 4691 (16.74) | |
16–17 years | 1399 (3.43) | 426 (3.34) | 972 (3.47) | |
Menopausal status | .09 | |||
Pre-menopausal | 2455 (8.9) | 969 (8.58) | 1485 (9.18) | |
Post-menopausal | 25,030 (91.1) | 10,326 (91.42) | 14,695 (90.82) | |
Ever used hormone replacement | <.001 | |||
No | 15,736 (56.71) | 7501 (65.51) | 8231 (50.53) | |
Yes | 12,014 (43.29) | 3949 (34.49) | 8058 (49.47) | |
Number of pregnancies | ||||
Never pregnant | 9890 (23.51) | 2545 (19.36) | 7343 (25.40) | <.001 |
1 | 5513 (13.10) | 1672 (12.72) | 3839 (13.28) | |
2 | 10,838 (25.76) | 3636 (27.65) | 7198 (24.90) | |
3 | 7994 (19.00) | 2664 (20.26) | 5327 (18.43) | |
4 | 4259 (10.12) | 1444 (10.98) | 2814 (9.73) | |
5+ | 3578 (8.50) | 1188 (9.03) | 2388 (8.26) | |
Number of live births | ||||
Nulliparous | 13,137 (31.18) | 3569 (27.09) | 9561 (33.03) | <.001 |
1 | 6977 (16.56) | 2249 (17.07) | 4727 (16.33) | |
2 | 14,610 (34.68) | 4899 (37.19) | 9708 (33.54) | |
3 | 5579 (13.24) | 1899 (14.42) | 3677 (12.70) | |
4 | 1440 (3.42) | 449 (3.41) | 991 (3.42) | |
5+ | 390 (0.93) | 108 (0.82) | 282 (0.97) | |
Breastfeeding | <.001 | |||
Nulliparous | 13,137 (31.18) | 3569 (27.09) | 9561 (33.03) | |
Yes | 23,610 (56.04) | 7533 (57.19) | 16,071 (55.52) | |
No | 5386 (12.78) | 2071 (15.72) | 3314 (11.45) | |
General health and health behaviors | ||||
General health compared to one year ago | ||||
Much better than a year ago | 3969 (9.36) | 1795 (13.54) | 2173 (7.46) | <.001 |
Somewhat better than a year ago | 6855 (16.16) | 2255 (17.01) | 4596 (15.77) | |
About the same | 25,814 (60.86) | 6613 (49.89) | 19,193 (65.85) | |
Somewhat worse than a year ago | 4835 (11.40) | 1872 (14.12) | 2960 (10.16) | |
Much worse than a year ago | 944 (2.23) | 720 (5.43) | 224 (0.77) | |
Health limits ability to climb a flight of stairs | <.001 | |||
No, not limited | 37,865 (89.30) | 11,249 (84.85) | 26,603 (91.32) | |
Yes, a little | 3810 (8.99) | 1659 (12.51) | 2148 (7.37) | |
Yes, a lot | 729 (1.72) | 349 (2.63) | 380 (1.30) | |
Health limits ability to walk one block | <.001 | |||
No, not limited | 39,491 (93.12) | 11,860 (89.50) | 27,615 (94.76) | |
Yes, a little | 2350 (5.54) | 1122 (8.47) | 1228 (4.21) | |
Yes, a lot | 568 (1.34) | 270 (2.04) | 298 (1.02) | |
Health limits ability to dress/bathe | <.001 | |||
No, not limited | 41,494 (97.81) | 12,770 (96.30) | 28,709 (98.49) | |
Yes, a little | 872 (2.06) | 455 (3.43) | 416 (1.43) | |
Yes, a lot | 59 (0.14) | 36 (0.27) | 23 (0.08) | |
Smoked 100 cigarettes in lifetime | <.001 | |||
No | 2604 (14.61) | 532 (9.02) | 2071 (17.37) | |
Yes | 15,222 (85.39) | 5365 (90.98) | 9849 (82.63) | |
Current smoker | <.001 | |||
No | 13,569 (89.27) | 4903 (91.49) | 8658 (88.05) | |
Yes | 1631 (10.73) | 456 (8.51) | 1175 (11.95) | |
Drank alcohol at least once a month for 6 months or more | .002 | |||
No | 6745 (15.97) | 2220 (16.81) | 4522 (15.59) | |
Yes | 35,479 (84.03) | 10,984 (83.19) | 24,482 (84.41) |
The HOW Study® recruited 11,670 breast cancer survivors into the My Breast Cancer Module. The majority of survivors were diagnosed at age 40 years and older (83.5%), had either Stage I (31.6%) or Stage II (31.5%) breast cancer, and were treated with surgery (98.8%), radiation (64.8%), and/or chemotherapy (62.3%) (Table 3).
Table 3.
Primary breast cancer characteristics from the my breast cancer module (n = 11,670)
n (%) | |
---|---|
Age at diagnosis | |
20–29 | 13 (1.86) |
30–39 | 103 (14.74) |
40–49 | 269 (38.48) |
50–59 | 136 (19.46) |
60–69 | 151 (21.60) |
70+ | 27 (3.86) |
Stage | |
In situ | 2585 (22.87) |
I | 3571 (31.60) |
II | 3561 (31.51) |
III | 1365 (12.08) |
IV | 219 (1.94) |
Primary breast cancer detection method | |
Mammography | 3443 (53.45) |
Lump | 3175 (49.29) |
MRI | 538 (8.35) |
Ultrasound | 1105 (17.16) |
Skin redness | 127 (1.97) |
Breast pain | 397 (6.16) |
Surgery | |
No | 76 (1.18) |
Yes | 6362 (98.82) |
Radiation | |
No | 2265 (35.18) |
Yes | 4173 (64.82) |
Chemotherapy | |
No | 2428 (37.73) |
Yes | 4007 (62.27) |
Multiple primaries | |
No | 9724 (84.69) |
Yes | 1758 (15.31) |
Discussion
My Health Overview and My Breast Cancer, the first of the HOW Study® suite of modules, were fielded in 2012 with an intensive effort to enroll a large number of participants with and without breast cancer. The HOW Study® was able to enroll over 42,000 participants in the My Health Overview module and nearly 12,000 breast cancer survivors in the My Breast Cancer Module. While other web-based breast cancer surveys have been conducted, the scope of these studies are typically intervention specific (e.g., physical activity, social support) or are limited to certain aspects of the cancer continuum. For example, the Metastatic Breast Cancer Project is a recent effort that accrued 4000 participants to discover new treatment modalities for metastatic breast cancer patients [https://www.mbcproject.org]. HOW data are complementary to and enhance these existing sources since eligibility was open to individuals with and without breast disease across all facets of the cancer continuum.
The HOW Study® ascertained detailed information on breast cancer risk factors to test associations between demographic, reproductive, environmental, and lifestyle characteristics among individuals with and without a history of breast cancer. The My Breast Cancer module collected comprehensive clinical characteristics, including tumor staging, cancer treatments, recurrence, and metastasis, providing an in-depth study of breast cancer survivors. The Quality of Life module could be used to interrogate hypotheses of treatment-related side effects and chronic conditions that may impact activities of daily living, social participation, and other health-related quality of life outcomes.
The HOW Study® provides an innovative framework for collecting large amounts of epidemiological data in an efficient and minimally invasive way. Web-based surveys may address some of the limitations of conventional paper and pencil approaches for data collection because: (1) large amounts of data can be collected quickly without the use of research staff; (2) survey length can be reduced with skip patterns; (3) mandatory responses can be programmed to improve data completeness; and (4) the cost of data collection is minimal after initial programming of the survey. Despite these advantages, online surveys have been criticized for lower or equivalent response rates compared to mailed surveys [3]. Furthermore, selection bias may be an issue with online surveys, as participation requires access to the internet and may reduce enrollment of some important subgroups [4]; although, the proliferation of smartphone use and mobile infrastructure partially addresses this limitation. In the HOW Study®, over 30% of the sample were aged 60 and older, suggesting adequate representation among older adults, which is important given the growing number of older women diagnosed with breast cancer. However, engaging historically underrepresented and underserved populations proved more difficult, as the majority of the sample were educated, non-Hispanic white women. This is consistent with findings from other online surveys that suggest web-based surveys include samples of higher socioeconomic status and that disparities by race, region, and income level remain [5]. Steps are needed to better engage underrepresented subgroups in online research.
There are some methodological restrictions to using HOW Study® data. First, the source population from which study subjects were drawn is unknown and response rates could not be calculated. Second, the HOW Study® sample should not be generalized to the general population due to convenience sampling. The objective of the study was to elucidate factors that influence breast cancer risk and prognosis, rather than conduct breast cancer surveillance. Therefore, a nationally representative probability sample is not requisite. Third, all data were self-reported; although, validation studies suggest high accuracy between self-reported breast cancer characteristics and medical records [6–8]. Finally, the current breast cancer survivor sample reflects women diagnosed at earlier stages, who may have a more favorable prognosis. Therefore, breast cancer survivors may appear more similar to their cancer-free counterparts and underestimate true associations due to survivor bias. Additional steps to recruit survivors with more advanced stages of breast cancer should be undertaken.
Availability of data to researchers
To honor the participants for their outstanding contribution and obtain the greatest value from these data, de-identified HOW Study® data are available to investigators. Steps to obtain data are provided on the HOW Study® website [https://drsusanloveresearch.org/research/the-health-of-women-study/].
Conclusions and future directions
The HOW Study® is the first online breast cancer study to recruit a large sample of individuals with and without breast disease using a series of online cross-sectional study modules. In this report, we described the development and data collection processes of the HOW Study® and characterize the study populations from the first released modules, My Health Overview and My Breast Cancer. Data are available to researchers upon request and can be used to explore breast cancer risk factors and outcomes, including primary diagnosis, recurrence, metastasis, and quality of life. We believe these questions are of great interest to public health researchers, clinicians, and most importantly, the growing number of breast cancer survivors and their loved ones.
Acknowledgements
A special thanks to Beth Slotman of Westat for her assistance. We thank Christine Taylor, Jane Sullivan-Halley, and Leslie Bernstein for their contribution to the development of the HOW Study® conceptualization.
Funding
The Health of Women Study® was supported by Dr. Susan Love Research Foundation, City of Hope Beckman Research Institute, Avon Foundation, and Keep A Breast Foundation™. The data cleaning and descriptive analysis was supported by the Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Rockville, MD.
Footnotes
Publisher's Disclaimer: This AM is a PDF file of the manuscript accepted for publication after peer review, when applicable, but does not reflect post-acceptance improvements, or any corrections. Use of this AM is subject to the publisher’s embargo period and AM terms of use. Under no circumstances may this AM be shared or distributed under a Creative Commons or other form of open access license, nor may it be reformatted or enhanced, whether by the Author or third parties. See here for Springer Nature’s terms of use for AM versions of subscription articles: https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms
Code availability
Coding is available upon request.
Ethics approval and consent to participate
The questionnaire and methodology for this study was approved by the Western Institutional Review Board. Informed consent was obtained from all individual participants included in the study. The study protocol and procedures were approved by the Western Institutional Review Board.
Consent for publication
There are no individual data or images to obtain consent for.
Conflict of interest
The authors have no relevant financial or non-financial conflicts of interest to disclose. Data cleaning and editing was conducted by Westat under contract number HHSN261201800002B issued by the National Cancer Institute.
Availability of data and material
All data are available to researchers upon request.
References
- 1.Miller KD, Nogueira L, Mariotto AB, et al. Cancer treatment and survivorship statistics, 2019. CA Cancer J Clin 2019;69(5):363–385. [DOI] [PubMed] [Google Scholar]
- 2.Society AC. Breast Cancer Facts & Figures 2019–2020. In; 2019.
- 3.Weigold A, Weigold IK, Russell EJ. Examination of the equivalence of self-report survey-based paper-and-pencil and internet data collection methods. Psychol Methods 2013;18(1):53–70. [DOI] [PubMed] [Google Scholar]
- 4.Tsuboi S, Yoshida H, Ae R, et al. Selection bias of Internet panel surveys: a comparison with a paper-based survey and national governmental statistics in Japan. Asia Pac J Public Health 2015;27(2):NP2390–9. [DOI] [PubMed] [Google Scholar]
- 5.Brodie M, Flournoy RE, Altman DE, et al. Health information, the Internet, and the digital divide. Health Aff (Millwood) 2000;19(6):255–65. [DOI] [PubMed] [Google Scholar]
- 6.D’Aloisio AA, Nichols HB, Hodgson ME, et al. Validity of self-reported breast cancer characteristics in a nationwide cohort of women with a family history of breast cancer. BMC Cancer 2017;17(1):692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kemp A, Preen DB, Saunders C, et al. Ascertaining invasive breast cancer cases; the validity of administrative and self-reported data sources in Australia. BMC Med Res Methodol 2013;13:17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Liu Y, Diamant AL, Thind A, et al. Validity of self-reports of breast cancer treatment in low-income, medically underserved women with breast cancer. Breast Cancer Res Treat 2010;119(3):745–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
All data are available to researchers upon request.