Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jun 3.
Published in final edited form as: Health Aff (Millwood). 2023 Apr;42(4):508–515. doi: 10.1377/hlthaff.2022.01205

Widespread Third-Party Tracking On Hospital Websites Poses Privacy Risks For Patients And Legal Liability For Hospitals

Ari B Friedman a,b, Raina M Merchant a,b, Amey Maley b, Karim Farhad b, Kristin Smith b, Rachel Gonzales a, Lujo Bauer c, Matthew S McCoy a
PMCID: PMC11145977  NIHMSID: NIHMS1992671  PMID: 37011312

Abstract

Computer code which transfers data to third-parties (third-party tracking) is common across the web and is subject to few federal privacy regulations. We determined the presence of potentially privacy-compromising data transfers to third parties on a census of U.S. non-federal acute care hospital websites, and used descriptive statistics and regression analyses to determine the hospital characteristics associated with a greater number of third-party data transfers. We found that third-party tracking is present on 98.6% of hospital websites, including transfers to large technology companies, social media companies, advertising firms, and data brokers. Hospitals in health systems, hospitals with a medical school affiliation, and hospitals serving more urban patient populations all exposed visitors to higher levels of tracking in adjusted analyses. By including third-party tracking code on their websites, hospitals are facilitating the profiling of their patients by third parties. In addition to dignitary harms from privacy loss, these practices may lead to health-related advertising targeting patients, as well as legal liability.

Introduction

In 2021, Mass General Brigham and the Dana-Farber Cancer Institute agreed to an $18 million settlement with a group of plaintiffs who claimed that the hospital networks had violated their privacy.(1) Notably, the case did not involve medical records, personal health information, security breaches, or unauthorized use of patients’ financial information. The plaintiffs alleged that the hospital networks had not obtained sufficient consent when using third-party tracking tools—including cookies and tracking pixels—on their publicly accessible websites.

The plaintiffs’ charges reflect growing concern about the privacy risks raised by third-party tracking, particularly on websites where visitors’ browsing behavior may reveal sensitive information about their or their family members’ health conditions to advertisers, data brokers, and other companies that seek to monetize it.(24) Third party tracking code is typically installed by website maintainers to add functionality such as advertisement campaign monitoring or social media linkage.(5) However, health systems may not fully appreciate the privacy implications of the code(6), which allows third parties not subject to the Health Insurance Portability and Accountability Act (HIPAA) to observe individuals’ browsing behavior across hospital websites.(79)

While prior research has shown that third-party tracking is prevalent across a range of health-related websites,(1012) little is known about the prevalence, quantity, and characteristics of third-party tracking on hospital websites, despite the fact that for many patients, these websites are an essential point of contact to the health system. Joshua Niforatos and colleagues recently assessed third-party tracking on the websites of 61 hospitals and found that 90% included at least one third-party cookie.(13) However, their study was limited to the largest and highest ranked hospitals and did not assess for differences across hospital characteristics or the types of third-parties to which data was transferred. A recent investigation conducted by STAT and The MarkUp found that websites of 33 of Newsweek’s top 100 hospitals transferred data to Facebook, but the investigation did not include hospitals outside this group, nor did it detail other third-party data recipients.(14)

In this analysis, we aimed to assess the prevalence and quantity of third-party tracking across the website homepages of all US acute care hospitals. Our secondary aims were to identify hospital characteristics associated with higher levels of tracking, and to assess whether third-party tracking varied between hospital website homepages and patient-facing webpages about potentially sensitive health conditions.

Methods

Design

We conducted a cross-sectional, prospective, observational study evaluating the third-party tracking on U.S. hospital websites. Third-party tracking was assessed on a rolling basis over a 3-day period (August 5–8, 2021).

Study Population

We studied all U.S. hospitals (n=6,162) included in the 2018 American Hospital Association (AHA) Annual Survey. The AHA Annual Survey is the canonical source for information on U.S. hospitals, and has >90% response rate. Our primary analysis consisted of non-federal acute care hospitals in the U.S. and U.S. territories, stratified according to the population they serve. Consistent with prior studies, we defined non-federal acute care hospitals as hospitals that had an emergency department; that were not a free-standing long-term care facility or an ambulatory surgical center; and that were not under military, Indian Health Service, or other federal control.(15,16)

Hospital URLs

To obtain hospital website homepage URLs, we employed a distributed search strategy using Amazon Mechanical Turk (AMT) with manual verification by two study authors (KF and AM). For each hospital, 3 AMT workers were provided the name of the hospital and its physical address, as listed in the AHA database, and asked to perform a Google search for the URL of the homepage of each hospital. If all 3 workers provided the same URL or agreed that the hospital had no website, the result was immediately accepted (N=2,534). For the remaining (N=3,628) cases, a study author (AM or KF) manually reviewed and selected the correct URL or confirmed that the hospital had no website.

Some hospitals shared a website since they are a part of a larger health system. In these cases, the health system homepages were accepted as valid hospital URLs.

Hospital Characteristics

We obtained hospital characteristics from the AHA database and the American Community Survey (ACS). The AHA database provided information on hospital name and address, health system membership, ownership type (non-profit vs for-profit), number of beds, presence of an emergency department, and medical school affiliation reported to the AMA.

We used the 2019 five-year ACS to compile data on race/ethnicity and population size for ZIP Code Tabulation Areas. Rural-Urban Commuting Area Codes (RUCA) from the 2010 Census were used to assign an urbanicity score to each of a hospital service area’s (HSA) constituent ZIP codes, as defined in the Dartmouth Atlas of Healthcare. HSAs define a geography where residents receive most of their hospitalizations from the hospitals in that area and therefore proxy for where a resident in a particular ZIP code would most likely seek treatment. Metropolitan and micropolitan areas were categorized as urban with all other areas being considered rural. If an HSA consisted of both rural and urban ZIP codes, it was classified as urban. ACS ZIP data were then aggregated into HSA by taking the average data of all ZIP codes in a given HSA.

We defined a rural hospital as having a rural population percentage value in the top decile of all hospitals. We defined poverty-serving hospitals as having a percent of patient population living in poverty in the top decile. We defined historically disadvantaged minority-serving hospitals as hospitals with a Black or Hispanic patient population percentage in the top decile, excluding Native American and other populations from this calculation due to a lack of available data.

Third-party Tracking

To assess the amount and type of third-party tracking on each hospital’s homepage, we visited each webpage using webXray, an open-source, automated tool designed to record third-party tracking, which has previously been used in academic studies.(10,11,17) For each webpage, we recorded data requests that initiate data transfers to third party domains. Transfers typically occur when the webpage loads and include a user’s IP address and the URL of the webpage being visited. We also recorded the presence of cookies, small pieces of data stored on a user’s browser that serve as persistent identifiers, allowing users to be tracked across multiple websites. We used the webXray database to link individual tracking domains to their parent companies (e.g. doubleclick.net was determined to be owned by Google which is owned by Alphabet).

To assess whether tracking differed between hospital homepages and condition-specific webpages within a hospital website, we selected 100 hospitals via simple random sampling and conducted a structured search of their websites. An author (JF) used each hospital website’s own search engine to locate webpages covering 6 conditions that may reveal sensitive information about users by searching for the following terms: “Alzheimer’s,” “breast cancer,” “congestive heart failure,” “Crohn’s disease,” “depression,” and “HIV.” We recorded the URL for the first patient-facing webpage returned in the search results. Using webXray, we visited the condition-specific URLs and the same hospitals’ homepages and recorded all third-party data requests.

Statistical Analysis

We calculated the percentage of hospital homepages with any third-party transfer and any third-party cookie, overall and by hospital type. Our primary outcome measure was the number of third-party transfers on hospital homepages. The number of third-party transfers has important implications for user privacy because it directly captures the scale of intrinsic privacy harm and correlates with the probability of data resale or targeted advertisement. We calculated the median number and interquartile range (IQR) of third-party transfers per hospital homepage and used the non-parametric equality-of-medians test to examine whether the number of third-party transfers differed by hospital characteristics. We used medians and correlation coefficients to compare tracking between condition-specific pages and hospital homepages. In adjusted analyses, we utilized linear regression with clustering by health system, with the number of third-party transfers as the dependent variable and the following independent variables: hospital size, region, ownership type, location (rural vs urban), minority-serving, poverty-serving, system membership and medical school affiliation. Sensitivity analyses explored additional definitions of medical school affiliation. The Variance Inflation Factor identified no covariates with multicollinearity.

Statistical analysis was conducted using Stata IC version 16.1 (StataCorp LP, College Station, TX). All hypothesis tests were two-tailed using an alpha level of 0.05. As this study utilized publicly available data, it was considered exempt from IRB review.

Limitations

This study had limitations. First, we investigated only two modes of tracking, data-transfers to third party domains and third-party cookies. Because other modes of tracking exist, such as browser fingerprinting, we likely underestimated the extent of third party tracking on hospital homepages. Second, we were unable to assess tracking on password protected sections of hospital websites including patient portals. Third, we cannot differentiate between uses of the data once transferred. However, while some third parties use data transfers to provide a service without using that data for other purposes such as targeted advertising or resale, the majority are known to, including on hospital pages.(14) Fourth, to assess whether tracking differed between hospital homepages and condition-specific pages, we analyzed a subset of hospital websites with patient-facing webpages for 6 specific conditions. Hospitals with such webpages may differ from hospitals without these condition-specific webpages. Finally, we did not assess longitudinal trends in tracking due to data limitations.

Results

We identified (n=3,747) non-federal acute care hospitals with accessible websites as shown in Appendix Exhibit S1.(18) Overall, 98.6% of hospital website homepages had at least one third-party transfer, while 94.3% had at least one third party cookie (Exhibit 1).

Exhibit 1:

Descriptive characteristics of non-federal acute care hospitals

Hospital Websites with
Hospitals (N = 3,747) Third-Party Request Third-Party Cookie

Characteristic N % % %
Overall 98.6 94.3
Size a
 Small (< 100 beds) 1,814 48.4 98.7 94.2
 Medium (100–499 beds) 694 18.5 99.3 98.9
 Large (> 500 beds) 1,239 33.1 98.1 91.9
Region
 Northeast 452 12.1 99.6 95.8
 Midwest 816 21.8 98.7 93.8
 South 1,657 44.2 98.4 94.2
 West 774 20.7 98.6 95.1
 Puerto Rico 48 1.3 95.8 81.3
Ownership
 For profit 754 20.1 98.5 93.0
 Not for profit 2,275 60.7 99.0 96.7
 Public 714 19.1 97.5 88.2
 Unknown 4 0.1 100.0 50.0
System Membership b
 Part of a system 2,434 65.0 99.5 97.4
 Not part of a system 1,313 35.0 97.0 88.6
Medical School Affiliation
 Yes 1,199 32.0 99.4 97.5
 No 2,548 68.0 98.2 92.8
Location c
 Rural 646 17.2 97.8 90.1
 Urban 3,101 82.8 98.8 95.2
Poverty-Serving d
 Yes 398 10.6 97.0 91.7
 No 3,349 89.4 98.8 94.6
Minority-Serving e
 Yes 695 18.6 97.7 92.1
 No 3,052 81.5 98.8 94.8

Source: Authors’ analysis of hospital website homepages, with tracking assessed via the webXray tool, August 2021, and hospital characteristics from the American Hospital Association Annual Survey, 2019

Notes

a

Total number of general medical and surgical beds

b

Defined as hospitals with a listed system name in the AHA database

c

Rural is defined as having a rural population percentage value in the top decile of all hospitals

d

Defined as hospitals with a percent of patient population living in poverty in the top decile

e

Defined as hospitals with a Black or Hispanic patient population percentage in the top decile

Alphabet was the most common tracking entity among all hospitals in the sample, with 98.5% of all homepages reporting third party transfers to this entity. Other common third-party entities included Meta (55.6%), Adobe Systems (31.4%), and AT&T (24.6%). The 25 most prevalent third-party entities are reported in Exhibit 2. Data transfers to third-party domains whose parent company could not be identified were present on 69.0% of homepages.

Exhibit 2:

Number of websites transferring data to domains associated with a given tracking entity owned by each parent company

Parent Company N %
Alphabeta 3691 98.5
Metab 2083 55.6
Adobe Systems 1177 31.4
AT&T 922 24.6
The Trade Desk 813 21.7
Oracle 802 21.4
Verizon 791 21.1
Rubicon Project 712 19.0
Amazon 689 18.4
Microsoft 671 17.9
Hotjar 629 16.8
StackPath 596 15.9
Siteimprove 592 15.8
Cloudflare 592 15.8
Acxiom 551 14.7
Salesforce 543 14.5
Telenor 532 14.2
Nielsen Online 476 12.7
Lotame 446 11.9
Fonticons 446 11.9
JS Foundation 420 11.2
Crazy Egg 408 10.9
Golden Gate Capital 408 10.9
Drawbridge 386 10.3

Source: Authors’ analysis of hospital website homepages, with tracking assessed via the webXray tool, August 2021

Notes There were 2,585 pages (69.0%) which transferred third-party data to at least one domain whose parent entity could not be identified in the webXray database.

a

Alphabet is the parent company of Google.

b

Meta is the parent company of Facebook.

Overall, hospital website homepages had a median of 16 third-party transfers. The median number of third-party transfers per homepage differed across hospital characteristics in unadjusted analyses (Exhibit 3 and Appendix Exhibit S2).(18) Medium size hospitals had a significantly higher median number of third-party transfers (24) compared to both small (17) and large (13) hospitals (p < 0.01). Non-profit hospitals had a greater number of third-party transfers than both public and for-profit hospitals (median 22 vs 11 vs 13, p < 0.01). Urban hospitals had a greater number of third-party transfers than rural hospitals (median 17 vs 11, p < 0.01). Non-poverty serving hospitals had a greater number of third-party transfers than poverty serving hospitals (median 17 vs 13, p < 0.01). Finally, hospitals in a health system had a greater number of third-party transfers than independent hospitals (median 21 vs 10, p < 0.01), while hospitals with a medical school affiliation had a greater number of third-party transfers than those without an affiliation (median 20 vs 15, p < 0.01). Compared to hospitals with any third-party data transfers, the small number (52, 1.4%) of hospitals on whose websites we did not observe third-party transfers were substantially (at least 10 percentage points) less likely to be part of a system, to have an academic affiliation, and to be non-profit, and more likely to be poverty-serving, minority-serving, and public (see Appendix Exhibit S3).(18)

Exhibit 3:

Number of third-party data transfers per website (2021), by 2019 hospital characteristics

Hospitals Number of third-party requests

Characteristic N % Median IQR P-value
Overall 3,747 100.0 16 10, 29
Size < 0.01
Small (< 100 beds) 1,814 48.4 17 10, 30
Medium (100–499 beds) 694 18.5 24 15, 36
Large (> 500 beds) 1,239 33.1 13 7, 22
Region < 0.01
Northeast 452 12.1 19 12, 32
Midwest 816 21.8 15 8, 28
South 1,657 44.2 16 10, 30
West 774 20.7 16 9, 31
Puerto Rico 48 1.3 5 4, 10
Ownership < 0.01
For profit 754 20.1 13 10, 17
Not for profit 2,275 60.7 22 12, 36
Public 714 19.1 11 6, 19
Unknown 4 0.1 3.5 2, 6.5
System Membership < 0.01
Yes 2,434 65.0 21 13, 35
No 1,313 35.0 10 5, 17
Medical School Affiliation < 0.01
Yes 1,199 32.0 20 12, 34
No 2,548 68.0 15 8, 27
Location < 0.01
Rural 646 17.2 11 6, 21
Urban 3,101 82.8 17 11, 31
Poverty-Serving < 0.01
Yes 398 10.6 13 8, 25
No 3,349 89.4 17 10, 30
Minority-Serving 0.069
Yes 695 18.6 16 9, 28
No 3,052 81.5 16 10, 30

SOURCE Authors’ analysis of hospital website homepages, with tracking assessed via the webXray tool, August 2021, and hospital characteristics from the American Hospital Association Annual Survey, 2019.

In multivariate regression analysis, several factors were associated with a significantly greater number of third-party transfers on hospital website homepages (Exhibit 4). Results from sensitivity analyses are detailed in Appendix Exhibit S4.(18) Membership in a health system was associated with an increase of 10.0 third-party transfers compared to non-system membership (p < 0.001). Having a primarily urban patient population was associated with an average of 3.6 more third-party transfers (p < 0.05). Finally, having a medical school affiliation was associated with 1.8 more third-party transfers after adjustment (p < 0.001).

Exhibit 4:

Adjusted association of hospital characteristics with the number of third party transfers on hospital websites

Variables Difference Standard Error
Size (ref = Small)
Medium (100–499 beds) 1.1 1.1
Large (> 500 beds) −1.0 1.2
Ownership (ref = For profit)
Non-profit 11.2*** 2.0
Public 4.8** 1.5
Unknown −7.6* 2.9
Member of Health System 10.0*** 1.5
Medical School Affiliation 1.8* 0.8
Rural Location −3.6*** 1.0
Poverty serving −2.5* 1.1
Minority serving 0.5 1.2
Constant 8.5*** 1.6

Source: Authors’ analysis of hospital website homepages, with tracking assessed via the webXray tool, August 2021, and hospital characteristics from the American Hospital Association Annual Survey, 2019

Notes Linear regression model with clustering by health system, with the number of third-party transfers as the dependent variable.

*

p<0.05

**

p<0.01

***

p<0.001

Our manual search of 100 randomly sampled hospital websites for patient-facing pages related to 6 potentially sensitive conditions yielded 30 websites that had all condition-specific pages. Across these 30 websites, 100% of condition-specific pages had at least one third-party data transfer. The number of third-party transfers was similar between condition-specific pages and the hospitals’ homepages, with a median of 18–22 third-party transfers per condition-specific page, compared to a median of 22 per homepage. The amount of tracking on condition-specific pages was highly correlated with tracking on the homepage of the same hospital, with condition-specific correlation coefficients ranging from 0.87–0.95 (See Appendix Exhibit S5).(18)

Discussion

Our results demonstrate that across the websites of 3,747 non-federal acute care hospitals, third-party tracking is ubiquitous and extensive, with hospital website homepages initiating a median of 16 third-party data transfers. Hospital websites transfer data to numerous third parties, including some of the largest technology and social media companies, advertising firms, and data brokers. Additionally, our analysis of a random sample of hospital websites revealed no substantial difference between the amount of third-party tracking on hospital homepages and condition-specific webpages.

Thus, despite being subject to HIPAA’s stringent privacy measures for protected health information (PHI), nearly all hospitals allow third parties to capture data about how patients and other users navigate their websites. A recent investigative report revealed that in some instances, data transfers from hospital websites to third parties may include PHI regarding patients’ prescriptions and doctor’s appointments and, hence, constitute HIPAA violations.(14) Our analysis suggests that, if this phenomenon occurs across even a small proportion of third-party data transfers on hospital websites, many patients may be exposed to such violations.

Additionally, a December 2022 bulletin issued by the Office for Civil Rights at the U.S. Department of Health and Human Services (HHS) clarified that HIPAA rules apply even to regulated entities’ unauthenticated webpages, including, webpages “with general information about the regulated entity like their location [or] services they provide.”(19) The bulletin notes, for example, that including tracking code that collects an individual’s IP address on an “unauthenticated webpage that addresses specific symptoms or health conditions,” would constitute the disclosure of PHI to the tracking technology vendor. This guidance implies that HIPAA rules would apply to a potentially vast number of third-party data transfers on hospital websites.

We found that hospitals in health systems, hospitals with a medical school affiliation, and hospitals serving more urban patient populations all exposed visitors to more third-party data transfers. While further research is needed to examine the causes of this discrepancy, it may be influenced by multiple factors. These hospitals may strive to include more features on their websites, and the additional tracking is a product of including third-party functionality, such as embedding a Google Maps product onto a site. Alternatively, these hospitals may engage in higher levels of online advertising to drive revenues, and the third-party tracking is a consequence of the perceived need to monitor these adverting campaigns by installing tracking tools.

The high number of entities engaged in tracking on hospital websites heightens potential privacy risks to patients. Many of the third parties to whom data is transferred have business models built on identifying and tracking individuals for the purposes of targeting online advertisements. Alphabet does not sell data to third parties, but rather allows targeted advertising through profiles, including the targeted promotion of prescription drugs. Less prevalent tracking entities are more varied in their policies and purposes, including tracking companies which sell their data on to third parties (e.g. Acxiom)(20) or allow health-related profiling (e.g. Adobe and Oracle)(21,22). These practices have led to lists of patients, including their phone number and home address, with particular disease types being available for purchase.(23) Third-party tracking code on hospital webpages may facilitate these types of health-related tracking.

Because little is known about the precise ways in which third parties use tracking data, the implications of extensive third-party tracking on hospital websites remain unknown but are potentially far-reaching. Patients who visit hospital websites may see greater levels of online targeted advertisement for pharmaceuticals, medical supplements, and insurance products that potentially conflict with best practices or the advice of their physician, drive low-value healthcare spending, or substitute for more effective cures. Health-related information inferred from browsing behavior may also may also be incorporated into risk scores, which can be used in decisions about eligibility for credit and insurance products.(24) While public health campaigns may also use targeted advertising to reach specific populations, public advertising budgets are smaller than private spending, limiting their relative impact.

Recommendations

Hospitals have a responsibility to protect patients from unnecessary risks, including risks to their privacy. Furthermore, as suggested by the recent settlement with Mass General Brigham and the Dana-Farber Cancer Institute, similar ongoing lawsuits against other hospital systems,(4) and HHS’s clarification that HIPAA rules apply to some data transfers from regulated entities’ unauthenticated webpages, hospitals may also face financial risks for exposing website visitors to unwanted tracking.

Policymakers should address tracking on health-related pages specifically in proposed privacy legislation such as the American Data Protection and Privacy Act, ideally by prohibiting the practice. Hospitals should audit their websites to limit or eliminate third-party tracking. Hospitals which choose to allow third-party tracking should disclose this to website visitors and give patients simple methods for opting out of tracking completely. Any third party tools installed should also have their privacy policies reviewed by the hospital’s legal department in conjunction with a patient representative to ensure that the policies meet the hospital’s legal and ethical obligations to protect patient privacy.

Conclusion

This study documents that nearly every US acute care hospital transfers data to third parties when patients or other members of the public visit their website. This practice poses privacy risks for patients and may result in legal liability for hospitals. Hospitals should regularly audit their own websites, limit the amount of third-party tracking, disclose such tracking in a transparent format, and allow patients to easily and permanently opt out of such tracking.

Supplementary Material

Appendix

Endnotes

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix

RESOURCES