Skip to main content
Annals of Family Medicine logoLink to Annals of Family Medicine
. 2021 Sep-Oct;19(5):447–449. doi: 10.1370/afm.2713

Voice Assistants and Cancer Screening: A Comparison of Alexa, Siri, Google Assistant, and Cortana

Grace Hong 1, Albino Folcarelli 2, Jacob Less 2, Claire Wang 2, Neslihan Erbasi 2, Steven Lin 1,
PMCID: PMC8437561  PMID: 34546951

Abstract

Despite increasing interest in how voice assistants like Siri or Alexa might improve health care delivery and information dissemination, there is limited research assessing the quality of health information provided by these technologies. Voice assistants present both opportunities and risks when facilitating searches for or answering health-related questions, especially now as fewer patients are seeing their physicians for preventive care due to the ongoing pandemic. In our study, we compared the 4 most widely used voice assistants (Amazon Alexa, Apple Siri, Google Assistant, and Microsoft Cortana) and their ability to understand and respond accurately to questions about cancer screening. We show that there are clear differences among the 4 voice assistants and that there is room for improvement across all assistants, particularly in their ability to provide accurate information verbally. In order to ensure that voice assistants provide accurate information about cancer screening and support, rather than undermine efforts to improve preventive care delivery and population health, we suggest that technology providers prioritize partnership with health professionals and organizations.

Key words: preventive medicine, early detection of cancer, artificial intelligence

INTRODUCTION

Voice assistants, powered by artificial intelligence, interact with users in natural language and can answer questions, facilitate web searches, and respond to basic commands. The use of this technology has been growing; in 2017, nearly one-half of US adults reported using an assistant, most commonly through their smartphones.1 Many individuals search for health information online; when assistants facilitate searches for and answer health-related questions, they present both opportunities and risks.

Because fewer patients are seeing their physicians for preventive care due to the SARS-CoV-2 pandemic,2 it is important to better understand the health information patients access digitally. This study aims to compare how 4 widely used voice assistants (Amazon Alexa, Apple Siri, Google Assistant, and Microsoft Cortana) respond to questions about cancer screening.

METHODS

The study was conducted in the San Francisco Bay Area in May 2020 using the personal smartphones of 5 investigators. Of the 5 investigators (2 men, 3 women), 4 were native English speakers. Each voice assistant received 2 independent reviews; the primary outcome was their response to the query “Should I get screened for [type of] cancer?” for 11 cancer types. From these responses, we assessed the assistants’ ability to (1) understand queries, (2) provide accurate information through web searches, and (3) provide accurate information verbally.

When evaluating accuracy, we compared responses to the US Preventive Services Task Force’s (USPSTF) cancer screening guidelines (Table 1). A response was deemed accurate if it did not directly contradict this information and if it provided a starting age for screening consistent with these guidelines (Supplemental Appendix 1, available at https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2713/-/DC1).

Table 1.

Current USPSTF Screening Guidelines for the 11 Cancer Types Queried

Cancer Type Screening Guideline
Bladder The USPSTF concludes the current evidence is insufficient to assess the balance of benefits and harms of screening for bladder cancer in asymptomatic adults.
Breast The USPSTF recommends biennial screening mammography for women aged 50 to 74 years.
Cervical The USPSTF recommends screening for cervical cancer every 3 years with cervical cytology alone in women aged 21 to 29 years. For women aged 30 to 65 years, the USPSTF recommends screening every 3 years with cervical cytology alone, every 5 years with high-risk human papillomavirus (hrHPV) testing alone, or every 5 years with hrHPV testing in combination with cytology (cotesting).
Colorectal The USPSTF recommends screening for colorectal cancer starting at age 50 years and continuing until age 75 years.
Lung The USPSTF recommends annual screening for lung cancer with low-dose computed tomography (LDCT) in adults aged 55 to 80 years who have a 30 pack-year smoking history and currently smoke or have quit within the past 15 years. Screening should be discontinued once a person has not smoked for 15 years or develops a health problem that substantially limits life expectancy or the ability or willingness to have curative lung surgery.
Ovarian The USPSTF recommends against screening for ovarian cancer in asymptomatic women.
Pancreatic The USPSTF recommends against screening for pancreatic cancer in asymptomatic adults.
Prostate For men aged 55 to 69 years, the decision to undergo periodic prostate-specific antigen (PSA)-based screening for prostate cancer should be an individual one. Before deciding whether to be screened, men should have an opportunity to discuss the potential benefits and harms of screening with their clinician and to incorporate their values and preferences in the decision.
Skin The USPSTF concludes that the current evidence is insufficient to assess the balance of benefits and harms of visual skin examination by a clinician to screen for skin cancer in adults.
Testicular The USPSTF recommends against screening for testicular cancer in adolescent or adult men.
Thyroid The USPSTF recommends against screening for thyroid cancer in asymptomatic adults.

USPSTF = US Preventive Services Task Force.

If the assistant responded with a web search, verbally, or both, we noted that it was able to understand the query. To evaluate web searches, we visited the top 3 web pages displayed as research shows these results get 75% of all clicks.3 Then, we read through each web page and noted if the information is consistent with USPSTF guidelines. Similarly, for verbal responses, we transcribed each response and noted whether it provided accurate information.

RESULTS

Figure 1 compares the voice assistants’ ability to understand and respond accurately to questions about cancer screening. Siri, Google Assistant, and Cortana understood 100% of the queries, consistently generating a web search and/or a verbal response. On the other hand, Alexa consistently responded, “Hm, I don’t know that” and was unable to understand or respond to any of the queries. Regarding the accuracy of web searches, we found that Siri, Google Assistant, and Cortana performed similarly, and the top 3 links they displayed provided information consistent with USPSTF guidelines roughly 7 in 10 times. The web searches we assessed came from a total of 34 different sources, with 47% of responses referencing the American Cancer Society or the Centers for Disease Control and Prevention. For-profit websites, including WebMD and Healthline, were referenced 14% of the time (Supplemental Appendix 2, available at https://www.AnnFamMed.org/lookup/suppl/doi:10.1370/afm.2713/-/DC1).

Figure 1.

Figure 1.

Comparison of voice assistants’ ability to understand and respond accurately to questions about cancer screening.

Verbal response accuracy varied more among the assistants. Google Assistant matched USPSTF guidelines 64% of the time, maintaining an accuracy rate similar to its web searches. Cortana’s accuracy of 45% was lower than its web searches and Siri was not able to provide a verbal response to any of the queries.

Cohen’s κ was used to measure the level of agreement between the 2 investigators that assessed each assistant’s responses. For Siri, Google Assistant, and Cortana respectively, the κ values were 0.956 (95% CI, 0.872-1.000), 0.785 (95% CI, 0.558-1.000), and 0.893 (95% CI, 0.749-1.000).

DISCUSSION

In terms of responding to questions about cancer screening, there are clear differences among the 4 most popular voice assistants, and there is room for improvement across all assistants. Almost unanimously, their verbal responses to queries were either unavailable or less accurate than their web searches. This could have implications for users who are sight-impaired, less techsavvy, or have low health literacy as it requires them to navigate various web pages and parse through potentially conflicting health information.

Our study has several limitations. We used standardized questions, whereas patients using their personal smartphones may word their questions differently, influencing the responses they receive. Furthermore, because the investigators work in the medical field and have likely used their devices to search for medical evidence before this study, they may have received higher quality search results for health-related questions than the average user.

Our findings are consistent with existing literature assessing the quality of assistants’ answers to health-related questions. Miner et al found that assistants responded inconsistently and incompletely to questions about mental health and interpersonal violence.4 Alagha and Helbing found that Google Assistant and Siri understood queries about vaccine safety more accurately and drew information from expert sources more often than Alexa.5

Sezgin et al acknowledge that assistants have the potential to support health care delivery and information dissemination, both during and after COVID-19, but state that this vision requires partnership between technology providers and public health authorities.6 Our findings support this assessment and suggest that software developers might consider partnering with health professionals—in particular guideline developers and evidence-based medicine practitioners—to ensure that assistants provide accurate information about cancer screening given the potential impact on individuals and population health.

Supplementary Material

Lin.pdf
Lin.pdf (193.5KB, pdf)
Lin_twitter_1200x675_v004.png

Footnotes

Conflicts of interest: authors report none.

To read or post commentaries in response to this article, go to https://www.AnnFamMed.org/content/19/5/447/tab-e-letters.

Previous presentations: Society of Teachers of Family Medicine’s 53rd Annual Conference; August 2020; Salt Lake City, Utah

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Lin.pdf
Lin.pdf (193.5KB, pdf)
Lin_twitter_1200x675_v004.png

Articles from Annals of Family Medicine are provided here courtesy of Annals of Family Medicine, Inc.

RESOURCES