Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2022 Feb 21;2021:697–706.

Incidence and Impact of Missing Functional Elements on Information Comprehension using Audio and Text

Gondy Leroy 1, David Kauchak 2, Nicholas Kloehn 1
PMCID: PMC8861712  PMID: 35309000

Abstract

Audio is increasingly used to communicate health information. Initial evaluations have shown it to be an effective means with many features that can be optimized. This study focuses on missing functional elements: words that relate concepts in a sentence but are often excluded for brevity. They are not easily recognizable without linguistics expertise but can be detected algorithmically. Two studies showed that they are common and affect comprehension. A corpus statistics study with medical (Cochrane sentences, N=44,488) and general text (English and Simple English Wikipedia sentences, N=318,056 each) showed that functional elements were missing in 20-30% of sentences. A user study with Cochrane (N=50) and Wikipedia (N=50) paragraphs in text and audio format showed that more missing functional elements increased perceived difficulty of reading text, with the effect less pronounced with audio, and increased actual difficulty of both written and audio information with less information recalled with more missing elements.

Introduction

Health literacy is vital for achieving and maintaining good health. In the previous decade in the US, several national programs have emphasized this goal and its importance. For example, the Affordable Care Act(1) emphasized patient-centeredness, the National Action Plan to Improve Health Literacy(2) specified national goals, and the Plain Writing Act(3) demanded clarity in government communications(4). However, these existing guidelines do not focus and have not been validated for new information distribution methods. They focus on text and visual presentation, but to our knowledge, they have not been updated for audio.

As technology evolves, new modes for communication are being created to provide access to health information. Audio is becoming increasingly popular with mobile devices using virtual assistants and smart speakers. Smart speakers are now ubiquitous in US households and are increasingly used for health-related application. By 2020 there were about 87.7M smart speakers used in the US(5, 6) with annual growth of 30-40%. Hospitals are testing the use of smart speakers(7, 8) and in 2019, about 16% of all questions asked of a smart speaker were health and wellness questions(6). There are few studies and little existing expertise on how to create audio or to combine audio with text. For example, the Plain Language Summit in 2019(9), described plain language as “language that people can understand the first time they read or hear it,” but audio was not covered during the summit.

With the increasing popularity of audio as an information source, research is needed to design the best strategies for using this resource as a medium for disseminating health information. In our early work, we focused on discovering lexical, semantic, syntactic, and composition features in text with demonstrated impact on comprehension. We are expanding this work to audio to discover how best to create content for audio consumption. We have already established that health information consumers can learn from audio information as well as from text information(10). Critically, unlike previous work that only focused on text, we now examine both text and audio to see how text features affect comprehension in the different presentation formats.

Sentences in English rely upon syntactic structure to convey the meaning of a sentence. The structure of a sentence dictates the order of words in a sentence, but also relies on function words that help join different phrases and clauses in a sentence. These function words tell us about the relationships between content words, but often carry little content themselves. For example, the word ‘that’ in a phrase such as ‘parasites that impact’, relates the two content words. Often, these function words are left out in spoken language or in complex written language for various reasons. When these words are left out, i.e., missing functional elements, the person reading the sentence has less information available to decode the meaning of the sentences than if all of these words were included.

We hypothesize that when there are missing functional elements in a sentence, the sentence will be more difficult to readers than if these functional elements were included because explicit connections between content-bearing terms are omitted. We developed an algorithm to detect missing functional elements in text. Using this algorithm, we examine the frequency of missing function elements in two corpora (medical and general) and evaluated the effect of missing elements on comprehension with both text and audio. Missing elements are a very frequent phenomenon, occurring in almost a third of the sentences in our corpora. We found that there is a subtle effect of missing functional elements on comprehension and we found differences between the text and audio presentation. Even though the effects are subtle, because missing elements occur so frequently, the effect is important.

The current work brings two contributions. The first is a demonstration that some text features, not easily recognize by non-experts, are frequently found in text, hence the usefulness of algorithmically identifying them. The second contribution is in showing how such features affect comprehension and how this differs depending on the mode of interaction with the information (text vs audio). For brevity, we will use the phrase ‘missing elements’.

Background

Content Simplification. Most literacy research has focused on text as the mode of distribution and examined how to make that text as accessible as possible. Given some existing text, the goal of text simplification is to transform that text into a variant that is more understandable. In the healthcare community, the majority of work has used readability formulas to aid content creators (e.g., medical writers) in simplifying text. A variety of different metrics have been proposed(11-14), e.g., Flesh-Kincaid and Reading Ease(15), and Simple Measure of Gobbeldygook (SMOG)(16), and some more comprehensive tools, e.g., Coh-Metrix(17), have been created. The scores assigned by these metrics and tools indicate the difficulty level of a text and there are agreed upon limits for text intended for use by non-experts. For example, the advice is to write at 6th to 8th grade level(18). This approach suffers from three main problems. First, there is a single score assigned to an entire text; unless a writer is very familiar with the inner workings of the formula, adjusting the text to improve the score is difficult. Second, these formulas rely on simple text statistics that tend to correlate with text readability, e.g., word length, but often don’t accurately measure, on a per instance case, the actual text difficulty. Well written, understandable text tends to have better readability score, however, simply manipulating the components of the readability metrics does not create text that is more readable or easier to understand. The following examples were evaluated using the built-in Flesh-Kincaid formulas in Microsoft Word:

  • Eating fruit is a healthy habit and enjoyable too. (7.5 Grade Level)

  • Eating fruit is. A healthy habit and enjoyable too. (5.8 Grade Level)

  • Eating fruit is. A healthy habit. And enjoyable too. (5.2 Grade Level)

Our own work and that of others have shown that there is no relationship between these scores and reader comprehension(19). Third, as we consider new modes of communication, i.e., audio, these formulas have not been validated and it is not clear how the formulas will translate to this new medium.

On the other end of the spectrum, there has been a lot of work recently that focuses on fully automated approaches to text simplification that do not require any human editing(20). These approaches usually rely on a sentence-aligned training corpus of difficult sentences aligned to corresponding simple sentences(21, 22). Most of these approaches have leveraged models that have been successful for machine translation, including phrasal models(23, 24), syntactic models(25) and, most recently, sequence-to-sequence neural networks(26, 27). There are two key challenges with applying these models in the health and medical domain. First, the performance of these models, including our own(28), is still not nearly good enough for practical application, particularly in a domain like health where correctness is critical. Second, currently, there are no good large sentence-aligned corpora of medical text available. While general domain text will provide some guidance, it is important to have domain text to capture both the lexical and syntactic nuances.

In this work, we take a compromise between these two extremes and view the text simplification processes as a semi-automated approach where a human in the loop utilizes a tool to help guide the writing process. Unlike readability formulas that only provide high-level information, the tool provides concrete guidance on which portions of the text are problematic and, critically, also provides concrete suggestions for how these sections can be improved. We have successfully used a range or data sources, corpus statistics, and machine learning to discover many features and develop individual algorithms evaluated through user studies. The algorithms that are shown to impact readability and understandability are integrated in an online text editor1. The current version of the tool contains a range of tool components at the word level, sentence level, and corpus level. In this work, we are expanding to focus on features that are useful for audio content.

Importance of Audio Information. While text is still a common mode for providing information in medicine, a new approach is emerging fast: audio information presentation. An important contributing factor is the rapidly rising use of virtual assistants and smart speakers. In 2018, 30% of Americans used voice to find and purchase products. Smart speakers have become an increasingly common household item and offer a range of activities with an increasing number of interactions that focus on healthcare. For example, in 2021 there were more than 2,000 skills in the Alexa’s Health and Fitness section, several focus on tracking (medication, menstruation, fertility, calories), finding providers and scheduling, but a large portion is devoted to providing information (e.g., drug information, general health advice, and information on specific conditions and treatments). A 2019 survey(29) found that 52% of those surveyed would welcome using a voice assistant for healthcare and many are already asking information about symptoms (73%), medication (46%), hospitals (38%), research treatment (38%), nutrition (39%) and more. The COVID pandemic has sped up several initiatives using text or audio. For example, Microsoft and the Centers for Disease Control and Prevention (CDC) collaborated on a COVID-19 chatbot and the World Health Organization (WHO) released a chatbot using WhatsApp(30) to provide relevant information.

Creating audio information from existing text is straightforward through the use of existing APIs, e.g., the Bing Speech Text-to-Speech API22 or Google’s Cloud Text-to-Speech API33. Using audio itself is not new and has been used with visually impaired audiences. Example research has focused on e.g., effectiveness of audio instructions or dental programs for visually impaired children(31). Newer studies focusing on virtual assistants are often limited to evaluating the usefulness and correctness of answers in response to queries, e.g., a comparison of virtual assistant and their acceptance at home(32) and usefulness in answering queries such as about gynecologic oncology(33). However, there are increasingly more options for more sophisticated information exchanges, for example, Amazon Alexa competitions focusing on Type 2 diabetes support(34), hospitals providing Alexa skills or adding smart speakers in patient rooms (e.g., Cedars’ Sinai pilot study(7)), and the increasing use of the Internet of Medical Things with voice-activated devices(35).

Our work differs from others in that we evaluate how to improve content for general, automated audio generation, a feature provided by virtual assistants although currently using existing, and not optimized, content. Since we focus on medical and healthcare content, the standards for accuracy are high and limited information (preferably none) can be omitted.

Methods

Identifying Missing Elements. We algorithmically identify missing elements in text based on the syntactic structure of the sentences. The algorithm does not require any human intervention and can be applied efficiently to text in a wide range of domains. Using this algorithm, we analyzed the frequency of missing elements in two different corpora. Then, we performed a user study to see the impact of missing elements on comprehension in both text and audio presentation.

Missing elements can be introduced into sentences in English in a variety of ways. In this work, we focus on two different ways that can occur in noun phrases. The first is missing elements in complex noun phrases. More specifically, we focus on noun phrases that contain a relative clause or verbal element in a nominal without an overt subordinate clause. For example, in the phrase

“The patients included are tested for ….”

there is syntactic information missing. This is often done in an attempt to be succinct, though it does leave some information implied rather than explicitly stated. A version of the sentence that includes this information could be written as:

The patients who were included are tested for …”

These missing elements can be detected using the grammatical structure of the sentence. A missing element occurs if there exists a noun phrase that contains a verb but does not contain a preposition or subordinating conjunction (i.e., a word with part-of-speech ‘IN’). We denote this Rule 1.

Another type of missing element is found in noun phrases that do not contain nouns, more specifically, having a nominal that lacks a determiner. In English, these include deictic determiners: function words that point or refer back to items that are not explicitly stated in the sentence. For example, in the phrase

“The patients included are tested for the disease, this indicates that...

the term ‘this’ refers to a noun that is implied by the content but not stated. It can be made explicit, for example:

“The patients included are tested for the disease, this fact indicates that...

A missing element of this type occurs if there exists a noun phrase that does not contain a noun or other words that act like nouns, i.e. a preposition, an existential ‘there’, a number, a gerund verb, or a comparative or superlative adjective. We denote this Rule 2.

For both rules, we can determine if a missing element occurs in a sentence, given the syntactic parse of the sentence. To apply the rules, we first preprocess the text with the Stanford CoreNLP toolkit(36) to split the text into sentences and automatically parse the sentences. The rules conditions above are checked based on these automatically generated parse trees. In our case, we implemented the rules in Java. Rule 1 identifies as containing missing element all noun phrases that contain a verb (identified by one of the five verb part of speech tags), but that do not contain a preposition or subordinating conjunction (i.e., does not contain a word with the ‘IN’ part of speech tag). Rule 2 identifies as containing a missing element all noun phrases that do not contain a noun or a noun-like word, specifically, one of ten parts of speech (four noun tags, two adjective tags, a preposition, a number, existential there, or a gerund verb).

Evaluating Occurrence: Corpus Statistics Study. To get an understanding of the impact of these two types of missing elements we completed corpus analyses to evaluate whether these missing elements appear regularly in text and whether they are especially indicative of difficult text. We used two different corpora to measure the impact of missing elements. The first corpus is a medical corpus with abstracts collected from Cochrane4 articles. Cochrane provides a collection of a wide range of medical articles that are meant to “inform healthcare decision-making”. They represent digests of current research and represent a source where patients can obtain current information on diseases, treatments, and other medically-related information. We queried the database for 15 different illnesses that were the leading causes of death according to the CDC (e.g., heart disease) along with 4 common conditions (e.g., obesity) and collected the abstracts for all of the articles returned. The total dataset contains 44,488 sentences.

For the second corpus, we examined a corpus that contains 318,056 aligned sentence pairs from English Wikipedia and Simple English Wikipedia(21). Wikipedia articles are a common source for people searching for a wide range of topics and are written to be generally accessible. Simple English Wikipedia articles contains similar topics to the normal English Wikipedia articles, but written to be more accessible and broadly digestible. For each English Wikipedia sentence the corpus contains a corresponding sentence in Simple English Wikipedia with roughly equivalent information, though expressed more simply. This parallel corpus allows for a direct comparison of text with varying difficulty levels that is agnostic of topic. The Wikipedia dataset contains a broad range of topics and is an extremely commonly used resource. As noted above, currently no corpus exists of such sentences in the medical domain.

Evaluating Effects on Listeners: User Study. To evaluate the impact on reading and listening, we combined the data from two user studies where comprehension was tested with online consumers using a corpus of text snippets. The difference between the two studies was in how comprehension of the content was measured: with either multiple-choice questions or with free recall of information. These initial studies showed that consumer can learn new information from audio as well as from text(10) which is a requirement before commencing research on optimizing content. The study was reviewed and approved by the Institutional Review Board of the University of Arizona.

The results for the current study are based on an analysis of our consumer comprehension data using the new missing element algorithms. The algorithms to detect missing elements (Rules 1 and 2) as well as the analysis of the relationship between missing elements and comprehension on the text snippets are new for this study and not reported elsewhere.

The goal was to compare user comprehension after being shown medical information two times. This simulates the situation where a patient is presented information after a medical consultation (e.g., orally in the doctor’s office and then again at home with written information). The studies were done in two phases designed to isolate the effect on comprehension of listening to information versus reading the same information. In the text-text variant, the information was presented twice to the participants as text. In the audio-text variant, the information was presented first as audio and then again as text. We used a total of 100 text snippets randomly selected from two medical sources and chosen to be of approximately equal length: Cochrane (N=50) and English Wikipedia snippets for health-related pages (N=50). For this paper, we calculated the number of missing elements based on Rules 1 and 2 in the text snippets to evaluate how the number of missing elements relates to comprehension.

In the first study, we used multiple-choice questions to measure comprehension. The participants were presented with the snippet (in either text form or audio form, depending on the condition) and then answered a multiple-choice question about the content (Multiple-Choice 1). They also scored the perceived difficulty of the snippet using a 5-point Likert scale with a score of 1 indicating ‘Very Easy’ text and 5 indicating ‘Very Difficult’ text. The participants were then shown the information again as text and they were given an opportunity to correct their answer to the same multiple-choice question (Multiple-Choice 2). We included the same question to see how much an individual improved with the second presentation of the information. For each multiple-choice question, there were three answer choices, one of which was correct. We measured comprehension with the percentage correct.

In the second study, we used free recall to measure comprehension; the study was identical to the first except that the first multiple-choice question was replaced with a free recall question where participants were asked to write, in their own words, as many pieces of information that they could remember about the information presented. To evaluate the quality of the free recall answers, we quantified how much of the information they recalled was in the original text that they were presented. To calculate this we first applied an automated spelling checker (to avoid differences due to participants who used a spelling checker and those who did not). Then, we measured the number of words that overlapped between the original text and the information recalled by the study participants.

To better understand the relationship between the information recalled and the original text, we used two measures of recall: exact and semantic. Exact recall was calculated by counting the number of content bearing words (i.e., nouns, verbs, adjectives, and adverbs) in the information recalled that occurred in the original text. This type of exact match can be too strict and does not allow for synonyms or other paraphrasing that participants might do. To capture a broader sense of overlap, we also used semantic recall, which is the number of content bearing words that were either an exact match of or that had a similar meaning to a word in the original text. To calculate if two words were similar, words were first represented by their word embedding. We then used the cosine similarity between word embeddings to identify word with similar semantic meaning. We used Google’s pre-trained 300-dimension word embeddings(37) and a cosine similarity threshold of 0.45 to tag words as semantically similar (this threshold was empirically determined).

Both studies were conducted using Amazon Mechanical Work (AMT) with three workers participating for each text snippet for each condition. Comprehension scores were averaged across the three workers and across the text snippets. We also report here on the time spent on the task, which is provided by AMT. Additional details of the AMT study have been reported in the study which focused on the impact of text-text and audio-text ordering on comprehension(38).

Results

Corpus Statistics Study Results. Table 1 shows the results of the corpus analysis for the two corpora. For each resource, we counted each occurrence of a missing functional element as well as the number of sentences containing one or more missing element. Since there can be multiple occurrence of missing elements in a single sentence (e.g., in different noun phrases), the number of occurrences can be higher than the number of sentences.

Table 1.

The number and proportion of missing elements in the medical corpus and the English-Simple English sentence-aligned corpus.

Corpus Source
Cochrane (N = 44,488) English Wikipedia (N= 318,056) Simple English Wikipedia (N= 318,056)
Rule 1
occurrences 15,734 52,486 45,379
sentences 10,449 (23.5%) 41,545 (13.1%) 37,427 (11.7%)
Rule 2
occurrences 4,784 31,258 30,806
sentences 4,333 (9.7%) 27,479 (8.6%) 27,446 (8.6%)
Rule 1 or 2
sentences 13,130 (29.5%) 63,737 (20.0%) 60,067 (18.9%)

Missing elements occur frequently in medical text. The Cochrane dataset contained 15,734 instances of Rule 1 elements and almost a quarter of all sentences (23.5%) had at least one occurrence. Rule 2 was less frequent with 4,784 instances overall and almost a tenth of the sentences (9.7%) had at least one occurrence. Combining the rules, there was at least one missing element in 29.5% of the sentences.

Missing elements also tended to occur more frequently in difficult text, particularly Rule 1 elements. In the Wikipedia corpus 20.0% (63,737 sentences) of the English Wikipedia sentences contained a missing element versus 18.9% (60,067 sentences) for Simple English Wikipedia. This difference was almost entirely due to Rule 1 occurrences (41,545 sentences vs. 37,427 sentences, English vs. Simple English).

User Study Results. Table 2 provides an overview of the study corpus used to measure user comprehension. Overall, missing elements are common in both types of text. The snippets contained on average 4.5 sentences and most contained one or more missing elements. There were more missing elements identified by Rule 1 (0.82 and 1.16 per snippet for Wikipedia and Cochrane snippets, respectively) than for Rule 2 (0.18 and 0.20 for Wikipedia and Cochrane snippets, respectively). The differences were not statistically significant.

Table 2.

Corpus statistics for the text snippets used in the user comprehension study.

Text Origin
Wikipedia Cochrane
Snippets (N) 50 50
Average Sentences / Snippet 4.50 4.52
Average Words / Snippet 94.36 96.12
Rule 1
Minimum 0 0
Maximum 4 4
Average 0.82 1.16
Rule 2
Minimum 0 0
Maximum 1 2
Average 0.18 0.20

In Table 3, we show the relationship with reading the text and the number of missing elements (text-text condition). We present reading first as it serves as a baseline: reading information is a familiar activity. We conducted correlation analyses to estimate the impact of missing elements on the difficulty (actual or perceived) of text. We calculated one-tailed Pearson Correlation Coefficients using the average scores of the three AMT participants for each of the snippets (N=100). We chose one-tailed because we hypothesize that an increasing number of missing elements makes the text more difficulty. We found that more time is spent on the tasks with more missing elements. The text is also perceived as more difficult when there are more missing elements. The effects are stronger for Rule 1 (which occurs more frequently in the text). There was no significant correlation with actual difficulty as measured here with multiple-choice questions.

Table 3.

Results for the multiple-choice questions for the reading (text-text) condition. One-tailed Pearson correlation significant at 0.05 level* or at 0.01 level**.

Missing Elements According to Rule Type Text Difficulty Measure (N=100)
Time Spent Multiple-Choice 1 Multiple-Choice 2 Perceived Difficulty
Rule 1 .491** .062 .068 .187*
Rule 2 .190* .030 .346 .112

Table 4 shows the results for listening to the content first and reading afterwards (audio-text condition). When participants have little control over the time spent (they could not pause the recording) there is no relationship with time spent. Interestingly, we found a negative correlation with the perceived difficulty of the audio and the number of missing elements: when listening to the information, more missing elements make the information sound easier. While including functional elements may help make the connections more explicit in the sentences, leaving them out can sometimes help with the fluency of the sentences. Similar to reading text (above), there is no correlation between the actual difficulty, measured with the multiple-choice questions and the number of missing elements.

Table 4.

Results for the multiple-choice for the listening (audio-text) condition. One-tailed Pearson correlation significant at 0.05 level* or at 0.01 level**.

Missing Elements According to Rule Type Text Difficulty Measure (N=100)
Time Spent Multiple-Choice 1 Multiple-Choice 2 Perceived Difficulty
Rule 1 .335 .065 .044 -.209*
Rule 2 .209 -.033 -.142 -.113

Our second study repeated this approach but replaced the multiple-choice questions with a free recall measure. This was done to provide a more sensitive measure of information comprehension and retention. Free recall was calculated as the overlap of words between the given content and the participants’ recall. Similar to the analysis above, we calculated the correlations between missing elements and the retention of information.

Table 5 shows the results with the free recall of information and reading text (text-text condition). There is no correlation between time spent and missing elements. While the number of exact or semantically matching words is not correlated to the missing elements (not surprisingly since many AMT workers prefer to work fast and spent a limited amount of time on a task), the overall similarity of the answer to the original information is strongly correlated with more missing elements resulting in a lower overall answer similarity, i.e., a worse answer by the participants. The information is perceived as more difficult when there are more missing elements.

Table 5.

Results for the free recall measure for the reading (text-text) condition. One-tailed Pearson correlation significant at 0.05 level* or at 0.01 level**.

Missing Elements According to Rule Type Text Difficulty Measure (N=100)
Time Spent Exact Free Recall Count Semantic Free Recall Count Overall Recall Similarity Perceived Difficulty
Rule 1 .044 -.077 -.060 -.268** .180*
Rule 2 .127 .083 .086 -.123 .125

Table 6 shows the results for free recall of information after listening to the information first (audio-text condition). As expected, since participants have no control over audio play, there was no correlation between time spent and missing elements. The results are similar to reading of the text. The number of exact or similar words is not correlated with missing elements. However, the overall answer similarity correlates to the number of missing elements with similarity being lower (i.e., worse answer) when there are more missing elements. As with text, the audio is perceived as more difficult when there are more missing elements.

Table 6.

Results for the free recall measure for the listening (audio-text) condition. One-tailed Pearson correlation significant at 0.05 level* or at 0.01 level**.

Missing Elements According to Rule Type Text Difficulty Measure (N=100)
Time Spent Exact Free Recall Count Semantic Free Recall Count Overall Recall Similarity Perceived Difficulty
Rule 1 .069 -.070 -.022 -.179* .174*
Rule 2 .031 .018 .074 -.038 .140

Discussion

We presented the discovery and results for missing elements, and its relationship to perceived and actual difficulty of content presented as text or audio. The effect of missing elements on reader comprehension is subtle. We believe that our broad-multiple choice questions were not sufficiently sensitive to capture the impact of missing elements. However, we find that with a more sensitive measure such as free recall, there is a relationship between the number of missing elements in the content and how much was recalled by study participants. The results are similar for audio and text presentation: when the content has more missing elements less is recalled. We believe these results might be an underestimate of the impact because AMT workers tend to work fast with quick and short answers.

Furthermore, there was an interesting but complex relationship between missing elements and perceived difficulty that needs further follow up. In the audio-text condition, having more missing elements was perceived as easier: this was when the participants were listening and only had to answer a multiple-choice question. In the text-text condition, they perceived content as more difficult when there were more missing elements. Our corpus statistics study showed that missing elements are a very frequent occurrence. It is reasonable to assume that people remove functional elements for a purpose, e.g., to make the text flow better. This discrepancy between text and audio in how they are perceived gives some insight into this, but further exploration is required. Additionally, perceived difficulty cannot be ignored since it may impact how consumers process and understand health information. The Health Belief Model (HBM) (39) and the Theory of Planned Behavior (TPB)(40) have shown this impact. In a review of 24 studies, the 4th dimension in the HBM, perceived barriers, was found to be the most significant in explaining health behavior(39). Similarly, in TPB, perceived difficulty, a sub-factor for perceived behavioral control, has been found to be the stronger predictor of intentions and behavior(41). Other work has showed how perceived difficulty correlates with a decrease in the recall of information(42).

Practical Implications

Our work has three main practical implications. First, audio is an effective delivery medium for information. This paper represents the second study to support this. Audio as a medium for information dissemination is becoming increasingly important the growing popularity of audio for information access. Second, our combined studies show the need for evidence-based guidance on how to optimize content for text and audio. Ideally, such guidelines will be incorporated into automated or semi-automated tools to allow a broad variety of content providers to optimize their content without requiring them to have a background in linguistics or communications. Finally, in addition to features discovered in previous work, we demonstrated here a new feature and its impact on the actual and perceived difficulty of content in text and audio presentation. This study shows that it is important to specify nouns and verbs when they refer to previously presented information. This makes the content seem easier and also contributes to better recall of the information.

Future work on health literacy for patient or consumer education where there is an information exchange with laypersons utilizing audio for information delivery would benefit from evaluating content, style, and presentation for their impact on perceived and actual difficulty. Beyond these, there are also potential opportunities for examining other dimensions that are more relevant for audio delivery, e.g., emphasis, persuasion, and bias. With audio information delivery, few evidence-based guidelines are currently available.

Conclusion

Our overall goal is to optimize content for delivery both via text and audio. We aim to do this by creating tools that provide evidence-based guidelines on how to change text. In a first study conducted, we found that audio is an effective medium to deliver content, an exciting and promising result particularly given the increasing prevalence of audio devices in our lives(10). In the study presented here, we focused on one text feature, missing elements, and found that it affects both perceived difficulty and actual difficulty for both text and audio. Again, this is a promising result since it means that optimizing content will be equally effective for text and audio presentation. While the impact of missing elements may seem small, the feature appears in nearly 30% of sentences used to provide health information; even a small improvement could have a large impact.

Acknowledgments

Research reported in this paper was supported by the U.S. National Library of Medicine of the National Institutes of Health under Award Number 1R01LM011975 and 2R01LM011975. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

1

http://simple.cs.pomona.edu:3000/

2

https://azure.microsoft.com/en-us/services/cognitive-services/speech-services/

3

https://cloud.google.com/text-to-speech/

4

https://www.cochranelibrary.com/

References

  • 1.United States. Affordable Care Act. https://www.healthcare.gov/where-can-i-read-the-affordable-care-act/2010. [Google Scholar]
  • 2.Centers for Disease Control and Prevention. National Action Plan to Improve Health Literacy. https://www.cdc.gov/healthliteracy/planact/national.html2010. [Google Scholar]
  • 3.United States. Plain Writing Act of 2010. https://www.plainlanguage.gov/law/2010. [Google Scholar]
  • 4.Koh HK, Berwick DM, Clancy CM, Baur C, Brach C, Harris LM, et al. New Federal Policy Initiatives To Boost Health Literacy Can Help The Nation Move Beyond The Cycle Of Costly ‘Crisis Care’. Health Affairs. 2012;31(3):434–43. doi: 10.1377/hlthaff.2011.1169. PMID: 22262723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kinsella B. Global Smart Speaker Growth Cools in Q1 as Pandemic Leads to Declining China Sales, Amazon Retains Top Spot Says Strategy Analytics. https://voicebot.ai/2020/05/25/global-smart-speaker-growth-cools-in-q1-as-pandemic-leads-to-declining-china-sales-amazon-retains-top-spot-says-strategy-analytics/2020. [Google Scholar]
  • 6.Kinsella B. 40% in 2018 to 66.4 Million and Amazon Echo Maintains Market Share Lead Says New Report from Voicebot. https://voicebot.ai/2019/03/07/u-s-smart-speaker-ownership-rises-40-in-2018-to-66-4-million-and-amazon-echo-maintains-market-share-lead-says-new-report-from-voicebot/2019. [Google Scholar]
  • 7.Soshea.Leibler@cshs.org. Cedars-Sinai Taps Alexa for Smart Hospital Room Pilot. https://www.cedars-sinai.org/newsroom/cedars-sinai-taps-alexa-for-smart-hospital-room-pilot/2019. [Google Scholar]
  • 8.Yoo TK, Oh E, Ryu IH, Kim JS, Kim JK. Deep learning-based smart speaker to confirm surgical sites for cataract surgeries: A pilot study. PLoS One. 2020;15(4):e0231322. doi: 10.1371/journal.pone.0231322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.digital.gov, plainlanguage.gov. Plain Language Summit 2019: A day of talks and conversations on how to advance plain language in government communications. https://digital.gov/event/2019/09/05/plain-language-summit-2019/20198. [Google Scholar]
  • 10.Leroy G, Kauchak D. A comparison of text versus audio for information comprehension with future uses for smart speakers. JAMIA Open. 2019;2(3):254–60. doi: 10.1093/jamiaopen/ooz011. Epub 10 May. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ley P, Florio T. The Use of Readability Formulas in Health Care. Psychology, Health & Medicine. 1996;1(1):7–28. [Google Scholar]
  • 12.Wang L-W, Miller MJ, Schmitt MR, Wen FK. Assessing Readability Formula Differences with Written Health Information Materials: Application, Results, and Recommendations. Research in Social & Administrative Pharmacy. 2012;(In Press) doi: 10.1016/j.sapharm.2012.05.009. [DOI] [PubMed] [Google Scholar]
  • 13.Piñero-López M, Figueiredo-Escribá C, Modamio P, Lastra C, Mariño E. Readability assessment of package leaflets of biosimilars. BMJ Open. 2019;9(1):e024837. doi: 10.1136/bmjopen-2018-024837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Cisu T, Mingin G, Baskin L. An evaluation of the readability, quality, and accuracy of online health information regarding the treatment of hypospadias. J Pediatr Urol. 2018;18:30498–4. doi: 10.1016/j.jpurol.2018.08.020. [DOI] [PubMed] [Google Scholar]
  • 15.Flesch R. A New Readability Yardstick. Journal of Applied Psychology. 1948;32(3):221–33. doi: 10.1037/h0057532. [DOI] [PubMed] [Google Scholar]
  • 16.McLaughlin GH. SMOG Grading: a New Readability Formula. Journal of Reading. 1969;12:636–46. [Google Scholar]
  • 17.Graesser AC, McNamara DS, Kulikowich JM. Coh-Metrix : Providing Multilevel Analyses of Text Characteristics. EDUCATIONAL RESEARCHER. 2011;40:223–4. [Google Scholar]
  • 18.Joint Commission’s Public Policy Initiative. “What Did the Doctor Say?:”Improving Health Literacy to Protect Patient Safety. 2007 [Google Scholar]
  • 19.Leroy G, Kauchak D. The Effect of Word Familiarity on Actual and Perceived Text Difficulty. Journal of the American Medical Informatics Association (JAMIA) 2014;e1 doi: 10.1136/amiajnl-2013-002172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Shardlow M. A survey of automated text simplification. International Journal of Advanced Computer Science and Applications. 2014;4(1):58–70. [Google Scholar]
  • 21.Coster W, Kauchak D. Simple English Wikipedia: A New Text Simplification Task. 49th Annual Meeting of the Association for Computational Linguistics. 2011 [Google Scholar]
  • 22.Xu W, Callison-Burch C. Problems in current text simplification research: New data can help. Transactions of the Association of Computational Linguistics. 2015;3(1):283–97. [Google Scholar]
  • 23.Coster W, Kauchak D. Learning to simplify sentences using wikipedia. Workshop on Monolingual Text-to-text Generation. 2011 [Google Scholar]
  • 24.Wubben S, Bosch AVD, Krahmer E. Sentence simplification by monolingual machine translation. 50th Annual Meeting of the Association for Computational Linguistics. 2012 [Google Scholar]
  • 25.Woodsend K, Lapata M. Learning to simplify sentences with quasi-synchronous grammar and integer programming. Conference on Empirical Methods in Natural Language Processing; Conference on Empirical Methods in Natural Language Processing; Conference on Empirical Methods in Natural Language Processing. 2011 [Google Scholar]
  • 26.Nisioi S, Štajner S, Ponzetto SP, Dinu LP. Exploring neural text simplification models. 55th Annual Meeting of the Association for Computational Linguistics. (Volume 2: Short Papers);Vancouver, Canada 2017:p. 85–91. [Google Scholar]
  • 27.Zhang X, Lapata M. Sentence Simplification with Deep Reinforcement Learning. 2017 Conference on Empirical Methods in Natural Language Processing. September 7–11; Copenhagen, Denmark 2017 [Google Scholar]
  • 28.Hung HN, Kauchak D, Leroy G. AutoMeTS: The Autocomplete for Medical Text Simplification. 28th International Conference on Computational Linguistics. COLING; December 3-8; Online 2020 [Google Scholar]
  • 29.Kinsella B, Mutchler A. Voice Assistant Consumer Adoption in Healthcare. 2019 [Google Scholar]
  • 30.Sezgin E, Huang Y, Ramtekkar U, Lin S. Readiness for voice assistants to support healthcare delivery during a health crisis and pandemic. Nature: Digital Medicine. 2020;3 doi: 10.1038/s41746-020-00332-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sardana D, Goyal A, Gauba K, Kapur A, Manchanda S. Effect of specially designed oral health preventive programme on oral health of visually impaired children: use of audio and tactile aids. International Dental Journal. 2019;69:98–106. doi: 10.1111/idj.12436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.McLean G, Osei-Frimpong K. Hey Alexa … examine the variables influencing the use of artificial intelligent in-home voice assistants. Computers in Human Behavior. 2019;99:28–37. [Google Scholar]
  • 33.Pavlika EJ, Burgess BT, Quick K, McDowell Jr AB, Gorski JW, Riggs MB, et al. Conversational replies to oral queries in gynecologic oncology by Google, Alexa and Siri. Gynecologic Oncology. 2020;159:308–9. [Google Scholar]
  • 34.Alexa Diabetes Challenge. www.alexadiabeteschallenge.com/. 2018; Available from: www.alexadiabeteschallenge.com/ [Google Scholar]
  • 35.Basatneh R, Najafi B, Armstrong D. Health Sensors, Smart Home Devices, and the Internet of Medical Things: An Opportunity for Dramatic Improvement in Care for the Lower Extremity Complications of Diabetes. Journal of Diabetes Science and Technology. 2018;12(3):577–86. doi: 10.1177/1932296818768618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Manning CD, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D. The Stanford CoreNLP Natural Language Processing Toolkit 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2014:p. 55–60. https://nlp.stanford.edu/pubs/StanfordCoreNlp2014.pdf. [Google Scholar]
  • 37.Mikolov T, Chen K, Corrado G, Dean J. Efficient Estimation of Word Representations in Vector Space. CoRR. 2013 abs/1301.3781. [Google Scholar]
  • 38.Leroy G, Kauchak D. A comparison of text versus audio for information comprehension with future uses for smart speakers. JAMIA. doi: 10.1093/jamiaopen/ooz011. Open. Accepted for Publication. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Janz NK, Becker MH. The Health Belief Model: A Decade Later. Health Education Quarterly. 1984;11:1–47. doi: 10.1177/109019818401100101. [DOI] [PubMed] [Google Scholar]
  • 40.Ajzen I. The Theory of Planned Behavior. Organizational Behavior and Human Decision Processes. 1991;50:179–211. [Google Scholar]
  • 41.Potelle H, Rouet J-F. Effects of Content Representation and Readers' Prior Knowledge on the Comprehension of Hypertext. International Journal of Human-Computer Studies. 2003;58:327–45. [Google Scholar]
  • 42.Velayo RS. Retention of Content as a Function of Presentation Mode and Perceived Difficulty. Reading Improvement. 1993;30(4):216–27. [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES