Abstract
Analyzing the reading grade level of online mental health information is an important first step in ensuring that information is largely accessible by the general public, so as not to perpetuate existing health disparities across socioeconomic groups. The present study systematically examined grade level readability of mental health information related to various psychiatric diagnoses, obtained from six highly utilized mental health websites, using a general estimating equations approach. Results suggest that in general, the readability of mental health information is largely well-above the 6th to 8th grade level recommended by several national health organizations (Kutner, Greenberg, Jin, & Paulsen, 2006; National Institutes of Health, 2001; National Institutes of Health, 2017), with reading grade level estimates from the model ranging from 5.62 to 17.9. Further efforts are required to ensure that writers of online health information do not exacerbate existing health disparities by ignoring these guidelines.
Keywords: online mental health information, readability, health literacy, GEE, multi-level modeling
Introduction
In recent years, the Internet has become a popular source of health-related information for a multitude of physical/mental health disorders and/or symptoms. Of the 84% of Americans who report using the Internet regularly, 72% have reported using the Internet as a resource for health information, with 77% beginning their search at a general search engine site such as Google, Bing, orYahoo.com (Kutner et al., 2006). In addition, 35% of U.S. adults indicated that they have specifically gone online to find out what condition they or someone else might have, with 46% of these ‘online diagnosers’ indicating that the information obtained led them to think they needed medical intervention, and 18% indicating that a medical professional either disagreed with the initial diagnosis or offered an alternate medical opinion (Fox & Duggan, 2013).
Information derived from online sources can be useful in helping individuals make important healthcare decisions for themselves and/or loved ones, provided that the information is readable, comprehensible, and accurate. In light of recent information that the national reading grade level average is at or below the 6th to 8th grade (Kutner, et al., 2006; Paasche-Orlow, Parker, Gazmararian, Nielsen-Bohlman & Rudd, 2005), several organizations including the Centers for Disease Control and Prevention (CDC), the American Medical Association (AMA), the Joint Commission (2010), and the National Institutes of Health recommend that health information be written at or below this level (National Institutes of Health, 2017; Neuhauser & Paul, 2011; Weis, 2003).
Although the constructs of readability, comprehension, and accuracy of information are closely related, and have recently been investigated for a myriad of physical health concerns (Beaunoyer, Arsenault, Lomanowska, & Guitton, 2017; Kher, Johnson, & Griffith, 2017; Koo, Shee, & Yap, 2017; Wong & Levi, 2017), little attention has been devoted to understanding the basic grade level readability of online mental health information, where readability is a systematic measure of ease with which a passage of text can be read (Albright et al., 1996; McInnes & Haglund, 2011), and grade level readability is the grade level people need to have completed in order to read a selection of text.
The lack of attention to readability of online public health information is particularly problematic considering that approximately 35% of US citizens have basic and below basic health literacy, 53% have intermediate health literacy, and only 12% have proficient health literacy (Kutner et al., 2006). In this context, health literacy is defined as the ability to search for, comprehend, and utilize written health education materials to make educated healthcare decisions (Berkman, Sheridan, Donahue, Halpern & Crotty, 2011; Berkman et al., 2011; Kutner et al., 2006; Committee on Health Literacy, 2004). A more comprehensive definition of health literacy defines below basic health literacy as the ability to successfully read a set of short instructions and identify what is permissible to drink before a medical test, basic health literacy as the ability to successfully read a pamphlet and give two reasons why a person with no symptoms should be tested for a disease, intermediate health literacy as the ability to read instructions on a prescription label and determine at what time a person can take the medication, and proficient health literacy as the ability to use a table to calculate an employee’s share of health insurance costs for one year (America’s Health Literacy, 2008).
With regard to Internet utilization, only 15% of adults with below basic health literacy indicate using the Internet “some” or “a lot” of the time for obtaining health information, as compared with 31% of those with basic health literacy, 49% with intermediate health literacy, and 62% of those with proficient health literacy (America’s Health Literacy, 2008). Although one study suggests that level of health literacy may be associated with preferences for receiving health information (Manganello, et al., 2017), it is uncertain at this time whether this low prevalence for those with basic to below basic health literacy is due to poor comprehension of written information, limited access to resources, or some combination of both factors. Exploration of socio-economic factors reveals an association between Internet use and reported level of education, with 95% of college- educated Americans using the Internet, as compared with 90% of those with some college education, 76% of those with a high school degree, and 66% with less than a high school diploma (Perrin & Duggan, 2015).
It is important to consider that even individuals with strong literacy skills and high educational attainment can face health literacy challenges, particularly after being diagnosed with a serious medical or mental illness that requires complicated self-care, unfamiliarity with opaque medical terminology and processes, and/or having to interpret numbers or risks in order to make challenging healthcare decisions (America’s Health Literacy, 2008). This suggests that readability does not equate to comprehension, which can oftentimes be two or more grade levels below reading or education level, and drops when a person is under stress (McInnes & Haglund, 2011).
The importance of accessibility to comprehensible text becomes even more apparent considering that individuals with low health literacy are at higher risk for poorer access to care, experience poorer health outcomes (Berkman et al., 2011), and have higher hospitalization rates than individuals with high health literacy (McInnes & Haglund, 2011). According to a number of reports (Baker, Parker, Williams, & Clark, 1998; Baker et al., 2002; Gordon, Hampson, Capell, & Madhok, 2002; Scott, Gazmararian, Williams, & Baker, 2002), individuals with low health literacy make greater use of treatment services, as compared with services designed to prevent the onset of disease or lessen serious complications. This results in an estimated $50 - $73 billion dollars in additional health care costs annually in the United States.
It is possible that one way to attenuate these costs might be to match the readability of written healthcare information to national reading grade level averages, or below. In light of this consideration, the purpose of this paper was to systematically examine reading grade level estimates of text extracted from common sources of online mental health information. Sixteen mental health disorders were selected for analysis based on prevalence rates in the population, and a selection of text was extracted from six popular sources of online mental health information common to all disorders under investigation. Reading grade level estimates were calculated using five different readability formulas, where each formula was conceptualized as an individual and representative ‘rater’ of text, selected from a hypothetical population of all available readability indices (i.e., raters).
Results from this study can be used to create a general set of readability guidelines from which to modify existing online mental health-related materials and/or compile new information in a manner that is consistent with the reading level and education of the general population. It is possible that by illustrating the educational bias inherent in much of the written mental health information that is available online, we can begin to address ways in which to reduce this gap and serve those who are most in need (World Health Organization, 2010).
Methods
Procedures
According to the website ebizMBA.com, Google, Bing, Yahoo, Ask, and AOL.com, respectively, were named the top five search engines of 2015. Because different Internet search engines may produce unique results for the same query based on numerous factors (including an individual’s location and browsing history), top website hits for the sixteen disorders selected for analysis were explored using all five search engines. That is, each disorder was entered as a separate search on Google, Bing, Yahoo, Ask, and AOL.com. The authors then examined the top ten web site results from each search. Only those sites that contained information about every disorder under investigation were selected. Disorders were selected based on their 12-month or lifetime prevalence in the U.S. population, and included: specific phobia (8.7%); substance abuse/addiction (8.2%); alcohol abuse/alcoholism (6.8%); social phobia (6.8%); major depressive disorder (MDD) (6.7%); attention deficit/hyperactivity disorder (ADHD) (4.1%); post-traumatic stress disorder (PTSD) (3.5%); generalized anxiety disorder (GAD) (3.1%); panic disorder (2.7%); bipolar disorder (2.6%); borderline personality disorder (1.6%); schizophrenia (1.1%); obsessive compulsive disorder (OCD) (1.0%); agoraphobia (.8%); bulimia nervosa (.3%); and anorexia nervosa (lifetime prevalence .6%) (Center for Behavioral Health Statistics and Quality, 2015; National Institute on Alcohol Abuse and Alcoholism, 2016; National Institute of Mental Health, 2016; Substance Abuse and Mental Health Services Administration, 2014).
This process of exploration and elimination resulted in the following list of six common websites that contain information for all disorders under investigation: Wikipedia.org (see: https://www.wikipedia.org), MayoClinic.org (see: https://www.mayoclinic.org), PsychCentral.com (see: https://psychcentral.com), MedicineNet.com (see: https://www.medicinenet.com/script/main/hp.asp), HealthLine.com (see: https://www.healthline.com), and WebMd.com (see: https://www.webmd.com).
Examination of the ‘About Us’ or ‘About’ tab on MayoClinic.org, PsychCentral.com, MedicineNet.com, HealthLine.com, and WebMd.com suggests that information on these sites is monitored and maintained by a team of editors, physicians, and other healthcare professionals. Unlike the other sites examined, Wikipedia.org is owned by the non-profit organization Wikimedia Foundation, and is not managed by a board of mental health professionals. Instead, users generate and edit most of the mental health content posted on the site. However, given its popularity, Internet users searching for medical and mental health conditions are often directed to Wikipedia for key information.
A selection of text from each website, for each disorder, was extracted and saved in a Word document as a simple text file during the last two weeks of October, 2015, by the first author. All commas, quotation marks, apostrophes, hyperlinks, references, and headings were removed from the text, as specified by common guidelines for readability analysis (DuBay, 2004). All bulleted lists and sentence fragments followed by a colon or semicolon were also removed. The final word count for selected texts ranged from approximately 150 to 600 words; the average number of words per sentence ranged from approximately 10 to 30 words; and the average number of syllables per word ranged from 1.5 to 2.5.
Ideally, each selection of text would have been randomly selected from each site, with an identical word count. However, for some websites, information was presented on multiple pages, with only a small amount of text available per page. On these sites, website users were instructed to click through several pages in order to obtain more information about the disorder. On other sites, the information was formatted on a single page that the reader could scroll through to read more. For sites instructing users to click through multiple pages, text was extracted only from the first few pages describing the disorder. For sites presenting information on a single page, text was extracted from the top of the document. All text extracted was from the description of the disorder in question. This method was selected because the authors were interested in evaluating the information that is most readily available to readers on each website, and it was surmised that most Internet users will read the information that is presented first (or at the top of a document) before scrolling to other sections, or clicking through multiple pages of text.
Text was processed by pasting extractions into the appropriate field on the website read-able.com (see: https://www.webpagefx.com/tools/read-able/). This website was selected because it is one of many free, online readability calculators, and generates five different grade-level readability estimates. Typical readability formulas include the Automated Readability Index, Coleman-Liau Index, Simple Measure of Gobbledygook (SMOG), Gunning Frequency of Gobbledygook (FOG) score, and Flesch-Kincaid Grade Level (Friedman & Hoffman-Goetz, 2006). Readability estimates derived from this site were cross-referenced with estimates from indices available on readability-score.com (see: https://readable.io/) and Readability-formulas.com (see: http://www.readabilityformulas.com/) by the first author. Specific indices examined included the Automatic Readability Index, Coleman-Liau Readability Index, SMOG, Gunning Fog Grade Level, and Flesch-Kincaid Grade Level Index.
The Flesch-Kincaid Grade Level and SMOG indices generate an approximate grade level score at which an average U.S. student in that grade can read the text. For example, a Flesch-Kincaid or SMOG score of 8.3 indicates that an average student in the eighth grade can read the text in question. Although the specific formulas for each index vary slightly, both scores are dependent on the number of syllables per word, and number of words per sentence in the text passage under investigation. Similarly, the Gunning Fog Grade level utilizes average sentence length and percentage of hard words (words that contain more than 3 syllables) to generate a grade level for written text, such that the ideal Fog score is a 7 or 8. Materials that receive a score of 10 are considered hard, 15 are considered difficult, and 20 are considered very difficult (DuBay, 2004).
Unlike the aforementioned indices, the Coleman-Liau and Automated Readability Index generate a readability estimate that takes into consideration the number of characters per word, as well as the number of words per sentence. Hence, although each index employs a different mathematical formula to arrive at a grade level score, scores should largely be consistent across indices.
Statistical analyses
For the purposes of this analysis, each of the selected reading level indices served as a separate rater of the same excerpt of text. Hence, reading level scores were clustered by rater (index), with each rater examining a total of 96 excerpts of text, for sixteen disorders, from six different websites. Whereas the authors considered using the mean score across raters to explore differences in reading level estimates, this approach would have reduced the number of measurements from 480 (16 disorders × 6 websites × 5 raters) to 96 (16 disorders × 6 websites) observations. A population-averaged or general(ized) estimating equations (GEE) approach was utilized to explore systematic differences between websites and disorders on population-averaged reading grade level scores, while retaining distinct information from all raters in the final analysis.
GEE’s are typically used to estimate population-average or marginal models that describe changes in the population mean of a given variable in relation to other important covariates, while also taking into account subject specific non-independence among observations (Hubbard et al., 2010). This modeling strategy provides a flexible modeling approach that can be applied when outcome data are continuous or discrete. Because reading grade level scores were continuous and normally distributed, an identity link function was specified, with an exchangeable working correlation matrix. Rater (readability index) was specified as the ‘subject’ variable, with the website and disorder variables predicting reading grade level scores. An interaction between website and disorder was also included in the model.
Given immense variability in elements such as overall website quality, online author qualifications, and the generally complicated nature of various psychiatric conditions, the authors hypothesized that: 1) grade level estimates for different disorders would vary based on the website text was derived from, and that 2) text from all websites would exceed the recommended 6th to 8th grade guidelines suggested by the CDC and the AMA.
Results
Preliminary Analyses
Data were analyzed using SAS Version 9.3 (Carey Institute, N.C.), and SPSS Version 21 (IBM, 2012). In order to determine the need for methods that account for within-subject clustering, the Intraclass Correlation Coefficient (ICC) was first calculated for rater (index). The ICC can be conceptualized as a general measurement of agreement or consistency between two or more raters or measuring methods, where a value of ‘1’ represents perfect agreement, and a value of ‘0’ represents no agreement at all (Shrout & Fleiss, 1979). Evidence for variability by index would suggest a clustering effect in the data that would need to be accounted for in all subsequent analyses.
A two-way random effects model was specified for rater in order to assess variability in reading level scores between raters. A two-way random effects model was selected because the same indices were used to assess all selections of text, and the indices selected were chosen from a population of available indices used to calculate reading grade level scores. The ICC (2) assumes that the variance of the raters serves to add noise to any ratings obtained, and that the mean of rater error is zero. Results indicated that the estimated reliability between indices was 82.1%, with 95% CI [76.9, 86.6], using a consistency definition. Table 1 provides mean reading level scores for each index, averaged over websites and psychological conditions, with associated standard deviations from this preliminary analysis. As indicated in the table, the mean for reading level scores generated by the Gunning Fog index was highest and had the largest variability (averaged across websites and disorders), whereas the mean for reading level scores generated by the SMOG index was lowest and had the smallest variability of the indices selected.
Table 1.
Mean Readability Grade-Level Rating, Averaged Over Website and Psychological Disorder, By Index
Index | Mean | Std. Deviation |
---|---|---|
Auto Readability Index | 11.766 | 2.5710 |
Coleman-Liau | 14.523 | 1.6711 |
FK Grade Level | 12.263 | 2.2784 |
Gunning Fog | 15.625 | 2.5924 |
SMOG | 11.377 | 1.8263 |
Note: The five readability indices abbreviated in the first column are the Automatic Readability Index, Coleman-Liau Index, Flesch-Kincaid Grade Level Index, Gunning Fog Grade Level, and the Simple Measure of Gobbledygook (SMOG).
Overall, the indices selected were largely consistent in their ratings of readability across disorders and/or websites. Hence, it could be concluded that the indices chosen demonstrated sufficient consistency for further analysis. Given that the researchers were: 1) not interested in examining specific differences between raters (indices) across websites and disorders; and 2) wanted to increase power by retaining as much information as possible from the original dataset (collapsing the data by calculating a mean score for each disorder from each website would reduce the number of available data points from 480, with all raters considered separately, to 96 when scores are averaged), a GEE approach was utilized to account for any natural variation in outcomes attributable to rater effects.
Main Analyses
Reading level values extracted from the websites sampled ranged from 1.3 to 21.5, with a mean of 13.07, and standard deviation of 2.85 (N = 480). Results from a one-sample t-test exploring differences between mean readability estimates obtained across all websites and disorders and the national 8th grade average suggest that the mean of the obtained sample was significantly higher than the upper limit of the national average, (mean difference = 5.07, p < .001, t (479) = 38.92, 95% CI of difference [4.82, 5.31]).
Results from the GEE suggest a significant website by content area interaction, χ2 (4, 480) = 192.57, p <.001, when controlling for the presumed interdependencies between scores across indices. The main effects of subject, χ2 (4, 480) = 436.92, p <.001, and website, χ2 (4, 480) = 1446.20, p <.001, were also significant at the .05 level. Significance tests for all reported pairwise comparisons were adjusted using the Holm-Bonferroni method (Holms, 1979). Figure 1 (available as a supplement online) presents the results from the interaction graphically, paneled by website. Tables 2 and 3 provide population-averaged mean readability estimates for disorders and websites (i.e., main effects), respectively, from the GEE model.
Table 2.
Population-Averaged Readability Grade-Level Estimates, By Disorder, From GEE Model, With 95% Confidence Limits
Subject | Mean | 95% Wald Confidence Interval | |
---|---|---|---|
Lower | Upper | ||
ADHD | 12.970 | 11.512 | 14.428 |
Agoraphobia | 11.563 | 9.788 | 13.339 |
Alcoholism | 12.450 | 11.131 | 13.769 |
Anorexia Nervosa | 12.793 | 11.311 | 14.276 |
Bipolar | 14.083 | 12.319 | 15.848 |
Borderline Personality Disorder |
14.157 | 12.747 | 15.567 |
Bulimia Nervosa | 13.037 | 11.621 | 14.452 |
GAD | 13.497 | 11.929 | 15.065 |
MDD | 13.683 | 12.217 | 15.150 |
OCD | 12.417 | 10.563 | 14.270 |
Panic Disorder | 12.070 | 10.778 | 13.362 |
PTSD | 13.280 | 11.924 | 14.636 |
Schizophrenia | 14.013 | 12.362 | 15.665 |
Social Phobia | 14.060 | 12.499 | 15.621 |
Specific Phobia | 12.927 | 11.311 | 14.542 |
Substance Abuse | 12.103 | 10.531 | 13.675 |
Table 3.
Population-Averaged Readability Grade-Level Estimates, By Website, From GEE Model, With 95% Confidence Limits
Website | Mean | 95% Wald Confidence Interval | |
---|---|---|---|
Lower | Upper | ||
HealthLine | 11.985 | 10.342 | 13.628 |
MayoClinic | 12.281 | 10.834 | 13.728 |
MedicineNet | 15.364 | 13.846 | 16.882 |
PsychCentral | 13.043 | 11.492 | 14.593 |
WebMd | 12.578 | 11.046 | 14.109 |
Wikipedia | 13.164 | 11.713 | 14.615 |
Table 4 presents the range of population-averaged reading grade level estimates for each disorder by website combination (interaction effects) from the GEE model. The full table with 95% confidence intervals is available in an online supplement. Exploration of specific interaction effects suggests that, in many instances, text obtained from MedicineNet.com and Wikipedia.org had the highest reading grade level estimates for each of the disorders under investigation, apart from text related to PTSD from WebMD.com. Text describing borderline personality disorder (grade level = 17.9, 95% CI [16.09, 19.71]), bipolar disorder (grade level = 16.56, 95% CI [14.77, 18.35]), agoraphobia (grade level = 16.54, 95% CI [14.93, 18.15]) and alcoholism (grade level = 16.46, 95% CI [14.86, 18.15]) from MedicineNet.com demonstrated the highest estimated readability scores overall, whereas text describing agoraphobia (grade level = 5.62, 95% CI [4.17, 7.07]), ADHD (grade level = 8.9, 95% CI [7.21, 10.6]), and borderline personality disorder (grade level = 9.32, 95% CI [7.58, 11.06]) from WebMd.com demonstrated the lowest estimated readability scores. Reading grade level estimates for all disorders across websites were consistent with a high school to college reading grade level, and well exceeded the recommended 6th to 8th grade guidelines, aside from text related to agoraphobia from WebMd.com. Likewise, in only three instances does the lower bound of the 95% confidence interval for the readability estimate obtained from the GEE model fall below the 8th grade level (text related to ADHD and borderline personality disorder from WebMd.com, and agoraphobia from PsychCentral.com). Lastly, further examination of Figure 1 and Table 4 reveals that readability estimates across disorders were more variable for some websites than for others. For example, estimates from HealthLine and MedicineNet across disorders were more similar than estimates across disorders for Wikipedia and WebMd.com.
Table 4.
Range of Population-Averaged Readability Grade-Level Estimates for Each Disorder, By Website, From GEE Model (95% CI are reported in supplemental table online)
Subject | Website | |||||
---|---|---|---|---|---|---|
Medicine Net |
Wikipedia | Psych Central |
WebMd | Mayo Clinic |
Health Line |
|
ADHD | 15.56 | 13.08 | 13.56 | 8.9 | 12.36 | 14.36 |
Agoraphobia | 16.54 | 13.5 | 10.42 | 5.62 | 12.14 | 11.16 |
Alcoholism | 16.46 | 11.2 | 12.46 | 13.42 | 10.18 | 10.98 |
Anorexia Nervosa | 15.32 | 10.12 | 13 | 14.1 | 12.64 | 11.58 |
Bipolar | 16.56 | 11.66 | 13.9 | 15.68 | 15.24 | 11.46 |
Borderline Personality Disorder | 17.9 | 16.4 | 17.36 | 9.32 | 11.58 | 12.38 |
Bulimia Nervosa | 15.02 | 10.4 | 13.96 | 14.98 | 12.04 | 11.82 |
GAD | 16.04 | 15.3 | 11.78 | 12.4 | 12.86 | 12.6 |
MDD | 14.18 | 15.04 | 11.72 | 14.6 | 11.76 | 14.8 |
OCD | 14.9 | 11.72 | 13.44 | 11.78 | 11.66 | 11 |
Panic Disorder | 12.52 | 14.74 | 10.62 | 12.3 | 10.7 | 11.54 |
PTSD | 14.64 | 13.82 | 12.76 | 15.44 | 11.54 | 11.48 |
Schizophrenia | 16.36 | 13.48 | 14.66 | 13.54 | 13.02 | 13.02 |
Social Phobia | 15.1 | 15.64 | 12.6 | 14.44 | 14.78 | 11.8 |
Specific Phobia | 16.1 | 10.32 | 13.48 | 13.62 | 12.88 | 11.16 |
Substance Abuse | 12.62 | 14.2 | 12.96 | 11.1 | 11.12 | 10.62 |
Discussion
Overall, aside from a key few instances, the reading grade level for all disorders across the websites examined far exceeded the suggested 6th to 8th grade reading level guidelines established by the CDC and other similar organizations. In some cases, (i.e. text related to borderline personality disorder from MedicineNet.com), the estimated reading grade level reached as high as 17.9. This estimate suggests that, on average, only individuals with an advanced graduate degree (grade 17.9) would be able to read the selected text effectively. In other instances, (i.e. text related to ADHD and Agoraphobia from WebMd.com), reading grade level estimates were much lower, and consistent with a 6th to 8th grade reading level, respectively. These estimates suggest that an individual who completed the 6th to 8th grade could read the selected text. However, all other estimates obtained were markedly higher, with a minimum required high school reading level.
Interestingly, text related to borderline personality disorder demonstrated the highest reading grade level estimate, followed by text related to bipolar disorder, social phobia, schizophrenia, MDD, and GAD, in descending order of grade level. Examination of estimates for these disorders generally suggests that an individual with an average post-high school reading level could effectively read the segments of text selected for analysis. Given the severity of impairment often associated with these disorders (particularly borderline personality disorder, bipolar disorder, and schizophrenia), it could be surmised that the information available online from the websites surveyed is not only relatively inaccessible to most healthy consumers, but particularly to those struggling with serious mental illness. Indeed, as noted by Revheim et al., (2014), individuals with schizophrenia commonly display severe deficits in reading ability. Moreover, given impairments in reading ability among individuals with serious mental illness, Rotondi et al. (2007) suggest that most online sources of mental health information are not well-suited to the needs of this population.
Not surprisingly, little difference was noted in reading grade level estimates between MDD and bipolar disorder, as these disorders may share a common language regarding general symptoms of depression. In light of similarities in language, symptom presentation, and etiology, there was no notable difference in reading level scores for alcoholism and substance abuse, as well as social phobia and specific phobia. However, this rationale could not be extended to text describing the two predominant eating disorders examined in this study: reading level estimates for bulimia nervosa were significantly higher than those for anorexia nervosa. It is possible that further exploration of text content may reveal emphasis on different features, symptoms, or etiology of each disorder, hence contributing to differences in reading level estimates.
Indeed, it is important to remember that this study only examined the readability of online public mental health materials, and did not explore the content (or meaning) of text extracted from the sites selected. Readability is an important first component in understanding whether the structure and form of written material is largely digestible by the average reader. Based on national statistics that suggest the reading grade level of the average American citizen is between the 6th to 8th grade (Kutner et al., 2006; Paasche-Orlow et al., 2005), materials describing mental health conditions, symptoms, and disorders that exceed this threshold may not be useful in helping the general population make important decisions about their own, or loved ones’ healthcare needs.
It is also vital to remember that although readability is an important first element in broadly distinguishing the level of education required to read a passage of text, reading comprehension is oftentimes two to three grade levels below an individual’s overall level of education or established reading grade level. This effect may further be exacerbated when an individual is under duress or struggling with a serious mental illness (McInnes & Hagland, 2011). As such, for the 77 million Americans with limited health literacy (America’s Health Literacy, 2008; Kutner et al., 2006; Paasche-Orlow et al., 2005), much of the current mental health materials available online may be both unreadable and incomprehensible. This can have broad implications for perpetuating health disparities by limiting access to publicly available mental health information to a small segment of the population who already possess above average health literacy, have better access to resources, and consequently, may have better health outcomes than those with low health literacy.
Examination of reading level estimates by website suggests that on average, MedicineNet.com has the highest reading grade level, followed by PsychCentral.com, Wikipedia.org, WebMd.com, MayoClinic.com, and HealthLine.com, in descending order. There was no difference in reading level scores between PsychCentral.com and Wikipedia.org, and betweenWikipedia.org and WebMd.com. However, PsychCentral.com had higher overall reading grade level estimates than WebMd.com, whereas estimates from WebMd.com were higher than those obtained from MayoClinic.com and HealthLine.com. No difference was noted between MayoClinic.com and HealthLine.com. Reading grade level estimates for all websites were consistent with a high school senior reading level or above.
As noted in the introduction, some of the variability in reading scores across websites may be due to website quality, as well as occupational differences in writers for these sites. That is, whereas sites such as WebMd, MayoClinic, and Healthline.com are managed by, and employ a team of healthcare workers to generate content for their sites, pages such as Wikipedia.org are user generated. This suggests that writers of online content may be substantially different in training, education, experience, knowledge, and clinical expertise in working with psychiatric populations. Further investigation is necessary in order to determine the accuracy of information presented across user-generated and professionally managed healthcare sites. Likewise, clinicians who treat individuals with psychiatric disorders should be careful when prescribing online psychoeducational materials, as much of this information may not only be un-readable by the average American, but also potentially inaccurate.
In general, the results from this study provide valuable evidence that online information, procured from the most popular health-related websites, for 16 of the most prevalent mental health disorders and/or conditions is written at a level that far exceeds the national reading grade level average. Writers of public mental health materials are well-advised to take great care in ensuring that the information provided to consumers is not only accurate, but also written in a manner that does not enhance existing health disparities by limiting access to knowledge to an already educated minority. Although it is most likely that this oversight is largely un- intentional (and can perhaps be tentatively attributed to a combination of factors including the global level of education of those writing public health materials, and/or a general lack of knowledge/awareness of statistics related to health literacy levels in the United States), failure to adhere to these guidelines can have broad public health implications (American’s Health Literacy, 2008).
Lastly, it is important to consider the practical and methodological limitations of this study before making sweeping conclusions about the content of online public mental health materials. Clearly, individuals have a multitude of ways of arriving at the websites and disorders examined within the scope of this study. In many cases, searching for mental health information may begin by entering key words related to symptoms, rather than names of formal diagnoses. This study did not assess the mechanism by which people arrive at the websites selected, with the implied understanding that based on common search terms, people will eventually be funneled to a web page describing a disorder whose symptoms are consistent with their initial search terms.
Furthermore, this study is in no way a comprehensive review of all mental health diagnoses, nor does it sample all websites with available online mental health materials. The websites selected for analysis were chosen, in part, because they contain information specific to each disorder under investigation. Some prominent mental health websites, such as the National Institute of Mental Health (NIMH.NIH.gov) were not selected because they did not promote information specific to substance abuse disorders or alcoholism. Likewise, given the speed at which technology changes, it is possible that the search engines selected in October, 2015 to conduct the initial investigation are no longer the most popular engines available.
From a methodological perspective, it may have been more robust to assess each block of text using additional readability indices, as well as to have multiple researchers select, clean, and process each block of text for enhanced inter-rater reliability. Although the primary researcher attempted to employ rigorous standards in selecting text for each disorder, it is possible that the selections may exhibit some bias. In addition, some variability in reading grade level scores could result from differences in the complexity of disorder names, which are inevitably included in the text selected from each site. That is, the terms ‘schizophrenia’ or ‘borderline personality disorder’ are more complex because they contain more characters and syllables, as compared with terms such as ‘substance abuse’ or ‘panic disorder.’ Because readability formulas are based on characteristics of words, versus content or meaning, it is possible that including these terms in the selected text may have biased readability ratings in an upward direction. Further studies might consider replacing the diagnosis term with a neutral or generic word in the text selected prior to analysis, in order to account for this natural variation.
Another possible future direction for researchers interested in exploring the readability of online mental health information may be to compare how well readability estimates hold up when information is presented in other languages, particularly Spanish. Providing readable and comprehensible translations of online health information may help to prevent the exacerbation of existing health disparities by making important information available to a wider, and more diverse audience of consumers. Development of interactive websites tailored to meet diverse information needs may aid in this objective. Unfortunately, the scope of this paper was restricted to an examination of online information written in English, with the hope that a preliminary understanding of the basic construct of readability would inspire researchers, clinicians, policy makers, and/or stakeholders to launch further investigations into the uncharted domains of comprehension and accuracy, across languages.
In summary, despite these limitations, this study provides some initial evidence that current readability estimates for 16 of the most prevalent mental health disorders common to all sites surveyed are well above the 6th to 8th grade reading level guidelines suggested by the CDC and AMA. This information is important for researchers interested in conducting more rigorous explorations of online mental health materials, policy makers interested in decreasing health disparities amongst various socio-demographic groups, and editors of mental health websites dedicated to providing consumers with quality written health materials. Future directions for this work may include examination of online information for all existing mental health diagnoses, exploration of quality of content of written text, experimental manipulations of text with consumers in the laboratory, and/or evaluation of differences in comprehension for online information presented in written, versus auditory or interactive formats, in various language formats.
Supplementary Material
Acknowledgments
This research was supported in part by NIH grant G20RR030883 for Lisa L. Harlow.
References
- Albright J, de Guzman C, Acebo P, Paiva D, Faulkner M, & Swanson J (1996). Readability of patient education materials: Implications for clinical practice. Applied Nursing Research, 9(3), 139–143. [DOI] [PubMed] [Google Scholar]
- America's Health Literacy: Why We Need Accessible Health Information. An Issue Brief from the U.S. Department of Health and Human Services (2008). Retrieved 7/15/2016 from: http://health.gov/communication/literacy/issuebrief/
- Baker DW, Parker RM, Williams MV, & Clark WS (1998). Health literacy and the risk of hospital admission. Journal of General Internal Medicine, 13(12), 791–798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker DW, Gazmararian JA, Williams MV, Scott T, Parker RM, Green D, … & Peel J (2002). Functional health literacy and the risk of hospital admission among Medicare managed care enrollees. American Journal of Public Health, 92(8), 1278–1283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beaunoyer E, Arsenault M, Lomanowska AM, & Guitton MJ (2017). Understanding online health information: Evaluation, tools, and strategies. Patient education and counseling, 100(2), 183–189. [DOI] [PubMed] [Google Scholar]
- Berkman ND, Sheridan SL, Donahue KE, Halpern DJ, & Crotty K (2011). Low health literacy and health outcomes: an updated systematic review. Annals of Internal Medicine, 155(2), 97–107. [DOI] [PubMed] [Google Scholar]
- Berkman ND, Sheridan SL, Donahue KE, Halpern DJ, Viera A, Crotty K, … & Tant E (2011). Health literacy interventions and outcomes: an updated systematic review. Evidence Report/Technology Assessment, 199, 1–941. [PMC free article] [PubMed] [Google Scholar]
- Brigo F, Otte WM, Igwe SC, Tezzon F, & Nardone R (2015). Clearly written, easily comprehended? The readability of websites providing information on epilepsy. Epilepsy & Behavior, 44, 35–39. [DOI] [PubMed] [Google Scholar]
- Center for Behavioral Health Statistics and Quality (2015). Behavioral health trends in the United States: Results from the 2014 National Survey on Drug Use and Health (HHS Publication No. SMA 15–4927, NSDUH Series H-50). Retrieved from http://www.samhsa.gov/data/
- Colaco M, Svider PF, Agarwal N, Eloy JA, & Jackson IM (2013). Readability assessment of online urology patient education materials. The Journal of Urology, 189(3), 1048–1052. [DOI] [PubMed] [Google Scholar]
- DuBay WH (2004). The Principles of Readability. Online Submission. [Google Scholar]
- Fox S, & Duggan M (2013). Health Online 2013. Pew Research Center; Retrieved August, 2015 from http://www.pewinternet.org/2013/01/15/health-online-2013/ [Google Scholar]
- Friedman DB, & Hoffman-Goetz L (2006). A systematic review of readability and comprehension instruments used for print and web-based cancer information. Health Education & Behavior, 33(3), 352–373. [DOI] [PubMed] [Google Scholar]
- Galbraith S, Daniel JA, & Vissel B (2010). A study of clustered data and approaches to its analysis. The Journal of Neuroscience, 30(32), 10601–10608. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gardiner JC, Luo Z, & Roman LA (2009). Fixed effects, random effects and GEE: what are the differences? Statistics in Medicine, 28(2), 221–239. [DOI] [PubMed] [Google Scholar]
- Gordon MM, Hampson R, Capell HA, & Madhok R (2002). Illiteracy in rheumatoid arthritis patients as determined by the Rapid Estimate of Adult Literacy (REALM) score. Rheumatology, 41(7), 750–754. [DOI] [PubMed] [Google Scholar]
- Holm S (1979). A simple sequential rejective multiple test procedure. Scandinavian Journal of Statistics, 65–70. [Google Scholar]
- Hubbard AE, Ahern J, Fleischer NL, Van der Laan M, Lippman SA, Jewell N, … & Satariano WA (2010). To GEE or not to GEE: comparing population average and mixed models for estimating the associations between neighborhood risk factors and health. Epidemiology, 21(4), 467–474. [DOI] [PubMed] [Google Scholar]
- IBM Corp. Released 2012. IBM SPSS Statistics for Windows, Version 21.0. Armonk, NY: IBM Corp. [Google Scholar]
- Kindig DA, Panzer AM, & Nielsen-Bohlman L (Eds.). (2004). Health Literacy: A Prescription to End Confusion. National Academies Press. [PubMed] [Google Scholar]
- Kher A, Johnson S, & Griffith R (2017). Readability Assessment of Online Patient Education Material on Congestive Heart Failure. Advances in Preventive Medicine, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Koo K, Shee K, & Yap RL (2017). Readability analysis of online health information about overactive bladder. Neurourology and Urodynamics, 36(7), 1782–1787. [DOI] [PubMed] [Google Scholar]
- Kutner M, Greenberg E, Jin Y, and Paulsen C (2006). The Health Literacy of America’s Adults: Results From the 2003 National Assessment of Adult Literacy (NCES 2006–483). U.S. Department of Education. Washington, DC: National Center for Education Statistics; Retrieved from: https://nces.ed.gov/pubs2006/2006483.pdf [Google Scholar]
- Liang KY, & Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73(1), 13–22. [Google Scholar]
- Manganello J, Gerstner G, Pergolino K, Graham Y, Falisi A, & Strogatz D (2017). The relationship of health literacy with use of digital technology for health information: implications for public health practice. Journal of public health management and practice, 23(4), 380–387. [DOI] [PubMed] [Google Scholar]
- McEnteggart GE, Naeem M, Skierkowski D, Baird GL, Ahn SH, & Soares G (2015). Readability of online patient education materials related to IR. Journal of Vascular and Interventional Radiology, 26(8), 1164–1168. [DOI] [PubMed] [Google Scholar]
- McInnes N, & Haglund BJ (2011). Readability of online health information: Implications for health literacy. Informatics for Health and Social Care, 36(4), 173–189. [DOI] [PubMed] [Google Scholar]
- Misra P, Agarwal N, Kasabwala K, Hansberry DR, Setzen M, & Eloy JA (2013). Readability analysis of healthcare-oriented education resources from the American academy of facial plastic and reconstructive surgery. The Laryngoscope, 123(1), 90–96. [DOI] [PubMed] [Google Scholar]
- National Institutes of Health (2001). Making health communication programs work. Retrieved 10/1/2017 from: https://www.cancer.gov/publications/health-communication/pink-book.pdf
- National Institutes of Health (2017). How to write easy-to-read health materials Internet. Retrieved 1/18/2017 from: https://www.nlm.nih.gov/medlineplus/etr.html
- National Institute of Mental Health, Statistics (2016). Retrieved 7/15/2016 from: http://www.nimh.nih.gov/health/statistics/index.shtml
- National Institute on Alcohol Abuse and Alcoholism, Alcohol Facts and Statistics (2016). Retrieved 7/15/2016 from: http://www.niaaa.nih.gov/alcohol-health/overview-alcohol-consumption/alcohol-facts-and-statistics
- Neuhauser L, & Paul K (2011). Readability, comprehension, and usability. Communicating risks and benefits: An evidence-based user’s guide, 129–148. [Google Scholar]
- Paasche-Orlow MK, Parker RM, Gazmararian JA, Nielsen-Bohlman LT, & Rudd RR (2005). The prevalence of limited health literacy. Journal of General Internal Medicine, 20(2), 175–184. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perrin A, & Duggan M (2015). Americans’ Internet Access: 2000–2015. Pew Research Center; Retrieved August, 2015 from http://www.pewinternet.org/files/2015/06/2015-06-26_internet-usage-across-demographics-discover_FINAL.pdf [Google Scholar]
- Revheim N, Corcoran CM, Dias E, Hellmann E, Martinez A, Butler PD, … & Javitt DC (2014). Reading deficits in schizophrenia and individuals at high clinical risk: Relationship to sensory function, course of illness, and psychosocial outcome. American Journal of Psychiatry, 171, 949–959. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rotondi AJ, Sinkule J, Haas GL, Spring MB, Litschge CM, Newhill CE, …& Anderson CM (2007). Designing websites for persons with cognitive deficits: Design and usability of a psychoeducational intervention for persons with severe mental illness. Psychological Services, 4(3), 202–224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shrout PE, & Fleiss JL (1979). Intraclass correlations: uses in assessing rater reliability. Psychological Bulletin, 86(2), 420. [DOI] [PubMed] [Google Scholar]
- Substance Abuse and Mental Health Services Administration (2014). Results from the 2013 National Survey on Drug Use and Health: Summary of National Findings, NSDUH Series H-48, HHS Publication No. (SMA) 14–4863. Rockville, MD: Substance Abuse and Mental Health Services Administration. [Google Scholar]
- Svider PF, Agarwal N, Choudry OJ, Hajart AF, Baredes S, Liu JK, & Eloy JA (2013). Readability assessment of online patient education materials for academic otolaryngology- head and neck surgery departments. American Journal of Otolaryngology, 34(1), 31–35. [DOI] [PubMed] [Google Scholar]
- The Joint Commission (2010). Advancing Effective Communication, Cultural Competence, and Patient- and Family-Centered Care: A Roadmap for Hospitals. The Joint Commission; Oakbrook Terrace, IL: Retrieved January, 2018 from: https://www.jointcommission.org/assets/1/6/ARoadmapforHospitalsfinalversion727.pdf [Google Scholar]
- Weiss BD (2003). Health literacy: A manual for clinicians. Chicago: American Medical Association Foundation and American Medical Association. [Google Scholar]
- Wong K, & Levi JR (2017). Partial Tonsillectomy: Content and Readability of Online Health Information. Annals of Otology, Rhinology & Laryngology, 126(3), 192–198. [DOI] [PubMed] [Google Scholar]
- World Health Organization (WHO) 7th Global Conference on Health Promotion: Health literacy and health behavior (2009). Retrieved August, 2015 from: http://www.who.int/healthpromotion/conferneces/7gchp/track2/en/ [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.