Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Aug 27.
Published in final edited form as: Health Serv Res. 2010 May 24;45(4):1105–1120. doi: 10.1111/j.1475-6773.2010.01119.x

Short Assessment of Health Literacy-Spanish and English: A Comparable Test of Health Literacy for Spanish and English-Speakers

S Saroja
PMCID: PMC2910571  NIHMSID: NIHMS308785  PMID: 20500222

Abstract

Objective

The intent of the study was to develop and validate a comparable health literacy test for Spanish-speaking and English-speaking populations.

Study Design

The design of the instrument, named the Short Assessment of Health Literacy-Spanish and English (SAHL-S&E), combined a word recognition test, as appearing in the Rapid Estimate of Adult Literacy in Medicine (REALM), and a comprehension test using multiple-choice questions designed by an expert panel. We employed the item response theory in developing and validating the instrument.

Data Collection

Validation of SAHL-S&E involved testing and comparing the instrument with other health literacy instruments in a sample of 201 Spanish-speaking and 202 English-speaking subjects recruited from the Ambulatory Care Center at the University of North Carolina Healthcare System.

Principal Findings

Based on item response theory analysis, 18 items were retained in the comparable test. The Spanish version of the test, SAHL-S, was highly correlated with another Spanish health literacy instrument, SAHLSA (r = 0.88, p < 0.05). The English version, SAHL-E, had high correlations with REALM (r = 0.94, p < 0.05) and the English Test of Functional Health Literacy in Adults (r = 0.68, p < 0.05). Significant correlations were found between SAHL-S&E and years of schooling in both Spanish and English-speaking samples (r = 0.15 and r = 0.39, respectively). SAHL-S&E displayed satisfactory reliability of 0.80 and 0.89 in the Spanish and English-speaking samples, respectively. IRT analysis indicated that the SAHL-S&E score was highly reliable for individuals with a low level of health literacy.

Conclusions

The new instrument, SAHL-S&E, has good reliability and validity. It is particularly useful for identifying individuals with low health literacy and could be used in clinical or community settings to screen for low health literacy among Spanish and English speakers.

Keywords: Health literacy, test instrument, Spanish speakers, English speakers, SAHL-S&E


It is hardly news anymore that a significant proportion of adults in the United States have difficulty navigating the health care system and managing personal health issues because of inadequate health literacy or limited “capacity to obtain, process, and understand health information and services needed to make appropriate health decisions” (Seldon, Zorn, Ratzan, & Parker, 2000). Inadequate health literacy, as a growing body of research has shown, is a risk factor for patients’ difficulties in understand health information and following medical instructions (Cho, Lee, Arozullah, & Crittenden, 2008; Davis et al., 2006; Gazmararian, Williams, Peel, & Baker, 2003; Parker, Ratzan, & Lurie, 2003), poor disease/self-management knowledge (Gazmararian et al., 2003), underuse of preventive services and routine physician and dental visits (Baker et al., 2004; Jones, Lee, & Rozier, 2007; Lindau, Basu, & Leitsch, 2006; Lindau et al., 2002; Rogers, Wallace, & Weiss, 2006; Scott, Gazmararian, Williams, & Baker, 2002), increased hospitalizations and medical costs (Baker et al., 2002; Howard, Gazmararian, & Parker, 2005), and high mortality rates (Sudore et al., 2006).

Identifying individuals with inadequate health literacy is difficult because information such as age, educational attainment (i.e., years of schooling), and self-reported literacy skills do not reliably reflect an individual’s health literacy level (Bass, Wilson, Griffith, & Barnett, 2002; Davis, Jackson, George, et al., 1993; Davis, Arnold, Berkel, et al., 1996; Nurss, el-Kebbi, Gallina, et al., 1997). Over the years, several instruments, including the Test of Functional Health Literacy in Adults (TOFHLA), the Rapid Estimate of Adult Literacy in Medicine (REALM), and the Newest Vital Sign (NVS), have been developed to assess health literacy in the U.S. (Davis et al., 1993; Murphy, Davis, Long, Jackson, & Decker, 1993; Parker, Baker, Williams, & Nurss, 1995; Weiss et al., 2005). Most of the instruments, however, have a strong focus on the English-speaking populations and are inappropriate for assessing the health literacy level of Spanish-speakers. In the case of REALM, an attempt to develop a Spanish version failed because of the phonetic structure of the Spanish language (Nurss, Baker, David, Parker, & Williams, 1995).1 Where a Spanish version is available, e.g., TOFHLA-Spanish, the Spanish instrument is usually developed using a rudimentary translation-and-back-translation technique and is not validated psychometrically. A recent study comparing the psychometric properties of the English and Spanish versions of shortened TOFHLA raised a significant concern about their comparability (Aguirre, Ebrahim, & Shea, 2005).

Our research team developed an easy-to-use health literacy test, the Short Assessment of Health Literacy for Spanish-speaking Adults (SAHLSA), for Spanish-speakers (Lee, Bender, Ruiz, & Cho, 2006). The SAHLSA contains 50 test items and has good psychometric qualities. It has been adopted in research and clinical practice in the U.S. (Keselman et al., 2007; Rosembla & Tse, 2006) and is being validated for use in Latin American countries (Huamán-Calderón, Quiliano-Terreros, & Vílchez-Román, 2009). Since the publication of SAHLSA, many users have expressed the need for an English version to allow comparisons of health literacy level between Spanish and English speakers for research and clinical purposes. In this paper, we report our subsequent effort to develop a comparable test for Spanish and English-speakers, named Short Assessment of Health Literacy-Spanish & English or SAHL-S&E, based on the same methods used in developing SAHLSA. The test contains 18 items and is easy to administer. In taking the test, examinees are asked to read aloud each of the 18 medical terms and then associate each term to another word similar in meaning to demonstrate comprehension. The following sections describe the development of the SAHL-S&E, the methods employed to validate the instrument, results of the validation, and recommendations for use of the instrument.

METHODS

Instrument Development

The test items in SAHL-S&E were selected from the Spanish and English versions of an instrument that contained the 66 medical terms in the Rapid Estimate of Adult Literacy in Medicine or REALM (Davis et al., 1993). As a departure from REALM, we incorporated in the instrument simple multiple-choice questions to assess the examinee’s comprehension. Specifically, two common, simple words were chosen to match each of the REALM medical terms (“don’t know” was also included as an option). One of the words was meaningfully associated with the REALM medical term and the other was not. The test is akin to one form of educational achievement testing: “defining,” which measures understanding or comprehension based on correct identification of a paraphrased version of an original concept, fact, principle, or procedure as presented during instruction (Haladyna, 1999). Because the purpose of the multiple-choice questions was to verify the comprehension of the given medical terms, examinees were instructed not to guess. The difficulty of the two added words was kept minimal so that any examinee with a low level of education could understand them.

As reported in Lee et al. (2006), the instrument was developed by an expert panel through a Delphi process. The panel consisted of five experts who were fluent in both English and Spanish and had extensive experience working with Spanish speakers in educational, medical, and public health settings. The panel first translated the 66 REALM medical terms into Spanish. The translation took into account both the dictionary definition and the commonality of usage in daily conversations. The panel then selected the key and distractor for each REALM medical term. The process produced both the English and Spanish drafts of the instrument. A pre-test with 10 English-speaking and 10 Spanish-speaking subjects found the drafts were appropriate, requiring no further change.

Field Test and Verification of the Association Questions

The field test was conducted with 202 English-speaking and 201 Spanish-speaking respondents, recruited at the Ambulatory Care Center of the University of North Carolina Healthcare System. To be eligible for participation in the study, the subjects had to meet the following criteria: (1) be fluent in either English or Spanish; (2) aged 18 or over but less than 80 years old; (3) without obvious signs of cognitive impairment; (4) without vision or hearing problems; and (5) showing no sign of drug or alcohol intoxication. The research protocol was approved by the Institutional Review Board at the School of Public Health, the University of North Carolina at Chapel Hill.

The two groups of respondents had similar gender composition, with female respondents representing approximately 56% of the total sample. On average, Spanish-speaking respondents tended to be younger (34.2 versus 43.7 years) and have fewer years of schooling (10.1 versus 13.0 years) than English-speaking respondents. The interview was conducted by six trained bilingual interviewers using a questionnaire that included the 66 test items and questions regarding the respondents’ demographic attributes (i.e., years of schooling, gender, age, and marital status). Also included in the questionnaire was the Test of Functional Health Literacy in Adults or TOFHLA, used as a comparison in instrument validation.

Using data collected from English-speaking respondents, we were able to verify the design and selection of words for the association (comprehension) test in the instrument. The verification was based on the correlation between the REALM score and the association test score. A high correlation (r=0.76) was found, suggesting the design of association test was adequate.

Psychometric Assessment and Selection of Comparable Items for Spanish and English-Speakers

For the purpose of developing a comparable test for Spanish and English-speakers, we employed item response theory (IRT). IRT is a modern, model-based, and item-oriented psychometric approach to scale development. In addition to testing the psychometric qualities of test items, it has the capability of examining the equivalence of test items between groups, thereby allowing the development of comparable tests (Ellis & Mead, 2002; Embretson & Reise, 2000).

IRT assumes that responses to items are related to a single underlying latent variable. We examined this assumption using both exploratory and confirmatory factor analyses of the inter-item tetrachoric correlation matrix via the WLMSV algorithm in the software Mplus (Muthén & Muthén, 2008). Initially, exploratory factor analysis, including the scree plot, was conducted to determine the necessary number of factors to achieve adequate model fit (using evaluation of common fit indices and comparisons of eigenvalues) (Hambleton & Rovinelli, 1986). Confirmatory factor analysis was then performed to confirm unidimensionality.

We then performed IRT to calibrate the test items in the Spanish and English versions of the original 66-item instrument. IRT assumes that an examinee’s response to an item on a test is related to a latent trait (θ), which the test is presumed to measure. It also assumes that the relationship can be represented by a mathematical function (usually an s-shaped, logistic function) known as an item characteristic curve (ICC). The ICCs of dichotomously scored items are commonly evaluated using the three-, two-, and one-parameter logistic models (3PLM, 2PLM, and 1PLM). The 3PLM is written as:

Pi(θ)=ci+(1ci)1[1+exp{Dai(θbi)}],

where Pi(θ) is the probability that an examinee with ability θ (in this case, health literacy) answers item i correctly; ai is the discrimination parameter indicating the degree to which small differences in ability are associated with different probabilities of correctly answering item i; bi is the difficulty parameter corresponding to the ability level associated with a .50 probability of answering item i correctly; and ci is the guessing parameter or the probability that an examinee who is infinitely low on the ability answers item i correctly; and D is a scaling constant of 1.7 used to transform the metric from logistic to normal. The 2PLM assumes no guessing and estimates item difficulty and discrimination. The 1PLM estimates item difficulty only and assumes that the discrimination parameter is equal across items. The 2PLM and 3PLM usually provide a better fit for dichotomous items (Embretson & Reise, 2000). We examined the relative fit of the two models and estimated the parameters using the MULTILOG program (Thissen, 1991).

In order to create a comparable health literacy test, the psychometric properties of the items must be shown to be equal in both the Spanish and English-speaking samples. In IRT, the test of differential item functioning (DIF) is used to assess whether item discrepancy exists between separate groups (Embretson & Reise, 2000). In the case of the 2PL model, for example, DIF may occur for either the discrimination or difficulty parameter. DIF in a discrimination parameter indicates that an item is more representative of the underlying construct in one group than the other. DIF in a difficulty parameter suggests that an item is more or less difficult in one group than the other, after accounting for overall group differences. In the context of this study, DIF could be interpreted as a Spanish-to-English, or vice versa, translation effect or a potential cultural difference (Orlando & Marshall, 2002). Ignoring DIF, therefore, could lead to incorrect conclusions about group differences or similarities.

DIF could also be viewed as an approach to ensuring “construct consistency” between samples. DIF on an item necessarily indicates that the construct the item is intended to measure is different between groups. When items with DIF are eliminated, we are left with a set of items that are measuring the same construct in practice. Thus, our goal was to identify items that were DIF-limited so that they could be administered to Spanish and English-speakers. DIF analysis was performed using the IRT-LR DIF procedure in the software IRTLRDIF (Thissen, 2001).

Validity and Reliability Tests

Construct validity and reliability of the comparable test were also examined. In testing construct validity, we performed the following analyses: (1) correlating the Spanish version of the test to SAHLSA and Spanish TOFHLA,2 (2) correlating the English version of the test to REALM and English TOFHLA,3 and (3) correlating the examinee’s test score to his/her educational attainment (i.e., years of schooling).

Reliability was examined using two approaches. First, we calculated Cronbach’s alpha for each version of the test. Cronbach’s alpha, a measure of internal reliability, indicates the extent to which the reliability of the test scores was similar across samples. Second, using an IRT-based approach, test information was computed. Differing from the traditional reliability coefficients (e.g., Cronbach’s alpha), test information reflects how reliably (or precisely) the SAHL-S&E items measure health literacy across the range of literacy (Ellis & Mead, 2002; Embretson & Reise, 2000).

RESULTS

Examination of Unidimensionality

Prior to conducting factor analysis, three of the 66 items—“flu,” “cancer,” and “eye”—were removed in both the Spanish and English-speaking samples, because more than 98% of the respondents provided correct responses, indicating that those items provided little useful information. For the remaining 63 items in each sample, comparisons of fit indices and interpretability of communalities indicated that a one factor model fit better than did models with more or fewer factors. Additionally, scree plots show a clear dominance of the first factor. In the Spanish-speaking sample, the eigenvalue for the first factor of the 63 items was over four times larger than that of the second largest, and the second largest eigenvalue was similar to the smaller ones, suggesting the items were indicators of a common, latent factor. Similarly, the eigenvalue of the first factor in the English-speaking sample was over eight times larger than that of the second largest factor (Appendix s1).

Results of confirmatory factor analysis also indicated generally good fit of the single-factor model (i.e., unidimensionality) in both the Spanish and English-speaking samples. For the Spanish-speaking sample, the single factor model had a χ2 value of 76 (df = 55, p = 0.030), TLI = 0.935, and RMSEA = 0.044. The corresponding fit indices for the English-speaking samples were: χ2 = 61 (df = 45, p = 0.058), TLI = 0.989, RMSEA = 0.042.

Item Calibration

IRT was conducted separately for the remaining 63 items in each sample. Results from likelihood ratio tests indicated that the 2PLM provided the best fit, suggesting that the effect of guessing was minimal.

Following Lee et al. (2006), we considered items with a discrimination parameter greater than 1.0 but less than 3.0 (to ensure all items reasonably discriminated between individuals) and a difficulty parameter between −3.0 and +3.0 to be satisfactory. Using these criteria, 17 additional items were removed from the English version of the instrument. Notably, most of the removed items had discrimination parameters greater than 3.0. Sixteen items (not necessarily the same) were also removed from the Spanish version. The majority of these items had discrimination parameters less than 1.0 or threshold parameters less than −3.0. Of the remaining items, 32 appeared in both versions of the instrument.

Differential Item Functioning (DIF) Test

To determine the final set of items for inclusion in the comparable health literacy test, DIF analysis was conducted on the 32 common items. Because of the number of statistical tests involved in determining DIF (in this case 32), the Benjamini-Hochberg (B-H) correction was used to control for multiple comparisons (Benjamini & Hochberg, 1995). Results indicated that 14 of the 32 items had significant DIF (Table 1). The remaining 18 items comprised the comparable health literacy test, which we named the Short Assessment of Health Literacy-Spanish and English or SAHL-S&E.

Table 1.

Results of the DIF Analysis

Discrimination Difficulty Discrimination Difficulty
English Item a b Spanish Item a b DIF
Dose 1.08 −1.67 Dosis 1.37 −1.95
Nerves 2.17 −1.35 Nervios 1.75 −1.69
Kidney 1.72 −3.10 Riñón 2.01 −2.05
Hormones 1.48 −1.65 Hormonas 1.41 −1.45
Herpes 1.38 −2.53 Herpes 1.28 −0.90 *
Caffeine 1.03 −2.49 Cafeína 0.92 −1.17 *
Incest 1.90 −1.27 Incesto 1.46 0.15 *
Asthma 2.10 −1.57 Asma 3.21 −2.26 *
Seizure 1.49 −1.76 Convulsiones 1.25 −2.39
Depression 1.63 −2.39 Depresión 2.13 −1.49 *
Infection 2.18 −2.27 Infección 1.57 −2.36
Pregnancy 1.88 −1.92 Embarazo 2.00 −1.80
Syphilis 0.93 −0.14 Sífilis 1.50 0.13
Abnormal 1.70 −1.54 Abnormal 1.36 −1.36
Nutrition 1.21 −2.28 Nutrición 2.38 −1.59
Miscarriage 1.56 −2.28 Aborto espontáneo 1.68 −1.93
Hemorrhoids 1.17 −0.84 Hemorroides 1.47 −1.13
Directed 1.91 −1.43 Indicado 1.09 −1.47
Irritation 1.41 −1.46 Irritación 1.03 −1.04 *
Alcoholism 1.59 −2.00 Alcoholismo 1.94 −2.32
Sexually 0.63 0.19 Sexualmente 1.06 −1.87 *
Colitis 1.55 0.80 Colitis 1.20 −0.56 *
Testicle 1.51 −1.31 Testículo 1.27 −0.72 *
Occupation 1.63 −2.37 Empleo 2.30 −2.42
Constipation 1.51 −1.25 Estreñimiento 1.25 −1.90
Medication 1.51 −2.31 Medicamento 1.88 −2.36
Diagnosis 1.36 −1.23 Diagnóstico 1.85 −1.28
Osteoporosis 1.77 −0.16 Osteoporosis 0.95 −1.48 *
Prostate 1.08 −1.53 Próstata 1.23 −0.58 *
Hepatitis 0.81 −1.55 Herpes 1.60 −0.67 *
Anemia 1.93 −0.66 Anemia 1.46 −2.18 *
Obesity 1.59 −1.28 Obesidad 2.29 −1.53 *
*

Indicates a significant difference (p < 0.05) in item parameters between the Spanish and English-speaking samples using the B-H correction.

Validity and Reliability Tests

SAHL-S was highly correlated with SAHLSA (r = 0.88, p < 0.05) and Spanish TOFHLA (r = 0.62, p < 0.05) in the Spanish-speaking sample. SAHL-E also had high correlations with REALM (r = 0.94, p < 0.05) and English TOFHLA (r = 0.68, p < 0.05) in the English-speaking sample. Significant correlations were also found between SAHL-S&E and years of schooling in both the Spanish and English-speaking samples (r = 0.15, p < 0.05 and r = 0.39, p < 0.05, respectively).

SAHL-S&E displayed satisfactory reliability of 0.80 and 0.89 in the Spanish and English-speaking samples, respectively. The test information function indicates that scores on the SAHL-S&E are highly reliable (i.e., greater than α = 0.90) for individuals with a low level of health literacy (i.e., between approximately −3 and −1 standard deviations below the mean) (Appendix 2).

Finally, we examined the plot of SAHL-S&E scores vis-à-vis SAHLA-50, English TOFHLA, and REALM scores and determined that subjects with a SAHL-S&E score between 0 and 14 had a significant chance (76% to 85%) of being classified as having low health literacy based on these other instruments. Additional analyses of association confirmed that SAHLA-S&E ≤ 14 represented a proper cutoff point for low health literacy. Based on this criterion, 54 (27.0%) of the Spanish speakers and 48 (23.8%) of the English speakers in our sample had a low level of health literacy.

DISCUSSION

This paper reports the development of SAHL-S&E, designed to provide a comparable test of health literacy for Spanish-speaking and English-speaking populations. Results show that the instrument has good validity and reliability. Guessing does not appear to be a concern if clear instruction is given before the test. The instrument contains only 18 items and is easy to administer. We estimate that the administration would take only 1-2 minutes and require minimal training. (The Spanish and English version of SAHL-S&E and the user guides are included in the Appendix). A rather high cutoff point is found for low health literacy (≤14), suggesting that the SAHL-S&E is particularly useful for identifying individuals with low health literacy. The test information function confirms that the instrument is highly reliable at the lower range of scores.

In validating the instrument, we found that SAHL-S had a higher correlation with SAHLSA than with Spanish TOFHLA. Similarly, the correlation between SAHL-E and REALM was higher than that between SAHL-E and English TOFHLA. The findings may reflect that the fact that the design of SAHL-S&E, essentially a word recognition test, is the same as SAHLSA and similar to REALM. We also found that the resulting instrument had a higher correlation with years of schooling in the English-speaking sample. There are two plausible explanations. First, in comparison to Spanish-speakers whose education was obtained in varying countries and systems, the education experience of English-speakers may be more homogeneous. Second, the format of the instrument (a pronunciation test and a multiple choice test for comprehension) may be more consistent with the standard testing in the U.S. education system.

Several limitations are worth noting. The instrument was developed based on standard, “dictionary” Spanish and English. Further testing of the instrument may be needed in different Latino and English-speaking subpopulations who are accustomed to using different idiomatic expressions. As with other health literacy instruments such as TOFHLA and REALM, SAHL-S&E is a reading test. It assesses specifically an individual’s reading skill in the health care context. The design is based on the assumption that reading ability is a basic literacy skill, without which patients would have difficulty functioning in and negotiating the health care system. However reasonable the assumption is, it should be noted that the instrument does not capture other skills, such as numeracy and communication, that may also be important in health care. Furthermore, similar to prior instrument development studies, our study did not include a random, representative sample of Spanish-speakers and English-speakers in the community. The hospital-based participants recruited for the study may be more receptive to a health literacy test. What kind of difficulties may arise in applying the SAHL-S&E to a community-based sample remains to be evaluated. Finally, as we have noted, the instrument is particularly suitable for identifying individuals with low health literacy. For individuals with a >14 score, the instrument may not be sensitive enough to distinguish different health literacy levels.

Despite these limitations, the instrument has several practical applications. First, unlike other instruments, the comparability between the Spanish and English versions of the instrument is established through rigorous psychometric evaluation. It offers a reliable way to assess and compare the level of low health literacy between Spanish and English speakers.

Second, the instrument can be used to screen for individual health literacy level in public health and clinical settings that serve a high concentration of English-speaking or Spanish-speaking patients or a mixed patient population. Although the value of health literacy screening is debatable, two recent studies suggest that patients are not averted to health literacy screening if protection of personal information is exercised (Ryan et al., 2008; VanGeest, Welch, & Weinber, 2010). Being able to identify patients with low health literacy can alert health care providers to the possibility that these patients may have difficulty with printed educational materials or highly scientific explanations of complex medical conditions (Bass, Wilson, Griffith, & Barnett, 2002; Chew, Bradley, & Boyko, 2004). Increased awareness among health care practitioners of the special health and personal needs of low health literacy patients may help reduce the level of linguistic complexity used in provider-patient communications, thus preventing serious medical errors due to misunderstanding. This, in turn, has the potential to improve quality of care and reduce health care cost.

Third, the instrument could be used to assess the level of health literacy in the community. The information could be used to guide the design of appropriate health educational materials (written and/or multimedia) or for devising community intervention programs that are comparable with the health literacy level of the local population (Brandes, 1996; Davis, Michielutte, Askov, Williams, & Weiss, 1998).

Finally, a comparable health literacy instrument for Spanish and English speakers would facilitate comparisons in research. Instead of stratifying subjects on language in health literacy research, researchers could combine samples and use SAHL-S&E to identify those with low health literacy in their analysis.

Supplementary Material

Supplement 1 and 2

Footnotes

1

In comparison to English, Spanish has regular phoneme-grapheme correspondence, meaning that one sound is usually represented by one letter and vice versa. Therefore, it is relatively easy to pronounce words in Spanish so long as one can recognize letters and a low-level reader can usually score high on a word recognition test. This feature of the Spanish language violates the design basis of the REALM that there exists a high correspondence between reading ability and comprehension.

2

In a previous study, the SAHLSA score was found to be significantly and positively associated with the physical health status of Spanish-speaking subjects (p<0.05), holding constant age and years of education (Lee et al., 2006). The instrument also displayed high internal reliability (Cronbach alpha=0.92) and test-retest reliability (Pearson r=0.86).

3

REALM has good correlation scores, ranging from 0.88 to 0.97, with 3 other general reading tests. Its test-retest reliability is 0.99 (Davis et al., 1993). English TOFHLA has a high correlation with REALM (r=0.84). It has test-retest reliability is 0.98 (Parker et al., 1995).

REFERENCES

  1. Aguirre AC, Ebrahim N, Shea JA. Performance of the English and Spanish S-TOFHLA among publicly insured Medicaid and Medicare patients. Patient Education and Counseling. 2005;56(3):332–339. doi: 10.1016/j.pec.2004.03.007. [DOI] [PubMed] [Google Scholar]
  2. Baker DW, Gazmararian JA, Williams MV, Scott T, Parker RM, Green D, et al. Functional health literacy and the risk of hospital admission among Medicare managed care enrollees. Am J Public Health. 2002;92(8):1278–1283. doi: 10.2105/ajph.92.8.1278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Baker DW, Gazmararian JA, Williams MV, Scott T, Parker RM, Green D, et al. Health literacy and use of outpatient physician services by Medicare managed care enrollees. J Gen Intern Med. 2004;19(3):215–220. doi: 10.1111/j.1525-1497.2004.21130.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bass PFI, Wilson JF, Griffith CH, Barnett DR. Residents’ ability to identify patients with poor literacy skills. Academic Medicine. 2002;77(10):1039–1041. doi: 10.1097/00001888-200210000-00021. [DOI] [PubMed] [Google Scholar]
  5. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of Royal Statistical Society: Series B. 1995;57:289–300. [Google Scholar]
  6. Brandes WL. Literacy, Health and the Law: An Exploration of the Law and the Plight of Marginal Readers within the Health Care System: Advocating for Patients and Providers. Health Promotion Council of Southeastern Pennsylvania, Inc.; Philadelphia, PA: 1996. [Google Scholar]
  7. Chew LD, Bradley KA, Boyko EJ. Brief questions to identify patients with inadequate health literacy. Family Medicine. 2004;36:588–594. [PubMed] [Google Scholar]
  8. Cho YI, Lee SY, Arozullah AM, Crittenden KS. Effects of health literacy on health status and health service utilization amongst the elderly. Soc Sci Med. 2008;66(8):1809–1816. doi: 10.1016/j.socscimed.2008.01.003. [DOI] [PubMed] [Google Scholar]
  9. Davis TC, Long SW, Jackson RH, Mayeaux EJ, George RB, Murphy PW, et al. Rapid estimate of adult literacy in medicine: A shortened screening instrument. Family Medicine. 1993;25(6):391–395. [PubMed] [Google Scholar]
  10. Davis TC, Michielutte R, Askov EN, Williams MV, Weiss BD. Practical assessment of adult literacy in health care. Health Education & Behavior. 1998;25(5):613–624. doi: 10.1177/109019819802500508. [DOI] [PubMed] [Google Scholar]
  11. Davis TC, Wolf MS, Bass PF, Middlebrooks M, Kennen E, Baker DW, et al. Low Literacy Impairs Comprehension of Prescription Drug Warning Labels. Journal of General Internal Medicine. 2006;21(8):847–851. doi: 10.1111/j.1525-1497.2006.00529.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Doak CC, Doak LG, Root JH. Teaching Patients with Low-Literacy Skills. 2nd ed JB Lipincott; Philadelphia, PA: 1996. [Google Scholar]
  13. Ellis BB, Mead AD. Item analysis: Theory and practice using classical and modern test theory. In: Rogelberg SG, editor. Handbook of Research Methods in Industrial and Organizational Psychology. Blackwell; Malden, MA: 2002. pp. 324–343. [Google Scholar]
  14. Embretson SE, Reise SP. Item Response Theory for Psychologists. Erlbaum; Hillsdale, NJ: 2000. [Google Scholar]
  15. Gazmararian JA, Williams MV, Peel J, Baker DW. Health literacy and knowledge of chronic disease. Patient Educ Couns. 2003;51(3):267–275. doi: 10.1016/s0738-3991(02)00239-2. [DOI] [PubMed] [Google Scholar]
  16. Haladyna TM. Developing and Validating Multiple-Choice Test Items. 2nd ed Lawrence Erlbaum Associates; Mahwah, NJ: 1999. [Google Scholar]
  17. Hambleton RK, Rovinelli RJ. Assessing the dimensionality of a set of test items. Applied Psychological Measurement. 1986;10:287–302. [Google Scholar]
  18. Howard DH, Gazmararian J, Parker RM. The impact of low health literacy on the medical costs of Medicare managed care enrollees. Am J Med. 2005;118(4):371–377. doi: 10.1016/j.amjmed.2005.01.010. [DOI] [PubMed] [Google Scholar]
  19. Huamán-Calderón D, Quiliano-Terreros R, Vílchez-Román C. Embarazo no deseado y fuentes de información impresas y audiovisuales, en mujeres peruanas (2004-2005) [Unwanted pregnancy and access to printed media in Peruvian women] Rev Méd Chile. 2009;137:46–52. [PubMed] [Google Scholar]
  20. Institute of Medicine . Health Literacy: A Prescription to End Confusion. The National Academy of Sciences; Washington, DC: 2004. [Google Scholar]
  21. Jones M, Lee JY, Rozier RG. Oral health literacy among adult patients seeking dental care. Journal of American Dental Association. 2007;038:1199–1208. doi: 10.14219/jada.archive.2007.0344. [DOI] [PubMed] [Google Scholar]
  22. Keselman A, Tse T, Crowell J, Browne A, Ngo L, Zeng Q. Assessing consumer health vocabulary familiarity: An exploratory study. Journal Medical Internet Research. 2007;9(1):e5. doi: 10.2196/jmir.9.1.e5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lee SY, Bender DE, Ruiz RE, Cho YI. Development of an easy-to-use Spanish Health Literacy test. Health Serv Res. 2006;41(4 Pt 1):1392–1412. doi: 10.1111/j.1475-6773.2006.00532.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Lindau ST, Basu A, Leitsch SA. Health literacy as a predictor of follow-up after an abnormal Pap smear: A prospective study. Journal of General Internal Medicine. 2006;21:829–834. doi: 10.1111/j.1525-1497.2006.00534.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Lindau ST, Tomori C, Lyons T, L. L, Bennett CL, Garcia P. The association of health literacy with cervical cancer prevention knowledge and health behavior in a multiethnic cohort of women. American Journal of Obstetrics and Gynecology. 2002;186:938–943. doi: 10.1067/mob.2002.122091. [DOI] [PubMed] [Google Scholar]
  26. Murphy PW, Davis TC, Long SW, Jackson RH, Decker BC. Rapid estimate of adult literacy in medicine (REALM): A quick reading test for patients. Journal of Reading. 1993;37:121–130. [Google Scholar]
  27. Muthén LK, Muthén BO. Mplus User’s Guide. Muthén & Muthén; Los Angeles, CA: 2008. [Google Scholar]
  28. Nurss JR, Baker DW, David TC, Parker RM, Williams MV. Difficulties in functional health literacy screening in Spanish-speaking adults. Journal of Reading. 1995;38:632–637. [Google Scholar]
  29. Orlando M, Marshall GN. Differential item functioning in a Spanish translation of the PTSD Checklist: Detection and evaluation of impact. Psychological Assessment. 2002;14:50–59. doi: 10.1037//1040-3590.14.1.50. [DOI] [PubMed] [Google Scholar]
  30. Parker RM, Baker DW, Williams MV, Nurss JR. The test of functional health literacy in adults: A new instrument for measuring patients’ literacy skills. Journal of General Internal Medicine. 1995;10(10):537–541. doi: 10.1007/BF02640361. [DOI] [PubMed] [Google Scholar]
  31. Parker RM, Ratzan SC, Lurie N. Health Literacy: A Policy Challenge For Advancing High-Quality Health Care. Health Affairs. 2003;22(4):147. doi: 10.1377/hlthaff.22.4.147. [DOI] [PubMed] [Google Scholar]
  32. Rogers ES, Wallace LS, Weiss BD. Misperceptions of medical understanding in low-literacy patients: Implications for cancer prevention. Cancer Control. 2006;13(3):225–229. doi: 10.1177/107327480601300311. [DOI] [PubMed] [Google Scholar]
  33. Rosembla G, Tse T. Paper presented at the AMIA Annual Symposium.2006. [Google Scholar]
  34. Ryan JG, Leguen F, Weiss BD, Albury S, Jennings T, Velez F, et al. Will patients agree to have their literacy skills assessed in clinical practice? Health Education Research. 2008;23(603-611) doi: 10.1093/her/cym051. [DOI] [PubMed] [Google Scholar]
  35. Scott TL, Gazmararian JA, Williams MV, Baker DW. Health literacy and preventive health care use among Medicare enrollees in a managed care organization. Med Care. 2002;40(5):395–404. doi: 10.1097/00005650-200205000-00005. [DOI] [PubMed] [Google Scholar]
  36. Seldon CR, Zorn M, Ratzan SC, Parker RM. National Library of Medicine Current Bibliographies in Medicine: Health Literacy. 2000. Retrieved. from.
  37. Sudore RL, Yaffe K, Satterfield S, Harris TB, Mehta KM, Simonsick EM, et al. Limited literacy and mortality in the elderly: the health, aging, and body composition study. J Gen Intern Med. 2006;21(8):806–812. doi: 10.1111/j.1525-1497.2006.00539.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Thissen D. MULTILOG User’s Guide: Multiple Categorical Item Analysis and Test Scoring Using Item Response Theory. Scientific Software International, Inc.; Chicago: 1991. [Google Scholar]
  39. Thissen D. IRTLRDIF v.2.0b: Software for the computation of the statistics involved in item response theory likelihood-ratio tests for differential item functioning. University of North Carolina at Chapel Hill; 2001. [Google Scholar]
  40. VanGeest JB, Welch VL, Weinber SJ. Patients’ perceptions of screening for health literacy: Reactions to the Newest Vital Sign. Journal of Health Communication. 2010 doi: 10.1080/10810731003753117. Forthcoming. [DOI] [PubMed] [Google Scholar]
  41. Weiss BD, Mays MZ, Martz W, Castro KM, DeWalt DA, Pignone MP, et al. Quick assessment of literacy in primary care: The newest vital sign. Annals of Family Medicine. 2005;3(6):514–522. doi: 10.1370/afm.405. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1 and 2

RESOURCES