Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 19.
Published in final edited form as: Med Care Res Rev. 2020 Jul 7;78(5):616–626. doi: 10.1177/1077558720935733

Beyond Black and White: Mapping Misclassification of Medicare Beneficiaries Race and Ethnicity

Irina B Grafova 1, Olga F Jarrín 2,3
PMCID: PMC8602956  NIHMSID: NIHMS1755273  PMID: 32633665

Abstract

The Centers for Medicare and Medicaid Services administrative data contains two variables that are used for research and evaluation of health disparities: the enrollment database “EDB” beneficiary race code and the Research Triangle Institute “RTI” race code. The objective of this paper is to examine state-level variation in racial/ethnic misclassification of EDB and RTI race codes compared to self-reported data collected during home health care. The study population included 4,231,370 Medicare beneficiaries who utilized home health care services in 2015. We found substantial variation among states in misclassification of self-identified Hispanics, Asian Americans/Pacific Islanders, and American Indians/Alaskan Natives in administrative race data. Caution should be used when interpreting state-level healthcare disparities and minority health outcomes based on existing race variables contained in Medicare datasets. Self-reported race/ethnicity data collected during routine care of Medicare beneficiaries may be used to improve the accuracy of minority health and health disparities reporting and research.

Keywords: race, ethnicity, Medicare, state-level, health disparities, minority health


Racial and ethnic disparities in health care quality and access in the United States have been well-documented across the lifespan (Duran & Pérez-Stable, 2019; Jeffries, et al., 2019; Wasserman, et al., 2019). However, data on race/ethnicity are incomplete in most clinical and administrative data sets. Collecting accurate data on race and ethnicity provides critically needed information that helps to identify racial and ethnic differences and disparities in healthcare access and care quality which are vital for improving minority health outcomes (Bierman, Lurie, Collins, & Eisenberg, 2002). In 2003, the Institute of Medicine reported that standardized data on racial and ethnic differences in care are generally unavailable and recommended that all health plans collect and report data on their members’ race and ethnicity (IOM, 2003). Later, the Affordable Care Act (ACA) required population surveys and federally funded health and health care programs to enhance their collection and reporting of data on race and ethnicity (2010). This was followed by the Department of Health and Human Services (DHHS) 2011 Action Plan to Reduce Racial and Ethnic Health Disparities that aimed to increase availability and quality of health disparities data collection (DHHS, 2014).

Despite these efforts, there are still substantial gaps in accuracy of race and ethnicity data contained in administrative data and electronic health records (Magana Lopez, Bevans, Wehrlen, Yang, & Wallen, 2016; Polubriaginof, et al., 2019; Sholle, et al., 2019). Missed policy opportunities to improve collection of data on race, ethnicity, gender, sexual orientation, and disability status in electronic health records remain (Douglas, Dawes, Holden, & Mack, 2015). This is particularly evident at the state level. Many traditional surveys fail to provide state-representative data despite being nationally representative. Even surveys that are representative at both national and state levels, such as the Behavioral Risk Factor Surveillance System, often have insufficient sample sizes to describe racial disparities with precision for each state (Brown, et al., 2013; Grzywacz, Hussain, & Ragina, 2018). Other data sources, such as Healthcare Cost and Utilization Project hospital discharge data, tend to have a substantial proportion of missing race and ethnicity data (Ma, Zhang, Lyman, & Huang, 2018).

The completeness and accuracy of race/ethnicity data is especially problematic for non-Black minority groups (American Indians/Alaska Natives, Asian Americans/Pacific Islanders, and Hispanics/Latinos) that were classified as “other” by the Social Security Administration prior to the 1980s. Race/ethnicity data quality issues have been independently identified and reported across cancer registry data (Layne et al., 2019), Ohio Medicaid data (Hartzler & Snyder, 2017), and Medicare administrative data for Veterans Health Administration patients (Hernandez et al., 2019). The limitations of Medicare administrative data on beneficiaries’ race and ethnicity are critically important because this is the primary source of information used to examine disparities in access to and in utilization of physicians’ services and inpatient hospital care at both national and state levels (CMS, 2020). Unfortunately, little evidence exists to support the accuracy of Medicare’s data on race and ethnicity at the national level (Bilheimer & Sisk, 2008; Escarce, Carreón, Veselovskiy, & Lawson, 2011; Liebler, 2018; Nerenz, 2005; Ng, Ye, Ward, Haffer, & Scholle, 2017). Moreover, there is evidence suggesting that there is regional, and potentially state-level, variation in the accuracy of Medicare’s race/ethnicity data (Zaslavsky, Ayanian, & Zaborski, 2012). Finally, there have been no attempts to identify patterns of race and ethnicity misclassification errors at the state level. This study aims to fill in this gap.

The Medicare Beneficiary Summary File (MBSF) contains two race and ethnicity variables: the enrollment database (EDB race) originating from Social Security Administration (SSA) records and an imputed race and ethnicity variable created by the Research Triangle Institute (RTI race). Prior to 1980 the SSA race/ethnicity question contained only four categories (White, Black, other, and unknown) that were used to populate the EDB race variable. By 1982 the SSA had expanded the race/ethnicity categories to include American Indian/Alaskan Native; Asian American or Pacific Islander; and Hispanic, however, for the purpose of the EDB race variable, the categories were collapsed to white, black, and other. Corrections/updates to the EDB race data were made in 1994 and in 1997, using the SSA expanded race/ethnicity data categories. Beginning in 1999 the Indian Health Service (IHS) provided information on the population it serves to the Center for Medicare and Medicaid Services (CMS), and since the year 2000 the EDB race code has been updated annually from SSA and IHS records. Despite these efforts, the completeness of race/ethnicity data for non-Black minorities remains problematic.

In response to these challenges, the Research Triangle Institute (RTI) race variable was developed to improve classification of Hispanics and Asian Americans/Pacific Islanders (AAPIs) through imputation based on U.S. Census surname lists from 1990 and 2000 combined with residence in Puerto Rico or Hawaii (Bonito, Eicheldinger, & Evensen, 2005; Eicheldinger & Bonito, 2008). The RTI race variable contains 6 categories: non-Hispanic White, non-Hispanic Black, Hispanic/Latino, Asian American/Pacific Islander, American Indian/Alaskan Native, and Other/Unknown. While leaving much room for improvement, the RTI race variable represents a significant advance over the EDB race variable for identification of Hispanics and Asian Americans/Pacific Islanders (AAPIs). Despite acknowledged limitations in accuracy for AAPIs and American Indians/Alaskan Natives, the RTI race variable is frequently used in research and public reporting of minority health disparities (Gandhi, Lim, Davis, & Chen, 2018; Glantz, et al., 2019; Hanchate, et al., 2019; CMS, 2020). The EDB and RTI race variables neither include subcategories for Black, Hispanic, or Asian American/Pacific Islander groups, nor allow for multiple racial/ethnic identities beyond the category “Other.”

New Contribution

This study provides new insights on the quality of the race and ethnicity information in Medicare administrative data on the state level. We critically explored racial/ethnic misclassification of the race variables contained in the Medicare Beneficiary Summary File (MBSF) by comparing EDB race and RTI race to gold-standard self-reported race and ethnicity data across all states and Puerto Rico. This is the first study to examine the difference in misclassification of Medicare beneficiaries’ race/ethnicity by the EDB and RTI race variables, and to explore how the accuracy varies between states. By documenting state-level misclassification error, this study can inform researchers and policymakers seeking to evaluate and address state-level disparities within the Medicare population. This study suggests to researchers the geographic areas and minority groups where particular caution should be exercised when using the EDB or RTI race variables. It also offers concrete examples of how additional data sources, such as OASIS data, could be used to improve race and ethnicity categorization in the Medicare data. The findings may be valuable for those who are interested in health disparities among larger minority groups, such as African Americans and Hispanics, as well as for those interested in smaller minority groups, such as Asian/Pacific Islanders and Native Americans and other racial and ethnic subgroups.

Methods

Data Source

The EDB and RTI race variables were extracted from the Medicare Beneficiary Summary File (MBSF) demographic and enrollment data contained in the annual base file. Using the unique beneficiary identification number provided for this purpose we linked the EDB and RTI race codes to self-reported race code from the home health care Outcome and Assessment Information Set (OASIS) (Jarrin, Nyandege, Grafova, Dong, & Lin, 2020). Home health care patient’s self-reported race/ethnicity is recorded as part of the standardized OASIS admission assessment by a registered nurse or physical therapist. The race/ethnicity question includes the direction to mark all the response choices that apply: (1) American Indian or Alaska Native, (2) Asian, (3) Black or African-American, (4) Hispanic or Latino, (5) Native Hawaiian or Pacific Islander, and (6) White. A review of prior research on the reliability of the race and ethnicity data recorded in OASIS assessments suggests near perfect agreement between clinicians and over time (O’Connor & Davitt, 2012), with studies reviewed finding between 98–99% agreement (Kinatukara, Rosati, & Huang, 2005) and 100% agreement (Hittle, Shaughnessy, Crisler, Powell, et al., 2003).

Study Population

We included all adult Medicare beneficiaries living in the United States, Puerto Rico and the U.S. Virgin Islands who received a new episode of home health care services in 2015 and self-reported belonging to only one racial/ethnic group (n=4,231,370). We excluded 11,270 Medicare beneficiaries (0.27%) who self-reported identification with multiple races/ethnicities.

Data Analysis

Using the self-reported race/ethnicity data collected during home health care as the gold-standard (Hittle, et al., 2003; Kinatukara, et al., 2005), we analyzed the pattern of misclassification errors in the EDB and RTI race variables. For each racial/ethnic group, misclassification errors by the EDB and RTI race variables were calculated using the self-reported race from the home health care (OASIS) assessment as the reference group and gold standard. State-level misclassification rates were then calculated for the EDB and RTI race variables. Full details of the methods, and accuracy and agreement statistics (percent agreement, sensitivity, specificity, positive predictive value, and Cohen’s kappa coefficient) are available in a separate methods paper (Jarrín, Nyandege, Grafova, Dong, & Lin, 2020). All maps were created with mapchart.net (MapChart, 2020).

Results

Demographics

Sample demographic characteristics are presented in Table 1, stratified by the four major U.S. Census divisions (Northeast, Midwest, South, and West). Additionally, we include a fifth group (territories), consisting of the sample population residing in the territories of Puerto Rico and the U.S. Virgin Islands. Each beneficiary in the sample has complete data for all three race variables: 1) EDB race, originating from the Social Security Administration data, 2) RTI race that improves on EDB race by imputing Asian American/Pacific Islander and Hispanic/Latino race/ethnicity from surname and geography, and 3) OASIS race that is self-reported during home health care admission assessments. Table 1 highlights the national and regional variation in the proportion of the sample assigned to each race/ethnicity by the three different race variables.

Table 1.

Demographics of 2015 Medicare Home Health Beneficiary Sample

National West Midwest Northeast South Territories
N 4,231,370 691,590 900,705 876,376 1,739,328 23,371
Age: x¯; SD 76.8; 11.8 77.8; 11.5 76.7; 12.0 77.8; 11.7 76.0; 11.5 76.5; 11.7
Male, % 39.0 40.3 38.1 39.1 38.8 42.8
FFS, % 71.0 68.0 69.2 68.2 75.1 21.1
MA, % 29.0 32.0 30.8 31.8 24.9 78.9
Dual, % 28.8 31.3 26.1 29.3 29.2 1.6
EDB race
White, % 80.8 80.5 83.5 84.4 77.9 68.8
Black, % 12.8 5.4 13.5 9.9 16.9 6.7
Hispanic, % 2.8 4.9 0.7 2.3 3.0 20.9
Asian, % 1.6 5.2 0.8 1.3 0.7 0.1
AIAN, % 0.4 0.7 0.2 0.1 0.4 0.0
Other, % 1.7 3.4 1.2 2.0 1.1 3.7
RTI race
White, % 76.5 73.5 82.0 81.5 73.5 0.6
Black, % 12.6 5.3 13.5 9.5 16.8 0.8
Hispanic, % 7.5 12.8 2.4 5.9 7.8 98.6
Asian, % 1.9 6.2 1.0 1.8 0.9 0.0
AIAN, % 0.4 0.7 0.2 0.1 0.4 0.0
Other, % 1.0 1.7 0.9 1.4 0.7 0.0
OASIS race
White, % 78.0 75.8 83.4 83.2 74.4 0.3
Black, % 12.5 5.1 13.4 9.6 16.6 1.1
Hispanic, % 7.0 11.4 1.9 5.3 7.6 98.4
Asian, % 2.1 7.0 1.1 1.7 0.9 0.1
AIAN, % 0.4 0.7 0.3 0.2 0.4 0.0

National: All 50 states, Washington D.C., Puerto Rico, and U.S. Virgin Islands

West: AK, WA, OR, CA, HI, MT, ID, WY, NV, UT, CO, AZ, NM

Midwest: ND, SD, NE, KS, MN, IA, MO, WI, IL, IN, MI, OH

Northeast: ME, VT, NH, MA, CT, RI, NY, PA, NJ

South: TX, OK, AR, LA, MS, AL, TN, KY, WV, MD, DE, DC, VA, NC, SC, GA, FL

Territories: Puerto Rico and U.S. Virgin Islands

FFS: Medicare Fee-for-Service

MA: Medicare Advantage

Dual: Medicare-Medicaid dual eligible

Misclassification

Overall, the EDB race variable misclassified 7.5% of the sample beneficiaries, compared to the RTI race variable that misclassified 4.2%. However, Figure 1 depicts how these averaged numbers mask substantial variation among states in the rate of misclassification. For example, the EDB race variable misclassification rate varies from a low of 1.1% in West Virginia to a high of 78.9% in Puerto Rico. Overall, the RTI race variable improves significantly on EDB race, especially for classification of Hispanics and Asian Americans/Pacific Islanders (AAPIs). However, at the state-level, the RTI race variable misclassification rate varies from 1.2% in Puerto Rico to 22.5% in Hawaii. Figure 1 shows higher rates of misclassification by EDB race (top map) compared to RTI race (bottom map) in states where Hispanics and/or AAPIs represent a larger fraction of the population (e.g., California, New Mexico, Hawaii, Texas, Nevada, Arizona, Illinois, Florida, New York). However, the greater accuracy of the RTI race variable in classification of Hispanics and AAPIs comes at the expense of slightly more non-Hispanic whites and African Americans being erroneously classified compared to the original EDB race variable. In 30 states the RTI race misclassification rate was lower than the EDB race variable, however misclassification rates were the same (or slightly worse) in the remaining 20 states.

Figure 1.

Figure 1.

Percentage of Medicare home health recipients (2015) whose race/ethnicity was misclassified by the EDB race (upper map) and RTI race (lower map).

Next, we examine whether misclassification substantially distorts population demographics at the state level. Comparing overall state-level misclassification (Figure 1) to race-specific state-level misclassification (Figures 2, 3 and 4) highlights two important aspects of misclassification error. States with a similar overall misclassification rate may differ substantially in race-specific misclassification, and vice versa. For example, the RTI race variable misclassifies 4% of the overall sample in both Michigan and Illinois. However, Asian Americans/Pacific Islanders are twice as likely to be misclassified in Michigan (50.8%) compared to Illinois (25.2%). Conversely, the low overall rate of misclassification in the East South-Central region (Mississippi, Alabama, Tennessee, Kentucky) is primarily due to the low percentage of the total population in these states who self-identify as Hispanic, Asian American/Pacific Islander (AAPI), and American Indian/Alaskan Native (AIAN). In these states, misclassification of non-Black minorities remains high (Figures 2, 3, and 4).

Figure 2.

Figure 2.

In every state Hispanics are misclassified by the EDB race variable (upper map) more often than by the RTI race (lower map).

Figure 3.

Figure 3.

In every state except Montana (no change), Asians/Pacific Islanders are misclassified by the EDB race variable (upper map) more often than by the RTI race variable (lower map).

Figure 4.

Figure 4.

Medicare beneficiaries who self-identify as American Indian/Alaskan Native (AIAN) are frequently misclassified in Medicare administrative data. The EDB and RTI race variables are nearly identical with regards to classification of AIANs, and for this reason we only present results using the RTI race variable. There is striking variation between states in the misclassification of Medicare beneficiaries in our sample who self-identify as AIAN, with nearly half of all states misclassifying 80% or more of people who self-report as AIANs, and 7 states misclassifying less than 20% (Alaska, Arizona, New Mexico, Oklahoma, Wyoming, North Dakota, and South Dakota).

Misclassification of Hispanic beneficiaries

Puerto Rico, which was home to nearly 800,000 Medicare beneficiaries in 2015, was a main target of the RTI race variable, where misclassification of Hispanics was reduced from 79% (EDB race) to less than 1% (Figure 2). In other states, the RTI algorithm improved classification of Hispanics and Latinos based on matches to Spanish surname lists from the U.S. Census. Compared to EDB race variable, the RTI race variable significantly improved classification of Hispanics in the large states of California (9.2% vs. 58.5%), Texas (7.6% vs. 69.4%), and Florida (9.0% vs. 54.4%). However, despite the improvements to identification of Hispanics and Latinos accomplished by the RTI algorithm, in 19 states more than one-fifth of Medicare beneficiaries in our sample who self-reported as Hispanic were still misclassified by the RTI race variable (Figure 3).

Misclassification of Asians and Pacific Islander beneficiaries

Hawaii, which was home to 241,000 Medicare beneficiaries in 2015, was also a main target of the RTI algorithm, where misclassification of Asian Americans/Pacific Islanders (AAPIs) was reduced by half from 50% (EDB race) to 25% (Figure 3). In other states, the RTI algorithm improved classification based on matches to Asian/Pacific Islander surname lists from the U.S. Census. While overall misclassification of AAPIs dropped from 37.5% using EDB race to 25.3% with RTI race 2020, at least one-fourth of AAPIs are misclassified by the RTI race variable in the majority of states (Figure 3).

Misclassification of American Indians and Alaskan Natives beneficiaries

The EDB and RTI race variables are nearly identical in their ability to correctly classify Medicare beneficiaries who self-report as American Indian/Alaskan Native (AIAN). Both EDB and RTI race misclassify over 80% of AIANs in 24 states and Puerto Rico (Figure 4). In contrast, AIANs were misclassified less than 20% of the time in seven states (Alaska, Arizona, New Mexico, Oklahoma, Wyoming, North Dakota, and South Dakota). This high degree of misclassification raises the concern that true health disparities for this population may be substantially greater than estimates calculated from the RTI race variable (Arday, Arday, Monroe, & Zhang, 2000).

Discussion

The use of accurate race/ethnicity data is critically important for health services research. State level estimates of health outcomes and health service utilization for racial minority populations are urgently needed to monitor and address health disparities, to assess the impact of health care reform, and to adequately reflect increasing diversity of many states. Systematic misclassifications of minorities may distort the objective picture of minority health and health disparities. For example, implementation of standardized data collection practices in the State of New Jersey resulted in significant changes to the demographic profile of hospital patients, resulting in a 27% increase in the number of Asian/Pacific Islander patients (Chakkalakal, Green, Krumholz, & Nallamothu, 2015).

Our findings are consistent with prior research finding Hispanics, Asians, Pacific Islanders, and American Indians are under-identified in Medicare administrative data (Arday, et al., 2000; Eicheldinger & Bonito, 2008; Filice & Joynt, 2017; Waldo, 2004; Zaslavsky, et al., 2012). This study is the first to explore the implications of race/ethnicity variable decisions for state-level research on disparities within the Medicare population. Researchers and policymakers currently do not have accurate race/ethnicity data needed to fully evaluate and address disparities within the Medicare population. One simple solution is to enrich the existing administrative race/ethnicity data with self-reported race/ethnicity collected during post-acute care, including home health care and brief stays in skilled nursing or rehabilitation facilities (CMS, 2017). For example, Hernandez and colleagues (2019) developed an algorithm for combining race and ethnicity data from several sources in the Veterans Health Administration. This methodology could be applied retroactively to re-establish better baseline data and benchmarks for improving minority health and reducing disparities in health care access and quality.

Underestimating the true size of disparities may result in health policy inaction or poorly calibrated allocation of resources. It may put certain populations at risk for worse health outcomes despite advances in health care. This is particularly relevant for Native Americans/Alaska Natives, and Asian Americans/Pacific Islanders who have been largely “invisible” in public health due to their relatively small population size (Dong, Gu, El-Serag, & Thrift, 2019; Freemantle, et al., 2015; Ro & Yee, 2010; Stafford, 2010). Recent studies suggest that misclassification of Native Americans/Alaska Natives tends to lead to underestimation of health disparities among various outcomes, including cancer incidence rates (Sarfati, et al., 2018) and mortality (Joshi & Warren-Mears, 2019). Conversely, overestimation of disparities is also possible, most notably when the EDB race variable is used to estimate disease prevalence for Hispanics (Jarrín, et al., 2020).

Prior studies found the accuracy of the EDB race variable is correlated with geographic concentration of a given racial/ethnic group (Zaslavsky, Ayanian, & Zaborski, 2012). This also appears to be true for the RTI race variable, which is imputed using lists of Hispanic and Asian/Pacific Islander surnames and first names based on the 1990 and 2000 U.S. Census, residence in Puerto Rico or Hawaii, and (Spanish) language preference for receipt of Medicare Handbook or Social Security Administration notices (Bonito, Eicheldinger, Evensen, 2005). Variation in accuracy of the RTI race variable between states may stem in part from historical immigration associated with the railroad, mining, and agriculture industries and development of ethnic enclaves. For example, prior to the Chinese Exclusion Act of 1882, hundreds of thousands of workers from China arrived in the U.S. to build the transcontinental railroad (Zhu, 2013), work the mines in Montana and South Dakota (Zhu, 1999, 2003), and the sugar cane plantations in Hawaii (Beechert, 1985). Similarly, Spanish laborers arrived in the late 1800s and early 1900s with other European immigrants to work the mines of West Virginia (Gonzalez, 1999; Hidalgo, 2001; Argeo, 2009), Maine and Vermont (Fernández & Argeo, 2014), and the sugar cane plantations in Hawaii (Beechert, 1985). The sugar cane plantations in were Hawaii were later manned with workers from Japan, Portugal, the Philippines, and Puerto Rico (Beechert, 1985). The resulting ethnic enclaves and diaspora may contribute to state level differences in racial/ethnic misclassification of Hispanic and Asian Medicare beneficiaries (Figures 2 and 3).

Decisions about where to live (selective migration within the U.S.), and whether to utilize an Anglicized or White-sounding name are impacted by historical patterns of racism, xenophobia and societal pressure to assimilate. For instance, prior research documents that higher concentration of immigrants of a given nationality leads to a decline in the share of immigrants of that nationality adopting “American” first names (Carneiro, Lee, Reis, 2020). Also, recent immigrants are less likely to change first names from ethnic sounding to more “American” sounding names than immigrants with longer tenure in the U.S. (Carneiro, Lee, Reis, 2020). This may create a pattern by which immigrants and their descendants living outside of gateway areas and outside of large ethnic enclaves may be more likely to have a more Anglicized or White-sounding name compared to their counterparts living in gateway areas or in ethnic enclaves. These geographic differences in names used by minority beneficiaries may contribute to the geographic variation in the ability of the RTI algorithm to correctly classify these beneficiaries. There may be other potential explanations for the observed state-level differences in the accuracy of the RTI race algorithm for Hispanics and Asians. Future research is needed to understand the source of this variation and to improve the accuracy of race and ethnicity data for Medicare beneficiaries who identify as Hispanic, Asian/Pacific Islander, and American Indian.

Limitations

Caution should be taken when generalizing to the entire U.S. Medicare population from these results, as the sample had a median age of 78 (IQR 69–86) and was comprised entirely of beneficiaries who received home health care in 2015. The fraction of the total Medicare population that receives home health care services and was included in our sample varies by state (see Appendix table for details). Compared to the total Medicare population, beneficiaries that receive home health care are on average older, have more chronic conditions, more daily activity limitations, live alone, and have lower income (Avalere, 2015). The data and results discussed are cross-sectional, and do not evaluate change over time in the race variables. Additionally, this paper does not address the issue of ethnic admixture, or subcategories of the five main racial/ethnic groups that are the focus: White (non-Hispanic); Black (non-Hispanic); Asian-American/Pacific Islander/Native Hawaiian (non-Hispanic); American Indian/Alaskan Native (non-Hispanic); and Hispanic/Latino. This study does not prove the pattern of misclassification is problematic for any individual study of racial/ethnic disparities in health care utilization and outcomes.

Conclusions

This study reveals that Medicare administrative data on beneficiary race and ethnicity contains substantial misclassification errors, particularly for Hispanic, Asian American/Pacific Islander, and American Indian/Alaskan Native populations. These results imply that Medicare race data should be used with caution when assessing health disparities at the state level. Of note, the key question for policy is how much error are we willing to tolerate? The answer is that we don’t know, but clearly there is unignorable error in administrative data, and, these errors are not distributed evenly across states, races, or ethnicities. Self-reported race/ethnicity data collected during routine care of Medicare beneficiaries should be included whenever possible to improve the accuracy of minority health and health disparities reporting and research. While self-reported race/ethnicity is not uniformly available to researchers, when it is available, it should be used to determine the extent and significance of misclassification bias for a particular study or outcome.

Supplementary Material

Supplemental digital content

Acknowledgments:

The authors thank Abner N. Nyandege, Michael K. Gusmano, Paul R. Duberstein, Tina Dharamdasani, and Jacquelyn Y. Taylor for contributions to the development and refinement of this article.

Funding:

This study was supported by funding from the Agency for Healthcare Research and Quality (R00-HS022406), the NIH National Center for Advancing Translational Sciences (UL1-TR003017), and the NIH National Institute on Aging (P30-AG0059304).

Footnotes

Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

  1. Argeo L (2009). Asturian West Virgina. Goldenseal, 14–18. Reprint available at https://tracesofspainintheus.org/west-va/spelter/ [Google Scholar]
  2. Avalere. (2015). Home Health Chartbook 2015: Prepared for the Alliance for Home Health Quality and Innovation. https://ahhqi.org/images/uploads/AHHQI_2015_Chartbook_FINAL_October.pdf
  3. Arday SL, Arday DR, Monroe S, & Zhang J (2000). HCFA’s racial and ethnic data: current accuracy and recent improvements. Health Care Financ Rev, 21(4), 107–116. [PMC free article] [PubMed] [Google Scholar]
  4. Beechert ED (1985). Working in Hawaii: A Labor History. University of Hawaii Press. [Google Scholar]
  5. Bierman AS, Lurie N, Collins KS, & Eisenberg JM (2002). Addressing racial and ethnic barriers to effective health care: The need for better data. Health Affairs, 21(3), 91–102. 10.1377/hlthaff.21.3.91 [DOI] [PubMed] [Google Scholar]
  6. Bilheimer LT, & Sisk JE (2008). Collecting adequate data on racial and ethnic disparities in health: The challenges continue. Health Affairs, 27(2), 383–391. 10.1377/hlthaff.27.2.383 [DOI] [PubMed] [Google Scholar]
  7. Bonito AJ, Eicheldinger CR, & Evensen C (2005). Health Disparities: Measuring Health Care Use and Access for Racial/Ethnic Populations. Final Report. Rockville, MD. https://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Reports/downloads/Bonito_Final.pdf [Google Scholar]
  8. Brown ER, Kincheloe J, Breen N, Olson JL, Portnoy B, & Lee SJC (2013). States’ use of local population health data: comparing the Behavioral Risk Factor Surveillance System and independent state health surveys. Journal of Public Health Management and Practice: JPHMP, 19(5), 444–450. 10.1097/PHH.0b013e3182751cfb [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Carneiro P, Lee S, & Reis H (2020). Please call me John: Name choice and the assimilation of immigrants in the United States, 1900–1930, Labour Economics, 62, 101778, 10.1016/j.labeco.2019.101778 [DOI] [Google Scholar]
  10. Chakkalakal RJ, Green JC, Krumholz HM, & Nallamothu BK (2015). Standardized data collection practices and the racial/ethnic distribution of hospitalized patients. Medical Care, 53(8), 666–672. 10.1097/MLR.0000000000000392 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Center for Medicare and Medicaid Services (CMS). (2017). Report to Congress: Improving Medicare Post-Acute Care Transformation (IMPACT) Act of 2014 Strategic Plan for Accessing Race and Ethnicity Data. https://www.cms.gov/About-CMS/Agency-Information/OMH/Downloads/Research-Reports-2017-Report-to-Congress-IMPACT-ACT-of-2014.pdf
  12. Center for Medicare and Medicaid Services (CMS). (2020). The Mapping Medicare Disparities Tool. Version 8.0. March 3, 2020. https://www.cms.gov/About-CMS/Agency-Information/OMH/OMH-Mapping-Medicare-Disparities
  13. Department of Health and Human Services (DHHS). (2014). HHS action plan to reduce racial and ethnic health disparities: A nation free of disparities in health and health care. https://www.minorityhealth.hhs.gov/npa/files/Plans/HHS/HHS_Plan_complete.pdf
  14. Dong J, Gu X, El-Serag HB, & Thrift AP (2019). Underuse of surgery accounts for racial disparities in esophageal cancer survival times: A matched cohort study. Clin Gastroenterol Hepatol, 17(4), 657–665 e613. 10.1016/j.cgh.2018.07.018 [DOI] [PubMed] [Google Scholar]
  15. Douglas MD, Dawes DE, Holden KB, & Mack D (2015). Missed policy opportunities to advance health equity by recording demographic data in electronic health records. American Journal of Public Health, 105 Suppl 3, S380–388. 10.2105/AJPH.2014.302384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Duran DG, & Pérez-Stable EJ (2019). Science visioning to advance the next generation of health disparities research. American Journal of Public Health, 109(S1), S11–S13. 10.2105/ajph.2018.304944 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Eicheldinger C, & Bonito A (2008). More accurate racial and ethnic codes for Medicare administrative data. Health Care Financing Review, 29(3), 27–42. https://www.ncbi.nlm.nih.gov/pubmed/18567241 [PMC free article] [PubMed] [Google Scholar]
  18. Escarce JJ, Carreón R, Veselovskiy G, & Lawson EH (2011). Collection of race and ethnicity data by health plans has grown substantially, but opportunities remain to expand efforts. Health Affairs, 30(10), 1984–1991. 10.1377/hlthaff.2010.1117 [DOI] [PubMed] [Google Scholar]
  19. Fernández JD, & Argeo L (2014). Invisible Emigrants: Spaniards in the US (1868–1945). White Stone Ridge. Digital project available at https://tracesofspainintheus.org/ with detail on Vermont at https://spanishimmigrantsintheus7.wordpress.com/tag/vermont/ [Google Scholar]
  20. Filice CE, & Joynt KE (2017). Examining race and ethnicity information in Medicare administrative data. Medical Care, 55(12), e170–e176. 10.1097/MLR.0000000000000608 [DOI] [PubMed] [Google Scholar]
  21. Freemantle J, Ring I, Arambula Solomon TG, Gachupin FC, Smylie J, Cutler TL, & Waldon JA (2015). Indigenous mortality (revealed): The invisible illuminated. American Journal of Public Health, 105(4), 644–652. 10.2105/ajph.2014.301994 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Gandhi K, Lim E, Davis J, & Chen JJ (2018). Racial disparities in health service utilization among Medicare fee-for-service beneficiaries adjusting for multiple chronic conditions. J Aging Health, 30(8), 1224–1243. 10.1177/0898264317714143 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Glantz NM, Duncan I, Ahmed T, Fan L, Reed BL, Kalirai S, & Kerr D (2019). Racial and ethnic disparities in the burden and cost of diabetes for US Medicare beneficiaries. Health Equity, 3(1), 211–218. 10.1089/heq.2019.0004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Gonzalez S (1999). Forging their place in Appalachia: Spanish immigrants in Spelter, West Virginia. Journal of Appalachian Studies, 5(2), 197–206. www.jstor.org/stable/41446913 [Google Scholar]
  25. Grzywacz V 2nd, Hussain N, & Ragina N (2018). Racial disparities and factors affecting Michigan colorectal cancer screening. J Racial Ethn Health Disparities, 5(4), 901–906. 10.1007/s40615-017-0438-x [DOI] [PubMed] [Google Scholar]
  26. Hanchate AD, Dyer KS, Paasche-Orlow MK, Banerjee S, Baker WE, Lin M, … Feldman J (2019). Disparities in emergency department visits among collocated racial/ethnic Medicare enrollees. Ann Emerg Med, 73(3), 225–235. 10.1016/j.annemergmed.2018.09.007 [DOI] [PubMed] [Google Scholar]
  27. Hartzler BM, & Snyder A (2017). Caring by numbers: Evaluation of inconsistencies and incompleteness in the reporting of racial and ethnic data. J Racial Ethn Health Disparities, 4(6), 1092–1099. 10.1007/s40615-016-0314-0 [DOI] [PubMed] [Google Scholar]
  28. Hernandez SE, Sylling PW, Mor MK, Fine MJ, Nelson KM, Wong ES, … Hebert PL (2019). Developing an algorithm for combining race and ethnicity data sources in the Veterans Health Administration. Military Medicine, usz322. 10.1093/milmed/usz322 [DOI] [PubMed] [Google Scholar]
  29. Hidalgo T (2001). En las montañas: Spaniards in Southern West Virgina. Goldenseal, 52–59. https://tracesofspainintheus.files.wordpress.com/2012/12/hidalgowestvirginia.pdf [Google Scholar]
  30. Hittle DF, Shaughnessy PW, Crisler KS, Powell MC, Richard AA, Conway KS, … Engle K (2003). A study of reliability and burden of home health assessment using OASIS. Home Health Care Services Quarterly, 22(4), 43–63. 10.1300/J027v22n04_03 [DOI] [PubMed] [Google Scholar]
  31. Institute of Medicine (IOM). (2003). Smedley BD, Stith AY, & Nelson AR (Eds.), Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. Washington (DC): National Academies Press. https://pubmed.ncbi.nlm.nih.gov/25032386/ [PubMed] [Google Scholar]
  32. Jarrin OF, Nyandege AN, Grafova IB, Dong X, & Lin H (2020). Validity of race and ethnicity codes in Medicare administrative data compared with gold-standard self-reported race collected during routine home health care visits. Med Care, 58(1), e1–e8. 10.1097/MLR.0000000000001216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Jeffries N, Zaslavsky AM, Diez Roux A, Creswell JW, Palmer RC, Gregorich JD, … Breen N (2019). Methodological approaches to understanding causes of health disparities. American Journal of Public Health, 109(S1), S28–S33. 10.2105/ajph.2018.304843 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Joshi S, & Warren-Mears V (2019). Identification of American Indians and Alaska Natives in public health data sets: A comparison using linkage-corrected Washington state death certificates. Journal of Public Health Management and Practice: JPHMP, 25 Suppl 5, Tribal Epidemiology Centers: Advancing Public Health in Indian Country for Over 20 Years, S48–S53. 10.1097/PHH.0000000000000998 [DOI] [PubMed] [Google Scholar]
  35. Kinatukara S, Rosati RJ, & Huang L (2005). Assessment of OASIS reliability and validity using several methodological approaches. Home Health Care Services Quarterly, 24(3), 23–38. 10.1300/J027v24n03_02 [DOI] [PubMed] [Google Scholar]
  36. Layne TM, Ferrucci LM, Jones BA, Smith T, Gonsalves L, & Cartmel B (2019). Concordance of cancer registry and self-reported race, ethnicity, and cancer type: A report from the American Cancer Society’s studies of cancer survivors. Cancer Causes Control, 30(1), 21–29. 10.1007/s10552-018-1091-3 [DOI] [PubMed] [Google Scholar]
  37. Liebler CA (2018). Counting America’s First Peoples. Annals of the American Academy of Political and Social Science, 677(1), 180–190. 10.1177/0002716218766276 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Ma Y, Zhang W, Lyman S, & Huang Y (2018). The HCUP SID imputation project: Improving statistical inferences for health disparities research by imputing missing race data. Health Services Research, 53(3), 1870–1889. 10.1111/1475-6773.12704 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Magana Lopez M, Bevans M, Wehrlen L, Yang L, & Wallen GR (2016). Discrepancies in race and ethnicity documentation: A potential barrier in identifying racial and ethnic disparities. J Racial Ethn Health Disparities. 10.1007/s40615-016-0283-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. MapChart. (2020). MapChart.net Make your own custom Map of the World, Europe, the Americas, United States, UK and more with colors and descriptions of your choice. http://www.mapchart.net
  41. Nerenz DR (2005). Health care organizations’ use of race/ethnicity data to address quality disparities. Health Affairs, 24(2), 409–416. 10.1377/hlthaff.24.2.409 [DOI] [PubMed] [Google Scholar]
  42. Ng JH, Ye F, Ward LM, Haffer SC, & Scholle SH (2017). Data on race, ethnicity, and language largely incomplete for managed care plan members. Health Affairs, 36(3), 548–552. 10.1377/hlthaff.2016.1044 [DOI] [PubMed] [Google Scholar]
  43. O’Connor M, & Davitt JK (2012). The Outcome and Assessment Information Set (OASIS): a review of validity and reliability. Home Health Care Services Quarterly, 31(4), 267–301. 10.1080/01621424.2012.703908 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Polubriaginof FCG, Ryan P, Salmasian H, Shapiro AW, Perotte A, Safford MM, … Vawdrey DK (2019). Challenges with quality of race and ethnicity data in observational databases. J Am Med Inform Assoc, 26(8–9), 730–736. 10.1093/jamia/ocz113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Ro MJ, & Yee AK (2010). Out of the shadows: Asian Americans, Native Hawaiians, and Pacific Islanders. American Journal of Public Health, 100(5), 776–778. 10.2105/ajph.2010.192229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Sarfati D, Garvey G, Robson B, Moore S, Cunningham R, Withrow D, … Bray F (2018). Measuring cancer in indigenous populations. Ann Epidemiol, 28(5), 335–342. 10.1016/j.annepidem.2018.02.005 [DOI] [PubMed] [Google Scholar]
  47. Sholle ET, Pinheiro LC, Adekkanattu P, Davila MA, Johnson SB, Pathak J, … Campion TR (2019). Underserved populations with missing race ethnicity data differ significantly from those with structured race/ethnicity documentation. J Am Med Inform Assoc, 26(8–9), 722–729. 10.1093/jamia/ocz040 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Stafford S (2010). Caught between “The Rock” and a hard place: The Native Hawaiian and Pacific Islander struggle for identity in public health. American Journal of Public Health, 100(5), 784–789. 10.2105/ajph.2009.191064 [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Waldo DR (2004). Accuracy and bias of race/ethnicity codes in the Medicare enrollment database. Health Care Financing Review, 26(2), 61–72. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4194866/ [PMC free article] [PubMed] [Google Scholar]
  50. Wasserman J, Palmer RC, Gomez MM, Berzon R, Ibrahim SA, & Ayanian JZ (2019). Advancing health services research to eliminate health care disparities. American Journal of Public Health, 109(S1), S64–S69. 10.2105/AJPH.2018.304922 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Zaslavsky AM, Ayanian JZ, & Zaborski LB (2012). The validity of race and ethnicity in enrollment data for Medicare beneficiaries. Health Services Research, 47(3 Pt 2), 1300–1321. 10.1111/j.1475-6773.2012.01411.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Zhu L (1999). No need to rush: The Chinese, placer mining, and the Western environment. Montana The Magazine of Western History, 49(3), 42–57. [Google Scholar]
  53. Zhu L (2003). Ethnic Oasis: The Chinese in the Black Hills. South Dakota History, 33(4), 289–329. https://www.sdhspress.com/journal/south-dakota-history-33-4/ethnic-oasis-chinese-immigrants-in-the-frontier-black-hills/vol-33-no-4-ethnic-oasis.pdf [Google Scholar]
  54. Zhu L (2013). The Road to Chinese Exclusion: The Denver Riot, 1880 Election, and Rise of the West. Lawrence, Kansas: University Press of Kansas. www.jstor.org/stable/j.ctt1ch795q. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental digital content

RESOURCES