Abstract
Objectives. To collect and standardize COVID-19 demographic data published by local public-facing Web sites and analyze how this information differs from Centers for Disease Control and Prevention (CDC) public surveillance data.
Methods. We aggregated and standardized COVID-19 data on cases and deaths by age, gender, race, and ethnicity from US state and territorial governmental sources between May 24 and June 4, 2021. We describe the standardization process and compare it with the CDC’s process for public surveillance data.
Results. As of June 2021, the CDC’s public demographic data set included 80.9% of total cases and 46.7% of total deaths reported by states, with significant variation across jurisdictions. Relative to state and territorial data sources, the CDC consistently underreports cases and deaths among African American and Hispanic or Latino individuals and overreports deaths among people older than 65 years and White individuals.
Conclusions. Differences exist in amounts of data included and demographic composition between the CDC’s public surveillance data and state and territory reporting, with large heterogeneity across jurisdictions. A lack of standardization and reporting mechanisms limits the production of complete real-time demographic data.
The impact of the COVID-19 pandemic in the United States has not been equal across different demographic groups. Multiple studies have shown that US racial and ethnic minority populations have a proportionally higher number of COVID-19 cases,1,2 higher mortality rates,3–6 and lower access to testing.7,8 Also, studies from other countries have shown that although the prevalence of COVID-19 is similar between males and females, males have higher mortality rates.9–11 Advanced age is a significant risk factor for severe illness and death, with adults older than 65 years accounting for 75% of all COVID-19 deaths in the United States.12
Most epidemiological studies of demographic characteristics of cases, hospitalizations, and deaths rely on data from death certificates3 or specific populations from metropolitan areas,1,5,7 hospitals and health systems with high-quality data,4,9 or data from foreign countries.10,11,13 These data sources are informative but incomplete. They may be limited to specific populations, may lack subnational representativeness, and may not be updated rapidly enough to adopt mitigation measures in specific populations.14
At the state and local levels, hospitals, health care providers, and laboratories report individualized data to health departments through a mandatory process known as “case reporting.”15 Using case reports, local health departments have created public-facing dashboards, data repositories, or Web sites with COVID-19 aggregated counts and demographic data. However, all public-facing dashboards are different, varying considerably in the availability and presentation of data. Therefore, comparing and tracking these data require that they be collected from different sites, organized, standardized, and concentrated in a single data repository.
By contrast, the US Centers for Disease Control and Prevention (CDC) collects deidentified patient-level data, including demographic characteristics, through a reporting mechanism called “case notification.”15 Using these patient-level data, the CDC produces the COVID-19 Case Surveillance Public Use Data with Geography data set. This data set contains 19 different characteristics for each COVID-19 case shared with the CDC, including demographics and geography (state and county), exposure history, and disease severity indicators. However, case notification is slower, voluntary, and less complete, as it depends on each jurisdiction’s reportable conditions. Moreover, the CDC follows a privacy protection review protocol that redacts specific information—including demographic characteristics—to reduce the risk of reidentification.16
Several independent efforts to gather and publish comprehensive race and ethnicity data from each jurisdiction’s health department in a single publicly available aggregator have also emerged outside CDC sources. Examples of these efforts include the COVID Racial Data Tracker from the COVID Tracking Project and the Boston University Center for Antiracist Research,17 The Color of Coronavirus project from the APM Research Lab,18 and the COVID-19 Vaccine Monitor Dashboard from the Kaiser Family Foundation.19 However, those efforts concluded in March 2021.
Despite the many advantages of having COVID-19 demographic data to mitigate disparities,20 it is not well understood how various public sites reporting demographic data compare. Moreover, it is unclear whether demographic data are complete and timely and whether they show consistent trends. To understand the impact of COVID-19 across different demographic groups at the state and territorial levels, the Johns Hopkins Coronavirus Resource Center (CRC) started collecting, processing, and publishing demographic data related to COVID-19 outcomes from state and territorial sources in April 2021.21 The CRC has been working since to routinely gather and standardize data that allow compilation—in a comprehensive, accurate, and uniform manner—of the diverse, publicly available data from all US states and territories.
As part of this effort, we sought to understand how these data from public-facing state and territorial Web sites compare with the national aggregation published regularly by the CDC. Here we describe the methods used to collect and standardize COVID-19 demographic data from various local sources and compare the standardized data set with a similar publicly available data set from the CDC, focusing on the demographic composition of cases and deaths and the proportion of missing data.
METHODS
The CRC collects, standardizes, and publishes demographic data from official and publicly available state and territorial data sources related to the ongoing COVID-19 pandemic. The CRC demographic data set includes information on cases, deaths, testing, and vaccination reported by health departments from the 50 states, the District of Columbia, Puerto Rico, Guam, and the US Virgin Islands. The demographic categories included in the data set are age group, gender or sex, race, ethnicity, and race and ethnicity combined where available. Furthermore, each category contains different demographic groups (e.g., 20–29 years old, female, Hispanic or Latino, Asian, non-Hispanic White). Finally, some mixed demographic categories, such as ethnicity by age group and age group by gender, are not included in the CRC’s data collection effort because they are not widely available.
Standardization Process for CRC Demographic Data
The CRC follows a standardization process to produce common demographic groups across all states and territorial sources for the aforementioned demographic categories. This standardization process produces comparable demographic groups across all states and territories that are also comparable with demographic groups used in external data sources from the CDC and the US Census Bureau. In addition, it incorporates methods previously used in different race and ethnicity data analyses, including analyses of COVID-19 outcome data.22,23
First, we aligned the race and ethnicity groups reported by local health authorities with the groups established in the Office of Management and Budget’s Standards for the Classification of Federal Data on Race and Ethnicity.24,25 Second, given that the Hispanic or Latino group comprises different races and some local health authorities do not disaggregate according to race and ethnicity, we kept race and ethnicity combined when the jurisdiction in question did so.
In addition, we calculated the ethnicity category in a manner similar to the California Department of Finance’s “Hispanic hierarchical” approach.26 We classified all self-reported Hispanic or Latino individuals as such, regardless of reported race, and categorized all self-reported non-Hispanic respondents claiming more than 1 race as “Two or More Races.” Also, we aggregated all gender and sex categories into 3 large groups: female, male, and other.
To standardize all age groups available across sources, we re-binned the age distributions to match the American Community Survey’s specific age categorization. We initially assumed uniform distributions for each original bin, then transformed the state-specific age ranges to 5-year age bins, and finally aggregated them, assuming that the upper limit of the 85 years old and older group was 100 years.27 In addition, we combined any missing or unreported data, data under investigation, or unavailable data into the “unknown” group for all demographic categories.
After this process, we generated a data set with all of the data collected from the source and a standardized data set. The original and the standardized data sets, the documentation, and the most up-to-date data sources can be accessed in a public repository (https://github.com/govex/COVID-19/tree/master/data_tables/demographic_data).
CDC Public Demographic Data and COVID-19
To compare how demographic data for cases and deaths from states and territorial sources differ from CDC public data, we used CRC demographic data, after standardization, collected from May 24 through June 4, 2021. With respect to CDC data, we used the June 23, 2021, update of the CDC’s COVID-19 Case Surveillance Public Use Data with Geography. We aggregated the CDC Case Surveillance Public Use Data with Geography data set for age, sex, race, and ethnicity from the patient to the state level. In addition, we created a new combined race and ethnicity category.
After standardizing and processing CRC demographic data and CDC Case Surveillance Public Use Data with Geography, we compared the proportion of each demographic group’s share of the total cases or deaths from both sources, including unknowns. Next, we compared the proportions of the groups’ shares of total cases and deaths to evaluate the demographic composition of cases and deaths from the CDC’s publicly available data and state and territorial sources. Finally, we performed a 2-proportion Z test to evaluate the statistical significance of the difference in proportions between the 2 data sets by demographic group, category, and state or territory. R version 4.0.3 (R Foundation, Vienna, Austria) was used in conducting all of our analyses.
RESULTS
There was significant variation in demographic categories across different sources and considerable heterogeneity in how similar demographic groups were named among local sources. Data by age were available from the majority of state and territorial sources. Table 1 shows that age data were available in 92.6% of the states and territories for cases, 88.9% for deaths, and 85.2% for vaccines. Ethnicity data were less frequently available across states (in only 48.1% of states and territories for vaccines, only 40.7% for deaths, and only 44.4% for cases). Testing data were least available across all COVID-19 outcomes; 9 states or territories (16.7%) had testing data by age, and only 3 (5.6%) had such data by ethnicity or race. We could not find a shared demographic group present in any of the 54 states and territories included in the data collection process for any COVID-19 outcome.
TABLE 1—
US Jurisdictions With Available Demographic Categories, by COVID-19 Outcome, Through June 2021
Demographic Category | Jurisdictions Where the Demographic Category Is Available (n = 54), No. (%) |
Cases | |
Age | 50 (92.6) |
Ethnicity | 24 (44.4) |
Gender or sex | 51 (94.4) |
Race | 30 (55.6) |
Race and ethnicity | 24 (44.4) |
Deaths | |
Age | 48 (88.9) |
Ethnicity | 22 (40.7) |
Gender or sex | 48 (88.9) |
Race | 26 (48.1) |
Race and ethnicity | 25 (46.3) |
Tests | |
Age | 9 (16.7) |
Ethnicity | 3 (5.6) |
Gender or sex | 8 (14.8) |
Race | 3 (5.6) |
Race and ethnicity | 6 (11.1) |
Vaccines | |
Age | 46 (85.2) |
Ethnicity | 26 (48.1) |
Gender or sex | 41 (75.9) |
Race | 29 (53.7) |
Race and ethnicity | 20 (37.0) |
In addition to data availability, standardization of demographic groups across data sources was a challenge. We identified 402 different demographic groups across all demographic categories through the data collection process. Some of these differences were a result of semantic differences in how each source names similar groups. For example, Black or African American is referred to as Black in some jurisdictions such as Mississippi, whereas other jurisdictions such as the District of Columbia match the CDC’s categorization scheme. However, other differences were more profound and implied demographic groups that cannot easily be compared with others. For example, Asian is aggregated with Native Hawaiian or Pacific Islander in Michigan but only with Pacific Islander in North Carolina. It is also included in the “other” category in certain jurisdictions such as Indiana.
As shown in Table A of the appendix (available as a supplement to the online version of this article at http://www.ajph.org), we recoded and aggregated 230 source-specific groups for gender or sex, race, ethnicity, and race and ethnicity. For age groups, we re-binned 134 source-specific groups into bins ranging from 0 to 9 years old to 85 years old and older. After the standardization process, we produced a data set that contained 8745 data points across 10 age groups, 3 ethnicity groups, 4 gender or sex groups, 8 race groups, and 23 race and ethnicity groups, including an “unknown” group for every category.
Moreover, we compared the demographic composition of the CDC Case Surveillance Public Use Data with Geography data set across specific demographic groups using the CRC standardized demographic data set as a benchmark separately for cases and deaths. In addition, we performed a 2-proportion Z test to evaluate whether differences were statistically significant at a 95% confidence level. This CDC data set incorporates patient-level data for more than 27.1 million COVID-19 cases. It includes demographics such as age, sex, ethnicity, race, state and county of residence, and underlying medical conditions (e.g., diabetes, cardiovascular disease).
We were able to join and compare 745 unique state and demographic group pairs for cases and 595 for deaths. We joined only state and demographic groups present in both data sets and excluded pairs in which there were no cases or deaths in either data set. Also, we manually excluded deaths by race for Montana and deaths by gender and race for Pennsylvania as a result of incomplete data for most groups.
When comparing specific demographic groups, we found that some groups are consistently overreported and others are consistently underreported in the CDC data set. For example, Figure 1 shows that deaths among people older than 65 years are overreported across states. The number of deaths among people older than 65 years in Louisiana is 23.1 percentage points (P < .001) higher in the CDC data set than in the CRC demographic data. The difference is similar in Utah, where the number of deaths among people older than 65 years is 21.5 percentage points (P < .001) higher in the CDC data set.
FIGURE 1—
Differences in the Proportions of (a) Cases and (b) Deaths Among People Aged 65 Years and Older in the CDC and CRC Data Sets, by US State or Territory, Through June 2021
Note. CDC = Centers for Disease Control and Prevention; CRC = Coronavirus Resource Center. Whiskers indicate 95% confidence intervals.
By contrast, we found that the proportion of cases and deaths in the Hispanic or Latino population is consistently underreported in the CDC data set. For example, Figure 2 shows that cases and deaths among Hispanic or Latino individuals are underreported compared with CRC demographic data across all states and territories with the exception of cases in Missouri. In addition, in an extreme example, 45.3% of California’s total cases are among people who identified as Hispanic or Latino in the CRC demographic data set. The CDC data set reports only 16.3%, a difference of 29 percentage points. Moreover, Oregon reported that 23.5% of total cases correspond to Hispanic or Latino individuals, whereas the CDC reported only 2.4%.
FIGURE 2—
Differences in the Proportions of (a) Cases and (b) Deaths Among Hispanic or Latino Individuals in the CDC and CRC Data Sets, by US State or Territory, Through June 2021
Note. CDC = Centers for Disease Control and Prevention; CRC = Coronavirus Resource Center. Whiskers indicate 95% confidence intervals.
Also, some demographic groups (e.g., females) are underreported in cases but overreported in deaths in the CDC data set. Figure A (available as a supplement to the online version of this article at http://www.ajph.org) shows that of the 44 states where we could compare both data sets, the CDC underreports cases among females in 40 states or territories with a statistically significant difference. However, in 18 of the 32 states or territories with available data, the CDC overreports the proportion of deaths in this demographic group.
Furthermore, we found that the CDC data set underreports the proportion of cases and deaths among Black or African American individuals for most jurisdictions. For instance, cases among Black or African American individuals are 13 percentage points (P < .001) lower in Georgia and 6.8 percentage points (P < .001) lower in North Carolina in the CDC data set than in the CRC’s demographic data (Figure B, available as a supplement to the online version of this article at http://www.ajph.org). For deaths, the differences increase to 30.7 percentage points (P < .001) in Georgia and 15.1 percentage points (P < .001) in North Carolina. However, not all race groups are consistently underreported in the CDC data set. For example, White individuals are overreported in deaths in 16 of 20 states and territories with available data (Figure C, available as a supplement to the online version of this article at http://www.ajph.org).
Finally, we examined 2 ways in which missing data in the CDC data set could be driving the differences we previously identified. First, we compared the total number of COVID-19 cases and deaths in both data sets to check for the overall completeness of the CDC data set by state and territory. We used the total number of COVID-19 cases and deaths from the CRC data set as the benchmark; this data set contains cumulative totals collected from official public sources and represents one of the most up-to-date sources for COVID-19 cases and deaths.28
The CDC Case Surveillance Public Use data set includes 80.9% of all COVID-19 cases registered in the United States. However, it contains only 46.7% of all reported COVID-19 deaths, with considerable heterogeneity across states (Figure 3). For cases, most states and territories have reported the majority of their patient-level data to the CDC, or they even update the CDC more frequently than their local public reporting. For example, the CDC data set includes more than 100% of all cases registered in the CRC data set for New York (103.33%) and New Jersey (103.9%). However, 5 states have reported less than 10% of their total case data to the CDC: Wyoming (2.1%), Texas (2.73%), Louisiana (4.14%), West Virginia (5.44%), and Missouri (9.73%).
FIGURE 3—
Proportions of Total COVID-19 (a) Cases and (b) Deaths Included in the CDC Case Surveillance Public Use Data With Geography Data Set, by US State or Territory, Through June 2021
Note. CDC = Centers for Disease Control and Prevention.
In addition, death data are incomplete overall. There are 11 states or territories that do not report patient-level data on COVID-19 deaths to the CDC. These jurisdictions are Alaska, Delaware, Guam, Hawaii, Missouri, Nebraska, South Dakota, Texas, the US Virgin Islands, West Virginia, and Wyoming. Massachusetts (95.96%) and Illinois (88.23%) report the most significant amounts of patient-level data for deaths.
Second, we compared the overall proportion of cases and deaths with unknown demographics between the 2 data sets for all demographic categories and performed a 2-proportion Z test. We found that the overall proportion of cases and deaths with unknown demographics is larger in the CDC data set than the CRC data set for all demographic categories (Table B, available as a supplement to the online version of this article at http://www.ajph.org). For example, the number of cases with unknown race and ethnicity is 24.2 percentage points higher in the CDC data set than in the CRC data set. Similar results were observed in most states and territories with available data.
DISCUSSION
The results of our analysis reveal considerable heterogeneity in how jurisdictions present demographic data on COVID-19 outcomes. Differences were found both across and within jurisdictions for different COVID-19 outcomes, which complicates comparisons across states. To overcome this challenge, we developed a standardization process that recategorized these groups, reducing the number of groups to a smaller number of common categories across states and territories. After a semantic alignment process and a re-binning process for age groups, we reduced the number of unique categories from 402 to 48. This standardization process produced a data set with demographic data by state or territory for common demographic groups in all states where sufficient data were available.
There are several limitations to our approach stemming from the lack of patient-level data from states. First, there were not enough data available to produce all demographic groups for all jurisdictions. Second, some states or territories do not report certain demographic categories or present data from different demographic groups aggregated into a single large group. Finally, the age re-binning process assumes a uniform distribution within a bin, which is problematic when a state reports a wide age bin. Enhanced data reporting standards across all jurisdictions could help ensure that states employ consistent categories, which might allow for more comparable and complete demographic data.
Our analysis revealed differences in size and demographic composition between patient-level demographic data from the CDC’s public COVID-19 data set and state and territory reporting. We found that the CDC data contain fewer COVID-19 cases and deaths and a larger proportion of unknown or missing demographics for cases and deaths, with significant variation across states and territories. Overall, the CDC data set includes 80.9% of all COVID-19 cases publicly reported by state and territorial health departments. For deaths, the CDC data set includes less than half of all deaths reported in the United States by state and territorial authorities. In addition, information for COVID-19-related deaths from 11 jurisdictions is completely missing from CDC reporting.
Moreover, the 2 data sets seem to present different demographic compositions of COVID-19 patients, especially with respect to deaths. For example, cases and deaths among Hispanic or Latino and Black or African American populations, and cases among females, are consistently underreported across the vast majority of states and territories in the CDC reporting. By contrast, deaths among those aged 65 years and older and White individuals are consistently overreported in CDC data relative to state and territorial sources.
It is unclear how data supply and processing might be affecting the demographic composition of these different sources. Further research is needed to understand how data reporting mechanisms might contribute to these differences. Additional research is also needed to assess the extent to which privacy protection rules limit the completeness of the CDC demographic data and whether such limitations could partially explain the differences in state and territory reporting.
Throughout this analysis, we have indicated that lack of standardization and ineffective data reporting mechanisms limit the ability to aggregate complete real-time demographic data at the national level. At best, discrepancies between data reported by states and those reported by the CDC may create confusion. At worst, there may be systematic bias in reporting of demographic data on the part of either states or the CDC. To ensure transparency and data quality, it is important that the causes of these discrepancies be investigated, understood, and, if necessary, adjudicated. However, these discrepancies may also deepen distrust, leading to hesitancy and noncompliance with mitigation strategies such as social distancing, masking, and vaccination.
Having accurate and timely demographic data may also help government agencies make data-informed decisions on how to target resources and allow the scientific community to understand the spread and impact of the virus in the most at-risk populations. For example, in the short run, health authorities could use COVID-19 demographic data to identify where to increase targeted vaccination efforts, how current and future variants will affect different communities and age groups, and which communities are being excluded from access to testing. In the long run, COVID-19 demographic data should influence decisions on where to increase clinical support to treat patients with post-COVID-19 conditions, which communities lack sufficient access to health care resources, and how COVID-19 affects educational outcomes and school dropout rates.
In conclusion, there is a need to improve the current US public health data landscape by modernizing data reporting mechanisms between health care providers, local health authorities, and the CDC. This will require significant policy and funding changes that should also affect the current standards for reporting demographic data. Nevertheless, having complete, accurate, and timely demographic data will help in efforts to deploy proper mitigation strategies in the short term and improve the ability of academics and governments to study how social determinants affect the population’s health.
ACKNOWLEDGMENTS
Work at the Johns Hopkins Coronavirus Resource Center is supported by Bloomberg Philanthropies and the Stavros Niarchos Foundation.
We acknowledge the states and jurisdictions that devoted time to developing COVID-19 dashboard infrastructure, as well as the team that collected the data presented in this study: Margaret Burke, Marlene Caceres, Dane Galloway, Molly Mantus, Taylor Martin, Promise Maswanganye, and Miriam McKinney Gray.
CONFLICTS OF INTEREST
The authors declare no conflicts of interest.
HUMAN PARTICIPANT PROTECTION
No protocol approval was needed for this study because no human participants were involved.
REFERENCES
- 1.Webb Hooper M, Napoles AM, Perez-Stable EJ. COVID-19 and racial/ethnic disparities. JAMA. 2020;323(24):2466–2467. doi: 10.1001/jama.2020.8598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Xian Z, Saxena A, Javed Z, et al. COVID-19-related state-wise racial and ethnic disparities across the USA: an observational study based on publicly available data from the COVID Tracking Project. BMJ Open. 2021;11(6):e048006. doi: 10.1136/bmjopen-2020-048006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Bassett MT, Chen JT, Krieger N. Variation in racial/ethnic disparities in COVID-19 mortality by age in the United States: a cross-sectional study. PLoS Med. 2020;17(10):e1003402. doi: 10.1371/journal.pmed.1003402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Rentsch CT, Kidwai-Khan F, Tate JP, et al. Patterns of COVID-19 testing and mortality by race and ethnicity among United States veterans: a nationwide cohort study. PLoS Med. 2020;17(9):e1003379. doi: 10.1371/journal.pmed.1003379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Vahidy FS, Nicolas JC, Meeks JR, et al. Racial and ethnic disparities in SARS-CoV-2 pandemic: analysis of a COVID-19 observational registry for a diverse US metropolitan population. BMJ Open. 2020;10(8):e039849. doi: 10.1136/bmjopen-2020-039849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zalla LC, Mulholland GE, Filiatreau LM, Edwards JK. Racial/ethnic and age differences in the direct and indirect effects of the COVID-19 pandemic on US mortality. Am J Public Health. 2022;112(1):154–164. doi: 10.2105/AJPH.2021.306541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Asabor EN, Warren JL, Cohen T. Racial/ethnic segregation and access to COVID-19 testing: spatial distribution of COVID-19 testing sites in the four largest highly segregated cities in the United States. Am J Public Health. 2022;112(3):518–526. doi: 10.2105/AJPH.2021.306558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Pond EN, Rutkow L, Blauer B, Aliseda Alonso A, Bertran de Lis S, Nuzzo JB. Disparities in SARS-CoV-2 testing for Hispanic/Latino populations: an analysis of state-published demographic data. J Public Health Manag Pract. 2022;28(4):330–333. doi: 10.1097/PHH.0000000000001510. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Goodman KE, Magder LS, Baghdadi JD, et al. Impact of sex and metabolic comorbidities on COVID-19 mortality risk across age groups: 66,646 inpatients across 613 U.S. hospitals. Nephrol Dial Transplant. 2021;73(11):e4113–e4123. doi: 10.1093/cid/ciaa178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Jin JM, Bai P, He W, et al. Gender differences in patients with COVID-19: focus on severity and mortality. Front Public Health. 2020;8:152. doi: 10.3389/fpubh.2020.00152. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Aslaner H, Aslaner HA, Gökçek MBG, Benli AR, Yıldız O. The effect of chronic diseases, age and gender on morbidity and mortality of COVID-19 infection. Iran J Public Health. 2021;50(4):721–727. doi: 10.18502/ijph.v50i4.5996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.National Center for Health Statistics. 2021. https://www.cdc.gov/nchs/covid19/mortality-overview.htm
- 13.Bonanad C, García-Blas S, Tarazona-Santabalbina F, et al. The effect of age on mortality in patients with COVID-19: a meta-analysis with 611,583 subjects. J Am Med Dir Assoc. 2020;21(7):915–918. doi: 10.1016/j.jamda.2020.05.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.National Center for Health Statistics. 2021. https://data.cdc.gov/NCHS/Provisional-COVID-19-Deaths-by-Race-and-Hispanic-O/ks3g-spdg
- 15.Centers for Disease Control and Prevention. 2021. https://www.cdc.gov/nndss/about/conduct.html
- 16.Centers for Disease Control and Prevention. 2021. https://github.com/CDCgov/covid_case_privacy_review
- 17.Racial Data Dashboard. The COVID Tracking Project. 2021. https://covidtracking.com/race/dashboard
- 18.APM Research Lab. 2021. https://www.apmresearchlab.org/covid/deaths-by-race
- 19.Kaiser Family Foundation. COVID-19 Vaccine Monitor Dashboard. 2021. https://www.kff.org/interactive/covid-19-coronavirus-tracker
- 20.Peek ME, Simons RA, Parker WF, Ansell DA, Rogers SO, Edmonds BT. COVID-19 among African Americans: an action plan for mitigating disparities. Am J Public Health. 2021;111(2):286–292. doi: 10.2105/AJPH.2020.305990. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Johns Hopkins University Coronavirus Resource Center. 2021. https://github.com/govex/COVID-19/tree/master/data_tables/demographic_data
- 22.Mays VM, Ponce NA, Washington DL, Cochran SD. Classification of race and ethnicity: implications for public health. Annu Rev Public Health. 2003;24:83–110. doi: 10.1146/annurev.publhealth.24.100901.140927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Yoon P, Hall J, Fuld J, et al. Alternative methods for grouping race and ethnicity to monitor COVID-19 outcomes and vaccination coverage. MMWR Morb Mortal Wkly Rep. 2021;70(32):1075–1080. doi: 10.15585/mmwr.mm7032a2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Office of Management and Budget. 2021. https://www.federalregister.gov/documents/2016/09/30/2016-23672/standards-for-maintaining-collecting-and-presenting-federal-data-on-race-and-ethnicity
- 25.US Department of Health and Human Services. 2022. https://minorityhealth.hhs.gov/omh/browse.aspx?lvl=3&lvlid=54 [DOI] [PubMed]
- 26.Klein DJ, Elliott M, Haviland AM, et al. A comparison of methods for classifying and modeling respondents who endorse multiple racial/ethnic categories: a health care experience application. Med Care. 2019;57(6):e34–e41. doi: 10.1097/MLR.0000000000001012. [DOI] [PubMed] [Google Scholar]
- 27.United States Census Bureau. 2021. https://data.census.gov/cedsci/table?q=age&tid=ACSST5Y2019.S0101
- 28.Dong E, Du H, Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect Dis. 2020;20(5):533–534. doi: 10.1016/S1473-3099(20)30120-1. [DOI] [PMC free article] [PubMed] [Google Scholar]