Abstract
The COVID-19 pandemic has differentially impacted people according to their race/ethnicity, socioeconomic status, and preexisting conditions. Public health surveillance efforts, especially those occurring early in the pandemic, did not gather nor report adequate individual-level demographic information to identify these differences, and thus, neighborhood-level characteristics were used to note striking disparities in the US. We sought to determine whether risk factors associated with COVID-19 incidence and mortality in five Southeastern Pennsylvania counties could be better understood by using neighborhood-level demographic data augmented with health, socioeconomic, and environmental characteristics derived from publicly available sources. Although we found that education level and age of residents were the most salient predictors of COVID-19 incidence and mortality, respectively, neighborhoods exhibited a high degree of segregation with multiple correlated factors, which limits the ability of neighborhood-level analysis to identify actionable factors underlying COVID-19 disparities.
Introduction
The Coronavirus Disease 2019 (COVID-19) pandemic is a public health emergency that continues to devastate the global population, having resulted in over 29.1 million infections and 534,733 deaths in the US alone as of March 12, 2021.1 Striking disparities by race/ethnicity in severe COVID-19 outcomes and death in the US have been noted since the onset of the pandemic.2 Severe morbidity and death also occur more frequently among COVID-19 patients of older age3 and those with preexisting conditions, including chronic respiratory disease, cardiovascular disease, diabetes, obesity, and hypertension.3 Exposure to ambient air pollution, including fine particulate matter (PM2.5), increases risk for many of these conditions4,5 and also for COVID-19 infection.6,7 Structural racism and income inequality have contributed to stark and persistent social disparities: racial/ethnic minorities and the poor are more likely to live and work in crowded environments with increased pollution and have more chronic diseases, tenuous employment, sustained toxic stress, and inequitable access to health care systems.8 These factors contribute to racial/ethnic minorities and the poor being at peak risk for COVID-19 and its dire outcomes.
The Southeast Pennsylvania (SEPA) region consists of urban Philadelphia County and the surrounding suburban counties of Bucks, Chester, Delaware, and Montgomery.9 SEPA has been heavily affected by COVID-19, with cumulative, county-level incidence as of March 12, 2021 ranging from 5.25 (Bucks) to 7.59 (Philadelphia) per 100 residents, and mortality ranging from 0.14 (Chester) to 0.23 (Delaware) per 100 residents. Individual-level data reporting was limited at the start of the pandemic, and public health efforts began by focusing on neighborhoods reporting the highest rates of COVID-19 infection and death. Most notably in Philadelphia, which has a high proportion of Black residents, overcoming testing barriers in neighborhoods with majority Black residents was a priority because such neighborhoods had the highest rates of COVID-19.10 More broadly, over half of Delaware and Philadelphia counties consist of environmental justice areas—census tracts where 20 percent or more of individuals live in poverty and/or 30 percent or more of the population identifies as non-white minority.11 Residents of these areas have historically suffered a disproportionate burden of pollutant exposures from sources that include oil refineries, trash incinerator plants, and vehicular exhaust from major roadways, putting them at increased risk for various diseases, including COVID-19. Between March 2020 and March 12, 2021, SEPA experienced two distinct pandemic surges: an initial peak in April 2020 and a resurgence in early winter of that year.12 Characteristics of COVID-19 cases and deaths during these two waves have been found to vary around the world in terms of age distribution,13 race/ethnicity,14 and the socioeconomic status and occupations15 of those affected.
Although it was known that the COVID-19 pandemic disproportionately impacted racial/ethnic minority and socioeconomically disadvantaged groups in the US16 including at-risk populations within Southeastern Pennsylvania, we sought to determine whether in the absence of individual-level data, a richer compilation of neighborhood-level health, demographic, environmental, and socioeconomic characteristics would facilitate the identification of COVID-19 risk factors.
Methods
COVID-19 Data. Data on COVID-19 cases and deaths was obtained from the public health departments of five counties in Southeast Pennsylvania, and that of individual long-term care facilities (LTCFs) was obtained from the Pennsylvania Department of Health website.17 Data was aggregated by county subdivision geographical units according to the smallest one provided by local public health departments: zip codes in Philadelphia County and municipalities in Bucks, Chester, Delaware, and Montgomery Counties. With the exception of Bucks County data, which was provided directly by its health department, data was downloaded from county health department websites18-21 at two time points that distinguished SEPA COVID-19 peaks: August 18, 2020 and March 12, 2021. Based on these dates, we defined three study periods: Wave 1 (March 1 to August 18, 2020), Wave 2 (August 19, 2020 to March 12, 2021), and Total (March 1, 2020 to March 12, 2021). Data for Wave 2 was determined by finding the difference between Total and Wave 1 counts. LTCFs were geocoded using the R package ggmap22 and assigned to a zip code or municipality via QGIS; COVID-19 outcome counts attributed to LTCFs were subsequently aggregated by zip code or municipality and excluded from the public health department total outcome counts.
American Community Survey Variables. Demographic and socioeconomic status variables were obtained from the 2015-2019 American Community Survey (ACS) 5-year estimates23 using the R package tidycensus24 for geographic areas corresponding to COVID-19 data. Population density was calculated by dividing population by land area in square kilometers, which was derived from the Census Bureau’s cartographic boundary files from the MAF/TIGER geographic database.25 Sex was expressed as the proportion of males in the population. Race was re-leveled into four groups: proportion of non-Hispanic/Latino White, non-Hispanic/Latino Black, Asian, and Hispanic/Latino. Age was re-leveled into four groups: proportion of residents 18 to 34, 35 to 49, 50 to 64, and greater than 65 years old. Household size was re-leveled into three groups: proportion of 1-person, 2-person, and 3 or more-person households. Education level was re-leveled into four groups based on the proportion of people aged 25 years or older with: less than 9th grade education, at least a high school diploma, at least a bachelor’s degree, and at least a graduate degree (master’s, professional school, or doctorate). Median household income in the past 12 months was expressed in 2019 inflation-adjusted US dollars (USD). Three housing characteristics were expressed as a proportion of all households: owner-occupied households, single-parent households (male or female householder with no spouse present and children under 18 years), and households with no vehicle available.
Household Health Survey Variables. Data on the health status and behaviors of local residents was obtained from the Southeast Pennsylvania Household Health Survey (SEPA-HHS), conducted in 2012, 2015, and 2018 by the Public Health Management Corporation (PHMC). This community survey of Philadelphia and four surrounding counties interviewed residents aged 18 years or older by telephone to measure their health status and healthcare experiences.26–28 Responses from the three survey years were aggregated by geographical subdivision corresponding to COVID-19 data. Sampling bias was addressed by stratifying the sample by geographic subareas and by including population weights based on Census estimates of race, age, sex, ethnicity, household size, and income derived using Claritas, Inc.26–28 to give more weight to underrepresented and less weight to overrepresented segments of the sample. Empirical Bayes estimation of small area prevalence29 was used to smooth prevalence estimates of zip codes or municipalities with small sample sizes towards the county-level value.
Health outcomes considered were the proportion of people with: 1) Diabetes, based on an affirmative response to “Have you EVER been told by a doctor or other health professional that you have or had diabetes?” and excluding diabetes occurring only during pregnancy; 2) Asthma, based on an affirmative response to “Have you EVER been told by a doctor or other health professional that you have or had asthma?”; 3) Hypertension, based on an affirmative response to “Have you EVER been told by a doctor or other health professional that you have high blood pressure or hypertension?” and excluding hypertension only during pregnancy; 4) Obesity, defined as BMI ≥ 30, calculated from each respondent’s height in meters and weight in kilograms based on the equation weight/squared-height; and 5) Mental health condition, based on an affirmative response to “Have you ever been diagnosed with any mental health condition, including clinical depression, anxiety disorder or bipolar disorder?”.
Lifestyle factors were the proportion of people: 1) Now smoking, based on a dichotomous re-leveled question in which respondents answered either “Everyday” or “Some days” to “Do you NOW smoke cigarettes every day, some days, or not at all?”; 2) Exposed to smoke at home, based on an affirmative response to “Does anyone living in your household smoke cigarettes, cigars or pipes INSIDE your home?”; 3) Eating 3+ fruits/vegetables per day, based on a dichotomous re-leveled self-reported question in which respondents answered “How many servings of fruits and vegetables do you eat on a typical day?” with a number three or greater; and 4) Exercising, based on a dichotomous re-leveled self-reported question in which respondents answered “Thinking about the past month, how many times per week did you participate in any physical activities for exercise that lasted for at least one half-hour, such as walking, basketball, dance, rollerblading or gardening?” with a number three or greater.
Socioeconomic status factors related to health were the proportion of people who: 1) had forgone food due to cost, based on an affirmative response to “In the last 12 months, did you ever cut the size of meals or skip meals because there was not enough money in the budget for food?”; and 2) Have health insurance, based on an affirmative response to “Do you have any kind of health care coverage, including health insurance, prepaid plans such as HMOs, or government plans such as Medicare?”. Insurance status was further classified according to variables representing the proportion of people with a specific insurance type: 1) Employer-sponsored, defined as health insurance through work, school, or union; 2) Private/Personal, defined as health insurance that the respondent or a family member buys on their own; 3) Medicare, defined as health insurance through “Medicare A” for 2012 and 2015 surveys and “Medicare” for the 2018 survey; 4) Medicaid, defined as health insurance through Medicaid or another state program such as Medical Assistance (M.A.) or HealthChoices; and 5) Military, defined as health insurance through TRICARE, CHAMPUS, VA, or Military.
Social capital was captured as a 3-level variable based on an additive point score calculated from three survey variables: 1) Improve, based on an affirmative response to “Have people in your neighborhood ever worked together to improve the neighborhood?” which represented 1 point; 2) Belong, based on the response to “Please tell me if you strongly agree, agree, disagree, or strongly disagree with the following statement: I feel that I belong and am a part of my neighborhood.”, with “Strongly agree” representing 2 points, “Agree” representing 1 point, and “Disagree” and “Strongly disagree” representing 0 points; and 3) Participate, based on the question “How many local groups or organizations in your neighborhood do you currently participate in such as social, political, religious, school-related, or athletic organizations?”, with the number of organizations corresponding to the number of points, up to 12 points maximum. Based on the distribution of respondents’ point scores, low social capital was defined as the proportion of people with 0 to 1 point, representing the 0 to 25th percentile; medium social capital was defined as the proportion of people with 2 to 3 points, representing the 25th to 72nd percentile, and high social capital was defined as the proportion of people with 4 to 14 points, representing the 73rd to 100th percentile.
Environmental Protection Agency (EPA) Variables. Air pollution was based on mean estimates of PM2.5 by geographical subdivision corresponding to COVID-19 data, derived from 2010 to 2019 rasterized yearly estimates computed from data sourced from EPA Air Data.30
Statistical Analysis. Statistical analyses were conducted in R.31 Bivariate analyses were conducted using quasi-Poisson generalized linear models with COVID-19 incidence or mortality during each of the three time periods as the outcome and each of 23 variables representing neighborhood race, age, household size, income, education level, insurance billing class, social capital, and health characteristics of its residents as a predictor, using log(population) as an offset term. Incidence rate ratios (IRRs) were obtained by exponentiating model coefficients of each variable. Multivariable models with COVID-19 incidence or mortality during each of the three time periods as the outcome were created via a two-stage process: 1) feature selection was conducted using LASSO regression analysis with the quasi-Poisson family and a population offset term. Variables were selected if they had IRRs > 1.1 or < 0.9 in the final models; 2) multivariable quasi-Poisson regression models were created using selected variables for each outcome (i.e., COVID-19 incidence and mortality for each of the three time periods), including offsets for population.
Results
SEPA COVID-19 incidence and mortality rates. COVID-19 outcomes data was available for 284 zip codes and municipalities. The total (i.e., through March 12, 2021) LTCF-excluded COVID-19 incidence and mortality per 100 residents in these geographic areas ranged from 1.13 to 12.06 and 0 to 0.87, respectively (Figure 1). In Wave 1, incidence and mortality per 100 residents ranged from 0 to 4.35 and 0 to 0.58, respectively, while in Wave 2, they ranged from 0.25 to 9.01 to 0 to 0.87, respectively. The area with greatest total incidence was the 19136 zip code in Philadelphia, where four correctional facilities are located. The area with greatest total mortality was Richlandtown Borough in Bucks County.
Figure 1.
COVID-19 A) incidence and B) mortality through March 12, 2021 by zip code (Philadelphia) and by municipality (Bucks, Chester, Delaware, Montgomery), excluding long-term care facility counts. Prevalence is reported as cases (incidence) or deaths (mortality) per 100 residents. County boundaries are outlined in black.
Bivariate Associations Between Neighborhood Characteristics and COVID-19 Outcomes. Characteristics of the neighborhood-level variables for the 265 geographical subdivisions are shown in Table 1. Of 23 variables considered, COVID-19 incidence was associated with all but sex and household size across the total period, with all variables during Wave 1, and with all but sex and asthma prevalence in Wave 2. COVID-19 mortality was associated with all variables but education and exercise level across the total period, with sex, race, age, household size, median household income, owner-occupied and without vehicle household prevalence, diabetes, asthma, hypertension, mental health condition, home smoke exposure, and Medicaid insurance prevalence during Wave 1, and with all but population density and sex prevalence in Wave 2. The directionality of significant relationships remained consistent across the three time periods studied (Figure 2). Several neighborhood-level variables were highly correlated with one another (Figure 3). For example, having a greater percentage of Black residents was positively correlated with population density, owner-occupied and without vehicle household prevalence, chronic health condition prevalence (diabetes, asthma, hypertension, and obesity), the percent of residents who now smoke or are exposed to smoke at home, the percent of residents who have forgone food due to cost, and the percent of residents who have Medicaid insurance. Meanwhile, a greater percentage of Black residents was negatively correlated with median household income, the percent of residents with a high school diploma, bachelor’s degree, or graduate degree, the percent of households that are owner-occupied, the percent of residents eating three or more fruits/vegetables per day, and the percent of residents with employer-sponsored insurance. Additionally, a greater percentage of residents aged 65 or older was positively correlated with the percent of residents who are White and negatively correlated with the percent of households with three or more residents, the percent of residents with less than 9th grade education, and the percent households that are single parent. For education level, a greater percentage of residents with less than 9th grade education was positively correlated with the percent of residents who are Latino and the percent of households without a vehicle, and negatively correlated with the percent of residents with health insurance.
Table 1.
Average prevalence estimates for zip codes (Philadelphia) and municipalities (Bucks, Chester, Delaware, Montgomery), expressed per 100 residents unless otherwise stated.
| Variable | Mean (SD) or Median (IQR)* |
|---|---|
| Population density (people/km2) | 930.3 (2075.3)* |
| Sex | |
| Male | 48.7 (2.7) |
| Race | |
| White | 86.3 (20.0)* |
| Black | 4.8 (11.7)* |
| Asian | 3.0 (4.9)* |
| Latino | 3.9 (4.4)* |
| Age | |
| 18-34 | 22.2 (8.0) |
| 35-49 | 18.6 (2.9) |
| 50-64 | 21.3 (4.0) |
| 65+ | 16.1 (5.5) |
| Household Size | |
| 1 | 26.8 (9.2) |
| 2 | 33.8 (6.0) |
| 3+ | 40.6 (9.3) |
| Education Level | |
| Less than 9thgrade | 1.8 (2.5)* |
| High school diploma | 93.8 (6.6)* |
| Bachelor’s degree | 39.0 (30.4)* |
| Graduate degree | 14.1 (14.7)* |
| Socioeconomic Status | |
| Median household income (USD) | 85055.5 (31332.4) |
| Housing (% of all households) | |
| Owner-occupied | 72.2 (26.7)* |
| Single-parent | 9.3 (11.4)* |
| Without vehicle | 5.4 (8.9)* |
| Health | |
| Diabetes | 11.9 (4.0) |
| Asthma | 15.7 (3.7) |
| Hypertension | 30.7 (6.3) |
| Obesity | 28.3 (7.2) |
| Mental health condition | 17.5 (4.6) |
| Lifestyle | |
| Now smoking | 16.0 (6.3) |
| Exposed to smoke at home | 10.8 (5.6) |
| 3+ fruits/vegetables per day | 48.6 (8.4) |
| Exercise | 54.6 (5.3) |
| Socioeconomic Status | |
| Forgone food due to cost | 9.7 (5.5) |
| Have health insurance | 93.0 (4.8) |
| Insurance Billing Class | |
| Employer-sponsored | 59.6 (8.4) |
| Private/Personal | 36.2 (4.4) |
| Medicare | 27.8 (5.0) |
| Medicaid | 9.2 (7.1) |
| Military | 1.9 (1.0) |
| Social Capital | |
| Low | 29.5 (7.7) |
| Medium | 47.9 (6.2) |
| High | 25.1 (6.8) |
| Pollution | |
| PM 2.5 (μg/m3) | 9.4 (0.3) |
*Variables with approximately normal distributions were described with mean and standard deviation (SD), while the rest were described with median and interquartile range (IQR).
Figure 2.
Neighborhood-level factors associated with COVID-19 A) incidence and B) mortality for three time periods: Total (March 2020 – March 12, 2021), Wave 1 (March 2020 – August 18, 2020), and Wave 2 (August 18, 2020 – March 12, 2021). Shown are incidence rate ratios (IRRs) with 95% confidence intervals corresponding to bivariate quasi-Poisson generalized linear models created with COVID-19 incidence or mortality as the outcome and individual variables listed as predictors.
Figure 3.
Correlation plot of COVID-19 outcomes and neighborhood-level characteristics considered. Color scale denotes Pearson’s correlation measure.
Neighborhood characteristics associated with COVID-19 incidence and mortality according to multivariable analysis. The lasso feature selection step found that education level, age, and sex were associated with COVID-19 incidence while age, exercise level, and prevalence of asthma, diabetes, and mental health conditions were associated with COVID-19 mortality in at least one time period considered (Table 2). However, only a subset of these variables remained significant in the final quasi-Poisson generalized linear models, as indicated by the significance thresholds shown in Table 2. For the three time periods considered, the only features consistently selected were education level for COVID-19 incidence and age 65+ years for COVID-19 mortality. In the final quasi-Poisson generalized linear models, the proportion of residents aged 25 years and older with less than 9th grade education was found to be the only significant (p<0.05) predictor of COVID-19 incidence across all three time periods. When considering COVID-19 mortality, the proportion of residents aged 65 years and older was found to be the only significant predictor of COVID-19 mortality across all three time periods. In both COVID-19 incidence and mortality, the strength of the relationship as measured by IRR magnitude with education level and age, respectively, was greater in Wave 1 compared to Wave 2. IRRs for COVID-19 incidence and education level were 1.88 for Wave 1 and 1.34 for Wave 2, and IRRs for COVID-19 mortality and age were 2.56 for Wave 1 and 1.65 for Wave 2.
Table 2.
Factors associated with COVID-19 incidence and mortality in multivariable analyses. Shown are incidence rate ratios (IRRs) and 95% confidence intervals reflecting the risk associated with a 10% increase in neighborhood prevalence of a variable for all features selected by lasso. *p<0.05; **p<0.001
| COVID-19 Incidence | COVID-19 Mortality | |||||
|---|---|---|---|---|---|---|
| Total | Wave 1 | Wave 2 | Total | Wave 1 | Wave 2 | |
| Sex: Male | –– | –– | 1.08 (0.96, 1.21) | –– | –– | –– |
| Age: 65+ | –– | 0.96 (0.85, 1.08) | –– | 1.68 (1.45, 1.94)** | 2.56 (1.99, 3.27)** | 1.65 (1.39, 1.94)** |
| < 9thgrade education | 1.47 (1.38, 1.56)** | 1.88 (1.63, 2.16) ** | 1.34 (1.26, 1.43)** | –– | –– | –– |
| Diabetes | –– | –– | –– | 2.02 (1.68, 2.42)** | –– | 1.76 (1.42, 2.18)** |
| Asthma | –– | –– | –– | –– | –– | 1.45 (1.17, 1.80)** |
| Mental health condition | –– | –– | –– | –– | –– | 0.82 (0.68, 0.97)* |
| Exercise | –– | –– | –– | –– | –– | 0.85 (0.72, 1.00)* |
| Medicaid | –– | –– | –– | –– | 1.56 (1.37, 1.77)** | –– |
Discussion
Our study found that in Southeastern Pennsylvania, education level was the most salient predictor of COVID-19 incidence at the zip code and municipality level. For all three time periods studied, a higher proportion of residents aged 25 or older with less than 9th grade education was associated with an increased risk of COVID-19 incidence. Previous studies have found a relationship between education level and COVID-19 incidence. In national, county-level risk-adjusted analysis, the percentage of adults without a high school degree had the strongest association with an increased risk of COVID-19 incidence,32 which was consistent with our results. Individuals with lower levels of education are less likely to work remotely,32 which suggests that they are less able to physically distance from other people and have to choose between staying at home and foregoing wages, or going to work and increasing their risk of COVID-19 infection.33 Higher education levels have also been associated with higher personal knowledge of COVID-19, which may offer greater understanding of health information and prevention measures to reduce the risk of COVID-19 infection.34 The decreased strength in the relationship between education level and COVID-19 incidence between the first and second waves is consistent with a national, county-level analysis that found that the percentage of adults with less than high school education who had COVID-19 decreased as the pandemic progressed.33 Hypotheses for this shift include that mitigation efforts affected people with lower levels of education more strongly and that access to testing increased by a greater amount among the highly educated.
In terms of COVID-19 mortality, the proportion of residents aged 65 or older was the most salient risk factor. This relationship held after the exclusion of state-reported cases and deaths attributed to LTCFs, which serve older adults and suffered a disproportionate burden of COVID-19 infection, especially in the early stages of the pandemic.35 The relationship between age and COVID-19 mortality has also been well reported. In a UK-based study, current age was exponentially associated with COVID-19 mortality, and older adults with additional risk factors (e.g., comorbid conditions) were at an even greater risk of death.36 Another study analyzing age-specific patterns of COVID-19 in 45 countries estimated a log-linear increase of a population’s infection fatality ratio by age in adults older than 30 years.37 In older individuals, severe COVID-19 outcomes have been attributed to aging innate and adaptive immune systems, increased inflammation and cytokine profiles, and chronic disease comorbidities, all of which can reduce the body’s ability to prevent the development of cytokine storm, an inflammatory response that can lead to tissue damage and multi-system organ failure.38 Our finding that the relationship between older age and COVID-19 mortality decreased in strength from Wave 1 to Wave 2 may be due to the availability of vaccinations and increased testing, as well as a decreased burden of COVID-19 in congregate living facilities (e.g., LTFCs and prisons) following the implementation of more effective mitigation strategies.
Our multivariable models identified two salient risk factors, but other significant relationships identified in at least one time period included education level, age, and sex with COVID-19 incidence, and age, exercise level, and prevalence of asthma, diabetes, and mental health with COVID-19 mortality. Our neighborhood-level analyses were limited due to the high correlation among many neighborhood variables, which hampered our ability to understand which variables may have driven the relationships with COVID-19 outcomes. Although lasso is a technique that can efficiently select predictors while minimizing overfitting of data,39 it is unable to differentiate between factors that have a similar statistical relationship with an outcome. For example, education level was correlated with other measures of socioeconomic status, race, insurance status, and health outcomes, making it difficult to discern which of these predictors truly underlies the observed relationship between education level and COVID-19 incidence at the neighborhood level. Despite this limitation, lasso has been used in COVID-19 studies40,41 to select features for final models of COVID-19 prevalence, and in other studies investigating chronic disease prevalence in small geographic areas42 to identify variables associated with disease outcomes in a similar manner to this study.
Our study is subject to additional limitations. The PHMC Household Health Survey, one of the three sources used to represent neighborhood characteristics of the Southeastern Pennsylvania region, consisted of adult respondents whose mean age is in the mid-50s for the years considered. This limitation does not substantially affect our results given that we used weighted survey results and that the COVID-19 disease burden largely affected adults.43 Additionally, our COVID-19 outcome data was limited in that although we excluded LTCF case and death counts from our outcome variables, not all LTCFs reported data to the Pennsylvania Department of Health, and data from other congregate living facilities such as prisons and personal care homes was not available to us. Because all of our variable estimates were at the municipality or zip code level, data granularity also limits the specificity of results; data available at a more granular geographical subdivision such as census tract would have allowed for more precise conclusions about the region.
In summary, we found that education level and the age of residents were the most salient neighborhood-level predictors of COVID-19 incidence and mortality, respectively, with the strength of the relationship decreasing as the pandemic progressed. Due to the high level of collinearity among geographic area variables that reflects the cumulative burden of lifetime adversities such as systemic racism experienced by Philadelphia area residents, clarifying the role of individual variables with COVID-19 was made difficult. While neighborhood-level analyses can be useful in determining the specific needs of vulnerable populations and in informing policies to address health disparities related to COVID-19, our results underscore the importance of gathering individual-level data as a pandemic emerges and progresses.
Figures & Table
References
- 1.Centers for Disease Control and Prevention COVID Data Tracker [Internet]. Centers for Disease Control and Prevention; 2021 [cited 2022 Jan 6]. Available from: https://covid.cdc.gov/covid-data-tracker.
- 2.Owen WF, Carmona R, Pomeroy C. Failing another national stress test on health disparities. JAMA. 2020 May 19;323(19):1905–6. doi: 10.1001/jama.2020.6547. [DOI] [PubMed] [Google Scholar]
- 3.Garg S, Kim L, Whitaker M, O’Halloran A, Cummings C, Holstein R, et al. Hospitalization rates and characteristics of patients hospitalized with laboratory-confirmed Coronavirus Disease 2019 — COVID-NET, 14 States, March 1-30, 2020 [Internet] MMWR Morb Mortal Wkly Rep. 2020;69(15):458. doi: 10.15585/mmwr.mm6915e3. 64. Available from: https://www.cdc.gov/mmwr/volumes/69/wr/mm6915e3.htm Apr 17 [cited 2021 Aug 13] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Song Q, Christiani DC, Wang Xe, Ren J. The global contribution of outdoor air pollution to the incidence, prevalence, mortality and hospital admission for chronic obstructive pulmonary disease: a systematic review and meta-analysis. Int J Environ Res Public Health. 2014 Nov 14;11(11):11822–32. doi: 10.3390/ijerph111111822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Schraufnagel DE, Balmes JR, Cowl CT, De Matteis S, Jung S-H, Mortimer K, et al. Air pollution and noncommunicable Diseases: a review by the forum of international respiratory societies’ environmental committee, part 2: air pollution and organ systems. Chest. 2019 Feb 1;155(2):417–26. doi: 10.1016/j.chest.2018.10.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Li H, Xu X-L, Dai D-W, Huang Z-Y, Ma Z, Guan Y-J. Air pollution and temperature are associated with increased COVID-19 incidence: a time series study. Int J Infect Dis. 2020 Jun 2;97:278–82. doi: 10.1016/j.ijid.2020.05.076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Comunian S, Dongo D, Milani C, Palestini P. Air pollution and COVID-19: the role of particulate matter in the spread and increase of COVID-19’s morbidity and mortality. Int J Environ Res Public Health. 2020 Jun 22;17(12):4487. doi: 10.3390/ijerph17124487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Gee GC, Payne-Sturges DC. Environmental health disparities: a framework integrating psychosocial and environmental concepts. Environ Health Perspect. 2004 Dec;112(17):1645–53. doi: 10.1289/ehp.7074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Xie S, Hubbard RA, Himes BE. 2020. Neighborhood-level measures of socioeconomic status are more correlated with individual-level measures in urban areas compared with less urban areas. Ann Epidemiol. Mar 1;43:3743.e4.
- 10.Jaklevic MC. Surgeon fills COVID-19 testing gap in Philadelphia’s black neighborhoods. JAMA. 2021 Jan 5;325(1):14–6. doi: 10.1001/jama.2020.22796. [DOI] [PubMed] [Google Scholar]
- 11.Department of Environmental Protection PA environmental justice areas [Internet]. Commonwealth of Pennsylvania; 2021 [cited 2021 Aug 26]. Available from: https://www.dep.pa.gov/PublicParticipation/OfficeofEnvironmentalJustice/Pages/PA-Environmental-Justice-Areas.aspx.
- 12.Commonwealth of Pennsylvania COVID-19 aggregate cases current daily county health | PA Open Data Portal. Commonwealth of Pennsylvania [Internet]; 2021 [cited 2021 Aug 13]. Available from: https://data.pa.gov/Covid-19/COVID-19-Aggregate-Cases-Current-Daily-County-Heal/j72v-r42c/data.
- 13.Ioannidis JPA, Axfors C, Contopoulos-Ioannidis DG. Second versus first wave of COVID-19 deaths: shifts in age distribution and in nursing home fatalities. Environ Res. 2021 Apr 1;195:110856. doi: 10.1016/j.envres.2021.110856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Nafilyan V, Islam N, Mathur R, Ayoubkhani D, Banerjee A, Glickman M, et al. Ethnic differences in COVID-19 mortality during the first two waves of the Coronavirus pandemic: a nationwide cohort study of 29 million adults in England. Eur J Epidemiol. 2021 Jun 1;36(6):605–17. doi: 10.1007/s10654-021-00765-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Yang B, Wu P, Lau EHY, Wong JY, Ho F, Gao H, et al. Changing disparities in COVID-19 burden in the ethnically homogeneous population of Hong Kong through pandemic waves: an observational study. Clin Infect Dis. 2021 Jan 6;73(12):2298–305. doi: 10.1093/cid/ciab002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Liao TF, De Maio F. Association of social and economic inequality with Coronavirus Disease 2019 incidence and mortality across US counties. JAMA Netw Open. 2021 Jan 20;4(1):e2034578. doi: 10.1001/jamanetworkopen.2020.34578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Pennsylvania Department of Health. COVID-19 long-term care facilities data for Pennsylvania [Internet]. Commonwealth of Pennsylvania; 2021 [cited 2021 Aug 13]. Available from: https://www.health.pa.gov:443/topics/disease/coronavirus/Pages/LTCF-Data.aspx.
- 18.Montgomery County. PositiveDemoTablePublic source [Internet]. Montgomery County Pennsylvania; 2021 [cited 2021 Aug 13]. Available from: https://data-montcopa.opendata.arcgis.com/datasets/positivedemotablepublic-source/data.
- 19.City of Philadelphia Datasets [Internet]. OpenDataPhilly; 2021 [cited 2021 Aug 13]. Available from: https://www.opendataphilly.org/dataset?q=covid&sort=score+desc%2C+metadata_modified+desc.
- 20.Chester County Department of Emergency Services COVID-19 statistics - Chester County, PA [Internet]. Chester County Health Department; 2021 [cited 2021 Aug 13]. Available from: https://chesco.maps.arcgis.com/home/item.html?id=5d50a533de8047eeb28afa7c5e997cab.
- 21.Chester County Department of Emergency Services COVID-19 statistics - Delaware County, PA [Internet]. Chester County Health Department; 2021 [cited 2021 Aug 13]. Available from: https://chesco.maps.arcgis.com/home/item.html?id=544f1bb88b1b4f8f86e38930e5a226b7.
- 22.Kahle D, Wickham H. ggmap: spatial visualization with ggplot2. The R Journal. 2013 Jun;5(1):144–61. [Google Scholar]
- 23.U.S. Census Bureau 2015-2019 American Community Survey (ACS) 5-year estimates [Internet]. U.S. Census Bureau; 2020 [cited 2021 Aug 13]. Available from: https://www.census.gov/programs-surveys/acs.
- 24.Walker K , Herman M. Spatial data in tidycensus [Internet]. 2021 [cited 2021 Aug 13]. Available from: https://walker-data.com/tidycensus/
- 25. U.S. Census Bureau. Cartographic Boundary Files [Internet]. U.S. Census Bureau; 2021 [cited 2021 Aug 13]. Available from: https://www.census.gov/geographies/mapping-files/time-series/geo/cartographic-boundary.html.
- 26.Public Health Management Corporation 2012. 2012 SEPA Household Health Survey documentation. Public Health Management Corporation.
- 27.Public Health Management Corporation 2015. 2014-2015 Household Health Survey documentation. Public Health Management Corporation.
- 28.Public Health Management Corporation 2018. 2018 SEPA Household Health Survey documentation. Public Health Management Corporation.
- 29.Martuzzi M, Elliott P Empirical Bayes estimation of small area prevalence of non-rare conditions. Stat Med. 1996 Sep 15-30;15(17-18):1867–73. doi: 10.1002/(SICI)1097-0258(19960915)15:17<1867::AID-SIM398>3.0.CO;2-2. [DOI] [PubMed] [Google Scholar]
- 30.Air Data Pre-Generated Data Files [Internet]. Environmental Protection Agency; 2021 [cited 2021 Aug 13]. Available from: https://aqs.epa.gov/aqsweb/airdata/download_files.html#Annual.
- 31.R: the R project for statistical computing [Internet] The R Foundation; 2021 [cited 2021 Aug 13]. Available from: https://www.r-project.org/
- 32.Percent of employed people who worked at home and at their workplace on days worked [Internet] U.S. Bureau of Labor Statistics; 2019 [cited 2021 Aug 13]. Available from: https://www.bls.gov/charts/american-time-use/work-by-ftpt-job-edu-p.htm.
- 33.Finch WH, Hernández Finch ME. Poverty and Covid-19: rates of incidence and deaths in the United States during the first 10 weeks of the pandemic. Front Sociol. 2020 Jun 15;5:47. doi: 10.3389/fsoc.2020.00047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Rattay P, Michalski N, Domanska OM, Kaltwasser A, Bock FD, Wieler LH, et al. Differences in risk perception, knowledge and protective behaviour regarding COVID-19 by education level among women and men in Germany. Results from the COVID-19 Snapshot Monitoring (COSMO) study. PLoS One. 2021 May 12;16(5):e0251694. doi: 10.1371/journal.pone.0251694. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ochieng N, Chidambaram P, Garfield R, Neuman T. 2021. Factors associated with COVID-19 cases and deaths in long-term care facilities: findings from a literature review [Internet]. Kaiser Family Foundation. Jan 14 [cited 2021 Aug 25]. Available from: https://www.kff.org/coronavirus-covid-19/issue-brief/factors-associated-with-covid-19-cases-and-deaths-in-long-term-care-facilities-findings-from-a-literature-review/
- 36.Ho FK, Petermann-Rocha F, Gray SR, Jani BD, Katikireddi SV, Niedzwiedz CL, et al. Is older age associated with COVID-19 mortality in the absence of other risk factors? General population cohort study of 470,034 participants. PLoS One. 2020 Nov 5;15(11):e0241824. doi: 10.1371/journal.pone.0241824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.O’Driscoll M, Ribeiro Dos Santos G, Wang L, Cummings DAT, Azman AS, Paireau J, et al. Age-specific mortality and immunity patterns of SARS-CoV-2. Nature. 2020 Nov 2;590(7844):140–5. doi: 10.1038/s41586-020-2918-0. [DOI] [PubMed] [Google Scholar]
- 38.Mueller AL, McNamara MS, Sinclair DA. Why does COVID-19 disproportionately affect older people? Aging. 2020 May 29;12(10):9959–81. doi: 10.18632/aging.103344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Methodol. 1996;58(1):267. .#x2013; 88. [Google Scholar]
- 40.Qin L, Sun Q, Wang Y, Wu K-F, Chen M, Shia B-C, et al. Prediction of number of cases of 2019 novel Coronavirus (COVID-19) using social media search index. Int J Environ Res Public Health. 2020 Mar 21;17(7):2365. doi: 10.3390/ijerph17072365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Sudre CH, Murray B, Varsavsky T, Graham MS, Penfold RS, Bowyer RC, et al. Attributes and predictors of long COVID. Nat Med. 2021 Mar 10;27:626–31. doi: 10.1038/s41591-021-01292-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Füssenich K, Boshuizen HC, Nielen MMJ, Buskens E, Feenstra TL. Mapping chronic disease prevalence based on medication use and socio-demographic variables: an application of LASSO on administrative data sources in healthcare in the Netherlands. BMC Public Health. 2021 Jun 2;21:1039. doi: 10.1186/s12889-021-10754-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Dhochak N, Singhal T, Kabra SK, Lodha R. Pathophysiology of COVID-19: why children fare better than adults? Indian J Pediatr. 2020 May 14;87(7):537–546. doi: 10.1007/s12098-020-03322-y. [DOI] [PMC free article] [PubMed] [Google Scholar]



