Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2023 Feb 21;80:62–68.e3. doi: 10.1016/j.annepidem.2023.02.007

Influence of geospatial resolution on sociodemographic predictors of COVID-19 in Massachusetts

Prasad Patil a,, Xiaojing Peng a, Beth M Haley b, Keith R Spangler b, Koen F Tieskens b, Kevin J Lane b, Fei Carnes b, M Patricia Fabian b, R Monina Klevens c, T Scott Troppy c, Jessica H Leibler b, Jonathan I Levy b
PMCID: PMC9942453  PMID: 36822278

Abstract

Purpose

When studying health risks across a large geographic region such as a state or province, researchers often assume that finer-resolution data on health outcomes and risk factors will improve inferences by avoiding ecological bias and other issues associated with geographic aggregation. However, coarser-resolution data (e.g., at the town or county-level) are more commonly publicly available and packaged for easier access, allowing for rapid analyses. The advantages and limitations of using finer-resolution data, which may improve precision at the cost of time spent gaining access and processing data, have not been considered in detail to date.

Methods

We systematically examine the implications of conducting town-level mixed-effect regression analyses versus census-tract-level analyses to study sociodemographic predictors of COVID-19 in Massachusetts. In a series of negative binomial regressions, we vary the spatial resolution of the outcome, the resolution of variable selection, and the resolution of the random effect to allow for more direct comparison across models.

Results

We find stability in some estimates across scenarios, changes in magnitude, direction, and significance in others, and tighter confidence intervals on the census-tract level. Conclusions regarding sociodemographic predictors are robust when regions of high concentration remain consistent across town and census-tract resolutions.

Conclusions

Inferences about high-risk populations may be misleading if derived from town- or county-resolution data, especially for covariates that capture small subgroups (e.g., small racial minority populations) or are geographically concentrated or skewed (e.g., % college students). Our analysis can help inform more rapid and efficient use of public health data by identifying when finer-resolution data are truly most informative, or when coarser-resolution data may be misleading.

Keywords: COVID-19, Regression analysis, Spatial resolution, Mixed-effect modeling

Introduction

The COVID-19 pandemic has emphasized the importance of rapid access to high-quality health data. Throughout the pandemic, timely access to accurately measured COVID-19 data has allowed for the identification of risk factors and development of targeted, evidence-based interventions to protect populations at elevated risk [1], [2]. In part because of its importance for individual and institutional decision-making and in part to supplement analyses of case surveillance data at state public health departments [3], aggregated COVID-19 case data have been routinely made available for researchers and the general public [4], [5]. In many states, data on cases, hospitalizations, and deaths associated with COVID-19 have been released regularly at varying levels of spatial aggregation. For example, in Massachusetts, the Massachusetts Department of Public Health (MDPH) has released data weekly at the town level since April 2020. These data are packaged in spreadsheets in ready-to-use formats, allowing for rapid analyses by researchers within and outside of MDPH, and such usability has improved pandemic response.

Access to finer-resolution COVID-19 outcomes, such as that on the census tract or individual level, is more limited. Availability of individual-level data is limited by patient privacy concerns through Health Insurance Portability and Accountability Act (HIPAA) and its usage is tightly restricted. When analyses are conducted rapidly for imminent policy guidance, there is increased potential for misuse or adversarial access to personally identifiable information. In addition, each health department (state and local) may have individual privacy and suppression rules that need to be applied to the data. However, individual data can be aggregated into sub-town-level geographic units, such as census tracts, while maintaining individual confidentiality. Data at the census-tract level have particular utility because many publicly available data sources on sociodemographics and environmental factors, such as the American Community Survey (ACS) [6], also present data on the tract level. Health outcome data on the tract level can be analyzed alongside tract-level predictors to inform a nuanced understanding of risk factors, allowing for a higher-resolution analysis and potentially more valid and actionable inferences.

The value of finely resolved geospatial data in understanding specific exposures and health outcomes experienced by small populations is well understood, including avoiding challenges related to aggregation such as the Modifiable Areal Unit Problem [7]. Such analyses, using residential geocoding of cases and community-level data attached to small geographic units, can identify important predictors that are relevant at the neighborhood level, and have particular utility in exploring disproportionate environmental exposures and health disparities. Using a combination of individual and local data can highlight locations in which racism and other forms of structural bias may have contributed to elevated risks in identified communities over time. Place-based assessments of exposure-outcome relationships have specific value during periods of pandemic, but are also employed in nonpandemic disease surveillance and environmental health and justice research to better understand health disparities [8], [9], [10].

In the spring of 2020, researchers at the Boston University School of Public Health and MDPH initiated a collaboration to evaluate sociodemographic and environmental predictors of COVID-19 at the census-tract level. The goal of this partnership was to leverage existing data resources available at the census-tract level to inform a more comprehensive and potentially actionable understanding of COVID-19 patterns in the state. MDPH provided data on the date of diagnosis and home addresses of individuals with Polymerase Chain Reaction (PCR)-confirmed cases of COVID-19 from March 2020 to February 2021. These data were subsequently geocoded by the Boston University School of Public Health team and used in random-effect regression model analyses alongside ACS and other geospatial predictors to identify patterns in risk factors over time [5]. We also evaluated and published a parallel analysis on the town level, using publicly available MDPH COVID-19 outcome data and town-level sociodemographic predictors from ACS [4]. Between these two analyses, we observed general consistency in key conclusions but some important differences in patterns of association for the same sets of predictors at different spatial resolutions. However, the models were not constructed identically or varied systematically to evaluate the specific elements that led to similar versus different conclusions.

In our current study, we vary the spatial scale of three components of the random-effect regression modeling process within a structured experimental design: (1) the outcome (COVID-19 cases), (2) sociodemographic and environmental variable selection, and (3) the random-effect term. The random-effect term captures group variation, such as correlation at the town- or county-level, without explicitly estimating a coefficient for the term. Although this is not a comprehensive list of all statistical modeling decisions, it is representative of common steps often taken in this approach that are likely to affect inference. We discuss which estimates changed by varying these inputs and the importance of these changes for both epidemiologic assessments and policy-making.

Previous work has reported on variation caused by and pitfalls due to temporal resolution [11], [12], [13], [14], but to our knowledge, there has not been a systematic exploration of how choices in a statistical analysis contribute to these issues. The goal of this project is to examine the influence of analytic choices on inferences and conclusions drawn from models trained at different geospatial resolutions. While our approach is not a prescription for how to conduct such analyses in all circumstances, our intention is to inform and support decision-making about the trade-offs and limitations associated with using data aggregated across different geographic resolutions.

Methods

Data sources

In this analysis, we compare the associations between a series of sociodemographic variables with hypothesized and demonstrated relevance for COVID-19 case data collected at different spatial levels. In our first analysis [4], we evaluated town-level sociodemographic data alongside town-level COVID-19 case data. All data were publicly available. COVID-19 case data were extracted from the MDPH website from April to October 2020 [15]. These data reflected all laboratory-confirmed cases reported to the state and were publicly presented by town on the MDPH website. Town-level sociodemographic data were derived from the 2014 to 2018 ACS 5-year estimates for tracts [16], and then scaled to towns by aggregating tract counts to the appropriate town and calculating proportions using ACS town population estimates from the same survey. The ACS sociodemographic variables included in the model (Table S1) include percentage of persons identifying as African-American/Black and Latino/Hispanic by town population, as well as the percentage of persons by town living below the poverty line, older than 80 years, living in housing units with occupancy of greater than 2 persons per room, and without health insurance.

In our second analysis [5], we used individual-level COVID-19 case data provided by MDPH under strict protective conditions. These cases were geocoded based on residential home address and aggregated to the appropriate census tract for analysis. Census-tract-level data on the same series of sociodemographic variables were extracted from the ACS and included in analyses directly without scaling to town. See descriptions in Tieskens et al. [4] and Spangler et al. [5] for further details on how each variable is constructed.

We used the following covariates in our comparative analyses. All refer to percentages within town/census tract unless otherwise noted (see Table S1 in the supplement for complete descriptions): ACS_20_min, ACS_80_plus, ACS_AIAN, ACS_black, ACS_disabilities, ACS_latino, ACS_more_1andhalf_per_rooom, ACS_more_2_per_room, ACS_not_insured, ACS_poverty, ACS_public_trans, ACS_services (essential services), Beds_LTCF (number of beds in long-term care facilities), Grad, HUD, (housing unit density), Num_LTCF (number of long-term care facilities), Students, Undergrad, and Urbanicity.

After excluding five census tracts with zero population, our statistical modeling was conducted on N town = 351 towns and N CT = 1462 census tracts.

Statistical modeling

We fit a series of eight random-effect models to the entire April–October 2020 time period using the glmmTMB package in R [17], outlined in Table 1. We restricted to this subset of dates for comparability with our earlier town-level analyses. The outcome of interest was COVID-19 cases. Across these models, we vary (1) the scale of the outcome (town-level or tract-level); (2) the scale at which variables are selected (town-level or tract-level); (3) the scale of the random-effect term (town-level or county-level). To conduct variable selection, we first discard variables that exhibit high correlation (correlation coefficient>0.7), then use a stepwise Poisson regression, and use only the subset of variables retained after the stepwise selection procedure is complete. We produce effect estimates and 95% confidence intervals for each covariate included in each model and report Akaike Information Criterion (AIC), Bayesian Infromation Criterion (BIC), and log-likelihood estimates for each model fit.

Table 1.

Models M1–M8 represent variation in the scale of the outcome, variable selection process, and random-effect term

Label Outcome Variables Random effect
M1 Town-level total number of cases Town County
M2 Town Town
M3 Tract County
M4 Tract Town
M5 Tract-level total number of cases Tract Town
M6 Town Town
M7 Tract County
M8 Town County

Results

Fig. 1 displays separate panels for the 19 candidate ACS variables. Within each panel appear the coefficient estimates and 95% confidence intervals from each model (M1–M8). Each panel is separated into town-level outcome (left) and tract-level outcome (right); estimate shapes indicate the resolution of variable selection and fill indicates random-effect resolution. Table S2 in the supplement displays the plotted values.

Fig. 1.

Fig. 1

Coefficient estimates and 95% confidence intervals for each covariate across models M1–M8. Each panel is separated into town-level outcome (left) and tract-level outcome (right); estimate point shapes indicate the resolution of variable selection and fill indicates random-effect resolution. Subtable displays characteristics and model fit for each model variation.

One can compare, for example, how an estimate differs by choice of random effect by comparing open and filled points of the same shape on the same side of the panel. The town-level and tract-level stepwise variable selection procedures select different subsets of the 19 candidate variables, resulting in some panels having only 4 coefficient estimates. Generally, there are only a few variables—% Black residents, % Latino residents, % of residents with an essential service occupation (ACS_services), and # beds in long-term care facilities per capita (Beds_LTCF)—that maintain the same inference in terms of direction of effect and statistical significance across all model variations. One consistent trend observed from these estimates is that the 95% confidence intervals are much tighter for the models with census-tract-level outcomes (M5–M8) than they are for the corresponding town-level models (M1–M4). Model fit metrics are fairly close to each other, but within buckets of similar models suggest that tract-level variable selection is preferable to town-level and that a county-level random effect is preferred for a town-level regression, while a town-level random effect is preferred for a county-level regression.

As noted, # LTCF Beds, % Essential services, % Latino residents, and % Black residents maintain the same direction and significance of association across all eight model variations, and all have a significantly positive association with the outcome. % with disabilities has a consistent positive association, but is only significant in M1; % more than two per room has a consistent negative association, but is only significant in town-level outcome models (M1–M4). All other variables demonstrate variation by model in whether or not they are selected in variable selection or vary in both direction and significance of association. While these overall trends help pinpoint constructs that retain a consistent association across modeling choices, the reasons for inconsistent or conflicting associations are numerous and potentially complex.

Fig. 2 displays a subset of comparisons between models to better illustrate the specific effects of singular changes to the modeling procedure. Each cell represents the direction (blue = positive, tan = negative) and the statistical significance (“*” indicates P < .05) of each association in each model variation. White spaces indicate that the corresponding variable was not selected into the corresponding model (e.g., Urbanicity does not appear in model M1). The first comparison (M1 vs. M5) is between an “independent” analysis at the town level (which uses a town-level outcome, town-level variable selection, and “county” as the random effect) and an “independent” analysis at the census-tract level (which uses a census-tract-level outcome, census-tract-level variable selection, and “town” as the random effect). M1 and M5 most closely resemble the analyses we would conduct if we only had access to town-level or census-tract-level data, respectively. There are some immediate differences in the two models: (1) % Students, # Long Term Care Facilities, and Housing Unit Density are not selected via variable selection at the town level, but are significantly associated at the tract level; (2) % Below poverty line, less than 20%, and % American Indian or Alaskan Native residents switch signs and significance. We note, however, that the conditional interpretation of these variables is not identical across models because none of the outcome, variable selection resolution, or random effect are the same across the two models.

Fig. 2.

Fig. 2

Four specific comparisons: (1) independent town-level analysis (M1) versus independent tract-level analysis (M5); (2) impact of changing outcome resolution; (3) impact of changing variable selection resolution; (4) impact of changing random-effect resolution. “*” indicates statistical significance, blue indicates positive association, tan indicates negative association. Empty spaces indicate that the given variable was not chosen into the given model (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article).

The second comparison (M1 vs. M8) is more direct, where the only difference between the two models is the resolution of the outcome. This ensures that the conditional interpretation of the coefficient estimates in each model is identical. In this comparison, we observe that % Below poverty line and % American Indian or Alaskan Native residents still switch signs, and that more variables are significant in the census-tract-level model.

The third and fourth comparisons illustrate the effect of changing variable selection resolution and random-effect resolution, respectively. We find that the overarching effects on inference are generally not as pronounced. We do observe that some variables not selected at the town level are significant at the tract level, and that spatially associated variables such as % Public Transportation change sign and direction when the random effect is changed. Looking across the four scenarios in Fig. 2, it appears that the types of changes observed across comparisons two through four compound to produce the substantial variation we observe in comparison one.

Discussion

Our model comparison varied the resolution of the outcome (town vs. tract), variable selection (town vs. tract), and random effect (town vs. county) in the modeling of COVID cases in Massachusetts. Overall, we find that the geospatial resolution of the COVID outcome has by far the greatest influence on how much model results vary, with the resolution of variable selection and random effect causing modest additional variation. Awareness of this result is essential for interpreting research using low-resolution outcome data available publicly. By taking this systematic approach, we observed a few notable patterns regarding resolution of predictor variables and interpretations.

For a subset of covariates, patterns were identical across modeling approaches, while for others, there were notable differences not just in statistical significance but also magnitude and direction of association. While the precise reasons vary across covariates, the underlying spatial structure and the extent of heterogeneity within towns is an important explanatory factor. For example, Fig. 3 displays heatmaps of % Black residents (which had a consistent positive association across all tested models) and % below poverty line (which switched signs from negative association at the town level to positive association at the census-tract level) at town and census-tract resolutions. We observe that for % Black residents, the geographic regions where these percentages are high remain roughly consistent between the town and tract resolution (due to a majority white population in the state of Massachusetts, this association is driven by cases in urban areas). In contrast, for % residents below poverty line, there are notable spatial differences between town and tract resolution, with many towns in Western Massachusetts having more pronounced areas with high percentages at the town level, and many towns in Eastern Massachusetts only displaying elevated % below the poverty line in a subset of tracts. While numerous factors influence the differential patterns across covariates, it is clear that formal assessments of heterogeneity within and between towns can provide valuable insight about the likelihood of model stability across geographic resolutions. For more in-depth interpretation of the associations found across these models, please refer to our previous work [4], [5].

Fig. 3.

Fig. 3

Panels display heatmaps with color spectrum representing regional % Black population and % population below poverty line at town and census-tract resolutions. Dark boundaries in all panels mark Massachusetts towns, while lighter boundaries in the right-hand-side panels mark Massachusetts census tracts. Breaks are manually defined to follow quantiles with additional custom cutoffs to emphasize lower-end values that are difficult to visualize due to the right-skewed distributions of each variable.

From a statistical standpoint, more variables tend to be selected into the model and more variables appear as statistically significant at the census-tract level. This is likely influenced by the increased sample size at the finer resolution (N town = 351 towns and N CT = 1462 census tracts). Statistical significance is not the only determinant of model interpretation and performance, but more precise covariate estimates can give policy makers increased confidence in deriving geographically targeted policies from these types of models. We also note that the four variables that maintain significant positive associations across all eight model configurations (% Black, % Latino, % Essential Services, and # Beds LTCF) are likely a result of homogeneity across census tracts as described above. We do not observe any consistent negative associations in the same manner, although this is likely due to the types of covariates preselected for this analysis—most were expected to be associated with increased rates of COVID-19 cases.

Broadly, while it is clear that a higher-resolution model would be preferred when available, it does come with a cost related to the increased effort to work with individual-level data and geocode addresses and spatial data. For public health agencies who have the internal capacity to do this on a regular basis, given that the analytical effort for regression modeling is essentially identical at higher resolution, we would recommend constructing higher-resolution regression models to avoid incorrect inferences about important covariates. This would also be the case for analyses that are not time-sensitive and are meant to derive broader insight for future planning. In settings where it is not realistic to geocode at sufficient speed to conduct time-sensitive analyses, our findings reinforce that many important insights (e.g., related to patterns by race/ethnicity) can be captured with town-level regressions. In any case, a careful assessment of whether risk factors are expected to be homogeneously or heterogeneously distributed across finer-resolution geographic units is critical for identifying which inferences may be influenced by geospatial resolution.

There are multiple limitations of our analyses that should be acknowledged. First, we modeled cases only from April to October 2020. The sociodemographic patterns of cases may have differed in other time periods, with corresponding differences in the insights available from census tract versus town resolution models. In addition, the patterns may have differed in other states and our findings may not be generalizable. That said, the general insight that patterns will differ across covariates as a function of their underlying spatial structure is likely robust, with the specific manifestation differing over time and space. In addition, our models are ecological in nature and cannot provide individual-level inferences (i.e., we found elevated case rates in towns or census tracts with higher % Black residents, but that does not necessarily imply that individuals who are Black have higher risks). However, such models are valuable because SES data are often not available at the individual level and supplementing with geography-based data from ACS is necessary for adequate model adjustment. While census tracts were a logical geographic resolution given available census data and the fact that tracts fall neatly into a standard geographic hierarchy, policy makers may be interested in insights for more politically interpretable aggregations such as ZIP codes. We did not analyze ZIP codes in part because they do not reflect underlying population patterns to the same extent as census tracts, but future studies should assess the implications of modeling at ZIP code resolution relative to town and/or census tract.

From a data modeling perspective, this study examines a restricted choice set for a single model specification in random-effect models. There are a number of valid approaches we do not explore, including nested models, generalized estimating equations, and Bayesian multilevel modeling. These approaches will have their own choice set variations, and regardless we did not study important topics such as interaction effects, random slope specification, or estimating/interpreting the intraclass correlation coefficient. As such, although we can provide some broad conclusions on the influences of data resolution, we are unable to generalize across all modeling frameworks and choices available to practitioners.

In spite of these limitations, our models offer important insight. Overall, our modeling in Massachusetts indicates that town-level modeling adequately reflects racial and ethnic disparities observed in COVID-19 case incidence in the state. This observation likely reflects residential segregation, which results in strong overlap between tract and town-level race and ethnicity characteristics. As such, these town-level approaches are valuable in quickly and efficiently identifying disparities by race and ethnicity, controlling for other factors. However, poverty information is needed because of its effect as an independent risk factor from race/ethnicity [18]. The number of long-term care facility beds per capita and percent essential workers was also stable across town and census-tract level, consistently identifying these groups as vulnerable subpopulations within the COVID-19 pandemic. We recommend that future analyses of COVID-19 case data and other outcomes for which time-sensitive and geographically targeted interventions are of interest explicitly consider the implications of spatial resolution of exposures and outcomes.

Reproducibility

The R code used to generate figures and results for this paper is available as a supplemental file. Per the data use agreement with MDPH, we are not permitted to publicly share the town- and census-tract-level data used for these analyses. These datasets may be obtained by request to MDPH and can be used in conjunction with the shared code to reproduce the analyses in this article.

Funding

This project is supported by National Institutes of Health (NIH)/National Institute on Minority Health and Health Disparities (NIMHD) Grant P50MD010428 (PIs: J. Levy and F. Laden) and by an unrestricted gift from Google LLC to support research on COVID-19 and health equity (PI: J. Levy). B. Haley was supported by National Institute of Environmental Health Sciences (NIEHS) Grant T32ES014562.

Footnotes

Conflict of Interests: The authors declare the following financial interests/personal relationships that may be considered as potential competing inter-ests: Jonathan I. Levy reports financial support was provided by Google Inc.

Appendix

See Table S1 and Table S2.

Table S1.

Variable names, shorthand descriptions, and full descriptions of all candidate variables for modeling

Variable name Shorthand description Full description
ACS_20_min % Age <20 % of residents younger than 20 y
ACS_80_plus % Age >80 % of residents older than 80 y
ACS_AIAN % American Indian or Alaskan Native % of residents identifying as American Indian or Alaskan Native
ACS_black % Black % of residents identifying as African-American/Black
ACS_disabilities % Disabilities % of residents with reported disabilities
ACS_latino % Latino % of residents identifying as Latino/Hispanic
ACS_more_1andhalf_per_room % More than 1.5 per room % housing units with occupancy >1.5 persons/room
ACS_more_2_per_room % More than 2 per room % housing units with occupancy >2 persons/room
ACS_not_insured % Not insured % of residents without health insurance coverage
ACS_poverty % Below poverty % of residents living below poverty line
ACS_public_trans % Public transportation % of residents traveling to work by public transportation
ACS_services % Essential services % of residents with an essential service occupation
Beds_LTCF # Beds LTCF Number of long-term care facility beds per capita
Grad % Graduate students % of residents enrolled in graduate higher education
HUD Housing unit density Housing unit density (units per square mile)
Num_LTCF # LTCF Number of long-term care facilities per capita
Students % Students % of residents enrolled in undergraduate or graduate higher education (Undergrad + Graduate)
Undergrad % Undergraduates % of residents enrolled in undergraduate higher education
Urbanicity % Urbanicity % of town/census tract classified as urban

Table S2.

Estimates, 95% CI bounds for each variable in each model configuration

Model Variable name Estimate P-value 95% CI LB 95% CI UB Outcome Variable selection Random effect
M1 ACS_not_insured -0.0235 0.2772 -0.0659 0.0189 Town Town County
M1 ACS_poverty -0.0818 0.0029 -0.1356 -0.0280 Town Town County
M1 ACS_latino 0.1698 0.0000 0.1206 0.2189 Town Town County
M1 ACS_AIAN -0.0419 0.0184 -0.0768 -0.0071 Town Town County
M1 ACS_black 0.0937 0.0000 0.0545 0.1328 Town Town County
M1 ACS_disabilities 0.0536 0.0463 0.0009 0.1062 Town Town County
M1 ACS_more_1andhalf_per_room 0.0493 0.0271 0.0056 0.0931 Town Town County
M1 ACS_more_2_per_room -0.0582 0.0027 -0.0962 -0.0202 Town Town County
M1 ACS_public_trans -0.0019 0.9406 -0.0521 0.0483 Town Town County
M1 ACS_services 0.1834 0.0000 0.1363 0.2306 Town Town County
M1 ACS_80_plus -0.0016 0.9463 -0.0480 0.0448 Town Town County
M1 ACS_20_min 0.0383 0.1136 -0.0091 0.0857 Town Town County
M1 Beds_LTCF 0.1314 0.0000 0.0949 0.1679 Town Town County
M1 Grad 0.0330 0.1300 -0.0097 0.0756 Town Town County
M1 Undergrad 0.0226 0.1855 -0.0109 0.0561 Town Town County
M2 ACS_not_insured -0.0519 0.0639 -0.1067 0.0030 Town Town Town
M2 ACS_poverty -0.1507 0.0000 -0.2208 -0.0805 Town Town Town
M2 ACS_latino 0.2276 0.0000 0.1647 0.2906 Town Town Town
M2 ACS_AIAN -0.0816 0.0002 -0.1241 -0.0391 Town Town Town
M2 ACS_black 0.1246 0.0000 0.0746 0.1747 Town Town Town
M2 ACS_disabilities 0.0258 0.4673 -0.0438 0.0954 Town Town Town
M2 ACS_more_1andhalf_per_room 0.0443 0.1321 -0.0133 0.1019 Town Town Town
M2 ACS_more_2_per_room -0.0767 0.0028 -0.1271 -0.0264 Town Town Town
M2 ACS_public_trans 0.0774 0.0043 0.0242 0.1306 Town Town Town
M2 ACS_services 0.2400 0.0000 0.1783 0.3017 Town Town Town
M2 ACS_80_plus -0.0186 0.5415 -0.0784 0.0411 Town Town Town
M2 ACS_20_min 0.1418 0.0000 0.0809 0.2027 Town Town Town
M2 Beds_LTCF 0.1750 0.0000 0.1264 0.2236 Town Town Town
M2 Grad -0.0001 0.9959 -0.0557 0.0554 Town Town Town
M2 Undergrad 0.0084 0.7142 -0.0366 0.0534 Town Town Town
M3 ACS_poverty -0.1042 0.0004 -0.1617 -0.0467 Town Tract County
M3 ACS_latino 0.1765 0.0000 0.1262 0.2267 Town Tract County
M3 ACS_AIAN -0.0416 0.0180 -0.0761 -0.0071 Town Tract County
M3 ACS_black 0.0916 0.0000 0.0503 0.1329 Town Tract County
M3 ACS_disabilities 0.0490 0.0712 -0.0042 0.1023 Town Tract County
M3 ACS_more_2_per_room -0.0358 0.0322 -0.0686 -0.0030 Town Tract County
M3 ACS_public_trans 0.0008 0.9822 -0.0701 0.0717 Town Tract County
M3 ACS_services 0.1797 0.0000 0.1336 0.2257 Town Tract County
M3 ACS_80_plus 0.0041 0.8628 -0.0428 0.0511 Town Tract County
M3 ACS_20_min 0.0310 0.1985 -0.0163 0.0784 Town Tract County
M3 Num_LTCF -0.0144 0.6815 -0.0834 0.0545 Town Tract County
M3 HUD -0.0387 0.3590 -0.1214 0.0440 Town Tract County
M3 Urbanicity 0.0511 0.1973 -0.0266 0.1287 Town Tract County
M3 Beds_LTCF 0.1388 0.0001 0.0701 0.2076 Town Tract County
M3 Grad 0.0376 0.0984 -0.0070 0.0822 Town Tract County
M3 Students 0.0254 0.1751 -0.0113 0.0620 Town Tract County
M4 ACS_poverty -0.2096 0.0000 -0.2829 -0.1363 Town Tract Town
M4 ACS_latino 0.1915 0.0000 0.1256 0.2575 Town Tract Town
M4 ACS_AIAN -0.0817 0.0001 -0.1222 -0.0413 Town Tract Town
M4 ACS_black 0.1039 0.0001 0.0517 0.1561 Town Tract Town
M4 ACS_disabilities 0.0252 0.4706 -0.0433 0.0938 Town Tract Town
M4 ACS_more_2_per_room -0.0546 0.0100 -0.0962 -0.0131 Town Tract Town
M4 ACS_public_trans -0.0023 0.9544 -0.0799 0.0754 Town Tract Town
M4 ACS_services 0.2151 0.0000 0.1553 0.2750 Town Tract Town
M4 ACS_80_plus -0.0123 0.6879 -0.0721 0.0476 Town Tract Town
M4 ACS_20_min 0.1408 0.0000 0.0781 0.2036 Town Tract Town
M4 Num_LTCF -0.0548 0.2503 -0.1483 0.0386 Town Tract Town
M4 HUD 0.0415 0.4297 -0.0614 0.1444 Town Tract Town
M4 Urbanicity 0.1140 0.0256 0.0139 0.2142 Town Tract Town
M4 Beds_LTCF 0.2057 0.0000 0.1131 0.2984 Town Tract Town
M4 Grad 0.0047 0.8699 -0.0519 0.0614 Town Tract Town
M4 Students 0.0068 0.7820 -0.0413 0.0549 Town Tract Town
M5 ACS_poverty 0.0225 0.0781 -0.0025 0.0476 Tract Tract Town
M5 ACS_latino 0.1619 0.0000 0.1309 0.1929 Tract Tract Town
M5 ACS_AIAN 0.0092 0.1968 -0.0048 0.0232 Tract Tract Town
M5 ACS_black 0.0583 0.0000 0.0378 0.0789 Tract Tract Town
M5 ACS_disabilities 0.0077 0.4950 -0.0144 0.0298 Tract Tract Town
M5 ACS_more_2_per_room -0.0049 0.4720 -0.0183 0.0085 Tract Tract Town
M5 ACS_public_trans 0.0006 0.9657 -0.0279 0.0291 Tract Tract Town
M5 ACS_services 0.0972 0.0000 0.0776 0.1168 Tract Tract Town
M5 ACS_80_plus -0.0145 0.1670 -0.0351 0.0061 Tract Tract Town
M5 ACS_20_min -0.0543 0.0000 -0.0776 -0.0309 Tract Tract Town
M5 Num_LTCF 0.0819 0.0000 0.0635 0.1004 Tract Tract Town
M5 HUD -0.0466 0.0000 -0.0663 -0.0269 Tract Tract Town
M5 Urbanicity 0.0406 0.0039 0.0130 0.0682 Tract Tract Town
M5 Beds_LTCF 0.0387 0.0006 0.0167 0.0607 Tract Tract Town
M5 Grad -0.0025 0.8295 -0.0249 0.0200 Tract Tract Town
M5 Students -0.0913 0.0000 -0.1121 -0.0705 Tract Tract Town
M6 ACS_not_insured 0.0314 0.0006 0.0135 0.0492 Tract Town Town
M6 ACS_poverty 0.0112 0.3874 -0.0142 0.0365 Tract Town Town
M6 ACS_latino 0.1550 0.0000 0.1223 0.1877 Tract Town Town
M6 ACS_AIAN 0.0078 0.2864 -0.0065 0.0221 Tract Town Town
M6 ACS_black 0.0650 0.0000 0.0439 0.0861 Tract Town Town
M6 ACS_disabilities 0.0171 0.1360 -0.0054 0.0396 Tract Town Town
M6 ACS_more_1andhalf_per_room -0.0008 0.9258 -0.0181 0.0164 Tract Town Town
M6 ACS_more_2_per_room -0.0072 0.3769 -0.0233 0.0088 Tract Town Town
M6 ACS_public_trans -0.0054 0.7148 -0.0345 0.0236 Tract Town Town
M6 ACS_services 0.0870 0.0000 0.0658 0.1081 Tract Town Town
M6 ACS_80_plus 0.0161 0.1097 -0.0036 0.0357 Tract Town Town
M6 ACS_20_min -0.0494 0.0000 -0.0732 -0.0257 Tract Town Town
M6 Beds_LTCF 0.0928 0.0000 0.0731 0.1125 Tract Town Town
M6 Grad -0.0243 0.0217 -0.0451 -0.0036 Tract Town Town
M6 Undergrad -0.0835 0.0000 -0.1032 -0.0639 Tract Town Town
M7 ACS_poverty 0.0441 0.0029 0.0151 0.0731 Tract Tract County
M7 ACS_latino 0.2001 0.0000 0.1735 0.2266 Tract Tract County
M7 ACS_AIAN 0.0175 0.0411 0.0007 0.0343 Tract Tract County
M7 ACS_black 0.0671 0.0000 0.0460 0.0882 Tract Tract County
M7 ACS_disabilities 0.0008 0.9483 -0.0241 0.0257 Tract Tract County
M7 ACS_more_2_per_room -0.0042 0.6199 -0.0206 0.0123 Tract Tract County
M7 ACS_public_trans -0.0857 0.0000 -0.1142 -0.0572 Tract Tract County
M7 ACS_services 0.1743 0.0000 0.1558 0.1927 Tract Tract County
M7 ACS_80_plus 0.0066 0.5762 -0.0165 0.0297 Tract Tract County
M7 ACS_20_min -0.0917 0.0000 -0.1159 -0.0675 Tract Tract County
M7 Num_LTCF 0.0873 0.0000 0.0660 0.1085 Tract Tract County
M7 HUD -0.0549 0.0000 -0.0793 -0.0306 Tract Tract County
M7 Urbanicity 0.0785 0.0000 0.0496 0.1074 Tract Tract County
M7 Beds_LTCF 0.0282 0.0080 0.0074 0.0490 Tract Tract County
M7 Grad -0.0352 0.0069 -0.0608 -0.0097 Tract Tract County
M7 Students -0.0524 0.0000 -0.0756 -0.0293 Tract Tract County
M8 ACS_not_insured 0.0355 0.0011 0.0143 0.0568 Tract Town County
M8 ACS_poverty 0.0443 0.0030 0.0151 0.0735 Tract Town County
M8 ACS_latino 0.2105 0.0000 0.1827 0.2383 Tract Town County
M8 ACS_AIAN 0.0176 0.0453 0.0004 0.0349 Tract Town County
M8 ACS_black 0.0833 0.0000 0.0619 0.1048 Tract Town County
M8 ACS_disabilities 0.0122 0.3481 -0.0133 0.0377 Tract Town County
M8 ACS_more_1andhalf_per_room 0.0010 0.9249 -0.0206 0.0226 Tract Town County
M8 ACS_more_2_per_room -0.0079 0.4337 -0.0278 0.0119 Tract Town County
M8 ACS_public_trans -0.0817 0.0000 -0.1096 -0.0538 Tract Town County
M8 ACS_services 0.1637 0.0000 0.1429 0.1845 Tract Town County
M8 ACS_80_plus 0.0381 0.0008 0.0159 0.0603 Tract Town County
M8 ACS_20_min -0.0961 0.0000 -0.1206 -0.0715 Tract Town County
M8 Beds_LTCF 0.0685 0.0000 0.0490 0.0881 tract Town County
M8 Grad -0.0460 0.0002 -0.0697 -0.0222 Tract Town County
M8 Undergrad -0.0394 0.0004 -0.0611 -0.0177 Tract Town County

These data are plotted in main text Fig. 1.

See Fig. S1.

Fig. S1.

Fig. S1

Significance and direction of variable by model. “*” indicates statistical significance, blue indicates positive association, tan indicates negative association. Empty spaces indicate that the given variable was not chosen into the given model.

References

  • 1.Antonelli M., Penfold R.S., Merino J., Sudre C.H., Molteni E., Berry S., et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID symptom study app: a prospective, community-based, nested, case-control study. Lancet Infect Dis. 2022;22(1):43–55. doi: 10.1016/S1473-3099(21)00460-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Baker J.M., Nelson K.N., Overton E., Lopman B.A., Lash T.L., Photakis M., et al. Quantification of occupational and community risk factors for SARS-CoV-2 seropositivity among health care workers in a large US health care system. Ann Intern Med. 2021;174(5):649–654. doi: 10.7326/M20-7145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.DeSalvo K., Hughes B., Bassett M., Benjamin G., Fraser M., Galea S., et al. Public health COVID-19 impact assessment: lessons learned and compelling needs. NAM Perspect. 2021;2021:1–29. doi: 10.31478/202104c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Tieskens K.F., Patil P., Levy J.I., Brochu P., Lane K.J., Fabian M.P., et al. Time-varying associations between COVID-19 case incidence and community-level sociodemographic, occupational, environmental, and mobility risk factors in Massachusetts. BMC Infect Dis. 2021;21(1):1–9. doi: 10.1186/s12879-021-06389-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Spangler K.R., Patil P., Peng X., Levy J.I., Lane K.J., Tieskens K.F., et al. Community predictors of COVID‐19 cases and deaths in Massachusetts: evaluating changes over time using geospatially refined data. Influenza Other Respir Viruses. 2022;16(2):213–221. doi: 10.1111/irv.12926. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.National Research Council. Using the American Community Survey: benefits and challenges; 2007 Mar 26.
  • 7.Fotheringham A.S., Wong D.W. The modifiable areal unit problem in multivariate statistical analysis. Environ Plan A. 1991;23(7):1025–1044. [Google Scholar]
  • 8.Rosofsky A., Levy J.I., Zanobetti A., Janulewicz P., Fabian M.P. Temporal trends in air pollution exposure inequality in Massachusetts. Environ Res. 2018;161:76–86. doi: 10.1016/j.envres.2017.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Di Q., Wang Y., Zanobetti A., Wang Y., Koutrakis P., Choirat C., et al. Air pollution and mortality in the medicare population. N Engl J Med. 2017;376(26):2513–2522. doi: 10.1056/NEJMoa1702747. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Evans L., Charns M.P., Cabral H.J., Fabian M.P. Change in geographic access to community health centers after Health Center Program expansion. Health Serv Res. 2019;54(4):860–869. doi: 10.1111/1475-6773.13149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Thomas A.J., Eberly L.E., Davey Smith G., Neaton J.D. ZIP-code-based versus tract-based income measures as long-term risk-adjusted mortality predictors. Am J Epidemiol. 2006;164(6):586–590. doi: 10.1093/aje/kwj234. [DOI] [PubMed] [Google Scholar]
  • 12.Sperling J. The tyranny of census geography: small-area data and neighborhood statistics. Cityscape. 2012;1:219–223. [Google Scholar]
  • 13.Chen J.T., Krieger N. Revealing the unequal burden of COVID-19 by income, race/ethnicity, and household crowding: US county versus zip code analyses. J Public Health Manag Pract. 2021;27(1):S43–S56. doi: 10.1097/PHH.0000000000001263. [DOI] [PubMed] [Google Scholar]
  • 14.Nethery R.C., Chen J.T., Krieger N., Waterman P.D., Peterson E., Waller L.A., et al. Statistical implications of endogeneity induced by residential segregation in small-area modeling of health inequities. Am Stat. 2022;76(2):142–155. doi: 10.1080/00031305.2021.2003245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.MAPDH. COVID-19 case data; 2020. Retrieved from 〈https://www.mass.gov/info-details/covid-19-response-reporting#covid-19-county-level-data-reporting-〉.
  • 16.U.S. Census Bureau. Selected social/housing/economic/demographic characteristics, 2014–2018 American Community Survey 5-year estimates; 2019. 〈https://data.census.gov/cedsci/〉.
  • 17.Brooks M.E., Kristensen K., Van Benthem K.J., Magnusson A., Berg C.W., Nielsen A., et al. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R J. 2017;9(2):378–400. [Google Scholar]
  • 18.Williamson E.J., Walker A.J., Bhaskaran K., Bacon S., Bates C., Morton C.E., et al. OpenSAFELY: factors associated with COVID-19 death in 17 million patients. Nature. 2020;584(7821) doi: 10.1038/s41586-020-2521-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Annals of Epidemiology are provided here courtesy of Elsevier

RESOURCES