Influence of geospatial resolution on sociodemographic predictors of COVID-19 in Massachusetts

Prasad Patil; Xiaojing Peng; Beth M Haley; Keith R Spangler; Koen F Tieskens; Kevin J Lane; Fei Carnes; M Patricia Fabian; R Monina Klevens; T Scott Troppy; Jessica H Leibler; Jonathan I Levy

doi:10.1016/j.annepidem.2023.02.007

. 2023 Feb 21;80:62–68.e3. doi: 10.1016/j.annepidem.2023.02.007

Influence of geospatial resolution on sociodemographic predictors of COVID-19 in Massachusetts

Prasad Patil ^a,^⁎, Xiaojing Peng ^a, Beth M Haley ^b, Keith R Spangler ^b, Koen F Tieskens ^b, Kevin J Lane ^b, Fei Carnes ^b, M Patricia Fabian ^b, R Monina Klevens ^c, T Scott Troppy ^c, Jessica H Leibler ^b, Jonathan I Levy ^b

PMCID: PMC9942453 PMID: 36822278

Abstract

Purpose

When studying health risks across a large geographic region such as a state or province, researchers often assume that finer-resolution data on health outcomes and risk factors will improve inferences by avoiding ecological bias and other issues associated with geographic aggregation. However, coarser-resolution data (e.g., at the town or county-level) are more commonly publicly available and packaged for easier access, allowing for rapid analyses. The advantages and limitations of using finer-resolution data, which may improve precision at the cost of time spent gaining access and processing data, have not been considered in detail to date.

Methods

We systematically examine the implications of conducting town-level mixed-effect regression analyses versus census-tract-level analyses to study sociodemographic predictors of COVID-19 in Massachusetts. In a series of negative binomial regressions, we vary the spatial resolution of the outcome, the resolution of variable selection, and the resolution of the random effect to allow for more direct comparison across models.

Results

We find stability in some estimates across scenarios, changes in magnitude, direction, and significance in others, and tighter confidence intervals on the census-tract level. Conclusions regarding sociodemographic predictors are robust when regions of high concentration remain consistent across town and census-tract resolutions.

Conclusions

Inferences about high-risk populations may be misleading if derived from town- or county-resolution data, especially for covariates that capture small subgroups (e.g., small racial minority populations) or are geographically concentrated or skewed (e.g., % college students). Our analysis can help inform more rapid and efficient use of public health data by identifying when finer-resolution data are truly most informative, or when coarser-resolution data may be misleading.

Keywords: COVID-19, Regression analysis, Spatial resolution, Mixed-effect modeling

Introduction

The COVID-19 pandemic has emphasized the importance of rapid access to high-quality health data. Throughout the pandemic, timely access to accurately measured COVID-19 data has allowed for the identification of risk factors and development of targeted, evidence-based interventions to protect populations at elevated risk [1], [2]. In part because of its importance for individual and institutional decision-making and in part to supplement analyses of case surveillance data at state public health departments [3], aggregated COVID-19 case data have been routinely made available for researchers and the general public [4], [5]. In many states, data on cases, hospitalizations, and deaths associated with COVID-19 have been released regularly at varying levels of spatial aggregation. For example, in Massachusetts, the Massachusetts Department of Public Health (MDPH) has released data weekly at the town level since April 2020. These data are packaged in spreadsheets in ready-to-use formats, allowing for rapid analyses by researchers within and outside of MDPH, and such usability has improved pandemic response.

Access to finer-resolution COVID-19 outcomes, such as that on the census tract or individual level, is more limited. Availability of individual-level data is limited by patient privacy concerns through Health Insurance Portability and Accountability Act (HIPAA) and its usage is tightly restricted. When analyses are conducted rapidly for imminent policy guidance, there is increased potential for misuse or adversarial access to personally identifiable information. In addition, each health department (state and local) may have individual privacy and suppression rules that need to be applied to the data. However, individual data can be aggregated into sub-town-level geographic units, such as census tracts, while maintaining individual confidentiality. Data at the census-tract level have particular utility because many publicly available data sources on sociodemographics and environmental factors, such as the American Community Survey (ACS) [6], also present data on the tract level. Health outcome data on the tract level can be analyzed alongside tract-level predictors to inform a nuanced understanding of risk factors, allowing for a higher-resolution analysis and potentially more valid and actionable inferences.

The value of finely resolved geospatial data in understanding specific exposures and health outcomes experienced by small populations is well understood, including avoiding challenges related to aggregation such as the Modifiable Areal Unit Problem [7]. Such analyses, using residential geocoding of cases and community-level data attached to small geographic units, can identify important predictors that are relevant at the neighborhood level, and have particular utility in exploring disproportionate environmental exposures and health disparities. Using a combination of individual and local data can highlight locations in which racism and other forms of structural bias may have contributed to elevated risks in identified communities over time. Place-based assessments of exposure-outcome relationships have specific value during periods of pandemic, but are also employed in nonpandemic disease surveillance and environmental health and justice research to better understand health disparities [8], [9], [10].

In the spring of 2020, researchers at the Boston University School of Public Health and MDPH initiated a collaboration to evaluate sociodemographic and environmental predictors of COVID-19 at the census-tract level. The goal of this partnership was to leverage existing data resources available at the census-tract level to inform a more comprehensive and potentially actionable understanding of COVID-19 patterns in the state. MDPH provided data on the date of diagnosis and home addresses of individuals with Polymerase Chain Reaction (PCR)-confirmed cases of COVID-19 from March 2020 to February 2021. These data were subsequently geocoded by the Boston University School of Public Health team and used in random-effect regression model analyses alongside ACS and other geospatial predictors to identify patterns in risk factors over time [5]. We also evaluated and published a parallel analysis on the town level, using publicly available MDPH COVID-19 outcome data and town-level sociodemographic predictors from ACS [4]. Between these two analyses, we observed general consistency in key conclusions but some important differences in patterns of association for the same sets of predictors at different spatial resolutions. However, the models were not constructed identically or varied systematically to evaluate the specific elements that led to similar versus different conclusions.

In our current study, we vary the spatial scale of three components of the random-effect regression modeling process within a structured experimental design: (1) the outcome (COVID-19 cases), (2) sociodemographic and environmental variable selection, and (3) the random-effect term. The random-effect term captures group variation, such as correlation at the town- or county-level, without explicitly estimating a coefficient for the term. Although this is not a comprehensive list of all statistical modeling decisions, it is representative of common steps often taken in this approach that are likely to affect inference. We discuss which estimates changed by varying these inputs and the importance of these changes for both epidemiologic assessments and policy-making.

Previous work has reported on variation caused by and pitfalls due to temporal resolution [11], [12], [13], [14], but to our knowledge, there has not been a systematic exploration of how choices in a statistical analysis contribute to these issues. The goal of this project is to examine the influence of analytic choices on inferences and conclusions drawn from models trained at different geospatial resolutions. While our approach is not a prescription for how to conduct such analyses in all circumstances, our intention is to inform and support decision-making about the trade-offs and limitations associated with using data aggregated across different geographic resolutions.

Methods

Data sources

In this analysis, we compare the associations between a series of sociodemographic variables with hypothesized and demonstrated relevance for COVID-19 case data collected at different spatial levels. In our first analysis [4], we evaluated town-level sociodemographic data alongside town-level COVID-19 case data. All data were publicly available. COVID-19 case data were extracted from the MDPH website from April to October 2020 [15]. These data reflected all laboratory-confirmed cases reported to the state and were publicly presented by town on the MDPH website. Town-level sociodemographic data were derived from the 2014 to 2018 ACS 5-year estimates for tracts [16], and then scaled to towns by aggregating tract counts to the appropriate town and calculating proportions using ACS town population estimates from the same survey. The ACS sociodemographic variables included in the model (Table S1) include percentage of persons identifying as African-American/Black and Latino/Hispanic by town population, as well as the percentage of persons by town living below the poverty line, older than 80 years, living in housing units with occupancy of greater than 2 persons per room, and without health insurance.

In our second analysis [5], we used individual-level COVID-19 case data provided by MDPH under strict protective conditions. These cases were geocoded based on residential home address and aggregated to the appropriate census tract for analysis. Census-tract-level data on the same series of sociodemographic variables were extracted from the ACS and included in analyses directly without scaling to town. See descriptions in Tieskens et al. [4] and Spangler et al. [5] for further details on how each variable is constructed.

We used the following covariates in our comparative analyses. All refer to percentages within town/census tract unless otherwise noted (see Table S1 in the supplement for complete descriptions): ACS_20_min, ACS_80_plus, ACS_AIAN, ACS_black, ACS_disabilities, ACS_latino, ACS_more_1andhalf_per_rooom, ACS_more_2_per_room, ACS_not_insured, ACS_poverty, ACS_public_trans, ACS_services (essential services), Beds_LTCF (number of beds in long-term care facilities), Grad, HUD, (housing unit density), Num_LTCF (number of long-term care facilities), Students, Undergrad, and Urbanicity.

After excluding five census tracts with zero population, our statistical modeling was conducted on N _town = 351 towns and N _CT = 1462 census tracts.

Statistical modeling

We fit a series of eight random-effect models to the entire April–October 2020 time period using the glmmTMB package in R [17], outlined in Table 1. We restricted to this subset of dates for comparability with our earlier town-level analyses. The outcome of interest was COVID-19 cases. Across these models, we vary (1) the scale of the outcome (town-level or tract-level); (2) the scale at which variables are selected (town-level or tract-level); (3) the scale of the random-effect term (town-level or county-level). To conduct variable selection, we first discard variables that exhibit high correlation (correlation coefficient>0.7), then use a stepwise Poisson regression, and use only the subset of variables retained after the stepwise selection procedure is complete. We produce effect estimates and 95% confidence intervals for each covariate included in each model and report Akaike Information Criterion (AIC), Bayesian Infromation Criterion (BIC), and log-likelihood estimates for each model fit.

Table 1.

Models M1–M8 represent variation in the scale of the outcome, variable selection process, and random-effect term

Label	Outcome	Variables	Random effect
M1	Town-level total number of cases	Town	County
M2		Town	Town
M3		Tract	County
M4		Tract	Town
M5	Tract-level total number of cases	Tract	Town
M6		Town	Town
M7		Tract	County
M8		Town	County

Open in a new tab

Results

Fig. 1 displays separate panels for the 19 candidate ACS variables. Within each panel appear the coefficient estimates and 95% confidence intervals from each model (M1–M8). Each panel is separated into town-level outcome (left) and tract-level outcome (right); estimate shapes indicate the resolution of variable selection and fill indicates random-effect resolution. Table S2 in the supplement displays the plotted values.

One can compare, for example, how an estimate differs by choice of random effect by comparing open and filled points of the same shape on the same side of the panel. The town-level and tract-level stepwise variable selection procedures select different subsets of the 19 candidate variables, resulting in some panels having only 4 coefficient estimates. Generally, there are only a few variables—% Black residents, % Latino residents, % of residents with an essential service occupation (ACS_services), and # beds in long-term care facilities per capita (Beds_LTCF)—that maintain the same inference in terms of direction of effect and statistical significance across all model variations. One consistent trend observed from these estimates is that the 95% confidence intervals are much tighter for the models with census-tract-level outcomes (M5–M8) than they are for the corresponding town-level models (M1–M4). Model fit metrics are fairly close to each other, but within buckets of similar models suggest that tract-level variable selection is preferable to town-level and that a county-level random effect is preferred for a town-level regression, while a town-level random effect is preferred for a county-level regression.

As noted, # LTCF Beds, % Essential services, % Latino residents, and % Black residents maintain the same direction and significance of association across all eight model variations, and all have a significantly positive association with the outcome. % with disabilities has a consistent positive association, but is only significant in M1; % more than two per room has a consistent negative association, but is only significant in town-level outcome models (M1–M4). All other variables demonstrate variation by model in whether or not they are selected in variable selection or vary in both direction and significance of association. While these overall trends help pinpoint constructs that retain a consistent association across modeling choices, the reasons for inconsistent or conflicting associations are numerous and potentially complex.

Fig. 2 displays a subset of comparisons between models to better illustrate the specific effects of singular changes to the modeling procedure. Each cell represents the direction (blue = positive, tan = negative) and the statistical significance (“*” indicates P < .05) of each association in each model variation. White spaces indicate that the corresponding variable was not selected into the corresponding model (e.g., Urbanicity does not appear in model M1). The first comparison (M1 vs. M5) is between an “independent” analysis at the town level (which uses a town-level outcome, town-level variable selection, and “county” as the random effect) and an “independent” analysis at the census-tract level (which uses a census-tract-level outcome, census-tract-level variable selection, and “town” as the random effect). M1 and M5 most closely resemble the analyses we would conduct if we only had access to town-level or census-tract-level data, respectively. There are some immediate differences in the two models: (1) % Students, # Long Term Care Facilities, and Housing Unit Density are not selected via variable selection at the town level, but are significantly associated at the tract level; (2) % Below poverty line, less than 20%, and % American Indian or Alaskan Native residents switch signs and significance. We note, however, that the conditional interpretation of these variables is not identical across models because none of the outcome, variable selection resolution, or random effect are the same across the two models.

The second comparison (M1 vs. M8) is more direct, where the only difference between the two models is the resolution of the outcome. This ensures that the conditional interpretation of the coefficient estimates in each model is identical. In this comparison, we observe that % Below poverty line and % American Indian or Alaskan Native residents still switch signs, and that more variables are significant in the census-tract-level model.

The third and fourth comparisons illustrate the effect of changing variable selection resolution and random-effect resolution, respectively. We find that the overarching effects on inference are generally not as pronounced. We do observe that some variables not selected at the town level are significant at the tract level, and that spatially associated variables such as % Public Transportation change sign and direction when the random effect is changed. Looking across the four scenarios in Fig. 2, it appears that the types of changes observed across comparisons two through four compound to produce the substantial variation we observe in comparison one.

Discussion

Our model comparison varied the resolution of the outcome (town vs. tract), variable selection (town vs. tract), and random effect (town vs. county) in the modeling of COVID cases in Massachusetts. Overall, we find that the geospatial resolution of the COVID outcome has by far the greatest influence on how much model results vary, with the resolution of variable selection and random effect causing modest additional variation. Awareness of this result is essential for interpreting research using low-resolution outcome data available publicly. By taking this systematic approach, we observed a few notable patterns regarding resolution of predictor variables and interpretations.

For a subset of covariates, patterns were identical across modeling approaches, while for others, there were notable differences not just in statistical significance but also magnitude and direction of association. While the precise reasons vary across covariates, the underlying spatial structure and the extent of heterogeneity within towns is an important explanatory factor. For example, Fig. 3 displays heatmaps of % Black residents (which had a consistent positive association across all tested models) and % below poverty line (which switched signs from negative association at the town level to positive association at the census-tract level) at town and census-tract resolutions. We observe that for % Black residents, the geographic regions where these percentages are high remain roughly consistent between the town and tract resolution (due to a majority white population in the state of Massachusetts, this association is driven by cases in urban areas). In contrast, for % residents below poverty line, there are notable spatial differences between town and tract resolution, with many towns in Western Massachusetts having more pronounced areas with high percentages at the town level, and many towns in Eastern Massachusetts only displaying elevated % below the poverty line in a subset of tracts. While numerous factors influence the differential patterns across covariates, it is clear that formal assessments of heterogeneity within and between towns can provide valuable insight about the likelihood of model stability across geographic resolutions. For more in-depth interpretation of the associations found across these models, please refer to our previous work [4], [5].

Fig. 3 — Panels display heatmaps with color spectrum representing regional % Black population and % population below poverty line at town and census-tract resolutions. Dark boundaries in all panels mark Massachusetts towns, while lighter boundaries in the right-hand-side panels mark Massachusetts census tracts. Breaks are manually defined to follow quantiles with additional custom cutoffs to emphasize lower-end values that are difficult to visualize due to the right-skewed distributions of each variable.

From a statistical standpoint, more variables tend to be selected into the model and more variables appear as statistically significant at the census-tract level. This is likely influenced by the increased sample size at the finer resolution (N _town = 351 towns and N _CT = 1462 census tracts). Statistical significance is not the only determinant of model interpretation and performance, but more precise covariate estimates can give policy makers increased confidence in deriving geographically targeted policies from these types of models. We also note that the four variables that maintain significant positive associations across all eight model configurations (% Black, % Latino, % Essential Services, and # Beds LTCF) are likely a result of homogeneity across census tracts as described above. We do not observe any consistent negative associations in the same manner, although this is likely due to the types of covariates preselected for this analysis—most were expected to be associated with increased rates of COVID-19 cases.

Broadly, while it is clear that a higher-resolution model would be preferred when available, it does come with a cost related to the increased effort to work with individual-level data and geocode addresses and spatial data. For public health agencies who have the internal capacity to do this on a regular basis, given that the analytical effort for regression modeling is essentially identical at higher resolution, we would recommend constructing higher-resolution regression models to avoid incorrect inferences about important covariates. This would also be the case for analyses that are not time-sensitive and are meant to derive broader insight for future planning. In settings where it is not realistic to geocode at sufficient speed to conduct time-sensitive analyses, our findings reinforce that many important insights (e.g., related to patterns by race/ethnicity) can be captured with town-level regressions. In any case, a careful assessment of whether risk factors are expected to be homogeneously or heterogeneously distributed across finer-resolution geographic units is critical for identifying which inferences may be influenced by geospatial resolution.

There are multiple limitations of our analyses that should be acknowledged. First, we modeled cases only from April to October 2020. The sociodemographic patterns of cases may have differed in other time periods, with corresponding differences in the insights available from census tract versus town resolution models. In addition, the patterns may have differed in other states and our findings may not be generalizable. That said, the general insight that patterns will differ across covariates as a function of their underlying spatial structure is likely robust, with the specific manifestation differing over time and space. In addition, our models are ecological in nature and cannot provide individual-level inferences (i.e., we found elevated case rates in towns or census tracts with higher % Black residents, but that does not necessarily imply that individuals who are Black have higher risks). However, such models are valuable because SES data are often not available at the individual level and supplementing with geography-based data from ACS is necessary for adequate model adjustment. While census tracts were a logical geographic resolution given available census data and the fact that tracts fall neatly into a standard geographic hierarchy, policy makers may be interested in insights for more politically interpretable aggregations such as ZIP codes. We did not analyze ZIP codes in part because they do not reflect underlying population patterns to the same extent as census tracts, but future studies should assess the implications of modeling at ZIP code resolution relative to town and/or census tract.

From a data modeling perspective, this study examines a restricted choice set for a single model specification in random-effect models. There are a number of valid approaches we do not explore, including nested models, generalized estimating equations, and Bayesian multilevel modeling. These approaches will have their own choice set variations, and regardless we did not study important topics such as interaction effects, random slope specification, or estimating/interpreting the intraclass correlation coefficient. As such, although we can provide some broad conclusions on the influences of data resolution, we are unable to generalize across all modeling frameworks and choices available to practitioners.

In spite of these limitations, our models offer important insight. Overall, our modeling in Massachusetts indicates that town-level modeling adequately reflects racial and ethnic disparities observed in COVID-19 case incidence in the state. This observation likely reflects residential segregation, which results in strong overlap between tract and town-level race and ethnicity characteristics. As such, these town-level approaches are valuable in quickly and efficiently identifying disparities by race and ethnicity, controlling for other factors. However, poverty information is needed because of its effect as an independent risk factor from race/ethnicity [18]. The number of long-term care facility beds per capita and percent essential workers was also stable across town and census-tract level, consistently identifying these groups as vulnerable subpopulations within the COVID-19 pandemic. We recommend that future analyses of COVID-19 case data and other outcomes for which time-sensitive and geographically targeted interventions are of interest explicitly consider the implications of spatial resolution of exposures and outcomes.

Reproducibility

The R code used to generate figures and results for this paper is available as a supplemental file. Per the data use agreement with MDPH, we are not permitted to publicly share the town- and census-tract-level data used for these analyses. These datasets may be obtained by request to MDPH and can be used in conjunction with the shared code to reproduce the analyses in this article.

Funding

This project is supported by National Institutes of Health (NIH)/National Institute on Minority Health and Health Disparities (NIMHD) Grant P50MD010428 (PIs: J. Levy and F. Laden) and by an unrestricted gift from Google LLC to support research on COVID-19 and health equity (PI: J. Levy). B. Haley was supported by National Institute of Environmental Health Sciences (NIEHS) Grant T32ES014562.

Footnotes

Conflict of Interests: The authors declare the following financial interests/personal relationships that may be considered as potential competing inter-ests: Jonathan I. Levy reports financial support was provided by Google Inc.

Appendix

See Table S1 and Table S2.

Table S1.

Variable names, shorthand descriptions, and full descriptions of all candidate variables for modeling

Variable name	Shorthand description	Full description
ACS_20_min	% Age <20	% of residents younger than 20 y
ACS_80_plus	% Age >80	% of residents older than 80 y
ACS_AIAN	% American Indian or Alaskan Native	% of residents identifying as American Indian or Alaskan Native
ACS_black	% Black	% of residents identifying as African-American/Black
ACS_disabilities	% Disabilities	% of residents with reported disabilities
ACS_latino	% Latino	% of residents identifying as Latino/Hispanic
ACS_more_1andhalf_per_room	% More than 1.5 per room	% housing units with occupancy >1.5 persons/room
ACS_more_2_per_room	% More than 2 per room	% housing units with occupancy >2 persons/room
ACS_not_insured	% Not insured	% of residents without health insurance coverage
ACS_poverty	% Below poverty	% of residents living below poverty line
ACS_public_trans	% Public transportation	% of residents traveling to work by public transportation
ACS_services	% Essential services	% of residents with an essential service occupation
Beds_LTCF	# Beds LTCF	Number of long-term care facility beds per capita
Grad	% Graduate students	% of residents enrolled in graduate higher education
HUD	Housing unit density	Housing unit density (units per square mile)
Num_LTCF	# LTCF	Number of long-term care facilities per capita
Students	% Students	% of residents enrolled in undergraduate or graduate higher education (Undergrad + Graduate)
Undergrad	% Undergraduates	% of residents enrolled in undergraduate higher education
Urbanicity	% Urbanicity	% of town/census tract classified as urban

Open in a new tab

Table S2.

Estimates, 95% CI bounds for each variable in each model configuration

Model	Variable name	Estimate	P-value	95% CI LB	95% CI UB	Outcome	Variable selection	Random effect
M1	ACS_not_insured	-0.0235	0.2772	-0.0659	0.0189	Town	Town	County
M1	ACS_poverty	-0.0818	0.0029	-0.1356	-0.0280	Town	Town	County
M1	ACS_latino	0.1698	0.0000	0.1206	0.2189	Town	Town	County
M1	ACS_AIAN	-0.0419	0.0184	-0.0768	-0.0071	Town	Town	County
M1	ACS_black	0.0937	0.0000	0.0545	0.1328	Town	Town	County
M1	ACS_disabilities	0.0536	0.0463	0.0009	0.1062	Town	Town	County
M1	ACS_more_1andhalf_per_room	0.0493	0.0271	0.0056	0.0931	Town	Town	County
M1	ACS_more_2_per_room	-0.0582	0.0027	-0.0962	-0.0202	Town	Town	County
M1	ACS_public_trans	-0.0019	0.9406	-0.0521	0.0483	Town	Town	County
M1	ACS_services	0.1834	0.0000	0.1363	0.2306	Town	Town	County
M1	ACS_80_plus	-0.0016	0.9463	-0.0480	0.0448	Town	Town	County
M1	ACS_20_min	0.0383	0.1136	-0.0091	0.0857	Town	Town	County
M1	Beds_LTCF	0.1314	0.0000	0.0949	0.1679	Town	Town	County
M1	Grad	0.0330	0.1300	-0.0097	0.0756	Town	Town	County
M1	Undergrad	0.0226	0.1855	-0.0109	0.0561	Town	Town	County
M2	ACS_not_insured	-0.0519	0.0639	-0.1067	0.0030	Town	Town	Town
M2	ACS_poverty	-0.1507	0.0000	-0.2208	-0.0805	Town	Town	Town
M2	ACS_latino	0.2276	0.0000	0.1647	0.2906	Town	Town	Town
M2	ACS_AIAN	-0.0816	0.0002	-0.1241	-0.0391	Town	Town	Town
M2	ACS_black	0.1246	0.0000	0.0746	0.1747	Town	Town	Town
M2	ACS_disabilities	0.0258	0.4673	-0.0438	0.0954	Town	Town	Town
M2	ACS_more_1andhalf_per_room	0.0443	0.1321	-0.0133	0.1019	Town	Town	Town
M2	ACS_more_2_per_room	-0.0767	0.0028	-0.1271	-0.0264	Town	Town	Town
M2	ACS_public_trans	0.0774	0.0043	0.0242	0.1306	Town	Town	Town
M2	ACS_services	0.2400	0.0000	0.1783	0.3017	Town	Town	Town
M2	ACS_80_plus	-0.0186	0.5415	-0.0784	0.0411	Town	Town	Town
M2	ACS_20_min	0.1418	0.0000	0.0809	0.2027	Town	Town	Town
M2	Beds_LTCF	0.1750	0.0000	0.1264	0.2236	Town	Town	Town
M2	Grad	-0.0001	0.9959	-0.0557	0.0554	Town	Town	Town
M2	Undergrad	0.0084	0.7142	-0.0366	0.0534	Town	Town	Town
M3	ACS_poverty	-0.1042	0.0004	-0.1617	-0.0467	Town	Tract	County
M3	ACS_latino	0.1765	0.0000	0.1262	0.2267	Town	Tract	County
M3	ACS_AIAN	-0.0416	0.0180	-0.0761	-0.0071	Town	Tract	County
M3	ACS_black	0.0916	0.0000	0.0503	0.1329	Town	Tract	County
M3	ACS_disabilities	0.0490	0.0712	-0.0042	0.1023	Town	Tract	County
M3	ACS_more_2_per_room	-0.0358	0.0322	-0.0686	-0.0030	Town	Tract	County
M3	ACS_public_trans	0.0008	0.9822	-0.0701	0.0717	Town	Tract	County
M3	ACS_services	0.1797	0.0000	0.1336	0.2257	Town	Tract	County
M3	ACS_80_plus	0.0041	0.8628	-0.0428	0.0511	Town	Tract	County
M3	ACS_20_min	0.0310	0.1985	-0.0163	0.0784	Town	Tract	County
M3	Num_LTCF	-0.0144	0.6815	-0.0834	0.0545	Town	Tract	County
M3	HUD	-0.0387	0.3590	-0.1214	0.0440	Town	Tract	County
M3	Urbanicity	0.0511	0.1973	-0.0266	0.1287	Town	Tract	County
M3	Beds_LTCF	0.1388	0.0001	0.0701	0.2076	Town	Tract	County
M3	Grad	0.0376	0.0984	-0.0070	0.0822	Town	Tract	County
M3	Students	0.0254	0.1751	-0.0113	0.0620	Town	Tract	County
M4	ACS_poverty	-0.2096	0.0000	-0.2829	-0.1363	Town	Tract	Town
M4	ACS_latino	0.1915	0.0000	0.1256	0.2575	Town	Tract	Town
M4	ACS_AIAN	-0.0817	0.0001	-0.1222	-0.0413	Town	Tract	Town
M4	ACS_black	0.1039	0.0001	0.0517	0.1561	Town	Tract	Town
M4	ACS_disabilities	0.0252	0.4706	-0.0433	0.0938	Town	Tract	Town
M4	ACS_more_2_per_room	-0.0546	0.0100	-0.0962	-0.0131	Town	Tract	Town
M4	ACS_public_trans	-0.0023	0.9544	-0.0799	0.0754	Town	Tract	Town
M4	ACS_services	0.2151	0.0000	0.1553	0.2750	Town	Tract	Town
M4	ACS_80_plus	-0.0123	0.6879	-0.0721	0.0476	Town	Tract	Town
M4	ACS_20_min	0.1408	0.0000	0.0781	0.2036	Town	Tract	Town
M4	Num_LTCF	-0.0548	0.2503	-0.1483	0.0386	Town	Tract	Town
M4	HUD	0.0415	0.4297	-0.0614	0.1444	Town	Tract	Town
M4	Urbanicity	0.1140	0.0256	0.0139	0.2142	Town	Tract	Town
M4	Beds_LTCF	0.2057	0.0000	0.1131	0.2984	Town	Tract	Town
M4	Grad	0.0047	0.8699	-0.0519	0.0614	Town	Tract	Town
M4	Students	0.0068	0.7820	-0.0413	0.0549	Town	Tract	Town
M5	ACS_poverty	0.0225	0.0781	-0.0025	0.0476	Tract	Tract	Town
M5	ACS_latino	0.1619	0.0000	0.1309	0.1929	Tract	Tract	Town
M5	ACS_AIAN	0.0092	0.1968	-0.0048	0.0232	Tract	Tract	Town
M5	ACS_black	0.0583	0.0000	0.0378	0.0789	Tract	Tract	Town
M5	ACS_disabilities	0.0077	0.4950	-0.0144	0.0298	Tract	Tract	Town
M5	ACS_more_2_per_room	-0.0049	0.4720	-0.0183	0.0085	Tract	Tract	Town
M5	ACS_public_trans	0.0006	0.9657	-0.0279	0.0291	Tract	Tract	Town
M5	ACS_services	0.0972	0.0000	0.0776	0.1168	Tract	Tract	Town
M5	ACS_80_plus	-0.0145	0.1670	-0.0351	0.0061	Tract	Tract	Town
M5	ACS_20_min	-0.0543	0.0000	-0.0776	-0.0309	Tract	Tract	Town
M5	Num_LTCF	0.0819	0.0000	0.0635	0.1004	Tract	Tract	Town
M5	HUD	-0.0466	0.0000	-0.0663	-0.0269	Tract	Tract	Town
M5	Urbanicity	0.0406	0.0039	0.0130	0.0682	Tract	Tract	Town
M5	Beds_LTCF	0.0387	0.0006	0.0167	0.0607	Tract	Tract	Town
M5	Grad	-0.0025	0.8295	-0.0249	0.0200	Tract	Tract	Town
M5	Students	-0.0913	0.0000	-0.1121	-0.0705	Tract	Tract	Town
M6	ACS_not_insured	0.0314	0.0006	0.0135	0.0492	Tract	Town	Town
M6	ACS_poverty	0.0112	0.3874	-0.0142	0.0365	Tract	Town	Town
M6	ACS_latino	0.1550	0.0000	0.1223	0.1877	Tract	Town	Town
M6	ACS_AIAN	0.0078	0.2864	-0.0065	0.0221	Tract	Town	Town
M6	ACS_black	0.0650	0.0000	0.0439	0.0861	Tract	Town	Town
M6	ACS_disabilities	0.0171	0.1360	-0.0054	0.0396	Tract	Town	Town
M6	ACS_more_1andhalf_per_room	-0.0008	0.9258	-0.0181	0.0164	Tract	Town	Town
M6	ACS_more_2_per_room	-0.0072	0.3769	-0.0233	0.0088	Tract	Town	Town
M6	ACS_public_trans	-0.0054	0.7148	-0.0345	0.0236	Tract	Town	Town
M6	ACS_services	0.0870	0.0000	0.0658	0.1081	Tract	Town	Town
M6	ACS_80_plus	0.0161	0.1097	-0.0036	0.0357	Tract	Town	Town
M6	ACS_20_min	-0.0494	0.0000	-0.0732	-0.0257	Tract	Town	Town
M6	Beds_LTCF	0.0928	0.0000	0.0731	0.1125	Tract	Town	Town
M6	Grad	-0.0243	0.0217	-0.0451	-0.0036	Tract	Town	Town
M6	Undergrad	-0.0835	0.0000	-0.1032	-0.0639	Tract	Town	Town
M7	ACS_poverty	0.0441	0.0029	0.0151	0.0731	Tract	Tract	County
M7	ACS_latino	0.2001	0.0000	0.1735	0.2266	Tract	Tract	County
M7	ACS_AIAN	0.0175	0.0411	0.0007	0.0343	Tract	Tract	County
M7	ACS_black	0.0671	0.0000	0.0460	0.0882	Tract	Tract	County
M7	ACS_disabilities	0.0008	0.9483	-0.0241	0.0257	Tract	Tract	County
M7	ACS_more_2_per_room	-0.0042	0.6199	-0.0206	0.0123	Tract	Tract	County
M7	ACS_public_trans	-0.0857	0.0000	-0.1142	-0.0572	Tract	Tract	County
M7	ACS_services	0.1743	0.0000	0.1558	0.1927	Tract	Tract	County
M7	ACS_80_plus	0.0066	0.5762	-0.0165	0.0297	Tract	Tract	County
M7	ACS_20_min	-0.0917	0.0000	-0.1159	-0.0675	Tract	Tract	County
M7	Num_LTCF	0.0873	0.0000	0.0660	0.1085	Tract	Tract	County
M7	HUD	-0.0549	0.0000	-0.0793	-0.0306	Tract	Tract	County
M7	Urbanicity	0.0785	0.0000	0.0496	0.1074	Tract	Tract	County
M7	Beds_LTCF	0.0282	0.0080	0.0074	0.0490	Tract	Tract	County
M7	Grad	-0.0352	0.0069	-0.0608	-0.0097	Tract	Tract	County
M7	Students	-0.0524	0.0000	-0.0756	-0.0293	Tract	Tract	County
M8	ACS_not_insured	0.0355	0.0011	0.0143	0.0568	Tract	Town	County
M8	ACS_poverty	0.0443	0.0030	0.0151	0.0735	Tract	Town	County
M8	ACS_latino	0.2105	0.0000	0.1827	0.2383	Tract	Town	County
M8	ACS_AIAN	0.0176	0.0453	0.0004	0.0349	Tract	Town	County
M8	ACS_black	0.0833	0.0000	0.0619	0.1048	Tract	Town	County
M8	ACS_disabilities	0.0122	0.3481	-0.0133	0.0377	Tract	Town	County
M8	ACS_more_1andhalf_per_room	0.0010	0.9249	-0.0206	0.0226	Tract	Town	County
M8	ACS_more_2_per_room	-0.0079	0.4337	-0.0278	0.0119	Tract	Town	County
M8	ACS_public_trans	-0.0817	0.0000	-0.1096	-0.0538	Tract	Town	County
M8	ACS_services	0.1637	0.0000	0.1429	0.1845	Tract	Town	County
M8	ACS_80_plus	0.0381	0.0008	0.0159	0.0603	Tract	Town	County
M8	ACS_20_min	-0.0961	0.0000	-0.1206	-0.0715	Tract	Town	County
M8	Beds_LTCF	0.0685	0.0000	0.0490	0.0881	tract	Town	County
M8	Grad	-0.0460	0.0002	-0.0697	-0.0222	Tract	Town	County
M8	Undergrad	-0.0394	0.0004	-0.0611	-0.0177	Tract	Town	County

Open in a new tab

These data are plotted in main text Fig. 1.

See Fig. S1.

References

1.Antonelli M., Penfold R.S., Merino J., Sudre C.H., Molteni E., Berry S., et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID symptom study app: a prospective, community-based, nested, case-control study. Lancet Infect Dis. 2022;22(1):43–55. doi: 10.1016/S1473-3099(21)00460-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Baker J.M., Nelson K.N., Overton E., Lopman B.A., Lash T.L., Photakis M., et al. Quantification of occupational and community risk factors for SARS-CoV-2 seropositivity among health care workers in a large US health care system. Ann Intern Med. 2021;174(5):649–654. doi: 10.7326/M20-7145. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.DeSalvo K., Hughes B., Bassett M., Benjamin G., Fraser M., Galea S., et al. Public health COVID-19 impact assessment: lessons learned and compelling needs. NAM Perspect. 2021;2021:1–29. doi: 10.31478/202104c. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Tieskens K.F., Patil P., Levy J.I., Brochu P., Lane K.J., Fabian M.P., et al. Time-varying associations between COVID-19 case incidence and community-level sociodemographic, occupational, environmental, and mobility risk factors in Massachusetts. BMC Infect Dis. 2021;21(1):1–9. doi: 10.1186/s12879-021-06389-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Spangler K.R., Patil P., Peng X., Levy J.I., Lane K.J., Tieskens K.F., et al. Community predictors of COVID‐19 cases and deaths in Massachusetts: evaluating changes over time using geospatially refined data. Influenza Other Respir Viruses. 2022;16(2):213–221. doi: 10.1111/irv.12926. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.National Research Council. Using the American Community Survey: benefits and challenges; 2007 Mar 26.
7.Fotheringham A.S., Wong D.W. The modifiable areal unit problem in multivariate statistical analysis. Environ Plan A. 1991;23(7):1025–1044. [Google Scholar]
8.Rosofsky A., Levy J.I., Zanobetti A., Janulewicz P., Fabian M.P. Temporal trends in air pollution exposure inequality in Massachusetts. Environ Res. 2018;161:76–86. doi: 10.1016/j.envres.2017.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Di Q., Wang Y., Zanobetti A., Wang Y., Koutrakis P., Choirat C., et al. Air pollution and mortality in the medicare population. N Engl J Med. 2017;376(26):2513–2522. doi: 10.1056/NEJMoa1702747. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Evans L., Charns M.P., Cabral H.J., Fabian M.P. Change in geographic access to community health centers after Health Center Program expansion. Health Serv Res. 2019;54(4):860–869. doi: 10.1111/1475-6773.13149. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Thomas A.J., Eberly L.E., Davey Smith G., Neaton J.D. ZIP-code-based versus tract-based income measures as long-term risk-adjusted mortality predictors. Am J Epidemiol. 2006;164(6):586–590. doi: 10.1093/aje/kwj234. [DOI] [PubMed] [Google Scholar]
12.Sperling J. The tyranny of census geography: small-area data and neighborhood statistics. Cityscape. 2012;1:219–223. [Google Scholar]
13.Chen J.T., Krieger N. Revealing the unequal burden of COVID-19 by income, race/ethnicity, and household crowding: US county versus zip code analyses. J Public Health Manag Pract. 2021;27(1):S43–S56. doi: 10.1097/PHH.0000000000001263. [DOI] [PubMed] [Google Scholar]
14.Nethery R.C., Chen J.T., Krieger N., Waterman P.D., Peterson E., Waller L.A., et al. Statistical implications of endogeneity induced by residential segregation in small-area modeling of health inequities. Am Stat. 2022;76(2):142–155. doi: 10.1080/00031305.2021.2003245. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.MAPDH. COVID-19 case data; 2020. Retrieved from 〈https://www.mass.gov/info-details/covid-19-response-reporting#covid-19-county-level-data-reporting-〉.
16.U.S. Census Bureau. Selected social/housing/economic/demographic characteristics, 2014–2018 American Community Survey 5-year estimates; 2019. 〈https://data.census.gov/cedsci/〉.
17.Brooks M.E., Kristensen K., Van Benthem K.J., Magnusson A., Berg C.W., Nielsen A., et al. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R J. 2017;9(2):378–400. [Google Scholar]
18.Williamson E.J., Walker A.J., Bhaskaran K., Bacon S., Bates C., Morton C.E., et al. OpenSAFELY: factors associated with COVID-19 death in 17 million patients. Nature. 2020;584(7821) doi: 10.1038/s41586-020-2521-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib1] 1.Antonelli M., Penfold R.S., Merino J., Sudre C.H., Molteni E., Berry S., et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID symptom study app: a prospective, community-based, nested, case-control study. Lancet Infect Dis. 2022;22(1):43–55. doi: 10.1016/S1473-3099(21)00460-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib2] 2.Baker J.M., Nelson K.N., Overton E., Lopman B.A., Lash T.L., Photakis M., et al. Quantification of occupational and community risk factors for SARS-CoV-2 seropositivity among health care workers in a large US health care system. Ann Intern Med. 2021;174(5):649–654. doi: 10.7326/M20-7145. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] 3.DeSalvo K., Hughes B., Bassett M., Benjamin G., Fraser M., Galea S., et al. Public health COVID-19 impact assessment: lessons learned and compelling needs. NAM Perspect. 2021;2021:1–29. doi: 10.31478/202104c. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] 4.Tieskens K.F., Patil P., Levy J.I., Brochu P., Lane K.J., Fabian M.P., et al. Time-varying associations between COVID-19 case incidence and community-level sociodemographic, occupational, environmental, and mobility risk factors in Massachusetts. BMC Infect Dis. 2021;21(1):1–9. doi: 10.1186/s12879-021-06389-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib5] 5.Spangler K.R., Patil P., Peng X., Levy J.I., Lane K.J., Tieskens K.F., et al. Community predictors of COVID‐19 cases and deaths in Massachusetts: evaluating changes over time using geospatially refined data. Influenza Other Respir Viruses. 2022;16(2):213–221. doi: 10.1111/irv.12926. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib6] 6.National Research Council. Using the American Community Survey: benefits and challenges; 2007 Mar 26.

[bib7] 7.Fotheringham A.S., Wong D.W. The modifiable areal unit problem in multivariate statistical analysis. Environ Plan A. 1991;23(7):1025–1044. [Google Scholar]

[bib8] 8.Rosofsky A., Levy J.I., Zanobetti A., Janulewicz P., Fabian M.P. Temporal trends in air pollution exposure inequality in Massachusetts. Environ Res. 2018;161:76–86. doi: 10.1016/j.envres.2017.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] 9.Di Q., Wang Y., Zanobetti A., Wang Y., Koutrakis P., Choirat C., et al. Air pollution and mortality in the medicare population. N Engl J Med. 2017;376(26):2513–2522. doi: 10.1056/NEJMoa1702747. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] 10.Evans L., Charns M.P., Cabral H.J., Fabian M.P. Change in geographic access to community health centers after Health Center Program expansion. Health Serv Res. 2019;54(4):860–869. doi: 10.1111/1475-6773.13149. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] 11.Thomas A.J., Eberly L.E., Davey Smith G., Neaton J.D. ZIP-code-based versus tract-based income measures as long-term risk-adjusted mortality predictors. Am J Epidemiol. 2006;164(6):586–590. doi: 10.1093/aje/kwj234. [DOI] [PubMed] [Google Scholar]

[bib12] 12.Sperling J. The tyranny of census geography: small-area data and neighborhood statistics. Cityscape. 2012;1:219–223. [Google Scholar]

[bib13] 13.Chen J.T., Krieger N. Revealing the unequal burden of COVID-19 by income, race/ethnicity, and household crowding: US county versus zip code analyses. J Public Health Manag Pract. 2021;27(1):S43–S56. doi: 10.1097/PHH.0000000000001263. [DOI] [PubMed] [Google Scholar]

[bib14] 14.Nethery R.C., Chen J.T., Krieger N., Waterman P.D., Peterson E., Waller L.A., et al. Statistical implications of endogeneity induced by residential segregation in small-area modeling of health inequities. Am Stat. 2022;76(2):142–155. doi: 10.1080/00031305.2021.2003245. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] 15.MAPDH. COVID-19 case data; 2020. Retrieved from 〈https://www.mass.gov/info-details/covid-19-response-reporting#covid-19-county-level-data-reporting-〉.

[bib16] 16.U.S. Census Bureau. Selected social/housing/economic/demographic characteristics, 2014–2018 American Community Survey 5-year estimates; 2019. 〈https://data.census.gov/cedsci/〉.

[bib17] 17.Brooks M.E., Kristensen K., Van Benthem K.J., Magnusson A., Berg C.W., Nielsen A., et al. glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. R J. 2017;9(2):378–400. [Google Scholar]

[bib18] 18.Williamson E.J., Walker A.J., Bhaskaran K., Bacon S., Bates C., Morton C.E., et al. OpenSAFELY: factors associated with COVID-19 death in 17 million patients. Nature. 2020;584(7821) doi: 10.1038/s41586-020-2521-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Influence of geospatial resolution on sociodemographic predictors of COVID-19 in Massachusetts

Prasad Patil, PhD

Xiaojing Peng, MS

Beth M Haley, MA

Keith R Spangler, PhD

Koen F Tieskens, PhD

Kevin J Lane, PhD

Fei Carnes, MS

M Patricia Fabian, ScD

R Monina Klevens, DDS, MPH

T Scott Troppy, MPH

Jessica H Leibler, DrPH

Jonathan I Levy, ScD

Abstract

Purpose

Methods

Results

Conclusions

Introduction

Methods

Data sources

Statistical modeling

Table 1.

Results

Fig. 1.

Fig. 2.

Discussion

Fig. 3.

Reproducibility

Funding

Footnotes

Appendix

Table S1.

Table S2.

Fig. S1.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases