Skip to main content
American Journal of Public Health logoLink to American Journal of Public Health
. 2020 Dec;110(12):1817–1824. doi: 10.2105/AJPH.2020.305911

Real-Time Spatiotemporal Analysis of Microepidemics of Influenza and COVID-19 Based on Hospital Network Data: Colocalization of Neighborhood-Level Hotspots

Evangelia K Mylona 1,, Fadi Shehadeh 1, Markos Kalligeros 1, Gregorio Benitez 1, Philip A Chan 1, Eleftherios Mylonakis 1,
PMCID: PMC7661994  PMID: 33058702

Abstract

Objectives. To identify spatiotemporal patterns of epidemic spread at the community level.

Methods. We extracted influenza cases reported between 2016 and 2019 and COVID-19 cases reported in March and April 2020 from a hospital network in Rhode Island. We performed a spatiotemporal hotspot analysis to simulate a real-time surveillance scenario.

Results. We analyzed 6527 laboratory-confirmed influenza cases and identified microepidemics in more than 1100 neighborhoods, and more than half of the neighborhoods that had hotspots in a season became hotspots in the next season. We used data from 731 COVID-19 cases, and we found that a neighborhood was 1.90 times more likely to become a COVID-19 hotspot if it had been an influenza hotspot in 2018 to 2019.

Conclusions. The use of readily available hospital data allows the real-time identification of spatiotemporal trends and hotspots of microepidemics.

Public Health Implications. As local governments move to reopen the economy and ease physical distancing, the use of historic influenza hotspots could guide early prevention interventions, while the real-time identification of hotspots would enable the implementation of interventions that focus on small-area containment and mitigation.


Infectious disease outbreaks and epidemics are facilitated by social networks and microcommunities.1 Real-time interventions can break the transmission cycle1 and prevent disease spreading through the social contact network.2 Although public health surveillance is essential for monitoring and evaluating disease spread, especially when new disease agents appear,3 disease transmission is often studied by using theoretical and mathematical models.4 However, such models are often unable to represent the dynamic effects of disease progression through different communities and identify patterns and factors that influence the development of the epidemic.5

Seasonal influenza is a significant infection and represents a typical example of a seasonal epidemic, which occurs on a yearly basis and poses a major health burden. According to the World Health Organization, there are an estimated 1 billion cases worldwide, of which 3 to 5 million are severe.6 Although an influenza vaccine is available and offers significant protection,7,8 its effectiveness varies by year. As a result, additional public health measures are needed, and efforts are being made to forecast community outbreaks.9 The importance of such efforts becomes even more evident amid novel emerging epidemics.10 Similar to influenza, COVID-19 is an acute respiratory syndrome.10 Although efforts have been made for the surveillance of the pandemic,11 spatiotemporal analysis has not been reported yet. A better understanding of how and where the disease spreads is needed for the implementation of targeted interventions for containment and mitigation.12

The consistent failure of converting research findings into health care policies13 leads to a delayed response from surveillance networks, both statewide and at the national level. Moreover, even though the spread of contagious diseases at the county level has been reported,14 spatial analysis at the neighborhood level that would allow for more effective interventions has not been reported. Furthermore, during a rapid-moving epidemic, the role of hospitals and health care networks in monitoring microepidemics, superspread events, and hotspots has become paramount.

In this study, we utilized laboratory-confirmed influenza cases from a large health care network in Rhode Island in an effort to identify spatial and temporal patterns of epidemic spread at the community level. We used data from influenza epidemics to identify spread patterns of influenza that could help focus resources and enact targeted microinterventions in the places most needed. Then we implemented the same methodology to identify spatiotemporal patterns associated with COVID-19 and study potential colocalization of hotspots. Finally, we used a pseudo-prospective spatiotemporal analysis to demonstrate how this approach could be implemented for the real-time surveillance and identification of microepidemics.

METHODS

We analyzed data from the state of Rhode Island in the United States. We used 2 sets of data: 1 of seasonal influenza A cases, because of the large available volume of data, and 1 of COVID-19 cases.

Data Extraction and Preprocessing

First, we extracted laboratory-confirmed influenza A cases reported between September 2016 and August 2019. The laboratories that participated in our study tested for influenza with either the polymerase chain reaction test or the rapid antigen test. Date of diagnosis, home address, and date of birth were included in the extracted data. Second, we extracted COVID-19 cases reported between March 12 and April 19, 2020, along with the date of diagnosis and home addresses of the patients. Both data sets were extracted from the largest statewide hospital network in Rhode Island. Addresses were geocoded, and data were de-identified, while individuals who could not be georeferenced or who had an address outside of the study area were excluded. We also used a land use–land cover shapefile that was provided by Rhode Island Geographic Information System15 to exclude nonhabitable areas, such as water areas and forests, and calculated the population density using 2016 census block group data.

Study Analysis

We used a geographic information system for visualizing and analyzing the spatial distribution of influenza and COVID-19 cases through time. We performed the analysis in ArcGIS Pro 2.4.0 (ESRI, Redlands, CA) using the Space Time Pattern Mining Tools, in Python programming language (Python Software Foundation, https://www.python.org) and in RStudio (PBC, Boston, MA). First, we divided the study area to 1-kilometer-height hexagons and aggregated the cases within the hexagons.16 We used a 1-kilometer radius as it is widely used in literature and enables the identification of patterns at a very detailed small-area level,16,17 and we chose hexagons for our analysis because hexagons are symmetric and the distance between centroids is the same for all neighbors.18

We performed spatiotemporal hotspot analysis to identify neighborhood-level hotspots (microepidemics) and time trends.19 For this analysis, we identified spatial hotspots for each time period by using the Getis-Ord Gi∗ algorithm and applied a Mann–Kendall trend test to the hotspot time series (details about these tests and definitions are given in Supplementary Methods in the Appendix, available as a supplement to the online version of this article at http://www.ajph.org). A hotspot was defined as a Getis-Ord Gi∗ z score greater than 1.65, and a microepidemic was defined as a hotspot at the neighborhood level (see also Supplementary Methods).

We used 2 different approaches to identify the hotspots of influenza and COVID-19. In the first analysis, we calculated hotspots by using incident counts, and in the second analysis we calculated hotspots accounting for population density. We used the Jaccard similarity coefficient20 and a logistic regression analysis to measure the similarity and association of hotspots between the 3 influenza seasons as well as between the 2018–2019 influenza season and COVID-19 (see Appendix Supplementary Methods).

Seasonal influenza A.

The influenza season was defined from September 1 to August 31 for each year. We performed a retrospective spatiotemporal hotspot analysis to identify hotspots and time trends. We defined a time step interval of 5 days from the diagnosis date, in accordance with influenza shedding duration and taking into consideration that most patients do not seek health care at the onset of their symptoms. According to the Centers for Disease Control and Prevention, in most cases, individuals may be able to infect others for up to 5 to 7 days after becoming sick.21 The neighborhood distance and time step define which cases are considered neighboring in space and time22; thus, the spatiotemporal neighborhood was defined as a radius of 1 kilometer for the previous 5 days. We also stratified the sample into 3 age groups and performed an additional analysis. Individuals aged younger than 18 years were defined as children, those aged 18 to 64 years as adults, and those aged 65 years or older as seniors.

We performed a pseudo-prospective spatiotemporal hotspot analysis starting with influenza data available on January 15, 2019, and repeated the analysis every 5 days until February 9, 2019. We calculated the Getis-Ord Gi∗ statistic and the Mann–Kendall trend test z score taking into account only the cases reported in the 2 months leading to the pseudo-prospective analysis date. Finally, we subgrouped the reported cases by age group and repeated the pseudo-prospective analysis to compare hotspot patterns among children, adults, and seniors for 2 time points (February 15 and 20, 2019).

COVID-19.

We applied the previously described methodology and performed a pseudo-prospective spatiotemporal hotspot analysis of COVID-19 cases. We aggregated cases by 2 days, and we defined the spatiotemporal neighborhood as cases reported in the last 10 days in a radius of 1 kilometer. In most cases, individuals develop symptoms 2 to 14 days after their exposure to COVID-19.23 We used a 10-day time step from the diagnosis date, taking into consideration that most patients do not undergo testing on the day of exposure. The first analysis was done with available data on March 31, 2020, and we repeated the analysis until April 20, 2020. We subgrouped the reported cases by age group (adults, seniors) and we repeated the pseudo-prospective analysis (April 14 and 20, 2020). We did not analyze the cases among children because of the small number of cases.

RESULTS

We extracted data on 6878 cases with influenza A. Among them, 6527 cases of influenza A were included in this analysis (351 individuals could not be georeferenced and were excluded): 1709 during the 2016–2017 influenza season, 2535 during the 2017–2018 influenza season, and 2283 during the 2018–2019 influenza season (Appendix Table A).

We also extracted laboratory-confirmed COVID-19 cases. In total, 731 cases were included in this analysis, up to April 19, 2020 (67 individuals with COVID-19 could not be georeferenced and were excluded).

Seasonal Influenza A

As detailed in Methods, we divided the study area into 1-kilometer hexagons resulting in 3545 neighborhoods (nonhabitable regions were excluded), and we detected hotspots of microepidemics in 1142 of them, which corresponded to 32% of the total area. We identified 677 hotspot neighborhoods with 1251 reported cases (323 hotspots had at least 1 case) during the 2016–2017 influenza season, 891 hotspot neighborhoods with 2009 reported cases (449 hotspots had at least 1 case) during the 2017–2018 influenza season, and 786 hotspot neighborhoods with 1660 reported cases (392 hotspots had at least 1 case) during the 2018–2019 influenza season. On average, we identified 3.93 (range = 1–16) hotspots of microepidemics in each neighborhood in the 3 years analyzed, with a mean duration of 17.2 days (range = 5–155 days). In our analysis, we defined duration as the time period from the first day that a hotspot was detected in a neighborhood until the last day that it was statistically significant. Also, 74.22% of the hotspots had a duration of at least 10 days, while 33.32% had a duration of 15 days or more (Figure 1). When we compared patterns between consecutive influenza seasons, we found a 53.27% similarity between the hotspots of the 2016–2017 and the 2017–2018 influenza seasons, and a 60.17% similarity between the hotspots of the 2017–2018 and 2018–2019 influenza seasons.

FIGURE 1—

FIGURE 1—

Mean Duration of Hotspots of Influenza Microepidemics per Neighborhood, in Rhode Island, During the 2016–2019 Influenza Seasons

Moreover, when we accounted for population density, 1630 neighborhoods were a hotspot at least once during the 3 influenza seasons. Specifically, 606, 1049, and 850 neighborhoods were hotspots at least once during the 2016–2017, 2017–2018, and 2018–2019 influenza seasons, respectively. On average, we identified 2.15 (range = 1–10) hotspots of microepidemics in each neighborhood over the 3-year period, with a mean duration of 10.8 days (range = 5–50 days) (Figure 2). The similarity of hotspot neighborhoods was 26.63% between the 2016–2017 and 2017–2018 influenza seasons, and 27.71% between the 2017–2018 and 2018–2019 influenza seasons. Furthermore, a neighborhood was 4.31 times more likely to be a hotspot during the 2017–2018 influenza season if it had been a hotspot during the 2016–2017 influenza season (odds ratio [OR] = 4.31; 95% confidence interval [CI] = 3.59, 5.17). Similarly, a neighborhood was 3.04 times more likely to be a hotspot during the 2018–2019 influenza season if it had been a hotspot during the 2017–2018 influenza season (OR = 3.04; 95% CI = 2.59, 3.57).

FIGURE 2—

FIGURE 2—

Mean Duration of Hotspots of Influenza Microepidemics, Accounting for Population Density, per Neighborhood, in Rhode Island, During the 2016–2019 Influenza Seasons

To simulate a real-time surveillance scenario, we performed a pseudo-prospective spatiotemporal hotspot and trend analysis of the reported influenza A cases, using only data reported before the analysis date and ignoring future cases. We performed the analysis with data available on January 15, 2019, and repeated the analysis in 5-day intervals, taking into account only the cases reported in the 2 months leading to the pseudo-prospective analysis date.

Figure 3 depicts the hotspots of influenza microepidemics at the neighborhood level in Providence, Rhode Island, and surrounding cities, as well as the temporal trends and evolution of the microepidemics in space and time. The seasonal variation of the microepidemics and the spatial and temporal spread is pictured, with newly detected hotspots of microepidemics at the beginning of the analysis, progressing to consecutive hotspots, and then diminishing. For example, as shown in Figure 3a in the marked areas, on January 15, new hotspots in the city of Cranston, Rhode Island, were detected (duration of 5 days or less). These hotspots became consecutive (duration of more than 5 days), and new hotspots were detected in the neighborhoods surrounding them (Figure 3b). The microepidemic outward spread to neighboring cities is clear, as well as the waning of the microepidemics by the end of the analysis.

FIGURE 3—

FIGURE 3—

Pseudo-prospective Hotspot and Trend Analysis of Influenza Cases, in Providence, Rhode Island, and Surrounding Cities, for (a) January 15, 2019, and (b) January 20, 2019

Note. For analysis until February 9, 2019, see Appendix Figure A (available as a supplement to the online version of this article at http://www.ajph.org). New hotspot = a possible hotspot has just been detected in these neighborhoods. The duration of the hotspots is 5 days or fewer. Consecutive hotspot = a hotspot has been detected in these neighborhoods, with a duration of more than 5 days but fewer than 54 days. Sporadic hotspot = nonconsecutive hotspots have been detected in these neighborhoods, with a duration of fewer than 54 days.

COVID-19

To apply the same methodology in COVID-19, we extracted laboratory-confirmed COVID-19 cases. When analyzing for hotspots of COVID-19 microepidemics, we detected 150 neighborhoods of the total 3545. Interestingly, 98.67% of these neighborhoods were similar to the influenza hotspots of the 2018–2019 influenza season. When accounting for population density, hotspots of COVID-19 microepidemics were detected in 155 neighborhoods with a 36.77% similarity with influenza hotspots. A neighborhood was 1.90 times more likely to be a hotspot for COVID-19 if it had been a hotspot in the 2018–2019 influenza season (OR = 1.90; 95% CI = 1.36, 2.67).

We performed a pseudo-prospective spatiotemporal hotspot and trend analysis of COVID-19 available data on March 31, 2020, and repeated the analysis until April 14, 2020. As shown in Figure 4a, at the start of our analysis, there were 114 cases, and 79 of the 114 (69.3%) were in hotspots, including 3 main outbreaks (southwest Providence: 41 cases; city of Central Falls, Rhode Island: 19 cases; city of East Providence, Rhode Island: 8 cases) and a secondary outbreak (city of Barrington, Rhode Island: 4 cases). Furthermore, neighborhoods with new hotspots at the beginning of the analysis remained hotspots after 2 days (April 2) while new hotspots emerged at the surrounding neighborhoods, indicating the rapid spatiotemporal spread of COVID-19 (Figure 4b). We observed the same trend on April 4, when even more hotspots appeared around the cities of Providence and Cranston and from Central Falls to Pawtucket, Rhode Island (Appendix Figure B). Finally, by the end of the analysis, the 3 main outbreaks were consolidated, indicating that, without neighborhood-level containment measures, hotspots expand and consolidate in wider areas, especially in larger population centers.

FIGURE 4—

FIGURE 4—

Pseudo-prospective Hotspot and Trend Analysis of COVID-19 Cases, in Providence, Rhode Island, and Surrounding Cities, for (a) March 31, 2019, and (b) April 2, 2019

Note. For analysis until April 14, 2020, see Appendix Figure B (available as a supplement to the online version of this article at http://www.ajph.org). New hotspot = a possible hotspot has just been detected in these neighborhoods. The duration of the hotspots is 10 days or fewer. Consecutive hotspot = a hotspot has been detected in these neighborhoods, with a duration of more than 10 days but fewer than 16 days. Sporadic hotspot = nonconsecutive hotspots have been detected in these neighborhoods, with a duration of fewer than 16 days.

Age-Specific Microepidemics

We subgrouped the reported cases by age group and performed a retrospective and a pseudo-prospective spatiotemporal hotspot analysis for each age group for influenza and COVID-19 (see Appendix Supplementary Results). The variation in the hotspot patterns among age groups was also evident. In both infections, many neighborhoods that had new or consecutive hotspots for one of the age groups did not have any hotspots for the other.

DISCUSSION

We extracted, geocoded, and analyzed at a small-area scale statewide laboratory-confirmed influenza cases for the previous 3 influenza seasons and laboratory-confirmed COVID-19 cases. The analysis provided evidence of the spatial heterogeneity of epidemics and identified hotspots of microepidemics and spatiotemporal trends and patterns. We used data that are readily available at the hospital level, performed a spatiotemporal analysis using influenza cases because of the large available volume of data, and applied this approach to the COVID-19 pandemic. Interestingly, both influenza and COVID-19 hotspots expanded and consolidated over time, especially in larger population centers. This approach also could be used for the real-time identification of spatiotemporal trends and hotspots of microepidemics with the use of georeferenced cases at a very detailed neighborhood scale.

Understanding the clustering of cases at a detailed level is crucial for epidemic surveillance, and georeferenced case data are needed for the implementation of early warning systems.24 Population in a small area tends to be more homogeneous in characteristics than in larger-scale areas.25 Small-area studies can be used to investigate disease clusters and identify health risks that may be exacerbated by socioeconomic or environmental factors.26 Minority communities are disproportionately affected by epidemics, and neighborhood-level analysis can be helpful in identifying different populations so that interventions are more pronounced in these communities.27 Understanding the characteristics of neighborhoods that are disproportionately affected by an epidemic or pandemic often requires analyzing data on a wide analysis scale. Performing the analysis at such detailed scale enables the identification of patterns that could not be extracted from a state-, county-, or zip code–level analysis.24

The main components of the methodology (laboratory-confirmed cases and patient addresses) are readily available in the electronic medical records of hospitals, outpatient clinics, and laboratories. A major challenge during an epidemic is the collection of high-quality data,4 and the implementation of statewide and nationwide data repositories with the cooperation of hospital networks is critical. The use of readily available hospital data could allow for the rapid implementation of early warning systems. A decentralized network of local disease surveillance initiatives could help accelerate mitigation efforts nationwide and reduce decision-making and execution time, leading to increased public health benefits and reduced cost.

Analyzing the patterns of influenza microepidemics by age group provided additional understanding and actionable information on the different dynamics that can shape neighborhood-level hotspots and how this approach can be used to study different categories of individuals at risk. More specifically, we found that children had more hotspot neighborhoods of influenza than adults, and previous studies have indicated that school-age children play a critical role in the geographic spread of an epidemic,28 while seniors had more hotspot neighborhoods of COVID-19, in agreement with studies that have shown that this age group is more vulnerable to the novel virus.29 Moreover, infection prevention resources during an emerging epidemic are limited and subject to logistical constraints, and the risk of hospitalization and mortality is different for each age group.30 As a result, health agencies have to prioritize interventions to certain groups of the population,31 and the identification of hotspot patterns by age group could be helpful for more specific, targeted, and effective interventions.

Furthermore, more than half of the neighborhoods (53%–60%) that were identified as hotspots of microepidemics in an influenza season became hotspots during the next influenza season. Even after we accounted for population density, more than a quarter (27%–28%) of the neighborhoods were hotspots for consecutive seasons. Overall, a neighborhood was more than 3 times more likely to be a hotspot if it had been a hotspot in the previous season, underlining the need of targeted prevention and mitigation strategies in these neighborhoods. The identification of neighborhoods that have been previous hotspots of microepidemics could enable the implementation of focused preventive measures in these neighborhoods, such as awareness campaigns and text-messaging interventions.32

As data on the patterns of COVID-19 spatial spread are limited, the use of existing knowledge on the spatial distribution of influenza hotspots could be used to guide targeted interventions at the local small-area level. Our analysis indicates that the vast majority of COVID-19 hotspots were among influenza hotspot neighborhoods, and this finding could be used for early interventions for COVID-19. After we accounted for population density, more than a third of COVID-19 hotspots were among neighborhoods with an influenza hotspot, and a neighborhood was almost 2 times more likely to be a COVID-19 hotspot if it had been an influenza hotspot.

We performed a pseudo-prospective spatiotemporal hotspot and trend analysis to simulate a real-time surveillance scenario of microepidemics in space and time. Importantly, this approach allowed us to observe the creation of new hotspots and the temporal evolution and the spread of these hotspots to surrounding areas and can be applied not only to influenza epidemics but also to other respiratory pathogens including COVID-19. The identification of microepidemics in real time could enable health agencies to plan and implement community interventions for epidemic containment and mitigation at the neighborhood level. Such interventions include vaccination if available, social distancing, voluntary self-isolation, increased screening, closure of schools and workplaces, cleaning and disinfection of public spaces, and promoting of hand hygiene and respiratory etiquette.12,33,34

An individual-based simulation study from the Imperial College COVID-19 Response team estimated that mitigation strategies, such as home isolation, quarantine, and social distancing of people at high risk, could result in a reduction of peak health care demand by two thirds and a decrease in deaths by half.35 We found that a neighborhood was almost 2 times more likely to become a COVID-19 hotspot if it had been an influenza hotspot in the 2018–2019 influenza season. The use of historic neighborhood-level influenza hotspots could prove useful in identifying communities with higher risk of contracting COVID-19 and guide early prevention interventions, while the ongoing identification of hotspots could provide real-time actionable information for further intervention.

Limitations

Limitations of this study should be considered. Even though the laboratory testing included outpatient clinics, infection cases often do not get tested and thus were probably not included in the extracted laboratory-confirmed influenza cases. The use of diagnostic methods with different specificity (polymerase chain reaction and rapid antigen test) by participating laboratories also influenced the observed differences between some neighborhoods. Finally, the distance and accessibility of each neighborhood to health care facilities could create heterogeneities in reporting, and the model needs to be adapted.

Public Health Implications

We used data on laboratory-confirmed influenza and COVID-19 cases, information that is readily available at the hospital level, to identify hotspots of microepidemics, patterns, and trends at the neighborhood level. The use of historic neighborhood-level influenza hotspots could prove useful in identifying communities with higher risk of contracting COVID-19 and guide early prevention interventions. The real-time identification of spatiotemporal trends and hotspots of microepidemics at a very detailed neighborhood scale would enable local communities and public health agencies to implement interventions that focus on small-area containment and mitigation. Such intervention could be even more important during temporal overlap of COVID-19 and influenza. Finally, as states try to reopen the economy and physical distancing is gradually eased, our approach will be essential for quickly identifying an increase in cases at the neighborhood level, enabling rapid small-area containment measures.

ACKNOWLEDGMENTS

The study was supported by the Brown University COVID-19 Research Seed Fund to P. A. Chan and E. Mylonakis.

CONFLICTS OF INTEREST

Eleftherios Mylonakis has received grant support from T2 Biosystems, Sanofi-Aventis, Kaleido Biosciences, and Cidara Therapeutics.

HUMAN PARTICIPANT PROTECTION

The study was approved by the Rhode Island Hospital institutional review board.

REFERENCES


Articles from American Journal of Public Health are provided here courtesy of American Public Health Association

RESOURCES