Can syndromic surveillance help forecast winter hospital bed pressures in England?

Roger A Morbey; Andre Charlett; Iain Lake; James Mapstone; Richard Pebody; James Sedgwick; Gillian E Smith; Alex J Elliot

doi:10.1371/journal.pone.0228804

. 2020 Feb 10;15(2):e0228804. doi: 10.1371/journal.pone.0228804

Can syndromic surveillance help forecast winter hospital bed pressures in England?

Roger A Morbey ^1,^*, Andre Charlett ², Iain Lake ³, James Mapstone ⁴, Richard Pebody ⁵, James Sedgwick ⁶, Gillian E Smith ⁷, Alex J Elliot ⁷

Editor: Oliver Gruebner⁸

PMCID: PMC7010388 PMID: 32040541

Abstract

Background

Health care planners need to predict demand for hospital beds to avoid deterioration in health care. Seasonal demand can be affected by respiratory illnesses which in England are monitored using syndromic surveillance systems. Therefore, we investigated the relationship between syndromic data and daily emergency hospital admissions.

Methods

We compared the timing of peaks in syndromic respiratory indicators and emergency hospital admissions, between 2013 and 2018. Furthermore, we created forecasts for daily admissions and investigated their accuracy when real-time syndromic data were included.

Results

We found that syndromic indicators were sensitive to changes in the timing of peaks in seasonal disease, especially influenza. However, each year, peak demand for hospital beds occurred on either 29^th or 30^th December, irrespective of the timing of syndromic peaks. Most forecast models using syndromic indicators explained over 70% of the seasonal variation in admissions (adjusted R square value). Forecast errors were reduced when syndromic data were included. For example, peak admissions for December 2014 and 2017 were underestimated when syndromic data were not used in models.

Conclusion

Due to the lack of variability in the timing of the highest seasonal peak in hospital admissions, syndromic surveillance data do not provide additional early warning of timing. However, during atypical seasons syndromic data did improve the accuracy of forecast intensity.

Introduction

Seasonal variation in emergency hospital admissions causes considerable pressure on health systems. In northern temperate countries, there is a seasonal winter peak in respiratory disease which can contribute to the increasing demand for hospital beds [1]. Importantly, health care planners need to predict the timing and intensity of seasonal demand for hospital beds to make efficient use of resources and avoid any deterioration in health care due to insufficient capacity [2, 3]. Also, increased pressure on services can lead to intensive care unit bed pressures, cancellation of elective surgery and facility closures. Furthermore, past studies have identified atypical seasons when hospital pressures due to influenza are significantly increased [4]. Hospital admissions data on its own may not provide adequate early warning of pressures due to reporting delays. For example, admission data with diagnostic details are not available until after discharge.

Syndromic surveillance involves the real-time monitoring of patients to provide early warning of, for example, seasonal influenza or respiratory syncytial virus (RSV) epidemics. Public Health England (PHE) routinely analyses real-time syndromic surveillance data for potential threats to public health. PHE has four syndromic surveillance systems, designed to monitor patients with morbidity presenting to different parts of the National Health Service (NHS) and provide a comprehensive coverage of patients with varying severity of symptoms. Since its establishment in 2013, PHE has monitored data from NHS 111, a free telephone advice line [5]. Primary care visits to family doctors (called General Practitioners or GPs in the UK) are analysed via two syndromic surveillance systems, one for in-hours (GPIHSS) and one for out-of-hours and unscheduled care (GPOOHSS) [6, 7]. Finally, patients with more severe illness presenting to emergency departments (EDs) are captured via the Emergency Department Syndromic Surveillance System (EDSSS) [8]. These systems are monitored daily with data available today from the previous day’s consultations.

Previous studies have shown that general syndromic respiratory indicators (e.g. respiratory tract infections) are associated with the overall burden of seasonal respiratory disease, whilst specific indicators (e.g. influenza-like illness; ILI) can help distinguish between different trends in individual pathogens, e.g. between influenza and RSV [9–11]. Therefore, syndromic surveillance may be able to provide early warning of hospital bed pressures caused by seasonal respiratory disease.

Here, we investigate the relationship between syndromic data and daily emergency hospital admissions. Specifically, we have explored whether daily syndromic data routinely collected for surveillance in England can help predict the timing and intensity of winter peaks in emergency hospital admissions.

Methods

Data collection

Respiratory indicators were selected from each of PHE’s syndromic systems including general indicators that capture a wide range of acute respiratory illness as well as more specific indicators. The indicators used included: acute respiratory infections, bronchitis, pneumonia and ILI from the ED and GP systems; and cold/flu calls from NHS 111. Each indicator was chosen based upon previously gathered evidence of sensitivity to circulating winter pathogens [9–11]. Syndromic indicators are monitored as daily rates rather than counts because routine surveillance can be affected by daily fluctuations in coverage. For GPIHSS rates are calculated per 100,000 registered patients. In other syndromic systems, registered patient coverage data are not available and rates are calculated using total activity as a denominator (Table 1).

Table 1. Methods for calculating syndromic indicator rates for syndromic systems.

Syndromic Surveillance System	Numerator	Denominator
GPIHSS	GP in-hours consultations for specific indicator	Registered population of participating GP practices
GPOOHSS	GP out-of-hours consultations for specific indicator	Total Read-coded consultations
EDSSS	ED attendances for specific indicator	Total ED attendances
NHS 111	Calls for specific indicator	Total calls to NHS 111

Open in a new tab

EDSSS: emergency department syndromic surveillance system; GPIHSS: GP in-hours syndromic surveillance system; GPOOHSS: GP out of hours and unscheduled care syndromic surveillance system; NHS 111: telehealth syndromic surveillance system.

Emergency hospital admission data were extracted from hospital episode statistics (HES), which include all admissions for England [12]. The data were aggregated into daily counts of admissions, using the date of the first consultant episode in each hospital stay. Previous studies in the UK have found that there is a limit to the number of excess admissions, with an artificial ceiling caused by the maximum capacity of hospitals due to a finite number of beds [13]. During extreme busy periods, hospitals are forced to cancel elective admissions in favour of emergency admissions. Therefore, we studied emergency rather than both elective and emergency admissions to give a less biased estimate for surges in demand. We used ‘all’ emergency admissions in our models rather than ‘respiratory’ admissions. Although, respiratory admissions were likely to have a stronger association with the syndromic indicators, forecasts for all emergency admissions are more likely to useful to healthcare service planners managing hospital demand.

Data were analysed for winter influenza seasons, between weeks 40 to 20 [14], from October 2013 until February 2018. This period covered the start of historical records for current surveillance systems until the most recent HES data available at the time of analysis. All data were stratified into the 7 age groups used for routine syndromic surveillance: under 1 year, 1 to 4, 5 to 14, 15 to 44, 45 to 64, 65 to 74 and over 75 years. We focussed on daily data routinely used for syndromic surveillance so that any conclusions about associations with hospital admissions would be relevant for using syndromic systems to predict admissions.

There is considerable variation in syndromic data and hospital admissions by day of the week due to availability of services and patient presenting behaviour. For instance, GP in-hours services are not available at weekends or during public holidays and other primary care services are correspondingly busier at these times. However, we did not want to lose resolution by aggregating the data into weekly rates. We want any potential future forecasts developed from syndromic data to be updatable daily. Therefore, we used seven-day moving averages to smooth the data to remove day of the week effects before examining trends or peak timing and included variables for weekends, and public holidays in our regression models.

Model fitting

The day of peak activity for emergency hospital admissions and syndromic indicators was calculated for each season. In addition, time series were visually inspected to identify seasons with secondary peaks in admissions or syndromic activity.

We used linear regression to test for associations between hospital admissions and syndromic indicators. Models included a variable for linear trend, and a pair of Fourier terms to model seasonal variation [15]. In addition to variables for weekends and public holidays, 25 December was also included as an additional binomial variable to account for differing presenting behaviour at Christmas compared to other public holidays [16]. Each syndromic indicator was modelled separately, stratified by age group. The adjusted R square of the regression models was used to estimate how much variation in hospital admissions could be attributed to the variables included in each model.

Model forecasts

We used the same association regression models to assess the utility of syndromic data for forecasting, but with different training data. For these forecasting models, we used k-fold cross validation to simulate the real-world scenario where information on the current season’s hospital admissions is not available in real-time [17]. For example, the forecast models for season 2017/2018 were constructed using training data that did not include 2017/2018.

A ‘null model’ was additionally created to include the same independent variables as the other models but without any syndromic data. Comparison with the null model enabled us to assess the added value of using syndromic data, where yesterday’s data is routinely available, to forecast today’s admissions, when recent admission data is not assumed to be available. These models were assessed by comparing the average absolute daily forecast errors.

Results

Data volume

During the study period (Oct 2013 –Feb 2018), there was a mean of 15,916 hospital admissions each day with 20,002 on the busiest day (29th Dec 2017). The largest syndromic system is GPIHSS where mean coverage was 30.3 million registered patients and a mean count of 22,281 respiratory tract infections each day, with a peak of 62,454. For GPOOHSS the mean of daily consultations with a Read-code recorded was 13,586. For NHS 111 the mean of daily total calls was 30,581. Finally, for EDSSS the mean of total daily attendances from all participating EDs was 5,173.

Model fitting

In the five seasons studied, peak activity in emergency hospital admissions were noticeably consistent, the all ages peak occurring either on 29^th or 30^th December every year. There was much more variation in the timing of peaks in syndromic indicators, particularly those most closely associated with influenza activity e.g. ILI. For example, during 2016/17 the peak in ILI consultations occurred during late December/early January, however, in 2015/16 the peak was very late in the season, and did not occur until late March (Fig 1).

All the syndromic indicators studied had significant associations with emergency hospital admissions and most association models explained over 70% of the variation in admissions (Table 2). In general associations were weaker in those aged <15 years, especially for 1–4 year olds. The syndromic indicators with the strongest associations were GPIHSS upper and lower respiratory tract infections for all ages and all age bands except 15 to 44 years.

Table 2. Adjusted R square values (%) for regression models measuring association between syndromic indicators and emergency hospital admissions (higher values are depicted with darker shades–highest value in each column shown in bold).

	Adjusted R square value (%)
	Age band (years)
Syndromic system and indicator	under 1	1 to 4	5 to 14	15 to 44	45 to 64	65 to 74	over 75	All ages
emergency department syndromic surveillance system
acute respiratory infection	63.7	42.03	48.46	77.88	86.14	81.25	73.38	67.55
admitted respiratory	57.01	32.50	48.89	77.19	86.17	81.03	71.03	64.83
bronchitis	66.03	28.82	48.37	77.22	86.16	80.93	69.64	65.31
pneumonia	47.11	26.30	48.38	77.18	86.13	81.00	70.93	62.43
GP in-hours syndromic surveillance system
influenza-like illness	47.83	30.31	51.10	77.52	86.22	81.26	74.66	64.13
lower respiratory tract infection	78.21	51.32	49.71	77.55	86.23	82.44	77.85	71.90
pneumonia	48.20	30.98	48.35	77.18	86.12	80.99	72.28	63.44
upper respiratory tract infection	71.59	59.78	55.74	77.51	86.19	82.40	77.73	72.99
GP out of hours and unscheduled care syndromic surveillance system
acute respiratory tract infection	67.80	46.83	48.33	78.05	86.18	82.13	76.79	69.44
bronchitis	69.61	31.43	48.55	77.28	86.13	81.10	70.54	66.38
NHS 111– telehealth syndromic surveillance system
cold/flu	44.51	29.66	49.45	78.29	86.21	81.26	74.98	63.48
NULL model	45.61	23.89	48.31	77.15	86.12	80.86	69.13	61.58

Open in a new tab

Model forecasts

All forecast models used to predict daily admissions showed similar trends and the resulting average absolute errors were low. Daily admissions, averaged around 16,000, but absolute errors were between 14 and 60 (Table 3). Histograms of absolute errors (not shown) were examined, however, there was no evidence of outliers having undue influence. For each age band, the null model was improved by including a syndromic indicator to reduce the forecast errors. However, for some models the improvement in reduction of forecast errors was marginal (less than two patients a day). The biggest improvements in forecast accuracy were in the youngest and eldest age bands. For example, the average forecast error for under 1s in the null model was 24.12 falling to 14.20 when GP lower respiratory tract infections were included in the model. The models with GP lower respiratory tract infections were better than the null model for every age band. Whilst in four out of seven of the age bands the smallest errors were for the GP upper respiratory tract infection models. Estimation of errors by season confirmed that 2017/18 had the least accurate forecasts for most models (Supplementary S1 Table).

Table 3. Absolute mean forecast errors (daily admissions) in England by age band (lowest errors in each age band highlighted in bold).

Absolute mean forecast errors
Syndromic indicator in model	Age band (years)
Syndromic indicator in model	under 1	1 to 4	5 to 14	15 to 44	45 to 64	65 to 74	over 75
Null model	24.12	27.67	19.88	45.07	35.80	26.75	54.98
EDSSS: acute respiratory infection	19.54	25.94	19.66	44.26	35.82	26.34	52.80
EDSSS: admitted respiratory	21.73	26.84	19.71	45.19	36.17	27.14	59.25
EDSSS: bronchitis	18.85	27.61	19.68	44.78	35.35	26.49	54.23
EDSSS: pneumonia	23.28	27.04	19.70	44.72	35.50	26.54	54.04
GPIHSS: lower respiratory tract infection	14.20	21.81	19.83	45.04	35.94	25.82	46.86
GPIHSS: influenza-like illness	23.88	28.01	19.56	45.67	36.23	26.86	55.32
GPIHSS: pneumonia	23.72	26.38	19.85	45.39	35.93	26.91	54.10
GPIHSS: upper respiratory tract infection	16.68	19.63	18.75	45.13	35.97	25.82	46.68
GPOOHSS: bronchitis	17.08	26.26	19.85	45.17	35.80	26.59	54.36
GPOOHSS: acute respiratory tract infection	17.92	23.69	19.87	45.23	35.91	25.96	48.87
NHS 111: cold/flu	23.51	26.88	19.31	45.56	35.39	26.09	51.67

Open in a new tab

EDSSS: emergency department syndromic surveillance system

GPIHSS: GP in-hours syndromic surveillance system

GPOOHSS: GP out of hours and unscheduled care syndromic surveillance system

NHS 111: telehealth syndromic surveillance system

The null models accurately predicted the long-term increasing trend in hospital admissions and day of the week and holiday effects. However, during seasons with higher than average syndromic activity the null model underestimated the peak hospital admissions. For example, towards the end of 2014 and 2017 syndromic models using GP consultations for upper respiratory tract infection provided higher forecasts than the underestimating null models (Fig 2).

Discussion

Main finding of this study

We found that seasonal peaks in syndromic respiratory indicators responded to changes in the timing of seasonal respiratory pathogen activity, whereas hospital emergency admissions consistently peaked on 29^th or 30^th December. Therefore, these findings suggest that syndromic data do not provide any additional early warning of the timing of the highest seasonal peak in hospital admissions each year. Furthermore, we found that predictable seasonal, secular and day of the week effects are evident in emergency hospital admissions. However, during atypical influenza seasons, such as 2014/15 and 2017/18, there were excess hospital admissions that were associated with increased activity seen in syndromic respiratory indicators. Therefore, syndromic data can improve real-time forecasts of the intensity of peaks in emergency hospital admissions and excess activity outside of the peak period associated with seasonal respiratory disease.

What is already known on this topic

In some countries, hospital admission data are available in real-time and are part of syndromic surveillance [18–20]. However, in countries like France and the United Kingdom (UK) full validated admission episodes including diagnostic data are not available until after discharge, and often not for several months post-discharge [21].

Many studies into the burden of seasonal respiratory disease on hospitals have focussed on admissions with a diagnosis of influenza. However, Fleming at al. studied all respiratory admissions because they found that only a small proportion of respiratory admissions received an influenza diagnosis, and that complications of influenza could result in later admissions for pneumonia etc. [22]. In our study, we aimed to show the utility of syndromic surveillance for assisting hospital planners manage total demand, and therefore we considered all emergency admissions, not just those with a respiratory diagnosis.

Previous studies into associations between syndromic systems and hospital admissions have used GP in-hours syndromic data. Specifically, associations have been found between GP ILI consultations and hospital admissions for influenza and pneumonia in the elderly [13, 22, 23]. In addition to ILI, GP consultations for acute bronchitis in the elderly have been found to be associated with hospital respiratory admissions, linked to RSV activity [1]. Studies exploring syndromic associations with admissions have produced estimates for the percentage of attributable variation in hospital admissions that vary from 2.7% [22] to 15% [13].

Comparisons between time series and forecasting studies naturally consider lags in terms of temporal association and whether trends in one series consistently anticipate trends in another. Chan et al. found that the greatest lag and therefore potential for early warning of changes in hospital admissions was from children aged 5–17 years with ILI [24], whilst, Schanzer et al. found that the relationship between ED ILI attendances and hospital admissions was not consistent enough for forecasting [25]. However, it is often more useful to planners to predict peaks when demand is at its greatest, rather than these temporal associations [26]. Therefore, our study focussed on the timing of peak demand for hospital admissions and the intensity during these peaks. Future studies may also consider whether syndromic data can anticipate the start of surges in demand that lead to these peaks.

Previous studies in Canada and the UK have identified a consistent peak in hospital admissions between Christmas and New Year that leads to bed pressures and is independent of the timing of the peak in influenza incidence [1, 13, 23]. For example, Elliot et al. found that although the timing of peaks in GP ILI consultations varied widely by season, between 1990 and 2005 all but one of the 15 seasons had a peak in elderly respiratory hospital admissions occurring between weeks 52 and 02. Therefore, studies into associations between syndromic data and hospital admissions usually incorporate seasonality into their models [3], and seasonality is often found to have a stronger association than syndromic data [2, 13, 23, 26]. By contrast, Fuhrman et al. found that the variation in timing of peaks in hospital admissions in France was strongly associated with peak timing for GP ILI consultations [21]. However, this French study considered just hospital admissions for chronic obstructive pulmonary disease and not the totality of burden.

The timing of the peak in hospital admissions between Christmas and New Year complicates daily forecasting using syndromic data because public holidays affect both the availability of services and patient behaviours [27]. Elliot et al. imputed weekly rates during the holiday period from adjacent weeks [1]. By contrast, we smoothed the data to remove holiday effects, using methods for routine surveillance described elsewhere [16]. PHE uses a range of syndromic systems, including both GP in-hours and out-of-hours to provide surveillance coverage all year round, including holiday periods.

We found that hospital admission peak intensity was significantly associated with syndromic indicators, seasonality, and an increasing secular trend. Other studies also found a significant increasing secular trend in hospital admissions for elderly patients [1, 2, 13, 22]. Patients over 75 years have been found to account for most hospital influenza admissions and the greatest pressure on bed days [22].

What this study adds

The findings from this study have supported earlier studies that showed an association between syndromic data and hospital admissions, and with the seasonality of hospital admissions being less sensitive to changing trends in seasonal respiratory disease than syndromic indicators. Furthermore, our results have confirmed that syndromic data can assist in predicting excess admissions associated with syndromic indicators for seasonal respiratory disease.

Many previous studies on the association between syndromic data and hospital admissions focus just on specific age groups or admissions with specific respiratory diagnoses. One of the strengths of our study is the consideration of all emergency admissions, which represent the excess demand that confront those managing health care pressures each winter. Furthermore, whilst previous studies have considered just one source of syndromic data, usually GP consultations, we have been able to use a range of syndromic data available from a working system, including telehealth calls, GP consultations and ED attendance data. Finally, unlike previous studies, we have studied daily not weekly data as any future forecasts developed from syndromic data will be most applicable if used in real-time.

Limitations of this study

The data used in this study were not patient identifiable linked data and therefore a causal link between increases in respiratory illness, syndromic indicators and hospital admissions cannot be demonstrated. Also, we have not studied other potential causal factors for hospital admissions, such as temperature or the timing of school holidays. Future work may be able to improve forecasts by including environmental data where it is available in real-time.

Our primary concern was to provide accurate forecasts; therefore, we have not presented an analysis of how the strength of associations between syndromic indicators and hospital admissions vary when a time lag is included. Future work could explore lags between time series to provide a deeper understanding of which syndromic indicators are most likely to provide early warning of hospital admissions.

For simplicity, we only assumed a linear secular trend in our regression models. All models underestimated the number of admissions in 2017/18 and it may be that there is a long-term increasing non-linear trend.

This study has only considered individual syndromic indicators monitored through real time systems. There may be further potential in exploring the combination of individual syndromic indicators, which may improve the accuracy of the forecast models.

Finally, whilst the timing of peaks in admissions are important, it would also be useful to give early warning of the surges in demand that lead to peaks. Future work could ascertain whether syndromic surveillance is able to predict any initial surges in hospital admissions.

Implications for public health practice

We have shown that syndromic surveillance data can improve real-time forecasts of peak intensity for emergency hospital admissions. Furthermore, we have used syndromic data available daily as part of working surveillance systems. Therefore, our results demonstrate that real-time forecast models can be constructed which use syndromic data to improve accuracy during seasons when there are additional pressures due to above average seasonal respiratory illness.

Supporting information

S1 Table. Absolute mean forecast errors (daily admissions) in England by age band, stratified by season.

(DOCX)

Click here for additional data file.^{(18.6KB, docx)}

Acknowledgments

We acknowledge support from NHS Digital and NHS 111; Royal College of Emergency Medicine emergency departments participating in the Emergency Department Syndromic Surveillance System (EDSSS); EMIS Health and L2S2 Ltd; out-of-hours providers submitting data to General Practitioners Out-of-Hours and Advanced; The Phoenix Partnership (TPP), and participating SystemOne practices and University of Oxford, ClinRisk, EMIS Health, and EMIS practices submitting data to the QSurveillance database.

Data Availability

The data underlying the results presented in the study are available from Public Health England, https://www.gov.uk/government/organisations/public-health-england.

Funding Statement

The author(s) received no specific funding for this work.

References

1.Elliot AJ, Cross KW, Fleming DM. Acute respiratory infections and winter pressures on hospital admissions in England and Wales 1990–2005. J Public Health (Oxf). 2008;30(1):91–8. [DOI] [PubMed] [Google Scholar]
2.Wargon M, Brun-Ney D, Beaujouan L, Casalino E. No more winter crisis? Forecasting daily bed requirements for emergency department admissions to hospital. Eur J Emerg Med. 2018;25(4):250–6. 10.1097/MEJ.0000000000000451 [DOI] [PubMed] [Google Scholar]
3.Boyle J, Jessup M, Crilly J, Green D, Lind J, Wallis M, et al. Predicting emergency department admissions. Emerg Med J. 2012;29(5):358–65. 10.1136/emj.2010.103531 [DOI] [PubMed] [Google Scholar]
4.Smith S, Morbey R, Pebody RG, Hughes TC, de Lusignan S, Yeates FA, et al. Retrospective Observational Study of Atypical Winter Respiratory Illness Season Using Real-Time Syndromic Surveillance, England, 2014–15. Emerg Infect Dis. 2017;23(11):1834–42. 10.3201/eid2311.161632 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Harcourt SE, Morbey RA, Loveridge P, Carrilho L, Baynham D, Povey E, et al. Developing and validating a new national remote health advice syndromic surveillance system in England. J Public Health. 2016;39(1):184–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Harcourt SE, Smith GE, Elliot AJ, Pebody R, Charlett A, Ibbotson S, et al. Use of a large general practice syndromic surveillance system to monitor the progress of the influenza A(H1N1) pandemic 2009 in the UK. Epidemiol Infect. 2011;140(1):100–5. 10.1017/S095026881100046X [DOI] [PubMed] [Google Scholar]
7.Harcourt SE, Fletcher J, Loveridge P, Bains A, Morbey R, Yeates A, et al. Developing a new syndromic surveillance system for the London 2012 Olympic and Paralympic Games. Epidemiol Infect. 2012;140(12):2152–6. 10.1017/S0950268812001781 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Elliot AJ, Hughes HE, Hughes TC, Locker TE, Shannon T, Heyworth J, et al. Establishing an emergency department syndromic surveillance system to support the London 2012 Olympic and Paralympic Games. Emerg Med J. 2012;29(12):954–60. 10.1136/emermed-2011-200684 [DOI] [PubMed] [Google Scholar]
9.Morbey RA, Elliot AJ, Harcourt S, Smith S, de Lusignan S, Pebody R, et al. Estimating the burden on general practitioner services in England from increases in respiratory disease associated with seasonal respiratory pathogen activity. Epidemiol Infect. 2018;146:1389–96. 10.1017/S0950268818000262 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Morbey RA, Harcourt S, Pebody R, Zambon M, Hutchison J, Rutter J, et al. The burden of seasonal respiratory infections on a national telehealth service in England. Epidemiol Infect. 2017;145:1922–32. 10.1017/S095026881700070X [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Hughes HE, Morbey R, Hughes TC, Locker TE, Pebody R, Green HK, et al. Emergency department syndromic surveillance providing early warning of seasonal respiratory activity in England. Epidemiol Infect. 2016;144(5):1052–64. 10.1017/S0950268815002125 [DOI] [PubMed] [Google Scholar]
12.NHS Digital. Hospital Episode Statistics (HES) 2018 [updated 20/09/2018; cited 15/2/19]. https://digital.nhs.uk/data-and-information/data-tools-and-services/data-services/hospital-episode-statistics.
13.Nguyen-Van-Tam JS, Brockway CR, Pearson JC, Hayward AC, Fleming DM. Excess hospital admissions for pneumonia and influenza in persons > or = 65 years associated with influenza epidemics in three English health districts: 1987–95. Epidemiol Infect. 2001;126(1):71–9. 10.1017/s0950268801005076 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Public Health England. Annual flu reports London 2019 [updated 24/5/18; cited 15/2/19]. https://www.gov.uk/government/statistics/annual-flu-reports.
15.Stolwijk AM, Straatman H, Zielhuis GA. Studying seasonality by using sine and cosine functions in regression analysis. J Epidemiol Community Health. 1999;53(4):235–8. 10.1136/jech.53.4.235 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Buckingham-Jeffery E, Morbey R, House T, Elliot AJ, Harcourt S, Smith GE. Correcting for day of the week and public holiday effects: improving a national daily syndromic surveillance service for detecting public health threats. BMC Public Health. 2017;17(1):477 10.1186/s12889-017-4372-y [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Burman P. A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika. 1989;76(3):503–14. [Google Scholar]
18.Dembek ZF, Carley K, Siniscalchi A, Hadler J. Hospital admissions syndromic surveillance—Connecticut, September 200-November 2003. MMWR supplements. 2004;53:50–2. [PubMed] [Google Scholar]
19.Hadler JL, Siniscalchi A, Dembek Z. Hospital admissions syndromic surveillance—Connecticut, October 2001-June 2004. MMWR supplements. 2005;54:169–73. [PubMed] [Google Scholar]
20.Lesesne SB, Rojas Smith L, Rein DB. Characteristics and Geographical Dispersion of Syndromic Surveillance Systems in the United States in 2008. Research Triangle Park, NC: RTI Press; 2010. www.rti.org/rtipress. [Google Scholar]
21.Fuhrman C, Roche N, Vergnenegre A, Zureik M, Chouaid C, Delmas MC. Hospital admissions related to acute exacerbations of chronic obstructive pulmonary disease in France, 1998–2007. Respir Med. 2011;105(4):595–601. 10.1016/j.rmed.2010.11.014 [DOI] [PubMed] [Google Scholar]
22.Fleming D, Harcourt S, Smith G. Influenza and adult hospital admissions for respiratory conditions in England 1989–2001. Commun Dis Public Health. 2003;6(3):231–7. [PubMed] [Google Scholar]
23.Upshur RE, Moineddin R, Crighton E, Kiefer L, Mamdani M. Simplicity within complexity: seasonality and predictability of hospital admissions in the province of Ontario 1988–2001, a population-based analysis. BMC Health Serv Res. 2005;5(1):13 10.1186/1472-6963-5-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Chan EH, Tamblyn R, Charland KM, Buckeridge DL. Outpatient physician billing data for age and setting specific syndromic surveillance of influenza-like illnesses. Journal of biomedical informatics. 2011;44(2):221–8. 10.1016/j.jbi.2010.10.001 [DOI] [PubMed] [Google Scholar]
25.Schanzer DL, Schwartz B. Impact of seasonal and pandemic influenza on emergency department visits, 2003–2010, Ontario, Canada. Acad Emerg Med. 2013;20(4):388–97. 10.1111/acem.12111 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Soyiri IN, Reidpath DD, Sarran C. Forecasting asthma-related hospital admissions in London using negative binomial models. Chron Respir Dis. 2013;10(2):85–94. 10.1177/1479972313482847 [DOI] [PubMed] [Google Scholar]
27.Chakraborty P, Lewis B, Eubank S, Brownstein JS, Marathe M, Ramakrishnan N. What to know before forecasting the flu. PLoS Comput Biol. 2018;14(10):e1005964 10.1371/journal.pcbi.1005964 [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0228804.r001

Decision Letter 0

Oliver Gruebner

29 Oct 2019

PONE-D-19-24776

Can syndromic surveillance help forecast winter hospital bed pressures in England? - Using routine daily syndromic surveillance data to forecast the winter peak in demand for hospital beds.

PLOS ONE

Dear Dr Morbey,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I would be happy if you can clarify on the points both reviewers have raised, with particular focus on the methodological section. Many thanks!

We would appreciate receiving your revised manuscript by Dec 13 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Oliver Gruebner

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on software sharing (http://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-software) for manuscripts whose main purpose is the description of a new software or software package. In this case, new software must conform to the Open Source Definition (https://opensource.org/docs/osd) and be deposited in an open software archive. Please see http://journals.plos.org/plosone/s/materials-and-software-sharing#loc-depositing-software for more information on depositing your software." 2) please ask the following, ping the Data team with follow-up: "Please amend your Data availability statement to outline how other researchers may access the data used in this study, for instance by providing a direct link/URL or contact details, including an email address or phone number, to the relevant authority where the data is kept. Please also ensure that the specific dataset is identified, or that the Methods section contains enough detail for another researcher to reproduce the dataset.

3. Please remove the 'Draft not for onward circulation' watermark from the background of the manuscript pages (page 1 onwards).

4. Please amend either the title on the online submission form (via Edit Submission) or the title in the manuscript so that they are identical.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: In this study, the authors have investigated whether the range of syndromic

surveillance data available in the England may help to predict the winter

demand for hospital beds. This bed demand is clearly influenced by the amount

of respiratory disease in the community, so these syndromic data may indeed be

particularly valuable for healthcare planning and this is potentially a very

valuable study. However, I'm afraid that the methods section doesn't provide

enough detail about the model design and evaluation for me to thoroughly

assess the results and the authors conclusions.

Major comments

==============

1. It's unclear exactly what each model comprised. Regarding the peak timing

models, the authors state:

Models included a variable for linear trend, and Fourier variables to model

seasonal variation (15). In addition to variables for weekends and public

holidays, 25 December was also included as an additional binomial variable

to account for differing presenting behaviour at Christmas compared to

other public holidays (16). Each syndromic indicator was modelled

separately, stratified by age group.

If I have interpreted this correctly, for each of the syndromic indicators

there were 7 regressions models, one for each of the following age groups:

under 1 year, 1 to 4, 5 to 14, 15 to 44, 45 to 64, 65 to 74 and 75+ years.

Is this correct? If so, presumably these 7 models (for a single indicator)

did not necessarily share the same coefficients for the linear term, the

seasonal terms, and the weekend, public holiday, and Christmas Day binomial

variables? What were the coefficients for each model, and which seasonal

terms (sines and cosines) were included?

2. What criteria was used to fit each model? Am I correct in thinking that the

authors minimised the mean absolute error, as they used in a later section

to assess model forecasts?

3. Regarding the peak timing models, why wasn't a null model (i.e., without a

syndromic indicator) included for comparison? This would be very helpful

for interpreting the adjusted R-squared values listed in Table 2.

4. The same models were then used to "assess the utility of syndromic data for

forecasting" (and this time a null model was included). In this case, for

each year Y in turn, the models were fit to data from all years except Y.

From Figure 2 I gather that the daily bed demand on day D was predicted by

each model using the syndromic indicator data up to, and including, day D.

Is this correct? If so, these results should be called "nowcasts" instead

of "forecasts". If not, and the models were used to predict daily bed

demand for future days D+1 onward, Figure 2 should identify the forecasting

date(s) and include confidence intervals.

5. In evaluating the improvements in forecast performance, the authors report

the mean absolute error for each model and each age group in Table 3,

averaged over the five winters. Relative to the null model, some of the

syndromic models showed substantial improvements for children aged under 5

and adults aged 75+. But this was not evident for those aged 5-74. Did the

authors also look at, e.g., histograms of the absolute errors for each

model? It would be very interesting to know whether some of the models did

yield better predictions than the null model for those aged 5-74 for the

most part, but also yielded a handful of predictions with large errors.

6. It would also be very useful to see the mean absolute errors presented

separately for each year, since the study period (2013-2018) included two

"atypical influenza seasons" (2014/15 and 2017/18). This should hopefully

provide more detailed evidence to support the authors claim that "syndromic

data can improve real-time forecasts of the intensity of peaks in emergency

hospital admissions and excess activity outside of the peak period

associated with seasonal respiratory disease".

In a similar vein, how did the adjusted R-squared for each model (as listed

in Table) vary from one year to the next? This could be nicely presented in

a reasonably simple plot, and would complement the results in Table 2.

Minor comments

==============

1. In the methods section, I find the sub-section titles "Peak timing" and

"Peak intensity" somewhat confusing, when they could seemingly be titled

"Model fitting" and "Model nowcasts/forecasts". This is especially true

since the peak demand for hospital beds was always the 29th or 30th of

December.

Reviewer #2: Thank you for an excellent manuscript. I have read it with great interest. The aims and methods of the study are clearly stated, results are presented in a clear way, figures illustrate the most important points, discussion offers relevant insights into the applicability of the results and future directions for research, and conclusion corresponds to the stated aims. I would only have very minor suggestions and questions for the authors to consider:

1. The conclusion, as formulated in the first paragraph of Discussion, outlines very well what is the biggest advantage of adding syndromic indicators (forecasting the intensity of peak activity in atypical seasons), and where it add less value (predicting timing of the highest seasonal peak). Perhaps the conclusion in the Abstract could be specified similarly.

2 The authors discuss why they used all emergency admissions rather than respiratory admissions as outcome (264-266). I wonder if the reasoning for this choice could be mentioned already in the Introduction. Could using all admissions contribute to the regular peak in 29-30 December? Potentially, respiratory-diseases related admissions' peak might be more variable.

3. Moving averages are described in Methods (line 110) and Figure 1 suggests that they are averaged over 7 days. If that is correct, this could be mentioned in the Methods.

4. The authors discuss briefly the lag of temporal association (from line 224). Have you also modelled the emergency admissions considering some time lag (the indicator value from x days ago)? As a reader, I would be interested to see whether you tested such models as well, and if not, why.

5. Lines 377 and 379 could be removed.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Feb 10;15(2):e0228804. doi: 10.1371/journal.pone.0228804.r002

Author response to Decision Letter 0

6 Dec 2019

Response to reviewers

Thank you for the helpful comments from the editor and reviewers. We have addressed each point raised below, our responses shown in red.

Editor comments

Please remove the 'Draft not for onward circulation' watermark from the background of the manuscript pages (page 1 onwards).

We have removed the watermark from the manuscript.

Please amend either the title on the online submission form (via Edit Submission) or the title in the manuscript so that they are identical.

We have amended the title in the online submission form so that it is the same as in the manuscript.

Reviewers' comments:

Reviewer #1:

In this study, the authors have investigated whether the range of syndromic surveillance data available in the England may help to predict the winter demand for hospital beds. This bed demand is clearly influenced by the amount of respiratory disease in the community, so these syndromic data may indeed be particularly valuable for healthcare planning and this is potentially a very valuable study. However, I'm afraid that the methods section doesn't provide enough detail about the model design and evaluation for me to thoroughly assess the results and the authors conclusions.

We thank the reviewer for their constructive comments throughout their comprehensive review of our paper. The major comments seem to be asking more for clarification for the reviewer rather than requesting amendments to the manuscript however where appropriate we have provided additional material to provide further clarification for the reader.

Major comments

==============

1. It's unclear exactly what each model comprised. Regarding the peak timing models, the authors state:

Models included a variable for linear trend, and Fourier variables to model seasonal variation (15). In addition to variables for weekends and public holidays, 25 December was also included as an additional binomial variable to account for differing presenting behaviour at Christmas compared to

other public holidays (16). Each syndromic indicator was modelled separately, stratified by age group.

If I have interpreted this correctly, for each of the syndromic indicators there were 7 regressions models, one for each of the following age groups: under 1 year, 1 to 4, 5 to 14, 15 to 44, 45 to 64, 65 to 74 and 75+ years.

Is this correct? If so, presumably these 7 models (for a single indicator) did not necessarily share the same coefficients for the linear term, the seasonal terms, and the weekend, public holiday, and Christmas Day binomial variables? What were the coefficients for each model, and which seasonal

terms (sines and cosines) were included?

The reviewer is correct in their interpretation that there were separate regression models for each syndromic indicator, each of which with separate coefficients for the different terms, seasonal, weekend etc. We do not feel that it is possible to present separately the coefficients for each model because there are too many separate models to make this practicable. Furthermore, in addition to separate models for each indicator and age group we also stratified by four English regions and five seasons (we have not presented separate results by region because we felt this was an unnecessary complication that did not add to the paper). Whereas the stratification by season was necessary for the k-fold validation of forecasts. Consequently, to publish model coefficients would require an additional 1540 (=11*7*4*5) tables!

Two seasonal terms were used in our models, the following code shows how they were created in Stata using the day of the year (doy) function: cos(2*_pi*doy(date)/365), sin(2*_pi*doy(date)/365) We have amended the text to clarify that a single pair of Fourier terms were used.

2. What criteria was used to fit each model? Am I correct in thinking that the authors minimised the mean absolute error, as they used in a later section to assess model forecasts?

Yes, this is correct in that we sought to minimise mean absolute error in our forecast model fitting.

3. Regarding the peak timing models, why wasn't a null model (i.e., without a syndromic indicator) included for comparison? This would be very helpful for interpreting the adjusted R-squared values listed in Table 2.

Thank you for this suggestion, we have added a row for a null model to table 2.

4. The same models were then used to "assess the utility of syndromic data for forecasting" (and this time a null model was included). In this case, for each year Y in turn, the models were fit to data from all years except Y. From Figure 2 I gather that the daily bed demand on day D was predicted by

each model using the syndromic indicator data up to, and including, day D. Is this correct? If so, these results should be called "nowcasts" instead of "forecasts". If not, and the models were used to predict daily bed demand for future days D+1 onward, Figure 2 should identify the forecasting

date(s) and include confidence intervals.

The reviewer is correct in saying daily bed demand on D was predicted using syndromic data available up to day D, however this does not include syndromic data for day D, only up to the day before. Our aim was to replicate the conditions in which forecasts could be applied within our surveillance service, i.e. a daily service where each day we are looking at the syndromic data collected during the previous day. We did not include confidence intervals for figure 2 because it is not in the format where the left most point represents the date of the forecast with a single fan-shaped confidence interval showing increasing uncertainty over time. Instead figure 2 shows a series of daily forecasts each predicting just one-day ahead.

5. In evaluating the improvements in forecast performance, the authors report the mean absolute error for each model and each age group in Table 3, averaged over the five winters. Relative to the null model, some of the syndromic models showed substantial improvements for children aged under 5 and adults aged 75+. But this was not evident for those aged 5-74. Did the authors also look at, e.g., histograms of the absolute errors for each model? It would be very interesting to know whether some of the models did yield better predictions than the null model for those aged 5-74 for the most part, but also yielded a handful of predictions with large errors.

We have added a line to the text to reflect that we did examine the histograms for the different indicators across the age bands and did not find any evidence of outliers having undue influence. Although, there was a wider variance and longer tails for the 5-64 years’ age bands this was also the case for the null model.

6. It would also be very useful to see the mean absolute errors presented separately for each year, since the study period (2013-2018) included two "atypical influenza seasons" (2014/15 and 2017/18). This should hopefully provide more detailed evidence to support the authors claim that "syndromic data can improve real-time forecasts of the intensity of peaks in emergency hospital admissions and excess activity outside of the peak period associated with seasonal respiratory disease".

We agree with the reviewer that showing mean absolute errors for each season could be useful so we’ve added this as a supplementary table, S1 and provided some additional text in the results. In most cases the absolute errors are highest for 2017/18 which agrees with our commentary that the models under-estimated this season.

In a similar vein, how did the adjusted R-squared for each model (as listed in Table) vary from one year to the next? This could be nicely presented in a reasonably simple plot, and would complement the results in Table 2.

We decided not to include a table or plot of model fit by season because we felt this would require considerable extra explanation to ensure the results were not miss-interpreted. The reason these results could be miss-interpreted is due to the k-fold validation; e.g. the models we created for forecasting the 2017/2018 season included data from all 5 seasons except 2017/18, we didn’t create forecast models using just the data for 2017/2018. Therefore, if we were to present a version of table 2 stratified by season we’d have to explain that each model is based on the exclusion of one year instead of the inclusion of one year. Consequently, if season 2017/2018’s data was causing a poor fit this would result in all of the models having a poorer fit except for the one labelled “2017/2018 excluded.”

Minor comments

==============

1. In the methods section, I find the sub-section titles "Peak timing" and "Peak intensity" somewhat confusing, when they could seemingly be titled "Model fitting" and "Model nowcasts/forecasts". This is especially true since the peak demand for hospital beds was always the 29th or 30th of

December.

We agree with the reviewer that these sub-headings are unhelpful, particularly for the methods section, and we welcome the suggestion to change them to Model fitting and Model forecasts. For consistency, we have also changed the sub-heading in the results.

Reviewer #2:

Thank you for an excellent manuscript. I have read it with great interest. The aims and methods of the study are clearly stated, results are presented in a clear way, figures illustrate the most important points, discussion offers relevant insights into the applicability of the results and future directions for research, and conclusion corresponds to the stated aims. I would only have very minor suggestions and questions for the authors to consider:

We thank the reviewer for their very positive and supportive review.

Thank you for this suggestion, we have changed the abstract conclusion accordingly.

2. The authors discuss why they used all emergency admissions rather than respiratory admissions as outcome (264-266). I wonder if the reasoning for this choice could be mentioned already in the Introduction. Could using all admissions contribute to the regular peak in 29-30 December? Potentially, respiratory-diseases related admissions' peak might be more variable.

Prior to our study, we did spend considerable time discussing with relevant experts the advantages and disadvantages of modelling ‘all emergency admissions’ vs ‘respiratory admissions’. Although, we believed that using just respiratory emergency admissions would give us stronger associations and better model fit, our stakeholders informed us that a forecast for total emergency admissions would be more valuable for healthcare service planners dealing with and managing pressures. Also, the greatest pressures felt by hospitals are when total emergency admissions peak, whatever the case mix. Therefore, we decided that the most useful research question was whether syndromic data was useful for forecasting all emergency admissions. We agree with the reviewer that this should be mentioned earlier in the paper and have added a sentence to the methods section under data collection.

3. Moving averages are described in Methods (line 110) and Figure 1 suggests that they are averaged over 7 days. If that is correct, this could be mentioned in the Methods.

We have added clarification to the methods that it is a 7-day moving average that is used.

We did explore lags in association in the preparation work for this study. However, our primary concern was to create as accurate a forecast as possible rather than model the lagged associations between the time series and we found the most accurate forecasts included the most recent data available. We have added a comment to the limitations section to note the importance of considering time lags in any future studies.

5. Lines 377 and 379 could be removed.

We have removed the un-necessary heading for Figure Legends.

[end]

Attachment

Submitted filename: Response to reviewersV03.docx

Click here for additional data file.^{(19.6KB, docx)}