Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2020 Oct 22;16(10):e1008301. doi: 10.1371/journal.pcbi.1008301

Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like illness

Sen Pei 1,*, Jeffrey Shaman 1,*
Editor: Sara Y Del Valle2
PMCID: PMC7608986  PMID: 33090997

Abstract

Influenza-like illness (ILI) is a commonly measured syndromic signal representative of a range of acute respiratory infections. Reliable forecasts of ILI can support better preparation for patient surges in healthcare systems. Although ILI is an amalgamation of multiple pathogens with variable seasonal phasing and attack rates, most existing process-based forecasting systems treat ILI as a single infectious agent. Here, using ILI records and virologic surveillance data, we show that ILI signal can be disaggregated into distinct viral components. We generate separate predictions for six contributing pathogens (influenza A/H1, A/H3, B, respiratory syncytial virus, and human parainfluenza virus types 1–2 and 3), and develop a method to forecast ILI by aggregating these predictions. The relative contribution of each pathogen to the total ILI signal is estimated using a Markov Chain Monte Carlo (MCMC) method upon forecast aggregation. We find highly variable overall contributions from influenza type A viruses across seasons, but relatively stable contributions for the other pathogens. Using historical data from 1997 to 2014 at US national and regional levels, the proposed forecasting system generates improved predictions of both seasonal and near-term targets relative to a baseline method that simulates ILI as a single pathogen. The hierarchical forecasting system can generate predictions for each viral component, as well as infer and predict their contributions to ILI, which may additionally help physicians determine the etiological causes of ILI in clinical settings.

Author summary

Influenza-like illness (ILI) is a widely used medical diagnosis of possible infection with influenza or another acute respiratory illness. Accurate forecasting of ILI can support better planning of interventions against respiratory infections, as well as early preparation for patient surges in healthcare facilities during periods of peak incidence. Although ILI is an amalgamation of multiple pathogens with variable seasonal phasing and contributions to incidence, to our knowledge, all existing process-based forecasting systems treat ILI as a single infectious agent. This leads to model misspecification that compromises forecast precision. In this study, we address this issue by forecasting ILI as the aggregation of predictions for individual contributing respiratory viruses. Using ILI records and virologic surveillance data, we show that ILI signal can be disaggregated into distinct viral components and develop a method to forecast ILI by aggregating predictions for six pathogens. We find highly variable overall contributions from influenza type A viruses across seasons, but relatively stable contributions from other pathogens. In retrospective forecasts, the proposed multi-pathogen forecasting system generates substantially more accurate predictions of both seasonal and near-term targets relative to a baseline method that simulates ILI as a single pathogen.

Introduction

Acute respiratory infections impose heavy morbidity and mortality burdens on global populations, especially children and the elderly [1]. Influenza-like illness (ILI), defined as fever (temperature of 100°F [37.8°C] or greater) with a cough and/or sore throat, is a widely used medical diagnosis of possible infection with influenza or another acute respiratory illness. The US Centers for Disease Control and Prevention (CDC) collects surveillance information on the percentage of patient visits to healthcare providers for ILI through the US Outpatient Influenza-like Illness Surveillance Network (ILINet) and has adopted ILI as the primary signal to track influenza activity in the United States [2]. Accurate forecasting of ILI can support better planning of both pharmaceutical (e.g., vaccines, antivirals and prophylaxis) and non-pharmaceutical (e.g., school closure, social distancing and travel restrictions) interventions against respiratory infections, as well as early preparation for patient surges in healthcare facilities during peak periods of incidence. In recent years, a number of forecasting systems for ILI have been developed [314], many of which have been applied operationally to forecast ILI in the United States [1518].

Despite its name, ILI is not exclusively caused by influenza. Common viruses contributing to ILI include influenza, respiratory syncytial virus, human parainfluenza virus, coronavirus, human metapneumovirus, respiratory adenovirus and rhinovirus [1922]. Typically clinically indistinguishable, these viral signals have disparate seasonal characteristics, and vary in their contributions to ILI during different times of the year. Although ILI is a syndromic record that represents a range of illnesses, to our knowledge, all current process-based forecasting systems treat ILI as a single pathogen [1618]. Such oversimplification may lead to model misspecification that compromises forecast precision. In addition, single-pathogen forecasting systems are unable to estimate and predict the relative contribution of each component pathogen.

To generate more accurate and precise ILI forecasts, the inclusion of more complex processes is needed to reduce model misspecification. Currently, laboratory-confirmed positivity rate data (defined as the fraction of samples testing positive for a specific respiratory pathogen) for several respiratory pathogens circulating in the US are available in near-real time from the National Respiratory and Enteric Virus Surveillance System (NREVSS) [23]. These pathogen-specific observational data might support prediction of each virus singly, and further improve the precision of ILI forecasting. Indeed, previous research has found that forecasting individual pathogens produces more precise predictions of those specific agents due to better-specified models [2426].

Here, using ILI records from ILINet and virologic surveillance data from the NREVSS, we show that ILI signal can be recovered by aggregating multiple viral components, specifically influenza A/H1, A/H3, B, respiratory syncytial virus (RSV), and human parainfluenza virus types 1–2 (PIV12) and 3 (PIV3). Further, we demonstrate that more accurate and precise forecasts of ILI can be obtained by aggregating predictions of these pathogens. Retrospective forecasts for the 1997–1998 to 2013–2014 seasons at national and regional levels demonstrate that this multi-pathogen forecasting system generates substantially improved predictions of both seasonal (onset week, peak week and peak intensity) and near-term (one- to four-week ahead ILI) targets relative to a competing method modeling ILI as a single agent. The multi-pathogen forecasting system, capable of generating predictions for each virus, can infer and predict their relative contributions to ILI as well.

Materials and methods

Data

Weekly ILI records for US national and 10 HHS (US Department of Health and Human Services) regions were obtained from the CDC FluView website [2]. Positivity rate data for influenza types and subtypes are also available from FluView. In this study, we focus on three influenza types/subtypes–A/H1, A/H3 and B. Weekly laboratory test results for RSV, PIV12 and PIV3 at national and regional levels were obtained from NREVSS. Positivity rate data for several other respiratory pathogens (e.g., coronavirus, human metapneumovirus and respiratory adenovirus) are also reported but with limited historical records. We therefore constrained study to the six viruses with data since 1997 –influenza A/H1, A/H3, B, RSV, PIV12 and PIV3. Participating laboratories voluntarily report to the NREVSS the total number of weekly tests performed to detect each virus and the total number of positive tests (including virus antigen detections, isolations by culture, and polymerase chain reaction (PCR)). The participating laboratories are spread throughout the US [23] and are thus fairly geographically representative of the broader population. However, as testing was ordered independently by healthcare providers, it is unknown whether samples were biased to certain group of patients.

In total, we used ILI and laboratory test data from the 1997–1998 to 2013–2014 seasons, excluding 2008–2009 and 2009–2010, which were impacted by the 2009 H1N1 pandemic (Fig 1, S1 and S2 Figs). Retrospective forecasts were generated at the US national level and for HHS regions 1 to 9. Region 10 was excluded as influenza positivity data were only available after 2010.

Fig 1. National ILI rates and six contributing viral signals in different seasons.

Fig 1

The viral signal shown here is the positivity rate of each pathogen. ILI is attributed to a range of viruses with variable seasonal phasing. ILI peaks are largely driven by the circulating influenza strains.

Forecasting single respiratory pathogens

The trajectories of RSV, PIV12 and PIV3 are similar across seasons, whereas the activity of each influenza strain is highly variable (S1 Fig). As a result, we adopted two distinct approaches to forecast influenza and non-influenza pathogens. For influenza, we used process-based models that could flexibly simulate a variety of outbreak trajectories. A humidity-forced SIRS (susceptible-infected-recovered-susceptible) model was employed to depict the influenza transmission process and iteratively optimized using available observations. In particular, we fitted an ensemble of model simulations to positivity rate data from the beginning of the outbreak season up to the week of forecast [2729] and then evolved the simulations into the future to generate forecasts. We also utilized a method to optimally perturb the fitted model to obtain a more calibrated ensemble forecast spread [30]. The spread of forecasts was calibrated to capture the uncertainty of near-term observations. Specifically, the initial conditions (S and I) of the fitted model were perturbed so that the difference between the spread of near-term model trajectories and observations is minimized (see details in S1 Text). In implementation, a 1,000-member ensemble forecast was generated each week.

For the non-influenza pathogens with consistent seasonality, we generated forecasts using a statistical approach that leverages the similarity among different seasons–the method of analogues [14]. Several observational studies indicate that both RSV and PIV have relatively regular seasonality in temperate regions [31,32]. In the US, RSV typically peaks in early February, and PIV3 and PIV2 usually peak in April-June and October-November, respectively [33,34]. The seasonality of RSV has been found correlated with climate factors such as temperature and humidity [35,36]. However, the exact drivers of the seasonality of RSV and other respiratory viruses and their comparative significance are still under debate. For the method of analogues, we measured the distance of the time series of observations in the current season to those in other seasons in the same location, and produced forecasts as a weighted average of historical records based on those distances. To generate probabilistic predictions, 1,000 time series were sampled from historical records with replacement according to their weights, and then calibrated to optimize the forecast spread. Details about the forecasting of influenza and non-influenza pathogens are presented in S1 Text.

Aggregating predictions of multiple pathogens

To generate ILI forecasts by aggregating predictions of the contributing viral signals, we estimated a multiplicative factor for each pathogen. Specifically, using the ILI data and positivity rates for the 6 viruses, we employed a Markov Chain Monte Carlo (MCMC) method to infer the multiplicative factors wi [37]. To ensure a broad probing of wi, we used a uniform prior distribution U(0,1) in the MCMC. Distributions of six multiplicative factors were obtained to maximize the log likelihood of observing the reported ILI data (see S1 Text for details). In implementation, the distributions of multiplicative factors were re-estimated each forecast week. In order to generate an ensemble of ILI forecast trajectories, 1,000 samples of multiplicative factors were randomly drawn from the posterior distributions, and then used to aggregate the previously obtained 1,000-member ensemble forecast for each pathogen. In the aggregation, multiplicative factors and forecast trajectories were randomly matched.

A post-processing procedure was applied to further improve the accuracy and precision of the ILI forecasts [6,38]. Systematic bias across seasons may exist in ILINet reported surveillance data, such as the drops in ILI often observed during the Christmas-New Year week. We accounted for this systematic bias by adding a component estimated from the residuals of ILI fits from other seasons (S3 Fig). After adjusting for the bias common to all seasons, real-time forecasts may still under or overestimate the ILI trajectory for a particular prediction. We therefore included another adjustment to counteract the bias specific to each forecast [6,38], which was approximated as the discrepancy between the aggregated and observed ILI data in the latest week. Finally, the forecast ensemble was redistributed around the mean prediction to calibrate its spread. Implementation details about the MCMC and post-processing are provided in S1 Text.

We acknowledge that ILI is attributable to more viruses than just the 6 pathogens included here; for instance, coronavirus, human metapneumovirus, respiratory adenovirus and rhinovirus may all produce symptoms consistent with an ILI diagnosis. However, without pathogen-specific data for those viruses, it is difficult to estimate the relative contribution of such neglected pathogens. For the 2010–2011 through 2013–2014 seasons, positivity rate data for three additional pathogens were available: human metapneumovirus (HMPV), respiratory adenovirus (rAD) and rhinovirus (Rhino). We therefore performed an additional analysis estimating the relative contributions of nine pathogens to ILI, and found that, at national level, the three additional pathogens accounted for 21%, 37%, 19% and 31% of the ILI signal in those four seasons (S4 Fig). This finding indicates that the principal six pathogens considered here comprise the majority of ILI. In aggregation, the contributions of neglected pathogens are partially represented as forecast discrepancy or by viruses with similar seasonality. In future studies, with the availability of more abundant pathogen-specific data, we will include more viruses in the multi-pathogen aggregation.

Evaluation of forecasts

The accuracy of probabilistic forecasts was measured using the “log score”, which is calculated as the logarithm (base e) of the fraction of forecasts assigned to an interval around the observed target (henceforth, score interval) [1618]. We focus on four near-term targets (1- to 4-week ahead ILI) and three seasonal targets (peak week, peak intensity and onset week) [1618]. In particular, peak week was defined as the week with the highest ILI rate in each season, and onset was defined as the first of three consecutive weeks with ILI at or over a predefined threshold. These thresholds for the US and all HHS regions were released by CDC prior to each season. For 1- to 4-week ahead ILI and peak intensity, the score interval was set as ±0.5% around the observed ILI rate; for peak week and onset week, it was set as ±1 week around the observation. A floor value of -10 was set as a lower bound for scores. Translating log score to forecast skill, a -0.5 (-1.0) log score implies a correct forecast approximately 61% (37%) of the time. Another measure, forecast error, calculated as the mean absolute error of average predictions to observed targets, was used to quantify the accuracy of point predictions. These forecast metrics are consistent with CDC FluSight guidelines [1618]. Note that, starting from the 2019–2020 season, an updated scoring rule using a single data bin was employed [39]. In this study, we used the previous definition for log score.

Results

ILI disaggregation

The six examined pathogens have distinct outbreak characteristics (Fig 1, S1 Fig). Influenza viruses typically peak during winter but exhibit large variations in peak week and peak intensity across seasons. This irregular behavior renders influenza forecasting a challenging task. In contrast, the activity of RSV, PIV12 and PIV3 is more regular: these viruses generally peak in January, November and May, respectively, with similar peak intensity across seasons. ILI epidemic trajectories are highly variable, with peak timing largely driven by the circulating influenza strain in each season. We show the scatter plot for each possible pair of observations (S2 Fig), and found that, in general, influenza and RSV are positively correlated with each other, and PIV12 and PIV3 are negatively correlated with influenza and RSV. In addition, PIV12 is negatively correlated with PIV3.

We represent ILI signal in a given season s as a linear combination of the positivity rates of its contributing pathogens in the same season (S1 Text): ILI(t) = ∑wi(t)vi(t), where vi(t) is the positivity rate of pathogen i at week t, and wi(t) is its corresponding multiplicative factor. Here we omit the season index s for notational convenience. We derived that the multiplicative factor wi(t) quantifies the ratio of the probability of undergoing a laboratory test among patients seeking medical attention for any reason to the probability of undergoing a laboratory test among patients seeking medical attention who are infected with pathogen i (S1 Text). As a result, the multiplicative factors for the six pathogens do not necessarily sum to 1. If a multiplicative factor wi(t) is stable over time, i.e., wi(t) = wi, we can predict ILI by aggregating forecasts for all contributing pathogens. To validate this assumption, we performed a linear regression to ILI with a constant multiplicative factor, wi, for each pathogen.

We fitted the positivity rate data of each virus to a mechanistic model (humidity-forced SIRS models for influenza A/H1, A/H3 and B; SIRS models for RSV, PIV12 and PIV3) and used the fitted curves to represent the actual unobserved positivity rates (Fig 2A). We then performed a linear regression to ILI using the fitted curves via MCMC in order to derive estimates for wi (Fig 2B). For comparison, we also fitted the ILI data to a humidity-forced SIRS model with ILI as a single pathogen.

Fig 2. Disaggregation of national ILI during the 2010–2011 season to multiple viral components.

Fig 2

Model fitting to the positivity rates of six respiratory pathogens (A). Time starts from the 40th week in 2010. The distribution of the multiplicative factor, wi, for each pathogen obtained from MCMC (B). The boxes and whiskers show the interquartile and 95% credible intervals. Fitting ILI as an amalgamation of six pathogens versus as a single agent (C).

The mechanistic models simulated the positivity rates of each pathogen well (Fig 2A), indicating that the process-based model is well specified for the transmission process of a single virus. This finding is also evident from the improved forecast accuracy for specific influenza types and subtypes reported in prior work [25]. The epidemic curve aggregated from the six pathogens closely matches the observed ILI data (Fig 2C). (Note that each component in the multi-pathogen aggregation needs to fit the observed positivity rate of each pathogen as well.) In contrast, the curve fitted to ILI as a single pathogen has discrepancies from observations during early and late weeks. These findings are corroborated across all seasons and regions (S1 Table, S5 Fig) and indicate that the multi-pathogen model with constant multiplicative factors is better specified for simulation of ILI. The distributions of multiplicative factors inferred for each season differ substantially, indicating that the contributions of the individual pathogens to overall ILI vary (S6 Fig). Within a season, distributions of multiplicative factors in different regions also have variations; however, the relative magnitudes for influenza types/subtypes remain similar across regions (S7 and S8 Figs). Note, the fitting did not reproduce the decrease of ILI at week 14 during the Christmas-New Year holiday. This discrepancy will be modeled during forecast post processing as an independent component representing systematic bias.

The overall contributions of the six pathogens to ILI in 15 seasons at the national level are reported in Table 1. The estimated contributions are relatively stable for non-influenza pathogens and influenza type B, but exhibit large year-to-year variability for influenza viruses A/H1 and A/H3: the standard deviations for A/H1, A/H3, B, RSV, PIV12 and PIV3 are 13.4%, 20.2%, 6.8%, 5.7%, 8.1% and 2.8%, respectively. The variance of contribution from influenza B is similar to non-influenza pathogens. However, the seasonality of influenza B is more irregular. This highlights the variable prevalence of influenza type A across seasons owing to antigenic drift and immune escape. Each season also tends to have only one dominant influenza type A virus (A/H1 or A/H3), leaving the other sub-type circulating with much lower activity.

Table 1. Overall contributions of the six pathogens to national ILI in 15 seasons.

The contribution of pathogen i is calculated as t=152wivi(t)/i=16t=152wivi(t), where vi(t) is the observed positivity rate of pathogen i at week t, and wi is the inferred mean multiplicative factor for pathogen i. The inferred dominant pathogen in each season is in bold.

A/H1 A/H3 B RSV PIV12 PIV3
97–98 0.00% 43.10% 0.50% 31.20% 17.70% 7.50%
98–99 0.30% 35.40% 12.80% 28.80% 15.70% 7.00%
99–00 1.90% 60.40% 0.50% 18.60% 8.80% 9.80%
00–01 30.00% 0.20% 15.10% 21.40% 24.40% 8.90%
01–02 0.50% 34.40% 8.50% 26.40% 24.40% 5.80%
02–03 8.50% 2.60% 14.90% 31.20% 29.60% 13.20%
03–04 0.00% 65.80% 1.60% 19.10% 2.20% 11.30%
04–05 0.10% 35.80% 14.50% 15.80% 21.00% 12.80%
05–06 3.70% 31.30% 12.70% 22.00% 18.60% 11.70%
06–07 27.60% 5.40% 9.20% 25.60% 15.70% 16.50%
07–08 12.20% 34.10% 12.80% 19.70% 12.20% 9.00%
10–11 18.10% 15.60% 20.20% 15.00% 18.40% 12.70%
11–12 5.50% 18.10% 8.30% 28.90% 28.40% 10.80%
12–13 3.60% 39.10% 25.30% 15.50% 6.50% 10.00%
13–14 42.80% 5.10% 15.80% 17.20% 10.50% 8.60%

The different seasonality between influenza and other examined respiratory pathogens can be attributed to a number of factors. For instance, influenza transmission dynamics depend on pre-exposure to influenza viruses and antigenic drift, which modulate rates of susceptibility [4042]. RSV mostly infects infants or young children whose maternal immunity has waned [43]. The complex interaction between different influenza strains and other respiratory viruses may also contribute to the variations in different seasons [4446]. In addition, a recent study reported that the virology and host response to RSV and influenza are distinct. While influenza induces robust strain-specific immunity after infection, strain-specific protection for RSV is incomplete [47].

Retrospective forecasting of ILI

Predictions for influenza A/H1, A/H3 and B generated by process-based models capture observed trajectories well (see examples in S9 Fig). For the non-influenza pathogens, RSV, PIV12 and PIV3, forecasts generated using the method of analogues reliably predict the positivity rates as well (S9 Fig). We also find that the prediction intervals generated for near-term targets are well-calibrated with the spread of observations (S10 Fig), i.e., 50% of observed outcomes fall within the interquartile prediction interval; 95% of observed outcomes fall within the 95% prediction interval; etc.

Fig 3 shows an example multi-pathogen forecast generated at week 6 of the 2010–2011 season at the national level. The aggregated forecast for ILI, improved by the post-processing that adjusts systematic bias, forecast-specific bias and forecast spread (Fig 3A), agrees well with the observed data. The adjustment for systematic bias reproduces the decrease of ILI in week 14; correcting forecast-specific bias reduces the discrepancy between predicted and observed ILI; and the calibration constrains the prediction intervals more tightly around the observed ILI curve. Near-term targets in the next one to four weeks fall within the 95% prediction intervals. The distributions of multiplicative factors estimated at week 6 (Fig 3B) are similar to those obtained from linear regression using data for the entire season (Fig 2B), but with a broader spread indicating a higher level of uncertainty. In comparison, the forecast obtained using a method that simulates ILI as a single pathogen can generally predict the trend of ILI but is less precise (Fig 3C). More forecast examples at different outbreak phases indicate that the aggregation of forecasts can well predict future ILI rates throughout an outbreak season (S11, S12, S13 and S14 Figs).

Fig 3. Forecasts generated for national ILI during the 2010–2011 season by aggregating multiple viral components.

Fig 3

Forecast of ILI generated by aggregating the predictions for the six viruses using MCMC-derived multiplicative factors and the effect of post-processing (A). Forecasts were generated at week 6 in the 2010–2011 season. The grey line shows the mean trajectory and the shaded area indicates the 95% prediction interval. The distributions of the inferred multiplicative factors for the six pathogens at week 6 are reported in (B). The boxes and whiskers show the interquartile and 95% credible intervals. The baseline forecast treating ILI as a single pathogen is shown in (C).

To evaluate the proposed ILI forecasting method, we performed retrospective forecasts for ILI in the US and HHS regions 1 to 9 during 15 seasons, using both the multi-pathogen approach and a baseline method that models ILI as a single agent [3]. To avoid over-fitting and to make full use of surveillance data, we used a “leave-one-out” cross-validation. That is, we trained the forecast algorithm using data from the periods outside the forecast season, so that the prediction is not contaminated by observations in the same outbreak. Within each season, weekly forecasts from week 4 to week 35 (i.e., late October to late May) were generated. Comparisons of log scores for seven targets averaged over forecasts in different seasons and locations indicate that the aggregated forecasts have significantly better log scores than the baseline forecasts (Wilcoxon signed-rank test, p<10−5, see details in S1 Text, Fig 4A, S15 Fig). Without post-processing, the aggregated forecasts already outperform the baseline; each of the post-processing procedures leads to additional improvement of the aggregated forecasts (S16 Fig). Specifically, improvement of the seasonal targets is primarily attributed to the forecast aggregation, and the adjustment for forecast-specific bias substantially improves near-term targets. The benefit of bias correction post-processing was recently demonstrated for ILI forecasting [6]. Combining both forecast aggregation and post-processing, the log scores for 1- to 4-week ahead ILI, peak week, peak intensity, and onset week are improved by 0.38, 0.36, 0.33, 0.32, 0.18, 0.27 and 0.15, respectively. Comparisons based on forecast error for point predictions (Fig 4B and 4C) also demonstrate superior performance of the multi-pathogen forecast for most targets (except for 4-week ahead prediction); the forecast errors for 1- to 3-week ahead ILI, peak week, peak intensity, and onset week are reduced by 31.9%, 16.8%, 3.3%, 22.9%, 16.8% and 56.4%. By both evaluation metrics, the improvement for near-term targets gradually decreases for longer forecast horizons.

Fig 4. Comparison of forecast performance with the baseline method.

Fig 4

Comparisons of log scores for forecasts generated by aggregating multiple pathogens and the baseline method that simulates ILI as a single agent (A). Log scores are obtained from weekly retrospective forecasts for ILI at the national level and in HHS regions 1 to 9, from 1997–1998 to 2013–2014 seasons, excluding pandemic seasons 2008–2009 and 2009–2010. Four near-term targets (1- to 4-week ahead ILI) and three seasonal targets (peak week, peak intensity and onset week, denoted by Pw, Pi and Onset, respectively) are shown. Comparison of forecast error for predictions of 1- to 4-week ahead ILI and peak intensity (B). Comparison of forecast error for predictions of peak week and onset week (C). Asterisks indicate statistical significance for multi-pathogen forecasts outperforming baseline (p<10−5, Wilcoxon signed-rank test).

We evaluate forecast precision using reliability plot. For the probabilistic forecasts, the prediction intervals generated by the forecasting system should be well calibrated with the spread of observations (i.e. 50% of observed outcomes fall within the interquartile prediction interval; 95% of observed outcomes fall within the 95% prediction interval; etc.). We evaluated this property using reliability plots. Specifically, we calculated the fraction of observed targets falling within the 25%, 50%, 75% and 100% prediction intervals, and display the relationship between the observed fractions and prediction intervals in Fig 5. A well-calibrated forecast yields data points that lie on the diagonal y = x line. Results in Fig 5 indicate that the calibration for near-term targets and peak intensity is substantially improved by the aggregated forecasting approach. In contrast, the baseline method tends to generate overly narrow prediction intervals (data points lie below the diagonal line). Indeed, the average width of 95% prediction intervals generated by the baseline is narrower than those generated by the aggregation method (S17 Fig).

Fig 5. Reliability plots for seven targets of ILI forecast targets.

Fig 5

Reliability plots for 1- to 4-week ahead ILI, peak week, peak intensity, and onset week are compared for multi-pathogen and single-pathogen forecasts. Data points show the fraction of observed targets falling within the 25%, 50%, 75% and 100% prediction intervals. Results are obtained from retrospective forecasts during 15 seasons at the US national level and for HHS regions 1 to 9.

We also found that using the systematic bias correction procedure alone with the baseline method does not provide clear improvement of its performance (S18 Fig). We additionally implemented a version of the multi-pathogen forecasting in which the component forecasts for influenza were generated using the method of analogues. We found that using the method of analogues for influenza could also support improved forecasting of ILI over the baseline single-pathogen method (S19 Fig). However, due to the large variations of influenza activity across different seasons, the improvement using the method of analogues is less than that obtained using process-based models.

To examine the marginal benefits of including each pathogen, we performed an analysis in which each pathogen was omitted from the multi-pathogen forecasting, and quantified the degradation of forecast accuracy (S20 Fig). We found that removing any one of the six pathogens leads to a degradation of log score and omitting A/H3 leads to the largest degradation. This finding is in agreement with our estimated relative contribution to ILI reported in Table 1: A/H3 is the dominant signal in 9 of examined 15 seasons.

Relative contribution of each pathogen

An additional benefit of the aggregation forecasting method is its inference and forecast of the relative contribution of each pathogen to ILI. Specifically, we computed the relative contribution of virus i at week t as wivi(t)/i=16wivi(t), where wi is the multiplicative factor for virus i estimated from available data. Here we only consider the relative activity among the examined six viruses. Should information about other ILI-related pathogens become available, it could be readily included into the framework. Fig 6 shows an example forecasting the relative contributions of the six viruses. Based on the seasonality of each pathogen, it is straightforward to reason that PIV12 and PIV3 would play dominant roles at the beginning and end of the respiratory virus season. However, the exact relative contribution depends on the evolving activities of all six pathogens. Reporting on the proportion of ILI attributable to different respiratory pathogens has large variations depending on cohort and study location [4852]. For reference, please see Table 1 in Reis & Shaman [21]. The multi-pathogen forecast system, capable of generating accurate predictions for each virus, predicts the temporally varying relative contributions. In this example, during the period of peak ILI, influenza strains account for the majority of ILI attributed to the six examined viruses. This highlights the significant role influenza has in shaping ILI seasonality and explains the large variation of ILI curves among different seasons.

Fig 6. Forecasting the relative contribution of each pathogen to ILI.

Fig 6

Forecasts were generated for national data at week 15 during the 2010–2011 season. The relative contribution before week 15 was inferred using available data. After week 15, it is obtained by normalizing the weighted predicted signal of each pathogen. Note that we only considered the relative contribution among the six examined viruses. Contributions from other pathogens were not included.

Discussion

In this study, we demonstrate that a process-based model aggregating multiple ILI-causing pathogens can better describe overall ILI activity and support more accurate and precise forecasting. Our approach, established on the premise that ILI is the amalgamation of many infectious agents, proves particularly effective. In the 2019–2020 flu season, we have submitted the multi-pathogen ILI forecasts to the FluSight initiative. The performance is competitive, with potentials for further improvement. In addition to improved forecast accuracy for ILI, the system enables the estimation and forecast of the relative contribution of each virus to ILI. In clinical settings, this information may be helpful for determining the possible cause of ILI among presenting patients.

Infectious disease forecasting is increasingly used to support public health decision-making [53,54] and have been particularly important during the ongoing COVID-19 pandemic. For intervention planning, forecasting the ILI signal may not generate sufficiently specific information. Even for typical wintertime cold and flu seasons, pathogen-specific forecasts could provide guidance on more specific intervention measures. For instance, in the 2019–2020 flu season, type B influenza appeared earlier and was abnormally high in several southern states (e.g., Louisiana) [2]. Anticipating an early surge of influenza B patients can support a better preparedness to this specific pathogen in healthcare systems.

Our findings indicate that influenza type A viruses are the predominant determinant of ILI seasonality. As a result, monitoring the activity of influenza type A viruses should be prioritized for accurate surveillance and forecasting of ILI. We further demonstrate that, through the inclusion of more realistic processes and virologic surveillance data, improved ILI forecasts can be generated to better support decision-making and intervention. Further improvement of process-based ILI forecasting will require additional representation of other information and processes (e.g., age groups, contact mixing and antigenic drift) and assimilation of more comprehensive datasets.

In the multi-pathogen model, only six pathogens were considered. This neglect of other pathogens may cause overestimation of the relative contributions of these viruses. In particular, the contributions of neglected pathogens are either included in the forecast discrepancy, or absorbed by viruses with similar seasonality. Considering this caveat, the contributions of PIV12 and PIV3 at the beginning and end of seasons are likely to be overestimated, possibly due to pathogens such as rhinovirus that circulate at the same time of year. Pathogens that peak in winter, e.g., coronavirus, may disturb the estimation during peak weeks. In addition, co-infection was not considered in the framework, which may also lead to model misspecification. However, co-infection of respiratory viruses is rare based on a recent study that actively tested 2,685 participants recruited at a New York City tourist attraction [55]. As data for more pathogens are made available, these issues may be corrected through incorporation into this forecasting framework.

For this study, information on test sensitivity/specificity was not available. Such details could be used to inform constraints of the observation error variance (OEV) of the positivity rate data. For instance, test results with high sensitivity and specificity would be assigned low OEV, which would give more credibility to these observations and impact the final estimated multiplicative factors. In operational forecasting, ILI data are subject to revisions for several weeks after the initial release. In this study, we did not consider this backfill; however, the impact of backfill should be limited as the aggregation procedure uses all available data points, while backfill mostly only affects the most recent 2 or 3 weekly data points. In practice, this backfill issue could be partly alleviated using nowcast techniques [56]. In future studies, the proposed method could also be used to disaggregate ILI or ILI-related hospitalizations by age groups, or to aggregate forecasts across different geographical scales (e.g., from state level to regional level).

Supporting information

S1 Text. Supplementary materials.

(DOCX)

S1 Table. Residuals of multi-pathogen and single-pathogen model fittings to ILI.

We measure the residual of model fitting to ILI using the sum of absolute discrepancies at all weeks in each season and location. The first numbers are the residuals of multi-pathogen model fittings, and the latter ones are obtained from single-pathogen model fittings. The smaller fitting residuals are in bold. In general, the multi-pathogen model fittings have smaller residuals in most seasons and locations.

(DOCX)

S1 Fig. Positivity rates of six pathogens in 15 seasons at national level.

We plot the positivity rates from laboratory tests at the national level for influenza A/H1, A/H3, B, RSV, PIV12 and PIV3 for the 1997–1998 to 2013–2014 seasons, excluding the 2008–2009 and 2009–2010 seasons. While influenza activities have large variations across seasons, non-influenza pathogens are more regular.

(EPS)

S2 Fig. Scatter plot of each possible pair of observations.

We report the correlation coefficient of positivity rates between each pair of pathogens.

(EPS)

S3 Fig. Residuals of multi-pathogen fit to ILI in the US and HHS regions 1 to 9.

Dash lines are residuals obtained in 15 seasons, and solid lines are the averaged curves. The mean residuals were used as the systematic bias that is the same across seasons in post-processing. Note that in most regions the decrease of ILI during the Christmas-New Year holiday is captured by the average residuals.

(EPS)

S4 Fig. Estimated relative contributions of nine respiratory pathogens at the national level.

We estimated the relative contribution of each pathogen from 2010 to 2013, when positivity rate data from rAD, HMPV and Rhino were also available.

(EPS)

S5 Fig. Comparison of multi-pathogen and single-pathogen model fittings to ILI.

Fitting was performed for national data for the 1997–1998 to 2013–2014 seasons, excluding the 2008–2009 and 2009–2010 seasons. The regression curves of multiple viral signals show less discrepancy from observations than the single-pathogen fitting.

(EPS)

S6 Fig. Estimated distributions of multiplicative factors in different seasons at the national level.

Multiplicative factors vary considerably over time, indicating that contributions from the examined six viruses to ILI are variable in different seasons. The prior range for wi is set as [0, 1].

(EPS)

S7 Fig. Estimated multiplicative factors of six respiratory pathogens in the 2011–2012 season.

The prior range for wi is set as [0, 1].

(EPS)

S8 Fig. Estimated multiplicative factors of six respiratory pathogens in the 2013–2014 season.

The prior range for wi is set as [0, 1].

(EPS)

S9 Fig. Examples of forecasts for the six viruses.

Forecasts are shown for national positivity rates during the 2006–2007 season at week 10. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

(EPS)

S10 Fig. Reliability plots for the forecasts of individual pathogens.

We display the reliability plots for 4 near-term targets: 1- to 4-week ahead ILI, denoted by X1 to X4, respectively. Data points show the fraction of observed targets falling within the 25%, 50%, 75% and 100% prediction intervals. For a well-calibrated forecast, data points will lie on the diagonal line y = x. Results are shown for the forecasts generated at the national level for all 15 seasons.

(EPS)

S11 Fig. Examples of ILI forecasts at different phases of the 2010–2011 outbreak.

Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

(EPS)

S12 Fig. Examples of ILI forecasts at different phases of the 2011–2012 outbreak.

Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

(EPS)

S13 Fig. Examples of ILI forecasts at different phases of the 2012–2013 outbreak.

Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

(EPS)

S14 Fig. Examples of ILI forecasts at different phases of the 2013–2014 outbreak.

Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

(EPS)

S15 Fig. Comparison of log score and forecast error at different forecast weeks.

We compare the log score (left) and forecast error (right) for retrospective forecasts generated by aggregating predictions of multiple pathogens and a baseline method that simulates ILI as a single pathogen. Results are obtained by averaging the score or error for forecasts generated from week 4 to week 35 during 15 seasons at the US national level and for HHS regions 1 to 9.

(EPS)

S16 Fig. Effect of post-processing on forecasts.

We compare the log scores (A) and forecast errors (B-C) for seven targets averaged over weekly predictions in 15 seasons and 10 locations (national and 9 HHS regions). Forecasts obtained by direct aggregation and additional post-processing procedures (adjusting systematic bias, adjusting current bias and calibration) are compared with the baseline.

(EPS)

S17 Fig. Comparison of the average width of the 95% prediction interval.

Results are obtained from retrospective forecasts during 15 seasons at the US national level and for HHS regions 1 to 9.

(EPS)

S18 Fig. Effect of systematic bias correction on baseline forecast log score.

We compare the log scores for seven targets averaged over weekly predictions during 15 seasons and 10 locations (national and 9 HHS regions). Forecasts obtained using the baseline with systematic bias correction (Baseline+systematic bias correction) are compared with the baseline forecast method (Baseline).

(EPS)

S19 Fig. Forecasting influenza using the method of analogues in multi-pathogen forecasting.

We compare the log scores for seven targets averaged over weekly predictions in 15 seasons and 10 locations (national and 9 HHS regions). Forecasts were obtained using the multi-pathogen method in which influenza types/subtypes were predicted using dynamical models (Dynamic), the multi-pathogen method in which influenza types/subtypes were predicted using the method of analogues (Analogues) and the baseline single-pathogen method (Baseline).

(EPS)

S20 Fig. Degradations of forecasts if each pathogen is removed from the multi-pathogen forecasting system.

We compare the log scores for seven targets averaged over weekly predictions in 15 seasons and 10 locations (national and 9 HHS regions). We removed each pathogen in turn in the multi-pathogen forecasting system and compared the degradation of log score.

(EPS)

Data Availability

Data and code are available at GitHub https://github.com/SenPei-CU/Multi-Pathogen_ILI_Forecast.

Funding Statement

This work was supported by US National Institutes of Health grant GM110748 and Defense Advanced Research Projects Agency contract W911NF-16-2-0035. The funders had no role in the study design, analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Lozano R, Naghavi M, Foreman K, Lim S, Shibuya K, Aboyans V, et al. Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012; 380: 2095–2128. 10.1016/S0140-6736(12)61728-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.U.S. Centers for Disease Control and Prevention https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html (accessed Jun 15, 2019).
  • 3.Shaman J, Karspeck A. Forecasting seasonal outbreaks of influenza. Proc Natl Acad Sci U S A. 2012; 109: 20425–20430. 10.1073/pnas.1208772109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ben-Nun M, Riley P, Turtle J, Bacon DP, Riley S. Forecasting national and regional influenza-like illness for the USA. PLoS Comput Biol. 2019; 15: e1007013 10.1371/journal.pcbi.1007013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Osthus D, Hickmann KS, Caragea PC, Higdon D, Del Valle SY. Forecasting seasonal influenza with a state-space SIR model. Ann Appl Stat. 2017; 11: 202–224. 10.1214/16-AOAS1000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Osthus D, Gattiker J, Priedhorsky R, Del Valle SY. Dynamic Bayesian Influenza Forecasting in the United States with Hierarchical Discrepancy. Bayesian Anal. 2019; 14: 261–312. [Google Scholar]
  • 7.Kandula S, Yamana T, Pei S, Yang W, Morita H, Shaman J. Evaluation of mechanistic and statistical methods in forecasting influenza-like illness. J Royal Soc Interface. 2018; 15: 20180174 10.1098/rsif.2018.0174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Pei S, Kandula S, Yang W, Shaman J. Forecasting the spatial transmission of influenza in the United States. Proc Natl Acad Sci U S A. 2018; 115: 2752–2757. 10.1073/pnas.1708856115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Brooks LC, Farrow DC, Hyun S, Tibshirani RJ, Rosenfeld R. Flexible modeling of epidemics with an empirical Bayes framework. PLoS Comput Biol. 2015; 11: e1004382 10.1371/journal.pcbi.1004382 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Reich NG, Brooks LC, Fox SJ, et al. A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States. Proc Natl Acad Sci U S A. 2019; 116: 3146–3154. 10.1073/pnas.1812594116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Ray EL, Reich NG. Prediction of infectious disease epidemics via weighted density ensembles. PLoS Comput Biol. 2018; 14: e1005910 10.1371/journal.pcbi.1005910 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Farrow DC, Brooks LC, Hyun S, Tibshirani RJ, Burke DS, Rosenfeld R. A human judgment approach to epidemiological forecasting. PLoS Comput Biol. 2017; 13: e1005248 10.1371/journal.pcbi.1005248 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Pei S, Shaman J. Counteracting structural errors in ensemble forecast of influenza outbreaks. Nat Commun. 2017; 8: 925 10.1038/s41467-017-01033-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Viboud C, Boëlle PY, Carrat F, Valleron AJ, Flahault A. Prediction of the spread of influenza epidemics by the method of analogues. Am J Epidemiol. 2003; 158: 996–1006. 10.1093/aje/kwg239 [DOI] [PubMed] [Google Scholar]
  • 15.U.S. Department of Health and Human Services, FluSight: Seasonal Influenza Forecasting. Epidemic Prediciton Initiative, www.predict.phiresearchlab.org/ (accessed Jun 15, 2019).
  • 16.Biggerstaff M, Alper D, Dredze M, Fox S, Fung IC, Hickmann KS, et al. Results from the centers for disease control and prevention’s predict the 2013–2014 Influenza Season Challenge. BMC Infect Dis 2016; 16: 357 10.1186/s12879-016-1669-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Biggerstaff M, Johansson M, Alper D, Brooks LC, Chakraborty P, Farrow DC, et al. Results from the second year of a collaborative effort to forecast influenza seasons in the United States. Epidemics. 2018; 24: 26–33. 10.1016/j.epidem.2018.02.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.McGowan CJ, Biggerstaff M, Johansson M, Apfeldorf KM, Ben-Nun M, Brooks L, et al. Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016. Sci Rep. 2019; 9: 683 10.1038/s41598-018-36361-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Pavia AT. Viral infections of the lower respiratory tract: old viruses, new viruses, and the role of diagnosis. Clin Infect Dis. 2011; 52(suppl_4): S284–S289. 10.1093/cid/cir043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Jansen RR, Wieringa J, Koekkoek SM, Visser CE, Pajkrt D, Molenkamp R, et al. Frequent detection of respiratory viruses without symptoms: toward defining clinically relevant cutoff values. J Clin Microbiol. 2011; 49: 2631–2636. 10.1128/JCM.02094-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Reis J, Shaman J. Simulation of four respiratory viruses and inference of epidemiological parameters. Infect Dis Model. 2018; 3: 23–34. 10.1016/j.idm.2018.03.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Levy N, Iv M, Yom-Tov E. Modeling influenza-like illnesses through composite compartmental models. Physica A. 2018; 494: 288–293. [Google Scholar]
  • 23.The National Respiratory and Enteric Virus Surveillance System (NREVSS) https://www.cdc.gov/surveillance/nrevss/index.html (accessed Jun 15, 2019).
  • 24.Goldstein E, Viboud C, Charu V, Lipsitch M. Improving the estimation of influenza-related mortality over a seasonal baseline. Epidemiol. 2012; 23: 829–838. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kandula S, Yang W, Shaman J. Type-and subtype-specific influenza forecast. Am J Epidemiol. 2017; 185: 395–402. 10.1093/aje/kww211 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Turtle J, Riley P, Ben-Nun M, Riley S. Accurate influenza forecasts using type-specific incidence data for small geographical units. medRxiv. 2019; 19012807 10.1101/19012807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Li R, Pei S, Chen B, Song Y, Zhang T, Yang W, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2). Science. 2020;368: 489–493. 10.1126/science.abb3221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Pei S, Morone F, Liljeros F, Makse H, Shaman J. Inference and control of the nosocomial transmission of methicillin-resistant Staphylococcus aureus. eLife. 2018; 7: e40977 10.7554/eLife.40977 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Ionides EL, Bretó C, King AA. Inference for nonlinear dynamical systems. Proc Natl Acad Sci U S A. 2006; 103: 18438–18443. 10.1073/pnas.0603181103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pei S, Cane MA, Shaman J. Predictability in process-based ensemble forecast of influenza. PLoS Comput Biol. 2019; 15: e1006783 10.1371/journal.pcbi.1006783 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Bloom-Feshbach K, Alonso WJ, Charu V, Tamerius J, Simonsen L, Miller MA, et al. Latitudinal variations in seasonal activity of influenza and respiratory syncytial virus (RSV): a global comparative review. PLoS ONE. 2013;8: e54445 10.1371/journal.pone.0054445 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Obando-Pacheco P, Justicia-Grande AJ, Rivero-Calle I, Rodríguez-Tenreiro C, Sly P, Ramilo O, et al. Respiratory syncytial virus seasonality: a global overview. J Infect Dis. 2018; 217: 1356–1364. 10.1093/infdis/jiy056 [DOI] [PubMed] [Google Scholar]
  • 33.Rose EB, Wheatley A, Langley G, Gerber S, Haynes A. Respiratory syncytial virus seasonality—United States, 2014–2017. Morb Mortal Wkly Rep. 2018; 67: 71–76. 10.15585/mmwr.mm6702a4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Fry AM, Curns AT, Harbour K, Hutwagner L, Holman RC, Anderson LJ. Seasonal trends of human parainfluenza viral infections: United States, 1990–2004. Clin Infect Dis. 2006; 43: 1016–1022. [DOI] [PubMed] [Google Scholar]
  • 35.Tang JW, Loh TP. Correlations between climate factors and incidence—a contributor to RSV seasonality. Rev Med Virol. 2014; 24: 15–34. 10.1002/rmv.1771 [DOI] [PubMed] [Google Scholar]
  • 36.Pitzer VE, Viboud C, Alonso WJ, Wilcox T, Metcalf CJ, Steiner CA, et al. Environmental drivers of the spatiotemporal dynamics of respiratory syncytial virus in the United States. PLoS Pathog. 2015; 11: e1004591 10.1371/journal.ppat.1004591 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Gelman A, Stern HS, Carlin JB, Dunson DB, Vehtari A, Rubin DB. Bayesian data analysis. Chapman and Hall/CRC; (Boca Raton, FL: ); 2013. [Google Scholar]
  • 38.Linzer DA. Dynamic Bayesian forecasting of presidential elections in the states. J Am Stat Assoc. 2013; 108: 124–134. [Google Scholar]
  • 39.Bracher J. On the multibin logarithmic score used in the FluSight competitions. Proc Natl Acad Sci U S A. 2019; 116: 20809–20810. 10.1073/pnas.1912147116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Gostic KM, Ambrose M, Worobey M, Lloyd-Smith JO. Potent protection against H5N1 and H7N9 influenza via childhood hemagglutinin imprinting. Science. 2016; 354: 722–726. 10.1126/science.aag1322 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Carrat F, Flahault A. Influenza vaccine: the challenge of antigenic drift. Vaccine. 2007; 25: 6852–6862. 10.1016/j.vaccine.2007.07.027 [DOI] [PubMed] [Google Scholar]
  • 42.Bedford T, Riley S, Barr IG, Broor S, Chadha M, Cox NJ, et al. Global circulation patterns of seasonal influenza viruses vary with antigenic drift. Nature. 2015; 523: 217–220. 10.1038/nature14460 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Fuentes S, Coyle EM, Beeler J, Golding H, Khurana S. Antigenic fingerprinting following primary RSV infection in young children identifies novel antigenic sites and reveals unlinked evolution of human antibody repertoires to fusion and attachment glycoproteins. PLoS Pathog. 2016; 12: e1005554 10.1371/journal.ppat.1005554 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Suzuki A, Mizumoto K, Akhmetzhanov AR, Nishiura H. Interaction Among Influenza Viruses A/H1N1, A/H3N2, and B in Japan. Int J Environ Res Public Health. 2019; 16: 4179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Nickbakhsh S, Mair C, Matthews L, Reeve R, Johnson PC, Thorburn F, et al. Virus–virus interactions impact the population dynamics of influenza and the common cold. Proc Natl Acad Sci U S A. 2019; 116: 27142–27150. 10.1073/pnas.1911083116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Opatowski L, Baguelin M, Eggo RM. Influenza interaction with cocirculating pathogens and its impact on surveillance, pathogenesis, and epidemic profile: A key role for mathematical modelling. PLoS Pathog. 2018; 14: e1006770 10.1371/journal.ppat.1006770 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ascough S, Paterson S, Chiu C. Induction and subversion of human protective immunity: contrasting influenza and respiratory syncytial virus. Front Immunol. 2018; 9: 323 10.3389/fimmu.2018.00323 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Juvén TA, Mertsola J, Waris M, Leinonen M, Meurman O, Roivainen M, et al. Etiology of community-acquired pneumonia in 254 hospitalized children. Pediatr Infect Dis J. 2000;19: 293–298. 10.1097/00006454-200004000-00006 [DOI] [PubMed] [Google Scholar]
  • 49.Jansen RR, Wieringa J, Koekkoek SM, Visser CE, Pajkrt D, Molenkamp R, et al. Frequent detection of respiratory viruses without symptoms: toward defining clinically relevant cutoff values. J Clin Microbiol. 2011; 49: 2631–2636. 10.1128/JCM.02094-10 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Munywoki PK, Koech DC, Agoti CN, Kibirige N, Kipkoech J, Cane PA, et al. Influence of age, severity of infection, and co-infection on the duration of respiratory syncytial virus (RSV) shedding. Epidemiol Infect. 2015; 143: 804–812. 10.1017/S0950268814001393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Li H, Wei Q, Tan A, Wang L. Epidemiological analysis of respiratory viral etiology for influenza-like illness during 2010 in Zhuhai, China. Virol J. 2013; 10: 143 10.1186/1743-422X-10-143 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kusel MM, de Klerk NH, Kebadze T, Vohma V, Holt PG, Johnston SL, et al. Early-life respiratory viral infections, atopic sensitization, and risk of subsequent development of persistent asthma. J Allergy Clin Immunol. 2007; 119: 1105–1110. 10.1016/j.jaci.2006.12.669 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Lutz CS, Huynh MP, Schroeder M, Anyatonwu S, Dahlgren FS, Danyluk G, et al. Applying infectious disease forecasting to public health: a path forward using influenza forecasting examples. BMC Public Health. 2019; 19: 1659 10.1186/s12889-019-7966-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.George DB, Taylor W, Shaman J, Rivers C, Paul B, O’Toole T, et al. Technology to advance infectious disease forecasting for outbreak management. Nat Commun. 2019; 10: 1–4. 10.1038/s41467-018-07882-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Birger R, Morita H, Comito D, Filip I, Galanti M, Lane B, et al. Asymptomatic shedding of respiratory virus among an ambulatory population across seasons. mSphere. 2018; 3: e00249–18. 10.1128/mSphere.00249-18 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Kandula S, Hsu D, Shaman J. Subregional nowcasts of seasonal influenza using search trends. J Med Internet Res. 2017; 19: e370 10.2196/jmir.7486 [DOI] [PMC free article] [PubMed] [Google Scholar]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008301.r001

Decision Letter 0

Rob J De Boer, Sara Y Del Valle

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

4 Mar 2020

Dear Dr. Pei,

Thank you very much for submitting your manuscript "Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like-illness" for consideration at PLOS Computational Biology.

As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. Note that the reviewers appreciated the attention to an important problem, but raised some substantial concerns about the manuscript as it currently stands. While your manuscript cannot be accepted in its present form, we are willing to consider a revised version in which issues raised by the reviewers have been adequately addressed. We cannot, of course, promise publication at that time.

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.

Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Sara Y Del Valle

Guest Editor

PLOS Computational Biology

Rob De Boer

Deputy Editor

PLOS Computational Biology

***********************

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: Review of “Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like-illness”

The authors present a well-motivated paper. ILI forecasting has garnered significant public health attention in recent years. ILI, though often treated like a singular entity, is really a collection of numerous respiratory illnesses, including the flu, as the authors state. Moving ILI forecasting models in directions that better align with the ILI data-generating process represents an important step for ILI forecasting. Furthermore, the authors present evidence that forecasting can be improved by such a move.

In principle, I’m bullish on this work. That said, the execution left much to be desired. Numerous issues exist and clarifying questions need to be addressed. By far the most significant issue surrounds the prior specification and sampling of the weights, w_i.

All issues and clarification questions are enumerated below.

Major issues:

1. Do the weights w_i described in lines 166-171 as well as the “Aggregation via MCMC” section of the Supplementary Materials (SM) sum to 1? The authors never say that they do, but they do describe ILI as being the aggregation of 6 pathogens, suggesting that the weights should sum to 1. It seems like they should but not clear if they do. Clarification is needed.

2. How are the weights actually estimated? On lines 175-176, the authors say, “We then performed a linear regression to ILI using the fitted curves via MCMC in order to derive estimates for w_i (Fig 1B).” The authors, in the second paragraph of section “Aggregation via MCMC” of the SM go on to say, “The prior distribution for each weighting factor was set as a uniform distribution between 0 and 0.3. Starting from a set of weighting factors randomly drawn from the prior distributions, a Metropolis algorithm was applied. At each update, a new set of weighting factors was obtained by sequentially perturbing w_i.” Because a Uniform [0, 0.3] prior was selected for each weight, that means the posterior must necessarily be between 0 and 0.3, inclusively. That check seems to hold for Figure S5, as the posterior distribution for each weight and each season appears to be between 0 and 0.3. That same check, however, does not hold for the weights in Fig 1B. Specifically, some of the posterior draws for the PIV12 weight exceeds 0.3, which is not possible if they were given a prior between 0 and 0.3. Explanation is needed. Shouldn’t Fig 1B be identical to the 2010 panel of Fig S5? Relatedly, what proposal density is used for the Metropolis algorithm?

3. It’s hard to understand what value the posteriors of the weights have, given that they were assigned a seemingly arbitrary prior hard capped above at 0.3 and many of the posterior distributions are bumping up against this arbitrary upper bound (e.g., Fig S5 - PIV12 in 2001, 2002, 2010, and 2011). This suggests that the data actually wants the weight for PIV12 to be larger than 0.3 but it can’t because of the selected prior. In general, there is nothing wrong when a posterior bumps up against a prior upper or lower bound when that upper or lower bound represents a physical constraint. When a prior bumps up against an arbitrary bound, however, that is a clear sign of prior misspecification. In such a case, a new prior should be selected that pushing those bounds out as the fits and likely the forecasts are suffering from this arbitrary 0.3 upper bound choice. It seems a more physical and elegant solution would be to assign a Dirichlet prior to the weight vector (all 6 weights jointly). A Dirichlet prior (and Metropolis-Hastings proposal) will guarantee that 1) all weights are positive and 2) all weights will sum to 1. Thus, if a weight bumps up against its lower (0) or upper (1) bound, that will be alright because those bounds represent physical lower and upper weight limits. If the authors choose not to change their priors and keep the analysis as is, they will need to provide extremely strong arguments as to why the weights cannot exceed 0.3.

Issues:

1. Throughout the manuscript, the authors say their models are “accurate” in an absolute sense (e.g., line 178; lines 218-220). However, the authors never define what “accurate” means in an absolute sense leaving the reader to define this for themselves. This is challenging to do as some of the language the author’s use deviates from my interpretations. To me, for instance, the red line in the Strain B panel of Fig 1A deviates from the observations roughly as significantly as the yellow line (single agent) does in Fig 1C. However, the authors say, “The mechanistic models accurately reproduced the positivity rates of each pathogen (Fig 1A)” (line 178) to describe the Strain B fit while simultaneously describe the single pathogen fit in Fig 1C as having, “large discrepancies from observations” (line 183). If the authors choose to describe their fits as “accurate” in an absolute sense, they need to provide an unambiguous definition of what “accurate” means and then demonstrate their fits either do or do not meet that unambiguous definition. Otherwise, the authors should refrain from such absolute claims.

2. Table 1: The percentages within a row in Table 1 do not sum to 100%. Are they supposed to? It seems like they should. I don’t know how to interpret them if they don’t.

3. Lines 230 and 240: Please provide more detail regarding how the Wilcoxon signed-rank test was performed. Its appropriateness is unclear based on the information provided.

4. Line 257: A conclusion of this work is that for the 2010-2011 season, flu strains account for up to 80% of ILI of the six examined viruses. How does this 80% estimate compare to the author’s previous work with ILI+, which is a scaling of ILI by the proportion of ILI patients testing positive for influenza. At first blush, 80% seems high.

5. SM second equation: The fraction should be flipped, with P(T|M) in the denominator. This means the fraction in the third equation also needs to be flipped and the interpretation of the weight is backwards. In general, the “Disaggregation of ILI” section should be closely examined and updated in light of the error in the second equation.

6. Table S1: It’s not intuitive to me why the multi-pathogen fit should ever be worse than the single-pathogen fit. Isn’t the model space of the single-pathogen contained within the more flexible multi-pathogen model space? Said another way, the multi-pathogen fit can take any form the singe-pathogen fit can take, but the single-pathogen fit cannot take any form the multi-pathogen fit can take. If this is true, it’s unclear why, when fitting (not forecasting) the multi-pathogen fit should ever be worse than the single-pathogen fit.

7. Figure S10: The central thesis of the paper is that forecasting ILI as a weighted sum of its constituent pathogens yields better forecasts than forecasting ILI directly as a single pathogen. This conclusion is supported by Fig S10. The authors also demonstrate that forecasting can be further improved by accounting for systematic bias. The glaring hole in the comparison in Fig S10 is the forecasts of the single-pathogen fit (baseline) that accounts for systematic bias. Without this comparison, it’s hard to know which modeling technique provides the largest improvement: the multi-pathogen modeling (as is suggested by the thesis of the paper) or the systematic bias modeling. The single-pathogen with systematic bias would be a very welcomed contribution to the paper.

Minor issues:

1. ILI is “influenza-like illness” not “influenza-like-illness” (e.g., the title and the abstract)

2. Line 109: In what sense is the perturbing “optimal”? Please provide a bit more context.

3. Line 110: please use consistent formatting for references (e.g., [28] instead of superscripting).

4. Lines 162-163: Why are RSV, PIV12, and PIV3 more regular than flu strains? Why do they peak when they do? Is this known? In general, a bit more context should be added to RSV, PIV12, and PIV3 – both what we know and don’t know about their seasonality.

5. Lines 166-168: Is there a season index being suppressed in the ILI(t) equation? For instance, is this ILI(t) for season s?

6. Line 172: It’s not clear how the dependent clause, “To eliminate the impact of observational error,” is related to the rest of this sentence. How is the fitting of each respiratory pathogen related to the observation error?

7. ILI data is subject to revisions for many weeks after the initial release of the ILI data. Is backfill accounted for in this work? If not, why not and how do you imagine backfill would impact this work?

8. SM last equation in section “Forecasting Influenza”: If you sample a negative positivity rate, do you set it equal to zero (as you know a positivity rate can’t be negative) or do you leave it negative?

9. Fig S7 and S11: The 95% prediction intervals are plotted at the x=1 point, not the x=.95 point. Is what’s plotted a 95% prediction interval at the wrong point, or a 100% prediction interval at the right point?

10. Table S1: Should “sum of discrepancies” in the Table caption be “sum of squared discrepancies” or “sum of absolute discrepancies”? All reported figures are non-negative, suggesting the sums are not the raw discrepancies.

Reviewer #2: This manuscript builds on a series of prior work from these authors to improve the accuracy of influenza-like illness modelling in the US. It is motivated by the flusight forecasting challenge. In addition to improved forecasting, these studies are fundamentally moving the needle in our epidemiological understanding of influenza and (in this case) of respiratory pathogens in general.

In this manuscript, the authors draw on type-specific regional data to improve their forecasts of national and regional ILI. They use their existing mechanistic models for influenza subtypes and then the method of analogues for the non-flu pathogens. They find consistently increased accuracy of retrospective forecasts with their multi-pathogen approach.

I enjoyed reading the paper and I think it’s a substantial contribution. Forecasting aside, we have very few disease-dynamic studies of multiple pathogens over multiple years. The lower variance of incidence for the other types versus influenza is in itself an interesting result. The authors may wish to comment on the type of immunity and transmissibility of influenza versus the other pathogens. I wonder if influenza has a lowish R0 and long leaky immunity that leads to the variation versus the others?

I have a number of comments around identifying the marginal contribution of different data and approaches to the increased accuracy of the forecasts. However, although I might have made a few different design choices, the merit of the retrospective forecast design (against an established baseline) is that that mis-specification and over-fitting are far less of a worry than a typical likelihood-driven study.

Did the authors explicitly try the method of analogues for the influenza dynamics? I appreciate that they have a lot of experience with the the process models, but given that the toolkit must be to hand in this project, it would be good to know that the process models do do better than the method of analogues and by how much on the metrics used here. Constant scepticism about the merit of prcioess models will result in really good process models at some point!

How much did each extra pathogen improve the accuracy? The authors state that more pathogens might be even better, but I wonder if the improvement in accuracy is coming from just a few of the additional non-flus. It would be interesting (and hopefully only resource intensive in computer time rather than human time) to hold each pathogen out in turn and assess the degradation of the forecast.

Many of these data will come from multiplex PCR…, but they are being treated as independent observations that contribute to ILI. Are there additional published data from multiplex studies that directly measure the proportion of ILI attributable to each type? Would sample-linked data be able to confirm any of the underlying model parameters. Are there estimates out there for the proportion of ILI in the US due to these 6?

The charts, both main text and SI are very clear and allow the reader to understand the work quickly.

These authors are part of the ongoing effort with the flusight initiative. Have they used these methods in the competition yet? Might be worth a comment in the discussion without going into too much detail.

Some specific comments

Line 110; there has been a recent revision in the flusight scoring algorithm, discussed in an exchange of letters in PNAS. https://doi.org/10.1073/pnas.1912147116 Does this paper use the old or the new algorithm. Either is fine, but please state and cite the exchange.

S1 Figure. Its a little difficult to spot patterns here and these are very innovative data. I wonder if scatter plots of each possible pair of observations (so time removed) might give a sense for the amplitude of different pathogens, their variance and their correlation with other pathogens.

S2 Figure. I’d put this in the main text. These are the outcome and “exposures” for this study and the patterns here are pretty clear.

144; I am more than happy to work in the log score, but maybe include a bit of narrative for the reader about going from score to skill. … -0.5 is correct approx 61% of the time !!

172; to eliminate the impact of observational error - I couldn’t quite understand this.

178; see comments above, would it be worth testing analogues for influenza as well as the non-flu viruses

194; B looks more like the non-flus in this regard?

237; % improvement in log scores is a but difficult to interpret. Maybe state the change in log scores or give % changes in skill rather than score?

253; Are there other sources of evidence about proportions of ILI that arise from different pathogens? See comments above.

276; If arguing for more pathogens being better, would be good to check the incremental impact of each of the pathogens here. See other comments above.

The SI was very clear. I didn’t see the SI data (and Imight have had a play with these!) but assume that it will be uploaded on publication.

Reviewer #3: The main claims of the paper are that modeling ILI as a weighted average of its contributing viral signals leads to more accurate and precise ILI forecasts (relative to a single-pathogen model for ILI), and that such disaggregation also has value from a public health perspective. The originality and innovative nature of this paper rely largely on the decomposition of ILI into component pathogenic contributors. However, it is not entirely clear based on the current intro whether/how such decomposition has been done before. Lines 31-32: "most existing process-based forecasting systems treat ILI as a single infectious agent." Most, but not all? Can you cite some forecasting systems that do not treat ILI as a single infectious agent? How do these differ from your method? (Same in line 62, with the language "almost all".) If there are competitors that also decompose ILI into component pathogens, the specific contributions of this paper relative to those competitors needs to be made explicit.

As far as the claim of improved accuracy and precision, the former is thoroughly demonstrated relative to the single-pathogen SIR model and the latter is not rigorously shown. We can see visually in Fig 2 that the 95% CI of the baseline model seems wider for that particular forecast example, but a table or figure giving a comparison of the area of the models' 95% predictive CIs is needed to make this a general claim.

Performance is compared to what seems to be to be a 'bad' baseline that simulates ILI as a single pathogen. How does performance at predicting overall ILI compare to other competitors that don't consider separate pathogens (e.g., the purely statistical top-performing FluSight model Dante)? Obviously the authors' method has advantages from an interpretability standpoint, but if it performs more poorly than a purely statistical model this would imply that the process-based model is still not quite capturing reality very well. Such a result would not invalidate the claims of the paper, or render its contribution irrelevant, but would indicate that further understanding of the process driving the ILI curve is needed.

The authors argue that learning about the trajectories of the individual infectious agents is potentially useful in clinical settings for determining the possible cause of ILI among multiple infectious agents. This claim of pathogen-specific forecasts improving on public health response relative to ILI forecasts alone could be strengthened by a citation. Additionally, the relationship between the sampling protocol for ILI and that of virologic surveillance isn't made entirely clear in the paper. How are virologic surveillance data sampled? Does this sampling protocol vary by year, and does it impact predictions? I'd like to know more about potential biases in the virologic data, and what the underlying population is of the positivity rate (Is it ILI patients? Is it people at the hospital? Are sicker patients more likely to get a virologic test? Etc.).

I am concerned about the assumption that ILI is a weighted sum of the 6 included pathogens. I think that slightly more care needs to be paid to this issue in the main text outside of the discussion section. I'd like to know whether it would be possible in this framework to include a catch-all "other" category. Given that the authors make the argument that accounting for the various pathogen sources of ILI activity has advantages from both a predictability and an interpretability perspective, it seems misguided for the model itself to implicitly not acknowledge that ILI has other contributors than the 6 pathogens having historic data.

In addition to discussing how (or why not) to include an 'other' category, I think more justification beyond data availability could ameliorate some of the concerns of including only those 6 pathogens. Specifically, I want to know what proportion of ILI seems to not be covered by those pathogens having data since 1997. That is, for the years that data are available, what are the positivity rates for the example other-pathogens cited on lines 93-94 (coronavirus, human metapneumovirus and respiratory adenovirus)? If the positivity rates are high, I would worry more that the 6 included pathogens aren't actually what comprises much of ILI. If low, I'm less concerned.

In line 167-168, the authors should be clear about whether or not \\sum w_i = 1. Does the weighting factor imply anything about whether some of ILI can be driven by an individual having multiple pathogens? Is this assumption valid? You mention this briefly in the discussion, but a quick note here would also be helpful. I'm also interested in the relationship between the weighting factor and the sensitivity/specificity of the test. If you knew about the TPR/FPR/FNR of the test could this model incorporate that information?

Figure S5 shows the estimated distributions of weighting factors in different seasons at the national level. It would be useful to also include a similar figure (or a few figures) with season fixed and weighting by region. I'm curious if there is significant regional variability.

Also in looking at Figure S5 I had the question as to why weighting factors should vary so considerably by season. The authors mention that this is due to contributions themselves being variable. However, it would make more sense to me for the weights to be consistent across seasons, but the positivity rate itself to drive differing pathogenic contributions. The variability of the weighting factors seems to me to perhaps be an artifact of sampling differences across seasons in the virologic data. I think this needs further discussion/consideration.

I would also like some comment/discussion from the authors on why the weights for some pathogens in Figure S5 have such small spread (e.g., RSV) while others are hugely variable (e.g., many have box-whisker plots spanning from near-0 to above 0.2). Is there a set of good solutions to the data that can come from essentially 0-weighting a pathogen, while putting all the weight on a different one? Is the set of good solutions highly modal, or is there a smooth trajectory of weights that yield a good fit?

A handful of smaller comments:

- Figure 4 is beautiful!!

- Is Figure S6 a remarkable example (particularly good or bad)? Having just one such example makes me wonder why it was chosen, so perhaps include a few more and note the performance relative to overall average performance.

- In line 109-110 what is the superscript 28 referring to?

- I appreciate the authors noting that the particular structure of the model (modeling ILI as an aggregation of component parts) isn't unique to pathogens but could be used to aggregate forecasts by age group or geography.

- Is the code available? For reproducibility it would be helpful to make this public.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: No: The data are stated as being available but were not present.

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions, please see http://journals.plos.org/compbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008301.r003

Decision Letter 1

Rob J De Boer, Sara Y Del Valle

30 Jul 2020

Dear Dr. Pei,

Thank you very much for submitting your manuscript "Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like illness" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. 

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Sara Y Del Valle

Guest Editor

PLOS Computational Biology

Rob De Boer

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #3: I thank the authors for their thorough and thoughtful responses to the reviewers! In light of the clarifications/additions, I feel that this manuscript is ready for publication.

I had one small note-- In one of the responses the authors note: “Dante is not a purely statistical model. It couples a process-based SIR model with a statistical procedure to correct forecast discrepancies.” I believe that they are referring to DBM (https://arxiv.org/abs/1708.09481, also by Osthus). Unlike DBM, Dante (https://arxiv.org/abs/1909.13766) does not rely on a process-based SIR model and is in fact a purely statistical model. In spite of this issue, the authors' response still addressed the underlying question, so thank you for noting the relative performance of your model to Dante.

Reviewer #4: The authors have submitted a revised manuscript describing their work on aggregating multiple pathogens to improve ILI forecasting. The initial reviews were generally positive, mostly focusing on clarification and contextualization of the authors' work. The authors have addressed the first round of reviews, and the clarity of the paper is noticeably improved. I do not have any major comments or criticism of the work as is. I think it should be published because it demonstrates a novel ILI forecasting strategy that has the potential for future improvements. I think that the paper's results may be challenging to reproduce verbatim from the supplementary material. Still, because the Authors have said they will post their code to GitHub, I am not concerned. I offer a handful of minor comments.

Comments:

line 68-69: Could you insert a quick definition of "positivity rate" here. Just for completeness.

line 122: Could "relatively regular outbreaks" be changed to something like "consistent seasonality"? The term regular outbreaks I don't think describes the temporal pattern you are focusing on. Regular outbreaks could mean than they happen annually, whereas you are specifically keying on the regularity of the outbreak time within any given year.

line 133: Sampled with replacement?

line 140-141: Explicitly state that the prior is uniform(0,1) instead of just describing a range

line 145: The word "inferred" is redundant because you infer the true posterior by performing the MCMC sampling

line 355: In this sentence, you say that "forecasting an aggregated ILI signal is insufficient," but forecasting ILI using an aggregation model is what this paper is about. I think you can drop the "aggregated" to emphasize the point that you are trying to make in this paragraph, that is, modeling separate pathogens provides more information and solutions to decision-makers.

line 636: The units of the "viral signals" in Figure 1 is not explicit. I am assuming it is the observed positivity rate.

SI Eq. 5: The sign of the I/D term is wrong. It should be negative.

SI pg. 2: The second paragraph is unclear. What does "free simulation" mean in this context?

SI pg. 4: The assumption of the distribution for $\\bar{v}_i(t)$ seems to be implying a Bayesian posterior using an improper prior. Is there a reason for not being explicit about this choice of inference?

SI pg. 4: In paragraph 3, why was the ensemble of forecast trajectories for each pathogen not incorporated into the likelihood when performing the MCMC to sample the multiplicative factors? It seems like you accept/reject the multiplicative factors with one set of $v_i(t)$ and then switching to another set of $v_i(t)$ when performing the forecasting.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #3: Yes

Reviewer #4: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #3: No

Reviewer #4: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008301.r005

Decision Letter 2

Rob J De Boer, Sara Y Del Valle

2 Sep 2020

Dear Dr. Pei,

We are pleased to inform you that your manuscript 'Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like illness' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Sara Y Del Valle

Guest Editor

PLOS Computational Biology

Rob De Boer

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008301.r006

Acceptance letter

Rob J De Boer, Sara Y Del Valle

14 Oct 2020

PCOMPBIOL-D-19-01318R2

Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like illness

Dear Dr Pei,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Matt Lyles

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Text. Supplementary materials.

    (DOCX)

    S1 Table. Residuals of multi-pathogen and single-pathogen model fittings to ILI.

    We measure the residual of model fitting to ILI using the sum of absolute discrepancies at all weeks in each season and location. The first numbers are the residuals of multi-pathogen model fittings, and the latter ones are obtained from single-pathogen model fittings. The smaller fitting residuals are in bold. In general, the multi-pathogen model fittings have smaller residuals in most seasons and locations.

    (DOCX)

    S1 Fig. Positivity rates of six pathogens in 15 seasons at national level.

    We plot the positivity rates from laboratory tests at the national level for influenza A/H1, A/H3, B, RSV, PIV12 and PIV3 for the 1997–1998 to 2013–2014 seasons, excluding the 2008–2009 and 2009–2010 seasons. While influenza activities have large variations across seasons, non-influenza pathogens are more regular.

    (EPS)

    S2 Fig. Scatter plot of each possible pair of observations.

    We report the correlation coefficient of positivity rates between each pair of pathogens.

    (EPS)

    S3 Fig. Residuals of multi-pathogen fit to ILI in the US and HHS regions 1 to 9.

    Dash lines are residuals obtained in 15 seasons, and solid lines are the averaged curves. The mean residuals were used as the systematic bias that is the same across seasons in post-processing. Note that in most regions the decrease of ILI during the Christmas-New Year holiday is captured by the average residuals.

    (EPS)

    S4 Fig. Estimated relative contributions of nine respiratory pathogens at the national level.

    We estimated the relative contribution of each pathogen from 2010 to 2013, when positivity rate data from rAD, HMPV and Rhino were also available.

    (EPS)

    S5 Fig. Comparison of multi-pathogen and single-pathogen model fittings to ILI.

    Fitting was performed for national data for the 1997–1998 to 2013–2014 seasons, excluding the 2008–2009 and 2009–2010 seasons. The regression curves of multiple viral signals show less discrepancy from observations than the single-pathogen fitting.

    (EPS)

    S6 Fig. Estimated distributions of multiplicative factors in different seasons at the national level.

    Multiplicative factors vary considerably over time, indicating that contributions from the examined six viruses to ILI are variable in different seasons. The prior range for wi is set as [0, 1].

    (EPS)

    S7 Fig. Estimated multiplicative factors of six respiratory pathogens in the 2011–2012 season.

    The prior range for wi is set as [0, 1].

    (EPS)

    S8 Fig. Estimated multiplicative factors of six respiratory pathogens in the 2013–2014 season.

    The prior range for wi is set as [0, 1].

    (EPS)

    S9 Fig. Examples of forecasts for the six viruses.

    Forecasts are shown for national positivity rates during the 2006–2007 season at week 10. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

    (EPS)

    S10 Fig. Reliability plots for the forecasts of individual pathogens.

    We display the reliability plots for 4 near-term targets: 1- to 4-week ahead ILI, denoted by X1 to X4, respectively. Data points show the fraction of observed targets falling within the 25%, 50%, 75% and 100% prediction intervals. For a well-calibrated forecast, data points will lie on the diagonal line y = x. Results are shown for the forecasts generated at the national level for all 15 seasons.

    (EPS)

    S11 Fig. Examples of ILI forecasts at different phases of the 2010–2011 outbreak.

    Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

    (EPS)

    S12 Fig. Examples of ILI forecasts at different phases of the 2011–2012 outbreak.

    Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

    (EPS)

    S13 Fig. Examples of ILI forecasts at different phases of the 2012–2013 outbreak.

    Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

    (EPS)

    S14 Fig. Examples of ILI forecasts at different phases of the 2013–2014 outbreak.

    Forecasts are shown for national ILI during the 2010–2011 season at weeks 8, 13, 18 and 26. The grey lines are mean forecast trajectories and grey areas represent the 95% predictive CIs.

    (EPS)

    S15 Fig. Comparison of log score and forecast error at different forecast weeks.

    We compare the log score (left) and forecast error (right) for retrospective forecasts generated by aggregating predictions of multiple pathogens and a baseline method that simulates ILI as a single pathogen. Results are obtained by averaging the score or error for forecasts generated from week 4 to week 35 during 15 seasons at the US national level and for HHS regions 1 to 9.

    (EPS)

    S16 Fig. Effect of post-processing on forecasts.

    We compare the log scores (A) and forecast errors (B-C) for seven targets averaged over weekly predictions in 15 seasons and 10 locations (national and 9 HHS regions). Forecasts obtained by direct aggregation and additional post-processing procedures (adjusting systematic bias, adjusting current bias and calibration) are compared with the baseline.

    (EPS)

    S17 Fig. Comparison of the average width of the 95% prediction interval.

    Results are obtained from retrospective forecasts during 15 seasons at the US national level and for HHS regions 1 to 9.

    (EPS)

    S18 Fig. Effect of systematic bias correction on baseline forecast log score.

    We compare the log scores for seven targets averaged over weekly predictions during 15 seasons and 10 locations (national and 9 HHS regions). Forecasts obtained using the baseline with systematic bias correction (Baseline+systematic bias correction) are compared with the baseline forecast method (Baseline).

    (EPS)

    S19 Fig. Forecasting influenza using the method of analogues in multi-pathogen forecasting.

    We compare the log scores for seven targets averaged over weekly predictions in 15 seasons and 10 locations (national and 9 HHS regions). Forecasts were obtained using the multi-pathogen method in which influenza types/subtypes were predicted using dynamical models (Dynamic), the multi-pathogen method in which influenza types/subtypes were predicted using the method of analogues (Analogues) and the baseline single-pathogen method (Baseline).

    (EPS)

    S20 Fig. Degradations of forecasts if each pathogen is removed from the multi-pathogen forecasting system.

    We compare the log scores for seven targets averaged over weekly predictions in 15 seasons and 10 locations (national and 9 HHS regions). We removed each pathogen in turn in the multi-pathogen forecasting system and compared the degradation of log score.

    (EPS)

    Attachment

    Submitted filename: Response.pdf

    Attachment

    Submitted filename: Response.docx

    Data Availability Statement

    Data and code are available at GitHub https://github.com/SenPei-CU/Multi-Pathogen_ILI_Forecast.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES