Skip to main content
Cambridge Open Access logoLink to Cambridge Open Access
. 2012 Feb 16;140(12):2131–2141. doi: 10.1017/S0950268812000179

Modelling interrupted time series to evaluate prevention and control of infection in healthcare

V GEBSKI 1,2,*, K ELLINGSON 1, J EDWARDS 1, J JERNIGAN 1, D KLEINBAUM 1,3
PMCID: PMC9152341  PMID: 22335933

SUMMARY

The most common methods for evaluating interventions to reduce the rate of new Staphylococcus aureus (MRSA) infections in hospitals use segmented regression or interrupted time-series analysis. We describe approaches to evaluating interventions introduced in different healthcare units at different times. We compare fitting a segmented Poisson regression in each hospital unit with pooling the individual estimates by inverse variance. An extension of this approach to accommodate potential heterogeneity allows estimates to be calculated from a single statistical model: a ‘stacked’ model. It can be used to ascertain whether transmission rates before the intervention have the same slope in all units, whether the immediate impact of the intervention is the same in all units, and whether transmission rates have the same slope after the intervention. The methods are illustrated by analyses of data from a study at a Veterans Affairs hospital. Both approaches yielded consistent results. Where feasible, a model adjusting for the unit effect should be fitted, or if there is heterogeneity, an analysis incorporating a random effect for units may be appropriate.

Key words: Methicillin-resistant S. aureus (MRSA), modelling, public health

INTRODUCTION

In the USA, methicillin-resistant Staphylococcus aureus (MRSA) is the most common cause of skin and soft tissue infections in patients presenting to emergency departments and is endemic in many hospitals [1]. Interventions to reduce transmission include emphasizing hand hygiene, active surveillance culturing, and educating healthcare workers in the ‘culture’ of infection control. We explored options for evaluating chronologically overlapping interventions in an existing dataset [2]. The outcome of interest was the impact of specific interventions on the incidence of MRSA in each unit, where interventions might take weeks or months to become effective and might be implemented in different units at different times. This approach is also known as a step wedge design [3].

METHODOLOGICAL FRAMEWORK

Segmented models may explain sudden and gradual shifts in data due to external mechanisms not accounted for by simple multiple regression models [410]. Segmented modelling, a reasonable approach for evaluating interventions when data are collected over time [11], has been used extensively for public health interventions [1215] and for effects on MRSA rates in particular [16, 17]. Biglan et al. [18] overviewed models of the effect of community interventions using a conventional time series (autoregressive integrated moving average), and this was applied to nosocomial infections by Fernández-Pérez et al. [19]. Clinical incidence of MRSA is a proxy for MRSA transmission [20]. We extended this idea to a model of infection rates of MRSA transmission incorporating multiple hospital units.

Approaches to answering these questions can be complicated by methodological concerns including: (1) the same intervention may be implemented at different times in units within the same facility (i.e. hospitals), (2) there may be a correlation of incidence rates across time periods; and (3) the intervention may take weeks or months to produce a measurable effect.

RESULTS

Data sources

Data were collected over 109 months at a Veterans Affairs (VA) hospital in Pittsburgh. The intervention included culturing for MRSA colonization at admission before initiation of contact precautions and hand-hygiene awareness, which were incrementally phased in. The intervention began in October 2001 (month 25) in unit A; in October 2003 (month 49) in unit B; then in July 2005 (month 70) in the remainder of the acute care units (area C). The first date for which MRSA incidence data were available for all three areas was 1 October 1999. One study objective was to quantify estimates of the changes in rates over time to allow researchers and administrators to gauge the effect of introduction of interventions introduced to individual units or hospitals.

The outcome of interest was the monthly clinical incidence of MRSA cases in each area of the hospital, a proxy for MRSA transmission or new acquisition of infection or colonization (at a clinical site). An incident case was defined as: a positive, clinical (non-surveillance) MRSA culture obtained at least 48 h after admission to an acute-care unit and, if the patient was transferred, within 48 h of transfer to another unit. Cases were excluded as ‘non-incident’ if a positive clinical culture in the previous year (including long-term care and outpatient setting) could be identified anywhere in the laboratory information system. Nasal and rectal swabs were considered surveillance cultures and thus ineligible. Clinical incidence was expressed as the number of incident cases per 1000 patient-days. The corresponding incidence rate for month i (i = 1, 2, …, 109), denoted as λi, was estimated by dividing the number of patients with incident MRSA by the patient-days for each month (Fig. 1).

Fig. 1.

Fig. 1.

Observed incident MRSA cases per 1000 patient-days. (a) Unit A, (b) unit B, and (c) area C.

Pooled analysis: fixed effects

Within each unit, the number of MRSA events in any one month can be thought of as a Poisson count, with the number of patient-days being the exposure time. An interrupted time series (or segmented regression) using a generalized linear model for a Poisson distribution, by using a segmented approach [8] can be fitted.

The form for a Poisson distribution is, for the ith unit (i = 1, 2, 3)

graphic file with name S0950268812000179_eqnU1.jpg

within each unit where a case equates to a positive microbiology culture. If the intervention occurs at time t0, we can link the number of infections (y) to time using the following model

Model 1: ln(λ) = β0 + β1T + β2I + β3T*,

where λ = monthly incidence rate,

graphic file with name S0950268812000179_eqnU2.jpg
graphic file with name S0950268812000179_eqnU3.jpg

In this model, β0 represents the baseline MRSA rate; β1, the slope before the intervention at t0; β2, the change in the rate just after t0; β1 + β3, the slope after t0, and β3, the change in slope after t0. The model can be fitted using standard software, the standard error of the post-intervention slope, β1,j + β3,j for the jth hospital unit is

graphic file with name S0950268812000179_eqnU4.jpg

The quantity Inline graphic is can be obtained from the output of statistical packages (Fig. 2). Fitting segmented Poisson regression separately for each unit in the hospital, gives, for the jth unit, parameter estimates of the coefficients: Inline graphic, Inline graphic, Inline graphic and Inline graphic and their associated standard errors: s.e.(Inline graphic), s.e.(Inline graphic), s.e.(Inline graphic) and s.e.(Inline graphic). Assuming the rate in each unit is independent of the rate in the other units, an overall estimate of β0, β1, β2 and β3 can be obtained by pooling the individual estimates [21]. This is achieved by a weighted average of the individual unit estimates, where the weights are the inverse of the variances of the estimates from the fitted model. The variances are obtained by squaring the respective standard errors. If there are k units over which we need to pool, the estimates are obtained as:

graphic file with name S0950268812000179_eqnU14.jpg

where Inline graphic and Inline graphic (j = 1, 2, …, k). Similarly,

graphic file with name S0950268812000179_eqnU17.jpg

Confidence intervals for the pooled estimates, Inline graphic (i = 0, 1, 2, 3) are constructed by noting that the variance of Inline graphic is

graphic file with name S0950268812000179_eqnU20.jpg

The 95% confidence interval for Inline graphic is Inline graphic These estimates and their standard errors are based on the logarithm of the rate, and need to be exponentiated to reflect the actual rates. These rates are referred to as incidence density ratios. The estimate of the variances of these coefficients can be adjusted if there is evidence of substantial heterogeneity between the units. This adjustment for the extra variation due to heterogeneity is also referred to as a random-effects adjustment and statistical tests are available to determine whether significant heterogeneity is present. The random-effects estimate simply adds a component to each the weights wi,j reflecting this variability [21, 22].

Fig. 2.

Fig. 2.

Representation of model 1: within an individual unit. β0 = Starting baseline MRSA rate; β1 = slope of line prior to t0; β2 = drop at t0; β1 + β3 = slope after t0.

Autocorrelation

As the data is a time series, any significant serial correlation (correlation between successive observations) present after the regression models have been fitted needs to be examined to determine the extent of any (first-order) serial autocorrelation in the residuals. The Durbin–Watson statistic [23], measures such correlation, and is calculated as

graphic file with name S0950268812000179_eqnU23.jpg

where the observations range from 1, …, T and ei is the ith residual from the Poisson regression model. The range of the statistic is 0–4, with values substantially less than 2 indicating (first-order) serial autocorrelation.

The data layout for the three hospital units is shown in Table 1. Table 2 shows the coefficients for each unit, the fixed- and random-effect [21, 22] pooled estimates, and the Durbin–Watson statistic. Based on 109 observations, values of this statistic >1·58 indicate no autocorrelation at the 1% and >1·71 at the 5% levels of significance. Values <1·54, and <1·67 suggest the presence of autocorrelation at the 1% and 5% levels of significance, respectively. Critical values at the 5%, 2·5% and 1% significance levels are available online (http://www.stanford.edu/∼clint/bench/dwcrit.htm).

Table 1.

Data layout for MRSA data from hospital units A and B and area C

Unit A: t0 = 25 Patient days MRSA counts Unit B: t0 = 48 Patient days MRSA counts Area C: t0 = 70 Patient days MRSA counts
T I T* T I T* T I T*
A 1 0 0 615 3 B 1 0 0 244 0 C 1 0 0 2289 4
A 2 0 0 460 0 B 2 0 0 245 0 C 2 0 0 2111 6
: : : : : : : : : : : : : : : : : :
A 24 0 0 B 47 0 0 249 1 C 69 0 0 2122 8
A 25 1 1 B 48 1 1 230 0 C 70 1 1 2035 6
A 26 1 2 : : B 49 1 2 259 3 C 71 1 2 2213 5
: : : : : : : : : : : : : : : : : :
A 109 1 85 645 1 B 109 1 62 252 0 C 109 1 39 2383 3

t0, Time of the intervention; T, time period before the intervention (months); I, intervention time period (months); T*, time period after the intervention (months).

Clinical incidence density rates (MRSA counts/patient days) over a period of 109 months were calculated for all units.

Table 2.

Individual and pooled incidence density rates, 95% CI and P values

Unit* Coefficient Rate 95% CI P
Inline graphic (constant) Unit A −6·529 0·0014 0·001–0·002 <0·001
Unit B −5·812 0·0030 0·002–0·004 <0·001
Area C −5·882 0·0028 0·002–0·003 <0·001
Fixed −5·907 0·0027 0·002–0·003 <0·001
Random −6·006 0·002 0·001–0·004 <0·001
Inline graphic (time) Unit A 0·0349 1·0355 0·98–1·04 0·19
Unit B 0·0054 1·0054 0·98–1·01 0·64
Area C −0·0008 0·9992 0·995–0·999 0·74
Fixed −0·0003 1·000 0·995–1·004 0·91
Random 0·011 1·011 0·9961–1·064 0·34
Inline graphic (intervention) Unit A −0·3051 0·7370 0·340–0·999 0·44
Unit B −0·4368 0·6461 0·258–0·992 0·35
Area C −0·5372 0·5844 0·395–0·632 0·01
Fixed −0·4828 0·617 0·445–0·856 <0·01
Random −0·483 0·617 0·445–0·855 <0·01
Inline graphic (change in slope after the intervention) Unit A −0·0478 0·9533 0·904–0·955 0·08
Unit B −0·0331 0·9674 0·935–0·968 0·06
Area C −0·0028 0·9972 0·982–0·997 0·72
Fixed −0·0106 0·989 0·976–1·003 0·13
Random −0·025 0·976 0·934–1·019 0·13
Inline graphic+Inline graphic (post-intervention slope) Unit A −0·0129 0·9872 0·978–0·987 <0·01
Unit B −0·0277 0·9727 0·948–0·973 0·04
Area C −0·0036 0·9964 0·982–0·997 0·63
Fixed −0·0118 0·9883 0·981–0·996 <0·01
Random −0·012 0·988 0·977–0·999 <0·01

CI, Confidence interval.

*

Durbin–Watson A: 1·90; B: 2·87; C: 1·87. Values >1·58 and 1·71 indicate no evidence of autocorrelation at the 1% and 5% levels of significance, respectively.

Rate is the exponential of the coefficient.

Adjusted for number of patient days at risk/month in each unit.

The interpretation of the pooled coefficients is as follows: At the start of data collection, compared with a rate of 0, there was a significant rate of MRSA of approximately 3/1000 in the hospital (estimated by β0). Before the intervention, the estimate of the rate (β1) was no change from the start (baseline) of data collection. The plausible rate was within the range of a 0·5% decrease and a 0·4% increase. The immediate effect of the intervention (β2) was a significant 39% decrease in the rate with a plausible effect ranging from a decrease of 55% to 14%. There was no significant reduction in clinical MRSA incidence of 1% per month of the intervention after implementation (β3). The plausible reduction ranged from 2% per month reduction to 0·3% increase. Adjustment for random effects does not substantially change the results.

Stacked analysis

Rather than fitting a separate model and pooling the estimates, we can fit a single model for all the units simultaneously to obtain estimates for pre-intervention, intervention, and post-intervention. A single model has the advantage of using the data across the units as part of the estimation procedure for each of the parameters resulting in an improvement in the efficiency of some of the estimates. In this model we concatenated the data row-wise to obtain one dataset of 327 rows.

Two extra columns appear in the array of Table 3, indicator variables representing unit B (U2) and area C (U3), leaving unit A as the reference category. We address the columns labelled T1*, T2* and T3* later. The model we fit may be represented as:

Table 3.

Stacked data layout combining units

Unit T I T* U2 U3 T1* T2* T3* Patient days MRSA counts
A 1 0 0 0 0 0 0 0 615 3
A 2 0 0 0 0 0 0 0 460 0
: : : : : : : : :
A 24 0 0 0 0 0 0 0
A 25 1 1 0 0 1 0 0
A 26 1 2 0 0 2 0 0
: : : : : : : : :
A 109 1 84 0 0 84 0 0 645 1
B 1 0 0 1 0 0 0 0 244 0
B 2 0 0 1 0 0 0 0 245 0
: : : : : : : : : :
B 48 0 0 1 0 0 0 0 244 0
B 49 1 1 1 0 0 1 0 245 0
B 50 1 2 1 0 0 2 0
: : : : : : : : :
B 109 1 58 1 0 0 58 0 252 0
C 1 0 0 0 1 0 0 0 2289 4
C 2 0 0 0 1 0 0 0 2111 6
: : : : : : : : : :
C 69 0 0 0 1 0 0 0 2125 0
C 70 1 1 0 1 0 0 1 1933 0
C 71 1 2 0 1 0 0 2
: : : : : : : : : : :
C 109 1 40 0 1 0 0 40 2383 3

U2 and U3 are indicators representing unit B and area C.

Model 2: ln(λ) = β0 + β1T + β2I + β3T* + γ1U2 + γ2U3,

where λ is the monthly incidence rate, T and I are as previously defined, and T* = 0 if Tt0 and T*=Tt0 for T>t0.

For our study, t0 = 24 if U2 = 0 and U3 = 0; t0 = 48 if U2 = 1 and U3 = 0, and t0 = 69 if U2 = 0 and U3 = 1. γ denotes the parameters relating to the effects of the individual units. The results of fitting this Poisson model (Table 4) are consistent with the pooled analysis (Table 2).

Table 4.

Stacked model: incidence density rates

Coefficient Rate 95% CI P
Adjusted model
Inline graphic (constant) −5·7920 0·0031 0·0024–0·0038 <0·001
Inline graphic (time) −0·0012 0·9988 0·9943–1·0034 0·62
Inline graphic(intervention) −0·3544 0·7016 0·5334–0·9228 0·01
Inline graphic (post-intervention) −0·0088 0·9912 0·9831–0·9994 0·03
Inline graphic+ Inline graphic −0·0100 0·9900 0·9830–0·9971 0·06
Inline graphic (unit B) 0·0288 1·0292 0·7412–1·4291 0·86
Inline graphic (area C) −0·0884 0·9154 0·6976–1·2013 0·52
Unadjusted model
Inline graphic (constant) −5·833 0·003 0·0025–0·0034 <0·001
Inline graphic (time) −0·002 0·997 0·994–1·002 0·26
Inline graphic(intervention) −0·320 0·726 0·558–0·943 0·02
Inline graphic (post-intervention) −0·007 0·993 0·986–0·999 0·04
Inline graphic+Inline graphic −0·009 0·991 0·984–0·998 <0·01

CI, Confidence interval.

ln(λ) = β0 + β1T + β2I + β3T* + γ1U2 + γ2U3.

ln(λ) = β + β1T + β2I + β3T*.

Using model 2, the contribution of the individual components to the overall estimates can be summarized as:

  • Unit A: U2=U3 = 0, T < 25: ln(λ) = β0 + β1T,

    T = 25: ln(λ) = (β0 + β2 + (25β1 + β3)

    T > 25: ln(λ) = (β0 + β2 − 24β3) + (β1 + β3)T.

  • Unit B: U2 = 1, U3 = 0, T < 49: ln(λ) = (β0 + γ1) + β1T,

    T = 49: ln(λ) = (β0 + γ1 + β2) + (49β1 + β3)

    T > 49: ln(λ) = (β0 + γ1 + β2 − 48β3) + (β1 + β3)T.

  • Area C: U2 = 0, U3 = 1, T < 70: ln(λ) = (β0 + γ2) + β1T,

    T = 70: ln(λ) = (β0 + γ2 + β2) + (70β1 + β3)

    T > 70: ln(λ) = (β0 + γ2 + β2 − 69β3) + (β1 + β3)T

(see Fig. 3).

Fig. 3.

Fig. 3.

Representation of stacked data in model 2. ln(λ) = β0 + β1T + β2I + β3T* + γ1U2 + γ2U3; β1 = slope of line prior to intervention for units A, B, and area C; β2 = average drop at the intervention (drop occurs at T = 25 for unit A, T = 49 for unit B, T = 70 for area C; β1 + β3 = slope after intervention for units A, B, and area C. If β3 = 0, then the slopes are the same (β1) both before and after the intervention for all three units.

While models 1 and 2 estimate the same quantities, the estimation approaches are different. In model 2, the pre-intervention effect (β1) has simultaneous contributions from all three units until the first intervention (24 months) and from two units (unit B and area C) until the second intervention (48 months). In model 1 they are estimated separately. Similarly, for the estimation of the post-intervention effect, the contribution to β1 + β3 comes from the three units.

In a stacked analysis, the effects are adjusted for differences between units; however, this may not be feasible if (a) the number of units is large, (b) there are different durations of time series in different units, or (c) numerical instability in the model fitting or parameter estimates is observed. In these situations, a pooled analysis may be a useful option.

If the unit effects are not significant, model 2 can be simplified by omitting the variables U2 and U3. In this case, we revert to model 1, but with all the units pooled in one (stacked) dataset and assume no unit effect (Table 4). Differences between the unadjusted estimates in Table 4 and pooled analyses (Table 2) may arise because estimates in Table 4 use information from all units simultaneously. For the pooled analysis, the rate is a weighted sum of the intervention rates calculated separately for each unit.

These results assume no pair-wise correlation among the MRSA rates within each unit (rates from one month to the next are independent) (Tables 2 and 4). As rates are fitted over time, any correlations will be accounted for by the terms in the model(s) that involve time, T and T*. Monthly overlap of patients or seasonal effects would be unusual, but, if found, the model should be modified accordingly. Any correlation would manifest itself after the residuals have been fitted and, at most, the first-order autocorrelation would be dominant. When the size of this correlation is examined via the Durbin–Watson statistic, non-significance should be sufficient to exclude any correlation, indicating that any correlation between successive observations is negligible and accounted for by the Poisson model.

Full interaction model parameterization

The pooled analysis may be thought of as equivalent to a single model including interactions of unit with the variables of interest [intercept, time (T), intervention (I) and post-intervention time (T*)] over the units. If the number of units are few, we can incorporate the models for each unit by fitting a single model that adds to all the main effects (T, I, Ui, T*) interaction terms with unit. For the stacked data, this ‘full’ model can be written as:

Model 3: ln(λ) = β0 + β1T + β2I + β3T* + γ1U2 + γ2U3 + δ1U2T + δ2U3T + δ3U2I + δ4U3I + δ5U2T* + δ6U3T*.

This model has 12 parameters and, via the interactions, the effects in each unit are modelled separately. The interaction effects δ1, …, δ6 are obtained by multiplying, in the stacked dataset, the U2 and U3 columns by T, I, and T*, respectively (Table 5).

Table 5.

Full parameterized interaction model: Poisson and logistic regression

Variable Poisson regression Logistic regression
RR 95% CI P OR 95% CI P
T 1·034 0·983–1·091 0·19 1·036 0·983–1·091 0·19
I 0·737 0·341–1·591 0·44 0·736 0·340–1·595 0·44
T* 0·953 0·904–1·05 0·08 0·953 0·904–1·005 0·08
U2 2·048 0·730–5·751 0·17 2·052 0·727–5·789 0·77
U3 1·910 0·826–4·419 0·13 1·913 0·823–4·449 0·13
U2 × T 0·971 0·918–1·027 0·31 0·971 0·917–1·028 0·31
U3 × T 0·965 0·916–1·017 0·18 0·965 0·915–1·017 0·18
U2 × I 0·877 0·265–2·904 0·83 0·876 0·264–2·912 0·83
U3 × I 0·793 0·334–1·883 0·79 0·978 0·333–1·889 0·60
U2 × T* 1·015 0·953–1·081 0·65 1·015 0·953–1·081 0·65
U3 × T* 1·046 0·990–1·105 0·11 1·046 0·990–1·106 0·11

RR, Rate ratio; CI, confidence interval; OR, odds ratio.

Wald test for testing H0: δ5 = δ6 = 0 (second order) interaction term P = 0·10.

Rate = exp(Inline graphic) where Inline graphic is the estimated regression coefficient of the variable listed in column 1, for example, for the variable I, the rate = 0·7367 = exp(Inline graphic) where Inline graphic = ln (0·7367) = −0·3056.

As with the previous models, we can determine the effects for the individual units:

  • Unit A: U2=U3 = 0,

    T < 25: ln(λ) = β0 + β1T

    T = 25: ln(λ) = (β0 + β2) + (25β1 + β3)

    T > 25: ln(λ) = (β0 + β2 − 24β3) + (β1 + β3)T.

  • Unit B: U2 = 1, U3 = 0,

    T < 49: ln(λ) = (β0 + γ1) + (β1 + δ1)T

    T = 49: ln(λ) = (β0 + β2 + γ1 + δ3) + 49(β1 + δ1) + (β3 + δ5)

    T > 49: ln(λ) = (β0 + β2 + γ1 + δ3) − 48(β3 + δ5) + (β1 + β3 + δ1 + δ5)T.

  • Area C: U2 = 0, U3 = 1,

    T < 70: ln(λ) = ( β0 + γ2)+ (β1 + δ2)T

    T = 70: ln(λ) = (β0 + β2+ γ2 + δ4) + 70(β1 + δ2) + (β3 + δ6)

    T > 70: ln(λ) = (β0 + β2 + γ2 + δ4) − 69(β3 + δ6) + (β1 + β3 + δ2 + δ6)T.

The effects (in terms of ‘rates’) for the individual units are obtained by exponentiating the sum of the appropriate coefficients in the model. The effects for the reference unit (unit A) are determined from the coefficients β0, β1, β2, and β3. The effects for the various components for the other units are obtained by exponentiating the sum of coefficients involving appropriate γ's and δ's in addition to β's in the model. For example, the (immediate) intervention effect for unit B is obtained by adding the estimates for β2 and δ3 and then exponentiating the result. For this dataset, we have: Inline graphic (=ln 0·7367 from Table 5) and Inline graphic (=ln 0·7931 from Table 5) yielding an estimate of the intervention effect in unit B as exp(−0·5371) = 0·5844. This result can also be obtained as the product of the rates: 0·7367∗0·7931.

The advantage of model 3 is that it allows testing for equality of effects for the different components (T, I, T*) by testing δ1 = δ2 = 0, δ3 = δ4 = 0, and δ5 = δ6 = 0, respectively. We first test δ5 = δ6 = 0. The term T* is the interaction of T and I (with an offset of 24, 48, or 69 months, depending on the unit). Therefore, U2T* and U3T* are second (or higher)-order interaction terms, and the convention is to test whether these are significant before performing tests for the first (lower)-order interaction. If we accept H0: δ5 = δ6 = 0, then the tests H0: δ1 = δ2 = 0 and H0: δ3 = δ4 = 0 can be performed. For this test, P = 0·30, and we can proceed to test the interaction effects for T and I, i.e. U2T, U3T, U2I, and U3I. The tests of H0: δ3 = δ4 = 0 and H0: δ1 = δ2 = 0 yield P = 0·23 and P = 0·70, respectively, and so model 2 is preferred to model 3.

Apart from numerical fitting problems, such comprehensive (stacked) models have the drawback of using the statistical analyses to guide the questions and the interpretation of the clinical mechanisms. A more prudent approach is to use the data or model to address prespecified hypotheses. As the number of parameters increases, so does the potential for type I errors. Nevertheless, in this instance, the fully parameterized model suggests model 2 is a satisfactory representation of the data.

Assessing model accuracy and potential correlations

An important consideration in fitting any if the proposed models is the assurance that the data are adequately represented by a Poisson model. While numerous approaches exist to assess the adequacy of the Poisson fit, two methods can readily provide a guide as to whether the fitted models are appropriate. The deviance (D) from the model fit is a quantity derived from the model likelihood and has an (asymptotic) χ2 distribution with (np) degrees of freedom (n being the number of observations in the model and p the number of parameters fitted, including the constant). If Pr(D > χnp2 > 0·05 we conclude that there is insufficient evidence to reject the hypothesis that the Poisson model is adequate. A second quantity which can be calculated is the dispersion which, in the case of a Poisson model measures the relationship between the mean and the variance of the model, i.e. var(Yi) = σ2E(Yi). As the mean and the variance for the Poisson model are equal, a dispersion value greatly different from unity will indicate inadequacy of the Poisson model assumption. Of the different methods used for estimating the dispersion parameter, that based on Pearson residuals is the more common. Most computer packages provide estimates of the deviance and dispersion in their output. In our dataset, there was no evidence, based on the deviance that the Poisson model was inadequate for all models fitted. The maximum value of the dispersion parameter was <1·15 for all models suggesting that, for this dataset, the Poisson assumption was appropriate.

We can also obtain an estimate of the correlation between the different units to examine the impact of the intervention in the different units. This can be achieved by fitting the model ln(λ) = β0 + β1T + β2I + β3T* and treating the unit variables U1, U2, U3 as repeated measures in a generalised estimating equations (GEE) model with a Poisson link and compound-symmetric correlation structure [24]. For our data, the magnitude of this correlation was 0·009 indicating independence within and across units. This analysis is not appropriate in models 2 and 3 as unit effects are being formally estimated.

Poisson or logistic regression?

If the rates are low, the odds ratio will estimate the relative risk, and fitting a logistic regression may be just as useful and less complicated. For our data, a ‘success’ is a MRSA case, the number of successes is the number of cases in a month, and the exposure time is the total number of patients at risk in each month. The data structure is identical to that of the stacked data in Table 3. The model is fitted to the number of cases each month out of the total exposure time (patient-days) in each month (Table 5). There is no appreciable difference between the coefficients and 95% confidence intervals in the Poisson and logistic models. While logistic regression provides estimates of the odds ratio, a binomial model with a log-link (rather than logistic-link) will yield relative risks as opposed to odds ratios. However, these models are still the subject of research and can suffer from numerical instability when being fitted and yield expected probabilities greater than unity [25].

Long-term follow-up

If the post-intervention series is measured over a long period (e.g. ⩾60 months), the model for a single reduction in the slope after the intervention may not be the most appropriate choice over the whole series (Fig. 4).

Fig. 4.

Fig. 4.

Representation of a model with a post-intervention threshold. ln(λ) = β0 + β1T + β2I3T* + β4T†.

If, after MRSA incidence rates decrease to some threshold then cease to decrease, a linear representation of the total post-intervention experience would underestimate the effect of the intervention. In this instance, the model would be of the form:

Model 4: ln(λ) = β0 + β1T + β2I + β3T* + β4T†,

graphic file with name S0950268812000179_eqnU50.jpg

T = time in months, I = intervention status,

graphic file with name S0950268812000179_eqnU51.jpg
graphic file with name S0950268812000179_eqnU52.jpg

In this model, β0 represents the baseline MRSA rate; β1 is the slope of line before the intervention (t0); β2 is the change in the MRSA rate just after t0; β1 + β3 is the slope after t0, and β3 is the change in slope of line after t0 but before t1, a predetermined time after which it is assumed that the rate has attained a threshold and is stable. Similarly, β4 is the change in slope of the line after t1 and β1 + β4 is the slope after t1.

DISCUSSION

In setting up an interrupted time-series model, we considered individual units in separate models, and used a single model allowing for statistical adjustment for each unit. Modelling individual units provides individual estimates, and so allows for evaluation of local practices and health policies. To estimate overall effects, individual estimates can be pooled, with modification to account for heterogeneity among the units. The key assumption in obtaining overall effects is the independence of the effects in individual units. Using a random-effects adjustment when pooling estimates is analogous to fitting a multilevel model, since both approaches assume the estimates are themselves drawn from a normal distribution with some mean and variance. A drawback of the pooled approach is that as the number of individual units increases, so does the number of parameters requiring estimation and then combination; when this happens, overdispersion may arise. If there are many hospitals, the random-effects component can be modified to adjust for possible heterogeneity across the hospitals by pooling across the units within each hospital using the fixed-effects approach and then using the random-effects adjustment to pool the effects over all the hospitals.

The second approach is to consider all the data simultaneously in a stacked model. This has the advantage of explaining all intervention effects in a single model. If population-level information (such as state or regional health policy information) is available, a multilevel approach may allow inclusion of information available at different levels of the hierarchy. Such models can be complex, requiring care in interpreting the coefficients. The stacked approach can also allow for random effects across different units or different hospitals (or both) in a mixed model. Such models allow extra variability due to a unit's effect to be included in the analysis. This may also be achieved in the pooled approach.

If there is good evidence of overflow into other months (i.e. patients who stay in wards for a long time and thus are at risk over a number of months), a different modelling strategy will be required, for example taking a longer period (say, 3 months) to minimize the effect of such overflow, or assuming a particular correlation structure as part of the model.

Any residual serial correlation from the fitted model can be examined via the Durbin–Watson statistic. If this statistic suggests significant autocorrelation, either extra terms related to time (some form of autoregressive model) may need to be investigated, or some form of differencing to reduce the serial correlation should be considered. The question that remains is whether these should be incorporated in the analysis of all the units or just the units exhibiting the serial correlation. If there is evidence of strong seasonal effect(s) and/or high autocorrelation present, then the modelling approach along the lines described by Fernández-Pérez et al. [19] may be more appropriate.

CONCLUSION

We developed a model to examine the immediate and longer-term effects of a MRSA intervention programme in different units of the same hospital. Where feasible, a model adjusting for the unit effect should be fitted, or if there is strong evidence of heterogeneity between the units, an analysis incorporating a random effect for units may be appropriate.

DECLARATION OF INTEREST

None.

REFERENCES

  • 1.Klevens R, et al. Invasive methicillin-resistant Staphylococcus aureus infections in the United States. Journal of the American Medical Association 2007. 298: 1763–1771. [DOI] [PubMed] [Google Scholar]
  • 2.Ellingson K, et al. Sustained reduction in the clinical incidence of methicillin-resistant Staphylococcus aureus colonization or infection associated with a multifaceted infection control intervention. Infection Control and Hospital Epidemiology 2011. 32: 1–8. [DOI] [PubMed] [Google Scholar]
  • 3.Brown C, Lilford R. The stepped wedge trial design: a systematic review. BMC Medical Research Methodology 2006; 6: 54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Smith P. Splines as a useful and convenient statistical tool. American Statistician 1979; 33: 57–62. [Google Scholar]
  • 5.Kim H, et al. Comparability of segmented line regression models. Biometrics 2004; 60: 1005–1014. [DOI] [PubMed] [Google Scholar]
  • 6.Kim H, et al. Permutation tests for joinpoint regression with application to cancer rates. Statistics in Medicine 2000; 19: 335–351. [DOI] [PubMed] [Google Scholar]
  • 7.Cleveland W, Devlin S. Locally weighted regression: an approach to regression analysis by local fitting. American Statistician 1988; 83: 596–610. [Google Scholar]
  • 8.Wagner AK, et al. Segmented regression analysis of interrupted time series studies in medication use research. Journal of Clinical Pharmacy and Therapeutics 2002; 7: 299–309. [DOI] [PubMed] [Google Scholar]
  • 9.Matowe L, et al. Interrupted time series analysis in clinical research. Annals of Pharmacotherapy 2003; 37: 1110–1116. [DOI] [PubMed] [Google Scholar]
  • 10.Shardell M, et al. Statistical analysis and application of quasi experiments to antimicrobial resistance intervention studies. Clinical Infectious Diseases 2007; 45: 901–907. [DOI] [PubMed] [Google Scholar]
  • 11.Gillings D, Makuc D, Sigel E. Analysis of interrupted time series mortality trends: an example to evaluate regionalized perinatal care. American Journal of Public Health 1981; 71: 38–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Madden J, et al. Effects of a law against early postpartum discharge on newborn follow-up, adverse events and HMO expenditures. New England Journal of Medicine 2002; 347: 2031–2038. [DOI] [PubMed] [Google Scholar]
  • 13.Ross-Degnan D, et al. Examining product risk in context. Market withdrawal of zomepirac as a case study. Journal of the American Medical Association 1993; 270: 1937–1942. [PubMed] [Google Scholar]
  • 14.Soumerai SB, et al. Payment restrictions for prescription drugs under Medicaid: effects on therapy, cost, and equity. New England Journal of Medicine 1987; 317: 550–556. [DOI] [PubMed] [Google Scholar]
  • 15.Mol P, et al. Improving compliance with hospital antibiotic guidelines: a time-series intervention analysis. Journal of Antimicrobial Chemotherapy 2005; 55: 550–557. [DOI] [PubMed] [Google Scholar]
  • 16.Haung S, et al. Impact of routine intensive care unit surveillance cultures and resultant barrier precautions on hospital-wide methicillin-resistant Staphylococcus aureus bacteremia. Clinical Infectious Diseases 2006; 43: 971–978. [DOI] [PubMed] [Google Scholar]
  • 17.Bosso J, Mauldin P. Using interrupted time series to assess associations of fluoroquinolone formulary changes with susceptibility of gram-negative pathogens and isolation rates of methicillin-resistant Staphylococcus aureus. Antimicrobial Agents and Chemotherapy 2006; 50: 2106–2112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Biglan A, Ary D, Wagenaar A. The value of interrupted time-series experiments for community intervention research. Prevention Science 2000; 1: 31–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Fernández-Pérez C, Tejada J, Carrasco M. Multivariate time series analysis in nosocomial infection surveillance: a case study. International Journal of Epidemiology 1998; 27: 282–288. [DOI] [PubMed] [Google Scholar]
  • 20.Feng PJ, et al. Clinical incidence of methicillin-resistant Staphylococcus aureus (MRSA) colonization or infection as a proxy measure for MRSA transmission in acute care hospitals. Infection Control and Hospital Epidemiology 2011; 32: 20–25. [DOI] [PubMed] [Google Scholar]
  • 21.DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials 1986; 7: 177–188. [DOI] [PubMed] [Google Scholar]
  • 22.Egger M, Davey Smith G, Altman D. Systematic Reviews in Health Care. Meta-analysis in Context. London: BMJ Books, 2001. [Google Scholar]
  • 23.Bhargava A, Franzini L, Narendranathan W. Serial correlation and the fixed effects models. Review of Economic Studies 1982; 49: 533–549. [Google Scholar]
  • 24.Hardin J, Hilbe J. Generalized Estimating Equations. London: Chapman and Hall/CRC London, 2003. [Google Scholar]
  • 25.Marschner I, Gillett A. Relative risk regression: reliable and flexible methods for log-binomial models. Biostatistics 2012; 13; 179–192. [DOI] [PubMed] [Google Scholar]

Articles from Epidemiology and Infection are provided here courtesy of Cambridge University Press

RESOURCES