Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2022 Mar 30;10(2):1069–1077. doi: 10.1016/j.cstp.2022.03.023

Modeling the effects of the governmental responses to COVID-19 on transit demand: The case of Athens, Greece

Marios Giouroukelis a,, Stella Papagianni b, Nellie Tzivellou b, Eleni I Vlahogianni a, John C Golias a
PMCID: PMC8964442  PMID: 35371920

Abstract

Short-term demand forecasting is essential for the public transit system, allowing for effective operations planning. This is especially relevant in the highly uncertain environment created by the SARS‑CoV‑2 pandemic. In this paper, we attempt to develop accurate prediction models of transit ridership in Athens, Greece, using Autoregressive Fractional Integrated time series models enhanced with SARS‑CoV‑2-related exogenous variables. The selected exogenous variables are, from the one hand, the ratio of weekly SARS‑CoV‑2 infections over the infections 3 weeks before (capturing the dynamics of the pandemic, as a proxy for fear of transmitting the disease while commuting), and from the other hand, an index of the stringency of the government’s SARS‑CoV‑2-related measures and regulations. The developed ARFIMAX models have been fitted separately on bus and metro ridership data and wield comparable and statistically significant results. In both models, the exogenous variables prove to be statistically significant and their values are intuitive, suggesting a linear interrelation between them and transit ridership.

Keywords: Transit demand, COVID-19, Forecasting, ARFIMAX, Fractional integration

1. Introduction

The outbreak of SARS‑CoV‑2 in 2020 has had severe impacts on urban economies and activities, with transit systems around the world witnessing a significant decline in ridership. As a result, recent literature has extensively studied the effect of SARS‑CoV‑2 to urban public transit. While the analysis and quantification of the danger of transmitting the coronavirus during commuting is probably the biggest issue on the topic, many studies have focused on how the disease affects the supply and demand level for public transit. On the demand side, analysis of ridership and congestion data has helped quantify the impacts of the SARS‑CoV‑2 situation in various cities such as A Coruña (Orro et al., 2020) and Santander (Aloi et al., 2020) (Spain), Baltimore (USA) (Ahangari et al., 2020), Saporo (Japan) (Arimura et al., 2020) and various systems, such public transit networks and Mobility-as-a-Service (MaaS) platforms (Lei et al., 2020). On the supply side, this novel situation elicited a wide range of responses from the operators and have been extensively reviewed by (among others) Gkiotsalitis and Cats (2021). International organizations such as International Transport Forum (ITF) (ITF, 2021) and the International Association of Public Transport (UITP) (UITP, 2020) have also published reports proposing conventional and innovative measures that can minimize the impacts of the pandemic on public transit (Fig. 1 ).

Fig. 1.

Fig. 1

Part of the Athens Transit network system.

Balancing demand and supply is a core issue of public transit operations, especially during the highly uncertain context of SARS‑CoV‑2 pandemic, which calls for efficient planning from operators and transit authorities. This starts by understanding travel demand patterns and their predictability, which, in turn, enables an informed decision-making regarding planning and day-to-day management of the network operations (e.g. scheduling) and infrastructure (e.g. dedicated bus lanes). These characteristics are crucial since they further determine efficiency, reliability, and attractiveness of public transit services (Li et al., 2020). In this context, forecasting models can provide planners with reliable future previsions regarding future ridership. For example, reliable demand forecasts can lead to informed decisions regarding bus scheduling and fleet & vehicle size, which can in turn minimize operation costs and improve bus service quality by allowing efficient allocation of limited operator resources (Tirachini, 2013). Demand forecasts are also key input of a wide range of secondary analyses that typically interest operators, such as environmental impact assessments and cost-benefit analyses (Næss et al., 2014).

Responding to this need for accurate and reliable transit demand forecasts, we develop a ridership forecasting model for the public bus network of Athens during the SARS‑CoV‑2 pandemic by leveraging the principles of time series analysis under the generalized framework of Autoregressive Fractionally Integrated Moving Average coupled with exogenous variables (ARFIMAX). The framework has already proven its pertinence regarding the analysis of transportation-related timeseries and it is used in many recent applications in the domain. The exogenous variables reflect the dynamics of the pandemic and the stringency of the government’s SARS‑CoV‑2-related measures and regulations. Thus, on the one hand we present and analyze ridership patterns in the bus and metro networks of Athens during the pandemic, and, on the other hand, we determine and quantify ridership fluctuations which are directly attributed to pandemic indicators. We also aim to study both bus and metro demand and their predictability and address some very important methodological issues, such as how the selection of the temporal resolution of demand and the long-term dependence may affect the predictability.

The remainder of this paper is organized as follows. Section 2 provides information on the state of the literature regarding demand models, focusing on short- and medium-term models. Section 3 presents the methodological approach of the paper. Section 4 summarizes the results of the paper. Section 5 discusses the results and provides directions on future research and the possible limitations of the model, and concludes this study.

2. Literature review

From the onset of the pandemic, people mobility and public transport in particular have attracted the attention of decision makers, because of the possible associated increased risk of SARS‑CoV‑2 transmission (Tirachini and Cats, 2020). Several scientific articles proceed to an analytical evaluation of the effects of the varied governmental responses, by correlating observed changes of passenger behavior with respect to mobility to the dynamics of pandemic-related indicators (cases, mortality rate) and/or the relevant mitigation measures. Such research can have measurable impacts in designing novel policies. Objective quantification of the policy implications can help steer future policies to much more sustainable paths and provide feedback and increased understanding of the situation for decision makers. Kartal et al. (2021) examined through statistical tests the relationship between mobility (based on Apple and Google data) and the SARS‑CoV‑2 pandemic in Turkey and showed that there is an econometric causality between mobility indicators (mobility of grocery, park, residential, retail, and workplace) and pandemic indicators (cases, deceased). Tokey (2021) has studied the spatial associations of mobility and SARS‑CoV‑2 infection rate in the USA using mobile phone location data and concluded that the number of cases was negatively correlated to human mobility (using a miles-traveled-per-person indicator) and the number of out of country trips. Cazelles et al. (2021) have related trends in the transmission of SARS‑CoV‑2 and retail/recreation and public transport mobility during non-lockdown periods by examining the statistical correlations between timeseries of SARS‑CoV‑2-related cases and Google mobility data on selected locations. Finally, Wielechowski et al. (2020) conducted an one factor variance analysis (ANOVA) to correlate mobility in public transport to anti-SARS‑CoV‑2 regulation policy in Poland. They concluded that the stringency of the governmental response (proxied by the Oxford Covid-19 Government Response Tracker), has contributed to social distancing in public transport in Poland and that restrictions induced a decrease in mobility.

Regarding public transit forecasting, numerous models have been proposed in literature that, depending on their scope and methods, can be divided into long-term, and short- to medium-term models. Long-term models usually include in their modelling various socio-economic factors (Brinchi et al., 2020), recreating analytically the multidimensional relationships between them and transport ridership. A recent example includes Brinchi et al. (2020) who monitored and described the impact of SARS‑CoV‑2 on the mobility trends in Rome, developed future socioeconomics scenarios and run simulations using a variant of the four-step travel model.

Short- and medium-term demand models on the other hand mainly involve discretizing time into fixed intervals and predicting demand using observations from past intervals. In transportation modeling, they are frequently applied in road traffic applications (mainly in order to access flows and travel-times), and in similar public transit applications (Chandakas, 2020). They are especially relevant for bus lines and networks, who typically exhibit mode degrees of freedom in relation to train networks, being necessary for effective bus operations management (Ceder et al., 2013). Depending on the scope of the analysis, the size of the time window can vary between hours to a year, depending on the format of the input and the desired output.

Short- and medium-term forecasting models can be further divided in parametric and non-parametric models. The application on public transport forecasting on non-parametric models in forecasting has been a recent development in the field, fueled by advances in Artificial Intelligence and Machine Learning. Most applications are using some variation of an artificial neural network, while other techniques include support vector machine, multiscale radial basis function, k-nearest neighbor pattern matching and others (Chen et al., 2020). Recent applications of Machine Learning in travel demand forecasting include Lin et al. (2018), who combined Graph Convolutional Neural Networks with Data-driven Graph Filter (GCNN-DDGF) models to predict station-level hourly demand in a large-scale bike-sharing network. Fathi-Kazerooni et al. (2020) correlated turnstile usage data of the New York City subway system to SARS‑CoV‑2 deaths and cases using a short-term memory neural network. More recently, Rodríguez González et al. (2021), developed forecasting models of future public transport demand of different networks based on regression techniques such as Gaussian Process Regressors and Regression trees, using smart card data and a Bluetooth traffic monitoring network. The advantage of non-parametric methods is their prediction accuracy, and their flexibily, in applications which include complex datasets where non-linearities or missing data are present. Be that as it may, they lack the explanatory power of statistical (parametric) models, being often treated as “black-boxes” by researchers (Karlaftis and Vlahogianni, 2011).

Parametric models on the other hand have been the dominant approach in the field for quite some time, because of their intuitiveness, simplicity and well-developed mathematical background. Their fitness has been proven in modeling applications where there is already underlining knowledge of the functional relationships between the variables of the problem, and in problems the causalities need to be explored through an interpretable model. (Karlaftis and Vlahogianni, 2011). Frequently used parametric models include historical average and smoothing techniques (Rodríguez González et al., 2021), as well as various timeseries forecasting techniques for analyzing patterns of events spanning through a sequence of time. Timeseries models are particularly adapted in public transit ridership forecasting: being free from problems like omitted variables, cause and effects biases, misspecifications and other problems (Keck et al., 2010), they offer a comprehensive econometric method for predicting ridership while bypassing the complicated interplay of the socioeconomic factors that affect it. future events can be predicted by analyzing the trends of the past, given that future trends will hold similar to historical trends (Box et al., 2015). Thereby, time series models are based on the concept of stochasticity (the nature of the time series may be due to a random external cause operating in the system, the dynamics of which are described by either linear or non-linear rules). In addition, linear analysis is based on the probabilistic theory of normal processes, which is easily understandable, and the statistical inference for linear models is fully developed (e.g., parametric predictive confidence intervals). This makes timeseries models easy to understand and to use.

Among all approaches, the Autoregressive Integrated Moving Average (ARIMA) family of models is one of the most common ones in time series forecasting. ARIMA forecasting has proven to be satisfactory for many kinds of transport data, in cases where time series show strong fluctuations with the effect of external randomness (Karlaftis and Vlahogianni, 2009) and has been overwhelmingly used in transportation-related applications. These include recently published works on passenger demand forecasting with seasonal ARIMAX models (Gong et al., 2014, Tao et al., 2018, Virati et al., 2020), metro ridership with dual ARIMA-GARCH models Ding et al. (2017), online car-hailing services demand forecasting with ARIMAX models (Zhou et al. 2021), and hybrid ARIMA-neural network models for bus demand prediction (Chiang et al. 2011). A closer look at these applications shows that, with the exception of the last paper (which is somewhat less recent), in the rest of the papers demand is proxied by validation data from smart cards. Recently, Hu et al. (2021) have reviewed open data sources and applications of human mobility data during the pandemic. They have shown that a big proportion of various types of demand models rely on automatically collected validation data from smartcards, turnstiles and the likes.

3. Autoregressive fractionally integrated moving average models in transportation science

The Autoregressive Integrated Moving Average (ARIMAX) models (p, d, q, r) are the generalized form of stationary autoregressive moving average models ARMAX (p, q, r), which are the most typical form of stochastic linear time series models expressed by the following formula:

Xt=θ0+εt+i=1pki×Xt-i+i=1qλi×εt-i+m=1rβm×χtm(1)where Χt, Xt-1, …, Xt-p refer the lagged observations in the measurement periods t, (t-1),…, (t-p), p is the number of lags, ki are the self-regression coefficients of the model, θ0 a fixed term, εt the regression error at period t, q the number of regression error lags, λi the moving average coefficients, χtm holds the values of the r exogenous, time-varying predictors at time t, with coefficients denoted β. . A common way to determine the parameters p and q is based on the autocorrelation function (ACF) and the partial autocorrelation function (PACF) respectively (Shumway & Stoffer, 2017).

The incorporation of exogenous variables can extend the modeling capabilities of the traditional ARIMA model and lead to more satisfactory predictions (Wang et al., 2021). The exogenous-explanatory variables allow for increased accuracy by including non-temporal variables whose cause is external to the model and explain changes in the timeseries which cannot be attributed to the normal evolution of the studied phenomena. Exogenous variables can also increase the interpretability of the forecasts, attributing changes in the output to the corresponding changes in the input. Common exogenous variables in forecasting ridership in public transit means include weather events and socioeconomic indicators (Ngo, 2019).

ARIMA models differ from Multiple Linear Regression models and it has been suggested that classical regression models are often insufficient for explaining all the dynamics of a time series (Nelson, 1998). Indeed, while MLR models work exclusively with observed variables, the ARIMA framework also incorporates unobserved variables (namely the lagged error terms). Thus, while the AR terms can be estimated using OLS, the same is not true for the MA terms.

In most cases, applications of the ARIMA models treat time series either as covariance stationary, suggesting time-independence of 1st and 2nd moments (short-term memory) or as possessing a stochastic trend. This trend, indicative of non-stationarity, is usually eliminated by assuming an integer value of unit roots, with I(1) models being the most common for transport applications (Karlaftis and Vlahogianni, 2009). Be that as it may, I(0) and I(1) model structures do not consider the possibility of long-term memory in the time series, indicated as slow-decaying autocorrelation of the dependent variable with observations that are lagging far behind. Requiring an integer value of differentiation degrees, can lead to over-differentiation of the series and erroneous results through an inflation of the moving-average component of the model structure (Granger and Joyeux, 1980). Instead, a fractional differentiation degree which is capable of taking any value in the (-1,1) region can account for the correct degree of persistence in the time series, treating both long- and short-term persistence of memory phenomena (Karlaftis and Vlahogianni, 2009). The extended ARFIMA framework responds to the need of handling long term processes, in ARIMA models through fractional integration (FI), accounts for variability in both the mean and variance (Granger and Joyeux, 1980, Hosking, 1981). The degree of differentiation, represented by the value of parameter d, is associated with specific of the time series, analogous to its value (Odaki, 1993).

Choosing the correct degree of differentiation by identifying long memory patterns in the time series can also be coupled with formal unit root tests. The most common ones include the Augmented Dickey-Fuller (ADF) test, the Phillips–Perron (PP) test and the KPSS (Kwiatkowski, Phillips, Schmidt, and Shin) test. In the first two, the evaluated null hypothesis is that the time series contain a unit root and should be I(1) integrated to reach stationarity. In contrast, in the KPSS test tests the opposite hypothesis that the process is I(0) stationary and there is no need for differencing. A joint consideration of the results of the PP and the KPSS tests can provide useful information regarding the degree of integration (Baillie, 1996).

  • Rejection of both statistics indicates a fractionally integrated process.

  • Failure to reject both statistics indicates insufficient information regarding the differentiation degree.

  • Rejection of the pp statistic and failure to reject the KPSS statistic indicates a covariance stationary process.

  • Failure to reject the PP statistic and rejection of the KPSS statistic indicates a unit root I(1).

4. Implementation and findings

4.1. Available data

The available data consists of time series of smart ticket validations data on the public transit network in Athens, Greece during the pandemic-affected period of January 2020 – July 2021. Public transit demand displays a relatively consistent behavior during the year, characterized by cyclic behavior at various levels (yearly, weekly, daily etc.). The SARS‑CoV‑2 situation introduces a significant rupture in this stability: phenomena external to the normal fluctuations of public transport demand and exclusively related to the pandemic have resulted in a significant decrease in ridership, though without completely altering its form. Properly analyzing this phenomenon calls for a model which can handle both time series data and the externalities introduced by the pandemic.

Ticket validations have been chosen as a proxy for the passenger demand of the bus network. Since 2017, smart cards have completely replaced older paper tickets in Athens, greatly facilitating the automatic collection of network-wide validation data. For this work, smart card validation data covering bus & metro operators have been provided by Athens Urban Transport Organization (OASA), the Public Transport Authority of Athens who is responsible for the strategic and operational planning, co-ordination and control of urban transit carried out by public transport means in the Attica Region.

The dynamics of the pandemic are taken into account in the model by calculating the ratio of SARS‑CoV‑2 infections of week t over the infections of week t-3 (hereinafter referred to as “Cases Ratio”). The use of a ratio instead of an absolute value reflects the hypothesis that ridership is mostly affected by the evolution of the pandemic than its absolute level. Daily SARS‑CoV‑2 cases data in Greece are reported daily by the National Public Health Organization (NPHO). The data are easily accessible online through NPHO’s site or the respective API.

Data on the stringency level of Greece’s SARS‑CoV‑2-related measures and restrictions are, apart from the data collected by the University of Oxford's Government Response Tracker (OxCGRT), which is tracking on a daily basis the dynamics the pandemic and the respective response measures in 180 countries (Hale et al., 2020). The Stringency Index takes on values from 0 to 100 and estimates the average of the individual component indicators that relate to the following containment and closure policies and health system policies undertaken by governments during SARS‑CoV‑2:

  • Closings of schools and universities

  • Closings of workplaces

  • Cancelling public events

  • Limits on gatherings

  • Closing of public transport

  • Orders to “shelter-in-place” and otherwise confine to the home

  • Restrictions on internal movement between cities/regions

  • Restrictions on international travel

The response of the Greek government across the above-mentioned criteria is represented schematically in Fig. 2 , while the index value for each criterion in interpreted using the codification of the OxCGRT (Hale et al., 2020) (Annex – Table 5).

Fig. 2.

Fig. 2

Greek Government Closure and Containment response to SARS‑CoV‑2.

For all sources, the data cover a span of 17 months, from January 2020 to July 2021. All data have been aggregated at a week level and, for every variable, we consider the Average Weekly Daily (AWD) value for each week in the timespan of our analysis. Doing so, we eliminate the seasonality that public transit validations timeseries typically exhibit.

The final dataset contains 83 observations, each with 3 attributes. Although these data seem limited, we believe that the data processing we follow result in simplified representation of the temporal evolution of validations, not only easily captured by time series modeling, but also with far reaching managerial implications for Transport Authorities. Table 1 illustrates the statistical characteristics of the dataset. Fig. 2 depicts graphically the parallel evolution of Average Weekly Daily Validations and the Average Weekly Daily values of the Stringency Index (left side) and the Cases Ratio (right side). A Pearson's product-moment correlation was also run to assess the relationship between the two exogenous variables. The results have shown that the correlation between the variables is not statistically significant at the 5% level, with r(81) = 0.091, p =.414.

Table 1.

Statistical characteristics of the dataset.

Variable Mean Std.Dev Min Max
AWD Validations 166,152.50 73,215.70 35,597.86 357,177.00
AWD Cases Ratio 1.95 4.73 0.00 41.86
AWD Stringency Index 60.89 25.43 0.00 88.89

Finally, we also provide some context on the public transit offer in Athens during the January 2020 – July 2021 period. During the first wave of the pandemic in Athens, and given the extremely reduced passenger traffic due to traffic restrictions (up to −90% according to the Operators’ data), the public transit offer was adjusted accordingly by implementing summer timetables for the entire network. Crucial bus lines serving hospitals have been strengthened while others services (such as the midnight metro services, airport and university service lines) have been severely limited or canceled altogether. Beginning in June 2020, when all governmental movement restriction measures were lifted, and despite the fact that demand has not recovered to pre-SARS‑CoV‑2 levels, public transport offer has returned and remained ever since to normalcy, with the implementation of winter timetables on the bus, trolleybus, subway and tram networks. Regarding bus transit offer, this has increased in relation to its pre-pandemic levels, since during the second wave of the pandemic in Greece (starting November 2020). The public operator has partnered with a private consortium of bus owners (KTEL), leading to an “injection” of 200 additional buses operating in across the network. The latter has allowed for reduced frequencies on core bus lines and acts as one of the few examples of SARS‑CoV‑2-related policies that create long-term value for the Athenian network.

5. Results

An ARIMAX model is calibrated using time series of smart ticket validations data on the public bus network in Athens. The dependent variable of the developed ARIMAX model is a time series of average weekly validations. The independent variables were selected so as to reflect on the one hand the dynamics of the pandemic and on the other hand the stringency level of the government’s policies regarding SARS‑CoV‑2. As a robustness check, the model was also fitted on smart tickets validations data from the Athens metro system using the same independent variables. The results suggest that the fitted model fits well both bus and metro validation data.

The parameters of the model have been determined by evaluating the form of the ACF and PACF graphs (Fig. 3 ). The ACF graph indicates that the observation Χt is correlated with observations Χt-1, Χt-2, Χt-3 (the lag is statistically significant up to the 3rd observation after observation Χt). This serial correlation between observations potentially allows model errors to affect up to the next 3 observations, or conversely, the prediction error of the model is potentially affected by the previous 3 observations. The terms of the moving average MA (q) act correctively for this type of error and therefore, the maximum degree of the MA terms (Fig. 4 ) of the models is set to 3 (q ≤ 3).

Fig. 3.

Fig. 3

Time evolution of smart card validations data in relation to the Stringency Index and the Cases Ratio.

Fig. 4.

Fig. 4

Autocorrelation and partial autocorrelation graph of the average weekly validations in the Athens bus network.

The PACF graph indicates that the Xt observation is directly affected by (Fig. 5 ) the Xt-1 observation, with the rest of observations moving within or very close to the limits of statistical insignificance. This direct influence relationship is recorded by the autoregressive AR (p) term, and therefore the maximum number of is equal to 1 (p ≤ 1). ACF and PACF graphs exhibit similar behaviors for both bus and metro networks.

Fig. 5.

Fig. 5

Autocorrelation and partial autocorrelation graph of the average weekly validations in the Athens metro network.

To determine the degree of integration, the time series is tested using the Philips-Perron and Kwiatkowski – Phillips – Schmidt – Shin (KPSS) tests, following the methodology described in Section 3.1. Based on the result of the Philips-Perron test, the hypothesis that the time series does not have a unit root cannot be ruled out. The result of the KPSS test indicates that the assumption that the time series is static also cannot be rejected at a 95% confidence level. The inability to reject both hypotheses indicates the lack of information regarding the required integration degree. In the specific case where the degree of integration of the standard is not easy to determine, we proceed in applying the ARFIMA model. The results indicate that parameter d is not statistically different from the value d = 0, with this value being adopted for the rest of the project. This observation holds for smart card validations data for both bus and metro networks (Table 2 )

Table 2.

Degree of integration tests results.

Metro Network Bus Network
Phillips-Perron p-value for Z(t) = 0.155 p-value for Z(t) = 0.104
KPSS p-value for Z(t) = 0.655 p-value for Z(t) = 0.554
Portmanteau test Prob > Χ2 = 0.894 Prob > Χ2 = 0.978

All relevant parameter combinations for the ARIMAX standard (p, d, q) were tested. The optimal model is the one where the parameters are statistically significant and the values of the Bayesian information criterion (BIC) and Akaike Information Criterion (AIC) are minimized. Based on the above, the fitted model is ARIMAX (1,0,0), which, in addition to the exogenous variables, contains an AR self-regression term. The MA (q) terms did not appear to be statistically significant for any combination of parameters. The analytical expression of the optimal model is (3):

Xt=θ0+εt+k1×Xt-1+β1×χ1+β2×χ2(3)

where Xt-1 refers to the lagged observations (validations) in week t-1, θ0 a fixed term, εt the regression error at period t, while χ1,χ2 hold the values of the two exogenous (the Stringency Index and the Covid Ratio), time-varying predictors at time t, with their corresponding coefficients β1,β2. In the absence of MA terms, the final model proves to be similar to a Linear Regression Model.

The results of the optimal model are listed in Table 3 . All factors are statistically significant and their signs are intuitive. For the bus network, an increase of one (1) unit in Average Weekly Daily Stringency Index leads to the reduction of the Average Weekly Daily Validations in the bus network by 2,120 validations. Similarly, a one (1) increase in the Average Weekly Daily Cases Ratio leads to a 1,505 decrease in ridership. The amount of AWD validations of a given week is positively correlated with the number of AWD validations of the previous one, with 65.1% of the current demand being explained by that of the previous week (L1 coefficient). The results are identical for the metro model: tightening of government restrictions and by one (1) unit leads to the decrease of the Average Weekly Daily Validations by 8,554, while a one (1) increase in the AWD Cases Ratio causes a 5,425 decrease. The L1 coefficient stands at 0.702.

Table 3.

Model coefficients and test results for the bus.

Model Coefficients Bus Network Metro Network
AWD Cases Ratio −1,505.1 −5,424.9
AWD Stringency Index −2,120.0 −8,553.8
L1 0.651 0.702
Constant 295,566 1,120,578
Dickey-Fuller test on residuals p-value for Z(t) = 0.00 p-value for Z(t) = 0.00
AIC 1,928.8 2,152.6
BIC 1,933.7 2,152.6
Log likelihood −970.0 −1,071.3
MAPE 18.97% 18.7%

Model errors were tested using the Portmanteau control. The assumption that the residuals of the standard are “white noise”, i.e. they do not contain information, is not rejected using the statistical test. In addition, hypothesis that there is a unit root in the residuals is rejected by using the Dickey-Fuller test. Finally, the ACF graph of the residuals indicates that the autocorrelations between residuals are not statistically significant. These results suggest that the model fits well on the data.

The model was also tested for the quality of its forecasts. One of the most common measurements used to measure the predictive accuracy of a standard is the mean absolute percentage error (MAPE), which represents the mean absolute error rate or the mean difference between the predicted value and the actual value. The index was calculated at 18.97% and 18.7% for bus and metro validations data respectively. The one-step ahead forecasts of the model for the timespan of the study are represented in Fig. 6 , while a plot of the observed vs. the predicted validations for the bus and metro network is depicted in Fig. 7

Fig. 8.

Fig. 8

Predicted vs Observed bus (left) and metro (right) network average weekly validations.

Fig. 6.

Fig. 6

Autocorrelation of the residuals for the bus (left) and metro (right) average weekly validations models.

Fig. 7.

Fig. 7

Predicted and observed bus (left) and metro (right) network average weekly validations time series.

In order to measure the responsiveness of public transit ridership with respect to changes in the Cases Ratio and the Stringency Index, we proceed with the calculation of the related elasticities (Table 4 ). We note that demand (proxied by smart card validations) is largely inelastic to changes in the AWD Cases Ratio for both the bus (εBus Validations | CR = -0.02) and metro (εMetro Validations | CR = -0.02) networks. In contrast, public transit demand is much more affected by changes in the AWD Stringency Index (εBus Validations | SI = -0.78 and εMetro Validations | SI = -0.87). The result is highly intuitive since, by definition, the stringency index includes restrictions in mobility and in trip generating activities. We also note that the elasticity of smart card validations with respect to the stringency index is 11.5% lower for the metro network than the bus network, suggesting that the former is less more by changes in government’s SARS‑CoV‑2-related rules. This can be attributed to the characteristics of the metro network, which is frequently seen as a crowded, largely confined space where natural ventilation is absent. From the onset of the SARS‑CoV‑2 pandemic, this specific combination of environmental factors has been identified as risk-prone, as it facilitates aerosol transmission and the spread of SARS‑CoV‑2.

Table 4.

Bus & Metro smart card validations elasticities w.r.t the AWD Cases Ratio and & the AWD Stringency Index.

Elasticity Value
εBus Validations | CR −0.02
εBus Validations | SI −0.78
εMetro Validations | CR −0.02
εMetro Validations | SI −0.87

6. Discussion & conclusions

In this paper we have developed an ARIMAX model which succeeded in correlating bus and metro ridership with the characteristics of the SARS‑CoV‑2 pandemic. We have shown that the stringency level of the SARS‑CoV‑2 related measures, as well as the dynamics of the disease (proxied by the change in the reported number of cases) are significant predictors of ridership levels on the public bus and metro networks of Athens. Increases in both the change of the number of cases and the stringency index are associated with decreases in ridership, which corresponds well with our intuition and evidence of bus and metro ridership in Athens during SARS‑CoV‑2. Quantitatively, a 10% increase in the Average Weekly Daily Stringency Index corresponds to a 7,8% and 8,7% drop in the Average Weekly Daily Validations for the metro and bus networks respectively. The effect of the Cases Ratio is more limited: a 10% increase in the Average Weekly Daily Cases Ratio leads to 0,2% less Average Weekly Daily Validations for both networks.

Regarding the quality of the model, analysis of its results prove that it fits the data well and the accuracy of its predictions are good. As such, from the Transport Authority or the Operator the model can be a valuable tool for adapting the bus network of Athens to pandemic conditions, allowing for calculated decisions on the adjustment of the offer to the variability of the demand. From the Central Government perspective, the model can provide valuable, “in vitro” insights regarding the impacts of SARS‑CoV‑2-related measures to human mobility and help steer future policy. We also note that there are few studies data-driven studies that proceed in developing forecasting models correlating variability in public transit ridership to specific indicators of the pandemic and, to our knowledge, this is the first paper doing so by employing an ARIMAX model. The ability of the such models to accurately predict ridership even in the highly uncertain condition created by SARS‑CoV‑2, as well as their employment in many similar applications (see Section 2) proves their relevancy for similar applications.

Moreover, the methodological simplicity and lack of complex mathematics of the model can be intuitive to end users which in turn improves the trust in the model’s outcomes and facilitates its application (Petropoulos et al., 2020). Given that our model has been developed for usage by the Public Transit Authority of Athens, intuitiveness and visibility of its results and the relations between variables have been a key issue. The employment of a more complex method such as an artificial neural network can lead to a less interpretable result and, thus, hinder its usage (Vlahogianni et al., 2014). Most notably, the use of a non-parametric model would not allow the quantification of the effect pandemic-related exogenous variables on ridership. At the same time, the accuracy of the model can be expected to be on par with non-parametric models. Chandakas (2020) suggests that, in modeling in linear relationships between variables, econometric models and deep learning models do not have significant differences in their performance. This is especially pertinent for our case since increases in the number of SARS‑CoV‑2 cases and in the stringency of measures can lead linearly to decreases in ridership, as it can be proved by the statistical significance and the intuitive values of the exogenous variables in the ARIMAX model.

Further research can consider adding explicative variables to the model, based on the available data. For example, weather (Singhal et al., 2014) and public holidays (Viallard et al., 2019) have already been correlated to fluctuations of public transit demand. Be that as it may, it seems that SARS‑CoV‑2 effects superimpose on traditional explicative variables, since both variables have not been statistically significant in our analysis. This hypothesis is intuitive: weather and public holidays are irrelevant if one does not plan to commute either way. The quality of the model might be improved thought by adding pandemic-related explicative variables, such as the number/percentage of employees working from home and indicators on the state of the economy. Unfortunately, the relevant data have not been available to the authors.

Given that smart card data validation data, as well as data on the evolution of the pandemic are widely available, it would be interesting to evaluate whether our results can be reproduced using ARIMAX or other models on different datasets from different cities and types of public transit networks. Our model fitted well on smart card data from the metro system of Athens, so we are confident that it will be consistent with the changes in mobility patterns observed in other parts of the world.

One common limitation associated with models calibrated on historical data is that they implicitly assume that behavior patterns that have been observed in the past will be repeated in the future for the same state of model inputs. This might not be the case in the ever-changing environment of SARS‑CoV‑2. While the SARS‑CoV‑2 measures stringency influences mobility directly, the effect of the evolution of cases on public transit ridership is indirect and can be attributed to fear of contracting the disease. Recent literature confirms that this kind of fear is directly associated with its transmission rate and medium, as well as its morbidity and mortality (Ahorsu et al., 2020). At the same time, this proposition suggests that fear perception might not be stable over the course of the pandemic (e.g., due to a change in morbidity attributed to vaccinations). This can lead to different mobility patterns and thus, nonconsistency in the model’s forecasts in the long run, so it would be interest to test whether the model’s forecasts are reproducible in the long run for all phases of the pandemic.

We have also considered the possibility of large errors in the reported number of SARS‑CoV‑2 cases, as discussed in Hong & Li (2020). We have concluded that this is not a factor of concern for our model, since the reported number of cases affects mobility only because it is reported. In other words, ridership is affected only indirectly, making the real number of SARS‑CoV‑2 cases more or less irrelevant for our reasoning.

Overall, though, we believe that these limitations do not hinder the quality of the model, which can be a useful tool in planning bus operations in Athens during the SARS‑CoV‑2 pandemic. Its accuracy, simplicity and interpretability make it relevant in the case of public transit, where decision making usually involves many stakeholders.

CRediT authorship contribution statement

Marios Giouroukelis: Conceptualization, Formal analysis, Software, Investigation, Writing – review & editing. Stella Papagianni: Data curation, Resources, Project administration. Nellie Tzivellou: Data curation, Project administration. Eleni I. Vlahogianni: Conceptualization, Methodology, Validation, Supervision, Project administration. John C. Golias: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.cstp.2022.03.023.

Appendix A. Supplementary data

The following are the Supplementary data to this article:

Supplementary Annex Table 5
mmc1.docx (13.9KB, docx)

References

  1. Ahangari, S., Chavis, C. and Jeihani, M., 2020. Public Transit Ridership Analysis during the SARS‑CoV‑2 Pandemic. medRxiv.
  2. Ahorsu D.K., Lin C.Y., Imani V., Saffari M., Griffiths M.D., Pakpour A.H. The fear of SARS-CoV-2 scale: development and initial validation. Int. J. Mental Health Addict. 2020:1–9. doi: 10.1007/s11469-020-00270-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Aloi A., Alonso B., Benavente J., Cordera R., Echániz E., González F., Ladisa C., Lezama-Romanelli R., López-Parra Á., Mazzei V., Perrucci L. Effects of the SARS-CoV-2 lockdown on urban mobility: Empirical evidence from the city of Santander (Spain) Sustainability. 2020;12(9):3870. [Google Scholar]
  4. Arimura M., Ha T.V., Okumura K., Asada T. Changes in urban mobility in Sapporo city, Japan due to the SARS-CoV-2 emergency declarations. Transport. Res. Interdiscipl. Perspect. 2020;7 doi: 10.1016/j.trip.2020.100212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baillie R.T. Long memory processes and fractional integration in econometrics. J. Economet. 1996;73(1):5–59. [Google Scholar]
  6. Box G.E., Jenkins G.M., Reinsel G.C., Ljung G.M. John Wiley & Sons; 2015. Time series analysis: forecasting and control. [Google Scholar]
  7. Brinchi S., Carrese S., Cipriani E., Colombaroni C., Crisalli U., Fusco G., Gemma A., Isaenko N., Mannini L., Patella S.M., Petrelli M. On Transport Monitoring and Forecasting During SARS-CoV-2 Pandemic in Rome. Transport Telecommun. 2020;21(4):275–284. [Google Scholar]
  8. Ceder A.A., Hassold S., Dano B. Approaching even-load and even-headway transit timetables using different bus sizes. Public Transport. 2013;5(3):193–217. [Google Scholar]
  9. Cazelles B., Comiskey C., Nguyen-Van-Yen B., Champagne C., Roche B. Parallel trends in the transmission of SARS-CoV-2 and retail/recreation and public transport mobility during non-lockdown periods. Int. J. Infect. Diseas. 2021;104:693–695. doi: 10.1016/j.ijid.2021.01.067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chandakas E. On demand forecasting of demand-responsive paratransit services with prior reservations. Transport. Res. Part C: Emerg. Technol. 2020;120 [Google Scholar]
  11. Chen D., Zhang J., Jiang S. Forecasting the short-term metro ridership with seasonal and trend decomposition using loess and LSTM neural networks. IEEE Access. 2020;8:91181–91187. [Google Scholar]
  12. Chiang W.C., Russell R.A., Urban T.L. Forecasting ridership for a metropolitan transit authority. Transport. Res. part A: Policy Pract. 2011;45(7):696–705. [Google Scholar]
  13. Ding C., Duan J., Zhang Y., Wu X., Yu G. Using an ARIMA-GARCH modeling approach to improve subway short-term ridership forecasting accounting for dynamic volatility. IEEE Trans. Intell. Transp. Syst. 2017;19(4):1054–1064. [Google Scholar]
  14. Fathi-Kazerooni, S., Rojas-Cessa, R., Dong, Z. and Umpaichitra, V., 2020. Time series analysis and correlation of subway turnstile usage and SARS‑CoV‑2 prevalence in New York City. arXiv preprint arXiv:2008.08156.
  15. Gkiotsalitis K., Cats O. Public transport planning adaption under the SARS-CoV-2 pandemic crisis: literature review of research needs and directions. Transp. Rev. 2021;41(3):374–392. [Google Scholar]
  16. Gong M., Fei X., Wang Z.H., Qiu Y.J. Sequential framework for short-term passenger flow prediction at bus stop. Transp. Res. Rec. 2014;2417(1):58–66. [Google Scholar]
  17. Granger C.W., Joyeux R. An introduction to long-memory time series models and fractional differencing. J. Time Ser. Anal. 1980;1(1):15–29. [Google Scholar]
  18. Hale T., Petherick A., Phillips T., Webster S. Variation in government responses to SARS-CoV-2. Blavatnik school of government working paper. 2020;31:2020–2111. [Google Scholar]
  19. Hong H.G., Li Y. Estimation of time-varying reproduction numbers underlying epidemiological processes: A new statistical tool for the SARS-CoV-2 pandemic. PLoS ONE. 2020;15(7) doi: 10.1371/journal.pone.0236464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hosking J.R.M. Lagrange-multiplier tests of multivariate time-series models. J. Roy. Stat. Soc.: Ser. B (Methodol.) 1981;43(2):219–230. [Google Scholar]
  21. Hu, T., Wang, S., She, B., Zhang, M., Huang, X., Cui, Y., Khuri, J., Hu, Y., Fu, X., Wang, X. and Wang, P., 2021. Human Mobility Data in the SARS‑CoV‑2 Pandemic: Characteristics, Applications, and Challenges. Applications, and Challenges (May 24, 2021).
  22. ITF . OECD Publishing; Paris: 2021. SARS-CoV-2 and Transport: A Compendium. [Google Scholar]
  23. Karlaftis M.G., Vlahogianni E.I. Memory properties and fractional integration in transportation time-series. Transport. Res. Part C: Emerg. Technol. 2009;17(4):444–453. [Google Scholar]
  24. Karlaftis M.G., Vlahogianni E.I. Statistical methods versus neural networks in transportation research: Differences, similarities and some insights. Transport. Res. Part C: Emerg. Technol. 2011;19(3):387–399. [Google Scholar]
  25. Kartal M.T., Depren Ö., Depren S.K. The relationship between mobility and SARS-CoV-2 pandemic: Daily evidence from an emerging country by causality analysis. Transport. Res. Interdiscipl. Perspect. 2021;10 doi: 10.1016/j.trip.2021.100366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Keck A., Raubold A., Truppia A. Forecasting international trade: A time series approach. OECD J. J. Business Cycle Measure. Anal. 2010;2009(2):157–176. [Google Scholar]
  27. Lei B., Liu X.L., Cao Z., Hao Y.R., Zhang Y., Chen X.M. Modeling and forecasting of SARS-CoV-2 spread in urban rail transit system. J. Traffic Transp. Eng. 2020;3:139–149. [Google Scholar]
  28. Li C., Bai L., Liu W., Yao L., Waller S.T. Graph neural network for robust public transit demand prediction. IEEE Trans. Intell. Transp. Syst. 2020 [Google Scholar]
  29. Lin L., He Z., Peeta S. Predicting station-level hourly demand in a large-scale bike-sharing network: A graph convolutional neural network approach. Transport. Res. Part C: Emerg. Technol. 2018;97:258–276. [Google Scholar]
  30. Næss P., Andersen J., Nicolaisen M.S., Strand A. Transport modelling in the context of the ‘predict and provide’paradigm. Europ. J. Transp. Infrastruct. Res. 2014;14(2) [Google Scholar]
  31. Nelson B.K. Time series analysis using autoregressive integrated moving average (ARIMA) models. Acad. Emerg. Med. 1998;5(7):739–744. doi: 10.1111/j.1553-2712.1998.tb02493.x. [DOI] [PubMed] [Google Scholar]
  32. Ngo N.S. Urban bus ridership, income, and extreme weather events. Transport. Res. Part D: Transp. Environ. 2019;77:464–475. [Google Scholar]
  33. Odaki M. On the invertibility of fractionally differenced ARIMA processes. Biometrika. 1993;80:703–709. [Google Scholar]
  34. Orro A., Novales M., Monteagudo Á., Pérez-López J., Bugarín M. Impact on City Bus Transit Services of the COVID–19 Lockdown and Return to the New Normal: The Case of A Coruña (Spain) Sustainability. 2020;12(17):7206. [Google Scholar]
  35. Petropoulos F., Makridakis S., Stylianou N. COVID-19: Forecasting confirmed cases and deaths with a simple time series model. Int. J. Forecast. 2020 doi: 10.1016/j.ijforecast.2020.11.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Rodríguez González A.B., Wilby M.R., Vinagre Díaz J.J., Fernández Pozo R. Characterization of SARS-CoV-2’s Impact on Mobility and Short-Term Prediction of Public Transport Demand in a Mid-Size City in Spain. Sensors. 2021;21(19):6574. doi: 10.3390/s21196574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Shumway R.H., Stoffer D.S. Time series analysis and its applications. Springer; Cham: 2017. ARIMA models; pp. 75–163. [Google Scholar]
  38. Singhal A., Kamga C., Yazici A. Impact of weather on urban transit ridership. Transportat. Res. Part A: Pol. Pract. 2014;69:379–391. [Google Scholar]
  39. Tao S., Corcoran J., Rowe F., Hickman M. To travel or not to travel:‘Weather’is the question. Modelling the effect of local weather conditions on bus ridership. Transportat. Res. Part C: Emerg. Technol. 2018;86:147–167. [Google Scholar]
  40. Tirachini A., Cats O. SARS-CoV-2 and public transportation: Current assessment, prospects, and research needs. J. Public Transport. 2020;22(1):1. doi: 10.5038/2375-0901.22.1.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Tirachini A. Estimation of travel time and the benefits of upgrading the fare payment technology in urban bus services. Transport. Res. Part C: Emerg. Technol. 2013;30:239–256. [Google Scholar]
  42. Tokey A.I. Spatial association of mobility and SARS-CoV-2 infection rate in the USA: A county-level study using mobile phone location data. J. Transport Health. 2021;22 doi: 10.1016/j.jth.2021.101135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. UITP . Mohamed Mezghani, Rue Sainte Marie 6, B-1080 Brussels, Belgium; 2020. Public transport authorities and COVID-19: impact and response to a pandemic, Victoria Australia. [Google Scholar]
  44. Viallard A., Trépanier M., Morency C. Assessing the evolution of transit user behavior from smart card data. Transp. Res. Rec. 2019;2673(4):184–194. [Google Scholar]
  45. Virati M.Q., Pamanik D.P., Pramana S. Forecasting Number of Passengers of TransJakarta using Seasonal ARIMAX Method. J. Data Sci. Appl. 2020;3(1):31–37. [Google Scholar]
  46. Vlahogianni E.I., Karlaftis M.G., Golias J.C. Short-term traffic forecasting: Where we are and where we’re going. Transportat. Res. Part C: Emerg. Technol. 2014;43:3–19. [Google Scholar]
  47. Wang H., Yao R., Hou L., Zhao J., Zhao X. A Methodology for Calculating the Contribution of Exogenous Variables to ARIMAX Predictions. Intelligence (Canadian AI; 2021. [Google Scholar]
  48. Wielechowski M., Czech K., Grzęda Ł. Decline in Mobility: Public Transport in Poland in the time of the SARS-CoV-2 Pandemic. Economies. 2020;8(4):78. [Google Scholar]
  49. Zhou S., Chen B., Liu H., Ji X., Wei C., Chang W., Xiao Y. Travel characteristics analysis and traffic prediction modeling based on online car-hailing operational data sets. Entropy. 2021;23(10):1305. doi: 10.3390/e23101305. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Annex Table 5
mmc1.docx (13.9KB, docx)

Articles from Case Studies on Transport Policy are provided here courtesy of Elsevier

RESOURCES