Skip to main content
SAGE - PMC COVID-19 Collection logoLink to SAGE - PMC COVID-19 Collection
. 2023 Jan 11;50(4):983–999. doi: 10.1177/23998083221150646

A network-based analysis to assess COVID-19 disruptions in the Bogotá BRT system

Juan D Garcia-Arteaga 1,, Laura Lotero 2
Editors: Neave O’Clery, Juan Carlos Duque, Seraphim Alvanides, Tim Schwanen
PMCID: PMC9841208  PMID: 38603410

Abstract

The global COVID-19 crisis has severely affected mass transit in the cities of the global south. Fear of widespread propagation in public spaces and the dramatic decrease in human mobility due to lockdowns have resulted in a significant reduction of public transport options. We analyze the case of TransMilenio in Bogotá, a massive Bus Rapid Transit system that is the main mode of transport for an urban area of roughly 10 million inhabitants. Concerns over social distancing and new health regulations reduced the number of trips to under 20% of its historical values during extended periods of time during the lockdowns. This has sparked a renewed interest in developing innovative data-driven responses to COVID-19 resulting in large corpora of TransMilenio data being made available to the public. In this paper we use a database updated daily with individual passenger card swipe validation microdata including entry time, entry station, and a hash of the card’s ID. The opportunity of having daily detailed minute-to-minute ridership information and the challenge of extracting useful insights from the massive amount of raw data (∼1,000,000 daily records) require the development of tailored data analysis approaches. Our objective is to use the natural representation of urban mobility offered by networks to make pairwise quantitative similarity measurements between daily commuting patterns and then use clustering techniques to reveal behavioral disruptions as well as the most affected geographical areas due to the different pandemic stages. This method proved to be efficient for the analysis of large amount of data and may be used in the future to make temporal analysis of similarly large datasets in urban contexts.

Keywords: COVID-19, network graphs, smart cards, urban data, commuting

Introduction

The global COVID-19 crisis severely affected mass transit worldwide; this resulted in transport modal shifts to avoid public transport which had a major effect on the amount of travel undertaken (Batty, 2020; Abdullah et al., 2020; Das et al., 2021; Thombre and Agarwal, 2021). Fear of widespread propagation in public spaces, as well as the dramatic decrease in human mobility due to lockdown and other strict measures, resulted in an important reduction of public transportation options. The consequences are multiple, overlapping, and possibly permanent as it is not clear how cities in developing countries will guarantee basic services to their inhabitants without massive public transportation. This is of particular concern for the extended cities of the global south which are highly dependent on public transport (Labonté-LeMoyne et al., 2020; Anwari et al., 2021; Arellana et al., 2020).

In Colombia, concerns over social distancing and new health regulations in response to the global emergency have plummeted the number of daily trips in the Bus Rapid Transit (BRT) of the metropolitan area of Bogotá, TransMilenio (TM), to historically low records. A renewed interest to develop innovative solutions and appropriate responses to COVID-19 allows the extent of these changes to be measured thanks to a large corpora of data containing daily card swipe validation records made available by TM to both the general public and interested researchers upon request.

The raw data is periodically updated and made available as individual daily files starting mid-February 2020. Each file contains individual entry time, entry station, and a hash of the card’s ID. The mere size of the raw data makes it impossible to understand travel behaviors without an adequate preprocessing and a careful selection of the proper metrics and analytics. How to reduce this type of dataset from its initial state to a volume that is “accessible, informative and actionable” 1 from which meaningful insights can be extracted, that is, transforming “Big Data” into “Small Data” (Chen et al., 2016), is a growing concern in the urban and transportation planning community, and a new challenge to local analysts for whom new massive datasets are becoming available.

This new data allows us to pose questions about the urban context and how it has been affected by the pandemic. In this paper, we aim to answer three questions:

  • • How can we synthesize the massive data to an actionable size?

  • • How can we group this data in order to extract spatio-temporal patterns?

  • • What insights into the disruptions of the pandemic in the urban context do these patterns deliver?

TransMilenio and the Bogotá Urban Context

Bogotá, located in the center of Colombia, is the capital and largest city of the country. With more than 7 million inhabitants the city is subdivided into 20 localities in a total area of about 1587 square kilometers of which 400 are urban. The metropolitan population of Bogotá and its conurbated neighboring municipalities was 10.7 million inhabitants in 2020.

In Colombia, laws 142 and 143 of 1994 defined a system that classifies households according to their physical and environmental characteristics. As a result, households are classified in one out of six socioeconomic strata which are loosely related to its income. People living in stratum 1 are the lowest-income householders and those in stratum 6 are the wealthiest. The city is strongly segregated along socio-economical lines with most low-income households located in the periphery due to a limited availability of affordable housing close to major employment centers, thus spending an increased amount of time commuting (Bocarejo and Urrego, 2020; Guzman et al., 2018; Lotero et al., 2016). The most affluent neighborhoods tend to be located in the north and northeastern border of the city and the poorest along the west and south.

The modal split in Bogotá, according to the Origin-Destination Survey made in 2019 (Secretaría Distrital De Movilidad, 2019), shows a high participation of public transportation (about 40%) followed by pedestrian trips (32%) (Bocarejo and Urrego, 2020). There are two bus systems in the city: a conventional integrated public bus system called SITP due to its Spanish acronym (Sistema Integrado de Transporte Público) and TM, a well-known BRT (Vecchio, 2017; Hidalgo et al., 2013; Gilbert, 2008).

TM circulates along 114.4 kms and eight lines (troncales) of exclusive lanes serving Bogotá and its neighbor city of Soacha. Additionally, passengers from neighboring municipalities use the system within multi-modal trips.

The system operates using a public–private partnership mechanism (Hidalgo et al., 2013). As of 2019, around 1,766 buses on average circulate on the trunk line system (dedicated lanes). Feeder buses transport users between the stations and different locations that the main TM lines do not reach.

Passengers must pay to acquire a TM smart card which they can recharge and use as many times as they wish. Due to this cost, cards will typically only be replaced due to damage or loss. Some cards are personalized and associated to specific riders, for example, senior citizens and students pay a discounted fare when they use their card. The fare is independent of the length of the ride and is validated upon entry to the system but not on exit.

COVID-19 disruptions in public transportation: An overview

There has been a plethora of studies analyzing the effects of COVID-19 in different activities and contexts. The disruptions in urban mobility and public transportation due to the pandemic are not an exception, with a wide variety of articles being published using diverse data sources and methodologies (Kim, 2021; Benita, 2021; Zhang et al., 2021; Rothengatter et al., 2021; Goulias, 2021; Musselwhite et al., 2021).

Literature of COVID-19 and transportation can be read twofold: studies that use traffic volume and transportation restrictions to model the spread of the disease, for example, Kraemer et al. (2020) and Arenas et al. (2020), and those regarding the effects of COVID-19 on transportation, focusing mainly in quantifying changes in travel volumes, modal shifts, behavior and their relationship with diverse socioeconomic factors. In this work, we will follow the latter approach.

Many papers use survey-like data in both developed and developing countries to address the effects of COVID-19 in urban mobility and transportation despite their inherent risk of biased samples (Goulias, 2021). Here, we are interested in those reported in the literature that use “smart data,” for example, transportation smart cards, GPS records, call detailed records or feed from digital social media, which may not have a sample-size problem but need to be tackled with the appropriate data analysis tools.

For instance, in A Coruña, Spain, data from smart card use, bus stop boarding and automatic vehicle location was used to analyze the impact on transit ridership during lockdown (Orro et al., 2020). Similarly, location-based social network data from Facebook was used in Italy to study the changes in mobility revealing a dramatic drop during the first days of lockdown and a limitation of trips to those fulfilling the basic needs such as supplies delivery and health personnel transportation (Beria and Lunkar, 2021). In Pepe et al. (2020), a data set consisting of daily time-series of mobility metrics such as origin-destination movements between Italian provinces based on a large-scale data set of anonymously shared positions of about 170,000 smartphone users was collected before and during the outbreak. A similar data descriptor for human mobility during COVID-19 in the United States is presented in Kang et al. (2020).

In Sweden, ticket validations were used to highlight the significant decrease of traffic volume in the three most populated regions of the country (Jenelius and Cebecauer, 2020). Using the same data set, the relationship between socioeconomic factors and the individual use of public transportation were evaluated during COVID-19 in Stockholm by means of logit models (Erik et al., 2021).

In Latin America, mobile phone data was used to analyze the share of residents allowed to leave their residences daily according to policy advice and mobility reductions and correlate it with socioeconomic conditions (Heroy et al., 2021). The authors found that employment characteristics and work-from-home capabilities are the primary determinants of mobility reduction in these cities. In Gramsch et al. (2020), the authors use bus and metro smart card data of Santiago, Chile, to identify trips at bus stop level and account for the variation of municipalities that were under lockdown in a given day. Regarding the relationship between mobility and socio-demographics, the authors found that larger reductions in mobility occurred in municipalities with a larger proportion of elderly population and high-income households. In Colombia, Dueñas et al. (2021) used TM data to study mobility reductions during lockdown in Bogotá and linked it to socioeconomic conditions finding that higher socioeconomic strata are consistently associated with higher reductions in mobility and that, in general, high poverty levels in middle-income countries drive non-compliance with lockdown policies and social distancing.

Most of the literature focuses on estimating the mobility reduction due to the pandemic. However, to the best of our knowledge, there is not an approach based on networks and clustering to manage the large amount of available mobility data to find informative patterns of a transportation system operation. Our main contribution consists on using inferred transportation networks as the basic data structure, thus minimizing the complexity, processing time and storage size while being able to extract insights which cannot be derived directly from the massive raw data.

Proposed approach

As the global emergency extended in time, it became increasingly clear that the disease and the cities responses were entangled and evolving. This makes it problematic to understand the urban reality as a simple binary “test vs. control” problem, that is, characterizing lockdown vs. baseline. Instead, we are witnessing evolving systems which must be understood dynamically.

We have opted to follow three guidelines in order to obtain significant insights, namely:

  • 1. Use days as the minimal temporal analysis unit.

  • 2. Cluster days by mobility pattern similarities.

  • 3. Analyze the spatial components of each of the temporal clusters.

The first decision, using days as the smallest temporal unit, is straightforward since the TM system operates roughly from 4 AM to 11 PM. This provides a natural and manageable division of the data.

The second step, clustering, seeks to group similar instances by assigning the same labels to days with similar mobility patterns. Clustering is an example of unsupervised learning, a well-studied Machine Learning application in which an algorithm tries to separate the data into “natural” clusters related to the underlying generative mechanism. Many clustering algorithms are based on a distance metric between instances and will, in general, attempt to minimize the sum of intra-class distances while maximizing the sum of inter-class distances. We have chosen this type of method to avoid setting arbitrary thresholds to the different pandemic stages.

Since the number of clusters is significantly smaller than the number of instances, clustering simplifies data analysis by reducing the number of objects that must be considered separately. This allows the analyst to gain insights of the data, our third objective, by making comparisons between classes and not between data points. The alternative to clustering, trying to draw significant insights from the raw data, requires the simultaneous analysis of the daily patterns of more than 150 heterogenous stations during long periods of time.

The data has, however, some limitations:

  • • The number of daily observations varies widely, ranging from 2.5 million validations during the pre-pandemic baseline to close to 100,000 trips during strict lockdown, holidays and some weekends. This gap makes it problematic to compare directly raw daily validations.

  • • The trip information is incomplete as the cards must only be validated when entering the system but not when exiting it.

Network representation

Graphs have been traditionally used to analyze and visualize transportation networks such as airline routes or rail networks (Newman, 2018). Many of these networks use nodes to represent physical locations or urban spaces (Jiang et al., 2000; Jiang, 2009) and edges to represent a connection between nodes, that is, airports connected by a flight route or train stations connected by a railway track.

This type of graphs are a powerful tool that accurately capture and visualize the stationary physical structure of the network but not its dynamic functional characteristics. These are ideally studied by tracing the movements of passengers and buses either on a street-network or on a route basis level. This allows the analysis of flow and interactions of the passengers within the system in a more detailed way.

Our objective is to use the natural representation of urban mobility offered by networks to understand the system’s social and functional changes. To do this, we use nodes to represent TM stations and we capture the trips between different stations. However, due to the high number of passengers at peak hours, TM does not require an exit validation in order to improve the passenger flow and avoid bottlenecks. Furthermore, once passengers enter the system they are allowed an unlimited and unrecorded number of route transfers till exit.

As a workaround to the lack of destination data we use the trip chain method originally proposed in Barry et al. (2002) to estimate the station-to-station origin-destination matrix from card information. Trip chain makes two basic assumptions:

  • • Most passengers follow a highly cyclical “pendulum movement” pattern in which the first validation of the day corresponds to a commuting trip, starting near the passenger’s residence and ending at their workplace, and the last validation corresponds to a return trip from the workplace to home.

  • • Two consecutive validations of the same card in different stations indicate a trip from the first station to the vicinity of the second one.

Although both assumptions are debatable and clearly do not cover the great diversity of trip purposes, they do agree with the general tendencies measured in most mobility studies (Prieto Curiel et al., 2021; Lotero et al., 2016; Liu et al., 2009) and give a unique and compact network representation of each day’s movements. Figure 1 shows evidence of this pendulum movement from the periphery to the center of the city of Bogotá.

Figure 1.

Figure 1.

A snapshot at one day of Bogota’s baseline morning (left) and afternoon (right) rush hours in TM. The area of each circle is proportional to the number of card swipes made in that station in a 15 min window. Red corresponds to stations near their maximum capacity and green corresponds to sparsely occupied stations. Note the “pendulum movement” with trips originating in the morning in the periphery and returning in the afternoon from the central zone.

Data

TM has liberated data sets containing information about daily validations, that is, swipes of the card when entering the BRT system. Each record contains a timestamp, a location identifying which station and entrance was used as well as a hash of the card’s ID in order to make the card identifiable while preserving user anonymity. Data is stored as daily files containing information of between roughly 100,000 and 2,500,000 validations.

In this paper, we have used the available data ranging from February 13 to December 31st, 2020. The date range allows us to compare pre-pandemic stages (the first confirmed case of COVID-19 in Colombia was reported in Bogotá on 6 March 2020) against different periods of public policy measures and their impact on the largest public transportation system in the country.

It should be noted that this data does not necessarily reflect the beginning and end of each passenger’s daily commute but only the portion of it done in TM: The system also serves passengers arriving from neighboring municipalities using other systems which are currently untraceable such as privately owned inter-city buses, bicycles, and taxis.

We have not considered feeder lines in this analysis as during the period under study this service could be accessed without restrictions with cards validated upon arrival at the stations. This changed in 2021 (outside of the period being analyzed) with passengers being required to validate their smart card in the feeder bus in order to avoid short-distance free riders.

Methods

Network construction

We use a directed graph G d to store the trip information of each day d. The nodes of G d correspond to the stations and the directed weight of each edge corresponds to the number of trips we assume were made between the stations. We generalize the commuting assumptions to extract information in the following rules:

  • 1. If a card makes two consecutive validations in stations S1 and S2 with S1S2, we assume that a trip was made from S1 to S2.

  • 2. If a card makes multiple validations in different stations in a single day (S1, S2, …, S n ) and the first and last stations are different (S n S1), we assume that a trip was made from S n to S1, which would correspond to a returning home trip.

Note that the simple home-work-home commute is assumed when n = 2. Also note that many trips about which no reasonable destination assumption may be made, for example, cards with a single validation or multiple consecutive validations at the same station, were not taken into account.

Centrality measures

A widely accepted definition of a node’s centrality coming from the study of social networks and organizations was given in Freeman (1978): a node’s centrality is related to its importance within the graph. Freeman defines three measures of centrality: Degree (related to the number of neighbors), closeness (related to the average distance to all other nodes in the graphs) and betweenness (related to the number of times a node appears in the shortest path between two nodes in the graph). Node centrality measures are useful to gain insights into the inner mechanics of the system and the node hierarchies generated may be used to compare networks.

Both betweenness and closeness centralities have been used for transportation networks with edges that model a distance component, for example airline networks (Wang et al., 2011; Guimera et al., 2005). Betweenness, in particular, is an intuitively natural choice when modeling the structures underlying physical world travel behavior as mentioned in The Network Representation. Unfortunately, in our case edges contain only the initial and final but not of the intermediate stations, if any. Furthermore, if it were to be used it would fail to capture changes in the nodes social and functional importance and consistently mark nodes at the crossroads of the system lines as the most central.

Degree centrality is one of the simplest ways to measure the importance of a node. In undirected graphs, the degree corresponds to the number of edges connected to the node (Newman, 2018; Wasserman and Faust, 1994). In directed graphs, such as commuting networks, one may use either the in-degree or the out-degree by measuring, respectively, the number of inbound or outbound edges. In the latter case, this would correspond to the use of the number of validations per station as the measure of centrality.

Although degree centrality may be informative in itself, information on the general topology of the network is lost, as seen in Figure 2. Degree centrality may also present problems when comparing networks in which the absolute value of the edges’ weight is significantly different, for example, the number of daily trips on the network during the lockdown or on a holiday is smaller than the number of trips on a baseline weekday.

Figure 2.

Figure 2.

Whereas the degree centrality calculated with the weight of the edges is the same for nodes A, B and C in both of the top networks, Eigencentrality (EC) gives a more nuanced picture of the network topology.

An alternative method to calculate centrality is by using the assumption that being connected to high-scoring nodes will contribute more to the centrality than being connected to low-scoring nodes. The Eigencentrality (EC) measure is based on this concept. The EC of a vertex v i in a graph G is defined as the sum of the distributed centrality of its neighbors

EC(vi)=1λjGai,jEC(vj) (1)

where ai,j is the value in row i and column j of A, the adjacency matrix of G. One may rewrite equation (1) as the eigen vector definition

Ax=λx (2)

Measures derived from EC have been used to determine the ranking of directed networks such as the citation impact of academic articles (Yan and Ding, 2009; Bollen et al., 2006). One of these derived measures, PageRank (PR), was developed to measure the importance of web pages (Page et al., 1999) and has been successfully used on a wide array of complex system problems. In the following section, we will explain how to measure the TM network centrality using PR.

PageRank and mobility networks as Markov chains

PR was originally used by Google founders Larry Page and Sergey Brin in Page et al. (1999) to calculate the importance of a web page in the world wide web (WWW). The WWW is modeled as a network in which a hyperlink in web page A pointing to web page B is interpreted as a directed edge e(A, B). Using an iterative algorithm, the centrality of each node at iteration n is equally distributed among all the web pages to which it points, while receiving the distributed PR of the pages pointing to it. This process is repeated until the centrality values of all nodes change below a given threshold, that is, it converges.

The general formula to calculate the PR of a node is

PRn+1(A)=1dN+dvBAPRn(v)L(v) (3)

where N is the total number of nodes in the network, L(v) is the out-degree of node v, B A is the set of all nodes pointing to A and d is a damping factor equal to the probability of following a link in the web page and not jumping to any other randomly chosen page.

The resulting PR values are related to the EC values of a Markov chain. A Markov chain is a stochastic model without memory representing a system with different possible states in which the probability of the following state is based only on the current state. Markov chains may be visualized as directed weighted networks in which each node corresponds to a state and the weight of the edge between two nodes, e w (A, B), corresponds to the probability of transitioning from state A to state B.

Considering the WWW as a Markov chain leads to a very intuitive explanation of the meaning of the PR centrality values: They are proportional to the fraction of times a “random web-surfer” clicking randomly at any hyperlink will pass through a given page after an infinite amount of time. PR and the Markov chain interpretation have been used in transport problems such as traffic prediction and modeling (Pop and Dobre, 2012; Zhang et al., 2016)

We have opted to model the TM daily trip network as a Markov chain in which it is possible to calculate the importance or centrality of a station as the proportional number of times a “random commuter” would pass through a station after an infinite number of trips, assuming he or she chooses the next station according to the probability of each edge. That is, the commuter will choose next station with a probability proportional to the number of passengers in that station following that direction. To do this, the edge weights of the trip network, G d , defined in 4.1 are normalized so that the sum of the weights of outgoing edges is equal to 1 for any node

j=1j=Nw(vi,vj)=1,viV (4)

Considering the daily trips as a Markov chain allows us to create a list of centrality rankings containing a snapshot related to the daily commuting patterns. These snapshots may be used to efficiently compare pairs of days in order to quantify how similar or dissimilar the networks are.

As seen in Figure 3, although the sum of outgoing edges is one for all nodes after normalization, PR will still capture the importance of nodes with a large number of trips and connections to other important nodes.

Figure 3.

Figure 3.

A simple example showing how the values for both EC and PR remain unchanged in both the original and the normalized network (left and right, respectively).

It should be noted that although both EC and PR are dimensionless and cannot be compared, PR has the advantage of being normalized and outputting a probability distribution which may intuitively be interpreted as the likelihood of passing through a given node in a random walk traversing process.

Graph temporal clustering

Network comparison is a difficult task with a complexity that commonly grows exponentially with the number of nodes being considered. The task is significantly simplified when there is a known node to node correspondence between the networks. In our case, given that all nodes correspond to TM stations, we use Vertex Ranking (VR), a technique introduced by Papadimitriou et al. (2010), to efficiently quantify and detect changes in the topology of a computer network through time.

VR uses a modified version of Spearman’s rank correlation coefficient and assumes that two networks are similar if the rankings of its nodes are similar. There are many possible measures of node centrality which may be used as a ranking. We have chosen PR due to its advantages over other centrality measures mentioned in The PageRank and Mobility Networks as Markov Chains as well as the highly optimized method used to calculate it and rapid convergence rate.

Given two networks G m = (V m , E m ) and G n = (V n , E n ) corresponding to days m and n, their similarity may be measured as

VR(Gm,Gn)=12vVmVnwv×(π(v,m)π(v,n))2D (5)

where π(v,m) and π(v,n) are the positions of node v in the sorted rank lists for day m and n. D is a normalization factor. A quality factor w is calculated from the nodes’ centrality measures.

Using the VR measure makes it possible to efficiently build a similarity matrix with rows and columns corresponding to different days, and cells corresponding to their similarity sim(G m , G n ) ∈ [0, 1].

Once the similarities between the graphs are calculated it is possible to find clusters. Spectral clustering, also known as spectral graph partitioning (Von Luxburg, 2007), is an algorithm based on the clustering of the projections to the normalized Laplacian of the data samples. Spectral clustering algorithms tend to outperform other traditional algorithms such as k-NN when the structure of the individual clusters is highly non-convex or highly spread. We used spectral clustering to group days by the similarity of their mobility networks.

Results

In this section we present the graphical and numerical results from analyzing the TM swipe card validation data.

Applying spectral clustering to the daily mobility pattern similarity graph, as explained in the previous section, results in six distinct groups. A graphical representation of the network-based clustering may be seen in Figure 4.

Figure 4.

Figure 4.

Total daily validations in TM stations from February 13 to December 31st, 2020. Markers correspond to the clusters generated by spectral clustering. We have named the clusters according to the dates encompassing each one: “Pre-pandemic” (orange circle), “Lockdown” (red star), first relaxation of lockdown or “Early recovery” (purple square), “Late recovery” (blue diamond), Saturdays (black cross) and holidays and Sundays (green circle).

The clusters show clear temporal patterns highly correlated to specific moments of the pandemic, namely, Baseline (pre-pandemic), Lockdown, Early (first lockdown relaxation), and Late (economic recovery measures in place). Additionally, the clustering algorithm assigns specific classes for Saturdays and for Sundays and local public holidays such as May the first or July the 20th, a Colombian national holiday.

In order to understand the city wide mobility changes between the time periods defined by the identified clusters, we compare the centrality changes in the TM stations. We are particularly interested in those with the largest changes in relative centrality (independently from the number of trips) as they highlight the city areas with the sharpest increase or decrease of mobility and, therefore, activity.

To have a meaningful comparison of the changes independently of the size of a station all station measurements are normalized by subtracting the corresponding Baseline cluster’s PR mean and dividing by its standard deviation.

Although the clustering method used identifies six different stages of COVID-19–related mobility restrictions, we are mainly interested in the differences between pre-pandemic stages (baseline), strict lockdown, and late recovery. These clusters reveal the major changes in the use of TM and give insights on how, despite the partial economic reopening of the city, the TM system has not been able to fully recover its baseline demand and how this recovery has not been spatially homogeneous.

Figure 5 compares the mean of each station’s centrality during the pre-pandemic Baseline vs Lockdown and Late. A geographical pattern related to the socioeconomic structure of the city is easily observable with stations in the south of the city (where most of low-income households are located) show an increase in PR during Lockdown and, to a lesser degree, during Late recovery. TM stations in this area of the city were more active during the strict lockdown, and might be related to workers of essential activities such as health, care, or logistics, relying on TM to commute.

Figure 5.

Figure 5.

Comparison of the station centrality mean (PR) in baseline vs Lockdown (left) and baseline vs Late (right). Colors represent latitude of the station with red representing the south and blue the north.

Baseline vs. lockdown

The strong geographical component of the change in centrality may be seen in Table 1 comparing the top five stations with the largest increase and the top 5 with the largest decrease of the standardized mean PR from Baseline to Lockdown. Standardized PR is calculated by subtracting the Baseline’s mean PR and dividing by its standard deviation in order to account for the magnitude differences between different stations. All of the top stations with the highest increase in mean standardized centrality from Baseline to Lockdown are located in the south end of the city (see Figure 6). Additionally, the top four are located near major public hospitals and the fifth, “Juan Pablo II,” is a gondola lift station communicating one of Bogota’s poorest and more marginal neighborhoods with the TM system.

Table 1.

The top five and bottom five stations with the strongest Lockdown-Baseline changes as measured by the differences in the standardized mean PR centrality.

Baseline-lockdown
Top 5 Bottom 5
Station Standardized Station Standardized
Name Mean diff. Name Mean diff.
San Bernardo 13.897801 Terminal −7.426562
Hortúa 12.610034 Calle146 −4.714783
Hospital 9.998976 Toberín −3.515932
Ferias 8.236083 Calle 187 −3.458311
Juan Pablo II 7.936324 Calle 63 −3.396144

Figure 6.

Figure 6.

Map of socioeconomic strata, averaged by blocks (stratum 1 for lowest-income households and 6 for higher-income) with the PR centrality for each TM station. The size of the dots is related to the Baseline centrality while color is related to the increase or decrease of normalized PR between Baseline and Lockdown.

The station with the largest standardized mean centrality decrease from Baseline to Lockdown corresponds to “Terminal,” Bogota’s main bus terminal for inter-city bus trips which remained highly restricted during Lockdown. The next three stations with the strongest decrease (“Calle 146,” “Toberin,” and “Calle 187”) are all located along the northern corridor of TM and correspond to upper-middle class family oriented residential neighborhoods. The last station, “Calle 63,” is still part of the northern zone of the city but serves a younger professional upper-middle class population.

The relationship between socioeconomic level and the change in centrality during the pandemic is also seen in Figure 6 which maps changes in standardized centrality against socioeconomic strata. In this figure, the size of the stations is proportional to the mean of the centrality (PR) of that node in the baseline scenario (Pre-pandemic). After Lockdown nodes in the south and southwest of the city show an increase in their centrality (blue and green colors). Nodes in the Northeast show a decrease in their importance (red color). This is correlated with the predominant socioeconomic strata distribution of the stations’ surroundings.

Baseline vs late

The same analysis done in the previous section may be applied to the Baseline to Late transition. Table 2 shows how the stations with the strongest increase remain in the south-east end of the city, although they are no longer centered around medical services. Two high ranking stations (“Juan Pablo II” and “Manitas”) are part of the gondola lift system, highlighting how the underprivileged sector it serves cannot afford to stop their economic activity nor their trips.

Table 2.

The top five and bottom five stations with the strongest Late-Baseline changes as measured by the differences in the standardized mean PR centrality.

Baseline-late
Top 5 Bottom 5
Station Standardized Station Standardized
Name Mean diff. Name Mean diff.
Tygua - San José 11.218690 Salitre - El Greco −5.678791
Manitas 5.640178 Terminal −3.588162
Juan Pablo II 5.533348 Calle 146 −3.281878
Guatoque - Veraguas 4.850199 Las Aguas −2.649208
Restrepo 4.351470 Universidades −2.542651

Two of the stations with the strongest decrease from Baseline to Lockdown (“Terminal” and “Calle 146”) are also among those with the strongest Baseline to Late decrease. Additionally, two stations near the downtown historic district and close to a large number of universities (“Universidades” and “Las Aguas”) appear in the list.

Discussion and conclusions

The analyzed data, ranging from mid-February to December 2020, shows an alarming reduction in TM use with a gradual recovery that does not return to baseline values. This threatens the financial feasibility of the system and may result in various city externalities related to the alternate transportation choices of lost passengers.

The Markovian network transport model is an appropriate model to compare days or stations with a significant difference in size as measured by the number of validations. The network also provides a compact way of representing the daily mobility data.

The cluster analysis was able to clearly separate movement-pattern groups based on network comparison data. These groups coincided with a baseline situation (before the declaration of COVID-19 emergency), lockdown measures, early relaxation of mobility restrictions, later economic recovery policies, Saturdays and Holidays including Sundays.

Results show that the most relevant nodes are related to the commuting of essential activities workers and correlate to the demographic composition of areas surrounding TM stations.

The commuting assumptions made, given the data available, disregard gendered differences and might better reflect traditional male mobility patterns (home-work-home) in a cultural context with relatively traditional gender divisions of household labor. Female commuting patterns, such as trip chaining (due to multi-purpose and/or multi-modal journeys) might be underestimated by the proposed method to construct networks. Data with gendered attributes would help to overcome this simplification. However, the difficulty of having complete data, especially in developing countries, highlights the need of innovative methods and approaches to analyze data to support decision-making.

The possibilities offered by data and network analysis merit further research including, but not limited to, predictive tasks of the behavior of the TM system as well as the occupancy of TM stations. This would offer stakeholders insights regarding decision-making in the resource allocation for stations. Also, related to other attributes of available data, it might be possible to identify the movement patterns of a random person or to use a finer granularity for temporal analysis of networks (e.g., hourly).

Finally, it should be noted that this work has limited itself to analyzing the disruptions in 2020, a year with a pre-pandemic period, severe lockdown disruptions, and also the reopening of many activities in order to pursue economic and social recovery from the pandemic. The following year, 2021, was atypical not only because of the pandemic but also due to the general strike and social unrest, partially related to the social effects of the lockdown itself. This strike affected severely not only the demand but the service of TM, for example, “Portal Américas,” one of the main stations of the system, was taken over by protesters, renamed “Portal Resistencia” and used as headquarters and epicenter of the protests during extended periods of time (Daviaud and Carvajal, 2021). This fact makes extending our analysis to 2021 unreliable to discriminate changes in TM due to the pandemic against those derived from social unrest.

Biography

Juan D. Garcia-Arteaga is an assistant professor at the School of Medicine of the Universidad Nacional de Colombia (Bogotá, Colombia). He graduated as an Electrical Engineer from the Universidad de Los Andes (Bogotá, Colombia) and received a D.E.A. from the Universitat Politècnica de Catalunya (Barcelona Spain) and a PhD from the Czech Technical University (Prague, Czech Republic). His current research interests are the use of use of networks and machine learning techniques in the modelling and behaviour prediction of urban and biological systems.

Laura Lotero is an associate professor in Industrial Engineering at the Universidad Pontificia Bolivariana in Colombia. She received her Ph.D. in Engineering from the Universidad Nacional de Colombia, at the Department of Decision and Computer Science. Her current research interests are modeling and simulating the complexity of urban systems, urban mobility networks, and investigating inequalities in urban and digital contexts.

Note

Footnotes

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Juan D Garcia-Arteaga https://orcid.org/0000-0002-4534-6393

Laura Lotero https://orcid.org/0000-0002-6537-3276

References

  1. Abdullah M, Dias C, Muley D, et al. (2020) Exploring the impacts of COVID-19 on travel behavior and mode preferences. Transportation Research Interdisciplinary Perspectives 8: 100255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Anwari N, Ahmed MT, Islam MR, et al. (2021) Exploring the travel behavior changes caused by the COVID-19 crisis: a case study for a developing country. Transportation Research Interdisciplinary Perspectives 9: 100334. [Google Scholar]
  3. Arellana J, Márquez L, Cantillo V. (2020) COVID-19 outbreak in colombia: an analysis of its impacts on transport systems. Journal of Advanced Transportation 2020: 1–16. [Google Scholar]
  4. Arenas A, Cota W, Gómez-Gardeñes J, et al. (2020) Modeling the spatiotemporal epidemic spreading of COVID-19 and the impact of mobility and social distancing interventions. Physical Review X 10(4): 041055. [Google Scholar]
  5. Barry JJ, Newhouser R, Rahbee A, et al. (2002) Origin and destination estimation in New York City with automated fare system data. Transportation Research Record 1817(1): 183–187. [Google Scholar]
  6. Batty M. (2020) The coronavirus crisis: What will the post-pandemic city look like? ’, Environment and Planning B: Urban Analytics and City Science 47(4): 547–552.DOI: 10.1177/2399808320926912. [DOI] [Google Scholar]
  7. Benita F. (2021) Human mobility behavior in COVID-19: a systematic literature review and bibliometric analysis. Sustainable Cities and Society 70: 102916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Beria P, Lunkar V. (2021) Presence and mobility of the population during the first wave of covid-19 outbreak and lockdown in italy. Sustainable Cities and Society 65: 102616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bocarejo JP, Urrego LF. (2020) ‘The Impacts of Formalization and Integration of Public Transport in Social Equity: The Case of Bogota’: Research In Transportation Business & Management. 100560. [Google Scholar]
  10. Bollen J, Rodriquez MA, Van de Sompel H. (2006) Journal status. Scientometrics 69(3): 669–687. [Google Scholar]
  11. Chen C, Ma J, Susilo Y, et al. (2016) The promises of big data and small data for travel behavior (aka human mobility) analysis. Transportation Research Part C: Emerging Technologies 68: 285–299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Das S, Boruah A, Banerjee A, et al. (2021) Impact of COVID-19: a radical modal shift from public to private transport mode. Transport Policy 109: 1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Daviaud S, Carvajal J. (2021) Colombia: el papel del arte y de la justicia transicional en la salida de la violencia. 03358252. [Google Scholar]
  14. Dueñas M, Campi M, Olmos LE. (2021) Changes in mobility and socioeconomic conditions during the covid-19 outbreak. Humanities and Social Sciences Communications 8(1): 1–10. [Google Scholar]
  15. Erik A, Isak R, Matej C, et al. (2021) Who continued travelling by public transport during COVID-19? socioeconomic factors explaining travel behaviour in Stockholm 2020 based on smart card data. European Transport Research Review 13(1): 31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Freeman LC. (1978) Centrality in social networks conceptual clarification. Social Networks 1(3): 215–239. [Google Scholar]
  17. Gilbert A. (2008) Bus rapid transit: is Transmilenio a miracle cure? Transport Reviews 28(4): 439–467. [Google Scholar]
  18. Goulias KG. (2021) Special Issue on Understanding the Relationships between COVID-19 and Transportation: Transportation Letters. pp. 1–4. [Google Scholar]
  19. Gramsch B, Guevara A, Munizaga M, et al. (2020) The Effect of Dynamic Lockdowns on Public Transport Demand in Times of COVID-19: Evidence from Smartcard Data. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Guimera R, Mossa S, Turtschi A, et al. (2005) The worldwide air transportation network: Anomalous centrality, community structure, and cities’ global roles. Proceedings of the National Academy of Sciences 102(22): 7794–7799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Guzman LA, Oviedo D, Cardona R. (2018) Accessibility changes: analysis of the integrated public transport system of Bogotá. Sustainability 10(11): 3958. [Google Scholar]
  22. Heroy S, Loaiza I, Pentland A, et al. (2021) Covid-19 policy analysis: labour structure dictates lockdown mobility behaviour. Journal of the Royal Society Interface 18(176): 20201035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hidalgo D, Pereira L, Estupiñán N, et al. (2013) Transmilenio brt system in Bogota, high performance and positive impact–main results of an ex-post evaluation. Research in Transportation Economics 39(1): 133–138. [Google Scholar]
  24. Jenelius E, Cebecauer M. (2020) Impacts of COVID-19 on public transport ridership in sweden: Analysis of ticket validations, sales and passenger counts. Transportation Research Interdisciplinary Perspectives 8: 100242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Jiang B. (2009) Ranking spaces for predicting human movement in an urban environment. International Journal of Geographical Information Science 23(7): 823–837. [Google Scholar]
  26. Jiang B, Claramunt C, Klarqvist B. (2000) Integration of space syntax into GIS for modelling urban spaces. International Journal of Applied Earth Observation and Geoinformation 2(3–4): 161–171. [Google Scholar]
  27. Kang Y, Gao S, Liang Y, et al. (2020) Multiscale dynamic human mobility flow dataset in the us during the COVID-19 epidemic. Scientific Data 7(1): 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kim K. (2021) Impacts of COVID-19 on transportation: Summary and synthesis of interdisciplinary research. Transportation Research Interdisciplinary Perspectives 9: 100305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kraemer MU, Yang C-H, Gutierrez B, et al. (2020) The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368(6490): 493–497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Labonté-LeMoyne É., Chen S-L, Coursaris CK, et al. (2020) The unintended consequences of COVID-19 mitigation measures on mass transit and car use. Sustainability 12(23): 9892. [Google Scholar]
  31. Liu L, Hou A, Biderman A, et al. (2009) Understanding individual and collective mobility patterns from smart card records: a case study in shenzhen. 12th International IEEE Conference on Intelligent Transportation Systems’. IEEE, pp. 1–6. [Google Scholar]
  32. Lotero L, Cardillo A, Hurtado R, et al. (2016) Several multiplexes in the same city: the role of socioeconomic differences in Urban mobility. In: Garas A. (ed), ‘Interconnected Networks. Understanding Complex Systems’, Understanding Complex Systems. Cham: Springer International Publishing, pp. 149–164. [Google Scholar]
  33. Lotero L, Hurtado RG, Floría LM, et al. (2016) Rich do not rise early: spatio-temporal patterns in the mobility networks of different socio-economic classes. Royal Society Open Science 3(10): 150654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Musselwhite C, Avineri E, Susilo Y. (2021) Restrictions on mobility due to the coronavirus Covid19: threats and opportunities for transport and health. Journal of Transport & Health 20: 101042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Newman M. (2018) Networks. Oxford University Press. [Google Scholar]
  36. Orro A, Novales M, Monteagudo Á, et al. (2020) Impact on city bus transit services of the covid–19 lockdown and return to the new normal: The case of A Coruña (Spain). Sustainability 12(17): 7206. [Google Scholar]
  37. Page L, Brin S, Motwani R, et al. (1999) The Pagerank Citation Ranking: Bringing Order to the Web. Stanford InfoLab: Technical report. [Google Scholar]
  38. Papadimitriou P, Dasdan A, Garcia-Molina H. (2010) Web graph similarity for anomaly detection. Journal of Internet Services and Applications 1(1): 19–30. [Google Scholar]
  39. Pepe E, Bajardi P, Gauvin L, et al. (2020) COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown. Scientific Data 7(1): 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pop F, Dobre C. (2012) An Efficient Pagerank Approach for Urban Traffic Optimization: Mathematical Problems in Engineering. [Google Scholar]
  41. Prieto Curiel R, Patino JE, Duque JC, et al. (2021) The heartbeat of the city. Plos One 16(2): 1–30. DOI: 10.1371/journal.pone.0246714 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Rothengatter W, Zhang J, Hayashi Y, et al. (2021) Pandemic Waves and the Time after Covid-19–Consequences for the Transport Sector. Transport Policy. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Secretaría Distrital de Movilidad (2019) Encuestas de movilidad. Available at: https://www.simur.gov.co/encuestas-de-movilidad
  44. Thombre A, Agarwal A. (2021) A Paradigm Shift in Urban Mobility: Policy Insights from Travel before and after COVID-19 to Seize the Opportunity. Transport Policy. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Vecchio G. (2017) Democracy on the move? Bogotá’s urban transport strategies and the access to the city. City, Territory and Architecture 4(1): 1–15. [Google Scholar]
  46. Von Luxburg U. (2007) A tutorial on spectral clustering. Statistics and Computing 17(4): 395–416. [Google Scholar]
  47. Wang J, Mo H, Wang F, et al. (2011) Exploring the network structure and nodal centrality of china’s air transport network: A complex network approach. Journal of Transport Geography 19(4): 712–721. [Google Scholar]
  48. Wasserman S, Faust K, et al. (1994) Social Network Analysis: Methods and Applications. [Google Scholar]
  49. Yan E, Ding Y. (2009) Applying centrality measures to impact analysis: A coauthorship network analysis. Journal of the American Society for Information Science and Technology 60(10): 2107–2118. [Google Scholar]
  50. Zhang J, Hayashi Y, Frank LD. (2021) COVID-19 and transport: Findings from a world-wide expert survey. Transport Policy 103: 68–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Zhang T, Li G, Xu Y, et al. (2016) Prediction of transportation network based on PageRank algorithm. 5th International Conference on Advanced Materials and Computer Science. Qingdao’. [Google Scholar]

Articles from Environment and Planning. B, Urban Analytics and City Science are provided here courtesy of SAGE Publications

RESOURCES