Abstract
We explore global stock markets’ connections during the financial crises or risks since 1995 with emphasis on the situation under COVID-19. We choose 40 countries/regions and take one index from each of them, and then compute the correlation coefficients and distances between each pair of the indices with a sliding window. We construct the complexes and carry out topological data analysis mainly through persistence landscapes and their -norms, which exhibit the complexes’ daily changes. We establish a critical dates’ detection system based on the persistence landscapes. Topological features of the complex networks are shown on the critical dates and dates before them. All the results show clearly that the connections became even closer among the markets when COVID-19 spread worldwide than those of any other risk. The robustness and effectiveness of these methods provide guidance for the analysis of financial crises in the future.
Keywords: Stock index, Topological data analysis, Persistence landscape, Complex network, COVID-19
1. Introduction
Since COVID-19 broke out in China in late December of 2019, it has spread all over the world. During the period, the talks broke down between Saudi Arabia and Russia on oil production reduction, and Saudi Arabia started full-scale production, leading to a sharp drop in global oil prices. Together with the decline of the global economy, the negative effects of regional economy are gathering. As a result, the financial markets reacted soon and global stock indices fell greatly. European stocks generally fell more than 7% on March 9, 2020. Among the three major European stocks, the FTSE100 index in the UK fell 7.69%, the DAX30 index in Germany fell 7.89%, and the CAC40 index in France fell 8.39%. The biggest losers in Europe were Greece and Italy, falling 13% and 11% respectively. In US, the condition was even worse. The markets experienced a once-in-a-century liquidity crisis. They triggered a market wide circuit breaker four times since March 9, i.e., March 9, March 12, March 16 and March 18 of 2020. Markets in Australia, China, Canada, Brazil and some other countries also experienced collapse in the period. The global stock markets were undergoing a terrible time. It is an urgent and necessary work to study the situation so as to help better understand the situation and make effective prevention and control in the near future. Many research works have been done based on the stock or bond markets under COVID-19, e.g., Ouzan [1], Sharif et al. [2], Jia et al. [3], Ashraf [4], Umara et al. [5], Akhtaruzzamana et al. [6], Just and Echaust [7], Papadamou et al. [8], Mazur et al. [9] and etc.
To do deeper research on the situation, we intend to explore the global markets’ connections in topological perspectives. Furthermore, we will also analyze the subprime crisis in the United States from 2007 to 2008 and the European debt crisis from 2010 to 2011 chronologically, so as to compare the situations among these three crises. The COVID-19 has the characteristics of long duration, rapid growth and wide range of influence, which has a significant impact on international finance. Different from the previous two crises, this financial turmoil is believed from the outside the financial system under COVID-19. However, the reduction of people’s trust and confidence is one of the common reasons for crises. And once it spreads widely and rapidly around the world, the turbulence of global stock markets will inevitably follow.
Sheffer et al. [10] showed that the increased spatial coherence and networks’ change may present us with useful information on the transitions to financial risks. It is valuable to study the financial time series based on their topological structures. In this way, a lot of work has been done with the help of complex networks. Ng et al. [11] combined the fuzzy system with the neural network to study the characteristics of financial distress. Haldane and May [12] discussed in the interplay between complexity and stability in deliberately simplified models of financial networks, and gave some suggestions to minimize systemic risks. Billio et al. [13] proposed the correlation network models that combined the financial and insurance networks. Diebold and Yilmaz [14] proposed several connectedness measures built from pieces of variance decomposition, and analyzed the US’s financial institutions’ volatility during the 2008 global financial crisis. Yan et al. [15] constructed the correlation network of MST and PMFG based on the US’s stock indices and showed that the non-fractal nature in the MST network and systemically important companies had significant influence on the market stability. Ahelegbey et al. [16] extended the approach by introducing stochastic graphical models. Mezeic and Sarlina [17] proposed RiskRank as a general-purpose aggregation operator of risk in nodes and links from the hierarchical network, and its effect was shown by the performance in out-of-sample analysis on systemic risks in Europe. Biplob et al. [18] analyzed the changing integration of Asian financial markets within the global financial network from 1995 to 2016, and found that the connections between the Asian markets and the rest of the world are deepening. Gai and Kapadia [19] analyzed how network models offer a compelling description of the structure of real-world financial systems and shed light on different contagion mechanisms during the global financial crises. Zhu et al. [20] proposed a network quantile autoregressive model, and tested its effect on analyzing systemic financial risks’ contagion among China’s stock markets. Xu et al. [21] studied the internet financial risks with the complex network and showed its effect on discovering potential hidden dangers of systemic risks. Giudici et al. [22] addressed the multivariate nature of the systemic risks by correlation network models and used the multivariate network structure as a viable means to assess common exposures.
The complex networks help to understand some nature and relations among the indicators. However, they are static in the meaning that the structure of a point cloud data usually relies on a fixed threshold value. Furthermore, it is difficult to capture the daily change of the networks. We can hardly detect the critical dates in the transition of crises or risks just from the complex networks. We want to obtain the daily dynamic change of the topological structures from point clouds data. Topological data analysis (abbreviated TDA in the sequel) based on persistent homology enables us to do the work more easily and deeply. Collins et al. [23] proposed a shape descriptor, called a barcode, and gave a metric on the space consisting of barcodes, which helped compare point clouds data for shape recognition and clustering. Carlsson [24] presented the rigorous theoretical basis and the power of TDA in data analysis. Bubenik introduced in [25] a new summary on the barcode, called a persistence landscape, to help quantify the persistence homology. And later, Bubenik and Dlotko provided a toolbox for implementing the persistence landscapes in [26]. The theory is becoming a useful method in dealing with big data.
Gidea [27] studied 2007–2008 financial crises in US based on TDA. He mainly used persistence barcodes and diagrams to capture the critical information on the financial time series. Then Gidea and Katz [28] studied the 2000 and 2008 financial crises mainly based on persistence landscapes. Guo et al. [29] analyzed the situation of US’s, European and China’s stock markets separately during the 2007–2008 global financial crisis and 2010–2011 European debt crisis based on TDA. These results show that TDA has a special perspective and good effect on financial crises analysis. Note that the complexes in these works were established by only a few stocks or indices. We will use TDA to do the analysis in a completely different way.
We want to combine TDA with complex networks to analyze the connections among global financial markets in the transition of financial crises since 1995 with emphasis on the situation under COVID-19. We pick 40 countries/regions to do the research and take one typical stock index from each of them. We build a system based on TDA to help detect the critical dates from the point clouds data. And then we analyze the complex networks on the detected critical dates by several topological measures. Our work is organized as follows.
In Section 2, we first introduce the correlation coefficient and distance between two series. Then we recall the basic terminology on TDA, and introduce the method for constructing the complexes. The persistence landscapes and their -norms are emphasized with detail explanation. We then introduce the basic theory and topological measures of two kinds of complex networks, i.e., threshold networks and minimum spanning trees. At last, we introduce the data and processing method to do empirical studies.
In Section 3, we construct the complexes based on the stock indices. We compute the -, - and -norms of the persistence landscapes through the persistence barcodes with sliding windows of different sizes . Then we build a detection system to help find the critical dates. On the dates, the -norms change sharply, which reflect great changes of the complexes. We also divide the indices into subgroups to make comparisons and check the robustness of the system.
In Section 4, we further focus the analysis on the critical dates by complex networks. By several topological measures on the networks, we compare the present situation with previous several crises and find the clear differences in the networks. It also confirms the effectiveness on the TDA mechanism we constructed in Section 3.
In Section 5, we draw a conclusion and take a look into the future.
2. Methodology and data
We mainly introduce the basic methodology of TDA and complex network we will use to analyze the financial time series in the following sections. Before that, we first show the way to construct distances from the time series, which help to construct simplicial complexes and complex networks. We also introduce the data and processing method.
2.1. Correlation coefficient and distance
For two series and with length , their correlation coefficients are obtained in the following way (Feller [30]),
| (1) |
where and are the mean values of the series and respectively.
Then we use the typical method introduced in Mantegna [31] to convert the correlation efficient to the distance between and ,
| (2) |
It was also shown (Mantegna [31]) to satisfy the following three metric axioms,
(a) if and only if ;
(b) ;
(c) .
The distance will be used to do TDA and construct complex networks.
2.2. Background of TDA
We give a brief introduction of some terminologies regarding TDA.
Suppose we have a data cloud in a topological space endowed with a metric . Denote the -ball of each point by for any . A Vietoris–Rips complex w.r.t the positive value is the simplicial complex whose vertices set is and where spans a -simplex if and only if for any . Note that in , two points and are connected when . It is not difficult to see that a -simplex may merge into a new higher dimensional simplex with the increasing of the value . For the concepts of -simplex and simplicial complex, please refer to Munkres [32]. Furthermore, although there are other types of complexes, we only use Vietoris-Rips complexes for the analysis and abbreviate them as complexes in the sequel.
To characterize a complex and quantify its change with the increasing of the value , the persistence barcode is the most direct way. A barcode (Collins et al. [23]) is a graphical representation of a collection of horizontal line segments (intervals) in a plane with each interval as the life of a topological hole, i.e., a homology class. For example, an interval in a 0-dim barcode corresponds to a connected component in the according complex with the meaning that the component emerges when and vanishes when , and an interval in a 1-dim barcode corresponds to an independent loop in the complex with the end points as the value when the loop emerges and vanishes.
However, it is not enough to analyze the shape of the data only by the barcodes. We need other ways to convert the barcodes to computable properties. A natural representation of a barcode called a persistence diagram (see Cohen-Steiner et al. [33]) is the set . Note that each here is a 2-dim point, where and are the end points of the interval in the barcode. We can compare and calculate the difference between the diagrams by bottleneck distance and degree Wasserstein distance (see Cohen-Steiner et al. [33] and Gidea and Katz [28] respectively). They induce a metric on the space of persistence diagrams. Even so, it is still hard to grasp the continual change of the data shapes. The persistence landscapes and their -norms introduced in Bubenik [25] improve the method effectively. Applying them to the complexes, we will see the shapes’ continual change of the point clouds data. Furthermore, persistence landscapes keep the robustness as diagrams under perturbations of the data. Its stability as a summary statistic was also shown in detail in Bubenik [25].
Now we introduce the persistence landscapes briefly. Each point in a persistence diagram corresponds to a piecewise function shown below,
| (3) |
Then we define
| (4) |
where denotes the th largest element for .
According to Bubenik [25], a persistence landscape is a sequence of functions , and is called the th persistence landscape function. The critical points of are those values of at which the slope changes. The set of critical points of the persistence landscape is the union of the sets of critical points of the functions .
The -norms of a persistence landscape are defined in the following way for ,
| (5) |
where denotes the -norm of , i.e., with respect to the Lebesgue measure.
The -norm of is defined in the following way,
| (6) |
by the definition of , it is clear that .
For two persistence landscapes and , define that and for all , and . Then the space consisting of persistence landscapes endowed with -norm forms a subspace of the Banach space . It hence becomes a Banach space. It is important to use the persistence landscapes to do the statistical analysis.
We will calculate the -, - and -norms in the sequel to do TDA so as to help detect and confirm the critical dates for the financial crises.
For more details of TDA and persistence homology, please refer to Collins et al. [23], Carlsson [24], and Cohen-Steiner et al. [33].
2.3. Background and topological measures of complex network
We will construct complex networks by distance matrices to study the connections of series. The initial networks are fully connected. Since only a few edges contain useful information, we will use threshold networks and minimum spanning trees to filter out the redundant information from the fully connected networks.
To establish a threshold network, we firstly construct an adjacency matrix based on the distance matrix . It is clear that the larger the correlation coefficient is between two nodes and in the network, the smaller the distance is between them. When is less than or equal to a given threshold , the corresponding element in the adjacency matrix, and then the nodes and are connected by an edge (also called correlated); otherwise, , and thus there is no edge connecting and directly. Considering the importance of the adjacency matrix, we show its definition below,
| (7) |
Note that the edges in the network have neither direction nor weight, and we say it a non-directional and unweighted network. Such a complex network constructed in the above way is called a threshold network. In order to observe the features of threshold networks, we introduce four topological properties on them, including degree, density, clustering coefficient, and average path length.
Suppose the network contains nodes. The number of edges connecting the node to other nodes is called the degree of the node . The average degree of all nodes is called the degree of the network.
The maximum possible number of edges is obviously in the network. Assuming the actual number of edges is , then the network density is defined and denoted by
| (8) |
It is clear that the network density ranges from 0 to 1.
Assume that the degree of the node is , i.e., there are nodes connecting the node . We denote the number of edges by among these nodes. Then the clustering coefficient of node , denoted by , is defined in the following way when ,
| (9) |
When is equal to 1 or 0, the clustering coefficient of node is specified as 0. The clustering coefficient of the network is the average clustering coefficient of all nodes in the network, that is, . The clustering coefficient can measure the integrity of the network, and its value ranges from 0 to 1, which is proportional to the correlation degree of the network.
In the network, the number of edges contained in the shortest path from the node to the node is called the shortest path length between and , denoted by . The average path length of the network, denoted by , is defined as the average of the shortest path length between each pair of nodes in the network, i.e.,
| (10) |
By the definitions of above four topological properties, we know that the bigger they are, the closer the nodes connected in the network, except the average path length. While it is on the contrary for the average path length of a network, that is, the smaller it is, the closer the nodes connected. These are the basis for them to measure the correlation of a network. So they are also usually regarded as topological measures on complex networks.
Now we introduce the minimum spanning tree and topological measures on it.
Minimum spanning tree is a basic concept in graph theory. In a connected graph with multiple vertices, a connected subgraph is called a spanning tree of provided that it contains all vertices and no loop. A spanning tree with the minimum sum of the weights on its edges is called the minimum spanning tree, which is abbreviated as MST in the sequel.
We also establish a distance network graph based on the distance matrix. The number of the nodes is denoted by ; each pair of nodes is connected and the weight of the edge is their distance .
Kruskal algorithm (introduced in Kruskal [34]) is one of the most common algorithms to establish MST. It is based on greedy method, that is, the nodes with the shortest distance are connected first in the establishment of the MST. Now we show the specific steps below.
Step 1: Find the minimum value from all the values of distances among nodes in the network, and mark the corresponding edge and the node and in the original graph. Then take out minimum value from the values of distances. Note that, if there are more than one minimum value, we only use one of them.
Step 2: Continue to find the minimum value from the remaining values of the distances, and mark the corresponding edges and nodes. Be careful to avoid forming a loop in all the marked edges and nodes.
Step 3: Repeat step 2 until the number of edges marked reaches .
In this way, we will acquire a connected graph by the marked edges and nodes. It must contain all the nodes of the original graph and edges with no loops. This is a minimum spanning tree of the graph. If all edges have different weights in a distance network graph, the MST will be unique. The MST not only shows correlations among different nodes, but also gives the backbone of relationships among nodes, while other relationships among nodes are filtered out of the network.
In order to observe MSTs, we introduce three properties on them, including normalized length, longest path and degree centrality.
The normalized length of a MST is defined as average weight of edges in the MST, i.e.,
| (11) |
It can be used to measure and compare the lengths of networks on different dates, so as to understand dynamic changes and correlations of the nodes in networks. The smaller the normalized length of MST is, the more strongly the nodes correlate.
We define the longest path in a MST to be the path with the most nodes (ignoring the weight of edges), so the length of longest path is the number of edges in the longest path.
The degree centrality of node in a MST is the number of edges directly connected to node . The larger the degree centrality of node is, the more important this node is in the network. And the node with the maximum value of degree centrality is the systemically most important node.
2.4. Data and processing
We choose 40 major countries/regions in the world according to their geographical locations. Among 40 major countries/regions, there are 7 American countries, 19 European countries, 1 African country, 2 Oceania countries, and 11 Asian countries/regions. We take one typical index from each country/region and download the closing price from 12/29/1994 to 06/30/2020 of each index from Yahoo Finance. Due to different holidays in different countries/regions, the series are different in the length clearly. We carry out linear interpolation processing on the series, and finally get 6,961 trading days’ closing prices. Note that we denote each index by index for and denote the date in the form ’mm/dd/yyyy’ in the sequel. We show the countries/regions and indices in Table 1.
Table 1.
40 global stock indices.
| Country/Region | Stock index | Country/Region | Stock index |
|---|---|---|---|
| Canada | TSX300 | Portugal | PSI |
| America | NASDAQ | Norway | MSCI NORW |
| Mexico | MSCIEWW | Finland | OMX |
| Argentina | MERV | Denmark | MIDK00000PDK |
| Brazil | IBOVESPA | Poland | WIG |
| Chile | IPSA40 | Greece | ASE |
| Peru | MIPE00000PUS | South Africa | JTOPI40 |
| France | CAC40 | Australia | ASX200 |
| Spain | IBEX35 | New Zealand | NZX |
| Switzerland | SWI20 | Turkey | XU100 |
| Sweden | OMX30 | Korea | KOSPI |
| Netherlands | AEX | Thailand | SETI |
| Belgium | BEL20 | China | SSEC |
| England | FTSE100 | Hong Kong | HSI |
| Germany | DAX | Japan | N225 |
| Austria | ATX | Singapore | STI |
| Ireland | ISEQ | Indonesia | JKSE |
| Russia | MSCI ERUS | Taiwan | MSCITW |
| Hungary | BUMIX | India | SENSEX30 |
| Italy | MIIT00000PEU | Malaysia | KLSE |
To find important information among the stock indices, we need their daily log-returns, i.e., , where is the closing price of the index on the trading day . Then each time series includes 6,960 daily log-returns. Note that we say trading day in the meaning that the th trading day in 6,960 days from 01/01/1995 to 06/30/2020.
For each trading day , we compute the correlation coefficient and distance between each pair of the indices according to a sliding window of size . Thus, we obtain the correlation efficient and distance of the series and .
3. Systemic financial risk analysis based on TDA
In March of 2020, the US’s stock markets reacted to the unpredictability with large drops. The VIX also had the highest close ever in the history. We show its daily closing price from January 3 of 1995 to June 30 of 2020 in Fig. 1.
Fig. 1.
The closing prices of VIX from January 3 of 1995 to June 30 of 2020.
We have no indices to observe the volatility of the markets when we study the situation of the global markets. TDA can help us combine many indices to do analysis. In this section, we focus the analysis on the three crises, the global financial crisis 2007–2008, the European debt crisis 2010–2011 and the crisis under COVID-19 in 2020 by exploring the continual change of the topological structures based on financial time series. We try to find the critical points with sharp changes. We first build a detection system from -norms of the persistence landscapes.
3.1. TDA on the 40 indices
For each trading day , we take all indices as points to construct the point cloud data. Then we carry out TDA on the complexes mainly based on the persistence landscapes. To do this, we take the 0-dim and 1-dim barcodes into the consideration and calculate the -norms of the persistence landscapes. Now we show the images of the -norms with sliding windows of three different sizes and in Fig. 2.
Fig. 2.
-norms of persistence landscapes of the point clouds data from the 40 indices of different window sizes. (a). ; (b). ; (c). .
We see from Fig. 2 that the norms are clearly in low level during well-known crises. It is clear that the longest period with low level is from late 2007 to 2011. This indicates the world experienced a long-term turbulence during that time mainly affected by 2007–2008 global financial crisis and 2010–2011 European debt crisis. However, we see that the lowest level emerges in 2020, which reflects the bad situation in the global stock markets since the late February of 2020.
To do further analysis and confirm the severity of the situation under COVID-19, we build a system to detect the dates with sharp changes of the point clouds data.
For every trading day , we calculate and denote by the -norm of the persistence landscape of the complex, by the mean value of the -norms of consecutive trading days starting from the day, and by the -norms’ mean value of consecutive trading days before the day. If the following three conditions are satisfied, we take the trading day as a critical date for a financial crisis or risk.
(i) The norm of the trading day is significantly smaller than that of the previous trading day, i.e., the ratio for a fixed ;
(ii) The norm is smaller than the mean value of consecutive trading days before the day, i.e., the ratio for a fixed .
(iii) The norms’ mean value of consecutive days starting from the day is obviously smaller than that of consecutive trading days before the day, i.e., the ratio for a fixed .
To obtain reasonable parameters to do the analysis in the sequel, we fix and (it is about the number of one year’s trading days). We let change from to and obtain the critical dates for and . And we take the 1996–1997 Asia crisis, 2007–2008 global financial crisis, 2010–2011 European debt crisis, 2015 stock disaster and 2020 crisis under COVID-19 as the main crises. The detected dates related to them are considered as True-positive dates and other detected dates as False-positives dates. When we completely miss the dates of any crises shown above, we add one to the False-negatives. Thus we obtain the numbers of True-positives, False-positives and False-negatives. Furthermore, 2000–2001 American financial crisis was not so seriously worldwide.
Now we show the results on the detected dates in Fig. 3. To have a better comparison for the numbers, we take the opposite of the False-negatives in the heatmap.
Fig. 3.
The heatmap of the detected dates with , , while and . Every three numbers horizontally respond to True-positives, False-positives and False-negatives in order.
We see from Fig. 3 that when changes from to and the other parameters are fixed, the detected dates have no change or change only a bit. By Fig. 3, we decide to mainly use to do the work in the sequel.
As we have stated, the -norms can well characterize and quantify the difference between the persistence landscapes. We calculate the -, - and -norms of the persistence landscapes of the complexes from the 40 indices with . The image of the -norms has been shown in Fig. 2(b). Now we show the - and -norms of the persistence landscapes in Fig. 4.
Fig. 4.
- and -norms of persistence landscapes on the point clouds data from the 40 indices of window sizes . (a). -norms; (b). -norms.
We see from Fig. 4 that the trend of the -norms is similar as that of -norms. While the -norms is different in the overall trend. Moreover, -norms change more frequently than -norms. Under -, - and -norms, we select different values of parameters and show the detected dates in Table 2.
Table 2.
Critical dates detected on different values of parameters based on and .
| Different values of parameters |
Critical dates |
|---|---|
|
under -norms |
10/27/1997, 08/01/2007, 10/06/2008, 08/08/2011, 09/15/2011, 08/24/2015, 11/15/2015, 03/24/2016, 02/24/2020, 03/09/2020, 03/12/2020 |
|
under -norms |
10/27/1997, 09/15/2011, 03/09/2020, 03/12/2020 |
|
, under -norms |
10/27/1997, 07/25/2006, 10/01/2007, 05/10/2010, 08/08/2011, 09/15/2011, 09/27/2011, 10/30/2015, 11/03/2015, 11/15/2015, 12/01/2015, 03/24/2016, 03/09/2020, 03/12/2020 |
|
under -norms |
07/07/2010, 09/20/2010, 09/22/2011, 12/27/2011, 01/06/2015, 03/09/2020, 03/12/2020, 03/16/2020, 03/24/2020. |
The dates 03/09/2020 and 03/12/2020 are detected in all the conditions shown in Table 2. We also see from Table 2 that -norms are more applicable in the detection of critical dates than - and -norms, so we will mainly use them to do further analysis.
Now we show the persistence landscapes on 12/30/2019, 01/02/2020, 03/09/2020 and 03/12/2020 in Fig. 5, to help observe the differences among the related complexes. Note that the former two dates are 50 trading days before the later ones respectively.
Fig. 5.
Persistence landscapes of the point cloud data from the 40 indices of window sizes on the following dates: (a). 12/30/2019; (b). 01/02/2020; (c). 03/09/2020; (d). 03/12/2020.
We see from Fig. 5 that the landscapes on 12/30/2019 and 01/02/2020 are much different from those on 03/09/2020 and 03/12/2020. The top layers in Fig. 5(c) and (d) is much smaller than those in Fig. 5(a) and (b). This well indicates that the connection of the stock markets got closer at those times. We will compare the topological features on the point clouds data in detail in Section 4.
3.2. Robustness check and TDA on subgroups
We first take three indices out to do TDA on the remaining 37 indices to check the robustness of TDA and the detection system. We fix and let change from 0.80 to 0.90.
By the same work with TDA, we obtain the critical dates under 5 groups with 37 indices and the group with all the 40 indices. We also divide the dates into True-positives, False-positives and False-negatives and show the numbers in Fig. 6.
Fig. 6.
The heatmap of the detected dates from -norms with and in different groups. Every three numbers horizontally respond to True-positives, False-positives and False-negatives in order.
In Fig. 6, Group 1 contains 37 indices without the indices from US, China and Greece; Group 2 without indices from Mexico, Russia and Turkey; Group 3 without indices from England, Spain and Finland; Group 4 without indices from Canada, Chile and Argentina; Group 5 without indices from Brazil, Malaysia and Singapore. The ‘Complete’ on the horizontal axis means the group of all the 40 indices. We see that in each group, the numbers grow stably with the increasing of and . When and are close to 0.9, the numbers are more alike among all the groups and there are no False-negative dates when they are bigger than 0.88. Note that the dates are most in Group 5. It means that the remaining indices become a bit more correlated than other groups we have tested. To see the geographic effect, we now divide the indices into two subgroups. We take the indices from European countries and Turkey, into a subgroup (denoted by European and Middle east countries) and the rest into another subgroup. We still use the parameters to detect the critical dates from -norms and show the dates in Table 3.
Table 3.
Critical dates detected in different subgroups from -norms.
| Subgroups | True-positive dates | False-positive dates |
|---|---|---|
| European and Middle east countries under |
10/27/1997, 09/25/1998, 11/25/1998, 08/08/2007, 12/12/2007, 03/18/2008, 10/06/2008, 11/04/2008, 05/10/2010, 06/22/2010, 09/15/2011, 09/23/2011, 10/06/2011, 08/13/2015, 08/24/2015, 12/11/2015, 12/14/2015, 02/27/2020, 03/09/2020, 03/12/2020 |
07/24/2002, 12/23/2005, 05/22/2006, 06/14/2006, 08/10/2006, 09/20/2006, 04/20/2016 |
| American, Asian, Oceania and African countries/regions under |
09/11/1997, 08/20/2007, 09/29/2008, 10/06/2008, 10/13/2008, 11/19/2008, 07/14/2010, 08/08/2011, 08/27/2015, 09/14/2015, 10/13/2015, 10/29/2015, 11/11/2015, 12/18/2015, 01/22/2016, 03/09/2020, 03/12/2020 |
06/18/2004, 06/21/2006, 03/03/2016, 03/23/2018 |
| European and Middle east countries under |
10/27/1997, 09/25/1998, 05/10/2010, 06/22/2010, 09/15/2011, 09/23/2011, 10/06/2011, 02/27/2020, 03/09/2020, 03/12/2020 |
05/22/2006, 06/14/2006, 08/10/2006 |
| American, Asian, Oceania and African countries/regions under |
11/19/2008, 08/27/2015, 10/13/2015, 03/09/2020, 03/12/2020 | 06/21/2006 |
Note that the complexes with 20 points are much simpler in the topological structure. The changes of the persistence landscapes are more frequent, and hence we see more detected dates in Table 3. However, the distribution of dates is quite concentrated, mainly in the period of the crises or risks, similar to the results with 40 points. This also reflects the robustness of the system to the samples to some extent.
We detect more dates in the subgroup with European and Middle East countries in Table 3 clearly. It indicates that the indices in the subgroup have stable relations. So the systemic risks emerge in the group more easily.
We know from Table 3 that the stock markets were closely connected among different combinations of countries/regions from late February to March of 2020. Fortunately, the situation did not continue for a longer time. However, we should not let our guard down. The epidemic is still spreading fast, especially in the United States, India, Brazil and some other countries. It has not been effectively controlled in the world. Furthermore, the economic setback has been a fact and the global economic recovery will take quite a long time. Under this situation, the fragility of financial markets will easily lead to more serious financial risk.
To explore the detailed topological features, especially on the point cloud data related to the critical dates we have detected above, we use complex network to do further research in the following section.
4. Network analysis of interconnection of global stock markets
Since we want to do the analysis on financial crises with emphasis on the situation under COVID-19, we take four dates with each corresponding to a crisis or disaster from 1995 to 2019 we have mentioned in Section 3.1, and take two dates in 2020 from the critical dates detected by TDA in Section 3.1 to do further research based on complex networks. They are 10/27/1997, 10/06/2008, 08/08/2011, 08/24/2015, 03/09/2020, and 03/12/2020. Based on the six critical dates, we construct corresponding six networks of 40 stock indices and explore the their topology properties. Moreover, to do comparative analysis, we take six dates which are 50 trading days before the six critical dates separately, and also construct the networks on them.
4.1. Threshold network of 40 indices
The networks we construct are based on distances, not correlation coefficients. Some tests are needed to obtain a reasonable threshold value . To do this, we compute and compare the average degrees of the networks based on the twelve dates. We show the results in Fig. 7.
Fig. 7.
Average degrees of threshold networks at various threshold values. (a). Average degrees of the networks on the six dates 50 trading days before the critical dates; (b). Average degrees of the networks on the six critical dates. (c). The difference of the average degrees between the networks on the critical dates and dates before them.
Note that in Fig. 7, the horizontal axis is the threshold values and the vertical axis is the average degrees of the networks. We see that the average degrees in the networks on critical dates are clearly bigger at various threshold values than those on the dates before them, especially when the threshold values range from about 0.8 to 1.3. By computing the average differences between the networks on critical dates and dates before them at various threshold values, we find that the average differences reaches the maximum at . So we take to construct the threshold networks. Moreover, we see from Fig. 7(b) that the average degrees on the three critical dates of 2020 are obviously bigger than those on the other four critical dates. And Fig. 7(c) shows that the biggest difference even arrive almost 30 for the networks on 03/12/2020. It also reflects the close connection of the markets at the time.
With threshold value , we establish the networks on the twelve dates and show some of them in Fig. 8.
Fig. 8.
Networks on the dates. (a). 07/28/2008; (b). 10/06/2008; (c). 05/30/2011; (d). 09/15/2011; (e). 12/30/2019; (f). 03/09/2020; (g). 01/02/2020; (h). 03/12/2020.
In Fig. 8, the right four networks are based on the critical dates, and the left ones are based on the dates 50 trading days before the critical dates. It is clear that right networks are much more closely connected than the left ones. The last two pairs show this more obviously. We see from the networks that, before the critical dates, Asian and some emerging countries such as Argentina or Turkey are mostly weakly connected, but due to the crisis they become more closely connected to the rest of the markets.
In order to explain the significant difference of the networks among the dates, we use the four topological measures shown in Section 2.3 to do further analysis of the networks on the twelve dates. The results are shown in Table 4.
Table 4.
Topological properties of threshold networks.
| Date | Degree | Density | Clustering coefficient | Average path length |
|---|---|---|---|---|
| 08/29/1997 | 6.30 | 0.1615 | 0.660 | 2.128 |
| 10/27/1997 | 12.80 | 0.3282 | 0.751 | 1.759 |
| 07/28/2008 | 15.90 | 0.4077 | 0.753 | 1.684 |
| 10/06/2008 | 27.70 | 0.7103 | 0.881 | 1.314 |
| 05/30/2011 | 16.15 | 0.4141 | 0.832 | 1.664 |
| 08/08/2011 | 26.85 | 0.6885 | 0.867 | 1.283 |
| 06/15/2015 | 8.60 | 0.2205 | 0.781 | 1.826 |
| 08/24/2015 | 21.60 | 0.5538 | 0.805 | 1.476 |
| 12/30/2019 | 13.20 | 0.3385 | 0.807 | 2.003 |
| 03/09/2020 | 36.70 | 0.9410 | 0.955 | 1.059 |
| 01/02/2020 | 10.65 | 0.2731 | 0.726 | 2.051 |
| 03/12/2020 | 38.00 | 0.9744 | 0.980 | 1.026 |
We see from Table 4 horizontally that the degree, density and clustering coefficient of the networks on critical dates are all much bigger than those on the dates before. And average path lengths of the networks are clearly smaller on the critical dates. These measures quantitatively illustrate that the networks during crises are denser than those before the crises.
Table 4 vertically shows that, the density, clustering coefficient and average path length of the networks on 03/09/2020 and 03/12/2020 are very close to 1, and the network degrees are very close to . Moreover, the differences of four measures between 03/09/2020, 03/12/2020 and 50 trading days before them are the biggest respectively. It indicates that the impact of COVID-19 on global stock markets in 2020 is even greater than that of any other financial crisis before.
4.2. Minimum spanning tree of global main stock indices
Since MST represents the strongest connection in the financial network, when it is unique, it means that MST is the most likely transmission path of systemic financial risk. That is, systemic financial risk may spread in the whole financial network along the MST at the fastest speed. Considering this point, we establish MSTs on the twelve dates and show some of them in Fig. 9.
Fig. 9.
MSTs on the dates. (a). 07/28/2008; (b). 10/06/2008; (c). 07/07/2011; (d). 08/08/2011; (e). 12/30/2019; (f). 03/09/2020; (g). 01/02/2020; (h). 03/12/2020.
In Fig. 9, we use the countries/regions to present the indices and each of them is denoted by a small box. If there was an impact from outside the network, such as a serious infectious disease like COVID-19, a big war, or a natural disaster, the risk would quickly spread most likely along the path in the MST. A systemic financial risk would happen soon. Moreover, if a country/region within the network, especially the country/region at the center, had a financial risk, the risk would also quickly spread and cause a global systemic risk.
Take MST of 03/09/2020 (see Fig. 9(f)) as an example to do analysis. In Fig. 9(f), the path from China to Chile is one of the longest path between two countries/regions, which consists of nine edges. While in Fig. 9(e), which is the MST of 12/30/2019, the path between the two countries consists of fourteen edges. And the longest paths in Fig. 9(e) consist of 18 edges, much longer than those in Fig. 9(f). The risk more easily happen and fast spread within the network of 03/09/2020.
The indices at the centers of the MSTs are obviously closely connected with more indices, hence at greater risk than those on the margins of the networks. Such indices are systemically important and worth of monitoring carefully. In the MST of 03/09/2020, France is the most important node in the system, with degree centrality as high as 11. So the CAC40 index is closely connected with other indices. It is more vulnerable to shocks, and more disruptive to the overall stock markets than other indices. It is natural to pay more attention to it.
From Fig. 9, there is an obvious geographical aggregation effect among the major stock indices. Countries/regions belonging to the same region have a stable relationship with great mutual influence. European countries have closer and more stable interconnections than countries in other regions. They serve as a bridge for all countries in the world. Although COVID-19 was first reported in China, it spread fast and seriously from the late February of 2020 in European countries. The global stock markets reacted soon with turbulence and the risk emerged all over the world.
Now we use properties shown in Section 2.3 to explore more features from the MSTs on the twelve dates. The results are shown in Table 5.
Table 5.
Properties comparison of MSTs.
| Date | Normalized length | Longest path | Maximum degree centrality | Systemically important nodes (5) |
|---|---|---|---|---|
| 08/29/1997 | 0.9794 | 10 | 8 | Germany |
| 10/27/1997 | 0.8039 | 13 | 7 | South Africa |
| 07/28/2008 | 0.7677 | 15 | 5 | Belgium |
| 10/06/2008 | 0.6159 | 16 | 8 | France, England and Singapore |
| 05/30/2011 | 0.7863 | 13 | 8 | Hong Kong and England |
| 08/08/2011 | 0.6364 | 16 | 6 | France, Singapore and Netherlands |
| 06/15/2015 | 0.8929 | 14 | 6 | Netherlands |
| 08/24/2015 | 0.7578 | 11 | 6 | France, Japan and Spain |
| 12/30/2019 | 0.8360 | 18 | 6 | France, German and Hong Kong |
| 03/09/2020 | 0.5855 | 9 | 11 | France, Netherlands and Singapore |
| 01/02/2020 | 0.8558 | 16 | 6 | France, German and Hong Kong |
| 03/12/2020 | 0.5002 | 13 | 7 | France and Canada |
Table 5 shows the normalized lengths of the twelve MSTs. From the horizontal perspective, the normalized lengths on critical dates are clearly much shorter than those on the dates before them. From the vertical perspective, the normalized lengths of MSTs on 03/09/2020 and 03/12/2020 are obviously shorter than those on other dates. Especially, the normalized length of MST on 03/12/2020 is very close to 0.5. It means that average correlation coefficient between every two stock indices was about 0.875, which indicates that the indices were greatly correlated at the time.
We also find from Table 5 that the longest paths on 02/24/2020 and 03/09/2020 are much shorter than the paths on 50 trading days before them. Moreover, the maximum value of degree centrality is up to 11 on 03/09/2020 and corresponding index is the CAC40 of France. Table 5 also shows the systemically important nodes with degree centrality not smaller that 5. From Fig. 9 and Table 5, we also find that, during the recent financial crises, France was always at the center, while other systemically important indices changed quite frequently.
Obviously, the networks under COVID-19 have the feature of strong system vulnerability. The systemic risk easily emerges and spreads in such networks. It is necessary to keep attention on the markets and manage to prevent the situation getting even worse.
Based on the results obtained by TDA, we use MSTs to study the financial system and risk transmission mechanism, which helps us to find the critical indices and greatly to analyze the indices’ connections within the complex networks. This also helps verify the effectiveness of the TDA mechanism we construct in Section 3.1.
5. Conclusion
We have focused our work on the detection and analysis of systemic financial crises since 1995 by 40 stock indices chosen from 40 countries/regions all over the world. Especially, we have emphasized on the impact of COVID-19 on the stock markets. Our study has been mainly from topological perspective, and we have combined TDA with complex networks to do the research.
We have fixed a sliding window of size to compute the correlation coefficient between each pair of the indices. Then we have converted the coefficients to distances and built the complexes on each trading day. The persistence landscapes and their -norms have been obtained which help us to establish a system to detect the critical dates for a financial crisis or disaster. It depends on the detection of the sharp changes on the complexes. By several tests, we have fixed suitable parameters of the system and detected some critical dates. We have also divided the indices into subgroups to explore their situation under COVID-19 and check the robustness of the mechanism. We have found that the global stock markets were clearly closely connected when a financial crisis happened. The situation from late February to March in 2020 was even more serious than any other systemic financial crisis or risk.
With the help of TDA, we have obtained critical dates to help research with complex networks. We have established threshold networks and MSTs on the critical dates. Then we have analyzed and compared the topological properties between the networks of critical dates and dates some days before the critical dates. All the topological measures have shown that correlations among the indices got closer around financial crises. The connection from late February to March in 2020 was also obviously closer than ever before. We also see that the forms are typically different with different stock markets. In the transition of a crisis, the indices became “closely connected” from “normal connected” among the developed countries. The emerging countries are more affected by crises. Their indices have fewer connections with other markets in the normal period, while they became to follow fast the indices of the developed countries when a crisis came, and thus they also closely connected to other indices. Furthermore, the results have indicated that there is clear geographical aggregation effects among the major stock indices in the world. The countries in the same geographical region usually have stable the close connections. Europe is the center of the global markets within the networks. This is also consistent with the results on TDA. It may be the reason why the risk emerged completely in the early March of 2020. It was just the time when COVID-19 began to spread fast in Europe.
So far, COVID-19 is still spreading and the global economy is undergoing a difficult period. So it is important to keep attention on the financial markets. We want to make deeper study on the financial markets with more indicators and more statistical summaries based on TDA. We also want to study the topological features of the complexes from banks’ time series with emphasis on the transitions of crises or risks. We are sure that the mechanism by TDA and complex networks can also show us useful information. The geographic aggregation on financial crises or risks will also be worth researching. Various systemic crises or risks had different causes, birthplace and spreading characteristics. Since TDA is also a good classifier, we hope to classify the crises and risks and take deeper research on them based on TDA and machine learning method. Furthermore, we have used the interpolation in the data processing. We will try to cleaning up all the different trading days among the markets in the future work. We believe that it can bring more realistic results.
CRediT authorship contribution statement
Hongfeng Guo: Conceptualization, Methodology, Software, Writing- original draft. Xinyao Zhao: Methodology, Data curation, Writing - original draft. Hang Yu: Data curation, Software. Xin Zhang: Software, Writing - original draft.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
This work was supported by the National Natural Science Foundation of China-Shandong Joint Fund (grant number U1806203) and Key R & D major projects (Soft Science) of Shandong Province, PR China (grants numbers 2019RZB01091, 2019RZB01151).
References
- 1.Ouzan S. Loss aversion and market crashes. Econ. Model. 2020;92:70–86. doi: 10.1016/j.econmod.2020.06.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sharif A., Alouib C., Yarovayac L. COVID-19 pandemic, oil prices, stock market, geopolitical risk and policy uncertainty nexus in the US economy: Fresh evidence from the waveletbased approach. Int. Rev. Financ. Anal. 2020;70 doi: 10.1016/j.irfa.2020.101496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Jia Q., Zhang D., Zhao Y. Searching for safe-haven assets during the COVID-19 pandemic. Internet Rev. Financ. Anal. 2020;71 doi: 10.1016/j.irfa.2020.101526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ashraf B.D. Stock markets’ reaction to COVID-19: Cases or fatalities? Res. Int. Bus. Finance. 2020;54 doi: 10.1016/j.ribaf.2020.101249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Umara Z., Kenourgios D., Papathanasiou S. The static and dynamic connectedness of environmental, social, and governance investments: International evidence. Econ. Model. 2020;93:112–124. doi: 10.1016/j.econmod.2020.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Akhtaruzzamana M., Boubakerb S., Sensoy A. Financial contagion during COVID-19 crisis. Finance Res. Lett. 2021;38 doi: 10.1016/j.frl.2020.101604. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Just M., Echaust K. Stock market returns, volatility, correlation and liquidity during the COVID-19 crisis: Evidence from the Markov switching approach. Finance Res. Lett. 2020;37 doi: 10.1016/j.frl.2020.101775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Papadamou S., Fassas A.P., Kenourgios D., et al. Flight-to-quality between global stock and bond markets in the COVID era. Finance Res. Lett. 2021;38 doi: 10.1016/j.frl.2020.101852. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mazur M., Dang M., Vega M. COVID-19 and the March 2020 stock market crash. Evidence from S&P1500. Finance Res. Lett. 2021;38 doi: 10.1016/j.frl.2020.101690. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sheffer M., Bascompte J., Brock W.A., et al. Early-waring signals for critical transitions. Nature. 2009;461(3):53–59. doi: 10.1038/nature08227. [DOI] [PubMed] [Google Scholar]
- 11.Ng G.S., Quek C., Jiang H. FCMAC-EWS: A bank failure early warning system based on a novel localized pattern learning and semantically associative fuzzy neural network. Expert Syst. Appl. 2008;34(2):989–1003. [Google Scholar]
- 12.Haldane A.G., May R.M. Systemic risk in banking ecosystems. Nature. 2011;469:351–355. doi: 10.1038/nature09659. [DOI] [PubMed] [Google Scholar]
- 13.Billio M., Getmansky M., Lo A.W., et al. Econometric measures of connectedness and systemic risk in the finance and insurance sectors. J. Financ. Econ. 2012;104(3):535–559. [Google Scholar]
- 14.Diebold F.X., Yilmaz K. On the network topology of variance decompositions: measuring the connectedness of financial firms. J. Econometrics. 2014;182(1):119–134. [Google Scholar]
- 15.Yan X.G., Xie C., Wang G.J. Stock market network’s topological stability: Evidence from planar maximally filtered graph and minimal spanning tree. Internat. J. Modern Phys. B. 2015;29(22):1–19. [Google Scholar]
- 16.Ahelegbey D.F., Billio M., Casarin R. Bayesian graphical models for structural vector autoregressive processes. J. Appl. Econometrics. 2016;31(2):357–386. [Google Scholar]
- 17.Mezeic J., Sarlina P. RiskRank: Measuring interconnected risk. Econ. Model. 2018;68:41–50. [Google Scholar]
- 18.Biplob C., Mardi D., Moses K., Mohammad A., Vladimir V. The changing network of financial market linkages: The Asian experience. Internet Rev. Financ. Anal. 2019;64:71–92. [Google Scholar]
- 19.Gai P., Kapadia S. Networks and systemic risk in the financial system. Oxford Rev. Econ. Policy. 2019;35(4):586–613. [Google Scholar]
- 20.Zhu X., Wang W., Wang H., et al. Network quantile autoregression. J. Econometrics. 2019;212(1):345–358. [Google Scholar]
- 21.Xu R., Mi C., Delcea C. Complex network construction of internet financial risk. Physica A. 2020;540 [Google Scholar]
- 22.Giudici P., Sarlin P., Spelta A. The interconnected nature of financial systems: Direct and common exposures. J. Bank. Financ. 2020;112 [Google Scholar]
- 23.Collins A., Zomorodian A., Carlsson G., et al. A barcode shape descriptor for curve point cloud data. Comput. Graph. 2004;28(6):881–894. [Google Scholar]
- 24.Carlsson G. Topology and data. Bull. Amer. Math. Soc. 2009;46(2):255–308. [Google Scholar]
- 25.Bubenik P. Statistical topology data analysis using persistence landscapes. J. Mach. Learn. Res. 2015;16(1):77–102. [Google Scholar]
- 26.Bubenik P., Dlotko P. A persistence landscapes toolbox for topological statistics. J. Symbolic Comput. 2017;78:91–114. [Google Scholar]
- 27.Gidea M. Third International Winter School and Conference on Network Science. 2017. Topological data analysis of critical transitions in financial networks. (Springer Proceedings in Complexity). [Google Scholar]
- 28.Gidea M., Katz Y. Topological data analysis of financial time series: Landscapes of crashes. Physica A. 2018;491:820–834. [Google Scholar]
- 29.Guo H., Xia S., An Q., et al. Empirical study of financial crises based on topological data analysis. Physica A. 2020;558 [Google Scholar]
- 30.Feller W., editor. An Introduction to Probability Theory and its Applications. Wiley; New York: 1970. [Google Scholar]
- 31.Mantegna R.N. Hierarchical structure in financial markets. Eur. Phys. J. B. 1999;11(1):193–197. [Google Scholar]
- 32.Munkres J.R., editor. Elements of Algebraic Topology. Addison Wesley; California: 1984. [Google Scholar]
- 33.Cohen-Steiner D., Edelsbrunner H., Harer J. Stability of persistence diagrams. Discrete Comput. Geom. 2007;37(1):103–120. [Google Scholar]
- 34.Kruskal J.B. On the shortest spanning subtree of a graph and the traveling salesman problem. Proc. Amer. Math. Soc. 1956;7(1):48–50. [Google Scholar]









