Abstract
Financial networks are typically estimated by applying standard time series analyses to price-based economic variables collected at low-frequency (e.g., daily or monthly stock returns or realized volatility). These networks are used for risk monitoring and for studying information flows in financial markets. High-frequency intraday trade data sets may provide additional insights into network linkages by leveraging high-resolution information. However, such data sets pose significant modeling challenges due to their asynchronous nature, complex dynamics, and nonstationarity. To tackle these challenges, we estimate financial networks using random forests, a state-of-the-art machine learning algorithm which offers excellent prediction accuracy without expensive hyperparameter optimization. The edges in our network are determined by using microstructure measures of one firm to forecast the sign of the change in a market measure such as the realized volatility of another firm. We first investigate the evolution of network connectivity in the period leading up to the U.S. financial crisis of 2007-09. We find that the networks have the highest density in 2007, with high degree connectivity associated with Lehman Brothers in 2006. A second analysis into the nature of linkages among firms suggests that larger firms tend to offer better predictive power than smaller firms, a finding qualitatively consistent with prior works in the market microstructure literature.
Keywords: market microstructure, high-frequency trading, random forests
1. Introduction
From both theoretical and practical perspectives, there is interest in estimating linkages among financial institutions using data. Academicians seek to understand how information flows between firms, regulators aim to identify when and how risk spreads through the financial system, and financiers would like to know whether incorporating other firms’ characteristics can improve their own trading algorithms. The matter of how firms interact with one another can be represented mathematically as a network, with nodes corresponding to financial institutions and an edge between two nodes indicating that those firms are connected in some sense.
There are two key questions as to how best to measure these connections. First, it is important to understand which statistical methods are well-suited for the task of estimating linkages. Second, one needs to decide what type of data to apply these methods on. Financial institutions generate a variety of data through their activities (e.g. trading volumes and stock prices) and it is not immediately apparent what kind of data are most informative for measuring firms’ connectivity.
Predictability of one firm’s future performance using another firm’s past is often used to define edges between two firms in financial networks. There is a rich literature applying linear and vector autoregression (VAR) methods to the stochastic process of a firm’s stock price, e.g., its stock returns or return volatilities. For instance, Billio et al. (2012) constructs financial networks by assessing bivariate Granger causal relationships between firms’ monthly stock returns. Basu et al. (2019) refines these methods by employing multivariate Granger causality, which has the effect of removing indirect edges from the network. Similarly, Karpman et al. (2022) uses multivariate quantile Granger causality to estimate inter-firm connections that occur specifically during market downturns. Diebold and Yılmaz (2014), on the other hand, models daily return volatilities using VAR models, with the corresponding forecast error variance decompositions defining edges in the network. Several other network modeling approaches based on tail-risk and extreme risk spillover have been proposed in the literature of financial economics (Hautsch et al. 2015; Härdle et al. 2016; Wang et al. 2017, 2021).
While some recent works have constructed high-frequency financial networks based on contemporaneous associations (Brownlees et al. 2018), the literature on financial network analysis based on predictability has primarily used low-frequency data such as monthly, weekly, or daily returns and volatilities. However, firm linkages that are estimated from – for example – monthly financial data are challenging to interpret since it is difficult to establish what mechanisms, over the course of the month, produce those linkages (investment decisions are usually made within a much shorter time frame). Intraday financial data have the potential to yield insights that we cannot obtain through a low-frequency lens. With the rise of high-frequency computerized trading, financial data are now being recorded at the level of nanoseconds, yielding massive intraday data sets. For example, the New York Stock Exchange (NYSE) maintains the Trade and Quote (TAQ) database, which provides detailed information (e.g., timestamp, price, size, etc.) on all trades and quotes for stocks that are active on U.S.-based exchanges. By considering high-frequency financial data, we can better discern which aspects of a firm’s trading give rise to its associations with other firms. One notable exception to the preponderance of low-frequency analyses is the work of Härdle et al. (2018), which uses intraday limit order book data (bid and ask prices and volumes) to measure stock connectedness.
While using intraday data provides interpretability benefits, high-frequency financial time series also pose significant modeling challenges that are not present at lower frequencies [Dutta et al. (2022)]. These time series are often complex and nonstationary, exhibiting strong persistence, seasonal and intraday patterns, and volatility bursts. As a result, simple models cannot capture key features of the data. Moreover, increasingly sophisticated financial products and trading algorithms have rendered the markets so complex that specifying a functional form to relate firms’ variables is difficult and likely to be overly simplistic.
In this work, we adopt a nonparametric approach to estimate predictability across firms’ time series using high-frequency data. Our methodology does not impose a functional form on the dynamic relationships between firms, and offers greater modeling flexibility. In addition, we move away from price-based measures (e.g. stock returns and volatility) and use trade-based measures, which are expected to contain more fine-grained information.
In particular, we expand on the methodology proposed in Easley et al. (2021), which uses a random forest to predict a set of market measures, variables that traders use as inputs to their execution algorithms, including measures of liquidity, volatility, and the shape of the returns distributions. The features of their (and our) random forest are microstructure variables, quantities that are computed from the price and volume of trades and that reflect underlying market frictions. We note that most analyses consider only price data; for instance, the stock returns used in Billio et al. (2012) and realized volatility used in Diebold and Yılmaz (2014) can be calculated from stock prices. However, trades are the truly fundamental object since they are what give rise to prices.
According to standard asset pricing theory, the price of asset at time is the conditional expected value of time price where the expectation is conditioned on all publicly available information. Prices are thus a lower dimensional summary of this information and they evolve as the information evolves. The public information underlying prices is primarily past prices and current trade flows. Thus, by using microstructure measures, which are formed from trade flows, we address whether there is information contained in trades – beyond what is summarized in prices – that is useful in understanding inter-firm connections.
Easley et al. (2019) provides empirical evidence that random forests can predict market measures of futures contracts using those contracts’ microstructure variables. Their analysis focuses mainly on intra-firm prediction; that is, they consider whether contract A’s microstructure variables can predict contract A’s market measures, and similarly for contracts B, C, etc. In our work, on the other hand, we ask whether features of firm A can help predict the market measures of firm B, for all pairs, A and B, in the system.
Put differently, we measure whether, and to what extent, firm A’s predictability increases when we include firm B’s features in the random forest, as compared to when only firm A’s features are used. Various metrics exist to quantify predictability, including accuracy, precision, recall, and the F1 score. We use the area under the ROC curve (AUC), which reflects the true and false positive rates as the decision threshold is varied. We apply a bootstrap procedure to test by how much (if at all) the AUC increases when we add firm B’s features. The increase is then used as a weight for the edge running from firm B to firm A. In this manner, we construct a network whose edges indicate cross-predictability between firms. This technique can be viewed as a high-frequency analogue to the Granger causality methods applied to monthly stock returns in Billio et al. (2012). Under that framework, an edge from firm B to firm A means that firm B’s lagged returns help predict firm A’s returns, over and above firm A’s own lagged returns. Edges are defined similarly here, except that instead of using linear models on monthly stock returns, we apply random forest methods to intraday data. In both cases, we assess whether another firm’s information boosts predictive power. We note that random forests may be a particularly effective method for capturing cross-effects between firms since they allow for higher-order interactions between many features, which is especially important given the complexities of modern-day financial markets.
While some recent works have shown the benefit of deep neural networks as a predictive learner in asset pricing (Gu et al. 2020), we chose to work with random forests because its default choice of tuning parameters tends to provide near-optimal performance in low-dimensional prediction problems such as ours (thousands of observations, tens of parameters), see Huang and Boutros (2016) and references therein. Deep neural networks and other nonlinear predictive learners such as gradient boosting are often sensitive to choice of tuning parameters and require a computationally intensive cross-validation procedure. We provide a more detailed discussion on this issue in Appendix B.
We apply our methodology to high-frequency trade data of U.S. banks, broker-dealers, and insurance companies, with the goal of better understanding cross-effects between these institutions. Our methods can be used to address the same questions that researchers ask in the low-frequency context, including how network connectivity changes over time and what information channels exist between firms. On the first count, we apply our methods to intraday data spanning 1998 to 2010, thereby visualizing the historical evolution of network connectivity over both economically stable and crisis periods. We find that the networks reach maximum density in late 2007, following the collapse of two subprime mortgage funds associated with the investment bank Bear Stearns. Several of the most highly connected nodes in the network, including Lehman Brothers and AIG, have been recognized as key contributors to the U.S. financial crisis. Second, we demonstrate how our methods can be used to detect possible information spillovers between small and large financial firms. This line of analysis is motivated by Chordia et al. (2011), which provides empirical evidence that the returns of large stocks lead (in the Granger causal sense) the returns of small stocks, and that this lead-lag relationship is especially strong when the large stocks have low liquidity. Our results are consistent with this earlier analysis: we find that the microstructure variables of large firms tend to be more important (compared to the microstructure variables of small firms) in predicting market measures for both small and large firms.
In summary, we contribute to the literature of financial connectivity estimation on two fronts. First, we provide a systematic nonparametric procedure, equipped with bootstrap-based uncertainty quantification, to detect cross-asset predictability from high-frequency trade data sets. Our empirical findings suggest that the method can be used to construct financial networks and complement the existing literature that primarily uses low-frequency, price based information to estimate linkages among financial firms. Second, finance theory provides little guidance about what variables describing the trading environment for firm A should be predictive of market measures for firm B, and it provides no guidance for the structure of any relationship. Market microstructure theory, even recent analyses of high frequency trading, typically focus on one firm at a time.1 Our results suggest that this focus obscures important cross-firm effects. We provide some insight into what variables matter for cross-firm predictions and our hope is that these results spur the development of more complete multi-firm analyses.
The manuscript is organized as follows. In Section 2, we describe our methodology, including an overview of how the data are structured, an explanation of how random forests work, and details of our bootstrap AUC procedure, which is used to assess whether cross-features provide predictive improvement. In Section 3, we define five microstructure variables that are used as features in the random forest, while Section 4 presents two market measures that serve as labels (variables that we predict). In Section 5, we describe the high-frequency data used in our empirical analysis, including which firms we choose to focus on. Section 6 presents the results of two empirical analyses, namely the evolution of network connectivity over time and the presence of information flows between small and large firms. Section 7 concludes.
2. Methods
In this section, we first describe how our data are structured, namely how we sample from high-frequency trade information to construct microstructure variables and market measures. We then outline our random forest methodology, including an overview of how random forests work and details of our training and testing procedure. Lastly, we introduce two metrics used to interpret the random forest results; the first measures the relative importance of features used in the model, while the second quantifies the random forest’s predictive accuracy. By comparing the accuracy with and without cross-features, we create financial networks whose edges indicate that one firm’s variables significantly improve our ability to predict the other firm’s market measure. Details are provided below.
2.1. Data Structure
Before describing our statistical methods, we briefly explain how our dataset is structured. We begin by obtaining high-frequency trade data from the NYSE Trade and Quote (TAQ) database [NYSE Trade and Quote Database]. TAQ provides information on every trade that occurs on a U.S.-based exchange, including the NYSE, Nasdaq, National Stock Exchange, and others. Among the many variables returned by TAQ are the timestamp, price, and volume of each trade. These three variables are integral to creating our final dataset: timestamps are used to aggregate trades (thereby reducing the total number of observations in our dataset), while price and volume are used to create the microstructure variables and market measures that serve as features and labels in our random forest.
2.1.1. Trade Aggregation
Aggregating trades is common in high-frequency financial data analysis, for several reasons: aggregation limits the effect of noise, reduces the amount of data that we need to process, and allows for the creation of economically meaningful variables [Hautsch (2012)]. Trade aggregation can be based on time (e.g., collecting all trades whose timestamps fall in a 30-minute interval) or on event (e.g., aggregating trades until the price change exceeds a given threshold). In our analysis, we use time-based aggregation, grouping each firm’s trades into 30-minute time bars.2 Aggregating our data into time-bars is also important for synchronizing our data across firms. Trades occur at random times and are not naturally synchronised across firms. We use both trade data from firm A to make predictions about firm B and trade data from firm B to make predictions about firm A, so to be sure that we are not using future data to predict past data our data set needs to be synchronized across firms. Our time bar aggregation accomplishes this. The choice of 30-minutes for our bars is arbitrary, but our choice is guided by two observations. First, [Easley et al. (2021)] investigates the effect of using 30-minute or 60-minute and finds little difference in their predictive ability. Second, short time bars are problematic as with very short time bars there are too many time bars with no trade and so the issue of missing data arises.
Since we consider only trades that occur during regular market hours (9:30 AM EST to 4:00 PM EST), our time bars correspond to the intervals 9:30 AM to 10:00 AM, 10:00 AM to 10:30 AM, and so on and so forth until 3:30 PM to 4:00 PM, with these bars repeated for each day of the sample period. We emphasize that time bars are formed on a firm-by-firm basis; that is, we do not combine trades of stocks and into a single bar.
2.1.2. Microstructure and Market Variables, Lookback Windows, and Forecast Horizons
Once a firm’s trades have been gathered into time bars, we construct a set of microstructure variables and market measures that capture key properties of the firm’s trading. In Sections 3 and 4, we provide definitions of these variables and measures. For now, we note that microstructure variables are used as features (predictors) in our random forest, while market measures are used to calculate labels (quantities we predict). All are based on sequences of trade prices and volumes, and all are computed – at each time bar – using a lookback window of size . For instance, the value of Kyle’s lambda (one of the microstructure variables) at time bar is based on the trade prices and volumes at time bars in .
The microstructure variables at time bar are then used to predict the sign of the change in a market measure at time bar , where is a fixed forecast horizon. For example, one of the market measures we consider is realized volatility. We do not predict the value of realized volatility at bar , nor do we predict the change in realized volatility between bars and . Instead we predict whether this change is positive (realized volatility increases) or negative (realized volatility decreases). The sign of the change in realized volatility becomes the label for our random forest; thus we are predicting a binary variable that takes the value 1 if the market measure increases and −1 if it decreases.
In our analysis, we set W = 50 and h = 50. Since each time bar represents a 30-minute interval and there are 12 such intervals during regular market hours3, our lookback window size and forecast horizon both correspond to slightly more than four trading days.
2.2. Random Forest
Random forests are a popular machine learning tool for predicting the values of a binary variable [Breiman (2001)], Friedman et al. (2001)]. In our work, this binary variable represents whether a market measure — such as realized volatility — decreases (−1) or increases (1) over some fixed forecast horizon. Random forests work by aggregating the predictions of many decision trees, so we begin by describing how each tree makes its prediction.
A decision tree takes as input training data in the form , where is the label for observation and is the vector of features. The tree repeatedly splits the training observations into two subsets on the basis of one of the features. For example, the first split might separate the training set based on whether the second feature is greater than 5. This would yield two subsets, and . The next split could be based on whether the third feature is greater than 10, yielding 4 subsets,
and so on and so forth. In this example, features 2 and 3 are referred to as split features, while 5 and 10 are split points. Decision trees choose split features and split points by maximizing information gain, which measures how pure the labels are in the subsets that result from the split. Maximum purity (information gain) is achieved when one subset contains only observations with label 1 and the other subset contains only observations with label −1. As we move further down the tree, generating more and more splits, the feature space becomes increasingly partitioned and there exist fewer observations in each node of the tree. Eventually the tree stops growing (according to a particular stopping criterion) and we classify each observation by considering the terminal node (aka leaf) to which that observation belongs. Specifically, each observation is predicted to have the most commonly occurring label (−1 or 1) in its leaf.
Decision trees are known to have low bias and high variance [Friedman et al. (2001)]. They are accurate, on average, but individual decision trees are prone to overfitting the training data and sometimes do not perform well when generalized to a test set. Random forests counteract overfitting by aggregating the predictions of many decision trees, thereby stabilizing the overall prediction [Breiman (2001)]. In particular, for each observation , the random forest computes the fraction of trees that predict −1 vs. 1. We can then make a prediction for observation i based on which class has the higher probability. For example, suppose a random forest consists of 100 trees, 55 of which predict −1 and 45 of which predict 1, yielding class probabilities of 0.55 and 0.45. Then we can set our final prediction to be −1 since it is the majority vote over all trees. Each decision tree in the forest is trained on a bootstrapped sample; that is, we draw samples with replacement from our training set and fit decision trees, one on each bootstrap sample.
Lastly, an important aspect of random forests is that not all features are taken to be candidates for every split. Instead – at each split – we choose a random subset of features, compute the largest information gain we can achieve with each of these features (over all split points), and select as our final split feature and split point the ones that offer maximal information gain. This procedure is particularly helpful when there are correlated features, in which case decision trees may select the feature that offers marginally higher information gain, while ignoring its highly correlated but slightly less predictive counterpart. By randomizing the split candidates, we ensure that each of these feature has an equal opportunity of being selected.
2.2.1. Random Forest Parameters
We implement our random forest using the randomForest package in R [Liaw and Wiener (2002)]. In particular, we produce forests with trees, each of which is allowed to grow without limit (i.e., the minimum leaf size is 1). We randomly select features as candidates at each split, with . Recall that denotes the number of features, which varies based on whether we include cross-effects. We use five microstructure variables (see Section 3), so is either if we use only the firm’s own variables for prediction or if we also consider the features of one other firm. Finally, we assign a weight to each training set observation on the basis of its class; observations with label 1 (resp. −1) have weight (resp. ), where and denote the number of training observations with label 1 and −1, respectively. When performing bootstrap, training observations are randomly sampled according to these weights so that the effect of class imbalance is minimized.
2.2.2. Purged Cross-Validation
Once the random forest is fit on the training set, we evaluate its performance on test data. Depending on the exact analysis we perform (see Section 6 for details), we use one of two approaches. The first procedure is purged cross-validation as proposed in Easley et al. (2021). This involves splitting the sample period into intervals of equal length. We then iterate over each interval, , taking to be the test set and using all other intervals as training data, with one caveat. Since our microstructure variables (features) and market measures (labels) are formed using a lookback window, the train and test sets under this approach are not independent of each other, introducing bias into our results. To correct for this, we purge five days worth of data from around each test set, (see Figure 1). This procedure yields sets of results, one for each interval. In Section 2.3, we discuss how to aggregate these results across test sets.
Figure 1.
Schematic of the purged cross-validation procedure. The sample period is divided into 6 intervals of equal length, each interval serving as a test set as we iterate over the sample period. Suppose interval 4 is the current test set. Then five days worth of data are purged from before and after interval 4, and the remaining data are used as the training set.
Purged cross-validation allows us to test on the entire dataset as we iterate over intervals; however, it has the disadvantage that for some choice of test sets (see, e.g. Figure 1) we test our model on data that occur prior to our training data. (This would not be the approach of, say, a practitioner applying a random forest to recent financial data in order to forecast changes in market measures.) To ensure that the chronology of the train and test sets does not impact our final results, we use an alternative approach for some of our analyses. This consists of splitting the sample period into two intervals, training on the earlier interval (which has some data purged from it) and testing on the later interval.
2.3. Evaluating the Random Forest
After applying our random forest to the test sets, we consider two aspects of our model’s performance: (i) its predictive ability, i.e., how well the random forest classifies observations in the test set, and (ii) which features are most important in making those predictions. We address each of these points in turn.
2.3.1. AUC for Assessing Prediction Accuracy and Forming Networks
The receiver operating characteristics (ROC) curve offers a visual medium by which we can assess the predictive performance of a binary classifier such as a random forest [Fawcett (2006)]. For each observation in the test set, the random forest provides the probability that the observation’s label is −1, from which we can readily compute the probability that the observation’s label is 1. We convert these probabilities to actual predictions of −1 or 1 by setting a decision threshold and evaluating whether the observation’s probability (e.g., of being −1) meets this threshold. For instance, if we set the decision threshold to 0.5, then observations are classified according to whether the majority of trees in the random forest predict −1 or 1 for the observation in question.
The ROC curve displays the tradeoff between the true positive rate and the false positive rate as we vary the decision threshold between 0 and 1. The true positive rate (also referred to as recall) is defined as
| (1) |
where (resp., ) is the number of true positives (resp., false negatives) produced by the classifier at a set threshold. In our analysis, we take labels of 1 to be positives and −1, negatives. From equation (1), we can see that the true positive rate is simply the proportion of positives in our system that are correctly classified as such. Similarly, the false positive rate is given by
| (2) |
where (resp., ) is the number of false positives (resp., true negatives) produced by the classifier at a set threshold. The false positive rate is then the proportion of negatives in our system that are incorrectly classified as positives. Both the TPR and FPR can be computed given the predicted class probabilities and true labels for the test set. Recall that, for purged cross-validation, we use multiple test sets; however, we simply aggregate the predicted and true values across all intervals, yielding – in effect – a single set of test results.
As we vary the decision threshold from 0 (all observations classified as positive) to 1 (all observations classified as negative), both the TPR and the FPR decrease from 1 to 0. The ROC curve plots the TPR and FPR at each of these intervening thresholds (see Figure 2). A random classifier (i.e., one which – for each observation – predicts −1 or 1 with equal probability) yields a diagonal ROC curve running from (0, 0) to (1, 1), while a classifier that perfectly separates negatives from positives has an ROC curve running from (0, 0) up to (0, 1) and across to (1, 1). Thus, we can quantify a classifier’s performance by computing the area under the ROC curve, referred to as the AUC. In the case of a random classifier, the AUC is 0.5, while a perfect classifier has an AUC of 1. AUC has the advantage that it assesses a classification model’s performance over all possible decision thresholds, without requiring us to set a single threshold.
Figure 2.
Receiver operating characteristics (ROC) curves plot the true and false positive rates of a binary classifier as the decision threshold is varied. Random classifiers have a diagonal ROC curve, with a corresponding area under the curve (AUC) of 0.5. Higher values of AUC, as illustrated here, indicate better classifier performance.
While an AUC of 0.6 (Figure 2) is generally considered a very small improvement over a random classifier, Easley et al. (2021) shows that even an AUC as small as 0.54–0.61 is high by financial machine learning standard and is considered to capture a potential inefficiency in the market. In section 6 of their paper, the authors prove that in a similar forecasting framework such as ours, bets made using a predictive model with AUC = 0.52 can lead to an annualized Sharpe ratio of 2.04, which is considered sizeable among practitioners.
In our quest of detecting predictability across assets, however, we care more about the statistical significance of the AUC improvement than its effect size. This is in line with how Granger causality networks are constructed in low-frequency analysis (Billio et al. 2012), where significance of the Granger causal effect at a pre-specified level () is used to define a network edge. We use AUC to detect the presence of cross-effects between firms; that is, to assess whether microstructure variables of firm are useful in predicting market measures of firm . We take the view that there are two competing models: Model 1 is a random forest not containing any cross-features (firm variables only), while Model 2 is a random forest that does contain cross-features (both firm and firm ’s variables). If features from firm have predictive power, then Model 2 should have a higher AUC than Model 1. Thus, to determine whether cross-effects exist, we test the following hypotheses:
| (3) |
where denotes the of Model , with .4 The test in (3) is executed according to the following steps:
We fit Models 1 and 2 on the training data, apply the fitted models to the test data, and store the predicted class probabilities from Model 1; the predicted class probabilities from Model 2; and the true test set labels, . Here indexes observations in the test set.
Using the predictions and true values, we compute and the areas under the curve for Models 1 and 2, respectively.
We then draw bootstrap samples from . For each bootstrap sample , with , we calculate new areas under the curve, and storing the difference, .
- The standard deviation, , of the bootstrap differences is computed and a test statistic, , is calculated as
Finally, a one-sided p-value is computed under the assumption that follows a normal distribution.5
Steps 1-5 are repeated twice for each pair of firms, (, ), in the system, once to make predictions for firm and again to make predictions for firm . This yields a set of p-values, where is the number of firms under consideration. We apply a multiple testing correction to control the false discovery rate [Benjamini and Hochberg (1995)] and form directed networks with edges between pairs of firms that have an adjusted p-value falling below some threshold.
2.3.2. MDA for Feature Importances
Area under the curve measures the random forest’s predictive performance; however, we are also interested in knowing to what extent the various features contribute to these predictions. We quantify feature importances using the mean decrease in accuracy (MDA), which compares the random forest’s accuracy on the original data to its accuracy on a dataset for which the values of a feature have been randomly permuted [Biau and Scornet (2016)]. Accuracy is defined as the fraction of all test set observations that are classified correctly6:
For each feature , we compute its MDA as follows:
We begin by fitting a model to the training set and computing its accuracy, , on the test set.
Next, we randomly permute the values of in the test set. We make predictions on this shuffled test set and compute the new accuracy, .
- The MDA for feature is the fraction by which the model’s test set accuracy decreases after shuffling :
(4)
Features having a high MDA are considered to be more important since they have a large effect on the model’s accuracy. In our analysis, we compute the MDAs separately for each test set in the sample period. The MDA values are then averaged over test sets to yield a mean importance for each feature .
3. Market Microstructure Variables
Our random forest model uses a variety of market microstructure variables as features. Microstructure variables are designed to measure illiquidity, volatility, order imbalance, and other consequences of market frictions. As in Easley et al. (2021), we focus on five such measures that represent the evolution of microstructure models from those that use price data alone (first generation) to those that use both price and volume data (second generation) to those that use more extensive trade information (third generation). Most of these measures were designed before the advent of high-frequency trading, raising the question of how well they capture market frictions in our current, more complex financial era. Thus our model helps to assess the ongoing utility of these traditional market microstructure variables. In what follows, we describe each of the five measures, including their importance and how they are computed.
3.1. Roll Measure
The Roll measure – a first generation microstructure variable – uses sequences of price changes to estimate the effective bid-ask spread, which in turn is a proxy for the transaction cost [Hautsch (2012)]. The Roll measure at bar , written , is a function of the first-order serial covariance of price changes:
| (5) |
Here , with denoting the difference between the closing prices at bars and .
3.2. Roll Impact
Roll impact, a second generation variable, is closely related to the Roll measure. Specifically, Roll impact is defined as the Roll measure, scaled by the amount of dollar-volume traded over the bar:
| (6) |
where is the set of trades belonging to bar , and and are the price and volume, respectively, of trade . Since the numerator, , represents transaction cost, Roll impact can be interpreted as the transaction cost per unit of trade.
3.3. Kyle’s Lambda
Kyle’s lambda at bar is given by
| (7) |
where is the closing price of bar , is the total volume traded over bar , and . Kyle’s lambda is the coefficient obtained by regressing price change on order flow, and thus measures the price impact of trading.
3.4. Amihud’s Lambda
Amihud’s lambda, another second generation variable, measures illiquidity by computing the ratio of the price change to the amount traded. Thus Amihud’s lambda can be viewed as the “price change per trade size,” with less liquid assets having a larger per-unit price impact than their more liquid counterparts [Hautsch (2012)]. In particular, Amihud’s lambda at bar is defined as
| (8) |
where is the return over bar .
3.5. VPIN
The volume-synchronized probability of informed trading (VPIN) arises from third generation market microstructure models. By comparing the amount of buyer- and seller-initiated trades, VPIN quantifies the extent to which there is information asymmetry in the market. For example, if a group of traders knows that an asset’s price is about to rise, we may observe a preponderance of buyer-initiated trades as informed traders rush to secure the asset before its price increases. The VPIN at bar is given by
| (9) |
where is the total volume traded over bar , is the estimated total buy volume over bar , and is the estimated total sell volume over bar . Importantly, the information provided in the Trade and Quote (TAQ) database does not include whether the trades were buyer- or seller-initiated (we call such trades “unsigned”). Thus, before computing the VPIN, we must first classify trades as buys or sells. A number of methods exist for this purpose (e.g, the Lee-Ready algorithm and the tick rule); here we use bulk volume classification (BVC), which has been demonstrated to outperform other techniques when the trade data are noisy [Easley et al. (2016)].
3.5.1. Bulk Volume Classification
Bulk volume classification is based on the heuristic that, if a trade is buyer-initiated, it will take place at the ask (the lowest price offered by sellers) and therefore will generate an uptick in the price of the asset. Similarly, if the trade is seller-initiated, it will take place at the bid (the highest price offered by buyers) and therefore will produce a downtick in price. This idea suggests that we can determine the amount of buyer- (resp., seller-) initiated trades by considering whether the price of the asset goes up or down. More specifically, let be the total volume traded over bar , with denoting the change in the closing price between bars and . Then BVC estimates the volume of buyer-initiated trades over bar to be
| (10) |
where is the empirical standard deviation of the price changes (over all bars) and is the cumulative distribution function of a standard normal random variable. Notice that the more positive the scaled price change is, the closer is to 1, so that most of the volume traded over bar is classified as buyer-initiated. Similarly, the more negative the scaled price change, the more volume is classified as seller-initiated. This result comports with the heuristic we described above: buyer-initiated trades are more likely to produce positive price changes, while seller-initiated trades are more likely to generate negative price changes.
4. Market Measures
We use the above-described microstructure variables as features in our random forest, with the aim of predicting several important market measures. Although there are a number of market measures that interest traders, regulators, and researchers, for financial network estimation we focus on a single market measure: the sign of the change in realized volatility. In a secondary analysis on information flow across large and small firms, we use an additional market measure: the sign of the change in the kurtosis of returns. We describe each in turn, explaining why they are of interest and how we compute them.
4.1. Sign of the Change in Realized Volatility
Realized volatility is a nonparametric ex-post estimate of the return variation, and often measured by the the sum of finely-sampled squared return realizations over a fixed time interval (Andersen and Teräsvirta 2009). If denotes the return over bar , then the realized volatility can be obtained from . The sign of the change in realized volatility can be calculated as
| (11) |
which is 1 when the realized volatility increases (over a forecast horizon of bars) and −1 when the realized volatility decreases. A trader who predicts that volatility will rise may want to adjust their execution algorithm, increasing their trading activity so that orders are completed before prices begin to fluctuate [Easley et al. (2021)].
4.2. Sign of the Change in the Kurtosis of Returns
Many standard risk models assume normally distributed returns; thus, traders are interested in forecasting any deviations from normality so that they can adapt their risk management practices accordingly. One such deviation could be an increase or decrease in the kurtosis (“tailedness”) of the returns. For example, high forecasted kurtosis could be caused by a drop in liquidity: with fewer orders on the book, trades are executed at more extreme prices, thereby generating more extreme returns [Easley et al. (2021)]. An increase in kurtosis indicates a reduction in liquidity and thus an increased slippage for traders who are attempting to execute an order; similarly, a reduction in kurtosis indicates less slippage in execution. Thus forecasting kurtosis is important for portfolio management.
The (excess) kurtosis7 at time is given by
| (12) |
where and are, respectively, the empirical fourth moment and standard deviation of . The sign of the change in kurtosis is then
| (13) |
5. Data Description
We obtain intraday trade data from the NYSE Daily Trade and Quote (TAQ) database, via Wharton Research Data Services (WRDS) [NYSE Trade and Quote Database]. TAQ includes trade and quote information for all stocks that are actively traded on a U.S.-based exchange; however, we focus our attention on firms from the financial sector, specifically banks, primary broker-dealers, and insurance companies. In so doing, we are able to compare our results to the analyses in Karpman et al. (2022), where lower-frequency data (monthly returns) are used to construct financial networks on the same set of firms. As in Karpman et al. (2022), sectoral membership of firms is identified using the Standard Industrial Classification (SIC) code. We analyze data for this set of firms over two time periods: 1998-2010, and 2018 (see Sections 6.1 and 6.2, respectively).
Starting with the full set of trades for these firms, we apply the following filters to compile our final dataset: (i) remove any trades whose price or volume is negative since these records are clearly erroneous, (ii) exclude trades occurring outside of regular market hours (9:30 AM to 4:00 PM EST), (iii) only retain trades of common shares8, and (iv) remove trades that are corrected, changed, or marked as erroneous9. For each stock, we form time series of each of the microstructure variables and market measures by aggregating trades into 30-minute time bars (see Section 2.1.1). Lastly, since the market opening is run according to a different process, namely, an auction, we remove the first bar of each day from our final dataset.
6. Results
Having discussed the methods by which we construct high-frequency financial networks, we now demonstrate how such networks can be used to gain insight into the structure of the financial system using historical trade data. We consider two examples. The first examines how inter-firm connections vary over the course of 1998 to 2010, with a special focus on whether connectivity changes in and around financial crises (see Section 6.1). The second example explores why edges appear between certain pairs of firms, and – in particular – whether the sizes of the firms (measured via market capitalization) plays a role (see Section 6.2).
6.1. Historical Evolution of High-Frequency Financial Network Connectivity
Connections between financial institutions are important for monitoring risk in the financial system. For instance, Billio et al. (2012) shows that there are an increasing number of Granger causal connections during the economically unstable periods of 1998-1999 and 2007-2008. Likewise, Basu et al. (2019) (which refines the methods in Billio et al. (2012)) demonstrates that network connectivity spikes around several recent systemic events, including the 1998 Russian financial crisis and the 2008 collapse of the investment bank, Lehman Brothers. Karpman et al. (2022) expands on these methods further by constructing networks via quantile Granger causality, which focuses on firm connections that exist specifically during market downturns. Each of the aforementioned studies uses monthly stock returns for network building.
The method proposed in this paper is a natural vehicle for building similar networks using high-frequency data. We have described how microstructure variables, computed from intraday trade data, can be used to predict future changes in market measures such as realized volatility. Since these microstructure measures reflect information-based trading, a firm, , having microstructure measures that can help predict realized volatility of another firm, , represents a possible source of risk to firm . Thus, by assessing whether features from one firm are useful in forecasting changes for another firm, we can construct a network whose edges represent a high-frequency analogue of returns-based Granger causality. In this section, we create such networks for a set of firms and over a given time period that are comparable to those considered in Billio et al. (2012), Basu et al. (2019), and Karpman et al. (2022). We begin with the details of our network construction process, and then compare our results to those obtained using bivariate Granger causality applied to monthly stock returns.
6.1.1. Methodology for Constructing 1998-2010 Financial Networks
For each year between 1998 and 2010, we rank all actively-traded firms according to their average monthly market capitalization, which is computed using data from the Center for Research in Security Prices (CRSP) database, accessed via WRDS [CRSP Stocks]. Using this ranking, for each year, we identify the top 25 banks, primary broker-dealers, and insurance companies, yielding a total of 75 firms.10 Any firms having insufficient data are excluded from our analysis, resulting in some variation in the number of firms considered per year (ranging from 59 firms in 1998 to 75 firms in the later years of the sample period).11
Next we divide each year into three overlapping 6-month periods: January 1 through June 30, April 1 through September 30, and July 1 through December 31. Thus our analysis involves 39 time windows (13 years, with 3 windows per year). For each six-month window, we split the interval into two sets, training on the first three months and testing on the last three months. For example, we train on data from approximately12 January 1, 1998 through March 31, 1998 and test on data from approximately April 1, 1998 through June 30, 1998. We implement this testing procedure, rather than purged cross-validation, so as not to introduce any bias that may result from training on data that occurs after the test data.
In each window, we iterate over each pair of firms, (, ), twice, once to predict the sign of the change in realized volatility for firm , and a second time for firm . We fit two random forest models, one that includes only features of the firm for which we are forecasting and the other that includes cross-features (i.e., features of both and ). Then, as described in Section 2.3, we use bootstrap to assess whether the area under the curve (AUC) increases significantly under the inclusion of cross-features. Our bootstrap procedure yields a set of p-values, one for each possible edge, , in the network. We apply a false discovery rate correction and retain the set of edges whose adjusted p-value is less than or equal to 0.05.
6.1.2. Estimated 1998-2010 Financial Networks
Figure 3 displays the proportion of realized edges, hereafter referred to as density,13 in each of our estimated networks from 1998 to 2010. For comparison purposes, we also show the density of networks estimated using bivariate Granger causality on monthly stock returns; however, we caution the reader that the high- and low-frequency networks are computed over different time windows, hence the two time series of network density are of different lengths.
Figure 3.
Density of financial networks over the 1998 to 2010 period, where network density refers to the proportion of realized edges. Top (high-frequency networks): each year is divided into three overlapping windows of six months each, a network is estimated for each window by applying the methodology described in 2 to intraday trade data, and the network density is plotted. Bottom (low-frequency networks): the sample period is divided into 36-month rolling windows, a network is estimated for each window by applying bivariate Granger causality to monthly stock returns, and the network density is plotted. Note that there are 39 high-frequency networks and 156 low-frequency networks.
Our first observation is that the high-frequency network density increases steadily during 1998, reaching a peak in the last quarter of that year (i.e., when our model is applied to test data from October-December 1998). This increase in connectivity coincides with a period of mounting economic turmoil in Russia, culminating with the Russian government devaluing the ruble, defaulting on domestic debt, and declaring a moratorium on repayment of foreign debt (August 17, 1998) [Chiodo and Owyang (2002)]. As the future of the Russian economy remained unclear, U.S. stocks plunged and the Federal Reserve Bank of New York was forced to organize a bailout of the U.S.-based hedge fund Long Term Capital Management [Rubin et al. (1999)]. Notice that low-frequency (monthly returns) networks also display a connectivity increase during the fall of 1998.
Our high-frequency networks then become less dense through the end of 2000, at which point connectivity repeatedly increases and decreases (albeit with an overall upward trend) through late 2003. These results are less interpretable than those in the low-frequency setting, where the density consistently decreases from 1999 through late 2002. Both the low- and high-frequency networks have elevated density in 2003. The intraday networks become less dense in 2004, before increasing in density in 2005. On the other hand, the monthly-scale networks remain dense throughout 2003 and 2004 and are not particularly dense in 2005.
Intraday networks exhibit a fairly persistent increase in density through 2006 and 2007. In fact, a global maximum density of 36.2% is reached in the end of 2007, subsequent to the summer 2007 failure of two subprime mortgage funds associated with the investment bank Bear Stearns. Connectivity then drops sharply in the first half of 2008 before increasing again. These results are somewhat consistent with what is observed in the monthly-scale networks, where density steadily increases until the beginning of 2008, then decreases, and finally spikes following the September 2008 collapse of the investment bank Lehman Brothers. High-frequency networks, like their low-frequency counterparts, display an overall decline in density in the late 2000s.
We now turn our attention to which firms are central in and around the 2007-2009 U.S. Financial Crisis. Node centrality can be measured using a variety of metrics (e.g., degree, closeness, betweeness). We focus on degree; that is, on how many edges are incident to the node. Firms can be characterized by both their in-degree (number of incoming edges) and out-degree (number of outgoing edges). A firm having a large in-degree is one for which many other firms’ microstructure measures are useful in forecasting its realized volatility. On the other hand, a firm with a large out-degree has microstructure measures that are useful for predicting the realized volatility of many other firms. Firms with large out-degree have the potential to spread risk through the financial system since aspects of their trading (captured via microstructure measures) propagate to other firms. Likewise firms with large in-degree have the potential to absorb this risk.
In Figure 4, we display the 10 most highly connected firms according to their indegree and (separately) their out-degree, before, during, and after the U.S. Financial Crisis. Several observations are in order. First, Lehman Brothers (LEH) has a large out-degree in the January-June 2006 and July-December 2006 networks. During the intervening time period (April-September 2006), it has a large in-degree. That our methodology should identify Lehman Brothers as a highly connected firm in the lead-up to the crisis is interesting given that the broker-dealer’s involvement in subprime mortgage lending has been recognized as a key contributor to the crisis [Friedman and Posner (2011)]. American International Group (AIG) is also highly connected before the crisis; in fact, it is one of the top firms according to out-degree in six of the nine networks that span 2006-2008. It is a top firm by in-degree during the January-June 2007 period. Like Lehman Brothers, AIG played a major role in the crisis through its use of collateralized debt obligations (CDOs) and credit default swaps (CDSs), and was bailed out by the federal government shortly after Lehman Brothers’ collapse [Friedman and Posner (2011)].
Figure 4.
Highly connected firms in high-frequency realized volatility networks before, during, and after the U.S. Financial Crisis. Firms are ranked according to their in-degree (top) and out-degree (bottom). Standardized degrees (i.e., (firm degree - mean network degree)/standard deviation of network degree) are plotted so as to make results comparable across different networks. Banks (resp., broker-dealers, insurance companies) are displayed in red (resp., green, blue). Note that we add a small amount of random noise to the (, ) coordinates of each firm so that firm labels do not overlap with one another. Full company names for all ticker symbols are provided in Table A2.
More generally, we note that the top firms are not always consistent across neighboring time periods. For example, a firm might be highly connected during one time window, but not during the windows immediately preceding or following it. (This is the case with T. Rowe Price (TROW), which has a large in-degree during April-September 2007, but neither a large in-degree nor a large out-degree during either of the other 2007 windows.) A major exception is AIG, as noted above. Our methodology highlights several additional firms that are known to have contributed to the crisis: Bear Stearns (BSC) is a top “in-firm” during April-September 2006 and a top “out-firm” in July-December 2007, The Federal National Mortgage Association (aka Fannie Mae; FNM) is a top out-firm during January-June 2007 and January-June 2008, and The Federal Home Loan Mortgage Corporation (aka Freddie Mac; FRE) is a top out-firm during April-September 2006.
Some of these results are consistent with those observed in monthly-scale bivariate Granger causality networks (see Figure 5). In the monthly-scale networks, as in their high-frequency counterparts, AIG, Fannie Mae, and Freddie Mac all have large out-degree before and/or during the crisis. Interestingly, AIG remains a large source of risk propagation (i.e., has high out-degree) through 2010, whereas it does not have a large out-degree in any of the high-frequency networks beyond April-September 2008. Another key difference between the low- and high-frequency settings is the role played by Lehman Brothers and Bear Stearns. In the high-frequency networks, Lehman Brothers, and – to a lesser extent – Bear Stearns, emerge as top out-firms in the lead-up to the financial crisis. In the monthly-scale networks, on the other hand, neither of these two firms have a large out-degree, although Lehman Brothers is consistently a top in-firm (absorber of risk) in 2006 and 2007. This difference raises the possibility that high-frequency networks may be able to identify risk propagating firms that are not highlighted in low-frequency networks.
Figure 5.
Highly connected firms in low-frequency bivariate Granger causality networks before, during, and after the U.S. Financial Crisis. Firms are ranked according to their in-degree (top) and out-degree (bottom). Standardized degrees (i.e., (firm degree - mean network degree)/standard deviation of network degree) are plotted so as to make results comparable across different networks. Banks (resp., broker-dealers, insurance companies) are displayed in red (resp., green, blue). Note that we add a small amount of random noise to the (, ) coordinates of each firm so that firm labels do not overlap with one another. Full company names for all ticker symbols are provided in Table A2.
To further explore the behavior of systematically important financial institutions, we consider subnetworks of firms that received considerable government assistance during or after the crisis (see Figure 6). The size of node is proportional to the market capitalization of firm , while the thickness of edge is proportional to the increase in AUC obtained by using the features of firm to predict the change in realized volatility of firm . We select – for each year between 2006 and 2008 – the 10 firms within our sample that received the largest amount of Troubled Asset Relief Program (TARP) funding, which was provided to companies that were deemed “too big to fail”14 [Kiel and Nguyen (2013)]. (For 2006 and 2007, we also include Lehman Brothers and Bear Stearns, which did not receive TARP funding but which were crucial firms during this period.)
Figure 6.
High-frequency realized volatility networks of financial institutions that received significant federal bailout packages (TARP funding). Each year, the 10 institutions within our sample that received the most TARP funding are selected, in addition to Lehman Brothers and Bear Stearns. Nodes are sized according to their market capitalization, with red (resp., green, blue) nodes indicating banks (resp., broker-dealers, insurance companies). Edge thickness is proportional to the increase in AUC obtained by including cross-features. Full company names for all ticker symbols are provided in Table A2.
Figure 6 highlights the role played by Lehman Brothers in the lead-up to the crisis. For example, in the April-September 2006 network, Lehman Brothers has incoming edges from all but one of the other firms and is particularly influenced by Bank of America (AUC increase = 0.305), Wells Fargo (AUC increase = 0.264), and JPMorgan Chase (AUC increase = 0.223). In early 2007, AIG emerges as a firm having large indegree, including from Bank of America (AUC increase = 0.297), Bear Stearns (AUC increase = 0.262), and Goldman Sachs (AUC increase = 0.259). Bank of America also has a large in-degree. By July-December 2007 (months before its collapse in March 2008), Bear Stearns has many incoming edges, the strongest of which is from the company that would come to purchase it, JP Morgan Chase (AUC increase = 0.144). In 2008, several of the firms previously considered are no longer present in our sample, whether because of their collapse (e.g., Lehman Brothers, Bear Stearns15) or because they are no longer among the top 75 financial institutions by market capitalization (e.g., Freddie Mac). However, firms like Citigroup, Wells Fargo, AIG, JPMorgan Chase, and Fannie Mae continue to have large in- and/or out-degree.
6.2. Cross-Asset Information Flow Between Small and Large Firms
Having constructed financial networks in Section 1.6.1, we now seek to address why there are edges between some firms and not others. One natural hypothesis, supported by the literature, is that edges may run from large firms to small firms. Indeed Lo and MacKinlay (1990) provides empirical evidence of a lead-lag relationship between the weekly returns of large and small market capitalization stocks. In particular, the authors divide a sample of 551 stocks into size-based quintiles and form equal-weighted portfolios for each quintile. The correlation between the returns of the first quintile’s portfolio, at week , and the returns of the fifth quintile’s portfolio, at week , is found to be 27.6 percent. (No evidence is discovered of the reverse – that is, of small stock returns leading large stock returns.)
Various explanations for this pattern exist. Among them is the theory that market-wide information is first absorbed into the prices of large stocks (which tend to be actively traded) and subsequently into the prices of small stocks (which tend to be less frequently traded) [Brennan et al. (1993)]. More recently, Chordia et al. (2011) performs a Granger causal analysis on value-weighted portfolios of large and small market capitalization stocks. The authors regress daily returns of these portfolios on lagged values of returns, volatilities, and quoted spreads, and find that the returns of the large stock portfolio lead the returns of the small stock portfolio, especially when the large stocks experience low liquidity.
While Chordia et al. (2011) includes a variety of financial variables (returns, volatilities, and quoted spreads) in their analysis, we are thus far unaware of any studies that examine whether market microstructure variables yield lead-lag relationships between small and large firms. Our random forest methodology, however, lends itself well to this question. Indeed we can assess whether the microstructure features of small (resp., large) firms are useful in predicting future increases and decreases in market measures of large (resp., small) firms. Notice that our method – as previously described – is implemented on a firm-by-firm basis; that is, we use microstructure variables of firm (and possibly of a second firm, ) to forecast market measures of firm . This procedure is fundamentally different from the analysis in Chordia et al. (2011), which considers financial variables that have been aggregated over firms of similar size. So that we may compare our results to those in Chordia et al. (2011), we perform a similar aggregation. Since we are aggregating firm-level information into two groups (large and small firms), instead of networks of firms we display our findings with boxplots summarizing feature importances of the two groups.
To begin, we consider all banks, primary broker-dealers, and insurance companies that were active on each trading day of 2018. We rank these firms according to their average monthly market capitalization and take our final set of firms to be those in the first and seventh capitalization deciles. The first decile (54 firms) represents large stocks and the seventh decile (55 firms) represents small stocks. We use the seventh decile because stocks in lower tiers are likely to trade so infrequently as to make missing values a problem in our downstream analysis. Next, for each stock, we form time series of its microstructure variables and market measures, and then compute value-weighted averages for both small and large firms (separately). For example, the aggregate time series of the Roll measure for small and large firms is given by
| (14) |
| (15) |
where and denote the set of small and large firms, respectively, is the average monthly MCAP of firm , and is the value of the Roll measure of firm at time . We form aggregate time series for Amihud’s lambda, VPIN, kurtosis, and realized volatility in an analogous manner to (14) and (15).16 Finally, we calculate the sign of the change in average kurtosis and realized volatility:
where, as before, is a forecast horizon of 50 time bars. There are 2,881 observations in our final dataset and we perform 10-fold cross-validation (over the entirety of 2018) to evaluate the random forest classifier’s performance.
The first question we address is which features are important for predicting market measure changes in firms of different sizes. We consider four prediction scenarios: (i) kurtosis for large firms, (ii) realized volatility for large firms (iii) kurtosis for small firms, and (iv) realized volatility for small firms. In each case, we include cross-features in our random forest model; that is, we use microstructure variables for both small and large firms. Furthermore, for each test set, we compute the mean decrease in accuracy (MDA) for each feature and average the results by firm size. Let denote the MDA of feature on test set , where . Then, for each , we calculate
Figure 7 displays the distributions of and of for each of the four prediction scenarios.
Figure 7.
Distribution of the average mean decrease in accuracy (MDA), grouped by firm size. For example, the leftmost boxplot illustrates the distribution of , where denotes the average of the MDA values for the large firms’ Roll measure, Amihud’s lambda, and VPIN, evaluated over test set . The left (resp., right) panel displays feature importance results when forecasting the sign of the change in kurtosis and realized volatility for large (resp., small) financial firms.
We observe that, under all scenarios but one, the microstructure variables of large firms are more important than those of small firms. For example, when forecasting kurtosis for large firms, the median large firm MDA is approximately 0.12 while the median small firm MDA is closer to 0.1. Qualitatively similar results hold when predicting realized volatility for large firms and kurtosis for small firms. On the other hand, this pattern is reversed when we forecast realized volatility for small firms, in which case the small firms’ features have higher MDA. Interestingly, though, the distributions of and have some overlap in this case, whereas in all other scenarios, the distribution of large firm MDA values lies entirely above the distribution of small firm MDA values.
To an extent, these results are consistent with those reported in Chordia et al. (2011) and Lo and MacKinlay (1990), where the weekly and daily returns of large stocks were found to lead those of small stocks (but not the reverse). Our analysis reveals a similar lead-lag pattern in a high-frequency setting: microstructure variables of large firms have predictive power when forecasting the sign of the change in kurtosis of the small firms’ returns distribution. Moreover, when forecasting for large firms, the microstructure features of small firms are found to be less important than those of large firms. This conforms with earlier findings that small firm returns do not lead large firm returns.
We now turn our attention to the question of whether adding cross-features improves the random forest’s predictive ability. We find that the results here are mixed (see Figure 8). Including cross-features yields a significant increase in AUC when forecasting realized volatility, and – to a lesser extent – kurtosis, for large firms. There is only minor predictive improvement, however, when using cross-features to predict realized volatility for small firms, and virtually no change when predicting kurtosis for small firms.17 As a robustness check, we repeat the analyses presented in this section on a set of information and communications technology (ICT) firms (rather than financial firms, as discussed here). We find that the ICT feature importance results are qualitatively similar to those we present here, though the ROC results are not. Our complete findings are shown in Appendix A.
Figure 8.
ROC curves for predicting the sign of the change in realized volatility (top row) and in kurtosis (bottom row) for large firms (left column) and small firms (right column). In each case, the random forest was fit twice, once without cross-features (e.g., using only large firms’ features to predict large firms’ measures) and again with cross-features. Thus, for each prediction scenario, two ROC curves are displayed: red (resp., black) curves indicate that cross-features were (resp., were not) used. The area under the curve (AUC) is reported in the lower right corner of each plot.
7. Conclusion
We estimate financial networks by determining whether cross-effects in intraday trade data exist between each pairs of firms in our sample. We detect these cross-effects by assessing whether microstructure measures of one firm improve our ability to forecast the sign of the change in a market measure such as the realized volatility of another firm, where predictability is captured by statistically significant increase in the area under the ROC curve (AUC). Because we learn our networks from high-frequency trade data, where economic theory does not offer a clear guidance on the nature of these predictive relationships, we use a nonparametric learner random forest to forecast market measure changes. Random forests, a popular machine learning tool, provide a great deal of modeling flexibility as they do not impose a particular functional form on the data. In addition, random forests are very robust to the choice of tuning parameters and do not require expensive hyperparameter selection techniques such as the cross-validation. We apply these methods to the trade data of large U.S. financial institutions, demonstrating how our networks can be used to answer the same questions posed by researchers in the low-frequency setting (e.g., how network connectivity evolves over time and which types of firms interact with one another).
High-frequency financial networks have the potential to yield novel insights into the workings of the financial system. Future work in this direction includes refining our network estimation procedures (e.g., by changing the microstructure variables used as features, or by considering different market measures for prediction). In particular, it would be interesting to decompose the realized volatility into continuous and jump components along the line of Pelger (2020), and use these two measures as separate prediction targets. Forecasting the magnitude of change instead of the sign of change is also an interesting direction of future research.
Our choice of random forests in this work is largely motivated by its essentially tuning free nature, which allows us to avoid expensive cross-validation strategies, and its ability to produce a measure of variable importance. It would be of interest to experiment with alternative machine learning methods such as deep neural networks and compare their predictive performance to random forests. There are a number of parameters in our forecasting framework (, the length of the lookback window; , the forecast horizon; the length of the time bar, etc.) and we have yet to perform an exhaustive review of how these parameters impact our final results.
Moreover, the networks we construct are based on bivariate analyses; that is, by testing for predictive improvements in firm when we include the features of one additional firm, . We could instead undertake a multivariate analysis wherein we include features of all firms in order to predict the change in the market measure of firm . Such an analysis would give assurance that any cross-predictability detected between firms and is indeed due to the measures of firm and not to measures of a firm that is correlated with (i.e., indirect associations). Preliminary work in this direction has yielded mixed results; however, it is possible that by adjusting our model beyond the standard random forest, we may be able to make further progress. Finally, we demonstrated the utility of our method in the context of historical events such as the LTCM bailout and the 2007-09 financial crisis. While these events are well-suited to validate a new methodology, it would be interesting to see if the methods can discover new patterns in more recent events such as the COVID-19 shock on the financial market. An exploration of high-frequency trade data sets in the sample period of 2018-2022 using machine learning methods may provide new insight into the network dynamics when the shocks are exogenous (COVID-19 pandemic) as opposed to endogenous (linked to 2007-09 financial crisis).
Funding
KK and SB were supported in part by NIH award R01GM135926. SB also acknowledges partial support from NSF awards DMS-1812128, DMS-2210675 and NIH award R21NS120227.
Appendix A. Additional Details
In Section 6.2, we apply a random forest model to the aggregated market measures of different sized financial firms to assess whether firm size impacts cross-predictability (i.e., whether trade information from large (resp., small) firms improves the predictability of small (resp., large) firms’ market measures). Here we repeat our analysis on a set of information and communications technology (ICT) firms, with the goal of determining whether our results vary by industry.
We determine ICT firms on the basis of their North American Industry Classification System (NAICS) code, which was obtained through the Center for Research in Security Prices (CRSP) database [CRSP Stocks, NAICS (2017)]. To begin, we select all firms having any of 10 NAICS industry codes listed in Table A1. We sort these firms according to their average market capitalization over 2018 and retain 47 firms from the first decile (representing large technology firms) and 47 from the seventh decile (representing small technology firms).
Figure A1 displays MDA feature importance results. On average, large firms’ features are more important than small firms’ features, regardless of whether we are forecasting realized volatility or kurtosis, for large firms or for small firms. However, the difference in feature importances is larger when predicting for large firms (left panel) than for small firms (right panel), which suggests (a) that small firms carry little information about large firms, and (b) that large firms do contain some information about small firms, but the small firm features are still significant. These results are qualitatively similar to what we obtain for financial firms, except that, for the latter, small firms’ features are more important than large firms’ when predicting small firm realized volatility. In Figure A2, we show that including cross-features in the random forest model yields very little change in the AUC. An exception is when we predict kurtosis for small firms (bottom right plot), in which case we see an appreciable increase in AUC when we add large firms’ features.
Figure A1.
Distribution of the average MDA, grouped by firm size. The left (resp., right) panel displays feature importance results when forecasting the sign of the change in kurtosis and realized volatility for large (resp., small) ICT firms.
Table A1.
North American Industry Classification System (NAICS) industry codes for information and communications technology firms.
| NAICS Code | Description |
|---|---|
| 3341 | Computer and peripheral equipment manufacturing |
| 3342 | Communications equipment manufacturing |
| 3344 | Semiconductor and other electronic component manufacturing |
| 3345 | Navigational, measuring, electromedical, and control instruments manufacturing |
| 5112 | Software publishers |
| 5161 | Internet publishing and broadcasting |
| 5179 | Other telecommunications |
| 5181 | Internet service providers and Web search portals |
| 5182 | Data processing, hosting, and related services |
| 5415 | Computer systems design and related services |
Figure A2.
ROC curves for predicting the sign of the change in realized volatility (top row) and in kurtosis (bottom row) for large (left column) and small (right column) ICT firms. For each prediction scenario, two ROC curves are displayed: red (resp., black) curves indicate that cross-features were (resp., were not) used. The area under the curve (AUC) is reported in the lower right corner of each plot.
Table A2.:
Firm names, sectors, and ticker symbols. BA: bank, PB: broker/dealer, INS: insurance.
| Firm Name | Sector | Ticker Symbol |
|---|---|---|
| ALLIANCEBERNSTEIN HLDG | PB | AB, AC |
| ACE LTD | INS | ACE |
| AETNA INC NEW | INS | AET |
| AFLAC INC | INS | AFL |
| AMERICAN INTERNATIONAL GROUP | INS | AIG |
| APOLLO INVESTMENT CORP | PB | AINV |
| ASSURANT INC | INS | AIZ |
| ALLSTATE CORP | INS | ALL |
| AFFILIATED MANAGERS GROUP INC | PB | AMG |
| AMERIPRISE FINANCIAL INC | PB | AMP |
| AMERITRADE HOLDING CORP NEW | PB | AMTD |
| AON CORP | INS | AOC, AON |
| AMERICAN EXPRESS CO | BA | AXP |
| BANK OF AMERICA CORP | BA | BAC |
| B B & T CORP | BA | BBT |
| FRANKLIN RESOURCES INC | PB | BEN |
| BANK NEW YORK INC | BA | BK |
| BLACKROCK INC | PB | BLK |
| BANK MONTREAL QUE | BA | BMO |
| BANK OF NOVA SCOTIA | BA | BNS |
| C B O T HOLDINGS INC | PB | BOT |
| BEAR STEARNS COMPANIES INC | PB | BSC |
| BLACKSTONE GROUP L P | PB | BX |
| CITIGROUP | BA | C |
| CHUBB CORP | INS | CB |
| COUNTRYWIDE FINANCIAL CORP | BA | CFC |
| CIGNA CORP | INS | CI |
| CIT GROUP | PB | CIT |
| CANADIAN IMPERIAL BANK COMMERCE | BA | CM |
| CHICAGO MERCANTILE EXCH HLDG INC | PB | CME |
| C N A FINANCIAL CORP | INS | CNA |
| CAPITAL ONE FINANCIAL CORP | BA | COF |
| COVENTRY HEALTH CARE INC | INS | CVH |
| DEUTSCHE BANK A G | BA | DB |
| DISCOVER FINANCIAL SERVICES | BA | DFS |
| E TRADE FINANCIAL CORP | PB | ET, ETFC |
| EATON VANCE CORP | PB | EV |
| FEDERATED INVESTORS INC PA | PB | FII |
| FEDERAL NATIONAL MORTGAGE ASSN | BA | FNM |
| FEDERAL HOME LOAN MORTGAGE CORP | BA | FRE |
| GREENHILL & CO INC | PB | GHL |
| GENWORTH FINANCIAL INC | INS | GNW |
| HARTFORD FINANCIAL SVCS GRP INC | PB | HIG |
| BLOCK H & R INC | BA | HRB |
| HUMANA INC | INS | HUM |
| INTERACTIVE DATA CORP | PB | IDC |
| INVESCO LTD | PB | IVZ |
| JEFFERIES GROUP INC NEW | PB | JEF |
| NUVEEN INVESTMENTS INC | PB | JNC |
| JANUS CAP GROUP INC | PB | JNS |
| JPMORGAN CHASE & CO | BA | JPM |
| LAZARD LTD | PB | LAZ |
| LEHMAN BROTHERS HOLDINGS INC | PB | LEH |
| LEGG MASON INC | PB | LM |
| LINCOLN NATIONAL CORP IN | INS | LNC |
| MERRILL LYNCH & CO INC | PB | MER |
| METLIFE INC | INS | MET |
| MANULIFE FINANCIAL CORP | INS | MFC |
| MARSH & MCLENNAN COS INC | INS | MMC |
| MORNINGSTAR INC | PB | MORN |
| MORGAN STANLEY DEAN WITTER & CO | PB | MS |
| M & T BANK CORP | BA | MTB |
| NATIONAL CITY CORP | BA | NCC |
| NASDAQ STOCK MARKET INC | PB | NDAQ |
| NYMEX HOLDINGS INC | PB | NMX |
| NORTHERN TRUST CORP | BA | NTRS |
| NYSE GROUP INC | PB | NYX |
| PEOPLES UNITED FINANCIAL INC | BA | PBCT |
| PRINCIPAL FINANCIAL GROUP INC | INS | PFG |
| PROGRESSIVE CORP OH | INS | PGR |
| PNC FINANCIAL SERVICES GRP INC | BA | PNC |
| PARTNERRE LTD | INS | PRE |
| PRUDENTIAL FINANCIAL INC | INS | PRU |
| EVEREST RE GROUP LTD | INS | RE |
| REGIONS FINANCIAL CORP | BA | RF |
| RAYMOND JAMES FINANCIAL INC | PB | RJF |
| ROYAL BANK CANADA MONTREAL QUE | BA | RY |
| SCHWAB CHARLES CORP NEW | PB | SCHW |
| S E I INVESTMENTS COMPANY | PB | SEIC |
| SUN LIFE FINANCIAL INC | INS | SLF |
| SLM CORP | BA | SLM |
| ST PAUL TRAVELERS COS INC | INS | STA |
| SUNTRUST BANKS INC | BA | STI |
| STATE STREET CORP | BA | STT |
| TORONTO DOMINION BANK ONT | BA | TD |
| T ROWE PRICE GROUP INC | PB | TROW |
| TRAVELERS GROUP INC | INS | TRV |
| UBS AG | BA | UBS |
| UNITEDHEALTH GROUP INC | INS | UNH |
| UNUMPROVIDENT CORP | INS | UNM |
| U S BANCORP DEL | BA | USB |
| VISA INC | BA | V |
| WACHOVIA CORP 2ND NEW | BA | WB |
| WELLS FARGO & CO NEW | BA | WFC |
| WASHINGTON MUTUAL INC | BA | WM |
| WILLIS GROUP HOLDINGS PUB LTD CO | INS | WSH |
| X L CAPITAL LTD | INS | XL |
Appendix B. Alternatives to random forests for cross-asset learning
While random forests (RF) provide one nonparametric approach to measure predictability across assets, there are a number of alternative machine learning (ML) algorithms which can be potentially used for this task. Our decision to use RF is motivated by the two desirable features of this algorithm.
First, RF is known to be very robust to its choice of tuning parameters, especially in low-dimensional forecasting problems such as ours (n > 2000, p < 10 in the analyses of Section 6.2). The default choice of tuning parameters (500 trees, ) provide near optimal prediction in many empirical applications (see, e.g. (Huang and Boutros 2016) and references therein). As a result, we do not need to perform expensive hyperparameter optimization procedure such as the cross-validation. Since we are more interested in detecting statistically significant predictability across assets rather than achieving a very large predictive gain, we chose RF for the reasons of robustness and computational speed. Most alternative ML algorithms exhibit high-sensitivity to tuning parameters, and require careful and computationally expensive tuning strategies.
Second, RF also offers a natural measure of feature importance which has been shown to capture nonlinear feature interactions in many empirical studies. For complex ML algorithms such as the deep neural networks, measuring such feature importance is still a field of active research.
In order to compare RF against alternative ML algorithms, we first re-constructed the ROC curves of Figure 8 using RF for different values of the tuning parameter . Then we repeated this exercise with a linear ML method (elastic net penalized logistic regression) and a nonlinear ML method (gradient boosting). For both ML methods, we used different shrinkage parameters to assess the robustness of our results.
The results of RF are summarized in Table B1. We used a large number of trees (n = 1000) to alleviate the effect of this tuning parameter. The table B1 shows that for each of the 8 prediction exercises, the AUC changes very little when we vary in the range {1,2,3,4}. Moreover, the qualitative finding of improvement (or no improvement) by adding cross-effects remain the same across all choices of .
In contrast, we found that the ROC shapes and AUC can be sometimes sensitive to tuning parameter choices. For elastic net penalized logistic regression, implemented using the R package glmnet, we report the results of two different tuning parameter selection strategies adopted in the literature - the one standard error rule (Figure B1) and cross-validation (Figure B2). While the results are robust for the three predictive models, the AUC of predicting kurtosis of large firms without cross-effects (bottom left ROC curve) changed from 0.52 to 0.56 across the two tuning parameter selection strategies.
In a similar analysis of gradient boosting, implemented using the R package gbm, we noticed similar sensitivity to tuning parameter choice. Here we illustrate the ROC curves for two choices of the key shrinkage parameter: the default of 0.01 (Figure B3) and a smaller shrinkage of 0.001 (Figure B4). Note that the ROC shape and the AUC change substantially across the two scenarios. We have conducted extensive simulations across different choices of tuning parameter, and obtained qualitatively similar results.
Based on these sensitivity analyses of ROC curves, we concluded that both penalized logistic regression and gradient boosting machine are reasonable alternatives to RF, but a thorough and computationally expensive hyperparemeter selection strategy is needed to obtain robust results on cross-asset predictability. We leave these investigations for future research.
Table B1.
AUC of random forests for different choices of the tuning parameter across the four classification tasks. No CE: without cross-effect features, CE: with cross effect features
| Target | Group | Model | m = 1 | m = 2 | m = 3 | m = 4 |
|---|---|---|---|---|---|---|
| Realized volatility | Large firms | No CE | 0.55 | 0.55 | 0.55 | 0.55 |
| CE | 0.66 | 0.66 | 0.66 | 0.65 | ||
| Realized volatility | Small firms | No CE | 0.58 | 0.57 | 0.55 | 0.55 |
| CE | 0.60 | 0.59 | 0.58 | 0.58 | ||
| Kurtosis | Large firms | No CE | 0.48 | 0.47 | 0.47 | 0.47 |
| CE | 0.52 | 0.52 | 0.53 | 0.53 | ||
| Kurtosis | Small firms | No CE | 0.46 | 0.47 | 0.47 | 0.47 |
| CE | 0.45 | 0.46 | 0.47 | 0.47 |
Figure B1.
ROC curves for predicting market measures of large and small firms using elastic net penalized logistic regression. The penalty parameter is chosen by 1 standard error rule in the R package glmnet.
Figure B2.
ROC curves for predicting market measures of large and small firms using elastic net penalized logistic regression. The penalty parameter is chosen by cross-validation implemented in the R package glmnet.
Figure B3.
ROC curves for predicting market measures of large and small firms using gradient boosting. The shrinkage parameter is set to 0.1 (default in R package gbm)
Figure B4.
ROC curves for predicting market measures of large and small firms using gradient boosting. The shrinkage parameter is set to 0.001
Footnotes
See for example, O’Hara’s (2015) survey of high frequency market microstructure.
One alternate sampling method is to collect trades until their cumulative dollar-volume reaches a certain level [Easley et al. (2021)]. So-called dollar-volume bars have appealing theoretical and practical properties; however, they are not synchronized across stocks and thus present challenges for how to model using cross-effects. For example, an actively traded stock, , fills its dollar-volume bars faster than a less actively traded stock, . Thus ’s first dollar-volume bar may run from 9:30 AM to 9:35 AM, while does not fill its bar until 10:00 AM. Therefore, we cannot use dollar-volume bars if we hope to use ’s features to make predictions about : in effect, we would be using future information about to predict current properties of .
The market is open from 9:30 AM EST to 4:00 PM EST, which corresponds to 13 30-minute intervals. However, we remove the first time bar of the day (see Section 5 for details), resulting in only 12 time bars per trading day.
Note that we consider only whether is greater than since the reverse does not indicate the presence of cross-effects. In theory, the AUC should either (a) increase if we include microstructure measures from a firm having predictive power, or (b) stay the same if we include microstructure measures from a firm that does not have predictive power (i.e., the random forest should be able to select – as split features – the microstructure measures that improve predictive performance, so that the AUC remains the same with the addition of an “unhelpful” firm, but does not decrease).
This bootstrap procedure is implemented using the roc.test() function in the pROC package within R [Robin et al. (2011)]. We set .
Here we use a decision threshold of 0.5 to convert probabilities to predicted values.
The kurtosis of the normal distribution is 3, meaning that the excess kurtosis measures the “tailedness” of a given distribution relative to the normal distribution. The terms “kurtosis” and “excess kurtosis” are often used inter-changeably; thus, we simply refer to “kurtosis.”
This corresponds to selecting records for which the TAQ symbol suffix is blank.
This corresponds to selecting records for which the TAQ trade correction indicator is “00.”
Our choice of firms is similar to that made in Karpman et al. (2022), which considers companies in the same three financial sectors, but selects sets of firms over 36-month rolling windows, rather than on an annual basis.
As discussed in Section 2.1, we aggregate trades into 30-minute time bars in order to synchronize trade data across firms. For the earlier years in the sample period, some firms have 30-minute windows in which few trades occurred. Out of an abundance of caution, we choose to exclude these firms from our analysis. Specifically we discard any firm for which 25% or more of its time bars contain fewer than 5 trades.
These dates are only approximate since we purge data from around the test set (see Section 2.3).
For example, if half of all possible edges occur, then the density is 0.5.
A firm is considered “too big to fail” if its collapse would result in significant damage to the economy.
Recall that firms are only included in our sample if they are actively traded during the entire year. Bear Stearns was acquired by JPMorgan Chase in March 2008, while Lehman Brothers filed for bankruptcy in September 2008.
Note that we do not include two of the microstructure variables described in Section 3, namely Roll impact and Kyle’s lambda. We exclude these because they were found to have relatively low predictive ability for 2018 financial firms.
We note that, regardless of firm size, our model has more success in forecasting realized volatility than it does in forecasting kurtosis.
References
- Andersen TG, Teräsvirta T. 2009. Realized volatility. In: Handbook of financial time series. Springer; p. 555–575. [Google Scholar]
- Basu S, Das S, Michailidis G, Purnanandam A. 2019. A system-wide approach to measure connectivity in the financial sector. Available at SSRN 2816137. [Google Scholar]
- Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological). 57(1):289–300. [Google Scholar]
- Biau G, Scornet E. 2016. A random forest guided tour. Test. 25(2):197–227. [Google Scholar]
- Billio M, Getmansky M, Lo AW, Pelizzon L. 2012. Econometric measures of connectedness and systemic risk in the finance and insurance sectors. Journal of financial economics. 104(3):535–559. [Google Scholar]
- Breiman L. 2001. Random forests. Machine learning. 45(1):5–32. [Google Scholar]
- Brennan MJ, Jegadeesh N, Swaminathan B. 1993. Investment analysis and the adjustment of stock prices to common information. The Review of Financial Studies. 6(4):799–824. [Google Scholar]
- Brownlees C, Nualart E, Sun Y. 2018. Realized networks. Journal of Applied Econometrics. 33(7):986–1006. [Google Scholar]
- Chiodo AJ, Owyang MT. 2002. A case study of a currency crisis: The russian default of 1998. Federal Reserve Bank of St Louis Review. 84(6):7. [Google Scholar]
- Chordia T, Sarkar A, Subrahmanyam A. 2011. Liquidity dynamics and cross-autocorrelations. Journal of Financial and Quantitative Analysis. 46(3):709–736. [Google Scholar]
- CRSP Stocks.. Available: Center for Research in Security Prices. Graduate School of Business. University of Chicago. Retrieved from Wharton Research Data Services. Accessed 2019. [Google Scholar]
- Diebold FX, Yılmaz K. 2014. On the network topology of variance decompositions: Measuring the connectedness of financial firms. Journal of Econometrics. 182(1):119–134. [Google Scholar]
- Dutta C, Karpman K, Basu S, Ravishanker N. 2022. Review of statistical approaches for modeling high-frequency trading data. Sankhya B:1–48. [Google Scholar]
- Easley D, de Prado ML, O’Hara M. 2016. Discerning information from trade data. Journal of Financial Economics. 120:269–285. [Google Scholar]
- Easley D, de Prado ML, O’Hara M, Zhang Z. 2021. Microstructure in the machine age. Review of Financial Studies. 34(7):3316–3363. [Google Scholar]
- Fawcett T. 2006. An introduction to roc analysis. Pattern recognition letters. 27(8):861–874. [Google Scholar]
- Friedman J, Hastie T, Tibshirani R. 2001. The elements of statistical learning. vol. 1. Springer series in statistics New York. [Google Scholar]
- Friedman J, Posner R. 2011. What caused the financial crisis. University of Pennsylvania Press. [Google Scholar]
- Gu S, Kelly B, Xiu D. 2020. Empirical asset pricing via machine learning. The Review of Financial Studies. 33(5):2223–2273. [Google Scholar]
- Härdle WK, Chen S, Liang C, Schienle M. 2018. Time-varying limit order book networks. IRTG 1792 Discussion Paper 2018. Report No:. [Google Scholar]
- Härdle WK, Wang W, Yu L. 2016. Tenet: Tail-event driven network risk. Journal of Econometrics. 192(2):499–513. [Google Scholar]
- Hautsch N. 2012. Econometrics of financial high-frequency data. Springer. [Google Scholar]
- Hautsch N, Schaumburg J, Schienle M. 2015. Financial network systemic risk contributions. Review of Finance. 19(2):685–738. [Google Scholar]
- Huang BF, Boutros PC. 2016. The parameter sensitivity of random forests. BMC bioinformatics. 17(1):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karpman K, Lahiry S, Mukherjee D, Basu S. 2022. Exploring financial networks using quantile regression and granger causality. arXiv. Available from: https://arxiv.org/abs/2207.10705. [Google Scholar]
- Kiel P, Nguyen D. 2013. Bailout tracker: Tracking every dollar and every recipient. propublica: Journalism in the public interest. [Google Scholar]
- Liaw A, Wiener M. 2002. Classification and regression by randomforest. R News. 2(3):18–22. Available from: https://CRAN.R-project.org/doc/Rnews/. [Google Scholar]
- Lo AW, MacKinlay AC. 1990. When are contrarian profits due to stock market overreaction? The review of financial studies. 3(2):175–205. [Google Scholar]
- NAICS. 2017. North American Industry Classification System Manual. Executive Office of the President, Office of Management and Budget. [Google Scholar]
- NYSE Trade and Quote Database. Retrieved from Wharton Research Data Services. Accessed 2019.
- Pelger M. 2020. Understanding systematic risk: a high-frequency approach. The Journal of Finance. 75(4):2179–2220. [Google Scholar]
- Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Müller M. 2011. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 12:77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rubin RE, Greenspan A, Levitt A, Born B. 1999. Hedge funds, leverage, and the lessons of long-term capital management. Report of The President’s Working Group on Financial Markets. [Google Scholar]
- Wang GJ, Xie C, He K, Stanley HE. 2017. Extreme risk spillover network: application to financial institutions. Quantitative Finance. 17(9):1417–1433. [Google Scholar]
- Wang GJ, Yi S, Xie C, Stanley HE. 2021. Multilayer information spillover networks: measuring interconnectedness of financial institutions. Quantitative Finance. 21(7):1163–1185. [Google Scholar]














