Skip to main content
Oxford University Press logoLink to Oxford University Press
. 2020 Sep 29;24(1):177–197. doi: 10.1093/ectj/utaa030

Model averaging estimation for high-dimensional covariance matrices with a network structure

Rong Zhu 1,, Xinyu Zhang 2,, Yanyuan Ma 3,, Guohua Zou 4,
PMCID: PMC7946866  PMID: 33746562

Summary

In this paper, we develop a model averaging method to estimate a high-dimensional covariance matrix, where the candidate models are constructed by different orders of polynomial functions. We propose a Mallows-type model averaging criterion and select the weights by minimizing this criterion, which is an unbiased estimator of the expected in-sample squared error plus a constant. Then, we prove the asymptotic optimality of the resulting model average covariance estimators. Finally, we conduct numerical simulations and a case study on Chinese airport network structure data to demonstrate the usefulness of the proposed approaches.

Keywords: asymptotic optimality, consistency, covariance regression network model, Mallows criterion, model averaging

1. INTRODUCTION

The covariance matrix is a familiar concept and has been widely used in many fields. For example, Markowitz (1952) illustrated geometrically the relationship between belief and choice of portfolio according to their covariance matrix. Campbell et al. (1998) and Jagannathan and Ma (2003) considered finance and risk management based on covariance matrices. Bilmes (2000) improved the parsimony of speech recognition systems by adjusting the type of covariance matrices. Chen and Conley (2001) presented a semiparametric model for high-dimensional vector time series: they analysed a semiparametric spatial model by using the positive-definiteness of a covariance function for panel data in the context of time series. Friedman et al. (2008) considered the problem of estimating sparse graphs by incorporating the lasso penalty into the inverse covariance matrix. And, motivated by the arbitrage pricing theory in finance, Fan et al. (2008) used the multi-factor model to reduce dimensionality and estimate the covariance matrix.

Let Inline graphic be the Inline graphic-dimensional covariance matrix of a Inline graphic-dimensional random vector Inline graphic. The most commonly used covariance matrix estimator of Inline graphic is the classic sample covariance matrix estimator. However, the sample covariance matrix estimator does not perform well when Inline graphic and the sample size Inline graphic is fixed or grows at a slower rate than p, because the number of unknown parameters is much larger than the sample size in this case. Inevitably, additional assumptions need to be made to ensure that the estimation of Inline graphic is feasible. Here, we consider the novel strategy proposed by Lan et al. (2018), which uses a covariance regression network model (CRNM) to estimate the high-dimensional covariance matrix. In this framework, the covariance matrix is regarded as a polynomial function of the symmetric adjacency matrix, representing a given network structure. In this way, the estimation of a high-dimensional covariance matrix is converted into the estimation of the low-dimensional coefficients of the CRNM. These authors further developed a Bayesian information criterion (BIC) to select the order of the polynomial function and proved the consistency of the BIC. Their treatment can thus be classed as a model selection approach.

Model averaging can be viewed as a smooth extension of model selection and it can substantially reduce the risk relative to model selection (Hansen, 2014). Furthermore, model averaging is often more stable than model selection, in which a small change of the data may lead to a significant change in the model selected; see Yuan and Yang (2005) and Leung and Barron (2006) for further discussion. There has been much research into model averaging for regression models. For example, Buckland et al. (1997) proposed the smoothed AIC and smoothed BIC method, in which weights are assigned based on the information criterion scores obtained from different models. Hansen (2007) and Wan et al. (2010) developed a Mallows model averaging method for linear regression models. Other methods include jackknife model averaging (Hansen and Racine, 2012; Zhang et al., 2013), heteroscedasticity-robust Cp model averaging (Liu and Okui, 2013), leave-subject-out cross-validation (Gao et al., 2016), and Mahalanobis Mallows model averaging (Zhu et al., 2018).

In this paper, in order to improve the estimation of a high-dimensional covariance matrix with a network structure, we propose a model averaging method based on the normalized version of CRNM (Lan et al., 2018). We select the averaging weights through minimizing a Mallows-type criterion, which is an unbiased estimator of the expected in-sample squared error plus a constant. We then establish the asymptotic optimality of the resulting model average covariance (MAC) estimator when none of the candidate models is correct. We also show that MAC will be consistent during the estimation of the covariance matrix Inline graphic if at least one of the candidate models is in fact correct. Following Lan et al. (2018), for the sake of convenience, we consider Inline graphic and Inline graphic first, and then discuss the extension when Inline graphic increases.

The remainder of this paper is organized as follows. In Section 2, we describe the estimation method in the normalized CRNM (nCRNM), introduce the Mallows-type weight choice criterion, and propose the MAC estimator based on this criterion. The asymptotic optimality of the model averaging estimator is established in Section 3, while the consistency of the method is shown in Section 4. We extend the theorems of asymptotic optimality and consistency to the case that the sample size Inline graphic is larger than one in Section 5. We compare the finite-sample properties of the MAC estimator with several information criterion-based model selection and averaging estimators in Section 6. A real data example is considered in Section 7. Section 8 contains some concluding remarks. Technical proofs are in Online Appendices.

2. MODEL SET-UP AND ESTIMATION

2.1. Covariance regression network model

Consider a graph with Inline graphic nodes, where the Inline graphicth node is used to represent the Inline graphicth observation, for example the Inline graphicth individual. Let Inline graphic denote the symmetric adjacency matrix of the graph, where Inline graphic if the two nodes Inline graphic and Inline graphic are directly related, for example if the Inline graphicth and Inline graphicth individuals know each other, and let Inline graphic otherwise. We set the diagonal elements Inline graphic. Let Inline graphic be the Inline graphicth power of Inline graphic, with the zeroth power defined as identity, i.e., Inline graphic. The Inline graphicth component of Inline graphic is then Inline graphic and it counts the number of ways to connect the Inline graphicth node and the Inline graphicth node through a path with exactly Inline graphic steps. In other words, Inline graphic measures the Inline graphic-path relationships of the Inline graphic nodes.

Let Inline graphic be a random variable that describes a certain property of the Inline graphicth node, such as the Inline graphicth person’s social activities in a period of time. Let Inline graphic, and let the covariance matrix of Inline graphic be Inline graphic. Naturally, Inline graphic can be linked to the adjacency matrix Inline graphic. Lan et al. (2018) introduced the CRNM: Inline graphic, where Inline graphic is a positive integer and Inline graphic is the vector of regression coefficients. To ensure that the parameters in the CRNM are comparable between models, we normalize the elements of Inline graphic, Inline graphic, by dividing Inline graphic. Hence, the normalized matrix is Inline graphic with Inline graphic. The Inline graphicth component of Inline graphic is then Inline graphic, which is the 'normalized' number of ways to connect the Inline graphicth node and the Inline graphicth node through a path with Inline graphic steps. Rather than using the CRNM in Lan et al. (2018), we assume the following normalized CRNM (nCRNM):

2.1. (2.1)

where Inline graphic is a positive integer that can increase to infinity with Inline graphic, and Inline graphic is the vector of regression coefficients.

To ensure that Inline graphic in (2.1) is positive-definite, constraints are imposed on Inline graphic. Because the adjacency matrix Inline graphic is assumed known, an obvious benefit of the nCRNM (2.1) is that it reduces the number of parameters in Inline graphic from Inline graphic to Inline graphic. Note that (2.1) is not necessarily the smallest model, since it is possible that Inline graphic for any Inline graphic. In particular, we allow the 'wasteful' case where Inline graphic for all Inline graphic, for a certain value Inline graphic.

In estimating the covariance matrix Inline graphic, it is common to assume that Inline graphic, have the same mean, which can be consistently estimated by the sample average Inline graphic. One can further centre the data by subtracting the sample mean and work directly with Inline graphic. Hence, without loss of generality, we assume that Inline graphic in the rest of the paper. For convenience, we also perform a standard eigenvalue decomposition on Inline graphic to obtain

2.1. (2.2)

where Inline graphic is an orthogonal matrix, Inline graphic is a diagonal matrix, and Inline graphic is the Inline graphicth-largest eigenvalue of Inline graphic. These preparations allow us to convert the model in (2.1) to a simple linear regression model, which facilitates straightforward estimation. Specifically, let Inline graphic, where Inline graphic is defined in (2.2). Then Inline graphic has mean Inline graphic and covariance matrix Inline graphic. Extracting the Inline graphicth diagonal element of these relations, we obtain Inline graphic, which is a multiple linear regression model if we view Inline graphic as the response variable, Inline graphic as the Inline graphicth regressor, and Inline graphic as the regression coefficients. To fix the notation, we define Inline graphic and Inline graphic. Then, the nCRNM (2.1) is equivalently written as

2.1. (2.3)

where Inline graphic. Let Inline graphic, Inline graphic, and Inline graphic be the diagonal matrix with the Inline graphicth diagonal entry being the Inline graphicth element of Inline graphic. With this notation,

2.1. (2.4)

Thus, for nCRNM (2.1) with a pre-determined Inline graphic, one can use the ordinary least squares method when Inline graphic is of full column rank to obtain the estimator of Inline graphic (i.e., Inline graphic) in (2.3) and use (2.4) to obtain Inline graphic.

2.2. Candidate models and estimation

Each fixed polynomial order Inline graphic represents a different nCRNM (2.1). In practice, which Inline graphic value is suitable is often unclear. We thus consider many possible values of Inline graphic, which corresponds to a series of possible models. We index these models by Inline graphic, where the total number of models Inline graphic can be related to the total number of nodes Inline graphic. To further increase flexibility, we allow the Inline graphicth candidate nCRNM to contain Inline graphic arbitrary monomials of Inline graphic, which do not have to be Inline graphic to Inline graphic but must include the intercept term Inline graphic. Similar to (2.3), the resulting Inline graphicth nCRNM is equivalently written as

2.2. (2.5)

where Inline graphic is a Inline graphic submatrix of Inline graphic, Inline graphic is the coefficient vector, and Inline graphic is the error. Here we can assume that Inline graphic is a Inline graphic matrix with Inline graphic sufficiently large so that Inline graphic for all Inline graphic and Inline graphic is of full column rank. Although we consider Inline graphic models, we do not assume that any of these models are correct. Thus, we allow some or all of the Inline graphic models to be misspecified.

It is easy to see that the ordinary least square estimator of Inline graphic is Inline graphic, and the estimator of Inline graphic is Inline graphic, where Inline graphic is the projection matrix. By (2.4), the estimator of Inline graphic based on the Inline graphicth candidate model is

2.2.

We thus have Inline graphic estimators of Inline graphic based on the Inline graphic candidate models.

2.3. Model averaging and weight choice criterion

Our purpose is now to combine the Inline graphic estimators of Inline graphic obtained from the Inline graphic candidate models to achieve an optimal estimator of Inline graphic through a weighted average of Inline graphics. The optimality is reflected in the fact that the resulting estimator minimizes the distance to the true Inline graphic, while it simultaneously ensures that the estimated covariance matrix is positive-definite. Note that the positive-definiteness property is not guaranteed by the estimators described in Subsections 2.1 and 2.2. Of course, to obtain a positive-definite matrix by forming a weighted average of several candidate matrices, we have to assume that at least one of these candidate matrices is positive-definite. This is reasonable and is certainly true when the set of candidates contains the trivial model, which contains only the intercept term Inline graphic.

Let Inline graphic be a weight vector. A model average estimator of Inline graphic is of the form

2.3.

where Inline graphic. Similarly, define Inline graphic. We then have

2.3.

We restrict the weight vector in the set Inline graphic, where

2.3. (2.6)

Obviously, Inline graphic is not empty under our assumption that at least one candidate model is positive-definite. We measure the distance between two matrices using the Frobenius norm of the difference, and hence the Frobenius norm loss of Inline graphic is naturally defined as

2.3.

where Inline graphic denotes the Frobenius norm. Our purpose is to devise a weight choice criterion to minimize the expected loss Inline graphic.

We first note that

2.3. (2.7)

where Inline graphic denotes the Inline graphic norm. This means that the Frobenius norm loss of Inline graphic is the same as the squared error loss of Inline graphic. Then the corresponding risk function, defined as the expected loss function, can be calculated as

2.3. (2.8)

Of course, neither Inline graphic nor Inline graphic can be directly minimized because they depend on Inline graphic, which is unknown. We thus work around the difficulty by first replacing Inline graphic with Inline graphic, and then adjusting for the offset. This leads us to construct an estimator of the risk function,

2.3. (2.9)

It can be readily shown that

2.3. (2.10)

This implies that Inline graphic is an unbiased estimator of the risk function Inline graphic plus a constant irrelevant to the weight Inline graphic. Therefore, by minimizing Inline graphic, we expect that Inline graphic and Inline graphic are also minimized. This property in (2.10) is similar to the Mallows criterion proposed by Hansen (2007). We thus proceed to use the Mallows-type model averaging criterion. Alternatively, the jackknife model averaging (Hansen and Racine, 2012; Zhang et al., 2013) criterion can also be used for weight choice. Liu and Okui (2013) have shown that the Mallows-type and jackknife methods have a similar performance.

In practice, the covariance matrix Inline graphic in Inline graphic is unknown and needs to be estimated. Following Hansen (2007), Lan et al. (2018) and Zhang and Wang (2019), we estimate Inline graphic based on a candidate model containing the largest number of covariates, indexed by

2.3. (2.11)

When several candidates have the same maximum number of covariates, we simply pick any one of such candidate models as the Inline graphic model. This leads to the estimator

2.3. (2.12)

where Inline graphic, and Inline graphic is the Inline graphicth component of Inline graphic. Here, we have restricted Inline graphic to be diagonal, while Inline graphic itself may not be diagonal. We find that despite this seemingly crude practice, the resulting model averaging estimator is always optimal in that it minimizes the expected Frobenius loss Inline graphic regardless of whether Inline graphic is diagonal or not.

When Inline graphic is replaced by Inline graphic, Inline graphic changes to

2.3. (2.13)

in which all quantities are known expect Inline graphic. Minimizing Inline graphic with respect to Inline graphic leads to

2.3. (2.14)

Substituting Inline graphic in Inline graphic yields the model average estimator Inline graphic, which we name the model average covariance (MAC) estimator. We point out that the minimization of (2.13) with respect to Inline graphic is a constrained quadratic programming problem, and hence the computation of the optimal weight vector is straightforward. For example, quadratic programming can be performed using the quadprog package in r, the quadprog command in matlab, or the qp command in sas. Next, we present the asymptotic optimality of the MAC estimator Inline graphic.

3. ASYMPTOTIC OPTIMALITY

As we have pointed out, in order to obtain a final covariance matrix estimator based on several candidate estimators, at least one of these candidate estimators needs to be valid, i.e., positive-definite. We state this formally as Condition (C.1). We then proceed to establish the optimality property of our procedure under scenarios of both independent and dependent error components.

Condition (C.1)

There exists anInline graphicsuch thatInline graphicis positive-definite.

We first define some notation. Let Inline graphic, which is a larger space than Inline graphic. Write Inline graphic. Let Inline graphic and Inline graphic denote respectively the maximum and minimum eigenvalues of a matrix. Let Inline graphic be an Inline graphic vector where the Inline graphicth element is one and all the other elements are zero. We use Inline graphic to denote the Inline graphicth diagonal element of Inline graphic and let Inline graphic. All limiting processes correspond to Inline graphic unless stated otherwise.

3.1. Asymptotic optimality with independence

We first consider the situation where the elements of Inline graphic are independent of each other. This arises when, for example, Inline graphic is normally distributed. In this case, Inline graphic, Inline graphic. This implies that the elements of Inline graphic are independent of each other; as a consequence, the elements of Inline graphic are also independent of each other. We assume the following regularity conditions.

Condition (C.2)

There exist a fixed integer G and a constantInline graphicso thatInline graphicforInline graphic.

Condition (C.3)

Inline graphicfor the same constant G as given in Condition(C.2).

Condition (C.4)

There exists a constant c such thatInline graphic, where, recall, Inline graphicis the number of columns ofInline graphicin the largest model.

Condition (C.5)

Inline graphic, a.s.

Condition (C.6)

Inline graphic, whereInline graphicis a constant.

Remark 3.1

Condition (C.2) places a moment restriction on the error term. Condition (C.3) imposes some relation between the best model risk and the total risks from all the models, in that the best model should not be too much better than all other models. Specifically, letInline graphicandInline graphic, then, givenInline graphic, a sufficient condition of Condition (C.3) isInline graphicwithInline graphic, which means that the risk of the best model can increase more slowly than that of the worst model, but not too much more slowly. In addition, a consequence of Condition (C.3) is thatInline graphic, which implies that there is no correctly specified candidate model with a finite dimension. This can be seen from an argument of contradiction. If theInline graphicth candidate model is correctly specified with a finiteInline graphic, then

Remark 3.1

which is finite ifInline graphicis bounded and hence is a contradiction. Similar conditions are used in Wan et al. (2010), Liu and Okui (2013) and Ando and Li (2014). Condition (C.4) is commonly used to ensure the asymptotic optimality of cross-validation; see, for example, Andrews (1991) and Hansen and Racine (2012). Condition (C.5) is about the sum of the squares of the elements ofInline graphicand is commonly used in the context of linear regression; see, for example, Wan et al. (2010) and Liang et al. (2011). Condition (C.6) means thatInline graphichas order the same as or smaller thanInline graphic. A similar requirementInline graphic, whereInline graphicis the sample size, has been used in Wan et al. (2010) and Liu et al. (2016).

Theorem 3.1

Assume that Conditions(C.1)–(C.6) hold. Then asInline graphic,

Theorem 3.1 (3.1)

Theorem 3.1 shows that the MAC estimator is asymptotically optimal in that it leads to a squared error loss that is asymptotically identical to that of the infeasible best possible model average estimator. The proof of Theorem 3.1 is given in Online Appendix A.1.

3.2. Asymptotic optimality without independence

We now consider the situation where the elements of Inline graphic are dependent. Recall that Inline graphic, and thus we allow Inline graphic to be nondiagonal. Let Inline graphic, and Inline graphic, where Inline graphic is the Moore–Penrose generalized inverse matrix of Inline graphic. We can still establish the asymptotic optimality described in Theorem 3.1 with the following additional conditions.

Condition (C.7)

Inline graphic

Condition (C.8)

Inline graphic.

Condition (C.9)

Inline graphic, whereInline graphicand c are positive constants.

Remark 3.2

Condition (C.7) is similar to Condition (C.3) but is weaker. It is implied by Condition (C.3) withInline graphic. Condition (C.8) is similar to condition (22) in Zhang et al. (2013). When all candidate models are nested within the largest candidate modelInline graphicdefined in (2.11), Inline graphic, so Condition (C.8) is implied byInline graphic, which is again a restriction onInline graphicand is similar to Condition (C.6). Condition (C.9) is a commonly used boundedness requirement on the minimum and maximum eigenvalues ofInline graphic.

Theorem 3.2

Assume that Conditions(C.1), (C.4), (C.5), (C.7)–(C.9) hold. Then asInline graphic, the asymptotic optimality in (3.1) still holds.

Theorem 3.2 shows that Theorem 3.1 remains valid when Inline graphic is a general matrix instead of a diagonal matrix. The proof of Theorem 3.2 is given in Online Appendix A.2 in the online Supporting Information. The reason why the optimality property is retained in Theorem 3.2 lies in Condition (C.8). This condition restricts the effect of the second term in the criterion Inline graphic to be ignorable on the final weight choice compared with the first term, and thus the estimate of Inline graphic has an ignorable effect.

4. CONSISTENCY WITH CORRECTLY SPECIFIED CANDIDATE MODELS

As discussed in Remark 3.1, the optimality shown above essentially excludes the situation where at least one of the candidate models is correctly specified. Here, we show that when there is at least one correctly specified candidate model, the model averaging method described above can achieve the same convergence rate of these correctly specified candidate models. Let Inline graphic be a Inline graphic selection matrix so that Inline graphic. Assume that the Inline graphicth model is a correctly specified model, i.e.,

4. (4.1)

Thus,

4. (4.2)

Because the Inline graphicth model has the largest number of covariates, for convenience we assume that the Inline graphicth model is nested inside the Inline graphicth model.

Under the Inline graphicth candidate model, Inline graphic, and thus the estimator of Inline graphic is Inline graphic. Then, the model averaging estimator of Inline graphic is

4. (4.3)

To build the consistency of Inline graphic, we assume the following regularity condition. Let Inline graphic.

Condition (C.10)

Inline graphic, whereInline graphicandInline graphicare constants.

Remark 4.1

Condition (C.10) requires the additional components in bigger models to contribute sufficiently different structures. Condition (C.10) is the same as the condition (A1) of Zou and Zhang (2009). In their paper, Inline graphicandInline graphicare constants, and we sometimes use the sameInline graphicto denote different constants.

Lemma 4.1

If Conditions(C.9) and(C.10) are satisfied, then

Lemma 4.1 (4.4)

Remark 4.2

Lemma 4.1 shows theInline graphic-consistency of the correctly specified model coefficientInline graphic, which is a very common convergence result when the dimension of coefficients diverges; see, for example, He and Shao (2000) and Fan and Peng (2004). The proof of Lemma 4.1 is given in Online Appendix A.3.

Theorem 4.1

Assume that Conditions(C.4), (C.9) and(C.10) hold. ThenInline graphic

Theorem 4.1 shows the Inline graphic-consistency of the optimal model averaging coefficient Inline graphic. The results in Theorem 4.1 may appear counter-intuitive at first glance, since they indicate that the large size of the unknown matrix Inline graphic is beneficial to us. This is a direct consequence of the assumption that the adjacency matrix Inline graphic is known, and hence a larger Inline graphic under such a setting does indeed represent more information. On the other hand, Inline graphic is the number of parameters, and hence Inline graphic indeed resembles the usual diverging parametric convergence rate. The proof of Theorem 4.1 is given in Online Appendix A.4 .

5. EXTENSIONS WITH Inline graphic

In this section, we extend the asymptotic optimality and consistency properties in Sections 3 and 4 to the case of Inline graphic. Suppose we have a sample Inline graphic, which consists of independent and identically distributed copies of Inline graphic. Denote Inline graphic, Inline graphic, Inline graphic and Inline graphic for Inline graphic. Let Inline graphic, Inline graphic be the Inline graphicth component of Inline graphic, Inline graphic and Inline graphic. Then, we know Inline graphic and Inline graphic. By model (2.3) and the Inline graphicth candidate model (2.5), we have that

5. (5.1)

Then, the ordinary least squares estimator of Inline graphic is Inline graphic, and the estimator of Inline graphic is Inline graphic. For simplicity, we do not explicitly redefine Inline graphic, Inline graphic, Inline graphic, Inline graphic and Inline graphic, other than pointing out that they are the same as defined in Section 2 except with the Inline graphic in their corresponding expressions replaced by Inline graphic.

The loss function is Inline graphic, and the corresponding risk function is

5. (5.2)

The estimator of the risk function is

5.

which is an unbiased estimator of the risk Inline graphic up to a constant, i.e.,

5.

In practice, we estimate Inline graphic by

5. (5.3)

where Inline graphic. Then, Inline graphic is changed to

5. (5.4)

and the optimal weight vector is

5. (5.5)

Next, we consider the asymptotic optimality of the model averaging estimator Inline graphic as Inline graphic and Inline graphic when no correct model is contained in the candidate model set.

Corollary 5.1

WhenInline graphicis diagonal, if Conditions(C.1)–(C.6) hold, then asInline graphicandInline graphic, the asymptotic optimality in (3.1) still holds.

Corollary 5.2

WhenInline graphicis nondiagonal, if Conditions(C.1), (C.4), (C.5), (C.7)–(C.9) hold, then asInline graphicandInline graphic, the asymptotic optimality in (3.1) still holds.

Corollaries 5.1 and 5.2 are the generalized versions of Theorems 3.1 and 3.2, where the sample size Inline graphic is larger than 1. The proofs of Corollaries 5.1 and 5.2 are given in Online Appendices A.5 and A.6 . From these proofs, it is straightforward to see that when Inline graphic, these corollaries still hold.

Similarly, we also establish the consistency of the methods when at least one candidate model is correct. Since Inline graphic can go to infinity, the convergence order in Lemma 4.1 will depend on Inline graphic Specifically, we have the following result.

Lemma 5.1

If Conditions(C.9) and (C.10) are satisfied, then

Lemma 5.1 (5.6)

Remark 5.1

Lemma 5.1 is a generalization of Lemma 4.1 when the sample sizeInline graphicis allowed to diverge. The proof of Lemma 5.1 is given in Online Appendix A.7.

Corollary 5.3

Assume that Conditions(C.4), (C.9) and(C.10) hold. ThenInline graphic.

From Corollary 5.3, if Inline graphic, then Inline graphic under those conditions. Otherwise, Inline graphic. Thus, as far as the convergence rate is concerned, there is no need to increase Inline graphic to be much larger than Inline graphic because no more gain can be obtained. The proof of Corollary 5.3 is given in Online Appendix A.8 .

6. SIMULATION STUDY

This section is devoted to a comparison of the finite-sample performance of the MAC estimator with the AIC- and BIC-based model selection and averaging estimators. The AIC and BIC scores for the Inline graphicth candidate model are Inline graphic and Inline graphic, respectively, where Inline graphic. (When Inline graphic, we use Inline graphic instead of Inline graphic and set Inline graphic.) Buckland et al. (1997) suggested smoothed AIC (SAIC) and smoothed BIC (SBIC) model averaging methods, where the weight for the Inline graphicth model is simply set as Inline graphic, and similarly for Inline graphic. Owing to their ease of use, the SAIC and SBIC weight choice methods have been used extensively in the literature (Wan and Zhang, 2009; Millar et al., 2014).

In the following, we consider two kinds of experimental design. In the first design, all candidate models are misspecified, while in the second one, some candidate models are correctly specified.

6.1. Experimental designs

Design 1 (All candidate models are misspecified). We set the true covariance matrix Inline graphic as Inline graphic, where Inline graphic and the individual elements of Inline graphic, Inline graphic, are calculated by Inline graphic, which are independently generated from a binary distribution with probability Inline graphic with Inline graphic or 10 for any Inline graphic, while Inline graphic. We set Inline graphic and Inline graphic, where the function Inline graphic returns the largest integers not greater than Inline graphic.

The Inline graphic and Inline graphic values together control the dimension and the sparsity of the network structure. We let Inline graphic be the maximum order for Inline graphic. Here we have chosen very small coefficients to ensure the positive-definiteness of Inline graphic. In addition, the response vector is Inline graphic, where each component of Inline graphic is independently and identically simulated from either a standard normal distribution (norm), a standardized exponential distribution (exp), or a mixture of two normal distributions (mix). For the 'mix' case, the first distribution has mean zero and variance 5/9, and the second one has mean zero and variance 5, with the mixture coefficient itself generated from a binomial distribution with probability 0.9, i.e., Inline graphic, where Inline graphicBinomial(0.9). We repeat the simulation 500 times under each simulation setting.

In all candidate models, Inline graphic is included and Inline graphic may or may not be included. Thus, we consider a total of Inline graphic candidate models. Note that Inline graphic when Inline graphic, and Inline graphic when Inline graphic, and hence all candidate models are misspecified.

Design 2 (Some candidate models are correctly specified). We set the covariance matrix Inline graphic as Inline graphic. We consider three settings of Inline graphic, which are Inline graphic, Inline graphic, and Inline graphic. Like in Design 1, we always include Inline graphic, while allowing all other components Inline graphic to be zeros. Thus, we consider a total of Inline graphic candidate models. All other aspects of Design 2 are the same as in Design 1.

We evaluate the performance of various estimators based on the following mean Frobenius loss (MFL):

6.1. (6.1)

where Inline graphic is the estimator of Inline graphic obtained by a given method in the Inline graphicth trial, and Inline graphic is the number of replications. In Design 2, where some candidate models are correctly specified, in order to illustrate the consistency of the MAC method shown in Corollary 5.3 we also calculate the mean squared error (MSE) :

6.1. (6.2)

where Inline graphic and Inline graphic are the chosen weight vector and the estimator of Inline graphic obtained in the Inline graphicth trial.

6.2. Results

The MFLs of Design 1 are presented in Tables 13 under normal, exponential and mixture distributions, respectively. The MFLs of Design 2 are presented in Tables 46, under normal, exponential and mixture distributions, respectively. To facilitate comparisons, we flag the best, second best and worst estimators in each case in bold, blue italic and red italic respectively. The MSEs of Design 2 are shown in Table 7.

Table 1.

The MFL under a normal distribution for Design 1 (Inline graphic100).

Inline graphic Inline graphic Inline graphic MAC SAIC SBIC AIC BIC
5 200 1 5.780 6.467 6.992 7.383 8.923
Inline graphic 3.571 3.991 4.072 4.639 4.736
Inline graphic 1.463 1.638 1.775 1.908 2.144
400 1 3.086 3.456 3.637 3.966 4.271
Inline graphic 1.243 1.518 1.745 1.723 2.081
Inline graphic 0.551 0.631 0.798 0.640 0.916
10 200 1 7.720 8.305 8.638 9.181 9.959
Inline graphic 4.819 5.143 5.344 5.460 5.913
Inline graphic 1.733 1.790 1.805 1.962 1.953
400 1 3.878 4.372 4.608 4.660 5.265
Inline graphic 1.331 1.429 1.417 1.572 1.566
Inline graphic 0.637 0.726 0.821 0.787 0.946

Notes: The smallest, second smallest and largest MFLs in each row are highlighted in bold, bold italic and italic, respectively.

Table 3.

The MFL under a mixture distribution for Design 1 (Inline graphic100).

Inline graphic Inline graphic Inline graphic MAC SAIC SBIC AIC BIC
5 200 1 9.801 10.476 10.954 11.563 12.515
Inline graphic 5.952 7.443 7.460 8.038 8.036
Inline graphic 2.230 2.481 2.601 2.792 3.040
400 1 5.233 5.671 5.803 6.255 6.430
Inline graphic 1.861 2.328 2.548 2.500 2.891
Inline graphic 0.769 0.864 1.023 0.885 1.143
10 200 1 10.421 10.994 11.285 11.908 12.716
Inline graphic 5.737 5.915 6.124 6.135 6.754
Inline graphic 2.556 2.698 2.741 2.916 2.922
400 1 5.543 5.921 6.274 6.148 7.025
Inline graphic 1.869 1.977 1.946 2.150 2.085
Inline graphic 0.825 0.919 1.010 0.988 1.143

Notes: The smallest, second smallest and largest MFLs in each row are highlighted in bold,bold italic and italic, respectively.

Table 4.

The MFL under a normal distribution for Design 2 (Inline graphic100).

Inline graphic Inline graphic Inline graphic Inline graphic MAC SAIC SBIC AIC BIC
Inline graphic 5 200 1 21.577 24.045 24.885 26.990 29.942
Inline graphic 12.983 14.752 14.289 16.624 16.140
Inline graphic 5.287 5.470 5.536 6.071 6.313
400 1 11.103 12.604 12.678 14.173 14.393
Inline graphic 4.422 5.161 5.212 5.670 5.857
Inline graphic 1.719 1.824 1.796 1.983 1.958
10 200 1 24.202 25.178 25.367 28.237 28.729
Inline graphic 15.434 16.109 16.480 17.396 18.228
Inline graphic 5.180 5.289 5.417 5.617 6.059
400 1 13.505 13.953 14.699 15.180 16.458
Inline graphic 4.382 4.447 4.481 4.739 4.905
Inline graphic 1.878 1.903 1.905 1.969 2.037
Inline graphic 5 200 1 17.819 19.836 20.807 22.336 24.616
Inline graphic 10.138 11.543 12.006 12.506 13.587
Inline graphic 3.987 4.110 4.131 4.321 4.381
400 1 8.332 9.742 10.687 10.623 12.495
Inline graphic 3.521 4.295 4.157 4.545 4.407
Inline graphic 1.238 1.359 1.287 1.477 1.392
10 200 1 21.224 21.724 21.570 23.124 23.196
Inline graphic 13.374 13.885 14.068 14.667 15.207
Inline graphic 4.486 4.647 4.868 4.814 5.248
400 1 11.434 11.830 12.125 12.550 13.279
Inline graphic 3.746 3.838 4.057 4.009 4.407
Inline graphic 1.626 1.671 1.710 1.717 1.793
Inline graphic 5 200 1 15.920 18.239 18.245 20.264 20.769
Inline graphic 9.256 10.970 10.732 11.693 11.854
Inline graphic 3.820 4.003 3.922 4.358 4.287
400 1 7.563 9.210 9.155 10.037 10.318
Inline graphic 3.295 4.121 3.923 4.458 4.335
Inline graphic 1.262 1.363 1.398 1.490 1.586
10 200 1 20.512 21.196 20.628 22.084 21.468
Inline graphic 12.942 13.595 13.606 14.150 14.329
Inline graphic 4.314 4.560 4.682 4.775 4.983
400 1 10.950 11.458 11.306 12.039 11.805
Inline graphic 3.454 3.652 3.748 3.810 3.982
Inline graphic 1.535 1.630 1.675 1.687 1.777

Notes: The smallest, second smallest and largest MFLs in each row are highlighted in bold, bold italic and italic, respectively.

Table 6.

The MFL under a mixture distribution for Design 2 (Inline graphic100).

Inline graphic Inline graphic Inline graphic Inline graphic MAC SAIC SBIC AIC BIC
Inline graphic 5 200 1 31.494 34.076 34.723 37.717 39.548
Inline graphic 26.376 33.483 32.943 35.667 34.934
Inline graphic 7.194 7.459 7.611 8.072 8.545
400 1 15.549 17.122 17.399 18.847 19.249
Inline graphic 7.677 9.692 9.613 10.261 10.414
Inline graphic 2.260 2.391 2.382 2.538 2.538
10 200 1 28.938 29.761 29.665 32.988 32.564
Inline graphic 15.583 16.243 16.769 17.558 18.745
Inline graphic 6.903 7.142 7.372 7.525 7.956
400 1 15.148 15.589 16.181 16.905 17.920
Inline graphic 5.323 5.427 5.542 5.735 6.076
Inline graphic 2.210 2.265 2.260 2.378 2.406
Inline graphic 5 200 1 22.950 25.278 26.137 27.904 30.076
Inline graphic 21.712 29.250 29.787 30.200 31.647
Inline graphic 5.266 5.571 5.599 5.758 5.916
400 1 10.800 12.232 13.236 13.167 15.212
Inline graphic 6.479 8.413 8.338 8.642 8.639
Inline graphic 1.557 1.681 1.607 1.803 1.713
10 200 1 24.239 24.780 24.694 26.198 26.343
Inline graphic 12.803 13.285 13.529 14.025 14.709
Inline graphic 5.839 6.101 6.318 6.307 6.737
400 1 12.337 12.662 12.935 13.370 14.056
Inline graphic 4.367 4.474 4.641 4.616 4.944
Inline graphic 1.864 1.914 1.952 1.970 2.045
Inline graphic 5 200 1 21.398 24.199 24.175 26.096 27.105
Inline graphic 20.849 28.529 28.268 29.559 29.689
Inline graphic 4.998 5.417 5.327 5.758 5.717
400 1 9.911 11.531 11.626 12.332 12.932
Inline graphic 6.340 8.301 8.151 8.602 8.592
Inline graphic 1.578 1.672 1.702 1.805 1.879
10 200 1 23.860 24.410 23.873 25.182 24.778
Inline graphic 12.298 12.975 12.941 13.641 13.765
Inline graphic 5.570 5.935 6.026 6.129 6.233
400 1 11.712 12.230 11.979 12.738 12.441
Inline graphic 4.071 4.337 4.412 4.542 4.673
Inline graphic 1.771 1.877 1.912 1.955 2.007

Notes: The smallest, second smallest and largest MFLs in each row are highlighted in bold, bold italic and italic, respectively.

Table 7.

The MSE of the MAC estimator Inline graphic under Design 2.

Inline graphic Inline graphic
Inline graphic Inline graphic Inline graphic Norm Exp Mix Norm Exp Mix
Inline graphic 200 1 38.156 42.237 42.153 72.417 71.881 79.302
Inline graphic 19.555 23.064 30.482 51.202 38.125 41.443
Inline graphic 11.337 9.225 13.505 16.759 18.577 20.464
400 1 27.714 26.913 25.331 73.968 69.991 61.756
Inline graphic 7.020 11.045 14.087 23.588 25.602 24.684
Inline graphic 4.684 5.221 5.467 10.584 11.597 11.795
Inline graphic 200 1 40.017 38.634 40.506 60.281 59.040 64.643
Inline graphic 19.370 22.159 28.908 37.361 28.449 32.889
Inline graphic 10.276 8.056 12.130 12.684 13.218 15.312
400 1 24.519 24.133 24.501 58.768 59.701 54.037
Inline graphic 6.793 10.555 15.713 20.278 21.214 20.150
Inline graphic 4.348 4.918 4.737 8.412 9.281 9.228
Inline graphic 200 1 41.005 38.222 41.347 55.315 58.739 64.556
Inline graphic 20.543 22.352 28.944 37.371 26.602 31.185
Inline graphic 10.724 8.065 12.109 11.796 12.293 14.133
400 1 26.025 23.938 24.792 58.897 59.446 52.941
Inline graphic 6.714 10.575 15.647 19.404 20.189 18.872
Inline graphic 4.396 4.902 4.765 8.020 8.883 8.598

Table 2.

The MFL under an exponential distribution for Design 1 (Inline graphic100).

Inline graphic Inline graphic Inline graphic MAC SAIC SBIC AIC BIC
5 200 1 9.828 10.631 11.169 11.581 13.106
Inline graphic 5.036 5.245 5.316 5.736 5.817
Inline graphic 2.512 2.784 2.937 3.068 3.374
400 1 5.120 5.542 5.726 5.997 6.258
Inline graphic 1.997 2.212 2.381 2.437 2.752
Inline graphic 0.833 0.919 1.068 0.941 1.189
10 200 1 11.093 11.650 11.997 12.494 13.450
Inline graphic 5.464 5.576 5.813 5.789 6.366
Inline graphic 2.507 2.593 2.619 2.808 2.814
400 1 5.321 5.866 6.205 6.142 6.909
Inline graphic 2.028 2.148 2.135 2.338 2.295
Inline graphic 0.856 0.948 1.043 1.015 1.150

Notes: The smallest, second smallest and largest MFLs in each row are highlighted in bold,bold italic anditalic, respectively.

Table 5.

The MFL under an exponential distribution for Design 2 (Inline graphic100).

Inline graphic Inline graphic Inline graphic Inline graphic MAC SAIC SBIC AIC BIC
Inline graphic 5 200 1 31.665 34.181 35.037 37.668 39.586
Inline graphic 15.732 16.454 16.369 18.079 18.267
Inline graphic 7.938 8.351 8.498 8.989 9.468
400 1 15.544 17.232 17.525 18.749 19.361
Inline graphic 6.021 6.392 6.580 6.872 7.198
Inline graphic 2.397 2.506 2.453 2.653 2.609
10 200 1 29.631 30.776 30.796 33.824 33.871
Inline graphic 14.371 14.837 15.341 16.025 17.127
Inline graphic 6.630 6.789 6.967 7.218 7.547
400 1 15.451 15.858 16.652 17.158 18.424
Inline graphic 5.627 5.718 5.850 6.017 6.307
Inline graphic 2.242 2.266 2.290 2.327 2.427
Inline graphic 5 200 1 22.577 24.649 25.606 27.176 29.480
Inline graphic 11.560 12.278 13.001 13.189 14.650
Inline graphic 5.390 5.704 5.729 5.994 6.129
400 1 10.597 12.065 13.314 12.916 15.328
Inline graphic 4.035 4.387 4.381 4.600 4.570
Inline graphic 1.631 1.751 1.698 1.866 1.804
10 200 1 24.457 25.007 24.816 26.269 26.431
Inline graphic 11.745 12.177 12.482 12.967 13.621
Inline graphic 5.404 5.608 5.821 5.838 6.230
400 1 12.708 13.020 13.339 13.780 14.430
Inline graphic 4.613 4.734 4.922 4.882 5.271
Inline graphic 1.873 1.920 1.967 1.975 2.038
Inline graphic 5 200 1 20.949 23.550 23.422 25.406 26.054
Inline graphic 10.631 11.572 11.5 08 12.419 12.514
Inline graphic 5.136 5.518 5.399 5.958 5.812
400 1 9.642 11.283 11.661 12.048 13.004
Inline graphic 3.916 4.299 4.249 4.602 4.578
Inline graphic 1.661 1.754 1.811 1.880 1.997
10 200 1 23.945 24.523 23.774 25.440 24.640
Inline graphic 11.259 11.799 11.802 12.341 12.515
Inline graphic 5.168 5.475 5.571 5.738 5.920
400 1 12.174 12.623 12.469 13.153 12.994
Inline graphic 4.365 4.602 4.697 4.777 4.898
Inline graphic 1.797 1.882 1.912 1.968 2.006

Notes: The smallest, second smallest and largest MFLs in each row are highlighted in bold, bold italic and red italic, respectively.

First, different distributions of Inline graphic have little quantitative effect on the performance of each method. Second, in most cases, evaluated by MFL, MAC is the best, and SAIC/SBIC outperforms AIC/BIC, which is as expected. Tables A.1–A.6 in the online Supporting Information show the standard errors for the MFLs, for which, we can see that our method performs the best stable in all cases. Finally, we can see from Table 7 that as Inline graphic or Inline graphic increases, the MSE of Inline graphic decreases, which reflects the consistency of Inline graphic.

7. EMPIRICAL APPLICATION

To illustrate the usefulness of the proposed method, we now apply the proposed MAC method to analyse the data on passenger traffic volume in airports. The dataset consists of yearly passenger traffic volumes at Inline graphic airports in Mainland China in 2017,1 obtained from the Civil Aviation Administration of China2. The response variable Inline graphic contains the centralized values of the logarithm of passenger traffic volumes at these airports. It is intuitive to see that the number of nonstop flights at an airport affects the passenger traffic volume at the airport. So, we use the matrix of nonstop flights as the adjacency matrix Inline graphic. Specifically, the Inline graphic entry of Inline graphic is Inline graphic if there is at least one nonstop flight between airports Inline graphic and Inline graphic, and 0 otherwise, where Inline graphic is the number of airports that have a nonstop flight with airport Inline graphic. We set the largest order of Inline graphic to be Inline graphic, and then the numbers of candidate models to be considered is Inline graphic.

We consider the five methods studied in Section 6. The values of AIC, BIC and the weights of SAIC, SBIC and MAC are reported in Table 8. We can see that the AIC and BIC methods select the same model (0, 1, 2, 3), which means that Inline graphic, Inline graphic, Inline graphic and Inline graphic are in the model, and the SAIC and SBIC methods also put almost all the weights on these models. However, the MAC method puts weights on the models (0, 2), (0, 3) and (0, 1, 3).

Table 8.

Model selection criterion values and model averaging weights in the analysis of Chinese airports in 2017.

Model selection criterion values Weights
Models AIC BIC SAIC SBIC MAC
(0) 1361.723 1365.148 0.000 0.000 0.000
(0, 1) 1357.781 1364.630 0.000 0.000 0.000
(0,2) 1195.696 1202.546 0.002 0.043 0.593
(0,3) 1266.882 1273.731 0.000 0.000 0.132
(0, 1, 2) 1198.360 1208.635 0.000 0.002 0.000
(0, 1, 3) 1242.146 1252.421 0.000 0.000 0.275
(0,2,3) 1190.737 1201.012 0.019 0.093 0.000
(0, 1, 2, 3) 1182.865 1196.565 0.979 0.861 0.000

Notes: Model (0,2,3) means the model including Inline graphic, Inline graphic and Inline graphic. The minimum AIC and BIC values are highlighted in bold.

To compare the different methods, we use the estimated covariance matrix Inline graphic from each method as the true covariance matrix, i.e., we let Inline graphic. Then we create Inline graphic replications of Inline graphic randomly generated from Inline graphic. Subsequently, we estimate Inline graphic based on the five methods used above and calculate their mean and median of the MFLs across the 500 replications. Specifically,

7. (7.1)

and

7. (7.2)

where Inline graphic is the estimator of Inline graphic obtained by a given method in the Inline graphicth trial. The results are shown in Table 9, which also reports the standard deviation of each method and the optimal rate of each method, defined as the proportion of times in which the method results in the smallest MFL across the Inline graphic replication trials.

Table 9.

MFL in the analysis of Chinese airports in 2017.

Method MAC SAIC SBIC AIC BIC
MAC Mean 84.669 149.813 148.348 150.454 149.039
Median 63.676 75.379 70.190 77.443 73.255
Standard deviation 94.413 304.467 304.698 304.360 304.712
Optimal rate 0.526 0.064 0.140 0.172 0.098
SAIC Mean 137.931 354.814 353.701 355.525 354.004
Median 94.863 102.324 98.094 105.817 97.684
Standard deviation 235.541 1403.513 1403.687 1403.396 1403.677
Optimal rate 0.528 0.070 0.118 0.202 0.082
SBIC Mean 129.912 305.138 303.586 305.603 304.121
Median 94.681 103.598 99.498 103.875 101.281
Standard deviation 271.782 1512.634 1512.853 1512.588 1512.798
Optimal rate 0.538 0.064 0.120 0.208 0.070
AIC/BIC Mean 115.859 226.284 224.767 226.895 225.073
Median 89.564 98.179 95.068 100.434 96.556
Standard deviation 150.260 650.444 650.764 650.334 650.732
Optimal rate 0.524 0.074 0.106 0.212 0.084

Notes: The methods listed in the first column are used to estimate the correlation matrix, and then the estimated correlation matrix is used as the truth to generate the corresponding data. The minimum values in every rows are highlighted in bold. Since AIC and BIC select the same model in this real data analysis, the last method is AIC/BIC.

The results show that the MAC method consistently outperforms all other methods, regardless of how the true Inline graphic is selected and which performance evaluation criterion is used. We find this quite remarkable. In terms of the mean and median of the MFLs, the MAC method performs the best among all five estimators. In particular, the mean MFL of the MAC method is about half of that by any other method in all cases. The standard deviation of the MFL of MAC is much smaller than those of others in all cases, which means that the MAC performance is the most stable. In terms of the percentage of times where a method shows optimal performance (optimal rate), the MAC estimator always attains the highest score among the five methods, often with a value of more than 50%, indicating that in over half of the 500 trials, MAC is the champion. Depending on the means and medians, SAIC/SBIC performs slightly better than AIC/BIC.

8. CONCLUDING REMARKS

In this article, we used the covariance regression network model, which treats the covariance as a polynomial function of the symmetric adjacency matrix. The model averaging method was used to estimate the high-dimensional covariance matrix, and the candidate models were constructed through different orders of a polynomial function. The optimal weights were obtained by minimizing the newly proposed Mallows-type model averaging criterion. We proved the asymptotic optimality and consistency of the resulting MAC estimators for different situations. Both numerical simulations and a case study on Chinese airport network structure data were conducted to demonstrate the validity of the proposed approach.

It is worth noting that our method combines polynomials of Inline graphic. If the true covariance matrix is far from the polynomial form, the MAC method may not yield an accurate estimate. In addition, when conjecturing that Inline graphic is a combination of banded and Toeplits-type matrices, these matrices should be included in candidate models, but then there is no single orthogonal matrix to diagonalize them simultaneously. In such a case, MAC will not be applicable. How to develop a model averaging method for estimating this kind of covariance matrix warrants further study.

Supplementary Material

utaa030_Online_Appendix
utaa030_Replication_Files

ACKNOWLEDGEMENTS

We thank the referee, the associate editor, the co-editor Victor Chernozhukov and Prof. Hansheng Wang for many constructive comments and suggestions. We thank Prof. Wei Lan for providing his codes. Zhang, the corresponding author, was supported by the National Key R&D Program of China (2020AAA0105200), the National Natural Science Foundation of China (grant nos. 71925007, 11688101 and 71631008), the Beijing Academy of Artificial Intelligence, and the Youth Innovation Promotion Association of the Chinese Academy of Sciences. Ma was supported by grants from the National Science Foundation and the National Institutes of Health. Zou was supported by the National Natural Science Foundation of China (grant nos. 11971323 and 12031016). All errors remain the authors.

Notes

Co-editor Victor Chernozhukov handled this manuscript.

Footnotes

1

In 2017, there were 229 civil airports in Mainland China, 227 of which had scheduled flights.

Contributor Information

Rong Zhu, School of Mathematics, Statistics and Physics, Newcastle University, Newcastle upon Tyne NE1 7RU, UK.

Xinyu Zhang, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.

Yanyuan Ma, Department of Statistics, Pennsylvania State University, University Park, PA 16802, USA.

Guohua Zou, School of Mathematical Sciences, Capital Normal University, Beijing 100048, China.

Supporting Information

Additional Supporting Information may be found in the online version of this article at the publisher’s website:

Online Appendix

Replication Package

REFERENCES

  1. Ando T., Li K.-C. (2014). A model-averaging approach for high-dimensional regression. Journal of the American Statistical Association. 109, 254–65. [Google Scholar]
  2. Andrews D. W. K. (1991). Asymptotic optimality of generalized cInline graphic, cross-validation, and generalized cross-validation in regression with heteroskedastic errors. Journal of Econometrics. 47, 359–77. [Google Scholar]
  3. Bilmes J. A. (2000). Factored sparse inverse covariance matrices. IEEE International Conference. 2, 1009–12. [Google Scholar]
  4. Buckland S. T., Burnham K. P., Augustin N. H. (1997). Model selection: An integral part of inference. Biometrics. 53, 603–18. [Google Scholar]
  5. Campbell J. Y., Lo A. W., Mackinlay A. C., Whitelaw R. F. (1998). The econometrics of financial markets. Macroeconomic Dynamics. 2, 559–62. [Google Scholar]
  6. Chen X. H., Conley T. G. (2001). A new semi-parametric spatial model for panel time series. Journal of Econometrics. 105, 59–83. [Google Scholar]
  7. Fan J., Fan Y., Lv J. (2008). High dimensional covariance matrix estimation using a factor model. Journal of Econometrics. 147, 186–97. [Google Scholar]
  8. Fan J., Peng H. (2004). On nonconcave penalized likelihood with diverging number of parameters. Annals of Statistics. 32, 928–61. [Google Scholar]
  9. Friedman J. H., Hastie T., Tibshirani R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 9, 432–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Gao Y., Zhang X., Wang S., Zou G. (2016). Model averaging based on leave-subject-out cross-validation. Journal of Econometrics. 192, 139–51. [Google Scholar]
  11. Hansen B. E. (2007). Least squares model averaging. Econometrica. 75, 1175–89. [Google Scholar]
  12. Hansen B. E. (2014). Model averaging, asymptotic risk, and regressor groups. Quantitative Economics. 5, 495–530. [Google Scholar]
  13. Hansen B. E., Racine J. S. (2012). Jackknife model averaging. Journal of Econometrics. 167, 38–46. [Google Scholar]
  14. He X., Shao Q. M. (2000). On parameters of increasing dimensions. Journal of Multivariate Analysis. 73, 120–35. [Google Scholar]
  15. Jagannathan R., Ma T. (2003). Risk reduction in large portfolios: Why imposing the wrong constraints helps. Journal of Finance. 58, 1651–84. [Google Scholar]
  16. Lan W., Fang Z., Wang H., Tsai C. (2018). Covariance matrix estimation via network structure. Journal of Business and Economic Statistics. 36, 359–69. [Google Scholar]
  17. Leung G., Barron A. (2006). Information theory and mixing least-squares regressions. IEEE Transactions on information theory. 52, 3396–410. [Google Scholar]
  18. Liang H., Zou G., Wan A. T. K., Zhang X. (2011). Optimal weight choice for frequentist model average estimators. Journal of the American Statistical Association. 106, 1053–66. [Google Scholar]
  19. Liu Q., Okui R. (2013). Heteroskedasticity-robust CInline graphic model averaging. Econometrics Journal. 16, 463–72. [Google Scholar]
  20. Liu Q., Okui R., Yoshimura A. (2016). Generalized least squares model averaging. Econometric Reviews. 35, 1692–752. [Google Scholar]
  21. Markowitz H. (1952). Portfolio selection. Journal of Finance. 7, 77–91. [Google Scholar]
  22. Millar C. P., Jardim E., Scott F., Osio G. C., Mosqueira I., Alzorriz N. (2014). Model averaging to streamline the stock assessment process. ICES Journal of Marine Science. 72, 93–98. [Google Scholar]
  23. Wan A. T. K., Zhang X. (2009). On the use of model averaging in tourism research. Annals of Tourism Research. 36, 525–32. [Google Scholar]
  24. Wan A. T. K., Zhang X., Zou G. (2010). Least squares model averaging by Mallows criterion. Journal of Econometrics. 156, 277–83. [Google Scholar]
  25. Whittle P., (1960), Bounds for the moments of linear and quadratic forms in independent variables. Theory of Probability and its Applications. 5, 331–5. [Google Scholar]
  26. Yuan Z., Yang Y. (2005). Combining linear regression models: When and how?. Journal of the American Statistical Association. 100, 1202–14. [Google Scholar]
  27. Zhang X., (2010), Model averaging and its applications. Ph.D. Thesis, Academy of Mathematics and Systems Science, Chinese Academy of Sciences. [Google Scholar]
  28. Zhang X., Wan A. T. K., Zou G. (2013). Model averaging by jackknife criterion in models with dependent data. Journal of Econometrics. 174, 82–94. [Google Scholar]
  29. Zhang X., Wang W. (2019). Optimal model averaging estimation for partially linear models. Statistica Sinica. 29, 693–718. [Google Scholar]
  30. Zhu R., Zou G., Zhang X. (2018). Model averaging for multivariate multiple regression models. Statistics. 52, 205–27. [Google Scholar]
  31. Zou H., Zhang H. H. (2009). On the adaptive elastic-net with a diverging number of parameters. Annals of Statistics. 37, 1733–51. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

utaa030_Online_Appendix
utaa030_Replication_Files

Articles from The Econometrics Journal are provided here courtesy of Oxford University Press

RESOURCES