Abstract
Sample entropy based tests, methods of sieves and Grenander estimation type procedures are known to be very efficient tools for assessing normality of underlying data distributions, in one-dimensional nonparametric settings. Recently, it has been shown that the density based empirical likelihood (EL) concept extends and standardizes these methods, presenting a powerful approach for approximating optimal parametric likelihood ratio test statistics, in a distribution-free manner. In this paper, we discuss difficulties related to constructing density based EL ratio techniques for testing bivariate normality and propose a solution regarding this problem. Toward this end, a novel bivariate sample entropy expression is derived and shown to satisfy the known concept related to bivariate histogram density estimations. Monte Carlo results show that the new density based EL ratio tests for bivariate normality behave very well for finite sample sizes. In order to exemplify the excellent applicability of the proposed approach, we demonstrate a real data example related to a study of biomarkers associated with myocardial infarction.
Keywords: Bivariate normality, Density estimation, Empirical likelihood, Entropy, Goodness-of-fit, Histogram density estimation
1. Introduction and the statement of problem
Various statistical topics dealt with bivariate normally distributed data have broadened their appeal in recent theoretical and applied publications that provide a long cohort of new methods for multivariate statistical analysis (e.g., Balakrishnan and Lai [2]). This motivates the growing need for developing and evaluating powerful tests for bivariate normality (e.g., Balakrishnan and Lai [2]; Hawkins [7]; Kowalski [13]; Mecklin and Mundfrom [19]). Testing bivariate data for normality is much more difficult in practice than when data are univariate. Commonly techniques for detecting departures from the bivariate normal distribution are developed using modifications of conventional test procedures known in the context of assessing univariate normality. In many cases, in order to test goodness-of-fit of two-dimensional normal distribution functions, the literature proposes one-dimensional test statistics (e.g., Balakrishnan and Lai [2]). In this framework, we note that it is not sufficient to test corresponding univariate marginal distributions for normality, since scenarios, when the marginal distributions are normal but the joint distribution fails to be bivariate normal, may be in effect.
In this paper we propose and examine a bivariate extension to the one-dimensional sample entropy based concept (e.g., Vasicek [30]) using the density based empirical likelihood (EL) methodology (e.g., Vexler et al. [33]). Then, we propose the density based EL ratio tests for bivariate normality. To this end, we shall first outline the following material regarding basic sources we use in the new development.
1.1. Empirical likelihood and sample entropy
When functional forms of underlying data distributions are completely specified, the parametric likelihood approach is unarguably a powerful tool that provides optimal statistical inference. In such cases, by virtue of the Neyman-Pearson lemma, the likelihood ratio tests yield the most powerful decision making rules (e.g., Lehmann and Romano [16]; Vexler et al. [33]). However, the parametric likelihood methods cannot be applied properly if assumptions on the forms of data distributions do not hold. The distribution function based EL methods were introduced as nonparametric alternatives to parametric likelihood techniques (e.g., Lazar and Mykand [14]; Owen [22]). Commonly, the conventional EL function has the form , where the probability weights, pi, i = 1, …, n, satisfy the assumptions 0 < pi < 1, i = 1, …, n and the values of pi, i = 1, …, n, are derived by maximizing the EL function under empirical constraints. For example, when we observe independent and identically distributed (i.i.d.) data points X1, …, Xn, under the null hypothesis that E(X1) = 0, the corresponding empirical constraint is .
The recent statistical literature has introduced the density-based EL (dbEL) approach for creating nonparametric test statistics that approximate parametric Neyman-Pearson statistics (e.g., Vexler et al. [33]). The dbEL method proposes to consider the likelihood function in the form
where f(⋅) is a density function of observations X1, …, Xn, and X(1) ≤ … ≤ X(n) are the order statistics based on X1, …, Xn. Then, one can estimate values of fj, j = 1, …, n, maximizing Lf given a constraint related to the empirical version of the density property ∫ f(x)dx = 1. In this case, the following lemma (Vexler and Gurevich [31]) has a key role.
Lemma 1. Let f(x) be a density function. Then
where m < n/2, X(j) = X(1), if j ≤ 1, and X(j) = X(n), if j ≥ n. In terms of constructing the dbEL ratio test for univariate normality, Lemma 1 implies the next inference. By virtue of the inequality
we obtain the empirical constraint
where the expressions (X(j+m) − X(j−m))fj, j = 1, …, n, are Mean-Value Theorem type approximations to , which appear in Lemma 1. Thus, the method of Lagrange multipliers provides values of f1, …, fn, which maximize log(Lf) and satisfy
resulting in
Therefore, taking into account the maximum likelihood function, say LN, under the null hypothesis H0: X1, …, Xn are normally distributed, where LN ∝ (2πes2)−n/2 with , we obtain the dbEL ratio
that is known to be an efficient test statistic based on sample entropy (e.g., Vasicek [30]; Arizono and Ohta [1]). In order to develop the sample entropy based test for normality, Vasicek [30] applied the property of the normal distribution that its entropy exceeds that of any other distribution with a density that has the same variance. The dbEL approach extends this sample entropy based mechanism to general methods for univariate goodness-of- fit testing. The test for normality based on sample entropy is an exponential rate optimal procedure (see Tusnady [28] for details). This is in conjunction with the fact that likelihood ratio type tests oftentimes have optimal statistical properties.
In the construction process of the test statistic Tmn shown above, we used the approximation to the constraint
By virtue of Lemma 1, one can expect that
In this case, the integer m should increase when n → ∞, provided that m/n → 0, since, in light of Lemma 1, corresponding remainder terms related to the constraint
need to vanish asymptotically (e.g., Vexler et al. [32]). In general cases, if m has a fixed value, the approximation to the parametric likelihood Lf is not consistent (see the Supplement, Appendix A, for technical details). This is an interesting point, since naively one can anticipate that fixed values of m can provide ‘good’ approximations to as n → ∞, shortening distances between , and their estimators (X(j+m) – X(j−m))fj, j = 1, …, n. However, when m is fixed and n → ∞, the number of , is bigger than that in the case with m → ∞. This enlarges a total error of the applied Mean-Value Theorem type approximations in the context of the , when intervals [X(i−m), X(i+m)], i = 1, …, n, overlap.
Note that the dbEL technique mentioned above can be employed in order to estimate density functions in the maximum likelihood manner, obtaining a class of histogram density estimators (e.g., Izenman [9]; Prakasa Rao [23]). In this framework, we may use fixed values of m, assuming that f is a monotonically decreasing density function. This can yield procedures related to Grenander’s estimation and the method of sieves in the context of nonparametric density functions’ evaluations (Izenman [9]; Carolan and Dykstra [4]; Efromovich [5: pp. 341–343]).
The next advance of the dbEL approach based on the Lemma 1’s consequence is found by attending to that we should specify values of m to apply the test statistic Tmn, in practice. It turns out that, since we employ the likelihood concept to derive Tmn, the maximum likelihood principle implies the test statistic
where the operator ‘’ automatically sets up nearly optimal values of m at the test statistic Tmn (Vexler and Gurevich [31]). In general cases, the optimal values of m, which maximize the power of the test based on Tmn, can be calculated using information regarding the alternative distribution of the observations.
The dbEL method was successfully applied to construct various nonparametric decision-making schemes, significantly improving power as compared to the corresponding classical procedures (e.g., Vexler, Hutson and Chen, 2016).
1.2. Bivariate extensions to the univariate density based empirical likelihood and sample entropy expressions
Let (X, Y)T denote the bivariate random vector from the joint density function f(x, y). Consider a sample from f(x, y) in the form {(Xi, Yi), i = 1, …, n}. In this case, the likelihood function is
| (1) |
where the sample is ordered by Xi, the notation Y[i] is termed the concomitant of the i-th order statistic X(i), representing the Y-variate associated with X(i) and fi = f(X(i), Y[i]), i = 1, …, n.
In this case, according to the dbEL technique, we aim to achieve maximization of Lf, holding an empirical constraint with respect to the requirement ∬f(x, y)dxdy = 1. The problem is to approximate the double integral ∬f(x, y)dxdy using only n data points. That is to say although it would be desired to apply n × n points in a Riemann-type manner to approximate the double integral, we cannot employ the couples (Xi, Yj), i ≠ j, that are not associated with fi, i = 1, …, n, and do not appear at Lf defined in (1). In this context, one can attempt to adapt multivariate entropy estimation algorithms (Berrett et al. [3]; Kozachenko and Leonenko [12]) using open circles around the points (X(i), Y[i]), i = 1, …, n, with the radiuses
The methods for estimating the random vector entropy can provide consistent evaluations of ∬f(x, y)log (f(x, y))dxdy. However, in terms of the target construction of approximations to ∬f(x, y)dxdy based on fi, i = 1, …, n, these approaches cannot be directly involved due to very complicated overlaps between the open circles around the points (X(i), Y[i]), i = 1, …, n.
In order to avoid the issue above, one can reduce the dimension of the testing problem via an application of projection pursuit techniques (Zhu et al. [38]). In this framework, it can be proposed to use the fact that (X, Y)T is bivariate normally distributed, say (X, Y)T ~ N2(μ, V) with μ = E{(X, Y)T} and variance-covariance matrix V, if and only if for every vector a ∈ R2 such that aT a = 1and aT Va ≠ 0, we have
Now, we can compute estimators , of the parameters μ, V and then consider one-dimensional observations
in order to derive the dbEL ratio depending on a via the method shown in Section 1.1. Then the obtained dbEL ratio as a function of a can be, for example, integrated over different values of a, yielding a final test-statistic. It is clear that this approach might suffer from an efficiency loss when the likelihood function Lf under the alternative hypothesis is replaced by that based on Za,i, i = 1, …, n. Properties of the decision making procedure based upon projection pursuit are significantly depend on a way for summarizing the final test statistic with respect to different values of a.
In a similar manner to that shown above, one can evaluate an option to transform the original variates X and Y to independent normal variates (Balakrishnan and Lai [2: p. 509]) that is a correct operation only under the null hypothesis, H0: (X, Y)T ~ N2(μ, V), in general cases. Note also that algorithms to transform (X, Y)T ~ N2(μ, V) into two independent normal variates, say ZX and ZY, depend on the parameters μ, V that are unknown. If μ and V are estimated, the target distributional properties of (ZX, ZY) are only approximate.
We, in this paper, carry out an accurate scheme to apply the Lemma 1’s result to approximate ∬f(x, y)dxdy, basing on fi, i = 1, …, n. Then, in Section 2, maximizing Lf defined in (1), we obtain estimators of fi, i = 1, …, n, in forms that can be directly associated with the bivariate histogram density estimation (Kim and van Ryzin [11]; van Ryzin [29]; Prakasa Rao [23: pp. 234–235]; Izenman [9: pp. 209–210, 212–213]). This displays a natural linkage between the univariate sample entropy based approach shown in Section 1.1 and the proposed methodology (see Section 3 for details). We study the asymptotic consistency of the bivariate dbEL technique in Section 4. In this context, due to the complexity of a structure of the dbEL, we assume a condition on f(x, y) to show rigorously that the bivariate dbEL is a consistent approximation of the parametric likelihood Lf. Various Monte Carlo experiments based on more than one hundred different scenarios of (X, Y)-distributions and a variety of sample sizes n showed that the theoretical condition applied in Section 4 is not critical. Then, we believe the bivariate dbEL approach is consistent in more general cases.
In Section 5, we develop the dbEL ratio tests for bivariate normality. Consider the simple example assuming that X1, …, X50 are i.i.d. observations from a standard normal distribution and Y1, …, Y50 are defined as Yi = τiXi, i=1,…50, where random variables τi = −1 or 1, i = 1, …, 50, are i.i.d. and independent of X1, …, X50 with Pr(τ1 = −1) = 0.5. This is a conventional scenario when X1, …, X50 ~ N1(0,1) and Y1, …, Y50 ~ N1(0,1), but (X1, Y1), …, (X50, Y50) are not bivariate normal. In this case, the Shapiro-Wilk test (R procedure “mvShapiro.Test”, R Development Core Team [24]) and classical Mardia’s test for bivariate normality show power of 0.06 and 0.38 at the significance level of 5%, respectively, whereas the new tests we propose provide power of 0.83 and 0.92 (see Section 6 for details). One advantage of the proposed technique is that by applying the dbEL approach we can powerfully detect failures of underlying data to be bivariate normal. In Section 6, an extensive Monte Carlo study is employed to support this conclusion.
In Section 7 the proposed tests are applied to a biomarker study associated with myocardial infarction (MI) disease. The epidemiological literature indicates significant associations between the biomarkers “vitamin E”, “cholesterol” and the MI disease. We demonstrate that the new tests based on measurements related to “vitamin E” and “cholesterol” biomarkers exhibit high and stable power characteristics in comparison to the well-known decision making procedures. We conclude with remarks in Section 8. Finally the technical proofs of the theoretical results shown in this paper are given in the Supplement. The online supplementary material of this paper also presents R code to implement the proposed method.
2. The bivariate density based empirical likelihood
In this section, we introduce the algorithm for developing the bivariate dbEL approximation to the likelihood function Lf defined in (1). Towards this end, we begin with outlining the following scheme to construct an empirical estimation of the constraint ∬f(x, y)dxdy = 1 based on the observations (X(i), Y[i]), i = 1, …, n. The proposed algorithm consists of two stages: (A) we use Lemma 1 with respect to the density function f(x) of X, employing X(1), …, X(n), and then (B) we use Lemma 1 regarding the conditional density function f(y|x), employing Y’s that link to X’s involved in corresponding procedures related to Stage (A).
Lemma 1 applied to the density function
provides the inequality
| (2) |
where m < n / 2.
Next, we will use Lemma 1 to approximate ∫f(y|x)dy. To this end, we rewrite the left term of (2) as
| (3) |
In order to consider the summands in (3) with 1 ≤ i ≤ m, m + 1 ≤ i ≤ n − m and n − m + 1 ≤ i ≤ n, corresponding to the sums related to the right side of (3), we define the order statistics
based on Y[k], …, Y[j] that are the concomitants of X[k], …, X[j], j ≥ k, respectively. We specify that Y(r:k, j) = Y(1:k, j), if r ≤ 1 and Y(r:k, j) = Y(j−k+1:k,j), if r ≥ j – k +1.
In the case of 1 ≤ i ≤ m, we have i + m observations Y(1:1,i+m) ≤ … ≤ Y(i+m:1,i+m), since it is defined that X(j) = X(1), if j ≤ 1. Then, by virtue of Lemma 1, for k < m / 2 and a fixed x, we obtain
| (4) |
In the case of m + 1 ≤ i ≤ n − m, we observe 2m + 1 data points Y(1:i-m,i+m) ≤ .. ≤ Y(2m+1:i-m,i+m) Then, we have
| (5) |
In the case of n – m + 1 ≤ i ≤ n, we observe n – i + m + 1 data points Y(1:i-m,n) ≤ .. ≤ Y(n-i+m+1:i-m,n), since it is defined that X(j) = X(n), if j ≥ n. Then, we have
| (6) |
Note that, for any fixed x, hr,i,m,k(x) → 1, r = 1, 2, 3, when k / m → 0 as k, m → ∞. It is clear that Equations (2)–(6) provide the constraint
| (7) |
where Hm,k it defined as
A simple algebra shows that the statistic Hm,k can be presented in the form
Now, in accordance with the dbEL-developing procedure introduced in Section 1.1, we will use the mean-value approximations to the integrals in Hm,k. Let the notation X[r:k,j] state the concomitant of Y(r:k,j). The couple (X[r:k,j], Y(r:k,j)) belongs to the data points {(Xi, Yi), i =1, …, n} and f(X[r:k,j], Y(r:k,j)) has a place in Lf at (1). Consider the following situations with respect to the summands that appear in the definition of Hm,k. In the cases with 1 < i < m, 1 ≤ j ≤ i + m, we obtain
| (8) |
Note that, in this framework, it is held that Y(j−k:1,i+m) ≤ Y(j:1,i+m) ≤ Y(j+k:1,i+m) and X(i−m) ≤ X[j:1,i+m] ≤ X(i+m). In the cases with m + 1 ≤ i ≤ n − m, 1 ≤ j ≤ 2m + 1, we have
| (9) |
In the cases with n – m + 1 ≤ i ≤ n, 1 ≤ j ≤ n – i + m + 1, we have
| (10) |
Thus, using (8)–(10), we represent constraint (7) in its empirical form
| (11) |
defining
Note that in (11) is a sum of
different summands and each summand involves one multiplier f(Xl, Yl) for some l ∈ [1, n]. Therefore, there are several summands in with equivalent multipliers f(X(l), Y[l]) = fl, l = 1, 2, …, n. This can complicate the use of the Lagrange method for deriving values of fi, i = 1, …, n, that maximize Lf defined in (1), satisfying (11). Taking into account this issue, we will rewrite via a sum of n variables with coefficients fi, i = 1, .., n. To this end, we define the rank of the observation Y[r] with respect to Y[c], Y[c+1], …, Y[d] as
where I(⋅) is the indicator function and 1 ≤ c ≤ d ≤ n. Then, using some reorganization (see the Supplement, Appendix B, for details), we have
| (12) |
where Gi,m,k are defined in accordance with the following scheme:
- for 1 ≤ i ≤ m,
- for m + 1 ≤ i ≤ 2m,
- for 2m + 1 ≤ i ≤ n − 2m,
- for n − 2m + 1 ≤ i ≤ n − m,
- for n − m + 1 ≤ i ≤ n,
Note that, corresponding to Scenarios (a)-(e) above, the statistic Gi,m,k consists of m + i, 2m + 1, 2m + 1, 2m + 1 and m + n − i + 1 different summands, respectively. Then at (12) includes w different summands that is consistent with definition (11).
According to the dbEL concept, we derive values of fi, i = 1, …, n, that maximize the logarithm of the likelihood defined in (1), subject to the constraint
obtained with respect to (11), where has the form (12). This procedure provides the values
| (13) |
yielding the approximation
to the likelihood function
where nδ ≤ m ≤ n1−δ and mδ ≤ k ≤ m1−δ with 0 < δ < 0.5. Employing the maximum likelihood technique described in Section 1.1, we conclude that the dbEL approximation to Lf is
| (14) |
where 0 < δ < 0.5. Note that, in contrast to the univariate dbEL approach shown in Section 1.1, it is required that m ≥ nδ and k ≥ mδ. Explanations regarding these restrictions are provided in the sections below.
3. A bivariate version of the variable partition histogram
In this section, we demonstrate that the proposed method satisfies the principle of the bivariate histogram construction in the context of a maximum likelihood estimation of the density function f(X, Y) (Kim and van Ryzin [11]; van Ryzin [29]; Prakasa Rao [23: pp. 234–235]; Izenman [9: pp. 209–210, 212–213]).
Consider, for example, Scenario (c) in the definition (12), where, for i ∈ [2m + 1, n − 2m],
consists of 2m + 1 summands. Taking into account the formal notations used in Kim and van Ryzin [11], we denote the statistics
where An,j = i − j, Cn,j = i − j + 2m, Bn,j = ρ(Y[i], i − j, i − j + 2m) − k, Dn,j = ρ(Y[i], i − j, i − j + 2m) + k. The statistics , j = 0, …, 2m, are consistent approximations to f(X(i), Y[i]), 2m + 1 ≤ i ≤ n − 2m, uniformly for j = 0, …, 2m, if An,j, Bn,j, Cn,j and Dn,j satisfy the conditions presented in Kim and van Ryzin [11]. In this context, we note that, for nδ ≤ m ≤ n1−δ and n → ∞,
(Cn,j − An,j) / n → 0; An,j and Cn,j are invariant under permutations of (Xr,Yr), r = 1, …, n, for X(i). Regarding the positive integer-valued and indexing random variables Bn,j and Dn,j, we have
where D′n,j = Dn,j I(Dn,j ≤ 2m + 1) + (2m + 1)I(Dn,j > 2m + 1) and B′n,j = Bn,j I(Bn,j ≥ 1) + I (Bn,j < 1) corresponding to the definition of subscripts (Dn,j: i − j, i − j + 2m) and (Bn,j: i − j, i − j + 2m) of Y’s, since the order statistics Y(1:i−1,i−j+m) ≤ Y(2:i−1,i−j+m) ≤ … ≤ Y(2m+1:i−1,i−j+m) are based on Y[i−j], Y[i−j+1], …, Y[i−j+2m]. It is clear that, for nδ ≤ m ≤ n1−δ and mδ ≤ k ≤ m1−δ, Dn,j − Bn,j = 2k → ∞, (Bn,j − Dn,j) / (Cn,j − An,j) → 0 as well as Dn,j and Bn,j are invariant under permutation of (Xr, Yr), r = 1, …, n, i = 1, 2, …, n for (X(i), Y(i)).
Now, requiring m ≥ nδ and k ≥ mδ, we obtain that
Thus, the theoretical arguments shown in Kim and van Ryzin [11] provide that, for all j ∈ [0, 2m], . This implies
concluding that
4. An asymptotic consistency of the bivariate density based empirical likelihood
Here we confine the density function f(x,y) to be continuous and bounded on its support, a1 < f(x, y) < a2, where 0 < a1 < a2 < ∞ are fixed constants. Then,
where the dbEL, Δn, is defined by (14).
The proof of the result above is based on a formal scheme in a manner, which can be associated with the explanations mentioned in Section 3. We use the theoretical arguments shown in [11], [29] and [37] to present the rigorous proof of the dbEL consistency in the Supplement (Appendix C).
We employed various Monte Carlo evaluations based on more than one hundred different scenarios of (X,Y)-distributions and a variety of sample sizes n in order to examine a critical necessity of the condition a1 < f(x,y) < a2 for the asymptotic result shown in this section. These studies demonstrated that the bivariate dbEL approach is consistent in more general cases of f(x, y)-forms.
5. The dbEL ratio tests for bivariate normality
Should the underlying data follow the density function
where the parameters and ρ = E((X − μx)(Y − μy)) / (σxσy) ∈ (0,1), the maximum log likelihood function is
(Balakrishnan and Lai [2: p. 490]). Then we state the log dbEL ratio test that rejects the null hypothesis iff
| (15) |
where the statistic Δn is defined in (14) and C is a test-threshold.
It is clear that the transformation ((Xi − μx) / σx, (Yi − μy) / σy)T, i = 1, …, n, of the data does not change a value of the statistic . That is to say the null distribution of is unaltered with respect to the parameters . However, the H0 − distribution of depends on ρ and one can easily show that the probability
decreases when |ρ| ∈ (0,1) increases. We will evaluate this fact in the next section.
Various efficient statistics applied for testing univariate normality have forms that distributed independently of the mean and variance of observations under the null hypothesis. For example, the distribution of the statistic Tn defined in Section 1.1 does not depend on μx and σx when X1 ~ N1(μx, σx). Then, in order to calculate the critical values of the Tn based test for Xi ~ N1(μx, σx), i = 1, …, n, one can use pre-tabulated critical points or/and Monte Carlo simulations without restricting the sample size n to be relatively large.
Unfortunately, in the bivariate case, when we construct appropriate test statistics, the property mentioned above cannot be held in many scenarios. In this framework, the conventional approach is to standardize the bivariate data (Xi, Yi)T, i = 1, …, n, obtaining the transformed data
where is the square root of the inverse of the estimated covariance matrix. Then, under the null hypothesis, (X′1,Y′1)T has approximately the bivariate standard normal distribution In this context, in addition to the relevant literature mentioned in Section 1, we would like to refer the reader to Looney [18], Henze and Zirkler [8], Lee et al. [15], Villaseñor-Alva and González-Estrada [36]. For example, the widely-applied R procedure (R Development Core Team [24]) “mvShapiro.Test” for Shapiro-Wilk type testing bivariate normality is based on the principle shown above. Note that applications of the code “mvShapiro.Test” are restricted by the requirement n ∈ [12,5000] related to the use of tabulated values of the theoretical expectations of order statistics in Shapiro-Wilk’s manner for testing normality.
We may then propose the dbEL ratio test that rejects the null hypothesis iff
| (16) |
where the statistic Δn by (14) is calculated employing the data (X′i,Y′i)T, i = 1, …, n, instead of (Xi,Yi)T, i = 1, …, n, the notation nlog(1/(2π)) − n corresponds to the approximate likelihood function of (X′i,Y′i)T, i = 1, …, n, under the null hypothesis, and C is a test-threshold.
Remark. The dbEL literature shows that the power of dbEL tests does not depend significantly on values of parameters that have roles similar to that of δ at definition (14) (e.g., Tsai et al. [27]; Vexler et al. [33, 35]). In this context, we note that extensive Monte Carlo simulations confirmed the robustness of the proposed tests with respect to the values of δ at (14), i.e. one can show that the power of the new tests does not depend significantly on values of δ ∈ (0, 0.5) under various scenarios of alternative distributions. Thus, without loss of generality, we set δ in the proposed test statistics to be 0.4.
5.1. Null distributions of the dbEL ratio tests
In the one-dimensional setting, a very substantial body of literature has now grown around the asymptotic distribution problems involving the Vasicek entropy type statistics. One can generally recognize that proofs regarding the asymptotic distribution of the statistic Tmn defined in Section 1.1 are analytically very complicated. Note that, when the sample size is relatively large, we can also anticipate that various tests provide very powerful inference. Thus, following the recent literature related to goodness-of-fit tests (e.g., Hall and Welsh [6]; Mudholkar and Tian [20, 21]), we will focus on finite sample sizes without attempting to provide here an asymptotic solution for the critical values for the proposed tests in the two-dimensional setting.
The critical values for the dbEL ratio tests can be accurately approximated using Monte Carlo techniques. In order to tabulate the percentiles of the null distributions of the test statistics and with δ = 0.4 in definition (14), we drew 50,000 samples of calculating values of at each sample size n. The generated values of the test statistics were used to determine the critical values Cα of the corresponding null distributions of at the significance level α. The results of this Monte Carlo study are presented in Tables 1 and 2.
Table 1.
Critical Values,Cα, of the Test Statistic
| Sample size | α | ||||
|---|---|---|---|---|---|
| n | 0.1 | 0.05 | 0.04 | 0.025 | 0.01 |
| 15 | 13.78831 | 14.45392 | 14.67231 | 15.10617 | 15.89147 |
| 20 | 16.01282 | 16.82075 | 17.06102 | 17.56599 | 18.44998 |
| 25 | 17.79228 | 18.67010 | 18.93496 | 19.48494 | 20.54437 |
| 30 | 19.34573 | 20.38065 | 20.67776 | 21.28747 | 22.37941 |
| 35 | 20.88404 | 21.99563 | 22.33991 | 23.01375 | 24.19156 |
| 40 | 22.07930 | 23.26584 | 23.64351 | 24.36966 | 25.67781 |
| 45 | 23.30695 | 24.54427 | 24.90769 | 25.70937 | 27.06523 |
| 50 | 24.67166 | 26.00287 | 26.34771 | 27.08118 | 28.43204 |
| 55 | 25.53667 | 26.91676 | 27.33220 | 28.19032 | 29.63046 |
| 60 | 26.44042 | 27.77653 | 28.22558 | 29.12152 | 30.66377 |
| 70 | 28.04247 | 29.53837 | 29.96516 | 30.84174 | 32.44158 |
| 80 | 29.36747 | 31.02797 | 31.51445 | 32.44036 | 34.09259 |
| 90 | 30.75603 | 32.51081 | 33.05028 | 34.05678 | 35.57193 |
| 100 | 32.07309 | 33.76989 | 34.37907 | 35.40567 | 37.31336 |
| 120 | 34.02814 | 36.02057 | 36.52973 | 37.70851 | 39.86061 |
Table 2.
Critical Values, Cα, of the Test Statistic
| Sample size | α | ||||
|---|---|---|---|---|---|
| n | 0.1 | 0.05 | 0.04 | 0.025 | 0.01 |
| 15 | 13.96789 | 14.68150 | 14.85996 | 15.30876 | 16.03354 |
| 20 | 16.22554 | 17.06107 | 17.31668 | 17.77547 | 18.72344 |
| 25 | 17.94104 | 18.90434 | 19.19077 | 19.72435 | 20.72166 |
| 30 | 19.47291 | 20.53942 | 20.78474 | 21.38950 | 22.44172 |
| 35 | 21.31563 | 22.44544 | 22.77901 | 23.46965 | 24.70225 |
| 40 | 22.29129 | 23.48504 | 23.82955 | 24.50040 | 25.67661 |
| 45 | 23.52354 | 24.73849 | 25.09407 | 25.84452 | 27.13543 |
| 50 | 24.77544 | 26.05601 | 26.43748 | 27.21586 | 28.60631 |
| 55 | 25.65505 | 26.96317 | 27.35342 | 28.14286 | 29.63217 |
| 60 | 26.52463 | 27.98900 | 28.42992 | 29.29108 | 30.66241 |
| 70 | 28.15409 | 29.68741 | 30.12672 | 31.04799 | 32.67359 |
| 80 | 29.46820 | 31.10201 | 31.56947 | 32.42746 | 34.07674 |
| 90 | 30.92357 | 32.62724 | 33.13130 | 34.23576 | 36.06089 |
| 100 | 32.00653 | 33.72502 | 34.26569 | 35.32767 | 37.10607 |
| 120 | 34.01007 | 35.96420 | 36.54560 | 37.82717 | 39.62449 |
In order to verify the results shown in Tables 1 and 2, for different values of ρ ∈ (−1,1) and n, we calculated the Monte Carlo approximations to
where Cα=0.05’s are shown in Tables 1 and 2. In this study, we also analyzed the Shapiro-Wilk test (SW), using the R procedure “mvShapiro.Test”. For each value of ρ and n, the Type I error rates were derived using 25,000 samples of . Table 3 presents the results of this Monte Carlo evaluation. According to Table 3, the validity of the critical values related to the test statistic is experimentally confirmed. Although the test based on is very conservative when |ρ| > 0.7, we can recommend the test (15) to be applied in practice, owing to high levels of the power of this test against alternatives considered in Section 6.
Table 3.
The Monte Carlo Type I error probabilities of the proposed tests (15), (16) and the Shapiro-Wilk test (SW), when and the anticipated significance level is α= 0.05.
| n = 35 | n = 50 | n = 70 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| ρ | SW | SW | SW | ||||||
| −0.9 | <0.0001 | 0.0475 | 0.0521 | <0.0001 | 0.0496 | 0.0519 | <0.0001 | 0.0513 | 0.0506 |
| −0.8 | 0.0015 | 0.0504 | 0.0501 | 0.0014 | 0.0495 | 0.0492 | 0.0014 | 0.0477 | 0.0482 |
| −0.7 | 0.0073 | 0.0501 | 0.0501 | 0.0069 | 0.0498 | 0.0480 | 0.0067 | 0.0499 | 0.0466 |
| −0.6 | 0.0143 | 0.0505 | 0.0489 | 0.0139 | 0.0499 | 0.0521 | 0.0143 | 0.0479 | 0.0495 |
| −0.5 | 0.0235 | 0.0511 | 0.0520 | 0.0233 | 0.0495 | 0.0521 | 0.0228 | 0.0498 | 0.0481 |
| −0.4 | 0.0325 | 0.0513 | 0.0507 | 0.0319 | 0.0500 | 0.0519 | 0.0322 | 0.0495 | 0.0498 |
| −0.3 | 0.0388 | 0.0509 | 0.0521 | 0.0407 | 0.0500 | 0.0493 | 0.0394 | 0.0500 | 0.0506 |
| −0.2 | 0.0454 | 0.0509 | 0.0517 | 0.0465 | 0.0501 | 0.0516 | 0.0475 | 0.0495 | 0.0497 |
| −0.1 | 0.0506 | 0.0461 | 0.0526 | 0.0499 | 0.0499 | 0.0489 | 0.0507 | 0.0503 | 0.0496 |
| 0 | 0.0488 | 0.0501 | 0.0500 | 0.0508 | 0.0499 | 0.0493 | 0.0504 | 0.0495 | 0.0501 |
| 0.1 | 0.0501 | 0.0473 | 0.0505 | 0.0500 | 0.0454 | 0.0510 | 0.0503 | 0.0475 | 0.0466 |
| 0.2 | 0.0466 | 0.0508 | 0.0521 | 0.0470 | 0.0499 | 0.0493 | 0.0469 | 0.0490 | 0.0508 |
| 0.3 | 0.0418 | 0.0507 | 0.0519 | 0.0400 | 0.0499 | 0.0519 | 0.0380 | 0.0493 | 0.0470 |
| 0.4 | 0.0335 | 0.0512 | 0.0509 | 0.0317 | 0.0501 | 0.0497 | 0.0323 | 0.0492 | 0.0506 |
| 0.5 | 0.0237 | 0.0510 | 0.0515 | 0.0228 | 0.0496 | 0.0481 | 0.0231 | 0.0499 | 0.0514 |
| 0.6 | 0.0143 | 0.0503 | 0.0488 | 0.0138 | 0.0498 | 0.0521 | 0.0142 | 0.0498 | 0.0497 |
| 0.7 | 0.0072 | 0.0498 | 0.0495 | 0.0071 | 0.0498 | 0.0492 | 0.0068 | 0.0489 | 0.0492 |
| 0.8 | 0.0016 | 0.0502 | 0.0498 | 0.0015 | 0.0496 | 0.0519 | 0.0012 | 0.0483 | 0.0489 |
| 0.9 | <0.0001 | 0.0500 | 0.0518 | <0.0001 | 0.0497 | 0.0519 | <0.0001 | 0.0508 | 0.0488 |
Remark. In practice, in order to implement the proposed approach in a simple and rapid manner, one can suggest applying a hybrid method for computing the p-values of the tests (15), (16), by combining Monte Carlo simulations and the critical values displayed in Tables 1 and 2. In this framework, employing Bayesian type procedures, we can derive relevant information from the Monte Carlo experiments via likelihood type functions, whereas the tabulated critical values can be used to reflect prior distributions (Vexler et al. [34]). The hybrid technique for computing the p-values has been employed in STATA and R statistical packages (Vexler et al. [33]).
6. Power of the tests
It is clear that in the nonparametric setting of testing bivariate normality, there are no most powerful decision making procedures. In this section, we only exemplify several scenarios where the power of the proposed tests is compared with that of the Shapiro-Wilk (SW) test and classical Mardia’s test (MT) for bivariate normality at the significance level of 5%. The following scenarios of source distributions were treated:
(A) X1, …, Xn ~ N1(0,1) and Yi = τiXi, i =1,…n, where random variables τi = −1 or 1, i = 1, …, n, are i.i.d. and independent of X1, …, Xn with Pr(τ1 = −1) = 0.5. In this case, X1, …, Xn ~ N1(0,1) and Y1, …, Yn ~ N1(0,1), but (X1, Y1), …, (Xn, Yn) are not bivariate normal.
(B) X1, …, Xn are uniformly distributed over (−5,5) and independent of Y1, …, Yn that are uniformly distributed over (−5,5). This case presents a light tailed alternative distribution.
(C) Let ξ1, …, ξn and η1, …, ηn be i.i.d. random variables from N1(0,1). Define , i = 1, …, n, to examine a case of heavy tailed alternative distributions.
(D) Define , i = 1, …, n, with independent and identically U(0,1)-distributed random variables ξ’s and η’s to evaluate a case in which the central limit theorem can be applied to approximate the data distribution.
(E) (X,Y)’s follow Morgenstern’s distribution with parameter α = 0.5 (Johnson [10: p. 185]).
(F) (X,Y)’s follow Plackett’s distribution with parameter Ψ = 2 (Johnson [10: p. 193]).
(G) (X1,Y1), …, (Xn,Yn) follow Gumbel’s Type I logistic distribution (Johnson [10: p. 199]).
(H) Assume random vectors (Z1, W1)T, …, (Zn, Wn)T are from Gumbel’s bivariate exponential distribution with parameter θ = 0.9 (Johnson [10: p. 197]). In order to reduce the power of the considered tests, we define (Xi = Zi + ξi, Yi = Wi + ηi), where ξ’s and η’s are independent and identically U(−6,3)-distributed random variables.
(I) Define (Xi = ξi, Yi = ηi), i = 1, …, n, where ξ’s and η’s are independent and identically Gamma(2,1)-distributed random variables. Note that, commonly, sample entropy based tests for univariate normality do not outperform the corresponding Shapiro-Wilk test when underlying data are from a gamma distribution (e.g., Table 2 in Vasicek [30], where the case with X ~ Gamma(2,1) is evaluated).
Table 4 shows the results of the power evaluations of the proposed tests ( and with δ = 0.4 in definition (14)), the Shapiro-Wilk (SW) test and classical Mardia’s test (MT) for bivariate normality via the Monte Carlo study based on 15,000 replications of (X1, Y1), …, (Xn, Yn) for the designs (A-I) given above at each sample size n = 35, 50, 70.
Table 4.
The Monte Carlo power of the tests at the significance level of 5%.
| Tests | Design (A) | Design (B) | Design (C) | ||||||
| Sample size (n) | Sample size (n) | Sample size (n) | |||||||
| … | 35 | 50 | 70 | 35 | 50 | 70 | 35 | 50 | 70 |
| 0.5645 | 0.8327 | 0.9427 | 0.9295 | 0.9923 | 1 | 0.8228 | 0.9635 | 0.9945 | |
| 0.6666 | 0.9249 | 0.9796 | 0.8907 | 0.9888 | 1 | 0.7771 | 0.9458 | 0.9914 | |
| SW | 0.0570 | 0.0604 | 0.0613 | 0.6975 | 0.9316 | 0.9952 | 0.7744 | 0.9282 | 0.9878 |
| MT | 0.3711 | 0.3881 | 0.4108 | 0.0004 | 0.0002 | 0.0002 | 0.7392 | 0.8350 | 0.9156 |
| Design (D) | Design (E) | Design (F) | |||||||
| 0.0639 | 0.0643 | 0.0753 | 0.9226 | 0.9928 | 0.9998 | 0.9908 | 0.9998 | 1 | |
| 0.0533 | 0.0639 | 0.0693 | 0.8748 | 0.9876 | 0.9994 | 0.9739 | 0.9990 | 1 | |
| SW | 0.0375 | 0.0378 | 0.0381 | 0.6544 | 0.9098 | 0.9906 | 0.8987 | 0.9910 | 0.9999 |
| MT | 0.0230 | 0.0251 | 0.0257 | 0.0000 | 0.0001 | 0.0001 | 0.0033 | 0.0045 | 0.0049 |
| Design (G) | Design (H) | Design (I) | |||||||
| 1 | 1 | 1 | 0.4581 | 0.6234 | 0.7752 | 0.8959 | 0.9766 | 0.9969 | |
| 0.9994 | 1 | 1 | 0.4160 | 0.6184 | 0.7796 | 0.8521 | 0.9745 | 0.9964 | |
| SW | 0.8812 | 0.9801 | 0.9980 | 0.3783 | 0.5582 | 0.7535 | 0.9615 | 0.9972 | 0.9999 |
| MT | 0.9950 | 0.9997 | 0.9998 | 0.2043 | 0.3004 | 0.4178 | 0.8062 | 0.9527 | 0.9959 |
This study demonstrates that the dbEL ratio tests are superior to the considered classical tests under the designs (A-H). The new tests have significantly improved powers as compared to the corresponding classical procedures. For example, the power of the proposed tests is roughly two times larger than that of the classical tests given scenarios (A) and (D). The proposed tests have approximately 10%−30% power gains as compared to the classical procedures when n = 35 in scenarios (E) and (H). It seems that the Shapiro Wilk test is not efficient under the designs of (A) and (D). In scenario (D), the Shapiro Wilk test is biased. Mardia’s test is biased under the designs of (B), (D), (E), and (F). In these scenarios, the new tests exhibit high and stable power characteristics. The proposed tests perform reasonably well, and are generally competitive with the classical tests in case (I). In this scenario, it is anticipated that the Shapiro Wilk test has higher power than the other considered tests. In parallel with studies regarding properties of tests for univariate normality, the shown Monte Carlo results are consistent with those related to one-dimensional sample entropy based tests (e.g., Vasicek [30]]).
7. Data analysis
Myocardial infarction is commonly caused by blood clots blocking the blood flow of the heart leading heart muscle injury. The heart disease is leading cause of death affecting about or higher than 20% of populations regardless of different ethnicities according to the Centers for Disease Control and Prevention (e.g., Schisterman et al. [25, 26]).
We illustrate the application of the proposed approach based on a sample from a study that evaluates biomarkers associated with myocardial infarction (MI). The study was focused on the residents of Erie and Niagara counties, 35–79 years of age. The New York State department of Motor Vehicles drivers’ license rolls was used as the sampling frame for adults between the age of 35 and 65 years, while the elderly sample (age 65–79) was randomly chosen from the Health Care Financing Administration database. The biomarkers “high density lipoprotein (HDL)-cholesterol” and “vitamin E” are often used as a discriminant factor between individuals with (MI=1) and without (MI=0) myocardial infarction disease (e.g., Schisterman et al. [25, 26]). The HDL-cholesterol levels were examined from a 12-hour fasting blood specimen for biochemical analysis at baseline. A total of 240 measurements of the biomarkers were evaluated by the study. The sample of 120 biomarkers values was collected on cases who survived on MI and the sample of 120 measurements on controls who had no previous MI.
Oftentimes, measurements related to biological processes follow a log-normal distribution (e.g., Limpert et al. [17]). The aim of this study is to investigate the joint distribution of log-transformed vitamin E measurements, say X, and log-transformed (HDL)-cholesterol measurements, say Y, with regard to MI disease. Towards this end, we implemented the new tests, the Shapiro-Wilk test (SW) and Mardia’s test (MT) for bivariate normality using the data described above. Figure S1 in the Supplement depicts the histograms based on values of X, Y and the scatter plots based on (X, Y) for the case (MI=1) and control (MI=0) groups, respectively.
In this study, the considered four tests provided p-values<0.045, rejecting the hypotheses that the observations (X1, Y1), …, (X120, Y120) are bivariate normally distributed (H0) for the case (MI=1) and control (MI=0) groups, respectively. Then, we organized a bootstrap/Jackknife type study to examine the power performances of the test statistics. The conducted strategy was that samples with sizes 35, 50 and 70 were randomly selected from the “vitamin E/HDL-cholesterol” data to be tested for bivariate normality at 5% level of significance. We repeated this strategy 5000 times calculating the frequencies of the events { rejects H0}, { rejects H0}, {SW rejects H0} and {MT rejects H0}. The test statistics and were performed with δ = 0.4 in definition (14). The obtained experimental powers of the four tests are shown in Table 5.
Table 5.
The experimental powers of the tests at the significance level of 5%.
| Tests | MI=1 | MI=0 | ||||
|---|---|---|---|---|---|---|
| Sample size (n) | Sample size (n) | |||||
| 35 | 50 | 70 | 35 | 50 | 70 | |
| 0.2780 | 0.3838 | 0.4716 | 0.3904 | 0.4504 | 0.6442 | |
| 0.1990 | 0.3108 | 0.4277 | 0.2799 | 0.4178 | 0.6189 | |
| SW | 0.1152 | 0.1896 | 0.3119 | 0.2602 | 0.3812 | 0.5910 |
| MT | 0.0940 | 0.1580 | 0.2369 | 0.0510 | 0.0786 | 0.1544 |
In this study, the proposed tests significantly outperform the SW and MT tests in terms of the power properties when detecting that the log-transformed biomarkers’ values are not jointly distributed as bivariate normal random variables. For example, when n = 35 and MI=1, the dbEL ratio tests reveal the experimental powers that are approximately two times larger than those of the SW and MT tests. That is, the dbEL ratio tests are more sensitive as compared with the known methods to rejecting the null hypothesis of bivariate normality regarding joint distributions of the log-transformed values of the “vitamin E” and “HDL-cholesterol” biomarkers.
8. Concluding remarks
In this paper, we extended the density based empirical likelihood approach to construct new goodness of fit tests for bivariate normality. The main idea of our method was to propose a consistent technique that employs histogram/sample entropies density based estimations in the bivariate framework. We compared the performance of the dbEL ratio tests to the known decision making procedures, the Shapiro-Wilk test and Mardia’s test. The conducted simulation study displayed that the proposed tests outperformed the known tests in many important scenarios of alternative distributions as well as the new tests provided power levels in a similar manner to their univariate sample entropy based analogs. Finally, we applied our tests on a real data set, where the proposed technique exhibited high and stable power characteristics.
Certainly, the proposed testing strategy is computational intensive. In this context, we would like to note that the known principles regarding bivariate histogram developments deal with strong computational requirements in general. In the modern age we are generally no longer constrained by computational issues and have a greater flexibility in terms of the statistical approaches that we may employ to data analysis problems. Advances in computation and the fast and cheap computational facilities now are available to statisticians. This can support that the dbEL methodology can be suggested to be modified and extended in order to be applied to various multivariate problems encountered in statistical studies.
Our main objective of this paper is twofold: (1) to show that the density based empirical likelihood technique can be a valuable tool in multivariate statistical analysis and (2) to convince readers of the usefulness of the sample entropy based approach that should be more widely investigated in multivariate frameworks.
Supplementary Material
Acknowledgments
Dr. Vexler’s effort was supported by the National Institutes of Health (NIH) grant 1G13LM012241–01.
Footnotes
Appendix. Supplementary data
Supplementary material related to this article can be found online
References
- [1].Arizono I and Ohta H, A test for normality based on Kullback-Leibler information. The American Statistician 43 (1989) 20–22. [Google Scholar]
- [2].Balakrishnan N and Lai C-D, Continuous Bivariate Distributions. Springer, New York, 2009. [Google Scholar]
- [3].Berrett TB, Samworth RJ and Yuan M. Efficient multivariate entropy estimation via k-nearest neighbour distances. arXiv:1606.00304 (https://arxiv.org/abs/1606.00304), 2017.
- [4].Carolan C and Dykstra R. Asymptotic behavior of the Grenander estimator at density flat regions. The Canadian Journal of Statistics 27 (1999) 557–566. [Google Scholar]
- [5].Efromovich S. Nonparametric Curve Estimation: Methods, Theory, and Applications. Springer-Verlag, New York, 1999. [Google Scholar]
- [6].Hall P and Welsh AH. A test for normality based on the empirical characteristic function. Biometrika 70 (1983) 485–489. [Google Scholar]
- [7].Hawkins DM. A new test for multivariate normality and homoscedasticity. Technometrics 23:1 (1981) 105–110. [Google Scholar]
- [8].Henze N and Zirkler B. A Class of Invariant Consistent Tests for Multivariate Normality. Communications in Statistics-Theory and Method 19 (1990) 3595–3618. [Google Scholar]
- [9].Izenman AJ. Resent developments in nonparametric density estimation. Journal of the American Statistical Association 86 (1991) 205–224. [Google Scholar]
- [10].Johnson ME. Multivariate Statistical Simulationa. John Wiley & Sons, New York, 1987. [Google Scholar]
- [11].Kim BK, Van Ryzin J. A bivariate histogram density estimator: consistency and asymptotic normality. Statistics & Probability Letters 3 (1985) 167–173. [Google Scholar]
- [12].Kozachenko LF and Leonenko NN. Sample estimate of the entropy of a random vector. Probl. Inform. Transm 23 (1987) 95–101. [Google Scholar]
- [13].Kowalski CJ. The performance of some rough tests for bivariate normality before and after coordinate transformations to normality. Technometrics 12:3 (1970) 517–544. [Google Scholar]
- [14].Lazar N and Mykland PA. An Evaluation of the Power and Conditionality Properties of Empirical Likelihood. Biometrika 85 (998) 523–534. [Google Scholar]
- [15].Lee R, Qian M and Shao YZ. On Rotational Robustness of Shapiro-Wilk Type Tests for Multivariate Normality. Open Journal of Statistics 4 (2014) 964–969. [Google Scholar]
- [16].Lehmann EL and Romano JP. Testing Statistical Hypotheses. Springer-Verlag, New York, 2005. [Google Scholar]
- [17].Limpert E, Stahel WA and Abbt M. Log-Normal Distributions across the Sciences: Keys and Clues on the Charms of Statistics, and How Mechanical Models Resembling Gambling Machines Offer a Link to a Handy Way to Characterize Log-Normal Distributions, Which Can Provide Deeper Insight into Variability and Probability—Normal or Log-Normal: That Is the Question. BioScience. 51 (2001) 341–352. [Google Scholar]
- [18].Looney SW. How to Use Tests for Univariate Normality to Assess Multivariate Normality. The American Statistician 49 (1995) 64–70. [Google Scholar]
- [19].Mecklin CJ and Mundfrom DJ. An appraisal and bibliography of tests for multivariate normality. International Statistical Review 72 (2004) 123–138. [Google Scholar]
- [20].Mudholkar GS and Tian L. An entropy characterization of the inverse Gaussian distribution and related goodness-of-fit test. Journal of Statistical Planning and Inference 102 (2002) 211–221. [Google Scholar]
- [21].Mudholkar GS and Tian L. A test for homogeneity of ordered means of inverse Gaussian populations. Journal of Statistical Planning and Inference. 118 (2004) 37–49. [Google Scholar]
- [22].Owen AB. Empirical Likelihood. CRC press, Florida, 2001. [Google Scholar]
- [23].Prakasa Rao BLS. Nonparametric functional estimation. Accademic Press, Inc., New York, 1983. [Google Scholar]
- [24].R Development Core Team. R: A language and environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2012. http://www.R-project.org
- [25].Schisterman EF, Faraggi D, Browne R, Freudenheim J, Dorn J, Muti P, Armstrong D, Reiser B and Trevisan M. TBARS and cardiovascular disease in a population-based sample. Journal of Cardiovascular Risk. 8 (2001) 219–225. [DOI] [PubMed] [Google Scholar]
- [26].Schisterman EF, Faraggi D, Browne R, Freudenheim J, Dorn J, Muti P, Armstrong D, Reiser B and Trevisan M. Minimal and best linear combination of oxidative stress and antioxidant biomarkers to discriminate cardiovascular disease. Nutrition, Metabolism, and Cardiovascular Disease 12 (2002) 259–266. [PubMed] [Google Scholar]
- [27].W-M Tsai A Vexler and G. Gurevich. An extensive power evaluation of a novel two-sample density-based empirical likelihood ratio test for paired data with an application to a treatment study of attention-deficit/hyperactivity disorder and severe mood dysregulation. Journal of Applied Statistics. 40 (2013) 1189–1208. [Google Scholar]
- [28].Tusnady G. On asymptotically optimal tests. Ann. Statist 5 (1977) 385–393. [Google Scholar]
- [29].Van Ryzin J. A histogram method of density estimation, Communications in Statistics 2:64 (1973) 93–506. [Google Scholar]
- [30].Vasicek O. A Test for Normality Based on Sample Entropy. Journal of the Royal Statistical Society. Series B (Methodological) 139 (1976) 54–59. [Google Scholar]
- [31].Vexler A and Gurevich G, G. Empirical Likelihood Ratios Applied to Goodness-of-Fit Tests Based on Sample Entropy. Computational Statistics & Data Analysis 54 (2010) 531–545. [Google Scholar]
- [32].Vexler A, Gurevich G and Hutson AD. An Exact Density-Based Empirical Likelihood Ratio Test for Paired Data. Journal of Statistical Planning and Inference 143(2) (2013) 334–345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Vexler A, Hutson AD and Chen X. Statistical Testing Strategies in the Health Sciences. Chapman & Hall/CRC, New York, 2016. [Google Scholar]
- [34].Vexler A, Kim YM, Yu J, Lazar NA and Hutson AD. Computing Critical Values of Exact Tests by Incorporating Monte Carlo Simulations Combined with Statistical Tables. Scandinavian Journal of Statistics. 41 (2014) 1013–1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Vexler A, W-M.Tsai and A. D. Hutson. A Simple Density-Based Empirical Likelihood Ratio Test for Independence. The American Statistician. 68 (2014) 158–169. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Villaseñor-Alva JA and González-Estrada G. A Generalization of Shapiro-Wilk’s Test for Multivariate Normality. Communications in Statistics-Theory and Methods 38 (2009) 1870–1883. [Google Scholar]
- [37].Wilks SS. Mathematical Statistics. John Wiley and Sons, New York, 1962. [Google Scholar]
- [38].Zhu L-X, Wong HL and Fang K-T. A test for multivariate normality based on sample entropy and projection pursuit. Journal of Statistical Planning and Inference 45 (1995) 373–385. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
