Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2023 Aug 27;51(10):1961–1975. doi: 10.1080/02664763.2023.2251098

Goodness-of-fit test for the one-sided Lévy distribution

Aditi Kumari 1, Deepesh Bhati 1,CONTACT
PMCID: PMC11271099  PMID: 39071255

ABSTRACT

The main aim of this work is to develop a new goodness-of-fit test for the one-sided Lévy distribution. The proposed test is based on the scale-ratio approach in which two estimators of the scale parameter of one-sided Lévy distribution are confronted. The asymptotic distribution of the test statistic is obtained under null hypotheses. The performance of the test is demonstrated using simulated observations from various known distributions. Finally, two real-world datasets are analyzed.

KEYWORDS: Asymptotic normality, one-sided Lévy distribution, gamma distribution, Monte Carlo simulation

2010 MATHEMATICS SUBJECT CLASSIFICATIONS: 62E20, 62F03

1. Introduction

In 1920, French mathematician Paul Lévy with Aleksander Khinchine developed the theory of stable distribution [10]. Subsequently, it has been applied in different disciplines, like economics, physics, hydrology, biology, and signals processing to capture asymmetry, tail behavior, and high kurtosis in datasets [1]. The one-sided Lévy distribution is a special case of stable distribution with positive support and has a heavier tail than any exponential tail distributions. Dumé [5] showed that the Lévy distribution could well describe the sequence of polarity reversals. Rogers [9] used the one-sided Lévy distribution to model the length of paths that are followed by photons after reflection from a turbid media. Despite its use in statistical modeling of physical phenomenon, the construction of goodness-of-fit test for the one-sided Lévy distribution has not attracted much attention from the researchers. Recently, Bhati and Kattumannil [3] proposed Jackknife empirical likelihood (JEL) ratio and adjusted Jackknife empirical likelihood (AJEL) ratio tests for testing the one-sided Lévy distribution. Motivated with the fact that very few tests are available on the goodness-of-fit test for the one-sided Lévy distribution in the statistical literature, we propose a new test for the one-sided Lévy distribution in this article.

This manuscript is structured as follows: In Section 2, we propose a new estimator of the scale parameter of the one-sided Lévy distribution. The proposed test and its asymptotic distribution under null and alternative hypothesis is introduced in Section 3. Simulated critical values at different significant levels and empirical power of the proposed test and their comparison with existing tests are discussed using Monte Carlo Simulation in Section 4. Finally, we apply our test to two real-world datasets in Section 5.

2. A new estimator of σ

A positive random variable (rv) is said to follow a one-sided Lévy distribution with scale parameter σ>0, denoted as Lv(σ), if its probability density function (pdf) is of the form

f(x;σ)=σ2πx3/2eσ2x,x>0, (1)

and the cumulative distribution function (cdf) is given by

F(x)=2(1Φ((σx)1/2)),x>0, (2)

where Φ() is the cdf of a standard normal random variable. The one-sided Lévy distribution can also be obtained as a particular case of the inverse gamma distribution with the shape parameter 1/2 and the scale parameter σ/2, which implies that the one-sided Lévy distribution possesses finite inverse moments. Lemma 2.1 will be used later to construct our proposed test statistic.

Lemma 2.1

Let X Lv(σ) then for l=0,1,2,, and k=1,2,, the rv Y=1/X follows the relation

σE(Yl+1(logY)k)=(2l+1)E(Yl(logY)k)+2kE(Yl(logY)k1). (3)

Proof.

Let X Lv(σ), then the transformed rv Y=1/XGamma(12,σ2) with density fY(y)=σ2πy1/2eσy2, y>0 and hence

E(Yl(logY)k)=σ2π0yl1/2(logy)keσy/2dy,

noting the fact that, ddy(yl+1/2(logy)k)=(l+1/2)yl12(logy)k+kyl1/2(logy)k1, the above integral can be re-written as

E(Yl(logY)k)=σ2π22l+10ddy(yl+1/2(logy)k)eσy/2dyσ2π2k2l+10yl1/2(logy)k1eσy/2dy

solving the first integral by parts, we obtain

E(Yl(logY)k)=σ2l+1E(Yl+1(logY)k)2k2l+1E(Yl(logY)k1).

Hence the lemma follows.

Let X:={X:X=σX~,σ>0} be a generic collection of non-negative rvs with one (scale) parameter σ, where X~ is a unit scale rv. Then the covariance between X and logX allows easy scale extraction of scale parameter σ, that means,

cov(X,logX)=cov(σX~,log(σX~))=σcov(X~,logX~)

which gives

σ=cov(X,logX)cov(X~,logX~),

where cov(X~,logX~) is some known constant depending on the distribution of X~. We utilize the above idea to estimate the scale parameter of one-sided Lévy distribution. Let rv XLv(σ), then rv X~ follows Lv(1). Define another collection of rvs using the inverse transform of X, say Y:={Y:Y=1/X,XX} and Y~=1/X~. Hence rv Y and Y~ follows Gamma(12,σ2) and Gamma(12,12) respectively. Hereinafter, we consider rv Y=1/X and Z=logY with mean μY=1σ and variance σY2=2σ2 and with simple computation the mean and variance of rv Z are μZ=γlog(2σ), and σZ2=π22 respectively. Then cov(Y,logY)=cov(Y~σ,logY~σ)=1σCov(Y~,logY~) gives σ=Cov(Y~,logY~)Cov(Y,logY). Using E(Y~logY~),E(Y~) and E(logY~) respectively in Cov(Y~,logY~) we get Cov(Y~,logY~)=2γlog2+γ+log2=2, where γ is Euler's constant, we get

σ=2Cov(Y,logY) (4)

Remark 2.1

The above approach can be applied to any generic family of distributions that have finite moments/log-moments for estimating the scale parameter.

For a random sample X1,X2,,Xn of size n from Lv(σ), consider the transformed observations Yi=1/Xi and Zi=logYi,i=1,2,,n with Y¯n=i=1nYi/n and Z¯n=i=1nZi/n, then the proposed new estimator ( σ~n) of σ based on sample covariance Cov^n(Y,Z) is given as

σ~n=2Cov^n(Y,Z)=2(n1)i=1n(YiY¯n)(ZiZ¯n). (5)

The sample covariance Cov^n(Y,Z) can also be written as Cov^n(Y,Z)=1(n1)i=1n(YiμY)(ZiμZ)(Yn¯μY)(Zn¯μZ)=(n1)1i=1n(YiμY)(ZiμZ)+Op(n1) with E(Cov^n(Y,Z))=2σ+Op(n1) and Var(Cov^n(Y,Z))=1nVar((YiμY)(ZiμZ)). This shows that 1σ~n is a consistent estimator of 1σ. Hence by invariance property of consistency, which means that, if Tn is any consistent estimator of some parameter θ and ξ(.) is a continuous function then ξ(Tn) is a consistent estimator of ξ(θ). Therefore using this result with ξ defined as ξ(t)=1/t, one can say that σ~n is an consistent estimator of σn.

For benchmark purposes, we use the maximum likelihood estimator of scale parameter σ, which is the solution to the likelihood equation n2σi=1nYi2=0 and is given by

σ^n=(Y¯n)1, (6)

and this estimator will be used later for the construction of our ratio test statistic.

In Figure 1, we give the results of a comparative study that we conducted to compare the estimated value of σ obtained using the method of maximum likelihood estimate (MLE) and method of covariance (MCoV) for different sample sizes. We generate a sample of size n=20,50,100,200,400,600,800 and 1000. For each n, samples from Lv(σ), σ varies from 0 to 10, are generated and the estimated value of σ from ML and MCoV method are plotted. In the figure, we observe that the estimated value of σ by both the methods falls closer to the diagonal line as the sample size increases, which is an important aspect because the test introduced later depends on these being similar. Further to complement this, we compare the performance of the new estimator σ~n with σ^ by generating 100,000 samples each of size 10, 20, 50, 100 and 200 from Lv(5). For each sample with size n, we compute σ^n and σ~n and depict the box plot of it. Figure 2 gives the box plot for both the estimators for different sample sizes. From this figure, we observe that the inter-quartile range of the box plots decreases as sample size increases. And the median line of these box plots are very close to the actual value of σ (i.e. σ=5). In addition to this, MCoV estimate shows more dispersion as compared to ML estimate for all sample sizes, nevertheless, this spread decreases as sample sizes increase. The MCoV estimate is more right-skewed as compared to MLE estimate for small samples. As the exact expressions of the variance and bias of the MCoV estimator is not in a closed form, we use simulation to get a visual look on the behavior n. The Bias( σ^n), Bias( σ~n), MSE( σ^n) and MSE( σ~n) for different sample sizes varies from n=10,50,100,150,200,,1500 is obtained from 100,000 samples each of size n and is drawn in Figure 3. Here, the Bias and MSE is decreasing as the sample size increases, which suggests that for both the ML and MCoV estimates the MLE and MSE approaches 0. It can be further observed that, irrespective of sample size, the ML estimator has small bias and MSE as compared to MCoV estimator. Thus, rather than parameter fitting, our main purpose is to introduce a new goodness-of-fit testing using this new estimator.

Figure 1.

Figure 1.

Scatter plot of estimator of σ obtained, by using maximum likelihood estimator (MLE) and Method of covariance (MCoV) for different sample sizes.

Figure 2.

Figure 2.

Box plots of estimator of σ obtained, by using maximum likelihood estimator (MLE) and Method of covariance (MCoV), from 10,000 samples of size n = 10, 20, 50, 100, and 200 from Lv with σ=5.

Figure 3.

Figure 3.

Bias and Means square error (MSE) of Maximum Likelihood Estimator (MLE) (in Blue) and Method of covariance (MCoV) estimator (in Red) for σ=0.5 and σ=2.

3. Scale-ratio test

Let X be a rv with continuous cdf F with support R+ and let L denote the one-sided Lévy family of distributions having density (1). We propose a goodness-of-fit test for the composite null hypothesis H0:FL versus the alternative hypothesis H1:FL, based on a random sample X1,,Xn of size n from F. Similar to the well-known Shapiro–Wilk test (see [11]) for testing the normality, we will construct our test statistic by taking the ratio of the estimators σ~n and σ^n, that is Vn=σ~n/σ^n=2Yn¯Cov^n(Y,Z) for testing H0. Note that, Vn is pivotal quantity with respect to σ. Hence, in order to test the H0 hypothesis, the value of the ratio Vn is expected to be close to one. In following section, we discuss the asymptotic distribution of Vn and propose the test statistic.

3.1. Asymptotic distribution of Vn

Theorem 3.1

Let Xi, i=1,2,,n be iid positive rv's having finite second inverse moment and finite second log moment, set Yi=1Xi and Zi=logYi. Then, for Tn=(Y¯n,Cov^n(Y,Z)), we have

n(TnΨ)dN(0,Σ),

where d denotes convergence in distribution, 0=(0,0), Ψ=(μY,σYZ) and Σ=(ccVar(U1)Cov(U1,U2)Cov(U2,U1)Var(U2)), with U1=(Y1μY) and U2=(Y1μY)(Z1μZ)σYZ. Moreover, for Vn=2Yn¯Cov^n(Y,Z), we have

n(Vn2μYσYZ)dN(0,η2),

where η2=[h(Ψ)]Σh(Ψ) with h(x1,x2)=2x1x2 and hence h(Ψ)=(2σYZ,2μYσYZ2).

Proof.

Notice that, Cov^n(Y,Z)=1ni=1(n1)(YiμY)(ZiμZ)(Yn¯μY)(Zn¯μZ)=(n1)1i=1n(YiμY)(ZiμZ)+Op(n1). Also, by the Central Limit theorem, we have Yn¯μY=Op(n1/2). Therefore, for any values t1 and t2, we consider

In=n{t1(Yn¯μY)+t2(Cov^n(Y,Z)σYZ)}=1ni=1n[t1(YiμY)+t2((YiμY)(ZiμZ)σYZ)]+Op(n1)=1ni=1nRi+Op(n1),

where Ri=t1(YiμY)+t2((YiμY)(ZiμZ)σYZ),i=1,2,,n are iid r.v. with E(Ri)=0 and Var(Ri)=tΣt with t=(t1,t2). By the Central Limit Theorem along with Slutsky's theorem, we conclude that In is asymptotically normally distributed. Since t is arbitrary, by Cramér Wold theorem [4, p. 9], we get

n{(Yn¯,Cov^n(Y,Z))(μY,σYZ)}dN(0,Σ).

Define h:R×RR such that h(x1,x2)=2x1/x2. Notice that Vn=h(Yn¯,Cov^n(Y,Z)), we apply the multivariate delta theorem to obtain the desired result. That is, let Ψ=(μY,σYZ) and notice that h1(x1,x2)=2/x2 and h2(x1,x2)=2x1/x22 where hi(x1,x2)=xih(x1,x2). Hence, h(Ψ)=(2σYZ,2μYσYZ2).

Theorem 3.1 gives rise to the following corollary, which tells us what happens under H0.

Corollary 3.1

Under H0, if XLv(σ), then n(Vn1)dN(0,η2) where η2=12(σZ22)=14(π24).

Proof.

When YGamma(12,σ2), by using Lemma 2.1 and by proceeding iteratively, we have the following equations.

E((Y1σ)logY)=2σ, (7)
E((Y1σ)2logY)=2σE(YlogY). (8)

By using (3), (7) and (8), it is possible to obtain the covariance expression as

cov(YμY,(YμY)(ZμZ))=E((YμY)((YμY)(ZμZ)2σ)),=E((Y1σ)((Y1σ)(logYμZ)2σ)),=2σE(YlogY)2σ2μZ,=2σ2E(logY)+4σ22σ2μZ=4σ2.

Now, from Lemma 2.1, we obtain the following relation

E((Y1σ)2(logY)2)=2σ2(E((logY)2)+4E(logY)+4),andVar((YμY)(ZμZ))=E((YμY)2(ZμZ)2)(E((YμY)(ZμZ)))2,=E((Y1σ)2(logYμZ)2)4σ2,=2σ2(σZ2+2).

Therefore, if we denote by Σij,i,j=1,2 the elements of the covariance matrix Σ, then Σ1,1=2σ2,Σ2,2=2σ2(σZ2+2),Σ1,2=Σ2,1=4σ2. Hence, by Theorem 3.1, for h(Ψ)=(σ,σ2),η2=[h(Ψ)]Σh(Ψ)=12(σZ22).

In view of Corollary 3.1, for testing the one-sided Lévy distribution hypothesis we propose a test

Vn=4nπ24(Vn1), (9)

which is asymptotically distributed as standard normal N(0,1) under H0. Figure 4 shows the histogram of test statistic ( Vn) values obtained from 100,000 samples drawn from Lv(1) for different sample sizes (it is to be noted that the test statistics does not depend on σ, hence for brevity, we choose samples from Lévy distribution with σ=1 ). We further superimpose the kernel density plot (in red) and standard normal density (in blue) on the histogram to visualize the test statistic behavior. For small sample size, the distribution of Vn is right-skewed, this occurs due to the fact that the estimator σ~n have relatively more dispersion than estimator σ^n, see Figure 2. However, the kernel density plot tends to symmetric and overlap with the density of standard normal as the sample size is above 250. There are criteria in the literature which can be used to assess the closeness of the simulated type I error rates to the nominal size, see for example [2]. Therefore, for testing H0 based on the sample X1,X2,,Xn of size n from a continuous cdf F with support on the positive real numbers, the null hypothesis H0 is rejected at the significance level (SL) α, if Vn deviates away from 0. That is for a relatively small sample size, reject H0 at α if Vn<cn,α/2v or Vn>cn,1α/2v, where cn,α/2v and cn,1α/2v are such that

1α=P(cn,α/2v<Vn<cn,1α/2v|H0).

Note that, the statistic Vn does not depend on unknown parameter σ and hence is scale invariant. Thus, we use σ=1 for computation of empirical size and empirical power. On the other hand, for large sample, we reject the null hypothesis at SL α, if |Vn|>z1α/2, where z1α/2 is the 100(1α/2)% quantile of the standard normal distribution. This relies on the fact for large sample size the upper quantiles of the null distribution of |Vn| obtained by MC simulation are close to the upper quantiles of the standard half-normal distribution, which is asymptotic null distribution of |Vn| (see [12]).

Figure 4.

Figure 4.

Histogram, kernel density plot(in red color) of the test statistic Vn and density of standard normal distribution (in blue color) obtained from 100,000 realizations from Lv(1) of different size n.

4. Simulation study

To investigate the performance of Vn test given in (9), we first evaluate the critical values at SL α by Monte Carlo simulation for different sample sizes. As the proposed test is asymptotically normally distributed, we also examine whether the critical values from the asymptotic distribution can be used in place of simulated critical values. These critical values are then being used to obtain the empirical size and power of the test. Finally, we compare the empirical power of Vn test with Jackknife Empirical Likelihood (JEL) and the Adjusted Jackknife Empirical Likelihood (AJEL) tests introduced in Ref. [3]. To begin the investigation, we use the following stepwise procedure to obtain the simulated critical values of Vn.

  • Step 1

    Fix sample size n and σ=1.

  • Step 2

    Generate a sample ( xis) of size n from Lv(σ).

  • Step 3

    Compute yi=1/xi and zi=logyi, i=1,,n.

  • Step 4
    Calculate Vn=2(n1)i=1n(yi)ni=1n(yiy¯)(ziz¯) and then obtain
    Vn=4nπ24(Vn1)
  • Step 5

    Repeat steps 1–4 B times to get the realization of Vn say Vn,b, b=1,,B.

  • Step 6

    Then, for sample size n<250, the critical value at SL α, (cn,αv), is the α-th quantile of Vn,1,Vn,2,,Vn,B and for sample of size n250, the critical value at SL α, is the 2α-th upper quantile of |Vn,1|,|Vn,2|,,|Vn,B|.

The above procedure is used to obtain the simulated critical values at different SL α for Vn tests for small and moderate sample size and for large sample size. The critical value obtained for different sample size n>250 seems close to the critical value of half-normal value. Hence, as a thumb rule, we suggest the readers to use the simulated critical value for n<250 and for n250, critical values, as quantile, of Half-normal distribution can be used. These results are shown in Tables 1 and 2, respectively. Table 1 provides the critical constants of the test corresponding to SL α=0.025,0.05,0.90, and 0.95 for samples of size n=10,15,20,25,30,,250. These values were obtained as the average of three runs of 100,000 MC samples each from the one-sided Lévy distribution with σ=1. Table 2 contains the 90% and 95% quantiles of |Vn| under H0 for n=250,500,1000,1500,,5000 which were obtained from 100,000 MC samples each. These values are close to the 95% and 97.5% quantiles of the asymptotic standard normal distribution (1.645 and 1.959). The empirical size of the test is then obtained by generating 100, 000 samples each of size (n) varies sample sizes from null hypothesis i.e. Lv(1) and proportion of samples for which the test statistic that falls in the critical region are computed. In Figure 5, we compute the empirical size at 5% SL using the simulated critical values (red) and critical values from asymptotic normal distribution (blue). We can observe in figure that the empirical size obtained from simulated critical value falls close to nominal level i.e. 0.05 for small and moderate sample size. Whereas it falls away from the nominal level (0.05) for critical values from asymptotic normal distribution. However, for large sample size ( n250), the empirical size obtained by considering simulated critical value or critical values from asymptotic normal distribution fall close to nominal level. Therefore we use the simulated critical value for computation of empirical power.

Table 2.

Upper quantiles of |Vn| under H0 for different sample sizes, n250.

n cn,0.90v cn,0.95v
250 1.653 1.994
500 1.655 1.985
1000 1.653 1.981
1500 1.645 1.960
2000 1.648 1.964
2500 1.644 1.964
3000 1.650 1.964
3500 1.647 1.961
4000 1.647 1.959
4500 1.641 1.963
5000 1.646 1.959

Table 1.

Simulated critical value of the Vn test for different SL α.

n cn,0.025v cn,0.05v cn,0.090v cn,0.095v n cn,0.025v cn,0.05v cn,0.090v cn,0.095v
10 −1.225 −1.085 3.3590 4.584 130 −1.671 −1.437 1.904 2.313
15 −1.315 −1.154 2.8075 3.664 135 −1.685 −1.441 1.908 2.321
20 −1.378 −1.204 2.5551 3.291 140 −1.684 −1.439 1.906 2.334
25 −1.432 −1.243 2.4329 3.072 145 −1.685 −1.439 1.893 2.313
30 −1.459 −1.270 2.3180 2.916 150 −1.695 −1.445 1.894 2.318
35 −1.494 −1.292 2.2659 2.849 155 −1.702 −1.444 1.891 2.303
40 −1.513 −1.306 2.2034 2.758 160 −1.704 −1.453 1.894 2.305
45 −1.526 −1.320 2.1599 2.673 165 −1.708 −1.449 1.883 2.287
50 −1.539 −1.328 2.1346 2.652 170 −1.714 −1.457 1.887 2.298
55 −1.557 −1.336 2.0933 2.601 175 −1.710 −1.457 1.879 2.285
60 −1.572 −1.350 2.0757 2.566 180 −1.710 −1.458 1.876 2.274
65 −1.578 −1.364 2.0541 2.514 185 −1.720 −1.469 1.857 2.258
70 −1.596 −1.371 2.0333 2.516 190 −1.716 −1.458 1.880 2.275
75 −1.600 −1.376 2.0280 2.488 195 −1.722 −1.467 1.859 2.260
80 −1.623 −1.389 2.0158 2.462 200 −1.733 −1.476 1.857 2.250
85 −1.624 −1.395 1.9824 2.441 205 −1.732 −1.473 1.850 2.247
90 −1.633 −1.396 1.9768 2.399 210 −1.728 −1.471 1.860 2.258
95 −1.632 −1.402 1.9668 2.414 215 −1.742 −1.478 1.855 2.238
100 −1.645 −1.410 1.9437 2.400 220 −1.737 −1.476 1.845 2.235
105 −1.654 −1.412 1.9513 2.377 225 −1.747 −1.485 1.859 2.236
110 −1.651 −1.417 1.9357 2.375 230 −1.744 −1.474 1.846 2.243
115 −1.665 −1.420 1.9407 2.365 235 −1.742 −1.484 1.851 2.243
120 −1.672 −1.424 1.9254 2.343 240 −1.743 −1.487 1.839 2.232
125 −1.665 −1.426 1.9284 2.346 245 −1.749 −1.488 1.850 2.230
          250 −1.753 −1.491 1.836 2.227

Figure 5.

Figure 5.

Empirical size obtained using simulated critical value (in red) and critical value of asymptotic normal distribution (in blue) for different sample size n at SL α=0.05.

In order to compute the empirical power of Vn, JEL and AJEL tests, we considered several families of distributions from alternative hypothesis, with support [0,), such as Lognormal (0, 1), Chi-square (4), Gamma (2, 3), Fréchet (0.5, 0, 1), Pareto (1.5, 1), Log-Gamma (3, 2), Pareto (0.75, 1), Inverse Gaussian (1, 1.5), Burr (1.5, 0.5, 0.5, 1), Rayleigh (1), Weibull (1.75, 1) and Half-Normal (1). These results are shown in Table 3. For sample size n from each alternative family, we generate 100, 000 independent samples of size n and obtained the value of each test statistic. Then the proportion of samples for which the test statistic falls in critical region are recorded as empirical power. The critical region for JEL and AJEL are obtained as suggested in Ref. [3]. For the family of distributions from alternative class having the right tail heavier than the normal distribution, say Lognormal, chi-square, Gamma, Fréchet, Pareto, Log-Gamma, Burr and Inverse Gaussian, the Vn test shows better empirical power than JEL and AJEL, whereas for the distributions having tail equivalent to normal distribution such as Rayleigh, Weibull and Half-Normal the Vn test have relatively less power as compare to JEL and AJEL for large sample size. In addition to this, we generate random variables from a stable distribution with skew parameter β=1, and stability parameter values were varied from 0.05 to 0.99 using the R-package ‘stabledist’. In Figure 6, the empirical power curve for Vn test is presented. The empirical power was computed using samples of sizes n = 200 and for 10,000 repetitions. As observed from Figure 6, the empirical powers of the tests decrease as stability parameter approaches 0.5 and increase as the values of deviate from 0.5. It is noteworthy that when stability parameter is 0.5, the empirical power is close to 0.05, consistent with the case of the one-sided Lévy distribution, as anticipated.

Table 3.

Proportion of samples falls in the critical region for different family of distributions from alternative class.

  Lognormal (0, 1) Chi-square (4) Gamma (2, 3)
n Vn JEL AJEL Vn JEL AJEL Vn JEL AJEL
10 0.401 0.116 0.119 0.649 0.248 0.255 0.652 0.250 0.257
20 0.704 0.212 0.217 0.818 0.437 0.445 0.814 0.423 0.432
30 0.831 0.297 0.304 0.869 0.638 0.643 0.869 0.636 0.644
50 0.938 0.524 0.482 0.917 0.903 0.905 0.913 0.895 0.899
100 0.988 0.868 0.856 0.949 0.934 0.942 0.959 0.948 0.948
  Fréchet Pareto (0.75, 1) Log-Gamma (3, 2)
n Vn JEL AJEL Vn JEL AJEL Vn JEL AJEL
10 0.165 0.056 0.054 0.483 0.087 0.092 0.827 0.163 0.171
20 0.313 0.066 0.066 0.845 0.109 0.112 0.998 0.270 0.279
30 0.429 0.072 0.072 0.964 0.158 0.161 1.000 0.433 0.439
50 0.636 0.076 0.077 0.999 0.302 0.308 1.000 0.725 0.731
100 0.895 0.094 0.091 1.000 0.631 0.612 1.000 0.973 0.938
  Pareto (1.5, 1) Burr (1.5, 0.5, 0.5) Inverse-Gaussian (1, 1.5)
n Vn JEL AJEL Vn JEL AJEL Vn JEL AJEL
10 0.981 0.347 0.357 0.921 0.086 0.082 0.481 0.112 0.116
20 1.000 0.526 0.539 0.997 0.138 0.136 0.872 0.173 0.179
30 1.000 0.777 0.785 1.000 0.184 0.184 0.977 0.285 0.289
50 1.000 0.977 0.977 1.000 0.292 0.294 0.999 0.514 0.520
100 1.000 1.000 1.000 1.000 0.527 0.506 1.000 0.866 0.853
  Rayleigh (1) Weibull (1.75, 1) Half-Normal (1)
n Vn JEL AJEL Vn JEL AJEL Vn JEL AJEL
10 0.832 0.413 0.423 0.723 0.448 0.371 0.722 0.326 0.332
20 0.913 0.669 0.678 0.854 0.583 0.532 0.844 0.543 0.553
30 0.952 0.853 0.863 0.886 0.761 0.713 0.886 0.772 0.779
50 0.969 0.993 0.991 0.912 0.968 0.958 0.919 0.965 0.967
100 0.973 1.000 1.000 0.976 1.000 1.000 0.978 0.992 1.000

Figure 6.

Figure 6.

Empirical power curve of the Vn test for sample size 200.

5. Application

In this section, we illustrate the use of test proposed in Section 4 with the help of two real datasets discussed in Ref. [3]. The first dataset is from Ref. [8], which represents the lifetime of the pressure of n = 20 constructed vessels subjected to a certain constant pressure. While fitting the gamma distribution to this data, the ML estimator of shape parameter is close to 0.5 and as the one-sided Lévy distribution is a special case of inverse of Gamma distribution, we consider the inverse of these observations for our purpose. The second dataset comprises a n = 31 weighted average of rainfall (in mm) data in January for the whole country starting from 1981 to 2011 released by Meteorological Department, Ministry of Earth Sciences, Government of India. This data is based on more than 2000 rain gauge readings spread over the entire country, and it is available at www.data.gov.in. To get the preliminary idea of model fitting to these data. For parameter estimation, maximum likelihood estimation method is used. The qq-plot between the sample quantiles and fitted one-sided Lévy quantiles are presented in Figure 7 for both datasets. The Kolmogorov–Smirnov test values Kn=max{D+,D}, where D+=maxj=1,,n(j/nF^(X(j))), D=maxj=1,,n(F^(X(j))(j1)/n) and F^(X(j)) is estimated cdf of one sided lévy rv, with bootstrap p-value ( pBootstrap1) for datasets 1 and 2 are 0.154 (0.498) and 0.461 (< 0.0001), respectively.

Figure 7.

Figure 7.

qq-plot between the empirical cdf and fitted one-sided Lévy CDF.

For the first dataset, the KS- pBootstrap-value is larger than the 5% significance level hence confirming that the one-sided Lévy distribution could be a plausible choice for modeling. Whereas for dataset 2 the KS- pBootstrap-value is very low as compared to the 5% significance level, hence this rejects the null hypothesis. The same conclusion can also be drawn from qq-plots, for dataset 1 the empirical and theoretical cdf values falls near to diagonal line whereas they depart from the diagonal line for dataset 2. We now apply the proposed Vn-test to these two datasets, and for comparison purposes, we use JEL and AJEL tests given by Ref. [3]. These tests will be compared with reference to the size of the tests. Nevertheless, the size of the tests based on asymptotic critical value may not be close to the nominal significance level, this behavior depends on the parent family of distribution. Hence in such cases Bootstrap method, introduced by Ref. [6], can be used effectively. By construction, the bootstrap method is efficient in approximating the cut-off points of the test's critical region. Despite the fact that the bootstrap method is computationally time consuming, it gives more accurate value of critical levels, as compared to obtaining asymptotic critical values. In our framework, to compute the bootstrap p-value, we present following brief outline of the bootstrap procedure. Denote the data by x1,,xn. We fit the model to the data. Then,

  • Step 1

    Compute the test statistic values, say tVn, tJEL, tAJEL.

  • Step 2
    Use the fitted model to perform parametric bootstrapping.
    1. Generate B set of resampled data, denote it as x^1(i),,x^n(i), i=1,2,,B.
    2. For each set of the resampled data, compute the test statistics values tVni, tJELi, tAJELi statistics, for i=1,2,,B.
  • Step 3
    For the two-tailed hypothesis, the equal-tail pBootstrap-value of the Vn test statistic is computed by following relation (see Ref. [7])
    2min(i=1BI{tVni<tVn}B,i=1BI{tVni>tVn}B),
    where I(.) is an indicator function, and for JEL and AJEL test, the pBootstrap-values are, respectively, be given as
    i=1BI{tJELitJEL}Bandi=1BI{tAJELitAJEL}B.

The test statistic values and their pBootstrap-value are presented in Table 4. We observe from Table 4 that the pBootstrap-value for Dataset 1 for all the three tests is higher than the significance value 0.05, hence confirming that the one-sided Lévy distribution cannot be rejected for modeling. However, for Dataset-2, pBootstrap-value for Vn test is less than the significance value 0.05 and hence the null hypothesis is rejected, which means the one-sided Lévy distribution is not a good choice to model this dataset. In addition to this, the simulated critical points, discussed in Section 4, at 5% SL obtained by Vn-test for Datasets 1 and 2 are (−1.204 , 2.555) and (−1.273 , 2.309), respectively, and Vn-test value for Dataset 1 lies within these critical points whereas for Dataset 2, it lies outside the critical points which gives evidence of accepting and rejecting the null hypothesis for Datasets 1 and 2, respectively. Further, on the contrary, the JEL and AJEL tests retain the null hypothesis at 5% SL, but reject the null hypothesis at 10% SL for Dataset 2.

Table 4.

Observed value of test statistic values along with their p-values.

    Vn-test JEL AJEL
Dataset 1 TS-value 0.782 0.316 0.269
  pBootstrap-value 0.547 0.554 0.554
Dataset 2 TS-value 10.952 3.083 2.770
  pBootstrap-value <0.001 0.081 0.079

6. Conclusion

Realizing that there are few tests available to test the one-sided Lévy distribution, the proposed test is useful to practitioners. Our Monte Carlo simulation study also supports the claim that the proposed tests show higher power than JEL and AJEL based tests for various alternatives. Finally, the applicability of the proposed test has been shown by considering two real-world datasets. The proposed test is obtained to test whether the sample belongs to one-sided Lévy distribution with zero location parameter, however, this test can be extended to inverse gamma distribution which nests one-sided Lévy distribution.

Acknowledgments

The authors would like to thank the Associate editor and anonymous referees for suggesting modifications which help us to improve the presentation of this article.

Notes

1

The bootstrap-p value is obtained similar to the procedure provided in this section later.

References

  • 1.Adler R.J., Feldman R.E., and Taqqu M.S., A Practical Guide to Heavy Tails: Statistical Techniques and Applications, Springer Science & Business Media, 1998. [Google Scholar]
  • 2.Batsidis A., Martin N., Pardo L., and Zografos K., A necessary power divergence-type family of tests for testing elliptical symmetry, J. Stat. Comput. Simul. 84 (2014), pp. 57–83. [Google Scholar]
  • 3.Bhati D. and Kattumannil S.K., Jackknife empirical likelihood test for testing one-sided Lévy distribution, J. Appl. Stat. 47 (2020), pp. 1208–1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.DasGupta A., Asymptotic Theory of Statistics and Probability, Springer, 2008. [Google Scholar]
  • 5.Dumé I., Geomagnetic flip may not be random after all, Physicsworld.com, March edition, 2006.
  • 6.Efron B., Bootstrap methods: another look at the jackknife, Ann. Stat. 7 (1979), pp. 1–26. [Google Scholar]
  • 7.Efron B. and Tibshirani R., Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy, Stat. Sci. (1986), pp. 54–75. [Google Scholar]
  • 8.Keating J.P., Glaser R.E., and Ketchum N.S., Testing hypotheses about the shape parameter of a gamma distribution, Technometrics 32 (1990), pp. 67–82. [Google Scholar]
  • 9.Rogers G.L., Multiple path analysis of reflectance from turbid media, J. Opt. Soc. Am. A 25 (2008), pp. 2879–2883. [DOI] [PubMed] [Google Scholar]
  • 10.Samorodnitsky G. and Taqqu M.S., Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance, Chapman and Hall/CRC, London,1994. [Google Scholar]
  • 11.Shapiro S.S. and Wilk M.B., An analysis of variance test for normality (complete samples), Biometrika 52 (1965), pp. 591–611. [Google Scholar]
  • 12.Villaseñor J.A. and González-Estrada E., On testing exponentiality based on a new estimator for the scale parameter, Braz. J. Probab. Stat. 34 (2020), pp. 809–820. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES