Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2020 Mar 10;48(4):646–668. doi: 10.1080/02664763.2020.1738360

Heavy or semi-heavy tail, that is the question

Jamil Ownuk 1, Hossein Baghishani 1,CONTACT, Ahmad Nezakati 1
PMCID: PMC9041666  PMID: 35706985

Abstract

While there has been considerable research on the analysis of extreme values and outliers by using heavy-tailed distributions, little is known about the semi-heavy-tailed behaviors of data when there are a few suspicious outliers. To address the situation where data are skewed possessing semi-heavy tails, we introduce two new skewed distribution families of the hyperbolic secant with exciting properties. We extend the semi-heavy-tailedness property of data to a linear regression model. In particular, we investigate the asymptotic properties of the ML estimators of the regression parameters when the error term has a semi-heavy-tailed distribution. We conduct simulation studies comparing the ML estimators of the regression parameters under various assumptions for the distribution of the error term. We also provide three real examples to show the priority of the semi-heavy-tailedness of the error term comparing to heavy-tailedness. Online supplementary materials for this article are available. All the new proposed models in this work are implemented by the shs R package, which can be found on the GitHub webpage.

Keywords: Asymptotic properties, heavy-tailed distribution, ML estimators, skew hyperbolic secant distributions, semi-heavy-tailed distribution

1. Introduction

Practitioners usually use heavy-tailed distributions when robustness to potential outliers is a concern. For symmetric distributions, to use robust estimation techniques is often a standard approach for implementing statistical inference. However, robust methods do not work for asymmetric distributions that lead to strongly biased parameter estimates [32]. Hence, accounting for both skewness and tick tails features of real data is an attractive benefit for statistical modeling and inference, especially in the context of a regression model. Several methods to obtain skewed heavy-tailed distributions have been presented in the literature; e.g. Fernandez and Steel [12], Ma and Genton [24], Azzalini and Capitanio [4], Jones and Faddy [22], Ferreira and Steel [14], and many others.

In a broad variety of applied contexts, such as economics, finance, and hydrology, the thickness of the tail of the distribution is not too heavy, but semi-heavy. We have experienced that the semi-heavy-tailed distributions often fit observed data quite well and better rather than heavy-tailed alternatives when there are a few suspicious outliers. Therefore, we consider a situation where data are skewed possessing semi-heavy tails. Indeed, we address the case where a skewed semi-heavy-tailed distribution for the error term in a linear regression is a better option rather than asymmetric heavy-tailed alternatives. To this end, we propose two new families of skewed distributions that have a semi-heavy tail, in some sense. Furthermore, we carry through a simulation study to validate that the proposed distributions perform over the alternative skewed heavy-tailed distributions concerning model fitting and information criteria, including Akaike information criterion (AIC; [1]) and Bayesian information criterion (BIC; [29]).

There are a few works of literature on symmetric semi-heavy-tailed distributions; e.g. hyperbolic secant (HS) family of distributions that originates from Fisher [18], according to Fischer [15]. Vaughan [34] introduced a skewed version of the HS distribution, named generalized HS (GHS), and Fischer and Vaughan [16] added an alternative one called it as skewed generalized HS (SGSH) distribution. Another skewed version of a semi-heavy-tailed distribution, beta hyperbolic secant (BHS), submitted by Fischer and Vaughan [17].

These skewed versions have complex probability densities and cumulative distribution functions. Also, their moments are usually complicated to calculate due to the existence of an infinite series. Furthermore, the asymptotic behavior of maximum likelihood (ML) estimators of the relevant parameters in a more applicable linear regression model, for these distributions, is not usually discussed. Preferably, our proposed distributions (Section 2) have several advantages, including (1) simple formula for density functions, (2) easy to calculate moments, (3) better fit to the real data, and (4) simple stochastic representations that lead to introduce efficient algorithms for generating random numbers.

In Section 2, we present two new skewed semi-heavy-tailed distributions and their basic properties. We enter them into a linear regression model in Section 3 as the distribution of errors, then verify the consistency and asymptotic normality of the corresponding ML estimators. In Section 4, we provide the results of a simulation study that compares the performance of the ML estimators in a linear regression model when errors are generated with our skewed semi-heavy-tailed distributions and also with a heavy-tailed skew Student-t distribution [2]. Section 5 presents three real data analyses with (i) acoustic comfort evaluation data, (ii) pricing diamond stones data, and (iii) the excess rate of return for a company, named Martin Marietta, and an index of the excess rate of return for the New York stock exchange. The final section contains concluding remarks. All technical details are available online as supplementary materials. All methods presented in this work are implemented in the shs R package [27], which can be found on the GitHub webpage .1

2. Proposed sampling models

A hyperbolic secant distribution is symmetric, and its density is bell-shaped similar to the Gaussian distribution but has slightly heavier tails. The probability density function of an HS distribution is given by

f(x)=2πσexp(xμσ)1+exp(2xμσ),μR,σ>0andxR,

where μ is the mean and σ is the scale parameter. The variance of the distribution is π24σ2.

Fischer [15] showed that if V has a half-Cauchy density function, then U=log(V) has a HS distribution. Using the fact that Cauchy distribution is the special case of a Student-t distribution with 1 degree of freedom, we can get a generalized HS distribution. Let Y has a half Student-t distribution with ν degrees of freedom, then Z=log(Y) has a standard new asymmetric HS distribution. We refer to this distribution as the skew HS type 1 (SHS1) distribution. The probability density function of Z is given by

f(z)=cνexp(z)(1+1νexp(2z))ν+12,cν=2Γ(ν+12)νπΓ(ν2). (1)

Figure 1 shows a variety of SHS1 densities for several values of ν. Given the density (1) for Z, it is easy to show that X=σZ+μ follows SHS1(μ,σ,ν). When ν=1, f(x) reduces to the HS distribution with location parameter μ and scale parameter σ. When ν<1 or ν>1, SHS1 distribution is negatively or positively skewed, respectively. Cumulative distribution function of (1) is also given by

F(z)=zcνexp(w)(1+1νexp(2w))ν+12dw=Iexp(2z)ν+exp(2z)(12,ν2)

where Iy(a,b) is the incomplete beta function.

Figure 1.

Figure 1.

SHS1 densities for different values of ν.

The following expression for the moment generating function of Z can be obtained (see the supplementary materials for the proof).

Proposition 2.1

For an SHS1 distribution with density (1), the moment generating function is given by

M(t)=νt2Γ(1+t2)Γ(νt2)πΓ(ν2).

Therefore, we can easily calculate that

E(Z)=12(log(ν)+ψν),E(Z2)=(E(Z))2+14ψνE(Z3)=(E(Z))3+38E(Z)ψν+18ψνE(Z4)=(E(Z))4+2416(E(Z))2ψν+816E(Z)ψν+316ψν2+116ψν

where ψν=ψ(12)ψ(ν2), ψν=ψ(12)+ψ(ν2), ψν=ψ(12)ψ(ν2), ψν=ψ(12)+ψ(ν2), in which ψ() is the digamma function and ψ(), ψ(), and ψ() are its first, second, and third derivatives, respectively. We can check that for ν=1, E(Z)=0 and Var(Z)=π24, which are the mean and variance of standard HS distribution, respectively.

2.1. Second proposal

It could be attractive to have a skewness parameter whose negative or positive values display a negative or positive skewness of data or vice versa. The SHS1 distribution has not such a feature. To achieve this, we introduce another generalization for HS distribution and refer to that as SHS type 2 (SHS2). For this distribution, the probability density function is given by

f(x)=cos(π2α)πσeα(xμσ)cosh(xμσ),1<α<1,σ>0,μR,xR. (2)

If the random variable X has the density (2), we denote it as SHS2(μ,σ,α). The following proposition, confirms that SHS2 has the symmetry property of skewness parameter.

Proposition 2.2

If XSHS2(μ,σ,α) then XSHS2(μ,σ,α).

When α=0, the distribution reduces to the HS distribution. When α<0 or α>0, SHS2 distribution is positively or negatively skewed, respectively. Figure 2 shows a variety of SHS2 densities for several values of α. It is also easy to show that the cumulative distribution function of SHS2 is as follows:

F(x)=Ie2x1+e2x(1+α2,1α2).

Furthermore, there is a simple stochastic representation for SHS2 distribution as well; certainly, if YBeta(1+α2,1α2), then X=12log(Y1Y) agrees with density (2).

Figure 2.

Figure 2.

SHS2 densities for different values of α.

When ZSHS2(0,1,α), the following expressions for characteristic and moment generating functions can be obtained.

Proposition 2.3

Characteristic and moment generating functions of Z is given by

C(t)=cos(π2α)cos(π2(α+it))M(t)=cos(π2α)cos(π2(α+t))

respectively.

By applying Proposition 2.3, we can show that, for example

E(Z)=π2tan(π2α)E(Z2)=π222(1+2tan2(π2α))E(Z3)=π323(5tan(π2α)+6tan3(π2α))E(Z4)=π424(5+28tan2(π2α)+24tan4(π2α)).

2.2. Tail behavior

Figure 3 plots the tails of densities in (1) and (2) and compares them with normal and Cauchy densities. The Cauchy distribution has a thick tail, while the normal distribution has a thin tail. Intuitively, our proposed distributions have tails lighter than Cauchy (as a well-known heavy-tailed distribution) but heavier than normal, hence possess semi-heavy tails. This intuition can be formalized by studying the behavior of the tails.

Figure 3.

Figure 3.

Comparison of tails for Cauchy (solid), normal (dotted), SHS1 (dotted-dashed), and SHS2 (dashed) densities.

The following theorem demonstrates that the SHS distributions have semi-heavy tails. The theorem is proved for the standard case. First, formal definitions of heavy and semi-heavy tails are required. We consider the following definitions provided by Omey et al. [26].

Definition 2.4

The random variable X has a fat tail density if its density function f() satisfies

limxf(xy)f(x)=yγγR.

Definition 2.5

A density function f(x) is called a semi-heavy-tailed function if it is of the form f(x)=eηxh(x), η>0, where h(x) is a fat tail density function.

Theorem 2.6

Both SHS1 and SHS2 distributions have semi-heavy tails.

3. Regression modeling

In linear regression modeling, the usual normality assumption for error terms is not appropriate in cases where observations may be skewed and may have heavy or semi-heavy tails. Departures from normality of the errors may have adverse effects on inferential results [5]. Many remedies arise throughout the statistics literature on extending the classical normal linear regression model; e.g. Student-t regression [23], scale mixtures of normal [13], skew-normal regression [36], finite mixture of normals [31], and Skew-t components regression [9]. However, none of these models explicitly examine semi-heavy tails distributions for errors. Furthermore, the identifiability and computational burden of the mixture-based approaches could be a challenge. The methods that are based on Student-t regression also have a limitation regarding the estimation of the degrees of freedom parameter, once its estimation is known to be complicated and computationally expensive.

Here, we address the situation where error terms are skewed and have a semi-heavy tail. We extend a likelihood-based approach for inference when the errors follow SHS distributions. We assume that the observations yi, i=1,,n, to be generated from

yi=βxi+σϵi, (3)

where xiRk is a vector of covariates, β=(β1,,βk) is the vector of unknown regression parameters, and σ is a scale parameter. We assume that the error terms ϵ1,,ϵn are i.i.d. SHS1(0,1,ν) or SHS2(0,1,α). In the following, we study each model separately.

3.1. SHS1 error term

When the error terms follow the SHS1(0,1,ν) distribution, the log-likelihood of the model (3) is given by

(β,σ,ν)nlog(Γ(ν+12))n2log(ν)nlog(Γ(ν2))nlog(σ)+i=1n(yiβxiσ)ν+12i=1nlog(1+1νexp(2yiβxiσ)).

To maximize ℓ with respect to the parameters of the model, we proceed through a quasi-Newton optimization method, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm, which was proposed by Broyden [6], Fletcher [19], Goldfarb [20], and Shanno [30], independently. Let θ=(β,σ,ν) denote the vector of the parameters. The components of the score vector

Sn=({(θ)βj}j=1k,(θ)σ,(θ)ν)

are

(θ)βj=ν+1σi=1nxij1νexp(2yiβxiσ)1+1νexp(2yiβxiσ)1σi=1nxij,j=1,,k(θ)σ=ν+1σi=1n(yiβxiσ)1νexp(2yiβxiσ)1+1νexp(2yiβxiσ)1σi=1n(yiβxiσ)nσ(θ)ν=ν+12νi=1n(yiβxiσ)1νexp(2yiβxiσ)1+1νexp(2yiβxiσ)12i=1nlog(1+1νexp(2yiβxiσ))n2ψ(ν2)+n2ψ(ν+12)n2ν.

3.2. SHS2 error term

For errors with SHS2(0,1,α) distribution, the log-likelihood of the model (3) is given by

(ζ)=(β,σ,α)=nlog(cos(π2α))nlog(σ)nlog(π)αi=1nyiβxiσi=1nlogcosh(yiβxiσ).

For this log-likelihood function, the components of the score vector

S~n=({(ζ)βj}j=1k,(ζ)σ,(ζ)α)

are given by

(ζ)βj=αi=1nxijσ+i=1nxijσtanh(yiβxiσ),j=1,,k(ζ)σ=nσ+αi=1nyiβxiσ2+i=1nyiβxiσ2tanh(yiβxiσ)(ζ)α=nπ2tan(π2α)i=1nyiβxiσ.

To compute the ML estimators of the parameters in this model, we use a BFGS method as well.

3.3. Asymptotic properties

Let Xn=(x1,,xn) denote an n×k design matrix. We assume Xn to be rank k. Let Θ be the parameter space for SHS1 model, an open subset of Rk+2. Also let Ψ be the paramater space for SHS2 model, an open subset of Rk+2. The second derivative matrix of (θ) can be written as

n(2)=[2(θ)ββ2(θ)βσ2(θ)βν2(θ)βσ2(θ)σ22(θ)σν2(θ)βν2(θ)σν2(θ)ν2]

where its elements are given in the supplementary materials. Similarly, the second derivative matrix of (ζ) (see the supplementary materials) can be written as

~n(2)=[2(ζ)ββ2(ζ)βσ2(ζ)βα2(ζ)βσ2(ζ)σ22(ζ)σα2(ζ)βα2(ζ)σα2(ζ)α2].

Using SHS1 model for the error term, we obtain the Fisher information matrix as

Bn=E(n(2))=[Bn(β)Bn(β,σ)Bn(β,ν)Bn(β,σ)Bn(σ)Bn(σ,ν)Bn(β,ν)Bn(σ,ν)Bn(ν)]

and for SHS2 model, we get

B~n=E(~n(2))=[B~n(β)B~n(β,σ)B~n(β,α)B~n(β,σ)B~n(σ)B~n(σ,α)B~n(β,α)B~n(σ,α)B~n(α)].

See the supplementary materials for the details.

To prove the asymptotic results, we need the following conditions:

  1. The true parameter vector θ0, in the SHS1 model, is an interior point of Θ.

  2. The true parameter vector ζ0, in the SHS2 model, is an interior point of Ψ.

  3. n1i=1nxijcj as n, for j=1,,k, where cj is a finite real number.

  4. n1XnXn converges to a finite and positive definite matrix C, as n.

  5. n2i=1nxij2xik20 as n, for j=1,,k.

The following two theorems reveal the consistency and asymptotic normality of the maximum likelihood estimators in our two proposed models.

Theorem 3.1

Given the conditions C1, C3, C4, and C5, the maximum likelihood estimators of the linear regression model (3) with the SHS1 model for error term, θ^, is weakly consistent and

n(θ^θ0)N(0,Bn1).

Theorem 3.2

Given the conditions C2, C3, C4, and C5, the maximum likelihood estimators of the linear regression model (3) with the SHS2 model for error term, ζ^, is weakly consistent and

n(ζ^ζ0)N(0,B~n1).

4. Simulation study

We conducted a simulation study to assess and compare the performance of our two proposed regression models to one another and to the skew Student-t (SSt) regression, as a heavy-tail model, and the skew-normal (SN; [3]) regression model. We simulated the responses from a linear model:

yi=β1x1i+β2x2i+σϵi,i=1,,n.

In this model, for errors with skew normal distribution, the density of ε is given by

f(ϵ)=2ϕ(ϵ)Φ(αϵ),

where ϕ() and Φ() are density and cumulative distribution functions of the standard normal distribution, respectively. Here, α is the skewness parameter of the density. For errors with skew Student-t distribution, the density of the error term is given by

f(ϵ)=2σt(ϵσ;ν)T(αϵσν+1ν+ϵ2σ2;ν+1)

in which t(;ν) and T(;ν) are density and cumulative distribution functions of standard Student-t distribution with ν degrees of freedom, respectively. Similar to skew-normal model, α controls the skewness of the errors.

In our study, we examined the following configuration options:

  • the sample size is n = 100, n = 200, or n = 500.

  • the regression covariates x1 and x2 are generated from the standard normal distribution.

  • the real values of regression coefficients are set to (β1,β2)=(1,2). The real value for scale parameter is also specified by σ=1.

  • both covariates and response variable are centered on its mean.

Toward developing an understanding of the performance of these models, we considered three simulation scenarios that vary in terms of the true underlying error distribution:

  1. ϵiSHS1(0,1,ν=4)

  2. ϵiSHS2(0,1,α=0.5)

  3. ϵiSSt(0,1,α=2,ν=4).

In the first and second scenarios, the error distribution is set to correspond to the SHS distributions so that the observations in the simulated data possess a semi-heavy tail. We chose ν=4 and α=0.5 for SHS1 and SHS2, respectively, to have both positive and negative skewness. To evaluate the performance of the model when the errors do follow a heavy-tailed distribution, the third scenario takes it to be a skew Student-t distribution. Specifically, we are curious to assess the sensitivity of the proposed semi-heavy-tailed distributions on the misspecification of the tail properties of errors.

The above configurations imply a total of nine different models, and we simulated and estimated R = 1000 times for each. We also computed bias and the mean squared error (MSE) of the parameter estimates for each model. To compare these measures in a better way and faster, we calculated a relative version of them as a ratio of the measure obtained by estimated misspecifying models and the estimated true model for the error term. Results for scenario (1) are summarized in Table 1, those for scenario (2) in Table 2, and those for scenario (3) in Table 3. In each scenario, for the true model, the pure bias and MSE of the estimates are reported as well.

Table 1. Relative bias and MSE of the parameter estimates under SHS1 distributed errors.

  n 100 200 500
Model   β^1 β^2 β^1 β^2 β^1 β^2
SHS1 Bias 0.0002 −0.0012 −0.0076 −0.0038 −0.0039 −0.0000
  MSE 0.0117 0.0117 0.0052 0.0058 0.0025 0.0023
SHS2 Bias 0.2031 1.1926 1.1041 1.0298 1.0769 4.8573
  MSE 1.0660 1.0659 1.0798 1.0843 1.0803 1.0914
SSt Bias 1.8930 1.2208 0.9862 1.1240 0.9527 8.9018
  MSE 1.0334 1.0236 1.0202 1.0348 1.0035 1.0250
SN Bias −0.3185 0.6290 1.2384 1.9219 0.8533 44.9954
  MSE 1.2076 1.2073 1.2146 1.2215 1.1709 1.2559

Notes: For the SHS1 model, the pure bias and MSE of the estimates are reported.

Table 2. Relative bias and MSE of the parameter estimates under SHS2 distributed errors.

  n 100 200 500
Model   β^1 β^2 β^1 β^2 β^1 β^2
SHS1 Bias 0.9419 0.9442 0.9925 1.0840 0.8363 1.0166
  MSE 0.9596 0.9655 0.9738 0.9664 0.9617 0.9612
SHS2 Bias 0.0038 0.0044 0.0058 0.0023 −0.0010 −0.0022
  MSE 0.0427 0.0410 0.0224 0.0209 0.0079 0.0083
SSt Bias 1.6202 0.8584 1.0212 1.3580 0.2751 1.0326
  MSE 0.9510 0.9504 0.9379 0.9430 0.9335 0.9282
SN Bias 1.4444 0.9297 0.5085 0.9448 0.8340 2.0620
  MSE 1.2020 1.1738 1.1808 1.2000 1.2733 1.1664

Note: For the SHS2 model, the pure bias and MSE of the estimates are reported.

Table 3. Relative bias and MSE of the parameters estimates under skew Student-t distributed errors.

  n 100 200 500
Model   β^1 β^2 β^1 β^2 β^1 β^2
SHS1 Bias 1.1092 1.1489 1.0177 1.0559 −2.4400 1.2461
  MSE 1.0263 1.0112 1.0160 1.0172 1.0454 1.0119
SHS2 Bias 1.1678 1.1670 1.0147 0.9766 −1.7733 1.4539
  MSE 1.0680 1.0496 1.0515 1.0524 1.0722 1.0502
SSt Bias 0.0027 −0.0048 −0.0033 −0.0014 0.0001 −0.0007
  MSE 0.0074 0.0084 0.0041 0.0039 0.0016 0.0016
SN Bias 0.2147 0.9159 1.4184 1.4412 −24.2274 1.9104
  MSE 1.5658 1.5123 1.4132 1.4089 1.5629 1.4771

Note: For the SSt model, the pure bias and MSE of the estimates are reported.

The results show that for all true error distributions, both the bias and MSE tend to zero as the sample size is increased; hence we can conclude the consistency and unbiasedness of the estimators of regression coefficients. Further, for all sample sizes and simulation scenarios, the estimators obtained by assuming the SN model for errors are inefficient. Its loss of efficiency comparing to the SSt model is substantial. For the scenario (3), when the error model is an SSt, the estimators obtained by the true model are uniformly more efficient; however, the difference with SHS1 and SHS2 models are minor. The same results can be derived for the scenario (1). Therefore, we can conclude the robustness of our proposed SHS models to misspecification of a heavy-tailed distribution like the SSt model in estimating regression parameters.

Figure 4 displays the average bias plus and minus the average standard error (SE) for the regression parameters under each scenario. The results in Figure 4 confirm those obtained in Tables 13; i.e. both the average bias and standard errors tend to zero as the sample size is increased. The lengths of intervals for all scenarios and sample sizes, under all models but the SN model, are almost the same. For the SN model, the length of the interval is the longest one.

Figure 4.

Figure 4.

Average bias and confidence intervals on a standardized scale for different error distributions by sample size and true error distributions: SHS1 (left), SHS2 (middle), and SSt (right). SHS1 (solid line), SHS2 (dashed line), SSt (dotted line) and SN (dotdash line).

For the regression parameters, we also computed empirical coverage rates under different models for errors. Figure 5 displays the results for all scenarios and sample sizes. The results in Figure 5 show that the estimated coverage rates are close to the nominal level of 95% for all sample sizes and simulation scenarios. It justifies the asymptotic normality of the ML estimators of the regression coefficients.

Figure 5.

Figure 5.

Coverage rate for each regression parameter for different error distributions by sample size and true error distributions: SHS1 (left), SHS2 (middle), and SSt (right). SHS1 (solid line), SHS2 (dashed line), SSt (dotted line) and SN (dotdash line).

It is also of interest to directly compare the goodness of fit of the models to the observed data. To this end, we estimated two model assessment measures: the AIC and the BIC. We do not compare the absolute values of AICs or BICs directly, but considers their difference which is defined (e.g.for AIC) as

DAIC(M)=AIC(M0)AIC(M)

in which the M0 is the true model for the error term and M is the misspecified one. The negative values for DAICs/DBICs are in favor of the correct model. Box plots of DAIC values for all simulation scenarios and sample sizes are shown in Figure 6, and box plots of DBIC values are displayed in Figure 7.

Figure 6.

Figure 6.

Box plots of DAIC values for misspecified error models by sample size and true error distributions: SHS1 (left), SHS2 (middle), and SSt (right).

Figure 7.

Figure 7.

Box plots of DBIC values for misspecified error models by sample size and true error distributions: SHS1 (left), SHS2 (middle), and SSt (right).

For scenarios (1) and (3), the correct model shows superior performance, mainly when the sample size is increased. However, for scenario (2), using an incorrectly specified SSt or SHS1 model leads to better performance in a model selection based on both AIC and BIC. Also, for all sample sizes and simulation scenarios, the SN model has poor performance.

To some extent, it can be said that for the correct SSt model, the performance of incorrectly specified SHS models is approximately equivalent to the correct one based on AIC/BIC. Again, it shows the robustness of the proposed SHS models in model selection for heavy-tailed data.

5. Real examples

In this section, we present three real data analyses with (i) Acoustic comfort evaluation data, (ii) Pricing diamond stones data, and (iii) Martin Marietta Company data. We compared eight different distributions for error terms, including SHS1, SHS2, SN, SSt, GSH, SGHS, BHS, and normal distributions, having μ=0.

In this paper, for testing goodness of regression fit under different models for error terms, we consider the Kolmogorov–Smirnov test based on the empirical distribution of residuals. Specifically, we assume that the regression model can be written in a location-scale form as

Y=βx+σϵ

with error distribution Fϵ(y)=P(Yβxσy). If θ^ denotes the ML estimates of all parameters including both regression coefficients and parameters of error distribution, then the error distribution under this model is built as

Fϵ0(y)=P(Yβ^xσ^y).

Therefore, the linear regression model with a specific distribution for error terms is true if and only if the distributions Fϵ and Fϵ0 are the same. By this result, a Kolmogorov–Smirnov type test for the error distribution can be constructed by comparing the empirical distribution of the residuals, F^ϵ, with the one estimated under the assumed distribution, F^ϵ0, as follows:

T=nsupyR|F^ϵ(y)F^ϵ0(y)|.

5.1. Acoustic comfort evaluation data

Zhang et al. [37] used a linear regression framework with Gaussian error terms to model the acoustic comfort evaluation in the common problems of the engineering vehicles with high environmental noise. This high-level noise pollutes the surrounding environment and endangers the driver's physical and mental health. As Zhang et al. [37] have been noticed, in the future, acoustic vehicle comfort will be an essential topic for research due to its close connection to daily life. Therefore, it is vital to address how to evaluate the acoustic comfort of engineering vehicles.

The most studies of acoustic comfort focus on acoustic subjective and objective evaluations and their mathematical mapping [10,25,35]. The objective assessment of acoustic comfort is based on psychoacoustic parameters, including loudness, sharpness, and so on. For subjective evaluation, annoyance could also be a proper index. Linear regression is a standard model to establish the mapping function of psychoacoustic objective parameters and subjective assessment as well (Zhang et al. [37]).

The data, in this example, include an actual application case of 50 noise samples from forklifts. The subjective evaluation index (annoyance) was divided into ten grades, in which the noise annoyance was further subdivided into five parts from low to high, and each part had two grades, as shown in Table 1 of Zhang et al. [37]. Similar to Zhang et al. [37], we selected nine objective parameters as the objective evaluation indexes: Linear sound pressure level (LSPL), A-weighted sound pressure level (ASPL), loudness, sharpness, roughness, fluctuation, tonality, articulation index (AI), and impulsiveness. The details of the data are given in Table 2 of Zhang et al. [37]. These data include one (or even two) extreme observations (Figure 2 in Zhang et al. [37], not reported here). Hence, it means that it may not be relevant to assume Gaussian errors for the regression model. By this feature, we considered the proposed different families for error distribution.

We first performed multiple covariate analyses by using the ML approach to select the best subset of objective parameters. From our results based on all considered error distributions, not shown here, our analysis will be centered on the model with only loudness, sharpness, and impulsiveness as the covariates. We obtained the ML estimates and their standard errors for the selected covariates and computed AICs and BICs, under all models for errors. The results are reported in Tabel 4. It shows that the normal model returns substantially different regression coefficient estimates in comparison with other models. Generally, there is not a consistent agreement in regression coefficient estimates between the given error models, but for our proposed SHS models. All three covariates have a significant positive effect on the response; i.e. higher amounts of annoyance are associated with higher values of these covariates simultaneously.

Table 4. ML inferences under different models for errors for forklift data.

  SHS1 SHS2 BHS GSH SGHS SSt SN Normal
Regression parameters:
constant 1.3863 1.4280 1.0222 0.7977 0.1594 1.1855 1.3091 0.0069
se 0.4926 0.5229 0.9308 0.4627 1.5459 0.4416 0.6174 0.6901
loudness 0.0473 0.0479 0.0534 0.0534 0.0477 0.0415 0.0472 0.0474
se 0.0065 0.0062 0.0097 0.0054 0.0126 0.0040 0.0065 0.0084
sharpness 0.7300 0.7046 0.4902 0.5062 0.5282 0.8471 0.7070 0.5585
se 0.2279 0.2225 0.3734 0.1799 0.5466 0.1262 0.2441 0.3026
impulsiveness 2.0577 2.0420 2.0669 1.9013 1.3515 1.5852 1.7040 1.2190
se 0.2916 0.2826 0.2868 0.2854 0.6233 0.3781 0.4005 0.4682
Distributional parameters:
Scale 0.1354 0.1906 0.1301 0.4901 0.3880 0.2985 0.6869 0.5082
se 0.0468 0.0510 0.1176 0.1131 0.0820 0.0914 0.0831 0.0508
Shape1   0.5131 0.6751 2.7291 0.8058 2.0511 2.6029  
se   0.1903 0.8729 0.2777 0.3675 1.1787 0.8832  
Shape2     0.3376   1.0972      
se     0.3327   1.3161      
d.f. 0.3325         0.9563    
se 0.1552         0.2189    
AIC 57.9143 58.2685 60.6665 59.0161 68.5898 71.7423 74.6952 84.1990
BIC 69.3865 69.7406 74.0507 70.4882 81.9739 85.1265 84.1673 93.7591

Note: Bold values indicate the best selected one.

Smaller values of AIC and BIC indicate a more appropriate model. Hence, for the models we considered here, the SHS1 model fits the data better according to both AIC and BIC, with the same complexity as others. It also fits the data much better than the normal (Normal) model for the errors. The performance of the SHS2 model is almost equivalent to the SHS1 model. Tabel 5 also shows the results of the Kolmogorov–Smirnov test corresponding to all error models. According to the results, we can verify the goodness of fit of all models, excluding the normal model, when the size of the test is 0.1.

Table 5. The Kolmogorov–Smirnov test for forklift data.

  SHS1 SHS2 BHS GSH SGHS SSt SN Normal
T 0.0577 0.0547 0.0628 0.0584 0.0978 0.1594 0.1427 0.1813
P-value 0.9928 0.9964 0.9822 0.9917 0.6881 0.1411 0.2365 0.0656

Overall, the results of forklift data analysis provide strong arguments in favor of modeling the tail of the errors when it is semi-heavy, not heavy.

5.2. Diamond data

Chu [8] describes the development of a pricing model for diamond stones using data that appeared in an advertisement in Singapore's Business Times edition of 18 February 2000. A total of 308 diamond stones were included in the analysis. This dataset is available in the Ecdat R package [21]. We assumed a linear model with the logarithm of the price of diamond stones, as the response variable, and the covariates, weight (carat), clarity, color, and certification body. The weight of a diamond stone is given in terms of carat units. One carat is equal to 0.2 g.

The last three covariates are categorical. Clarity of a diamond stone is classified in descending order as internally flawless (IF), very very slightly imperfect (VVS1 or VVS2), and very slightly imperfect (VS1 or VS2). The most prized diamonds display color purity. Top color purity invites a grade of D. Subsequent degrees of color purity are graded E, F, G, and so on. For the present data, we are facing six different degrees of color purity. Also, the different certification bodies assay diamond stones and provide each of them with a certificate listing their caratage and their grades of clarity and color. Three certification bodies are GIA, IGI, and HRD [8].

We included the categorical covariates in the regression model as a set of dummy variables. We selected color I as the baseline category for the color purity and compared it to the other five colors. For the clarity, we selected VVS2 as the baseline category. Likewise, for the certification body, the IGI is chosen as the baseline category. According to the results of Chu [8], we also considered an interaction term between carat and certification bodies.

Table 6 shows the ML estimates and their standard errors of the regression coefficients and the parameters of error distribution under different models for error terms. Notably, the result from the normal model is different from the rest, exhibiting more inflation in standard errors of the estimated coefficients compared to others. This inflation results in different conclusions comparing to other models; for example, based on the normal errors, the difference between VS1 and VVS2 is not significant, which is not in agreement with the rest. When the other models for errors are applied, all effects are significant and are consistent in the sign of estimates. Table 6 also reports the corresponding values of AIC and BIC. Both criteria provide strong support for the SHS2 model and indicate that our proposed model is the most appropriate distribution for the error terms. For this example, the performance of the SHS2 model is also almost equivalent to the SHS1 model.

Table 6. ML inferences under different models for errors for diamond data.

  SHS1 SHS2 BHS GSH SGHS SSt SN Normal
Regression parameters:
constant 5.9500 5.7900 5.9039 5.8063 5.7959 5.9224 5.9725 5.5477
se 0.0437 0.0320 0.1153 0.0404 0.0453 0.0409 0.0415 0.0934
carat 3.4638 3.8540 3.4344 3.5462 3.6555 3.4782 3.4462 3.7301
se 0.0920 0.0736 0.1680 0.1043 0.0945 0.0942 0.0798 0.1082
D 0.4567 0.4609 0.4284 0.4422 0.4123 0.4588 0.4368 0.5852
se 0.0276 0.0194 0.0391 0.0324 0.0327 0.0272 0.0323 0.0603
E 0.3676 0.3887 0.3513 0.3982 0.4327 0.4353 0.4006 0.4790
se 0.0194 0.0145 0.0430 0.0242 0.0269 0.0234 0.0246 0.0420
F 0.2868 0.3436 0.2624 0.3277 0.3319 0.3490 0.3223 0.4061
se 0.0174 0.0154 0.0355 0.0218 0.0225 0.0199 0.0220 0.0401
G 0.2014 0.2399 0.1776 0.2317 0.1969 0.2500 0.2169 0.3115
se 0.0182 0.0146 0.0361 0.0229 0.0232 0.0211 0.0228 0.0438
H 0.1277 0.1662 0.0722 0.1329 0.1074 0.1521 0.1221 0.2296
se 0.0183 0.0147 0.0311 0.0227 0.0228 0.0197 0.0226 0.0450
IF 0.1707 0.2418 0.1893 0.1680 0.1888 0.2122 0.1857 0.3120
se 0.0194 0.0174 0.0278 0.0230 0.0254 0.0234 0.0243 0.0499
VS1 0.0879 0.0593 0.0886 0.0892 0.1050 0.0825 0.1191 0.0086
se 0.0142 0.0111 0.0237 0.0178 0.0201 0.0160 0.0191 0.0379
VS2 0.1549 0.1455 0.1820 0.1845 0.2437 0.1800 0.1968 0.1102
se 0.0168 0.0137 0.0234 0.0198 0.0256 0.0192 0.0210 0.0375
VVS1 0.0697 0.0824 0.0975 0.0884 0.0506 0.1001 0.0976 0.1715
se 0.0147 0.0116 0.0279 0.0201 0.0195 0.0191 0.0208 0.0357
GIA 0.6251 0.7381 0.4466 0.4919 0.5708 0.5229 0.4971 0.6242
se 0.0562 0.0433 0.1026 0.0454 0.0572 0.0644 0.0528 0.0669
HRD 0.7555 0.8444 0.8357 0.8828 0.8750 0.8587 0.7990 1.2225
se 0.0620 0.0552 0.0753 0.0690 0.0729 0.0751 0.0720 0.1298
carat × GIA 0.9328 1.2494 0.6261 0.7382 0.8297 0.7152 0.6610 0.9127
se 0.1081 0.0896 0.2274 0.1083 0.1209 0.1265 0.0993 0.1160
carat × HRD 1.1175 1.3926 1.1072 1.2052 1.2271 1.1528 1.0701 1.6104
se 0.1080 0.0958 0.1595 0.1218 0.1145 0.1192 0.1079 0.1752
Distributional parameters:
Scale 0.1052 0.0338 0.1301 0.1116 0.1212 0.1631 0.1722 0.1564
se 0.0060 0.0073 0.0734 0.0059 0.0117 0.0123 0.0112 0.0198
Shape1   0.7439 1.0816 3.0670 1.2336 3.0234 2.2712  
se 0.0582 2.2730 1.2692 0.2120 0.5657 0.5393  
Shape2     1.8586   0.6116      
se 0.3920   0.5829      
d.f. 9.3565         5.8957    
se (3.4098)     1.2959    
AIC −471.6554 493.3932 −421.9113 −462.4386 −430.2067 −448.6855 −459.4364 −322.4285
BIC −408.2437 429.9815 −354.7695 −399.0269 −363.0649 −381.5437 −398.0247 −262.7469

Tabel 7 shows the results of the Kolmogorov–Smirnov test for different error models, rendering the goodness of fit for all models when the size of the test is 0.05. However, a substantially better fit of the SHS2 model could be seen clearly in Figure 8.

Table 7. The Kolmogorov–Smirnov test for diamond data.

  SHS1 SHS2 BHS GSH SGHS SSt SN Normal
T 0.0725 0.0656 0.0692 0.0744 0.0669 0.0672 0.0726 0.0688
P-value 0.0788 0.1411 0.1044 0.0663 0.1264 0.1239 .0779 0.1082

Figure 8.

Figure 8.

Histograms of the residuals and the corresponding estimated densities for different error models for diamond data.

5.3. Martin Marietta company data

Butler et al. [7] analyzed a set of data on the excess rate of return for the Martin Marietta Company. They considered a linear regression of Y, the excess rate of return for the Martin Marietta Company, on CRSP, an index of the excess rate of return for the New York stock exchange, as follows:

Y=β0+β1CRSP+ϵ. (4)

The data consist of 60 monthly observations collected in 5 years, from January 1982 to December 1986. This dataset has also been analyzed by Azzalini and Capitanio [4], Taylor and Verbyla [33], DiCiccio and Monti [11], and Salazar et al. [28]. These data include one very extreme observation (Figure 9); Moreover, according to DiCiccio and Monti [11], some diagnostic measures show that two additional points are possible outliers. Therefore it indicates that it may not be appropriate to assume Gaussian errors for the regression model.

Figure 9.

Figure 9.

Scatterplot of the Martin Marietta data and fitted regression lines with different distributions for error term.

We now proceed to consider the proposed different families for error distributions. The ML inferences are reported in Tabel 6. It shows that the Normal estimates return substantially different regression coefficient estimates in comparison with other models. As Butler et al. [7] have also mentioned, the estimated regression coefficients produced by the Normal method classify Martin Marietta Company into a different risk classification. The Normal estimate of β^1=1.8024 means Martin Marietta Company rates as a very aggressive investment relative to the market portfolio. However, the corresponding estimates for other models put the company as moderately aggressive. Further, the corresponding estimates for SSt and SN models are somewhat different compared to semi-heavy-tailed distribution families. The smaller values for β^1 in our proposed semi-heavy-tailed models than the SSt and SN models, emphasize more on classifying the company as moderately aggressive.

Table 8. ML inferences under different models for errors for the Martin Marietta Company data.

  SHS1 SHS2 BHS GSH SGHS SN SSt Normal
Regression parameters:
Intercept −0.0275 −0.0460 −0.0492 −0.0073 −0.0075 −0.0933 −0.0755 0.0011
  (0.0125) (0.0167) (0.0332) (0.0087) (0.0089) (0.0125) (0.0207) (0.0087)
Slope 1.2482 1.2643 1.2515 1.2456 1.2459 1.3790 1.3391 1.8024
  (0.1937) (0.1979) (0.2035) (0.2029) (0.1942) (0.2415) (0.2087) (0.2015)
Distributional parameters:
Scale 0.0313 0.0400 0.0368 0.0911 0.0849 0.1362 0.0937 0.0928
  (0.0076) (0.0072) (0.0232) (0.0180) (0.0155) (0.0293) (0.0042) (0.0085)
Shape1   0.4386 1.2657 −2.5607 0.9019 3.9154 2.4587  
    (0.1555) (1.3370) (0.3520) (0.0752) (1.4049) (1.4575)  
Shape2     0.5108   −2.4090      
      (0.3731)   (0.4474)      
d.f. 0.4320           6.0660  
  (0.1488)           (5.4508)  
AIC −135.1765 135.3089 −133.176 −132.1232 −124.0341 −134.0729 −124.0341 −108.9898
BIC −126.7991 126.9316 −122.7043 −123.7458 −115.6567 −123.6012 −115.6567 −102.7068

For the models we considered here, the SHS2 model fits the data better according to both AIC and BIC. It also fits the data much better than the normal model for the errors. Similar to two previous examples, the performance of the SHS1 model is almost equivalent to the SHS2 model.

Figure 8 shows the fitted regression lines following the supposed distributions for the error term in the model (4). For the Normal fitted model, the regression line is shrunk to upward due to the extreme observations. However, the semi-heavy-tailed models limit the influence of extreme observations and provide a robust alternative when there exist a few outliers. On the other hand, both SN and SSt models give high leverage observations less influence than the semi-heavy-tailed models due to their heavy tail properties, which causes their regression lines to be more shrunk to downward. Therefore, a heavy tail assumption for the distribution of errors for these data results in a poor model fit.

For these data, the Kolmogorov–Smirnov test (Table 9) verifies the goodness of regression fit under all models for errors, as well.

Table 9. The Kolmogorov–Smirnov test for Martin Marietta data.

  SHS1 SHS2 BHS GSH SGHS SSt SN Normal
T 0.0864 0.0852 0.0897 0.0761 0.0759 0.0768 0.1386 0.1474
P-value 0.7289 0.7436 0.6861 0.8513 0.8534 0.8445 0.1816 0.1334

6. Conclusions

To consider the semi-heavy-tailedness of data, in the presence of a few outliers, could provide more efficient inferences in the real applications when they do not possess a heavy tail. We developed a regression methodology based on the semi-heavy-tailedness of the errors in a linear model under two new skewed version of the hyperbolic secant distribution. We fitted our models using ML, which is the standard nowadays. We also examined the asymptotic properties of the ML estimators of the regression parameters in our proposed methodology, including consistency and asymptotic normality.

This article demonstrates the utility of the proposed methodology in applied situations when data are not too heavy, using simulated data as well as data from real applications. Our suggested models are also robust comparing to actual heavy-tailedness of data.

Future work will also include extensions of the models for use with multivariate and spatial data. An extension to multivariate data would consist of not only to consider the spatial dependence of each variable but also to examine the spatial dependence between each pair of variables.

Supplementary Material

Supplemental Material

Acknowledgments

The authors wish to thank two referees and the associate editor for their valuable comments and suggestions that improved the previous version of this paper.

Note

1

Install the package from github with library(devtools)install_git (" git://github.com/jamilownuk/shs.git" ).

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Akaike H., A new look at the statistical model identification, in Selected Papers of Hirotugu Akaike, Parzen, Emanuel, Tanabe, Kunio, and Kitagawa, Genshiro, eds., Springer, New York, 1974, pp. 215–222
  • 2.Arellano-Valle R.B. and Azzalini A., The centred parameterization and related quantities of the skew-t distribution, J. Multivar. Anal. 113 (2013), pp. 73–90. doi: 10.1016/j.jmva.2011.05.016 [DOI] [Google Scholar]
  • 3.Azzalini A., A class of distributions which includes the normal ones, Scand. J. Stat. 12 (1985), pp. 171–178. [Google Scholar]
  • 4.Azzalini A. and Capitanio A., Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t distribution, J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 65 (2003), pp. 367–389. doi: 10.1111/1467-9868.00391 [DOI] [Google Scholar]
  • 5.Bartolucci F. and Scaccia L., The use of mixtures for dealing with non-normal regression errors, Comput. Stat. Data. Anal. 48 (2005), pp. 821–834. doi: 10.1016/j.csda.2004.04.005 [DOI] [Google Scholar]
  • 6.Broyden C.G., The convergence of a class of double-rank minimization algorithms 1. General considerations, IMA J. Appl. Math. 6 (1970), pp. 76–90. doi: 10.1093/imamat/6.1.76 [DOI] [Google Scholar]
  • 7.Butler R.J., McDonald J.B., Nelson R.D., and White S.B., Robust and partially adaptive estimation of regression models, Rev. Econ. Stat. 72 (1990), pp. 321–327. doi: 10.2307/2109722 [DOI] [Google Scholar]
  • 8.Chu S., Pricing the C's of diamond stones, J. Stat. Educ. 9 (2001), pp. 1–12. 10.1080/10691898.2001.11910659. [DOI] [Google Scholar]
  • 9.da Silva N.B., Prates M.O., and Gonçalves F.B., Bayesian linear regression models with flexible error distributions, arXiv preprint arXiv:1711.04376.
  • 10.Di G.Q., Chen X.W., Song K., Zhou B., and Pei C.M., Improvement of Zwicker's psychoacoustic annoyance model aiming at tonal noises, Appl. Acoust. 105 (2016), pp. 164–170. doi: 10.1016/j.apacoust.2015.12.006 [DOI] [Google Scholar]
  • 11.DiCiccio T.J. and Monti A.C., Inferential aspects of the skew exponential power distribution, J. Am. Stat. Assoc. 99 (2004), pp. 439–450. doi: 10.1198/016214504000000359 [DOI] [Google Scholar]
  • 12.Fernandez C. and Steel M.F., On Bayesian modeling of fat tails and skewness, J. Am. Stat. Assoc. 93 (1998), pp. 359–371. [Google Scholar]
  • 13.Ferreira C.S. and Lachos V.H., Nonlinear regression models under skew scale mixtures of normal distributions, Stat. Methodol. 33 (2016), pp. 131–146. doi: 10.1016/j.stamet.2016.08.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ferreira J.T.S. and Steel M.F.J., A constructive representation of univariate skewed distributions, J. Am. Stat. Assoc. 101 (2006), pp. 823–829. doi: 10.1198/016214505000001212 [DOI] [Google Scholar]
  • 15.Fischer M.J., Generalized Hyperbolic Secant Distributions: with Applications to Finance, Springer Science & Business Media, Heidelberg, 2013. [Google Scholar]
  • 16.Fischer M.J. and Vaughan D., Classes of skew generalized hyperbolic secant distributions, Diskussionspapiere, Friedrich-Alexander-Universität Erlangen-Nürnberg, Lehrstuhl für Statistik und Ökonometrie, 2002
  • 17.Fischer M.J. and Vaughan D., The beta-hyperbolic secant distribution, Austrian J. Stat. 39 (2010), pp. 245–258. doi: 10.17713/ajs.v39i3.247 [DOI] [Google Scholar]
  • 18.Fisher R.A., On the'probable error'of a coefficient of correlation deduced from a small sample, Metron 1 (1921), pp. 1–32. [Google Scholar]
  • 19.Fletcher R., A new approach to variable metric algorithms, Comput. J. 13 (1970), pp. 317–322. doi: 10.1093/comjnl/13.3.317 [DOI] [Google Scholar]
  • 20.Goldfarb D., A family of variable-metric methods derived by variational means, Math. Comput. 24 (1970), pp. 23–26. doi: 10.1090/S0025-5718-1970-0258249-6 [DOI] [Google Scholar]
  • 21.Graves S., Ecdat: Data Sets for Econometrics, R package (2019) version 0.3-3.
  • 22.Jones M. and Faddy M., A skew extension of the t distribution, with applications, J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 65 (2003), pp. 159–174. doi: 10.1111/1467-9868.00378 [DOI] [Google Scholar]
  • 23.Lange K.L., Little R.J., and Taylor J.M., Robust statistical modeling using the t-distribution, J. Am. Stat. Assoc. 84 (1989), pp. 881–896. [Google Scholar]
  • 24.Ma Y. and Genton M.G., Flexible class of skew-symmetric distributions, Scand. J. Stat. 31 (2004), pp. 459–468. doi: 10.1111/j.1467-9469.2004.03_007.x [DOI] [Google Scholar]
  • 25.Ma C., Ma C., Li Q., Liu Q., Wang D., Gau J., Tang H., and Sun Y., Sound quality evaluation of noise of hub permanent-magnet synchronous motors for electric vehicles, IEEE Trans. Ind. Electron. 63 (2016), pp. 5663–5673. doi: 10.1109/TIE.2016.2569067 [DOI] [Google Scholar]
  • 26.Omey E., Van Gulck S., and Vesilo R., Semi-heavy tails, Lith. Math. J. 58 (2018), pp. 480–499. doi: 10.1007/s10986-018-9417-0 [DOI] [Google Scholar]
  • 27.R Core Team , R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2019. Available at http://www.R-project.org/. [Google Scholar]
  • 28.Salazar E., Ferreira M.A., and Migon H.S., Objective Bayesian analysis for exponential power regression models, Sankhya B 74 (2012), pp. 107–125. doi: 10.1007/s13571-012-0045-0 [DOI] [Google Scholar]
  • 29.Schwarz G., Estimating the dimension of a model, Ann. Stat. 6 (1978), pp. 461–464. doi: 10.1214/aos/1176344136 [DOI] [Google Scholar]
  • 30.Shanno D.F., Conditioning of quasi-Newton methods for function minimization, Math. Comput. 24 (1970), pp. 647–656. doi: 10.1090/S0025-5718-1970-0274029-X [DOI] [Google Scholar]
  • 31.Soffritti G. and Galimberti G., Multivariate linear regression with non-normal errors: a solution based on mixture models, Stat. Comput. 21 (2011), pp. 523–536. doi: 10.1007/s11222-010-9190-3 [DOI] [Google Scholar]
  • 32.Takeuchi I., Bengio Y., and Kanamori T., Robust regression with asymmetric heavy-tail noise distributions, Neural Comput. 14 (2002), pp. 2469–2496. doi: 10.1162/08997660260293300 [DOI] [PubMed] [Google Scholar]
  • 33.Taylor J. and Verbyla A., Joint modelling of location and scale parameters of the t distribution, Stat. Modell., 4 (2004), pp. 91–112. doi: 10.1191/1471082X04st068oa [DOI] [Google Scholar]
  • 34.Vaughan D.C., The generalized secant hyperbolic distribution and its properties, Commun. Stat. Theory Methods 31 (2002), pp. 219–238. doi: 10.1081/STA-120002647 [DOI] [Google Scholar]
  • 35.Xu Z.M., Xia X.J., He Y.S., and Zhang F.Z., Analysis and evaluation of car engine starting sound quality, J. Vib. Shock 11 (2014), pp. 142–147. [Google Scholar]
  • 36.Zeller C.B., Cabral C.R., and Lachos V.H., Robust mixture regression modeling based on scale mixtures of skew-normal distributions, Test 25 (2016), pp. 375–396. doi: 10.1007/s11749-015-0460-4 [DOI] [Google Scholar]
  • 37.Zhang E., Zhang Q., Xiao J., Hou L., and Guo T., Acoustic comfort evaluation modeling and improvement test of a forklift based on rank score comparison and multiple linear regression, Appl. Acoust. 135 (2018), pp. 29–36. doi: 10.1016/j.apacoust.2018.01.026 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES