Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2020 Dec 3;49(5):1305–1322. doi: 10.1080/02664763.2020.1854203

A Bayesian approach on the two-piece scale mixtures of normal homoscedastic nonlinear regression models

Zahra Barkhordar a, Mohsen Maleki b,CONTACT, Zahra Khodadadi a, Darren Wraith c, Farajollah Negahdari d
PMCID: PMC9041943  PMID: 35707508

Abstract

In this application note paper, we propose and examine the performance of a Bayesian approach for a homoscedastic nonlinear regression (NLR) model assuming errors with two-piece scale mixtures of normal (TP-SMN) distributions. The TP-SMN is a large family of distributions, covering both symmetrical/ asymmetrical distributions as well as light/heavy tailed distributions, and provides an alternative to another well-known family of distributions, called scale mixtures of skew-normal distributions. The proposed family and Bayesian approach provides considerable flexibility and advantages for NLR modelling in different practical settings. We examine the performance of the approach using simulated and real data.

KEYWORDS: Gibbs sampling, MCMC method, nonlinear regression model, scale mixtures of normal family, two-piece distributions

1. Introduction

Non-linear regression (NLR) models are commonly used to model data in a range of applications including engineering, biology, geology and finance, to name just a few. However, a common assumption is that the error distribution is normally distributed which has limitations for the modelling of asymmetrical and/or outlying data seen relatively common in practice.

Recent and various extensions to allow for asymmetrical distributions for non-linear regression models to different fields and phenomena have been proposed (See, e.g. [9,33,8,4,34,35,24,29]). These asymmetric NLR models have involved various distributional classes involving symmetry/asymmetry and others are light/heavy tailed. Montenegro et al. [24], Cancho et al. [5] and Labra et al. [14] are notable and highly cited works which introduced and showed the advantages of using the flexible class of scale mixtures of skew-normal (SMSN) family called SMSN-NLR models. The SMSN family [3] is a flexible class of distributions which is an extension of the skewed version of the well-known symmetric scale mixtures of normal (SMN) family [2], and covers the light/heavy tailed and symmetry/asymmetry distributions such as skew-normal (SN), skew-t (ST), skew-slash (SSL) and skew contaminated-normal (SCN) distributions. Additionally, this family has been widely considered in many statistical models [17,19,23,36,11], and references therein.

Another extension of the skewed version of the SMN family was introduced based on the two-piece distributions (constructed by symmetrical distributions; [1]) with various scales by Maleki and Mahmoudi [20] which were called two-piece scale mixtures of normal (TP-SMN) family. The TP-SMN family is an analogy and competitor of the well-known SMSN family which contains the light/heavy-tailed and symmetry/asymmetry members including the two-piece normal (TP-N), two-piece t (TP-T), two-piece slash (TP-SL) and two-piece contaminated-normal (TP-CN) distributions. Maleki et al. [15] and Hoseinzadeh et al. [12] proposed and examined the performance of the TP-SMN family in the context of NLR models (TP-SMN-NLR) using an EM-type algorithm to obtain maximum-likelihood estimates for parameters. However, in the context of non-linear regression models, these types of approaches often have computational difficulties in high dimensions and are often not suitable for noisy data without methods which have the capability to include some form of regularization on the parameter space.

In this paper, we extend the previous work to consider a Bayesian approach for the TP-SMN-NLR models which aims to provide more flexibility and advantages in different settings compared to previous approaches. In the extension, we are able to derive a suitable hierarchical representation for this particular approach (different from the hierarchical representation in [15]) which is computationally attractive and facilitates the use of Markov Chain Monte Carlo (MCMC) methods for finding Bayes estimates using the Gibbs sampling scheme.

The rest of this paper is organized as follows. In Section 2, some important properties of the TP-SMN family are reviewed and a suitable stochastic representation is proposed for use in the Bayesian approach. In Section 3, the TP-SMN-NLR models are introduced and a hierarchical representation of the model outlined. In Section 4, details of the MCMC approach using Gibbs sampling are outlined, including the appropriate choice of prior information. The performance of the approach is examined in Section 4 using simulation studies and several challenging and real examples used in the NLR literature. Some final conclusions are also provided in Section 5.

2. Some properties of the TP-SMN family

In this section, we present some properties of the TP-SMN distributions which have been used to implement the Bayesian analysis of the non-linear regression model (some of these properties have also been outlined in [15]).

The well-known SMN family introduced by Andrews and Mallows [2], denoted by XSMN(μ,σ,ν) is the basis of the asymmetry TP-SMN family [20] and has the following probability density function (pdf) and stochastic representation, respectively given by

fSMN(x|μ,σ,ν)=0ϕ(x|μ,u1σ2)dH(u|ν),xR, (1)
X=μ+σU1/2W, (2)

where ϕ(|μ,σ2) represents the density of N(μ,σ2) distribution, H(|ν) is the cumulative distribution function (cdf) of the scale mixing random variable U, which can be indexed by a scalar or vector of parameters ν, and WN(0,1) is independent of U.

The TP-SMN is a rich family of distributions that covers the asymmetric light-tailed TP-N, (which is also called the Epsilon-Skew-Normal; [26,18]) and asymmetric heavy-tailed TP-T, TP-SL and TP-CN distributions and their corresponding symmetric members. In terms of the density, for yR, this family can be represented as

g(y|μ,σ,γ,ν)=2(1γ)fSMN(y|μ,σ(1γ),ν)I(yμ)+2γfSMN(y|μ,σγ,ν)I(y>μ), (3)

where 0<γ<1 is the slant parameter, fSMN(|μ,σ,ν) is given by (1), and it is denoted by Y TP-SMN (μ,σ,ν,γ) with

E(Y)=μbσ(12γ) and Var(Y)=σ2[c2k2(ν)b2c12],

for which b=2/πk1(ν), cr=γr+1+(1)r(1γ)r+1 and kr(ν)=E(Ur/2) and U is the scale mixing random variable in (2).

There are a range of different TP-SMN member distributions which are possible from (3) using different distributions for the scale mixing random variable U in (2), as follows:

  • Two-Piece Normal (TP-N): U=1 with probability one,

  • Two-Piece t (TP-T): UGamma(ν/2,ν/2), i.e. ν=ν,

  • Two-Piece Slash (TP-SL): UBeta(ν,1), i.e. ν=ν,

  • Two-Piece Contaminated Normal (TP-CN): h(u|ν)=νI(u=τ)+(1ν)I(u=1), i.e. ν=(ν,τ).

For more details of stochastic representations, statistical inferences and applications of the two-pieces distributions, especially TP-SMN family, see Arellano-Valle et al. [1] and Maleki and Mahmoudi [20], Hoseinzadeh et al. [13,7], Moravveji et al. [25], Maleki et al. [15,16], Maleki et al. [21,22] and Ghasami et al. [10].

By using an auxiliary (latent) variable Z in terms of the components of the mixture (3), the TP-SMN random variable can have the following stochastic representation

Y|Z=zSMN(μ,σ|z|,ν)IA(y)z+|z|2|z|IAc(y)z|z|2|z|, (4)

where A=(,μ] and SMN()IA() denotes the truncated SMN-distribution on the interval A, and Z has a probability mass function (pmf),

P(Z=z)=γz+|z|2z(1γ)z|z|2z;z=(1γ),γ. (5)

Note that this distribution for Z is different to the distribution used in Maleki et al. [15] to obtain ML estimates.

3. TP-SMN nonlinear regression

In this section, we introduce the TP-SMN nonlinear regression (TP-SMN-NLR) model and outline the hierarchical representation which will be needed for the Bayesian approach in Section 4.

3.1. TP-SMN-NLR models

The TP-SMN non-linear regression model is defined by

Yi=η(β,xi)+εi,i=1,,n, (6)

where β=(β1,,βp) is a vector of nonlinear regression parameters, Yi is a response variable, η() is an injective and twice continuously differentiable function with respect to the parameter vector β, xi=(xi1,,xip) is a vector of fixed explanatory variables for subject i and with the random errors εi TP-SMN (με,σ,γ,ν), where με=bσ(12γ) and b=2/πk1(ν). Following properties of the TP-SMN family, we have E(εi)=0 and

YiTP--SMN(η(β,xi)+με,σ,γ,ν),i=1,,n, (7)

with vector of parameters θ=(β,σ,γ,ν). Note that E(Yi)=η(β,xi) and Yi has the pdf given by

fYi(y|θ)=g(y|η(β,xi)+με,σ,γ,ν), (8)

where g(|) is the TP-SMN density given in (3).

From Equations (6)–(7) and the stochastic representation of the SMN family in (4), it is noticed that the proposed TP-SMN-NLR can be written in a convenient hierarchical form and represented as

Yi|Ui,Zi=ziind.N(η(β,xi)+με,zi2ui1σ2)IAi(yi)zi+|zi|2|zi|IAic(yi)zi|zi|2|zi|,
Ui|Zi=ziind.H(ui|ν),
P(Zi=zi|γ)=γzi+|zi|2|zi|(1γ)zi|zi|2|zi|;zi=(1γ),γ, (9)

for i=1,,n and j=1,2, where Ai=(,η(β,xi)+με] and N()IA() denotes the truncated normal distribution on the interval A. Note that the hierarchical representation (9) is different from the hierarchical representation in Maleki et al. [15] and facilitates a Bayesian approach to obtain posteriors using Gibbs sampling.

To define the complete log-likelihood for use in Gibbs sampling, let D=(W,U,Z) be the complete data in the form of the U=(U1,,Un) and Z=(Z1,,Zn), be the missing parts of the data and W=(W1,,Wn) for which Wi=(xi,Yi);i=1,,n, be the i.i.d. observed part of the data from TP-SMN-NLR models with vector of parameters θ=(β,σ,γ,ν). Considering the hierarchical representation (9), the completed (augmented) likelihood function is given by

Lc(θ|D)=i=1nϕ(yi|η(β,xi)+με,zi2ui1σ2)IAi(yi)zi+|zi|2|zi|IAic(yi)zi|zi|2|zi|h(ui|ν)P(Zi=zi|γ), (10)

where Ai=(,με+η(β,xi)],i=1,,n.

4. Bayesian approach of TP-SMN-NLR models

In this section, we develop a Bayesian approach of the TP-SMN-NLR models defined by (6–7). Some standard model selection criterions are also presented.

4.1. Prior distributions

To complete the Bayesian specification of the TP-SMN-NLR models we need to consider prior distributions for all the unknown parameters β, σ,γ and ν. We assign conjugate but weakly informative prior distributions to the parameters, because it is assumed that no prior information from historical data or from previous experiment exists. Also, to guarantee proper posteriors, proper priors with known hyper-parameters are adopted. So, the following prior distributions are then specified in the form of:

π(β)=ϕp(β|μβ,Σβ),where Σβ=diag(σβ12,,σβp2);

i.e. normal prior distributions for the elements of β. Also because of conjugacy, we consider the following priors for scale and shape parameters

σ2Gamma(ρ2,ϱ2),γBeta(α,β).

The prior distribution assigned to ν, varies with the particular TP-SMN distributions as follows:

  1. TP-T distribution: ν=ν exp(ς/2)I(2,) with mean 2/ς before truncation. The truncation at the interval (2,), was chosen to ensure a finite variance for errors.

  2. TP-SL distribution: ν=ν Gamma(a,b) with small positive values a and b ( ba).

  3. TP-CN distribution: the non-informative and independent U(0,1) prior distributions are adopted for each component of ν=(ν,τ).

The values assigned in our methodology to the hyper-parameters guarantee the propriety of the posterior distributions, and they are chosen in order to have weakly informative prior distributions. Note that more informative prior distributions could be used in the case of noisy data, particularly on the parameters in the error term (σ,γ,ν). In high dimensional applications there is also interest in regularization on the regression coefficients (β) and so more informative priors could be used for this purpose. However, different types of prior distributions and an examination of their performance are outside the scope of this paper.

4.2. Posterior distributions

Assuming a joint prior distribution for θ given by π(θ)=π(β)π(σ)π(γ)π(ν), the joint posterior distribution of θ with unobserved variables is π(θ,U,Z|W)L(θ|W,U,Z)π(θ), where L(θ|W,U,Z) is given in (10). Since this posterior distribution is not analytically tractable, MCMC methods such as the Gibbs sampler are used to draw samples from the full conditional distributions given by

σ2|β,γ,ν,W,U,ZGamma(n+ρ2,ϱ+i=1nui(yiη(β,xi)με)22).
γ|β,σ,ν,W,U,ZBeta(α+ni+,β+ni),

where ni+ is the number of positive latent allocation variables zi,i=1,,n, and ni=nni+.

β|σ,γ,ν,W,U,Zϕp(β|μβ,Σβ)i=1nϕ(yi|η(β,xi)+με,zi2ui1σ2).

Note that the full conditional posterior for the non-linear regression coefficient β cannot be sampled from directly, so a Metropolis-Hasting algorithm within the Gibbs iterations will be used [6]. To generate samples from β at the kth iteration in the chain, we generate a sample βMH from Np(β(k),V) and a random number from UUniform(0,1), then set the new value β(k+1) to be either βMH or β(k) depending on whether

U<ϕp(βMH|μβ,Σβ)i=1nϕ(yi|η(βMH,xi)+με(k),zi2(k)ui1(k)σ2(k))ϕp(β(k)|μβ,Σβ)i=1nϕ(yi|η(β(k),xi)+με(k),zi2(k)ui1(k)σ2(k))

The matrix V is a proposal covariance matrix that can be adapted as the chain progresses or after generating an initial MCMC chain, to achieve efficient convergence.

To complete the sampling scheme via MCMC methods, we need to determine the posterior distributions for the latent allocation variables Zi and the latent scalar Ui for i=1,,n and the additional parameter ν, which depends on the specific members of the TP-SMN family. To do this, we have that

π(Zi=zi|θ,U;W)=AiAi+BiI(zi=(1γ))+BiAi+BiI(zi=γ),

where Ai=γϕ(yi|η(β,xi)+με,ui1γ2σ2) and Bi=(1γ)ϕ(yi|η(β,xi)+με,ui1(1γ)2σ2).

And then, by defining ci=(yiη(β,xi)με)2/zi2σ2, we obtain for i=1,,n that:

  • TP-T-NLR: β|σ,γ,ν,W,U,Z
    Ui|θ,W,ZGamma(ν+22,ν+ci2).
    π(ν|β,σ,γ,Z,U;W)(2ν/2Γ(ν2))nGamma×(nν21,12[i=1n(UilogUi)+ς])I(2,)(ν). (11)

Note that (11) does not have a closed form, but a Metropolis-Hastings algorithm or rejection sampling steps can be embedded in the MCMC scheme to obtain draws for ν.

  • TP-SL-NLR:
    Ui|θ,W,Z Gamma(ν+1,ci2)I(0,1)(Ui).
    ν|β,σ,γ,Z,U,W Gamma(n+a,bi=1nlogUi).

• TP-CN-NLR:

π(ui|θ,Z;W)=AiAi+BiI(ui=τ)+BiAi+BiI(ui=1),

where Ai=νγexp(τci/2),Bi=(1ν)exp(ci/2), and

ν|β,σ,γ,τ,Z,U,WBeta(ni=1nUi1τ+1,i=1nUinτ1τ+1).
π(τ|β,σ,γ,ν,Z,U;W)νni=1nUi1τ(1ν)i=1nUinτ1τ. (12)

Note that a Metropolis-Hastings algorithm can be embedded in the MCMC scheme to obtain draws for τ in (12), which is described in Rosa et al. [30].

4.3. Model selection criteria

Let θ1,,θM be a sample of size M from posterior π(θ|D) after a suitable burn-in period for the chain. A commonly used model selection criteria is the deviance information criterion (DIC). The posterior mean of the deviance can be estimated by D¯=m=1MD(θm)/M, where D(θ)=2i=1nlogfYi(yi|θ) where fYi(yi|θ) is defined in (9), and by utilizing the estimates from the MCMC chain, the DIC criterion can be estimated by DIC^=2D¯D^, where D^=D(θ¯) and θ¯=m=1Mθm/M. Other model selection criteria can also be used (based on the posterior mean of the deviance D¯) which penalize for the number of parameters in the model such as the expected Akaike information criterion (EAIC) and the expected Bayesian information criterion (EBIC). These criteria can be estimated by EAIC^=D¯+2k and EBIC^=D¯+klogn, respectively, where k is the number of unknown parameters of the TP-SMN-NLR model with sample size of n.

5. Numerical studies

The general performance of the TP-SMN-NLR models have been shown in Maleki et al. [15] with simulations and some well-known real datasets, so in this section, simulation studies and real examples are presented to evaluate the performance of Bayesian estimates of the proposed TP-SMN-NLR models. Results are compared with well-known scale mixtures of skew normal non-linear regression (SMSN-NLR) models.

For the simulation study we evaluate the performance of the Bayesian estimates based on the posterior mean for various TP-SMN-NLR models in three situations: week, moderate and strong skewed errors. All simulations use the following weakly informative prior distributions: βNp(0,103Ip), γBeta(0,1) and σ2Gamma (0.01,0.01), with νexp(0.1)I(2,) for the TP-T-NLR model, ν Gamma (0.01,0.01) for the TP-SL-NLR model, and νU(0,1) independent of τU(0,1) for the TP-CN-NLR model. Also, Gibbs sampling run of 45,000 iterations with a burn-in of 15,000 cycles is used for each generated data set (to eliminate the effect of the initial values and to avoid correlation problems,) and also chains monitored for convergence. Initial values of NLR model have obtained via the least square (LS) method, and for the TP-SMN distribution parameters have obtained via method of moments (MM) in Maleki and Mahmoudi [20] on the LS estimated residuals. The implementations of the algorithms are based on the R software [31] version 3.6.1 with a core i7 760 processor 2.8 GHz.

5.1. Simulations

In this part, we simulate from a nonlinear regression based on the logistic model given as

Yi=β11+β2exp(β3xi)+εi,εiiid.TPSMN(με,σ,ν,γ);i=1,,n, (13)

with β1=10, β2=1, β3=0.2, σ=1, where ν=6 for the TP-T and TP-SL cases; ν=0.5 and τ=0.5 for TP-CN case, each with 400 Monte Carlo generated data sets. The variables xi,i=1,,n are generated from a univariate normal standard distribution and these values are held fixed throughout the simulations.

We have two experiments: Experiment 1, parameters recovery and Experiment 2, performances of model selection criteria. In the Experiment 1, we consider different scenarios ( n=25,50,100and300) for simulations to verify the estimate of true parameter values accurately by using the proposed estimation method. In addition, in the Experiment 2, we compare the ability of some classic model selection criteria to select the appropriate model among the different models, including the SMSN family.

5.1.1. Experiment 1

For the first simulation study, we consider γ=0.05,0.25,0.45 (respectively, strong, moderate and weak skewness) and n=25,50,100and300, in each of these sets, the arithmetic average of the 100 Bayesian estimates (replications) given by

MC--Mean(θ)=j=1100θ^j/100,

and the empirical mean squared error given by

MC--MSE(θ)=j=1100(θjθ^j)2/100,

(MC indicates the arithmetic average of the respective criterion) for the model parameters are obtained, where θ^j is the estimated value of θj for j=1,,100 obtained in the jth sample from the posterior mean. The results from the different fitted TP-SMN-NLR models are shown in Tables 1–3 and show the good performance of the proposed Bayesian estimates of parameters with various sample sizes. In particular, the parameter estimates are all relatively close to the true parameter estimates with reasonable accuracy across the Monte Carlo datasets.

Table 1.

The arithmetic average of Bayesian estimates and empirical mean squared error of logistic TP-SMN-NLR models with strong skewness γ=0.05.

    TP-N-NLR TP-T-NLR TP-SL-NLR TP-CN-NLR
n Par. MC-Mean MC-MSE MC-Mean MC-MSE MC-Mean MC-MSE MC-Mean MC-MSE
25 β1(10) 10.311 0.769 9.746 0.874 9.642 0.911 9.597 0.964
β2(1) 1.262 0.665 1.271 0.623 1.232 0.644 1.228 0.677
β3(0.2) 0.227 0.110 0.181 0.103 0.222 0.125 0.213 0.123
σ(1) 1.298 0.311 1.302 0.312 1.278 0.315 1.307 0.345
γ(0.05) 0.039 0.028 0.032 0.075 0.035 0.033 0.034 0.027
ν 6.643 0.594 5.203 0.583 0.410 0.232
τ 0.602 0.311
50 β1(10) 10.265 0.713 10.338 0.862 9.569 0.869 10.358 0.883
β2(1) 1.238 0.627 1.235 0.618 1.223 0.621 1.218 0.637
β3(0.2) 0.178 0.102 0.215 0.098 0.184 0.119 0.211 0.127
σ(1) 1.223 0.270 1.247 0.291 1.254 0.288 1.269 0.258
γ(0.05) 0.056 0.016 0.040 0.023 0.061 0.020 0.062 0.029
ν 5.711 0.507 6.305 0.497 0.550 0.221
τ 0.567 0.247
100 β1(10) 9.781 0.492 10.265 0.487 10.198 0.413 10.222 0.413
β2(1) 1.118 0.517 1.133 0.504 1.124 0.534 1.129 0.540
β3(0.2) 0.208 0.088 0.193 0.079 0.210 0.092 0.207 0.090
σ(1) 1.127 0.242 1.120 0.246 1.131 0.251 1.121 0.228
γ(0.05) 0.052 0.009 0.053 0.010 0.054 0.011 0.055 0.013
ν 6.051 0.445 6.080 0.431 0.529 0.170
τ 0.520 0.146
300 β1(10) 10.118 0.401 10.132 0.424 10.157 0.401 9.879 0.397
β2(1) 1.087 0.496 1.092 0.500 1.102 0.497 1.100 0.489
β3(0.2) 0.194 0.069 0.206 0.077 0.195 0.085 0.203 0.081
σ(1) 1.118 0.229 1.101 0.211 1.110 0.207 1.104 0.201
γ(0.05) 0.049 0.010 0.052 0.008 0.053 0.009 0.051 0.011
ν 5.881 0.421 6.072 0.449 0.487 0.156
τ 0.513 0.142
Table 2.

The arithmetic average of Bayesian estimates and empirical mean squared error of logistic TP-SMN-NLR models with moderate skewness γ=0.25.

    TP-N-NLR TP-T-NLR TP-SL-NLR TP-CN-NLR
n Par. MC-Mean MC-MSE MC-Mean MC-MSE MC-Mean MC-MSE MC-Mean MC-MSE
25 β1(10) 10.298 0.752 10.312 0.913 9.512 0.904 10.300 0.872
β2(1) 1.274 0.692 1.269 0.652 1.301 0.684 1.251 0.690
β3(0.2) 0.171 0.118 0.219 0.112 0.220 0.125 0.218 0.131
σ(1) 1.312 0.341 1.287 0.322 0.731 0.307 1.296 0.312
γ(0.25) 0.234 0.072 0.263 0.068 0.261 0.061 0.235 0.061
ν 5.842 0.499 6.181 0.511 0.527 0.312
τ 0.532 0.299
50 β1(10) 10.253 0.720 9.732 0.807 10.398 0.766 10.247 0.754
β2(1) 1.248 0.646 1.243 0.642 1.250 0.651 1.244 0.660
β3(0.2) 0.217 0.108 0.182 0.106 0.216 0.111 0.218 0.120
σ(1) 0.793 0.293 1.230 0.291 1.262 0.275 1.239 0.281
γ(0.25) 0.260 0.062 0.264 0.058 0.262 0.059 0.260 0.051
ν 6.147 0.461 5.888 0.498 0.471 0.300
τ 0.525 0.321
100 β1(10) 9.781 0.501 10.212 0.489 10.263 0.465 10.220 0.492
β2(1) 1.156 0.476 1.174 0.511 1.143 0.532 1.169 0.498
β3(0.2) 0.208 0.091 0.209 0.086 0.191 0.090 0.189 0.095
σ(1) 1.118 0.234 1.125 0.217 1.108 0.205 0.897 0.199
γ(0.25) 0.256 0.049 0.255 0.051 0.234 0.060 0.232 0.055
ν 5.893 0.406 6.077 0.422 0.483 0.247
τ 0.513 0.253
300 β1(10) 10.142 0.417 10.122 0.421 10.178 0.431 9.854 0.418
β2(1) 1.137 0.440 1.129 0.452 1.130 0.456 1.119 0.437
β3(0.2) 0.193 0.087 0.192 0.084 0.206 0.088 0.201 0.091
σ(1) 1.107 0.226 0.928 0.213 1.089 0.198 1.084 0.190
γ(0.25) 0.248 0.047 0.253 0.046 0.256 0.050 0.253 0.055
ν 6.043 0.387 6.059 0.391 0.491 0.239
τ 0.506 0.248
Table 3.

The arithmetic average of Bayesian estimates and empirical mean squared error of logistic TP-SMN-NLR models with weak skewness γ=0.45.

    TP-N-NLR TP-T-NLR TP-SL-NLR TP-CN-NLR
n Par. MC-Mean MC-MSE MC-Mean MC-MSE MC-Mean MC-MSE MC-Mean MC-MSE
25 β1(10) 9.497 0.797 10.334 0.927 9.496 0.909 10.421 0.915
β2(1) 1.286 0.712 1.274 0.735 1.292 0.737 1.282 0.747
β3(0.2) 0.224 0.123 0.223 0.119 0.179 0.120 0.181 0.126
σ(1) 0.674 0.350 1.311 0.347 1.303 0.356 0.694 0.338
γ(0.45) 0.433 0.071 0.468 0.069 0.475 0.071 0.474 0.079
ν 6.223 0.521 6.308 0.530 0.537 0.351
τ 0.466 0.347
50 β1(10) 10.404 0.701 10.323 0.727 9.611 0.720 10.412 0.711
β2(1) 1.251 0.676 1.255 0.685 1.241 0.655 1.227 0.690
β3(0.2) 0.178 0.119 0.217 0.120 0.218 0.117 0.219 0.122
σ(1) 1.239 0.287 0.680 0.292 1.265 0.301 1.264 0.287
γ(0.45) 0.438 0.069 0.467 0.066 0.465 0.071 0.435 0.073
ν 5.798 0.518 6.288 0.512 0.531 0.342
τ 0.540 0.356
100 β1(10) 10.358 0.677 9.680 0.700 10.338 0.692 10.361 0.685
β2(1) 1.174 0.501 1.191 0.524 1.165 0.513 1.160 0.560
β3(0.2) 0.211 0.100 0.189 0.096 0.209 0.099 0.210 0.097
σ(1) 1.131 0.227 0.880 0.225 1.134 0.219 1.116 0.217
γ(0.45) 0.461 0.057 0.442 0.049 0.458 0.042 0.456 0.041
ν 6.137 0.474 6.103 0.460 0.482 0.280
τ 0.523 0.293
300 β1(10) 10.201 0.523 10.280 0.603 10.256 0.598 10.240 0.549
β2(1) 1.156 0.478 1.161 0.463 1.155 0.440 1.129 0.479
β3(0.2) 0.208 0.081 0.209 0.086 0.195 0.080 0.207 0.099
σ(1) 0.934 0.213 0.926 0.217 1.109 0.211 1.100 0.211
γ(0.45) 0.445 0.054 0.453 0.046 0.458 0.042 0.453 0.045
ν 5.870 0.449 6.089 0.437 0.518 0.265
τ 0.488 0.259

5.1.2. Experiment 2

The second simulation study is devoted to the NLR model (13) with two 200 samples, for which on each sample sets εi,i=1,,n, be an i.i.d. sequence of skew-t with zero mean, σ=1, ν=6 and with weak skewness λ=0.01 (heavy tailed weakly skewed) and strong skewness λ=4 (heavy tailed strongly skewed). The Bayesian model comparison criterion EAIC is presented for the proposed TP-SMN-NLR models in Figure 1, which satisfied (the expected result) that the asymmetry and heavy-tailed TP-SMN-NLR members (i.e. TP-T-NLR, TP-SL-NLR and TP-CN-NLR models), especially TP-T-NLR, are the best fitted models in comparison the light-tailed TP-N-NLR member.

Figure 1.

Figure 1.

EAIC criterion of fitted TP-SMN-NLR models for weakly (Left) and strongly (Right) skewed ST-NLR samples.

5.2. Real datasets

In this section, we examine the performance of the Bayesian fitted TP-SMN-NLR models compared to the Bayesian fitted well-known class of SMSN-NLR models examined in Cancho et al. [5] for two real datasets. The proposed Bayesian model selection criteria show that TP-T-NLR model is more suitable for the both proposed datasets compared to other TP-SMN-NLR models, and also has better performance against their well-known SMSN-NLR counterparts.

5.2.1. One dimensional predictor data

The first dataset is the result of a NIST study involving circular interference transmittance. This real data called ‘Eckerle4’ set has length of n=35 with response variable ( y: transmittance) and predictor variables ( x: wavelength). This dataset is available at https://www.itl.nist.gov/div898/strd/nls/data/LINKS/DATA/Eckerle4.dat, and in ‘NISTnls’ R statistical software package, which is considered by [32] R., NIST (197?). In the present methodology assume that the following NLR model, given by

Yi=β1β2exp((xiβ3)22β22)+εi,εiiid.TPSMN(με,σ,ν,γ);i=1,,35.

The results of the various Bayesian fitted TP-SMN-NLR models and corresponding counterparts in the SMSN-NLR models are provided in Tables 4 and 5.

Table 4.

Bayesian estimates of the TP-SMN-NLR and SMSN-NLR models parameters for the ‘Eckerle4' dataset.

Par. TP-N SN TP-T ST TP-SL SSL TP-CN SCN
β1 1.5535 1.5673 1.5412 1.5419 1.5452 1.5361 1.5548 1.5523
β2 4.0761 4.0947 4.1661 4.1076 4.1245 4.1481 4.0729 4.0884
β3 451.5638 451.5763 451.5040 451.5020 451.5254 451.5032 451.5090 451.5031
σ 1.2e–02 1.0e–04 54e-04 4.2e–06 7.6e–03 8.6e–06 8.2e–04 5.7e–07
(γ,λ) 0.3994 –2.6411 0.4726 0.0492 0.5302 –0.6222 0.5211 –0.1486
ν 1.2354 0.9371 1.2077 0.9032 (0.6,0.02) (0.6,0.01)
Table 5.

Bayesian model selection criteria for the TP-SMN-NLR and SMSN-NLR models parameters of the ‘Eckerle4’ dataset.

Criteria TP-N SN TP-T ST TP-SL SSL TP-CN SCN
EAIC –247.21 –246.07 –252.15 –251.10 –250.72 –249.51 –251.14 –250.23
EBIC –239.34 –238.25 –242.82 –241.12 –241.41 –240.19 –240.19 –239.33
DIC –251.42 –250.11 –256.11 –255.21 –254.50 –254.32 –255.31 –254.26

Note: The best values are indicated in bold.

In this example, the TP-T-NLR model appears to be the best fitting, with most of the two-piece distributions outperforming their SMSN counterpart. In order to some fitted model checking, we consider the following diagrams (Figures 24). The histograms of the residuals based on the TP-T-NLR and skew-t NLR (ST-NLR) models are provided in Figure 2. According the keenly methodology in O’Hagan et al. [28], the ECDF of the original response ‘Eckerle4’ dataset has superimposed on the ECDFs of 500 data sets simulated from the optimal models under skew-t and the TP-T models in Figure 3. Figures 23 provide marginally better fit of the TP-T-NLR than the ST-NLR model. The fitted TP-T-NLR model superimposed on the ‘Eckerle4’ data is provided in Figure 4.

Figure 2.

Figure 2.

Bayesian estimated densities of the ST (left) and TP-T (right) models fitted on their corresponding residuals of each NLM using the ‘Eckerle4' dataset.

Figure 4.

Figure 4.

Bayesian fitted TP-T-NLR model and observed values versus predictor variable from the ‘Eckerle4' dataset.

Figure 3.

Figure 3.

The black line represents the ECDF of the original response ‘Eckerle4' dataset; the gray lines represent the ECDFs of each of the 500 data sets simulated from the optimal model under the ST (left) and the TP-T (right) models.

5.2.2. Two dimensional predictor data

In this section, we examine the performance of the Bayesian fitted TP-SMN-NLR models compared to the Bayesian fitted well-known class of SMSN-NLR models examined in Cancho et al. [5] for a real dataset. The proposed Bayesian model selection criteria show that TP-T-NLR model is more suitable for the proposed data compared to other TP-SMN-NLR models, and also has better performance against their well-known SMSN-NLR counterparts.

The real data called ‘Nelson’ set has length of n=128 with one response variable ( y: dielectric breakdown strength in kilo-volts) and two predictor variables ( x1: time in weeks, and, x2: temperature in degrees Celsius). These data are the result of a study involving the analysis of performance degradation data from accelerated tests, published in Nelson [27] and Maleki et al. [15]. In the present methodology and assuming the same NLR model as in Nelson [27],

log(Yi)=β1β2xi1exp(β3xi2)+εi,εiiid.TPSMN(με,σ,ν,γ);i=1,,128.

The results of the various Bayesian fitted TP-SMN-NLR models and corresponding counterparts in the SMSN-NLR models are provided in Tables 6 and 7.

Table 6.

Bayesian estimates of the TP-SMN-NLR and SMSN-NLR models parameters for the ‘Nelson' dataset.

Par. TP-N SN TP-T ST TP-SL SSL TP-CN SCN
β1 2.53e+00 2.61e+00 2.63e+00 2.60e+00 2.59e+00 2.62e+00 2.61e+00 2.62e+00
β2 9.72e–09 9.06e–09 2.30e–09 2.29e–09 4.52e–09 4.51e–09 4.35e–09 2.55e–08
β3 –5.53e–02 –5.61e–02 –6.22e–02 –6.23e–02 –5.91e–02 –5.87e–02 –5.93e–02 –5.21e–02
σ 0.3511  0.2401 0.2103 0.1204 0.2402 0.2110 0.1702  0.1320
(γ,λ) 0.4201 – 1.3101 0.5870 0.2394 0.5210 0.2532 0.6571 – 0.3521
ν 2.8504 2.8222 2.2331 2.1109 (0.3,0.1) (0.3,0.1)
Table 7.

Bayesian model selection criteria for the TP-SMN-NLR and SMSN-NLR models parameters of the ‘Nelson' dataset.

Criteria TP-N SN TP-T ST TP-SL SSL TP-CN SCN
EAIC –78.143 –78.432 –95.211 –93.209 –86.132 –82.567 –78.980 –85.590
EBIC –63.814 –64.345 –78.147 –76.432 –69.244 –65.413 –59.734 –65.576
DIC –80.621 –80.879 –97.720 –95.689 –88.593 –85.010 –81.402 –88.023

Note: The best values are indicated in bold.

In this example, the TP-T-NLR model appears to be the best fitting, with most of the two-piece distributions outperforming their SMSN counterpart. To model checking of the best fitted models as the previous real example, the histograms of the residuals based on the TP-T-NLR and ST-NLR models are provided in Figure 5. The ECDF of the original response ‘Nelson’ dataset has superimposed on the ECDFs of 500 data sets simulated from the optimal models under skew-t and the TP-T models in Figure 6. Figures 5 and 6 provide marginally better fit of the TP-T-NLR than the ST-NLR model. The fitted surface of the TP-T-NLR model superimposed on the ‘Nelson’ data is provided in Figure 7.

Figure 5.

Figure 5.

Bayesian estimated densities of the ST (left) and TP-T (right) models fitted on their corresponding residuals of each NLM using the ‘Nelson' dataset.

Figure 6.

Figure 6.

The black line represents the ECDF of the original response ‘Nelson' dataset; the gray lines represent the ECDFs of each of the 500 data sets simulated from the optimal model under the ST (left) and the TP-T (right) models.

Figure 7.

Figure 7.

Bayesian fitted surface of the fitted TP-T-NLR model and observed values versus two predictor variables from the ‘Nelson' dataset.

6. Conclusion

In this paper, we have extended previous work and the literature on NLR models to propose a Bayesian approach to fitting these models using TP-SMN distributions. This family of distributions provides considerable flexibility in modelling asymmetric and/or outlying data in the context of NLR models and encompasses a number of well-known distributions. We proposed a special stochastic and hierarchical representation of the TP-SMN random variables to allow for ease of computation using a Bayesian approach to estimate non-linear regression parameters. Using simulated and real dataset, the proposed methodology provides good performance of the Bayesian estimates of TP-SMN-NLR models compared to the well-known SMSN-NLR models. Future research could take a number of directions but we expect some of this to build on the advantages of the Bayesian approach in this context including the handling of high dimensional and/or noisy data, which can be particular problems in regression settings in practice.

Acknowledgements

We would like to express our very great appreciation to associate editor and reviewer(s) for their valuable and constructive suggestions during the planning and development of this research work.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Arellano-Valle R.B., Gómez H., and Quintana F.A., Statistical inference for a general class of asymmetric distributions. J. Stat. Plan. Inference. 128 (2005), pp. 427–443. [Google Scholar]
  • 2.Andrews D.R. and Mallows C.L., Scale mixture of normal distribution. J. R. Stat. Soc., Ser. B 36 (1974), pp. 99–102. [Google Scholar]
  • 3.Branco M.D. and Dey D.K., A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 79 (2001), pp. 99–113. [Google Scholar]
  • 4.Cancho V.C., Lachos V.H., and Ortega E.M.M., A nonlinear regression model with skew-normal errors. Stat. Pap. 51 (2009), pp. 547–558. [Google Scholar]
  • 5.Cancho V.G., Dey D.K., Lachos V.H., and Andrade M.G., Bayesian nonlinear regression models with scale mixtures of skew-normal distributions: estimation and case influence diagnostics. Comput. Stat. Data Anal. 55 (2011), pp. 588–602. [Google Scholar]
  • 6.Chib S. and Greenberg E., Understanding the metropolis_Hastings algorithm. Am. Stat. 49 (1995), pp. 327–335. [Google Scholar]
  • 7.Contreras-Reyes J.E., Maleki M., and Cortés D.D., Skew-Reflected-Gompertz information quantifiers with application to sea surface temperature records. Mathematics 7 (2019), pp. 403). doi: 10.3390/math7050403. [DOI] [Google Scholar]
  • 8.Cordeiro G.M., Cysneiros A.H.M.A., and Cysneiros F.J.A., Corrected maximum likelihood estimators in heteroscedastic symmetric nonlinear models. J. Stat. Comput. Simul. (2009), doi: 10.1080/00949650802706420. [DOI] [Google Scholar]
  • 9.Cysneiros F.J.A. and Vanegas L.H., Residuals and their statistical properties in symmetrical nonlinear models. Stat. Probab. Lett. 78 (2008), pp. 3269–3273. [Google Scholar]
  • 10.Ghasami S., Maleki M., and Khodadadi Z., Leptokurtic and platykurtic class of robust symmetrical and asymmetrical time series models. J. Comput. Appl. Math. 112806 (2020), doi: 10.1016/j.cam.2020.112806. [DOI] [Google Scholar]
  • 11.Hajrajabi A. and Maleki M., Nonlinear semiparametric autoregressive model with finite mixtures of scale mixtures of skew normal innovations. J. Appl. Stat. 46 (2019), pp. 2010–2029. [Google Scholar]
  • 12.Hoseinzadeh A., Maleki M., and Khodadadi Z., Heteroscedastic nonlinear regression models using asymmetric and heavy tailed two-piece distributions. AStA Adv. Stat. Anal. (2020), doi: 10.1007/s10182-020-00384-3. [DOI] [Google Scholar]
  • 13.Hoseinzadeh A., Maleki M., Khodadadi Z., and Contreras-Reyes J.E., The skew-Reflected-Gompertz distribution for analyzing symmetric and asymmetric data. J. Comput. Appl. Math. 349 (2019), pp. 132–141. [Google Scholar]
  • 14.Labra F.V., Garay A.M., Lachos V.H., and Ortega E.M.M., Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions. J. Stat. Plan. Inference. 142 (2012), pp. 2149–2165. [Google Scholar]
  • 15.Maleki M., Barkhordar Z., Khodadadi Z., and Wraith D., A robust class of homoscedastic nonlinear regression models. J. Stat. Comput. Simul. 89 (2019c), pp. 2765–2781. [Google Scholar]
  • 16.Maleki M., Contreras-Reyes J.E., and Mahmoudi M.R., Robust mixture modeling based on two-piece scale mixtures of normal family. Axioms 8 (2019d), pp. 38). doi: 10.3390/axioms8020038. [DOI] [Google Scholar]
  • 17.Maleki M. and Nematollahi A.R., Autoregressive models with mixture of scale mixtures of Gaussian innovations. Iran. J. Sci. Technol., Trans. A: Sci. 41 (2017a), pp. 1099–1107. [Google Scholar]
  • 18.Maleki M. and Nematollahi A.R., Bayesian approach to epsilon-skew-normal family. Commun. Stat. Theory Methods 46 (2017b), pp. 7546–7561. [Google Scholar]
  • 19.Maleki M., Wraith D., Mahmoudi M.R., and Contreras-Reyes J.E., Asymmetric heavy-tailed vector auto-regressive processes with application to financial data. J. Stat. Comput. Simul. 90 (2019a), pp. 324–340. [Google Scholar]
  • 20.Maleki M. and Mahmoudi M.R., Two-Piece Location-scale distributions based on scale mixtures of normal family. Commun. Stat. Theory Methods 46 (2017), pp. 12356–12369. [Google Scholar]
  • 21.Maleki M., Mahmoudi M.R., Heydari M.H., and Pho K.H., Modeling and forecasting the spread and death rate of coronavirus (COVID-19) in the world using time series models. Chaos, Solitons Fractals 110151 (2020a), doi: 10.1016/j.chaos.2020.110151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Maleki M., Mahmoudi M.R., Wraith D., and Pho K.H., Time series modelling to forecast the confirmed and recovered cases of COVID-19. Travel Med. Infect. Dis. 101742 (2020b), doi: 10.1016/j.tmaid.2020.101742. [DOI] [PubMed] [Google Scholar]
  • 23.Maleki M., Wraith D., and Arellano-Valle R.B., A flexible class of parametric distributions for Bayesian linear mixed models. TEST 28 (2019b), pp. 543–564. [Google Scholar]
  • 24.Montenegro L.C., Lachos V., and Bolfarine H., Local influence analysis of skew-normal linear mixed models. Commun. Stat. Theory Methods 38 (2009), pp. 484–496. [Google Scholar]
  • 25.Moravveji B., Khodadadi Z., and Maleki M., A Bayesian analysis of Two-piece distributions based on the scale mixtures of normal family. Iran. J. Sci. Technol., Trans. A: Sci. 43 (2019), pp. 991–1001. [Google Scholar]
  • 26.Mudholkar G.S. and Hutson A.D., The epsilon-skew-normal distribution for analyzing near-normal data. J. Stat. Plan. Inference. 83 (2000), pp. 291–309. [Google Scholar]
  • 27.Nelson W., Analysis of performance-degradation data. IEEE Trans. Reliab. 2 (1981), pp. 149–155. [Google Scholar]
  • 28.O’Hagan A., Murphy T.B., Gormley I.C., McNicholas P.D., and Karlis D., Clustering with the multivariate normal inverse Gaussian distribution. Comput. Stat. Data Anal. 93 (2016), pp. 18–30. [Google Scholar]
  • 29.Pan J.J., Mahmoudi M.R., Baleanu D., and Maleki M., On Comparing and Classifying several independent linear and Non-linear regression models with symmetric errors. Symmetry. (Basel) 11 (2019), pp. 820). doi: 10.3390/sym11060820. [DOI] [Google Scholar]
  • 30.Rosa G.J.M., Padovani C.R., and Gianola D., Robust linear mixed models with normal/independent distributions and Bayesian MCMC implementation. Biom. J. 45 (2003), pp. 573–590. [Google Scholar]
  • 31.R Core Team . R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at https://www.R-project.org/.
  • 32.Thurber R., Semiconductor Electron Mobility Modeling, NIST, unpublished, 1979.
  • 33.Vanegas L.H. and Cysneiros F.J.A., Assesment of diagnostic procedures in symmetrical nonlinear regression models. Comput. Stat. Data Anal. 54 (2010), pp. 1002–1016. [Google Scholar]
  • 34.Xie F.C., Lin J.G., and Wei B.C., Diagnostics for skew-normal nonlinear regression models with ar(1) errors. Comput. Stat. Data. Anal. 53 (2009a), pp. 4403–4416. [Google Scholar]
  • 35.Xie F.C., Wei B.C., and Lin J.G., Homogeneity diagnostics for skew-normal nonlinear regression models. Stat. Probab. Lett. 79 (2009b), pp. 821–827. [Google Scholar]
  • 36.Zarrin P., Maleki M., Khodadadi Z., and Arellano-Valle R.B., Time series process based on the unrestricted skew normal process. J. Stat. Comput. Simul. 89 (2018), pp. 38–51. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES