Inference and diagnostics for heteroscedastic nonlinear regression models under skew scale mixtures of normal distributions

Clécio da Silva Ferreira; Víctor H Lachos; Aldo M Garay

doi:10.1080/02664763.2019.1691158

. 2019 Nov 11;47(9):1690–1719. doi: 10.1080/02664763.2019.1691158

Inference and diagnostics for heteroscedastic nonlinear regression models under skew scale mixtures of normal distributions

Clécio da Silva Ferreira ^a,^CONTACT, Víctor H Lachos ^b, Aldo M Garay ^c

PMCID: PMC9041946 PMID: 35707586

ABSTRACT

The heteroscedastic nonlinear regression model (HNLM) is an important tool in data modeling. In this paper we propose a HNLM considering skew scale mixtures of normal (SSMN) distributions, which allows fitting asymmetric and heavy-tailed data simultaneously. Maximum likelihood (ML) estimation is performed via the expectation-maximization (EM) algorithm. The observed information matrix is derived analytically to account for standard errors. In addition, diagnostic analysis is developed using case-deletion measures and the local influence approach. A simulation study is developed to verify the empirical distribution of the likelihood ratio statistic, the power of the homogeneity of variances test and a study for misspecification of the structure function. The method proposed is also illustrated by analyzing a real dataset.

KEYWORDS: EM algorithm, heteroscedastic nonlinear regression models, influence diagnostics, likelihood ratio test, skew scale mixtures of normal distributions

1. Introduction

Nonlinear regression models (NLM) are applied in some areas of the sciences to model data for which nonlinear functions of unknown parameters are used to explain or describe the phenomena under study. For a broad discussion on nonlinear models, see, for instance, [4,30]. According [5], a large degree of heteroscedasticity is more commonly seen with data that are best fit by a nonlinear regression model than with data that can be adequately fit by a linear model. However, it is well known that several phenomena have asymmetry and/or heavy-and-lightly tailed behavior, so it is necessary to work with more flexible classes of distributions.

Some authors have proposed homoscedastic nonlinear regression models with an asymmetric structure in the error term. For instance, Cancho et al. [7] introduced the skew-normal nonlinear regression models (SN-NLM) and presented a complete likelihood based analysis, including an efficient EM-type algorithm for ML estimation. Garay et al. [18] introduced an extension of the SN-NLM by using the scale mixtures of skew-normal (SMSN) distributions, proposed by Branco and Dey [6], in the error structure, allowing modeling data with skewness and heavy-tails simultaneously. More recently, Ferreira and Lachos [16] extended the SN-NLM using the skew scale mixtures of normal (SSMN) distributions [15]. This novel class of distributions provides a useful generalization of the symmetrical and asymmetrical NLM, since the error term distributions cover both asymmetric and heavy-tailed distributions, such as the skew-t-normal, skew-slash and skew-contaminated normal, among others. There are some important differences between the classes of SSMN and SMSN distributions (see, for example, [16]).

On the other hand, heteroscedastic nonlinear regression models (HNLM) have been studied recently by some authors. For example, Xie et al. [32,33] developed score statistics for testing homogeneity in the SN-NLM. Lin et al. [24] developed diagnostic tool for skew-t-normal nonlinear models and investigated the properties of a score test statistic for homogeneity of the variance through Monte Carlo simulations. Louzada et al. [26] proposed a HNLM with a skew-normal structure in the presence of heteroscedasticity applied to growth curve modeling. Garay et al. [19] developed diagnostics analysis for HNLM under SMSN (SMSN-HNLM) distributions and presented a score test for checking the homogeneity of the scale parameter. More recently, Araújo et al. [2] addressed the issue of hypothesis testing of the dispersion parameter in the symmetric HNLM using the likelihood ratio test.

The assessment of robustness aspects of the parameter estimates in statistical models has received special attention in recent decades. Identification of problems caused by influential aspects may provide ideas to improve the model assumptions. The case deletion measures [10] consist of studying the impact on the parameter estimates after dropping individual observations. The influence of small perturbations in the model/data on the parameter estimates can be ascertained by performing local influence analysis [11]. Zhu et al. [34] proposed the selection of an appropriate perturbation scheme and the development of influence measures for objective functions at a point with a nonzero first derivative based on the observed log-likelihood function. Chen et al. [9] extended Zhu et al. [34]'s approach to analyzing complex latent variable models using the complete-data log-likelihood. Another approach for case deletion measures and local influence analysis, based on conditional expectation of the complete-data log-likelihood at the E-step of the EM algorithm, was developed by Zhu et al. [36] and Zhu and Lee [35], respectively. For some applications of Zhu and Lee [35]'s approach in the context of asymmetric models, we refer to [14,17,23,28], among others.

In this work, we propose the heteroscedastic nonlinear regression model with errors following a SSMN distribution (SSMN-HNLM), generalizing the SSMN-NLM proposed by Ferreira and Lachos [16]. The ML estimates of the model parameters are obtained via the EM algorithm and the observed information matrix is obtained analytically. Thus, in this paper we develop influence diagnostic tools (case deletion and local influences) for our model based in Zhu and Lee [35]'s well-known approach, which is based on the complete log-likelihood function. We perform a simulation study to verify the asymptotic distribution of the likelihood ratio test statistic and its empirical power to test homogeneity of variances. Further, a study for misspecification of the structure function is performed.

The rest of the paper is organized as follows. In Section 2, we present some properties of the univariate SSMN family. Section 3 outlines the SSMN-HNLM and the EM algorithm for maximum likelihood estimation. In Section 4, we discuss the log-likelihood ratio test for checking homogeneity of a scale parameter and investigate its properties through Monte Carlo simulations. The case deletion measures and local influence of three perturbation schemes are derived in Section 5. The proposed method is illustrated in Section 6 by analyzing a real dataset, and some concluding remarks are presented in Section 7.

2. Skew scale mixtures of normal distributions

In order to motivate our proposed methods, we present a brief introduction to the skew-normal (SN), the scale mixture of normal (SMN) and the SSMN class of distributions. For further details we refer to [15,17].

Let $ϕ (x; μ, σ^{2})$ and $Φ (x; μ, σ^{2})$ be the probability density function (pdf) and the cumulative distribution function (cdf), respectively, of the $N (μ, σ^{2})$ distribution evaluated at x. A random variable Y follows a univariate skew-normal distribution [3] with location parameter μ, scale parameter $σ^{2}$ and skewness parameter λ if its pdf is given by:

f (y) = 2 ϕ (y; μ, σ^{2}) Φ (λ \frac{y - μ}{σ}), y \in R .

(1)

For a random variable with pdf as in (1), we use the notation $Y \sim SN (μ, σ^{2}, λ)$ . When $λ = 0$ , the skew normal distribution reduces to the normal distribution ( $Y \sim N (μ, σ^{2})$ ). Its marginal stochastic representation [21], which can be used to derive several of its properties, is given by:

Y \overset{d}{=} μ + σ (δ | T_{0} | + (1 - δ^{2})^{1 / 2} T_{1}), with δ = λ / \sqrt{1 + λ^{2}},

(2)

where $T_{0} \sim N (0, 1)$ and $T_{1} \sim N (0, 1)$ are independent, $| T_{0} |$ denotes the absolute value of $T_{0}$ and ‘ $\overset{d}{=}$ ’ means ‘distributed as’. The expectation and variance of Y are given, respectively, by:

E [Y] = μ + b σ δ, Var [Y] = σ^{2} (1 - b^{2} δ^{2}),

(3)

where $b = \sqrt{2 / π}$ .

A random variable Y follows a SMN distribution [1] with location parameter $μ \in R$ and scale parameter $σ^{2}$ if its pdf assumes the form:

f_{0} (y) = \int_{0}^{\infty} ϕ (y; μ, κ (u) σ^{2}) d H (u; τ),

(4)

where $H (u; τ)$ is the cdf of a positive random variable U indexed by the parameter vector $τ$ and $κ (.)$ is a strictly positive function. For a random variable with a pdf as in (4), we use the notation $Y \sim SMN (μ, σ^{2}, H; κ)$ .

A random variable Y follows a SSMN distribution [15] with location parameter $μ \in R,$ scale factor $σ^{2}$ and skewness parameter $λ \in R$ , if its pdf is given by:

f (y) = 2 f_{0} (y) Φ (λ \frac{y - μ}{σ}),

(5)

where $f_{0} (y)$ is a SMN density as defined in (4). For a random variable with pdf as in (5), we use the notation $Y \sim SSMN (μ, σ^{2}, λ, H; κ)$ . If $μ = 0$ and $σ^{2} = 1$ , we refer to it as the standard SSMN distribution and we denote it by $SSMN (λ, H; κ)$ . Clearly, when $λ = 0$ , we get the corresponding SMN distributions proposed by Andrews and Mallows [1].

For a SSMN random variable, a convenient hierarchical representation is given next, which can be used to quickly simulate realizations of Y and to implement the EM algorithm.

Let $Y \sim SSMN (μ, σ^{2}, λ, H; κ)$ . Then its hierarchical representation is given by:

Y | U = u \sim SN (μ, σ^{2} κ (u), λ κ^{1 / 2} (u)), U \sim H (τ) .

(6)

Thus, the distributions in the SSMN class that will be considered in this work are:

The skew Student-t normal distribution (StN), [20], with $τ = ν$ degrees of freedom, denoted by $S t N (μ, σ^{2}, λ, ν)$ and $κ (u) = 1 / u$ , with $U \sim G a m m a (ν / 2, ν / 2), ν > 0$ , which has pdf
$f (y) = 2 \frac{1}{σ \sqrt{ν π}} \frac{Γ ((ν + 1) / 2)}{Γ (\frac{ν}{2})} {(1 + \frac{d}{ν})}^{- ((ν + 1) / 2)} Φ (λ \frac{(y - μ)}{σ}),$ (7)
where $d = (y - μ)^{2} / σ^{2}$ and $Γ (\cdot)$ is the gamma function. When $ν ↑ \infty$ , we obtain the SN distribution as the limiting case. Lastly, $U | Y = y \sim G a m m a ((ν + 1) / 2, (ν + d) / 2)$ .
The skew slash distribution (SSL), denoted by $SSL (μ, σ^{2}, λ, ν)$ , arises when $κ (u) = 1 / u$ and $U \sim Beta (ν, 1)$ , with $τ = ν > 0$ . Its pdf is given by:
$f (y) = 2 ν \int_{0}^{1} u^{ν - 1} ϕ (y; μ, \frac{σ^{2}}{u}) d u Φ (λ \frac{y - μ}{σ}), y \in R .$ (8)
The skew slash distribution reduces to the SN distribution when $ν ↑ \infty$ . It is easy to see that $U | Y = y \sim T G a m m a_{(0, 1)} (ν + 1 / 2, d / 2),$ where $T G a m m a_{(t_{1}, t_{2})} (a, b)$ is the truncated gamma distribution, in the interval $(t_{1}, t_{2})$ .
The skew contaminated normal distribution (SCN), denoted by $SCN (μ, σ^{2}, λ, ν, γ)$ , $τ = (ν, γ)$ , $0 < ν < 1$ , $0 < γ < 1$ . Here, $κ (u) = 1 / u$ and U is a discrete random variable taking one of two states. The probability density function of U is given by:
$\begin{aligned} h (u; τ) & = ν I_{(u = γ)} + (1 - ν) I_{(u = 1)} and \\ f (y) & = 2 {ν ϕ (y; μ, \frac{σ^{2}}{γ}) + (1 - ν) ϕ (y; μ, σ^{2})} Φ (λ \frac{y - μ}{σ}) . \end{aligned}$ (9)
The skew contaminated normal distribution reduces to the SN distribution when $γ = 1$ . The conditional distribution $U | Y = y$ is given by:
$f (u | Y = y) = \frac{1}{f_{0} (y)} {ν ϕ (y; μ, σ^{2} / γ) I_{(u = γ)} + (1 - ν) ϕ (y; μ, σ^{2}) I_{(u = 1)}},$
where $f_{0} (y) = ν ϕ (y; μ, σ^{2} / γ) + (1 - ν) ϕ (y; μ, σ^{2})$ .
The skew power-exponential distribution (SPE), denoted by $SPE (μ, σ^{2}, λ, ν)$ , with $τ = ν$ and $0.5 < ν \leq 1$ , has pdf given by:
$f (y) = 2 \frac{ν}{2^{1 / 2 ν} σ Γ (1 / 2 ν)} e^{- d^{ν} / 2} Φ (λ \frac{y - μ}{σ}),$ (10)
which reduces to the SN distribution when $ν = 1$ . Although the conditional distribution of $U | Y = y$ is not known, Ferreira et al. [15] showed that:
$E [κ^{- 1} (U) | Y = y] = ν d^{ν - 1} .$ (11)

3. The model and the EM algorithm for ML estimation

3.1. The model

The SSMN-HNLM is defined by:

\begin{aligned} Y_{i} & = η_{i} + ε_{i}, i = 1, \dots, n, \\ ε_{i} & \sim SSMN (0, σ^{2} m_{i}, λ; H, κ), \end{aligned}

(12)

where $Y_{i}$ is the response variable, $x_{i}$ is a known $p \times 1$ covariate vector, $η_{i} = η (β, x_{i})$ is the nonlinear predictor, where $η (\cdot)$ is an injective and twice continuously differentiable function with respect to the vector of unknown regression coefficients $β = (β_{1}, \dots, β_{p})^{⊤}$ , $m_{i} = m (ρ, z_{i})$ is a known positive continuously differentiable function, $z_{i}$ contains values of the explanatory variables, which constitute in general, although not necessarily, a subset of $x_{i}$ , and $ρ$ is a vector of unknown parameters (see [19,32] for more details). We assume that there is a unique value $ρ = ρ_{0}$ such that $m_{i} (ρ_{0}, z_{i}) = 1$ for all $i = 1, \dots, n$ .

Using Equations (4) and (5), it follows that the observed-data log-likelihood function of the parameter vector $θ = (β^{⊤}, σ^{2}, λ, ρ^{⊤}, τ^{⊤})^{⊤}$ can be expressed as:

ℓ (θ | y) = \sum_{i = 1}^{n} \log {2 \int_{0}^{\infty} ϕ (y_{i}; η_{i}, σ^{2} m_{i} κ (u_{i})) h (u_{i}; τ) d u_{i} Φ (λ \frac{(y_{i} - η_{i})}{σ m_{i}^{1 / 2}})},

(13)

where $h (\cdot; τ)$ is the pdf of U and $y = (y_{1}, \dots, y_{n})^{⊤}$ is the vector of observed values of the response variable Y.

3.2. The ECME algorithm for the SSMN-HNLM model

Note that it is not possible to obtain an analytical solution to the ML estimates of $θ$ using $ℓ (θ | y)$ directly. The hierarchical representation of the SSMN distributions, see Equations (2) and (6), enables the construction of an EM-type algorithm [13] for ML estimation of the SSMN-HNLM. When the M-step of the EM turns out to be analytically intractable, it can be replaced with a sequence of conditional maximization (CM) steps, referred to as the ECM algorithm [27]. Another option is to use the ECME algorithm [25], a faster extension of the EM and ECM, which is obtained by maximizing the constrained Q-function (the expected complete data function) with some CM steps that maximize the corresponding constrained actual marginal likelihood function, called the CML steps.

In this section, we demonstrate how to employ the ECME algorithm for ML estimation of the SSMN-HNLM model. From Equations (2) and (6), the following hierarchical representation for $Y_{i}$ can be obtained:

\begin{aligned} Y_{i} | U_{i} = u_{i}, T_{i} = t_{i} & \overset{i n d}{\sim} N (η_{i} + \frac{σ λ m_{i}^{1 / 2} κ (u_{i})}{(1 + λ^{2} κ (u_{i}))^{1 / 2}} t_{i}, \frac{σ^{2} m_{i} κ (u_{i})}{1 + λ^{2} κ (u_{i})}), \\ U_{i} & \overset{i i d}{\sim} H (τ), \\ T_{i} & \overset{i i d}{\sim} T N_{(0, \infty)} (0, 1), i = 1, \dots, n, \end{aligned}

(14)

where $T N_{(0, \infty)} (μ, σ^{2})$ denotes the univariate normal distribution, $(N (μ, σ^{2}))$ , truncated in the interval $(0, \infty)$ .

Using Lemma 1 presented by Ferreira and Lachos [16], and after some algebraic manipulations, the joint distribution of $(Y_{i}, U_{i}, T_{i})$ can be written as:

\begin{aligned} f (y_{i}, u_{i}, t_{i}) & = 2 ϕ (y_{i}; η_{i}, σ^{2} m_{i} κ (u_{i})) ϕ (t_{i}; λ (y_{i} - η_{i}), σ^{2} m_{i}) h (u_{i}; τ), \\ t_{i} \geq 0, u_{i} \geq 0, y_{i} \in R . \end{aligned}

Let $u = (u_{1}, \dots, u_{n})^{⊤}$ and $t = (t_{1}, \dots, t_{n})^{⊤}$ . Considering $u$ and $t$ as missing data, it follows that the complete log-likelihood function associated with $y_{c} = (y^{⊤}, u^{⊤}, t^{⊤})^{⊤}$ is given by:

\begin{aligned} ℓ_{c} (θ; y_{c}) & = C + \sum_{i = 1}^{n} {\log h (u_{i}, τ) - \log σ^{2} - \log m_{i} \\ - \frac{1}{2 σ^{2} m_{i}} [(κ^{- 1} (u_{i}) + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ t_{i} (y_{i} - η_{i}) + t_{i}^{2}]}, \end{aligned}

where C is a constant not depending on unknown parameters $θ$ .

Given the current estimate ${\hat{θ}}^{(k)} = ({\hat{β}}^{(k) ⊤}, {\hat{σ^{2}}}^{(k)}, {\hat{λ}}^{(k)}, {\hat{ρ}}^{(k) ⊤}, {\hat{τ}}^{(k) ⊤})^{⊤}$ , the E-step calculates the function

Q (θ | \hat{θ}) = E [ℓ_{c} (θ; y_{c}) | y, {\hat{θ}}^{(k)}] = \sum_{i = 1}^{n} Q_{1 i} (θ_{1} | {\hat{θ}}^{(k)}) + \sum_{i = 1}^{n} Q_{2 i} (τ | {\hat{θ}}^{(k)}),

(15)

with $θ_{1} = (β^{⊤}, σ^{2}, λ, ρ^{⊤})^{⊤}$ and

\begin{aligned} Q_{1 i} (θ_{1} | {\hat{θ}}^{(k)}) & = C - \log σ^{2} - \log m_{i} \\ - \frac{m_{i}^{- 1}}{2 σ^{2}} {({\hat{κ}}_{i}^{(k)} + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i}^{(k)} (y_{i} - η_{i}) + {\hat{t^{2}}}_{i}^{(k)}} \\ Q_{2 i} (τ | {\hat{θ}}^{(k)}) & = E [\log h (U_{i}; τ) | y_{i}, {\hat{θ}}^{(k)}], see Appendix 2 for more details . \end{aligned}

It is important to note that these values require expressions for ${\hat{κ}}_{i}^{(k)} = E [κ^{- 1} (U_{i}) | y_{i}, {\hat{θ}}^{(k)}]$ , ${\hat{t}}_{i}^{(k)} = E [T_{i} | y_{i}, {\hat{θ}}^{(k)}]$ and ${\hat{t^{2}}}_{i}^{(k)} = E [T_{i}^{2} | y_{i}, {\hat{θ}}^{(k)}]$ .

As presented by Ferreira and Lachos [16], $T_{i} | Y = y_{i} \sim T N_{(0, \infty)} (λ (y_{i} - η_{i}), σ^{2} m_{i})$ and the expectations ${\hat{t}}_{i}^{(k)}$ and ${\hat{t^{2}}}_{i}^{(k)}$ can be readily evaluated by:

\begin{aligned} {\hat{t}}_{i}^{(k)} & = {\hat{λ}}^{(k)} {\hat{e}}_{i}^{(k)} + {\hat{σ}}^{(k)} ({\hat{m}}_{i}^{(k)})^{1 / 2} W_{Φ} (\frac{{\hat{λ}}^{(k)} {\hat{e}}_{i}^{(k)}}{{\hat{σ}}^{(k)} ({\hat{m}}_{i}^{(k)})^{1 / 2}}), \end{aligned}

(16)

\begin{aligned} {\hat{t^{2}}}_{i}^{(k)} & = {({\hat{λ}}^{(k)} {\hat{e}}_{i}^{(k)})}^{2} + \hat{σ} {^{2}}^{(k)} {\hat{m}}_{i}^{(k)} + {\hat{λ}}^{(k)} {\hat{σ}}^{(k)} ({\hat{m}}_{i}^{(k)})^{1 / 2} {\hat{e}}_{i}^{(k)} W_{Φ} (\frac{{\hat{λ}}^{(k)} {\hat{e}}_{i}^{(k)}}{{\hat{σ}}^{(k)} ({\hat{m}}_{i}^{(k)})^{1 / 2}}), \end{aligned}

(17)

where $W_{Φ} (u) = ϕ (u) / Φ (u),$ ${\hat{m}}_{i}^{(k)} = m_{i} ({\hat{ρ}}^{(k)}, z_{i}),$ ${\hat{η}}_{i}^{(k)} = η ({\hat{β}}^{(k)}, x_{i})$ and ${\hat{e}}_{i}^{(k)} = y_{i} - {\hat{η}}_{i}^{(k)}$ for $i = 1, \dots, n$ .

Updating ${\hat{d}}_{i}^{(k)} = (y_{i} - {\hat{η}}_{i}^{(k)})^{2} / ({\hat{σ^{2}}}^{(k)} {\hat{m}}_{i}^{(k)})$ , as in [15], we have computationally attractive expressions for ${\hat{κ}}_{i}^{(k)}$ for different SSMN distributions, as presented in Table 1.

Table 1. ${\hat{κ}}_{i}^{(k)} = E [κ^{- 1} (U_{i}) | y_{i}, {\hat{θ}}^{(k)}]$ for different SSMN distributions.

Distributions	${\hat{κ}}_{i}^{(k)}$
SN	1
StN	$\frac{{\hat{ν}}^{(k)} + 1}{{\hat{ν}}^{(k)} + {\hat{d}}_{i}^{(k)}}$
SSL	$\frac{(2 {\hat{ν}}^{(k)} + 1)}{{\hat{d}}_{i}^{(k)}} \frac{P_{1} ({\hat{ν}}^{(k)} + 3 / 2, {\hat{d}}_{i}^{(k)} / 2)}{P_{1} ({\hat{ν}}^{(k)} + 1 / 2, {\hat{d}}_{i}^{(k)} / 2)},$
SCN	$\frac{1 - {\hat{ν}}^{(k)} + {\hat{ν}}^{(k)} {\hat{γ}}^{(k)^{3 / 2}}}{1 - {\hat{ν}}^{(k)} + {\hat{ν}}^{(k)} {\hat{γ}}^{(k)^{1 / 2}}} \times \frac{\exp {(1 - {\hat{γ}}^{(k)}) {\hat{d}}_{i}^{(k)} / 2}}{\exp {(1 - {\hat{γ}}^{(k)}) {\hat{d}}_{i}^{(k)} / 2}},$
SPE	${\hat{ν}}^{(k)} ({\hat{d}}_{i}^{(k)})^{{\hat{ν}}^{(k)} - 1},$

Open in a new tab

$^{(*)} P_{x} (a, b)$ denotes the cdf of the $G a m m a (a, b)$ distribution, evaluated at x.

Thus, the CM-step then conditionally maximizes $Q (θ | \hat{θ})$ with respect to $θ$ , obtaining a new estimate ${\hat{θ}}^{(k + 1)}$ , as described below:

E-step: For $i = 1, \dots, n$ , compute ${\hat{t}}_{i}^{(k)}$ , ${\hat{t^{2}}}_{i}^{(k)}$ using Equations (16)–(17) and ${\hat{κ}}_{i}^{(k)}$ from Table 1.

CM-step:

Update

{\hat{β}}^{(k)}

{\hat{ρ}}^{(k)}

{\hat{σ^{2}}}^{(k)}

and

{\hat{λ}}^{(k)}

\begin{aligned} {\hat{β}}^{(k + 1)} & = {argmin}_{β} {{({\hat{b}}^{(k)} - η (β, x))}^{⊤} D ({\hat{a}}^{(k)}) ({\hat{b}}^{(k)} - η (β, x))}, \\ {\hat{σ^{2}}}^{(k + 1)} & = \sum_{i = 1}^{n} \frac{1}{{\hat{m}}_{i}^{(k)}} [({\hat{κ}}_{i}^{(k)} + ({\hat{λ}}^{(k)})^{2}) {(y_{i} - {\hat{η}}_{i}^{(k + 1)})}^{2} \\ - 2 {\hat{λ}}^{(k)} {\hat{t}}_{i}^{(k)} (y_{i} - {\hat{η}}_{i}^{(k + 1)}) + {\hat{t^{2}}}_{i}^{(k)}] / 2 n, \\ {\hat{λ}}^{(k + 1)} & = \sum_{i = 1}^{n} \frac{1}{{\hat{m}}_{i}^{(k)}} {\hat{t}}_{i} (y_{i} - {\hat{η}}_{i}^{(k + 1)}) / \sum_{i = 1}^{n} \frac{1}{{\hat{m}}_{i}^{(k)}} {(y_{i} - {\hat{η}}_{i}^{(k + 1)})}^{2}, \\ {\hat{ρ}}^{(k + 1)} & = {argmin}_{ρ} \sum_{i = 1}^{n} {\log m_{i} (ρ) \\ + \frac{m_{i}^{- 1} (ρ)}{2 {\hat{σ^{2}}}^{(k + 1)}} [({\hat{κ}}_{i}^{(k)} + ({\hat{λ}}^{(k + 1)})^{2}) {(y_{i} - {\hat{η}}_{i}^{(k + 1)})}^{2} \\ - 2 {\hat{λ}}^{(k + 1)} {\hat{t}}_{i}^{(k)} (y_{i} - {\hat{η}}_{i}^{(k + 1)}) + {\hat{t^{2}}}_{i}^{(k)}]}, \end{aligned}

where

{\hat{a}}^{(k)} = ({\hat{a}}_{1}^{(k)}, \dots, {\hat{a}}_{n}^{(k)})

with

{\hat{a}}_{i}^{(k)} = ({\hat{m}}_{i}^{(k)})^{- 1} ({\hat{κ}}_{i}^{(k)} + {\hat{λ}}^{(k)^{2}})

D (\cdot)

is the diagonal matrix,

b^{(k)} = (b_{1}^{(k)}, \dots, b_{n}^{(k)})^{⊤}

is the corrected observed response given by

b_{i}^{(k)} = y_{i} - {\hat{λ}}^{(k)} {\hat{t}}_{i}^{(k)} / ({\hat{κ}}_{i}^{(k)} + {\hat{λ}}^{(k)^{2}}),

{\hat{m}}_{i}^{(k)} = m_{i} ({\hat{ρ}}^{(k)}, z_{i}),

η (β, x) = (η (β, x_{1}), \dots, η (β, x_{n}))^{⊤}

and

{\hat{η}}_{i}^{(k)} = η ({\hat{β}}^{(k)}, x_{i})

CML-step: Considering the values ${\hat{β}}^{(k + 1)}$ , ${\hat{ρ}}^{(k + 1)}$ , ${\hat{σ^{2}}}^{(k + 1)}$ and ${\hat{λ}}^{(k + 1)},$ obtain ${\hat{τ}}^{(k)}$ by optimizing the constrained log-likelihood function, i.e.
${\hat{τ}}^{(k + 1)} = {argmax}_{τ} {\sum_{i = 1}^{n} \log f_{0} (y_{i}; {\hat{η}}_{i}^{(k + 1)}, {\hat{σ^{2}}}^{(k + 1)} {\hat{m}}_{i}^{(k + 1)}, τ)},$ (18)
where $f_{0} (y | θ)$ is the respective symmetric pdf as defined in (4).

The more efficient CML–step follows [25] (ECME), which is referred to as the conditional marginal likelihood step (CML-step), where we replace the usual M-step by a step that maximizes the restricted actual log-likelihood function. Furthermore, this step requires a one-dimensional search of StN, SSL and SPE models and a bi-dimensional search of the SCN model, which can be easily accomplished by using, for example, the ‘optim/optimize’ routine in R [29].

The iterations of the above algorithms are repeated until a suitable convergence rule is satisfied, e.g. $| | θ^{(k + 1)} - θ^{(k)} | |$ or $| ℓ (θ^{(k + 1)} | y) - ℓ (θ^{(k)} | y) |$ is sufficiently small, say $10^{- 6}$ .

3.3. Notes on implementation

Although the EM-type algorithm tends to be robust with respect to the choice of the starting values, it may not converge when initial values are far from good ones. Thus, the choice of adequate starting values for the EM algorithm plays a big role in parameter estimation. A set of reasonable values can be obtained by computing ${\hat{β}}^{(0)}$ and ${\hat{σ}}^{2 (0)}$ from standard nonlinear least squares (NLS). Then, calculate ${\hat{λ}}^{(0)}$ as the skewness coefficient of the NLS residuals, using the ‘nls’ routine in R [29]. The value of ${\hat{ρ}}^{(0)}$ can be the value $ρ^{(0)}$ such that ${\hat{m}}_{i}^{(0)} = m_{i} (ρ^{(0)}, z_{i}) = 1$ for all $i = 1, \dots, n$ .

4. Likelihood ratio test for homogeneity of variance

The SSMN-HNLM defined in Equation (12) supposes that the variance of the model is not constant with scale parameter given by $σ_{i}^{2} = σ^{2} m_{i}$ with $m_{i} = m (ρ, z_{i})$ . Besides this, it is assumed that a unique value $ρ = ρ_{0}$ exists such that $m_{i} (ρ_{0}, z_{i}) = 1$ for all $i = 1, \dots, n$ . Thus, the test for homogeneity of the scalar parameter in the model (12) can be expressed by:

H_{0} : ρ = ρ_{0} vs H_{1} : ρ \neq ρ_{0} .

In this work, we use a likelihood ratio (LR) test statistic to test $H_{0}$ , which is provided by $L R = 2 (ℓ (\hat{θ}, y) - ℓ ({\hat{θ}}_{0}, y)$ , where $ℓ (θ, y)$ denotes the observed-data log-likelihood function, given by Equation (13), $\hat{θ}$ and ${\hat{θ}}_{0}$ represent the ML estimates obtained using the ECME algorithm under $H_{1}$ and $H_{0}$ , respectively. Under $H_{0}$ , the LR test statistic has an asymptotically $χ_{q}^{2}$ distribution, where q is the length of $ρ$ . Thus, in order to analyze the empirical distribution and power of the LR test, we develop two simulation studies.

4.1. Simulation studies

In this section, the performance of the asymptotic distribution and power of the log-likelihood ratio (LR) test statistic are examined. First, we compare the empirical distribution with the theoretical distribution ( $χ^{2}$ ) via Monte Carlo simulations. Second, we investigate the power of the LR test for a grid of values of ρ.

4.1.1. The empirical distributions of LR test statistics

The performance of the asymptotic distribution of the LR test statistic is examined following the procedure described in [19,33]. The model used in this simulation study is

Y_{i} = e^{β x_{i}} + ε_{i}, i = 1, \dots, n,

(19)

where $ε_{i} \overset{i n d}{\sim} SSMN (0, σ_{i}^{2}, λ; H, κ)$ , with $σ_{i}^{2} = σ^{2} e^{ρ x_{i}}$ . The variable $x_{i}$ is generated from a uniform distribution in the interval(0.2,2). The parameter values are set at $β = 2$ , $σ^{2} = 0.5$ , $λ = 3$ . The values of ν are chosen to achieve heavy-tails, with $ν = 3$ for StN and SSL.

We generate values of $Y_{i}$ by the model (19) with the true values of parameters and $ρ = 0$ (under $H_{0}$ ), repeating this procedure 2000 times (the values of $x_{i}^{'}$ s are fixed for each replication). Then, using the 2000 estimates of the LR statistics, we obtain the empirical distribution functions (edf). Figure 1 shows comparisons between the edf and the theoretical distribution of $χ_{(1)}^{2}$ for n = 30, 70 and 120 in SN, StN and SSL models. It can be seen that when ‘n’ increases, the edf's are very close to the theoretical distribution for the distributions considered in our study.

Figure 1. — Simulated comparisons between the empirical distributions of the score statistic and $χ_{(1)}^{2}$ distribution, using SN (first row), StN (second row) and SSL (last row).

4.1.2. The power of the LR test

In order to study the power of the test, we use different values of ‘n’ and ρ to get the simulated sizes and powers of the test statistic. We consider the values of $ρ = 0$ , 0.2, 0.4, 0.6, 0.8, 1 and n = 10, 20, 30, 50, 70, 90 and 150. Each simulation is repeated 2000 times, so the proportion of times when the null hypothesis is rejected is just the simulated power value. All the statistics are compared with the $χ_{1}^{2}$ critical value at $α = 0.05$ level. Table 2 presents the rejection rate for the hypothesis $H_{0} : ρ = 0$ from the test statistic LR for the SN, StN and SSL distributions. It can be seen that for $ρ = 0$ , the rejection rate of the test approximates the true nominal level when ‘n’ increases. When ‘n’ and ρ increase, the power of the test approaches of 1 for all models. Figure 2 presents the rejection rate when varying the parameter ρ in the interval $[0, 1]$ and varying the sample size between 30 and 150 for each distribution.

Table 2. Rejection rate for $H_{0} : ρ = 0$ at the nominal level of $5 %$ from the LR statistic for the SN, StN and SSL distributions.

n	$ρ = 0.0$	$ρ = 0.2$	$ρ = 0.4$	$ρ = 0.6$	$ρ = 0.8$	$ρ = 1.0$
SN-NLM
10	0.0940	0.1025	0.0895	0.0895	0.0870	0.1220
20	0.0715	0.0680	0.0810	0.1050	0.1730	0.2340
30	0.0570	0.0610	0.0895	0.1590	0.2510	0.3920
50	0.0630	0.0790	0.1400	0.2675	0.4650	0.6475
70	0.0535	0.0810	0.1865	0.3655	0.5955	0.8140
90	0.0535	0.1075	0.2370	0.5085	0.7550	0.9105
120	0.0500	0.1175	0.3425	0.6440	0.8505	0.9585
150	0.0490	0.1230	0.4120	0.7305	0.9290	0.9925
StN-NLM
10	0.1335	0.1340	0.1375	0.1175	0.1285	0.1560
20	0.0760	0.0925	0.1075	0.1250	0.1615	0.2255
30	0.0735	0.1070	0.1095	0.1500	0.2215	0.3170
50	0.0715	0.1105	0.1375	0.2245	0.3285	0.4695
70	0.0600	0.0985	0.1580	0.2710	0.4455	0.6185
90	0.0520	0.1065	0.2005	0.3660	0.5190	0.7100
120	0.0540	0.1190	0.2455	0.4385	0.6665	0.8455
150	0.0465	0.1420	0.3090	0.5275	0.7640	0.9050
SSL-NLM
10	0.0805	0.0665	0.0630	0.0650	0.0615	0.0960
20	0.0545	0.0560	0.0425	0.0520	0.0760	0.0940
30	0.0540	0.0460	0.0380	0.1085	0.0890	0.2775
50	0.0415	0.0390	0.0795	0.1310	0.2410	0.4670
70	0.0445	0.0460	0.1160	0.2765	0.3060	0.6430
90	0.0450	0.0495	0.1180	0.2890	0.5245	0.7625
120	0.0430	0.0600	0.1490	0.4020	0.6310	0.9205
150	0.0375	0.0715	0.2190	0.5860	0.8070	0.9615

Open in a new tab

Figure 2. — Power of the analysis to detect significant heteroscedasticity over a range of possible $ρ_{0}$ values and different sample sizes (n), considering the SN, StN and SSL error distributions.

4.1.3. Study of misspecification of the structure function

As suggested by a referee, we report here a simulation study to analyze the influence of misspecification of the structure function. We simulate from model (12) with $η_{i} (β, x_{i}) = β_{1} / (1 + β_{2} e^{- β_{3} x_{i}})$ , $x_{i} \sim U (0.1, 15)$ , $m_{i} (ρ, z_{i}) = e^{ρ z_{i}}, z_{i} = x_{i} i = 1, \dots, n$ . The true values of the parameters are set at: $β = (37, 43, 0.6)^{⊤}$ , $σ^{2} = 0.5$ , $λ = - 3$ and $ν = 3$ (for StN and SSL). We use the following structure functions for $m_{i}$ : (1) $m_{i} = 1$ (homoscedasticity), (2) $m_{i} = ρ z_{i}$ (linear relation), (3) $m_{i} = e^{ρ z_{i}}$ (true relation) and (4) $m_{i} = z_{i}^{ρ}$ . We generate 2000 Monte Carlos samples of size n = 300 and we compute the coverage rates (CR) given by the proportion of estimates that filled in the $95 %$ confidence interval and the bias, given by the difference between the mean of the estimates and the true value of the parameters. For CR, we expected a value close to $95 %$ and for the bias a value close to 0. According to Table 3, the true structure function attains our expectations in terms of CR bias for all parameters and all the distributions taken into consideration. On the other hand, we note that other specifications of $m_{i}$ present relatively large bias and CR in at least one parameter.

Table 3. Coverage rates (CR) at the nominal level of $5 %$ and bias for different structure functions with $m_{i} = e^{0.1 z_{i}}$ (true values of the parameters are in parentheses).

	$m_{i} = 1$		$m_{i} = ρ z_{i}$		$m_{i} = e^{ρ z_{i}}$		$m_{i} = z_{i}^{ρ}$
Parameter	CR	bias	CR	bias	CR	bias	CR	bias
SN-HNLM
$β_{1} (37)$	86.40	−0.12	87.94	0.11	94.64	−0.02	92.20	−0.06
$β_{2} (43)$	90.08	−0.86	87.57	0.83	94.11	0.15	94.78	−0.06
$β_{3} (0.6)$	94.04	−1.6 $e^{- 3}$	82.16	4.7 $e^{- 3}$	94.26	5.2 $e^{- 4}$	94.82	9.6 $e^{- 4}$
$σ^{2} (0.5)$	5.02	0.61	81.97	−0.12	92.76	1.4 $e^{- 3}$	90.50	0.10
$λ (- 3)$	92.04	−0.01	74.32	−1.16	95.85	−0.25	96.22	−0.30
$ρ (0.1)$	–	–	13.63	0.72	94.24	−4.9 $e^{- 4}$	18.47	0.26
StN-HNLM
$β_{1} (37)$	93.04	−0.09	98.38	−0.04	92.13	−0.01	90.51	−0.04
$β_{2} (43)$	95.11	−0.08	55.46	0.21	92.90	0.16	93.54	0.15
$β_{3} (0.6)$	94.75	1.4 $e^{- 4}$	0.00	1.2 $e^{- 3}$	93.26	5.7 $e^{- 4}$	93.24	1.1 $e^{- 3}$
$σ^{2} (0.5)$	77.59	0.50	62.94	−0.01	92.58	0.04	93.06	0.06
$λ (- 3)$	93.56	0.04	27.73	−0.42	93.46	−0.49	93.71	−0.46
$ρ (0.1)$	–	–	85.65	0.35	94.38	−2.0 $e^{- 4}$	45.51	0.31
$ν (3)$	96.85	−4.2 $e^{- 4}$	31.06	0.43	96.56	0.65	96.06	0.57
SSL-HNLM
$β_{1} (37)$	79.16	−0.30	82.36	−0.10	96.14	−0.08	93.41	−0.08
$β_{2} (43)$	82.23	1.07	71.59	1.91	95.34	0.19	98.12	0.29
$β_{3} (0.6)$	56.83	3.9 $e^{- 3}$	75.67	6.8 $e^{- 3}$	95.05	1.7 $e^{- 3}$	96.89	1.6 $e^{- 3}$
$σ^{2} (0.5)$	4.67	−0.05	0.00	41.68	95.54	0.04	94.91	0.04
$λ (- 3)$	3.42	1.47	57.73	0.55	97.09	−0.25	96.89	−0.29
$ρ (0.1)$	–	–	0.00	−0.10	95.15	0.31	24.28	0.31
$ν (3)$	0.00	−1.90	3.38	−0.96	92.62	3.89	90.21	4.63

Open in a new tab

4.1.4. Computational aspects

The simulation studies were run in a Linux server, with 2 processors of 2.4 GHz, 12 cores, 24 threads and 32 GB of RAM memory. All the computational procedures were coded and implemented using the statistical software package R (R Core Team, 2018). For each procedure based on 2000 replicates, we used the parallel routine of R. The run time of each simulation procedure varied between 7 and 40 minutes, depending on the sample size ${10, 20, \dots, 150}$ and ρ value used. We did not observe problems with convergence and out-of-boundary estimates in our simulation study. The computer programs are available from the first author upon request.

5. Diagnostic analysis

Diagnostic techniques are used to detect observations that seriously influence the results of a statistical analysis. In the literature, there are basically two approaches to detect influential observations. One approach is the case-deletion method [10], in which the impact of deleting an observation on the estimates is directly assessed by measures such as the likelihood distance and Cook distance. The second approach is a general statistical technique used to assess the stability of the estimation outputs with respect to the model inputs [11]. Inspired by the results of Zhu et al. [36], Zhu and Lee [35] and Lee and Xu [23], we study the case-deletion measures and the local influence diagnostics for nonlinear regression models on the basis of the Q-function. In the following subsections we describe the background and details of the classic diagnostic methods to detect influential observations.

5.1. The local influence approach

Let $ℓ_{c} (θ; y_{c})$ and $ℓ_{c} (θ, ω | y_{c}),$ with $θ \in R^{p},$ be the complete log-likelihood functions from the postulated model, considering a perturbation vector $ω$ , varying in an open region $Ω \in R^{q}$ , respectively. We assume that a vector $ω_{0}$ exists such that $ℓ_{c} (θ, ω_{0} | y_{c}) = ℓ_{c} (θ | y_{c}),$ for all $θ$ . To assess the influence of the perturbations on the ML estimate of $ℓ (θ; y_{c})$ , one may consider the Q-displacement function, defined as:

g_{Q} (ω) = 2 {Q (\hat{θ} | \hat{θ}) - Q ({\hat{θ}}_{ω} | \hat{θ})},

where ${\hat{θ}}_{ω}$ denotes the ML estimator in the perturbed model.

Following the approach developed in [11,35], the normal curvature for $θ$ in the direction of some unit vector $d$ is given by:

C_{g_{Q}, d} (θ) = 2 d^{⊤} Δ_{ω_{0}}^{⊤} {- \ddot{Q} (\hat{θ} | \hat{θ})}^{- 1} Δ_{ω_{0}} d^{⊤},

where $\ddot{Q} (\hat{θ} | \hat{θ}) = \partial^{2} Q (θ |, \hat{θ}) / \partial θ \partial θ^{⊤} |_{θ = \hat{θ}}$ and $Δ_{ω} = \partial^{2} Q (θ, ω | \hat{θ}) / \partial θ \partial ω^{⊤} |_{θ = {\hat{θ}}_{ω}}$ .

Let $(λ_{1}, h_{1}), \dots, (λ_{n}, h_{n})$ be the eigenvalue-eigenvector pairs of the matrix $- 2 {\ddot{Q}}_{ω_{o}} = - 2 Δ_{ω_{0}}^{⊤} {- \ddot{Q} (\hat{θ} | \hat{θ})}^{- 1} Δ_{ω_{0}},$ with $λ_{1} \geq \dots \geq λ_{q} > 0, λ_{q + 1} = \dots = λ_{n} = 0$ , ${\bar{λ}}_{k} = λ_{k} / (λ_{1} + \dots + λ_{q})$ and $h_{k}^{2} = (h_{k 1}^{2}, \dots, h_{k n}^{2})$ . The aggregated contribution vector of all eigenvectors corresponding to nonzero eigenvalues is given by:

M (0) = \sum_{k = 1}^{q} {\bar{λ}}_{k} h_{k}^{2} .

Following Lee and Xu [23], we use $1 / n + c^{*} S M (0)$ as a benchmark to regard the lth case as influential, where $c^{*}$ is an arbitrary constant (depending on the real application) and $S M (0)$ is the standard deviation of ${M (0)_{l}, l = 1, \dots, n}$ . Appendix 3 presents the Hessian matrix $\ddot{Q} (\hat{θ} | \hat{θ}) = \partial^{2} Q (θ | \hat{θ}) / \partial θ \partial θ^{⊤} |_{θ = \hat{θ}}$ and some perturbation schemes used in this work.

5.2. Case deletion measures

In the process of model validation, it is fundamental to verify if there are observations with a disproportionate influence on the estimates of the model's parameters. Case-deletion is a classic approach to study the effects of dropping the ith case from the dataset. Thus, considering the model in (12) and $θ = (β^{⊤}, σ^{2}, λ, ρ^{⊤}, τ^{⊤})^{⊤}$ , we compare the ML estimate with all observations $\hat{θ}$ with the ML estimate ${\hat{θ}}_{[i]}$ obtained when the ith observation has been deleted from the dataset. The SSMN-HNLM in (12) is rewritten as:

Y_{j} = μ_{j} + ε_{j}, j = 1, \dots, n, j \neq i .

Let $y_{c [i]} = (y_{[i]}, u_{[i]}, t_{[i]})$ be the augmented dataset, where the subscript ‘ $[i]$ ’ denotes the original vector with the ith observation deleted. The complete-data log-likelihood function based on the data with the ith case deleted is denoted by $ℓ_{c} (θ | y_{c [i]})$ . Let ${\hat{θ}}_{[i]} = ({\hat{β}}_{[i]}^{⊤}, {\hat{σ^{2}}}_{[i]}, {\hat{λ}}_{[i]}, {\hat{ρ}}_{[i]}^{⊤})^{⊤}$ be the maximizer of the function $Q_{[i]} (θ | \hat{θ}) = E {ℓ_{c} (θ | Y_{c [i]}) | y_{[i]}, θ = \hat{θ}}$ of the proposed regression model, where the estimates ${\hat{θ}}_{[i]}$ are obtained by using the EM algorithm based on the remaining n−1 observations. If ${\hat{θ}}_{[i]}$ is far from $\hat{θ}$ in some sense, then the ith case is regarded as influential.

Similar to the classic case-deletion measures, Cook distance and the likelihood displacement, Zhu et al. [36] presented analogous measures based on the Q-function.

Generalized Cook distance GD: This measure, similar to the usual Cook distance $D_{i}$ [10], determines the degree of influence of the ith observation on the estimate of $θ$ and is defined by:
$G D_{i} = ({\hat{θ}}_{[i]} - \hat{θ})^{⊤} {- \ddot{Q} (\hat{θ} | \hat{θ})} ({\hat{θ}}_{[i]} - \hat{θ}) .$
Q-distance QD: This measure of the influence of the ith case is similar to the likelihood distance $L D_{i}$ discussed by Cook and Weisberg [12], defined by:
$Q D_{i} = 2 {Q (\hat{θ} | \hat{θ}) - Q ({\hat{θ}}_{[i]} | \hat{θ})} .$

6. Application

In this section we consider the likelihood analysis of the dataset presented in [24], which describes data from ultrasonic calibration. Labra et al. [22] analyzed this dataset in a heteroscedastic nonlinear model with SMSN distribution errors. In this context, the authors verified the presence of outliers. Here we reanalyze the dataset with the aim of showing the capacity of the SSMN distributions to fit real datasets in the presence of asymmetry and heavy-tails in heteroscedastic nonlinear models. The data consist of 214 observations where the response variable is ultrasonic response Y, and the predictor variable is metal distance x. From the descriptive statistics, presented in Table 4, we observe a large and positive sample skewness. The distance between the mean and median suggest using an asymmetric distribution as an alternative to model the data. On the other hand, Figure 3 shows a nonlinear relationship between the metal distance and the ultrasonic response.

Table 4. Summary statistics for ultrasonic calibration data (SD is sample standard deviation).

Min	Max	Mean	Median	SD	Skewness	Kurtosis
3.75	92.90	30.26	21.11	23.68	0.91	2.56

Open in a new tab

Figure 3. — Scatter-plot of ultrasonic calibration data.

Following Lin et al. [24], we consider a SSMN-HNLM of the form:

Y_{i} = \frac{\exp (- β_{1} x_{i})}{β_{2} + β_{3} x_{i}} + ε_{i}, ε_{i} \overset{i n d}{\sim} SSMN (0, σ_{i}^{2}, λ; H, κ),

(20)

where $σ_{i}^{2} = σ^{2} x_{i}^{ρ}$ for $i = 1, \dots, 214$ .

Table 5 contains the ML estimates of the parameters from the SN, StN, SSL, SPE and SCN models, together with their corresponding standard errors (SE) calculated via the observed information matrix (Appendix 1). Moreover, both the Akaike information criterion (AIC) and Bayesian information criterion (BIC) indicate that the SSMN models with heavy-tails (StN-NLM, SSL-NLM and SPE-NLM) present better fit than the SN model, with the StN-NLM being significantly better. In addition, by comparing our results with those obtained by Labra et al. [22, Table 2, p. 2159], we can see that the ML estimates in the StN and SSL distributions have higher log-likelihood ( $ℓ (\hat{θ})$ ) and consequently lower AIC values, indicating better performance of the SSMN models for the ultrasonic calibration dataset.

Table 5. ML estimation results of fitting various mixture models to the ultrasonic calibration data. The SE values are the asymptotic standard errors based on the observed information matrix.

	SN-NLM		StN-NLM		SSL-NLM		SPE-NLM
Parameter	Estimate	SE	Estimate	SE	Estimate	SE	Estimate	SE
$β_{1}$	0.188	0.019	0.190	0.014	0.196	0.016	0.190	0.027
$β_{2}$	0.006	4.6 $e^{- 4}$	0.006	3.8 $e^{- 4}$	0.006	3.8 $e^{- 4}$	0.006	4.7 $e^{- 4}$
$β_{3}$	0.013	9.1 $e^{- 4}$	0.012	8.6 $e^{- 4}$	0.012	6.9 $e^{- 4}$	0.013	8.7 $e^{- 4}$
$σ^{2}$	33.981	5.997	11.244	2.050	9.977	0.101	18.514	12.545
λ	2.088	0.428	0.651	0.125	0.824	0.158	1.448	0.784
ρ	−1.082	0.128	−1.028	0.119	−1.091	0.065	−1.091	0.139
ν	–	–	3.846	1.184	1.454	0.058	0.773	0.150
$ℓ (\hat{θ})$	−520.305		−514.764		−515.108		−518.008
AIC	1052.609		1043.529		1044.215		1050.017
BIC	1072.806		1067.090		1067.778		1073.578

Open in a new tab

From (20), the model is homoscedastic when $ρ = 0$ . So, the test for heteroscedasticity based on the likelihood ratio test statistic is $H_{0} : ρ = 0 vs . H_{1} : ρ \neq 0$ , which has an approximate $χ_{1}^{2}$ distribution under $H_{0}$ . The LR statistic for the StN model was $L R = 2 (ℓ (\hat{θ}) - ℓ ({\hat{θ}}_{0})) = 32.232$ with $p - value \approx 0$ . This result is in accordance with that obtained by Lin et al. [24] using the score statistic test and Labra et al. [22] using the likelihood ratio test. Therefore the assumption of homogeneity of variance is not suitable for the ultrasonic calibration data.

In order to detect incorrect specification of the error distribution, we use the Mahalanobis distance $D_{i} = (Y_{i} - η (β, x_{i}))^{2} / σ_{i}^{2}$ , for $i = 1, \dots, 214$ to construct simulated envelopes. In the skew-normal case, we have $D_{i} \sim χ_{1}^{2}$ . Thus, we can use as cutoff points the quantile $υ = χ_{1}^{2} (ξ)$ , where $0 < ξ < 1$ . From Ferreira et al. [15], we have the following properties related to the Mahalanobis distance: $D_{i} \sim F (1, ν)$ for StN, $\Pr (D_{i} \leq υ) = \Pr (χ_{1}^{2} \leq υ) - (2^{ν} Γ (ν + 1 / 2) / υ^{ν} Γ (1 / 2)) \Pr (χ_{2 ν + 1}^{2} \leq υ)$ for SSL and $\Pr (D_{i} \leq υ) = υ^{1 / 2} IG (1 / 2 ν, υ^{ν} / 2) / Γ (1 / 2 ν) 2^{1 / 2 ν}$ for the SPE distribution.

The QQ-plots and simulated envelopes for the Mahalanobis distance of the fitted SN-NLM, StN-NLM, SSL-NLM and SPE-NLM models are shown in Figure 4. The lines in these figures represent the 2.5th percentile, the mean, and the 97.5th percentile of 100 simulated points for each observation. It can be seen that the SN and SPE models contain some observations outside the confidence band, but the StN and SSL models present good fit to the dataset.

First, we identify influential observations in the fitted model based on case-deletion measures, the generalized Cook distance (Figure 5) and Q-distance (Figure 6), which are similar for each model but with less scale in the StN model. We note from these figures that observations $# 142$ and $# 176$ for both measures are potentially influential on the parameter estimates in the SN model, while $# 146$ and $# 147$ are influential in the StN model and $# 142$ , $# 146$ and $# 147$ are influential in the SSL model.

Figure 6. — Ultrasonic calibration data. Index plot of the Q-distance. (a) SN-HNLM, (b) StN-HNLM, (c) SSL-HNLM.

We used the same strategy presented by Cao et al. [8] to compare the robustness among models. Thus, in order to reveal the impact of the four observations considered as potential outliers, using the Mahalanobis distance (See Figure 7), on the parameter estimates, we refitted the five models eliminating these four cases.

Figure 7. — Ultrasonic calibration data. Mahalanobis distance for SN-HNLM.

In Table 6, we show the estimated values $(\hat{θ})$ and their relative changes, defined by $R C_{k} = ({\hat{θ}}_{[j] k} - {\hat{θ}}_{k}) / {\hat{θ}}_{k}$ , where ${\hat{θ}}_{[j] k}$ means the kth parameter estimate, without the set $[j]$ of potential outliers. We observe that the RC for StN-NLM becomes smaller than the SN-NLM, which implies that the StN model is less sensitive than the SN model, in the presence of potential outliers.

Table 6. Ultrasonic calibration data. Relative changes (RC) of $\hat{θ}$ , after deleting outliers.

	SN-NLM		StN-NLM		SSL-NLM		SPE-NLM
Parameter	Estimate	RC	Estimate	RC	Estimate	RC	Estimate	RC
$β_{1}$	0.1790	13.7299	0.1751	4.6419	0.1870	5.5330	0.2040	−7.2792
$β_{2}$	0.0055	9.7201	0.0056	2.6319	0.0057	4.4269	0.0061	−4.1235
$β_{3}$	0.0131	−5.0432	0.0123	−3.0677	0.0125	−6.0685	0.0118	5.4580
$σ^{2}$	27.2558	31.9260	13.4812	−0.5992	20.7536	−108.0115	26.3738	−42.4551
λ	1.8818	46.7073	0.6055	0.5030	1.2628	−53.2015	1.6482	−13.8311
ρ	−1.1117	−19.3049	−1.2033	−5.6935	−1.2359	−13.2407	−1.1854	−8.6415
ν	–	–	7.8713	2.2535	6.4618	−344.3733	0.9999	−29.2878

Open in a new tab

Figures 8–9 present local influence diagnostic analysis using the case-weight and response perturbations, respectively. For the perturbation schemes we obtained the values of $M (0)$ and the figures present the index graphs of $M (0)$ . The horizontal lines delimit the benchmark for $M (0)$ , with $c^{*} = 4$ [23]. Observations $# 146$ and $# 147$ stand out as influential in case-weight perturbation in all models, but with less scale in StN and SSL models. The same three observations ( $# 82, 145, 193$ ) are influential in the response perturbation in the StN and SSL models, while the SN model presents other observations as influential.

Figure 8. — Ultrasonic calibration data. Index plot of $M (0)$ in the case weight perturbation. (a) SN-HNLM, (b) StN-HNLM, (c) SSL-HNLM.

Figure 9. — Ultrasonic calibration data. Index plot of $M (0)$ in the response perturbation. (a) SN-HNLM, (b) StN-HNLM, (c) SSL-HNLM.

As suggested by a referee, in order to evaluate whether the likelihood ratio (LR) statistic to test $H_{0} : ρ = 0$ , is sensitive to the presence of the influential observations #82, #130, #145, #162, #163, #175 and $# 193$ (based in the diagnostic analysis), we removed each observation individually and all of them together from the full data, and obtained the LR statistics and their corresponding p-values. Table 7 shows that $H_{0}$ was rejected in all cases, so the SSMN-HNLM remains effective when influential observations are removed individually or jointly.

Table 7. Ultrasonic calibration data. Likelihood ratio (LR) statistics when removing each observation individually and all simultaneously.

	Removed Observations
Model	$# 82$	$# 130$	$# 145$	$# 162$	$# 163$	$# 175$	$# 193$	All influential
SN	60.596	82.487	66.013	69.210	80.171	79.849	64.741	86.439
StN	31.361	34.673	32.034	34.305	32.913	33.547	30.954	45.529
SSL	32.788	42.236	39.588	41.848	41.588	41.425	38.382	52.032
SPE	32.897	36.674	33.389	35.998	34.794	34.521	32.070	56.935

Open in a new tab

7. Conclusions

In this paper we developed an EM algorithm for maximum likelihood estimation in the SSMN-HNLM, where closed-form expressions are obtained for the E and M steps of the EM algorithm with the standard errors as byproducts. Furthermore, we applied Zhu and Lee [35]'s approach for case-deletion measures and local influence diagnostics. A simulation study was developed to verify the asymptotic distribution of the likelihood ratio test statistic and the empirical power of the test. For $ρ = 0$ , the rejection rate of the test approached of the true nominal level when n increased. When n and ρ increased, the power of the test approached 1. The diagnostic analysis showed that the influence of the observations declined when we considered distributions with heavier tails than the SN one. The models can be fitted using standard available software packages from $R$ and the program codes are available from us on request.

Finally, the proposed method can be extended to a more general framework, such as censored regression models [31], measurement error models and multivariate regression models, providing satisfactory results at the expense of additional complexity of implementation. An in-depth investigation of such extension is beyond the scope of the present paper, but is certainly an interesting topic for future research.

Appendices.

Appendix 1. The observed information matrix for SSMN heteroscedastic nonlinear regression models

Consider the SSMN-HNLM model given in (12), where the corresponding observed-data log-likelihood function of $θ = (β^{⊤}, σ^{2}, λ, ρ^{⊤}, τ^{⊤})^{⊤}$ is of the form $ℓ (θ | y) = \sum_{i = 1}^{n} ℓ_{i} (θ | y_{i})$ . In this section we write $ℓ_{i} (θ | y_{i}) = ℓ_{i} (θ)$ for simplification. We have that $ℓ_{i} (θ) = \log 2 + ℓ_{1_{i}} (θ) + \log Φ (ℓ_{2_{i}} (θ)),$ where $ℓ_{1_{i}} (θ)$ is the log-likelihood function of the corresponding symmetric SMN distribution and $ℓ_{2_{i}} (θ) = λ (y_{i} - η (β, x_{i})) / (σ m_{i}^{1 / 2} (ρ, z_{i}))$ . To simplify the text, we write $η_{i} = η (β, x_{i})$ and $m_{i} = m_{i} (ρ, z_{i})$ . Thus, the observed information matrix for $θ$ can be written as:

I (θ) = I_{1} (θ) + I_{2} (θ),

where:

\begin{aligned} I_{1} (θ) & = - \sum_{i = 1}^{n} \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial θ^{⊤}}; \\ I_{2} (θ) & = - \sum_{i = 1}^{n} [W_{Φ} (ℓ_{2_{i}} (θ)) \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial θ \partial θ^{⊤}} + W_{Φ}^{'} (ℓ_{2_{i}} (θ)) \frac{\partial ℓ_{2_{i}} (θ)}{\partial θ} \frac{\partial ℓ_{2_{i}} (θ)}{\partial θ^{⊤}}], \end{aligned}

with $W_{Φ}^{'} (x) = - W_{Φ} (x) (x + W_{Φ} (x))$ , with $W_{Φ} (x) = ϕ (x) / Φ (x)$ .

The first-order derivatives of $ℓ_{2_{i}} (θ)$ in relation to $θ$ are given by:

\begin{aligned} \frac{\partial ℓ_{2_{i}} (θ)}{\partial β} = - \frac{λ}{σ m_{i}^{1 / 2}} \frac{\partial η_{i}}{\partial β}; \frac{\partial ℓ_{2_{i}} (θ)}{\partial σ^{2}} = - \frac{λ (y_{i} - η_{i})}{2 σ^{3} m_{i}^{1 / 2}}; \frac{\partial ℓ_{2_{i}} (θ)}{\partial λ} = \frac{(y_{i} - η_{i})}{σ m_{i}^{1 / 2}}; \\ \frac{\partial ℓ_{2_{i}} (θ)}{\partial ρ} = - \frac{λ}{2 σ} \frac{(y_{i} - η_{i})}{m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ}; \frac{\partial ℓ_{2_{i}} (θ)}{\partial τ} = 0 \end{aligned}

The second-order derivatives of $ℓ_{2_{i}} (θ)$ in relation to $θ$ are given by:

\begin{aligned} \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial β \partial β^{⊤}} = - \frac{λ}{σ m_{i}^{1 / 2}} \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial β} = \frac{λ}{2 σ^{3} m_{i}^{1 / 2}} \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ \partial β} = - \frac{1}{σ m_{i}^{1 / 2}}; \frac{\partial η_{i}}{\partial β}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial ρ \partial β^{⊤}} = \frac{λ}{2 σ m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β^{⊤}}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{4}} = \frac{3 λ}{4 σ^{5}} \frac{(y_{i} - η_{i})}{m_{i}^{1 / 2}}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial λ} = - \frac{1}{2 σ^{3}} \frac{(y_{i} - η_{i})}{m_{i}^{1 / 2}}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial ρ} = \frac{λ}{4 σ^{3}} \frac{(y_{i} - η_{i})}{m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ \partial ρ} = - \frac{1}{2 σ} \frac{(y_{i} - η_{i})}{m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial ρ \partial ρ^{⊤}} = - \frac{λ}{2 σ} \frac{(y_{i} - η_{i})}{m_{i}^{3 / 2}} [- \frac{3}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} + \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}] \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ^{2}} = 0; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial θ \partial τ^{⊤}} = 0 . \end{aligned}

The first and second-order derivatives of $ℓ_{1_{i}} (θ)$ in relation to $θ$ can be calculated for each considered SMN distribution as follows:

A.1. The normal distribution

\begin{aligned} \frac{\partial ℓ_{1_{i}} (θ)}{\partial β} & = \frac{(y_{i} - η_{i})}{σ^{2} m_{i}} \frac{\partial η_{i}}{\partial β}; \frac{\partial ℓ_{1_{i}} (θ)}{\partial σ^{2}} = \frac{(d_{i} - 1)}{2 σ^{2}}; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial λ} & = 0; \frac{\partial ℓ_{1_{i}} (θ)}{\partial ρ} = \frac{(d_{i} - 1)}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ}; \frac{\partial ℓ_{1_{i}} (θ)}{\partial τ} = 0; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial β \partial β^{⊤}} & = \frac{1}{σ^{2} m_{i}} [(y_{i} - η_{i}) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}} - \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}}]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial β} & = - \frac{(y_{i} - η_{i})}{σ^{4} m_{i}} \frac{\partial η_{i}}{\partial β}; \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial β^{⊤}} = - \frac{(y_{i} - η_{i})}{σ^{2} m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β ⊤}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{4}} & = \frac{1}{2 σ^{4}} - \frac{d_{i}}{σ^{4}}; \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial ρ} = - \frac{d_{i}}{2 σ^{2} m_{i}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial ρ^{⊤}} & = \frac{1 - 2 d_{i}}{2 m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} + \frac{d_{i} - 1}{2 m_{i}} \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial λ} & = \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial τ} = 0, \end{aligned}

with $d_{i} = (y_{i} - η_{i})^{2} / σ^{2} m_{i}$ .

A.2. The Student-t distribution

\begin{aligned} \frac{\partial ℓ_{1_{i}} (θ)}{\partial β} = \frac{(ν + 1) (y_{i} - η_{i}) V_{i}}{ν σ^{2} m_{i}} \frac{\partial η_{i}}{\partial β}; \frac{\partial ℓ_{1_{i}} (θ)}{\partial σ^{2}} = \frac{1}{2 σ^{2}} [(ν + 1) V_{i} d_{i} / ν - 1]; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial λ} = 0; \frac{\partial ℓ_{1_{i}} (θ)}{\partial ρ} = \frac{(ν + 1) V_{i} d_{i}}{2 ν m_{i}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial ν} = \frac{\log (V_{i})}{2} + \frac{(ν + 1) V_{i} d_{i}}{2 ν m i} + \frac{1}{2} [Ψ ((ν + 1) / 2) - Ψ (ν / 2) - 1 / ν]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial β \partial β^{⊤}} = \frac{(ν + 1) V_{i}}{ν σ^{2} m_{i}} [(\frac{2 V_{i} d_{i}}{ν m_{i}} - 1) \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}} + (y_{i} - η_{i}) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}] \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial β} = \frac{ν + 1}{ν σ^{4} m_{i}} V_{i} (y_{i} - η_{i}) (\frac{V_{i} d_{i}}{ν} - 1) \frac{\partial η_{i}}{\partial β} \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial β^{⊤}} = \frac{(ν + 1)}{ν σ^{2} m_{i}^{2}} V_{i} (y_{i} - η_{i}) (\frac{1}{ν} V_{i} d_{i} - 1) \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β^{⊤}}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial β} = \frac{1}{ν^{2} σ^{2} m_{i}} V_{i} (y_{i} - η_{i}) (\frac{ν + 1}{ν} V_{i} d_{i} - 1) \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{4}} = \frac{1}{2 σ^{4}} + \frac{ν + 1}{ν σ^{4}} V_{i} d_{i} (\frac{1}{2 ν} V_{i} d_{i} - 1); \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial ρ} = \frac{(ν + 1)}{2 ν σ^{2} m_{i}} V_{i} d_{i} (\frac{1}{ν} V_{i} d_{i} - 1) \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial σ^{2}} = \frac{1}{2 ν^{2} σ^{2}} V_{i} d_{i} (\frac{ν + 1}{ν} V_{i} d_{i} - 1); \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial ρ^{⊤}} = \frac{(ν + 1)}{2 ν m_{i}} V_{i} d_{i} [\frac{1}{m_{i}} (\frac{1}{ν} V_{i} d_{i} - 2) \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} + \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial ρ} = \frac{1}{2 ν^{2} m_{i}} V_{i} d_{i} (\frac{ν + 1}{ν} V_{i} d_{i} - 1) \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν^{2}} = \frac{1}{4} [Ψ_{1} (\frac{ν + 1}{2}) - Ψ_{1} (\frac{ν}{2}) + \frac{2}{ν^{2}}] + \frac{1}{2 ν^{2}} V_{i} d_{i} + \frac{1}{2 ν^{4}} V_{i} d_{i} [(ν + 1) V_{i} d_{i} - ν (ν + 2)]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial λ} = 0, \end{aligned}

where $V_{i} = (1 + d_{i} / ν)^{- 1}$ , $Ψ (x)$ is the digamma function and $Ψ_{1} (x)$ is the trigamma function.

A.3. The slash distribution

\begin{aligned} \frac{\partial ℓ_{1_{i}} (θ)}{\partial β} & = \frac{(y_{i} - η_{i}) I G_{3} (d_{i})}{σ^{2} m_{i} I G_{1} (d_{i})} \frac{\partial η_{i}}{\partial β}; \frac{\partial ℓ_{1_{i}} (θ)}{\partial σ^{2}} = - \frac{1}{2 σ^{2}} + \frac{d_{i} I G_{3} (d_{i})}{2 σ^{2} I G_{1} (d_{i})}; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial λ} & = 0; \frac{\partial ℓ_{1_{i}} (θ)}{\partial ρ} = - \frac{1}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ} + \frac{d_{i} I G_{3} (d_{i})}{2 m_{i} I G_{1} (d_{i})} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial ν} & = 1 / ν + E_{1, 1} / I G_{1} (d_{i}); \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial β \partial β^{⊤}} & = \frac{1}{σ^{2} m_{i} I G_{1} (d_{i})} {(y_{i} - η_{i}) I G_{3} (d_{i}) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}} \\ + [\frac{d_{i}}{I G_{1} (d_{i})} (I G_{5} (d_{i}) I G_{1} (d_{i}) - I G_{3}^{2} (d_{i})) - I G_{3} (d_{i})] \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}}}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial β} & = \frac{(y_{i} - η_{i})}{σ^{4} m_{i} I G_{1} (d_{i})} [\frac{d_{i}}{2} (I G_{5} (d_{i}) - \frac{I G_{3}^{2} (d_{i})}{I G_{1} (d_{i})}) - I G_{3} (d_{i})] \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial β^{⊤}} & = \frac{(y_{i} - η_{i})}{σ^{4} m_{i} I G_{1} (d_{i})} [\frac{d_{i}}{2} (I G_{5} (d_{i}) - \frac{I G_{3}^{2} (d_{i})}{I G_{1} (d_{i})}) - I G_{3} (d_{i})] \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β^{⊤}}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial β} & = \frac{(y_{i} - η_{i})}{σ^{2} m_{i} I G_{1}^{2} (d_{i})} \frac{\partial η_{i}}{\partial β} (I G_{1} (d_{i}) E_{1, 3} (d_{i}) - I G_{3} (d_{i}) E_{1, 1} (d_{i})); \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{4}} & = \frac{1}{2 σ^{4}} {1 + \frac{d_{i}}{I G_{1} (d_{i})} [\frac{d_{i}}{2} (I G_{5} (d_{i}) - \frac{I G_{3}^{2} (d_{i})}{I G_{1} (d_{i})}) - 2 I G_{3} (d_{i})]}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial ρ} & = \frac{d_{i}}{2 σ^{2} m_{i} I G_{1} (d_{i})} [\frac{d_{i}}{2} (I G_{5} (d_{i}) - \frac{I G_{3}^{2} (d_{i})}{I G_{1} (d_{i})}) - I G_{3} (d_{i})] \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial σ^{2}} & = \frac{d_{i}}{2 σ^{2} I G_{1}^{2} (d_{i})} (I G_{1} (d_{i}) E_{1, 3} (d_{i}) - I G_{3} (d_{i}) E_{1, 1} (d_{i})); \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial ρ^{⊤}} & = \frac{1}{2 m_{i}} (\frac{1}{m_{i}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} - \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}); + \frac{d_{i}^{2}}{2 m_{i} I G_{1} (d_{i})} {I G_{3} (d_{i}) \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}} \\ + \frac{1}{m_{i}} [\frac{d_{i}}{2 I G_{1} (d_{i})} (I G_{5} (d_{i}) I G_{1} (d_{i}) - I G_{3}^{2} (d_{i})) - 2 I G_{3} (d_{i})] \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}}}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial ρ} & = \frac{d_{i}}{2 m_{i} I G_{1}^{2} (d_{i})} (I G_{1} (d_{i}) E_{1, 3} (d_{i}) - I G_{3} (d_{i}) E_{1, 1} (d_{i})) \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν^{2}} & = - \frac{1}{ν^{2}} + \frac{1}{I G_{1}^{2} (d_{i})} (I G_{1} (d_{i}) E_{2, 1} (d_{i}) - E_{1, 1}^{2} (d_{i})); \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial λ} & = 0, \end{aligned}

where

\begin{aligned} I G_{k} (d_{i}) & = \int_{0}^{1} u^{ν + k / 2 - 1} e^{- u d_{i} / 2} d u = \frac{Γ (ν + k / 2)}{(d_{i} / 2)^{ν + k / 2}} P_{1} (ν + k / 2, d_{i} / 2), \\ E_{j, k} (d_{i}) & = \int_{0}^{1} u^{ν + k / 2 - 1} [\ln (u)]^{j} e^{- u d_{i} / 2} d u . \end{aligned}

A.4. The contaminated normal distribution

\begin{aligned} U^{1} (θ) & = \sum_{i = 1}^{n} \frac{\partial ℓ_{1_{i}} (θ)}{\partial θ} = \sum_{i = 1}^{n} \frac{1}{f_{s} (y_{i} | θ)} \frac{\partial f_{s} (y_{i} | θ)}{\partial θ} . \\ I^{1} (θ) & = - \sum_{i = 1}^{n} \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial θ^{⊤}}, where \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ_{j} \partial θ_{k}^{⊤}} \\ = - \frac{1}{f_{s}^{2} (y_{i} | θ)} \frac{\partial f_{s} (y_{i} | θ)}{\partial θ_{j}} \frac{\partial f_{s} (y_{i} | θ)}{\partial θ_{k}^{⊤}} + \frac{1}{f_{s} (y_{i} | θ)} \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial θ_{j} \partial θ_{k}^{⊤}}, \end{aligned}

with $f_{s} (y_{i} | θ) = ν ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})$ .

The first partial derivatives are given by

\begin{aligned} \frac{\partial f_{s} (y_{i} | θ)}{\partial β} = \frac{1}{σ^{2} m_{i}} [ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})] (y_{i} - η_{i}) \frac{\partial η_{i}}{β}; \\ \frac{\partial f_{s} (y_{i} | θ)}{\partial σ^{2}} = \frac{1}{2 σ^{2}} {d_{i} [ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})] - f_{s} (y_{i} | θ)}; \\ \frac{\partial f_{s} (y_{i} | θ)}{\partial ρ} = \frac{1}{2 m_{i}} {d_{i} [ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})] - f_{s} (y_{i} | θ)} \frac{\partial m_{i}}{ρ}; \\ \frac{\partial f_{s} (y_{i} | θ)}{\partial ν} = ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) - ϕ (y_{i}; η_{i}, σ^{2} m_{i}); \\ \frac{\partial f_{s} (y_{i} | θ)}{\partial γ} = \frac{ν}{2 γ} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (1 - γ d_{i}); \\ \frac{\partial f_{s} (y_{i}; θ)}{\partial λ} = 0. \end{aligned}

The second partial derivatives are given by

\begin{aligned} \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial β \partial β^{⊤}} & = \frac{1}{σ^{2} m_{i}} {(ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})) (y_{i} - η_{i}) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}; \\ + [d_{i} (ν γ^{2} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})) \\ - ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) - (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i})] \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}}}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial σ^{2} \partial β} & = \frac{(y_{i} - η_{i})}{2 σ^{4} m_{i}} [ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} - 3) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (d_{i} - 3)] \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ρ \partial β^{⊤}} & = \frac{(y_{i} - η_{i})}{σ^{2} m_{i}^{2}} [ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} - 3 / 2) \\ + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (d_{i} - 3 / 2)] \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β^{⊤}}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ν \partial β} & = \frac{(y_{i} - η_{i})}{σ^{2} m_{i}} (γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) - ϕ (y_{i}; η_{i}, σ^{2} m_{i})) \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial γ \partial β} & = \frac{ν (y_{i} - η_{i})}{2 σ^{2} m_{i}} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (3 - γ d_{i}) \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial σ^{4}} & = \frac{1}{4 σ^{4}} [ν γ d_{i} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} - 6) \\ + (1 - ν) d_{i} ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (d_{i} - 6) + 3 f_{s} (y_{i} | θ)]; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial σ^{2} \partial ρ} & = \frac{1}{2 σ^{2} m_{i}} [ν γ d_{i} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} - 5 / 2) \\ + (1 - ν) d_{i} ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (d_{i} - 5 / 2) + 1 / 2 f_{s} (y_{i} | θ)] \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ν \partial σ^{2}} & = \frac{1}{2 σ^{2}} [ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} - 1) + ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (1 - d_{i})]; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial γ \partial σ^{2}} & = \frac{ν}{4 σ^{2}} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (- γ d_{i}^{2} + 4 d_{i} - 1 / γ); \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ρ \partial ρ^{⊤}} & = \frac{1}{2 m_{i}} [d_{i} (2 - f_{s} (y_{i} | θ)) \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}} + \frac{3 f_{s} (y_{i} | θ)}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}}]; \\ - \frac{d_{i}}{2 m_{i}^{2}} [ν γ ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} + 2) + (1 - ν) ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (d_{i} + 2)] \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ν \partial ρ} & = \frac{1}{m_{i}} [ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (γ d_{i} - 1 / 2) + ϕ (y_{i}; η_{i}, σ^{2} m_{i}) (1 / 2 - d_{i})] \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial γ \partial ρ} & = \frac{1}{4 m_{i}} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (- 2 ν γ d_{i}^{2} + (6 ν + γ) d_{i} - 1) \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ν^{2}} & = 0; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial ν \partial γ} & = \frac{1}{2} ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ) (1 / γ - d_{i}); \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial γ^{2}} & = \frac{ϕ (y_{i}; η_{i}, σ^{2} m_{i} / γ)}{2} [\frac{1}{2} (1 / γ - d_{i})^{2} - 1 / γ^{2}]; \\ \frac{\partial^{2} f_{s} (y_{i} | θ)}{\partial θ \partial λ} & = 0 . \end{aligned}

A.5. The power-exponential distribution

\begin{aligned} \frac{\partial ℓ_{1_{i}} (θ)}{\partial β} & = \frac{ν (y_{i} - η_{i}) d_{i}^{ν - 1}}{σ^{2} m_{i}} \frac{\partial η_{i}}{\partial β}; \frac{\partial ℓ_{1_{i}} (θ)}{\partial σ^{2}} = \frac{ν d_{i}^{ν} - 1}{2 σ^{2}}; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial λ} & = 0; \frac{\partial ℓ_{1_{i}} (θ)}{\partial ρ} = \frac{ν d_{i}^{ν} - 1}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial ℓ_{1_{i}} (θ)}{\partial ν} & = \frac{1}{2 ν^{2}} [2 ν + \log (2) + Ψ (1 / 2 ν)]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial β \partial β^{⊤}} & = \frac{ν d_{i}^{ν - 1}}{σ^{2} m_{i}} [(1 - 2 ν) \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}} + (y_{i} - η_{i}) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial β} & = - \frac{ν^{2} (y_{i} - η_{i}) d_{i}^{ν - 1}}{σ^{4} m_{i}} \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial β^{⊤}} & = - \frac{ν^{2} (y_{i} - η_{i}) d_{i}^{ν - 1}}{σ^{2} m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β^{⊤}}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial β} & = \frac{(y_{i} - η_{i}) d_{i}^{ν - 1}}{σ^{2} m_{i}} (1 + ν \ln (d_{i})) \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{4}} & = - \frac{1}{2 σ^{4}} [ν (ν + 1) d_{i}^{ν} - 1]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial σ^{2} \partial ρ} & = - \frac{ν^{2} d_{i}^{ν}}{2 σ^{2} m_{i}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial σ^{2}} & = \frac{1}{2 σ^{2}} (1 + ν \ln (d_{i})) d_{i}^{ν}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ρ \partial ρ^{⊤}} & = \frac{1}{2 m_{i}} [\frac{1}{m_{i}} (1 - ν (ν + 1) d_{i}^{ν}) \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} + (ν d_{i}^{ν} - 1) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}]; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν \partial ρ} & = \frac{d_{i}^{ν}}{2 m_{i}} (1 + ν \log (d_{i})) \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial ν^{2}} & = - \frac{1}{ν^{2}} [1 + \frac{\ln (2)}{ν} + \frac{1}{4 ν^{2}} (4 ν Ψ (\frac{1}{2 ν}) + Ψ (1, \frac{1}{2 ν}))] - \frac{1}{2} d_{i}^{ν} [\log (d_{i})]^{2}; \\ \frac{\partial^{2} ℓ_{1_{i}} (θ)}{\partial θ \partial λ} & = 0 . \end{aligned}

Appendix 2. Computation of the $Q_{2 i} (τ | {\hat{θ}}^{(k)})$ function and its derivatives

Skew-normal distribution:

$Q_{2 i} (τ | {\hat{θ}}^{(k)}) = 0$ .
Skew power-exponential distribution:

In this case, there is no explicit form for $h (\cdot | τ)$ function and consequently $Q_{2 i} (τ | {\hat{θ}}^{(k)})$ cannot be calculated explicitly.

Skew Student-t normal distribution:

Since

U \sim G a m m a (ν / 2, ν / 2)

and

\log (h (u | ν)) = ν / 2 \log (ν / 2) - \log (Γ (ν / 2)) + (ν / 2 - 1) \log (u) - ν u / 2

, we have that:

\begin{aligned} Q_{2 i} (τ | {\hat{θ}}^{(k)}) & = ν / 2 \log (ν / 2) - \log (Γ (ν / 2)) + (ν / 2 - 1) {\hat{l u}}_{i}^{(k)} - ν {\hat{u}}_{i}^{(k)} / 2; \\ \frac{\partial Q_{2 i} (τ | {\hat{θ}}^{(k)})}{\partial ν} & = [\log (ν / 2) + 1 - Ψ (ν / 2) + {\hat{l u}}_{i}^{(k)} - {\hat{u}}_{i}^{(k)}] / 2; \\ \frac{\partial^{2} Q_{2 i} (τ | {\hat{θ}}^{(k)})}{\partial ν^{2}} & = 1 / (2 ν) - Ψ (1, ν / 2) / 4, where {\hat{l u}}_{i} = Ψ (\frac{\hat{ν} + 1}{2}) - \log (\frac{\hat{ν} + {\hat{d}}_{i}}{2}) . \end{aligned}

Skew slash distribution:

In this case,

U \sim Beta (ν, 1)

and

\log (h (u | ν)) = \log (ν) + (ν - 1) \log (u)

. Thus:

\begin{aligned} Q_{2 i} (τ | {\hat{θ}}^{(k)}) & = \log (ν) + (ν - 1) {\hat{l u}}_{i}^{(k)}; \\ \frac{\partial Q_{2 i} (τ | {\hat{θ}}^{(k)})}{\partial ν} & = 1 / ν + {\hat{l u}}_{i}^{(k)}; \\ \frac{\partial^{2} Q_{2 i} (τ | {\hat{θ}}^{(k)})}{\partial ν^{2}} & = - 1 / ν^{2}, where \\ {\hat{l u}}_{i} & = \frac{({\hat{d}}_{i} / 2)^{\hat{ν} + 1 / 2}}{Γ (\hat{ν} + 1 / 2) P_{1} (\hat{ν} + 1 / 2, {\hat{d}}_{i} / 2)} \int_{0}^{1} u^{\hat{ν} - 1 / 2} \log (u) e^{- u {\hat{d}}_{i} / 2} d u . \end{aligned}

Skew contaminated-normal distribution:

Here

\log (h (u | τ)) = \log (ν) I_{{γ}} (u) + \log (1 - ν) I_{{1}} (u)

. So,

\begin{aligned} Q_{2 i} (τ | {\hat{θ}}^{(k)}) & = {\hat{u}}_{γ_{i}}^{(k)} \log (ν) + {\hat{u}}_{1_{i}}^{(k)} \log (1 - ν); \\ \frac{\partial Q_{2 i} (τ | \hat{θ})}{\partial ν} & = \frac{{\hat{u}}_{γ_{i}}^{(k)}}{ν} - \frac{{\hat{u}}_{1_{i}}^{(k)}}{1 - ν}; \\ \frac{\partial^{2} Q_{2 i} (τ | \hat{θ})}{\partial ν^{2}} & = - \frac{{\hat{u}}_{γ_{i}}^{(k)}}{ν^{2}} - \frac{{\hat{u}}_{1_{i}}^{(k)}}{(1 - ν)^{2}}; \\ \frac{\partial Q_{2 i} (τ | \hat{θ})}{\partial γ} & = \frac{\partial^{2} Q_{2 i} (τ | \hat{θ})}{\partial γ^{2}} = \frac{\partial^{2} Q_{2 i} (τ | \hat{θ})}{\partial ν \partial γ} = 0, \end{aligned}

with

{\hat{u}}_{γ_{i}} = \hat{ν} ϕ_{γ_{i}} / f_{s} (\hat{τ} | y_{i}, {\hat{θ}}_{1})

{\hat{u}}_{1_{i}} = (1 - \hat{ν}) ϕ_{1_{i}} / f_{s} (\hat{τ} | y_{i}, {\hat{θ}}_{1})

f_{s} (τ | y_{i}, {\hat{θ}}_{1}) = ν ϕ_{γ_{i}} + (1 - ν) ϕ_{1_{i}}

ϕ_{γ_{i}} = ϕ (y_{i}; {\hat{η}}_{i}, \hat{σ^{2}} m_{i} (\hat{ρ}, z_{i}) / \hat{γ})

and

ϕ_{1_{i}} = ϕ (y_{i}; {\hat{η}}_{i}, \hat{σ^{2}} m_{i} (\hat{ρ}, z_{i}))

Appendix 3. Hessian matrix and perturbation schemes

To obtain the diagnostic measures of the SSMN-HNLM based on the approach proposed by Zhu and Lee [35], it is necessary to compute the Hessian matrix, which is defined by ${\ddot{Q}}_{θ} (\hat{θ}) = \partial^{2} Q (θ | \hat{θ}) / \partial θ \partial θ^{⊤} |_{θ = \hat{θ}}$ , where $θ = (β^{⊤}, σ^{2}, λ, ρ^{⊤}, τ^{⊤})^{⊤}$ . It follows from (15) that the derivatives ${\partial^{2} Q (θ | \hat{θ}) / \partial θ \partial θ^{⊤}}$ have elements given by:

\begin{aligned} \frac{\partial^{2} Q (θ | \hat{θ})}{\partial β \partial β^{⊤}} & = - \frac{1}{σ^{2}} \sum_{i = 1}^{n} m_{i}^{- 1} {- ({\hat{κ}}_{i} + λ^{2}) [- \frac{\partial η_{i}}{\partial β} \frac{\partial η_{i}}{\partial β^{⊤}} + (y_{i} - η_{i}) \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}] + λ {\hat{t}}_{i} \frac{\partial^{2} η_{i}}{\partial β \partial β^{⊤}}}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial ρ \partial β^{⊤}} & = \frac{1}{σ^{2}} \sum_{i = 1}^{n} m_{i}^{- 2} [- ({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i}) + λ {\hat{t}}_{i}] \frac{\partial m_{i}}{\partial ρ} \frac{\partial η_{i}}{\partial β^{⊤}}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial σ^{2} \partial β} & = \frac{1}{σ^{4}} \sum_{i = 1}^{n} m_{i}^{- 1} [- ({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i}) + λ {\hat{t}}_{i}] \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial λ \partial β} & = - \frac{1}{σ^{2}} \sum_{i = 1}^{n} m_{i}^{- 1} [- 2 λ (y_{i} - η_{i}) + {\hat{t}}_{i}] \frac{\partial η_{i}}{\partial β}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial ρ \partial ρ^{⊤}} & = - \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} m_{i 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i}) + {\hat{t^{2}}}_{i}] - \sum_{i = 1}^{n} m_{i 2}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial σ^{2} \partial ρ} & = - \frac{1}{2 σ^{4}} \sum_{i = 1}^{n} m_{i}^{- 2} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i}) + {\hat{t^{2}}}_{i}] \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial λ \partial ρ} & = \frac{1}{σ^{2}} \sum_{i = 1}^{n} m_{i}^{- 2} [λ (y_{i} - η_{i})^{2} - {\hat{t}}_{i} (y_{i} - η_{i})] \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial σ^{4}} & = \frac{n}{σ^{4}} - \frac{1}{σ^{6}} \sum_{i = 1}^{n} m_{i}^{- 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i}) + {\hat{t^{2}}}_{i}]; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial λ \partial σ^{2}} & = \frac{1}{σ^{4}} \sum_{i = 1}^{n} m_{i}^{- 1} [λ (y_{i} - η_{i})^{2} - {\hat{t}}_{i} (y_{i} - η_{i})]; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial λ^{2}} & = - \frac{1}{σ^{2}} \sum_{i = 1}^{n} m_{i}^{- 1} (y_{i} - η_{i})^{2}, \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial τ \partial θ_{1}} & = 0; \\ \frac{\partial^{2} Q (θ | \hat{θ})}{\partial τ \partial τ^{⊤}} & = \sum_{i = 1}^{n} \frac{\partial^{2} Q_{2_{i}} (τ | \hat{θ})}{\partial τ \partial τ^{⊤}} (seeAppendix 2), \end{aligned}

where

\begin{aligned} m_{i 1} & = \frac{1}{m_{i}^{3}} [2 \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} - m_{i} \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}], \\ m_{i 2} & = \frac{1}{m_{i}^{2}} [m_{i} \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}} - \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}}] and θ_{1} = (β^{⊤}, σ^{2}, λ, ρ^{⊤})^{⊤} . \end{aligned}

A.6. Perturbation schemes

In this section we consider two different perturbation schemes for SSMN-HNLM. For each case, we need to calculate the matrix $Δ_{ω_{0}} = \partial^{2} Q (θ, ω | \hat{θ}) / \partial θ \partial ω^{⊤} |_{ω = ω_{0}} = (Δ_{β}^{⊤}, Δ_{σ^{2}}^{⊤}, Δ_{λ}^{⊤}, Δ_{ρ}^{⊤}, Δ_{τ}^{⊤})^{⊤}$ .

A.6.1. Case weight perturbation

Let $ω = (w_{1}, \dots, w_{n})^{⊤},$ be an $n \times 1$ dimensional vector with $ω_{0} = (1, \dots, 1)^{⊤} = 1_{n}$ . Then the expected value of the perturbed complete-data log-likelihood function (perturbed Q-function) can be written as:

Q (θ, ω | \hat{θ}) = E [ℓ_{c} (θ, ω | y_{c})] = \sum_{i = 1}^{n} ω_{i} E [ℓ_{c_{i}} (θ | y_{c})] = \sum_{i = 1}^{n} ω_{i} Q_{i} (θ | \hat{θ}) .

In this case the elements of $Δ_{θ_{i}}$ , $i = 1, \dots, n$ , are given by

\begin{aligned} Δ_{β_{i}} & = - \frac{1}{σ^{2}} m_{i}^{- 1} [- ({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i}) + λ {\hat{t}}_{i}] \frac{\partial η_{i}}{\partial β}; \\ Δ_{σ_{i}^{2}} & = - \frac{1}{σ^{2}} + \frac{1}{2 σ^{4}} m_{i}^{- 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i}) + {\hat{t^{2}}}_{i}]; \\ Δ_{λ_{i}} & = - \frac{1}{σ^{2}} m_{i}^{- 1} [λ (y_{i} - η_{i})^{2} - {\hat{t}}_{i} (y_{i} - η_{i})]; \\ Δ_{ρ_{i}} & = - {\frac{1}{2 σ^{2}} m_{i}^{- 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i})] + 1} m_{i}^{- 1} \frac{\partial m_{i}}{\partial ρ}; \\ Δ_{τ_{i}} & = \frac{\partial Q_{2 i} (τ, \hat{θ})}{\partial τ} (See Appendix 2), for i = 1, \dots, n . \end{aligned}

A.6.2. Response variable perturbation

A perturbation of the response variables $y = (y_{1}, \dots, y_{n})^{⊤}$ is introduced by $y_{ω} = y + S_{y} ω$ , where $S_{y}$ is the standard deviation of $y$ . In this case, $ω_{0} = 0_{n \times 1}$ and

\begin{aligned} Q (θ, ω | \hat{θ}) & \propto \sum_{i = 1}^{n} Q_{2_{i}} (τ | \hat{θ}) - n \log σ^{2} - \sum_{i = 1}^{n} \log m_{i} \\ - \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} m_{i}^{- 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} + S_{y} ω_{i} - η_{i})^{2} - 2 λ {\hat{t}}_{i} (y_{i} + S_{y} ω_{i} - η_{i}) + {\hat{t^{2}}}_{i}] . \end{aligned}

In this case the elements of $Δ_{θ_{i}}$ , $i = 1, \dots, n$ , are given by

\begin{aligned} Δ_{β_{i}} & = \frac{S_{y}}{σ^{2}} m_{i}^{- 1} ({\hat{κ}}_{i} + λ^{2}) \frac{\partial η_{i}}{\partial β}; \\ Δ_{σ_{i}^{2}} & = \frac{S_{y}}{σ^{4}} m_{i}^{- 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i}) - λ {\hat{t}}_{i}]; \\ Δ_{λ_{i}} & = - \frac{S_{y}}{σ^{2}} m_{i}^{- 1} [2 λ (y_{i} - η_{i}) - {\hat{t}}_{i}]; \\ Δ_{ρ_{i}} & = \frac{S_{y}}{σ^{2}} m_{i}^{- 2} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i}) - λ {\hat{t}}_{i}] \frac{\partial m_{i}}{\partial ρ}; \\ Δ_{τ_{i}} & = 0 . \end{aligned}

A.6.3. Explanatory variable perturbation

A perturbation in a specific explanatory variable $x_{p}$ can be obtained as $x_{i ω p} = x_{i p} + S_{x} ω_{i}$ , for $i = 1, \dots, n$ , where $S_{x}$ is a scale factor that can be the standard deviation of $x_{p}$ . To simplify the notation, we write $m_{i ω} = m_{i} (ρ, x_{i ω p})$ and $η_{i ω} = η (β, x_{i ω p})$ . In this case, $ω_{0} = 0_{n \times 1}$ and

\begin{aligned} Q (θ, ω | \hat{θ}) & \propto \sum_{i = 1}^{n} Q_{2_{i}} (τ | \hat{θ}) - n \log σ^{2} - \sum_{i = 1}^{n} \log m_{i ω} \\ - \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} m_{i ω}^{- 1} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i ω})^{2} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i ω}) + {\hat{t^{2}}}_{i}] . \end{aligned}

In this case the elements of $\partial^{2} Q (θ, ω | \hat{θ}) / \partial θ \partial ω_{i}$ , $i = 1, \dots, n$ , are given by:

\begin{aligned} \frac{\partial^{2} Q (θ, ω | \hat{θ})}{\partial β \partial ω_{i}} & = - \frac{1}{σ^{2}} {- \frac{1}{m_{i ω}^{2}} \frac{\partial m_{i ω}}{\partial ω_{i}} \frac{\partial η_{i ω}}{\partial β} [- ({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i ω}) + λ {\hat{t}}_{i}] \\ + \frac{1}{m_{i ω}} [({\hat{κ}}_{i} + λ^{2}) \frac{\partial η_{i ω}}{\partial ω_{i}} \frac{\partial η_{i ω}}{\partial β} + λ {\hat{t}}_{i} \frac{\partial^{2} η_{i ω}}{\partial ω_{i} \partial β}]}; \\ \frac{\partial^{2} Q (θ, ω | \hat{θ})}{\partial σ^{2} \partial ω_{i}} & = \frac{1}{2 σ^{4}} {- \frac{1}{m_{i ω}^{2}} \frac{\partial m_{i ω}}{\partial ω_{i}} [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i ω})^{2} + {\hat{t^{2}}}_{i} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i ω})] \\ + \frac{2}{m_{i ω}} \frac{\partial η_{i ω}}{\partial ω_{i}} [- ({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i ω}) + λ {\hat{t}}_{i}]}; \\ \frac{\partial^{2} Q (θ, ω | \hat{θ})}{\partial λ \partial ω_{i}} & = - \frac{1}{σ^{2}} {- \frac{1}{m_{i ω}^{2}} \frac{\partial m_{i ω}}{\partial ω_{i}} [λ (y_{i} - η_{i ω})^{2} - {\hat{t}}_{i} (y_{i} - η_{i ω})] \\ + \frac{1}{m_{i ω}} \frac{\partial η_{i ω}}{\partial ω_{i}} [- 2 λ (y_{i} - η_{i ω}) + {\hat{t}}_{i}]}; \\ \frac{\partial^{2} Q (θ, ω | \hat{θ})}{\partial ρ \partial ω_{i}} & = - \frac{1}{m_{i ω}^{2}} [m_{i ω} \frac{\partial^{2} m_{i ω}}{\partial ω_{i} \partial ρ} - \frac{\partial m_{i ω}}{\partial ω_{i}} \frac{\partial m_{i ω}}{\partial ρ}] \\ + \frac{1}{2 σ^{2}} {\frac{1}{m_{i ω}^{3}} [m_{i ω} \frac{\partial^{2} m_{i ω}}{\partial ω_{i} \partial ρ} - 2 \frac{\partial m_{i ω}}{\partial ω_{i}} \frac{\partial m_{i ω}}{\partial ρ}] \\ \times [({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i ω})^{2} + {\hat{t^{2}}}_{i} - 2 λ {\hat{t}}_{i} (y_{i} - η_{i ω})] \\ + \frac{1}{m_{i ω}^{2}} \frac{\partial m_{i ω}}{\partial ρ} \frac{\partial η_{i ω}}{\partial ω_{i}} [- 2 ({\hat{κ}}_{i} + λ^{2}) (y_{i} - η_{i ω}) + 2 λ {\hat{t}}_{i}]}; \\ \frac{\partial^{2} Q (θ, ω | \hat{θ})}{\partial τ \partial ω_{i}} & = 0 . \end{aligned}

Funding Statement

The first author thanks to the FAPEMIG (Minas Gerais State Research Support Foundation), [grant number CEX APQ 01944/17] for financial support. The research of Aldo M. Garay was supported by Grant 420082/2016-6 from CNPq-Brazil.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

1.Andrews D.F. and Mallows C.L., Scale mixtures of normal distributions, J. R. Stat. Soc. Ser. B 36 (1974), pp. 99–102. [Google Scholar]
2.Araújo M.C., Cysneiros A.H.M.A., and Montenegro L.C., Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models, Stat. Papers (2017). doi: 10.1007/s00362-017-0933-5. [DOI] [Google Scholar]
3.Azzalini A., A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178. [Google Scholar]
4.Battes D.M. and Watts D.G., Nonlinear Regression Analysis and its Applications, John Wiley & Sons, New York, 1988. [Google Scholar]
5.Beale E.M.L. and Little R.J.A., Missing values in multivariate analysis, J. R. Stat. Soc. Ser. B 37 (1975), pp. 129–146. [Google Scholar]
6.Branco M.D. and Dey D.K., A general class of multivariate skew-elliptical distributions, J. Multivar. Anal. 79 (2001), pp. 99–113. doi: 10.1006/jmva.2000.1960 [DOI] [Google Scholar]
7.Cancho V.C., Lachos V.H., and Ortega E.M.M., A nonlinear regression model with skew-normal errors, Stat. Papers 51 (2010), pp. 547–558. doi: 10.1007/s00362-008-0139-y [DOI] [Google Scholar]
8.Cao C., Wang Y., Shi J.Q., and Lin J., Measurement error models for replicated data under asymmetric heavy-tailed distributions, Comput. Econ. 52 (2018), pp. 531–553. doi: 10.1007/s10614-017-9702-8 [DOI] [Google Scholar]
9.Chen F., Zhu H.T., and Lee S.Y., Perturbation selection and local influence analysis for nonlinear structural equation model, Psychometrika 74 (2009), pp. 493–516. doi: 10.1007/s11336-009-9114-3 [DOI] [Google Scholar]
10.Cook R.D., Detection of influential observation in linear regression, Technometrics 19 (1977), pp. 5–18. [Google Scholar]
11.Cook R.D., Assessment of local influence, J. R. Stat. Soc. Ser. B 48 (1986), pp. 133–169. [Google Scholar]
12.Cook R.D. and Weisberg S., Residuals and Influence in Regression, Chapman & Hall/CRC, Boca Raton, FL, 1982. [Google Scholar]
13.Dempster A., Laird N., and Rubin D., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B 39 (1977), pp. 1–38. [Google Scholar]
14.Ferreira C.S. and Arellano-Valle R., Estimation and diagnostic analysis in skew-generalized-normal regression models, J. Stat. Comput. Simul. 88 (2018), pp. 1039–1059. doi: 10.1080/00949655.2017.1419351 [DOI] [Google Scholar]
15.Ferreira C.S., Bolfarine H., and Lachos V.H., Skew scale mixtures of normal distributions: Properties and estimation, Stat. Methodol. 8 (2011), pp. 154–171. doi: 10.1016/j.stamet.2010.09.001 [DOI] [Google Scholar]
16.Ferreira C.S. and Lachos V.H., Nonlinear regression models under skew scale mixtures of normal distributions, Stat. Methodol. 33 (2016), pp. 131–146. doi: 10.1016/j.stamet.2016.08.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Ferreira C.S., Lachos V.H., and Bolfarine H., Inference and diagnostics in skew scale mixtures of normal regression models, J. Stat. Comput. Simul. 85 (2015), pp. 517–537. doi: 10.1080/00949655.2013.828057 [DOI] [Google Scholar]
18.Garay A.M., Lachos V.H., and Abanto-Valle C.A., Nonlinear regression models based on scale mixtures of skew-normal distributions, J. Korean Stat. Soc. 40 (2011), pp. 115–124. doi: 10.1016/j.jkss.2010.08.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Garay A.M., Lachos V.H., Labra F.V., and Ortega E.M.M., Statistical diagnostics for nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Comput. Simul. 84 (2014), pp. 1761–1778. doi: 10.1080/00949655.2013.766188 [DOI] [Google Scholar]
20.Gómez H.W., Venegas O., and Bolfarine H., Skew-symmetric distributions generated by the distribution function of the normal distribution, Environmetrics 18 (2007), pp. 395–407. doi: 10.1002/env.817 [DOI] [Google Scholar]
21.Henze N., A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271–275. [Google Scholar]
22.Labra F.V., Garay A.M., Lachos V.H., and Ortega E.M.M., Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Plan. Inference 142 (2012), pp. 2149–2165. doi: 10.1016/j.jspi.2012.02.018 [DOI] [Google Scholar]
23.Lee S.Y. and Xu L., Influence analyses of nonlinear mixed-effects models, Comput. Stat. Data. Anal. 45 (2004), pp. 321–341. doi: 10.1016/S0167-9473(02)00303-1 [DOI] [Google Scholar]
24.Lin J.G., Xie F.C., and Wei B., Statistical diagnostics for skew-t-normal nonlinear models, Comm. Stat. Simul. Comput. 38 (2009), pp. 2096–2110. doi: 10.1080/03610910903249502 [DOI] [Google Scholar]
25.Liu C. and Rubin D.B., The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence, Biometrika 80 (1994), pp. 267–278. [Google Scholar]
26.Louzada F., Ferreira P.H., and Diniz C.A., Skew-normal distribution for growth curve models in presence of a heteroscedasticity structure, J. Appl. Stat. 41 (2014), pp. 1785–1798. doi: 10.1080/02664763.2014.891005 [DOI] [Google Scholar]
27.Meng X.L. and Rubin D.B., Maximum likelihood estimation via the ECM algorithm: A general framework, Biometrika 81 (1993), pp. 633–648. [Google Scholar]
28.Montenegro L.C., Bolfarine H., and Lachos V.H., Influence diagnostics for a skew extension of the grubbs measurement error model, Comm. Stat. Simul. Comput. 38 (2009), pp. 667–681. doi: 10.1080/03610910802618385 [DOI] [Google Scholar]
29.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2018. Available at http://www.R-project.org/.
30.Ratkowsky D.A., Handbook of Nonlinear Regression Models, Marcel Dekker, New York, 1990. [Google Scholar]
31.Vaida F. and Liu L., Fast implementation for normal mixed effects models with censored response, J. Comput. Graph. Stat. 18 (2009), pp. 797–817. doi: 10.1198/jcgs.2009.07130 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Xie F.C., Lin J.G., and Wei B.C., Diagnostics for skew-normal nonlinear regression models with ar(1) errors, Comput. Stat. Data. Anal. 53 (2009a), pp. 4403–4416. doi: 10.1016/j.csda.2009.06.010 [DOI] [Google Scholar]
33.Xie F.C., Wei B.C., and Lin J.G., Homogeneity diagnostics for skew-normal nonlinear regression models, Stat. Probab. Lett. 79 (2009b), pp. 821–827. doi: 10.1016/j.spl.2008.11.001 [DOI] [Google Scholar]
34.Zhu H.T., Ibrahim J.G., Lee S.Y., and Zhang H.P., Perturbation selection and influence measures in local influence analysis, Ann. Statist. 35 (2007), pp. 2565–2588. doi: 10.1214/009053607000000343 [DOI] [Google Scholar]
35.Zhu H. and Lee S., Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126. doi: 10.1111/1467-9868.00279 [DOI] [Google Scholar]
36.Zhu H., Lee S., Wei B., and Zhou J., Case-deletion measures for models with incomplete data, Biometrika 88 (2001), pp. 727–737. doi: 10.1093/biomet/88.3.727 [DOI] [Google Scholar]

[CIT0001] 1.Andrews D.F. and Mallows C.L., Scale mixtures of normal distributions, J. R. Stat. Soc. Ser. B 36 (1974), pp. 99–102. [Google Scholar]

[CIT0002] 2.Araújo M.C., Cysneiros A.H.M.A., and Montenegro L.C., Improved heteroskedasticity likelihood ratio tests in symmetric nonlinear regression models, Stat. Papers (2017). doi: 10.1007/s00362-017-0933-5. [DOI] [Google Scholar]

[CIT0003] 3.Azzalini A., A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178. [Google Scholar]

[CIT0004] 4.Battes D.M. and Watts D.G., Nonlinear Regression Analysis and its Applications, John Wiley & Sons, New York, 1988. [Google Scholar]

[CIT0005] 5.Beale E.M.L. and Little R.J.A., Missing values in multivariate analysis, J. R. Stat. Soc. Ser. B 37 (1975), pp. 129–146. [Google Scholar]

[CIT0006] 6.Branco M.D. and Dey D.K., A general class of multivariate skew-elliptical distributions, J. Multivar. Anal. 79 (2001), pp. 99–113. doi: 10.1006/jmva.2000.1960 [DOI] [Google Scholar]

[CIT0007] 7.Cancho V.C., Lachos V.H., and Ortega E.M.M., A nonlinear regression model with skew-normal errors, Stat. Papers 51 (2010), pp. 547–558. doi: 10.1007/s00362-008-0139-y [DOI] [Google Scholar]

[CIT0008] 8.Cao C., Wang Y., Shi J.Q., and Lin J., Measurement error models for replicated data under asymmetric heavy-tailed distributions, Comput. Econ. 52 (2018), pp. 531–553. doi: 10.1007/s10614-017-9702-8 [DOI] [Google Scholar]

[CIT0009] 9.Chen F., Zhu H.T., and Lee S.Y., Perturbation selection and local influence analysis for nonlinear structural equation model, Psychometrika 74 (2009), pp. 493–516. doi: 10.1007/s11336-009-9114-3 [DOI] [Google Scholar]

[CIT0010] 10.Cook R.D., Detection of influential observation in linear regression, Technometrics 19 (1977), pp. 5–18. [Google Scholar]

[CIT0011] 11.Cook R.D., Assessment of local influence, J. R. Stat. Soc. Ser. B 48 (1986), pp. 133–169. [Google Scholar]

[CIT0012] 12.Cook R.D. and Weisberg S., Residuals and Influence in Regression, Chapman & Hall/CRC, Boca Raton, FL, 1982. [Google Scholar]

[CIT0013] 13.Dempster A., Laird N., and Rubin D., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B 39 (1977), pp. 1–38. [Google Scholar]

[CIT0014] 14.Ferreira C.S. and Arellano-Valle R., Estimation and diagnostic analysis in skew-generalized-normal regression models, J. Stat. Comput. Simul. 88 (2018), pp. 1039–1059. doi: 10.1080/00949655.2017.1419351 [DOI] [Google Scholar]

[CIT0015] 15.Ferreira C.S., Bolfarine H., and Lachos V.H., Skew scale mixtures of normal distributions: Properties and estimation, Stat. Methodol. 8 (2011), pp. 154–171. doi: 10.1016/j.stamet.2010.09.001 [DOI] [Google Scholar]

[CIT0016] 16.Ferreira C.S. and Lachos V.H., Nonlinear regression models under skew scale mixtures of normal distributions, Stat. Methodol. 33 (2016), pp. 131–146. doi: 10.1016/j.stamet.2016.08.004 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0017] 17.Ferreira C.S., Lachos V.H., and Bolfarine H., Inference and diagnostics in skew scale mixtures of normal regression models, J. Stat. Comput. Simul. 85 (2015), pp. 517–537. doi: 10.1080/00949655.2013.828057 [DOI] [Google Scholar]

[CIT0018] 18.Garay A.M., Lachos V.H., and Abanto-Valle C.A., Nonlinear regression models based on scale mixtures of skew-normal distributions, J. Korean Stat. Soc. 40 (2011), pp. 115–124. doi: 10.1016/j.jkss.2010.08.003 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0019] 19.Garay A.M., Lachos V.H., Labra F.V., and Ortega E.M.M., Statistical diagnostics for nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Comput. Simul. 84 (2014), pp. 1761–1778. doi: 10.1080/00949655.2013.766188 [DOI] [Google Scholar]

[CIT0020] 20.Gómez H.W., Venegas O., and Bolfarine H., Skew-symmetric distributions generated by the distribution function of the normal distribution, Environmetrics 18 (2007), pp. 395–407. doi: 10.1002/env.817 [DOI] [Google Scholar]

[CIT0021] 21.Henze N., A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271–275. [Google Scholar]

[CIT0022] 22.Labra F.V., Garay A.M., Lachos V.H., and Ortega E.M.M., Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Plan. Inference 142 (2012), pp. 2149–2165. doi: 10.1016/j.jspi.2012.02.018 [DOI] [Google Scholar]

[CIT0023] 23.Lee S.Y. and Xu L., Influence analyses of nonlinear mixed-effects models, Comput. Stat. Data. Anal. 45 (2004), pp. 321–341. doi: 10.1016/S0167-9473(02)00303-1 [DOI] [Google Scholar]

[CIT0024] 24.Lin J.G., Xie F.C., and Wei B., Statistical diagnostics for skew-t-normal nonlinear models, Comm. Stat. Simul. Comput. 38 (2009), pp. 2096–2110. doi: 10.1080/03610910903249502 [DOI] [Google Scholar]

[CIT0025] 25.Liu C. and Rubin D.B., The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence, Biometrika 80 (1994), pp. 267–278. [Google Scholar]

[CIT0026] 26.Louzada F., Ferreira P.H., and Diniz C.A., Skew-normal distribution for growth curve models in presence of a heteroscedasticity structure, J. Appl. Stat. 41 (2014), pp. 1785–1798. doi: 10.1080/02664763.2014.891005 [DOI] [Google Scholar]

[CIT0027] 27.Meng X.L. and Rubin D.B., Maximum likelihood estimation via the ECM algorithm: A general framework, Biometrika 81 (1993), pp. 633–648. [Google Scholar]

[CIT0028] 28.Montenegro L.C., Bolfarine H., and Lachos V.H., Influence diagnostics for a skew extension of the grubbs measurement error model, Comm. Stat. Simul. Comput. 38 (2009), pp. 667–681. doi: 10.1080/03610910802618385 [DOI] [Google Scholar]

[CIT0029] 29.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2018. Available at http://www.R-project.org/.

[CIT0030] 30.Ratkowsky D.A., Handbook of Nonlinear Regression Models, Marcel Dekker, New York, 1990. [Google Scholar]

[CIT0031] 31.Vaida F. and Liu L., Fast implementation for normal mixed effects models with censored response, J. Comput. Graph. Stat. 18 (2009), pp. 797–817. doi: 10.1198/jcgs.2009.07130 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0032] 32.Xie F.C., Lin J.G., and Wei B.C., Diagnostics for skew-normal nonlinear regression models with ar(1) errors, Comput. Stat. Data. Anal. 53 (2009a), pp. 4403–4416. doi: 10.1016/j.csda.2009.06.010 [DOI] [Google Scholar]

[CIT0033] 33.Xie F.C., Wei B.C., and Lin J.G., Homogeneity diagnostics for skew-normal nonlinear regression models, Stat. Probab. Lett. 79 (2009b), pp. 821–827. doi: 10.1016/j.spl.2008.11.001 [DOI] [Google Scholar]

[CIT0034] 34.Zhu H.T., Ibrahim J.G., Lee S.Y., and Zhang H.P., Perturbation selection and influence measures in local influence analysis, Ann. Statist. 35 (2007), pp. 2565–2588. doi: 10.1214/009053607000000343 [DOI] [Google Scholar]

[CIT0035] 35.Zhu H. and Lee S., Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126. doi: 10.1111/1467-9868.00279 [DOI] [Google Scholar]

[CIT0036] 36.Zhu H., Lee S., Wei B., and Zhou J., Case-deletion measures for models with incomplete data, Biometrika 88 (2001), pp. 727–737. doi: 10.1093/biomet/88.3.727 [DOI] [Google Scholar]

PERMALINK

Inference and diagnostics for heteroscedastic nonlinear regression models under skew scale mixtures of normal distributions

Clécio da Silva Ferreira

Víctor H Lachos

Aldo M Garay

ABSTRACT

1. Introduction

2. Skew scale mixtures of normal distributions

3. The model and the EM algorithm for ML estimation

3.1. The model

3.2. The ECME algorithm for the SSMN-HNLM model

Table 1. κ^i(k)=E[κ−1(Ui)|yi,θ^(k)] for different SSMN distributions.

3.3. Notes on implementation

4. Likelihood ratio test for homogeneity of variance

4.1. Simulation studies

4.1.1. The empirical distributions of LR test statistics

Figure 1.

4.1.2. The power of the LR test

Table 2. Rejection rate for H0:ρ=0 at the nominal level of 5% from the LR statistic for the SN, StN and SSL distributions.

Figure 2.

4.1.3. Study of misspecification of the structure function

Table 3. Coverage rates (CR) at the nominal level of 5% and bias for different structure functions with mi=e0.1zi (true values of the parameters are in parentheses).

4.1.4. Computational aspects

5. Diagnostic analysis

5.1. The local influence approach

5.2. Case deletion measures

6. Application

Table 4. Summary statistics for ultrasonic calibration data (SD is sample standard deviation).

Figure 3.

Table 5. ML estimation results of fitting various mixture models to the ultrasonic calibration data. The SE values are the asymptotic standard errors based on the observed information matrix.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Table 6. Ultrasonic calibration data. Relative changes (RC) of θ^, after deleting outliers.

Figure 8.

Figure 9.

Table 7. Ultrasonic calibration data. Likelihood ratio (LR) statistics when removing each observation individually and all simultaneously.

7. Conclusions

Appendices.

Appendix 1. The observed information matrix for SSMN heteroscedastic nonlinear regression models

A.1. The normal distribution

A.2. The Student-t distribution

A.3. The slash distribution

A.4. The contaminated normal distribution

A.5. The power-exponential distribution

Appendix 2. Computation of the Q2i(τ|θ^(k)) function and its derivatives

Appendix 3. Hessian matrix and perturbation schemes

A.6. Perturbation schemes

A.6.1. Case weight perturbation

A.6.2. Response variable perturbation

A.6.3. Explanatory variable perturbation

Funding Statement

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 1. ${\hat{κ}}_{i}^{(k)} = E [κ^{- 1} (U_{i}) | y_{i}, {\hat{θ}}^{(k)}]$ for different SSMN distributions.

Table 2. Rejection rate for $H_{0} : ρ = 0$ at the nominal level of $5 %$ from the LR statistic for the SN, StN and SSL distributions.

Table 3. Coverage rates (CR) at the nominal level of $5 %$ and bias for different structure functions with $m_{i} = e^{0.1 z_{i}}$ (true values of the parameters are in parentheses).

Table 6. Ultrasonic calibration data. Relative changes (RC) of $\hat{θ}$ , after deleting outliers.

Appendix 2. Computation of the $Q_{2 i} (τ | {\hat{θ}}^{(k)})$ function and its derivatives