Heteroscedastic partially linear model under skew-normal distribution with application in ragweed pollen concentration

Clécio S Ferreira; Camila Borelli Zeller; Rafael R de Oliveira Garcia

doi:10.1080/02664763.2021.2024798

. 2022 Jan 13;50(6):1255–1282. doi: 10.1080/02664763.2021.2024798

Heteroscedastic partially linear model under skew-normal distribution with application in ragweed pollen concentration

Clécio S Ferreira ^1,^CONTACT, Camila Borelli Zeller ¹, Rafael R de Oliveira Garcia ¹

PMCID: PMC10071991 PMID: 37025282

Abstract

We introduce a new class of heteroscedastic partially linear model (PLM) with skew-normal distribution. Maximum likelihood estimation of the model parameters by the ECM algorithm (Expectation/Conditional Maximization) as well as influence diagnostics for the new model are investigated. In addition, a Likelihood Ratio test for assessing the homogeneity of the scale parameter is presented. Simulation studies for assessing the performance of the ECM algorithm and the Likelihood Ratio test statistics for homogeneity of variance are developed. Also, a study for misspecification of the structure function is considered. Finally, an application of the new heteroscedastic PLM to a real data set on ragweed pollen concentration is presented to show that it provides a better fit than the classic homocedastic PLM. We hope that the proposed model may attract applications in different areas of knowledge.

Keywords: Partially linear models, skew-normal distribution, heteroscedasticity, ECM algorithm, local influence

1. Introduction

Partially linear models (PLMs) or semiparametric models have been studied by various authors (see, for instance, [16,30], and references therein). These models add a nonparametric component to the usual linear relation between the response and explanatory variables. When the data set presents heterogeneity of variance, the PLMs can be extended by incorporating, for example, a positive continuously differentiable function involving a subset of explanatory variables, named here as Heteroscedastic Partially Linear Models (HPLM). In this context, Chen and You [5] proposed to estimate the nonparametric component through kernel smoothing and constructed a semiparametric generalized least-squares estimator for the parametric component. Ma et al. [26] studied the HPLMs with an unspecified partial baseline component and a nonparametric variance function, proposing a family of consistent estimators and investigate their asymptotic properties. They showed that the optimal semiparametric efficiency bound can be reached by a semiparametric kernel estimator in this family. You et al. [35] proposed a test of heteroscedasticity, a two-step estimator of the heteroscedastic variance function, semiparametric generalized least-squares estimators of the parametric and nonparametric components of the model, and bootstrap goodness-of-fit test to see whether the nonparametric component can be parametrized. More recently, Keilegom and Wang [21] studied a general class of location-dispersion regression models, in which both the location function and the dispersion function are semiparametrically modeled. Note that these works use the normal (Gaussian) distribution to model data, including the use of the traditional method of least-squares in the models.

However, in some situations, where the data are asymmetric, the above methods may not be appropriate, particularly when the response assumes real values. In this sense, a distribution that accommodates skewness, and includes the normal distribution as a special case, was introduced by Azzalini [2], named skew-normal. This class has been studied by various authors in different contexts (see, for example, [4,23,27], among others). An extensive, but not exhaustive, list of the publications involving skew-normal distribution can be accessed on Azzalini's homepage¹. Ferreira and Paula [12] proposed estimation and diagnostic for homoscedastic PLMs in an asymmetric context using the skew-normal distribution (PLM-SN), by developing the expectation-maximization (EM) algorithm for linear regression models and diagnostic analysis through local influence as well as generalized leverage, following the approach of Zhu and Lee [37]. To our knowledge, there is no research in the scientific literature involving HPLMs with an asymmetric error.

So, a natural extension is to propose a PLM model using SN distribution in the presence of heteroscedasticity and to develop influence diagnostics for detecting influential observations, which is the objective of this work. It is important to note that we implemented the ECM algorithm (Expectation/Conditional Maximization) and obtained closed-form expressions for all the estimators of the parameters of the proposed model, except for the parameter of heterogeneity. Influence analysis is an important and key step in data analysis after parameter estimation. There are two approaches for detecting influential observations: the case-deletion approach [7,38] and the local influence approach [6,37]. Since the estimation of the parameters of the HPLM model using skew-normal distribution will be via the ECM algorithm, then, in this work, the local influence approach will be based on the complete-data technique that uses the conditional expectation of the complete-data log-likelihood function (Q-function), from the ECM algorithm, as proposed by Zhu and Lee [37].

Another interesting aspect is that in the PLM, the standard assumption is that all observations have equal variances, but in some cases these models do not comply with this assumption, affecting the efficiency of the estimators. Therefore, it is important to develop tests that allow us to determine the presence or absence of such homogeneity. In this paper, we propose a Likelihood Ratio test o check the homogeneity of the scale parameter in the HPLM-SN model.

The rest of the work is organized as follows. In Section 3, the heteroscedastic partially linear model, under the assumption that the errors follow the skew-normal distribution, is presented and a penalized log-likelihood function is considered for the parameter estimation. The ECM algorithm to obtain the maximum likelihood estimate of the parameters of the HPLM-SN model is given in Section 4 and the discussion of degrees of freedom estimation and goodness-of-fit is also presented. Residual analysis is given in Section 5 to identify atypical observations and/or model misspecification once residuals are measures of agreement between the data and the fitted model. Section 6 presents local influence measures using the methodology proposed by Zhu and Lee [37]. The Hessian and the corresponding matrices of four perturbation schemes are also derived. In Section 7, we discuss the Likelihood Ratio test to test the homogeneity of a scale parameter. The properties of the Likelihood Ratio test statistics are investigated through Monte Carlo simulations. Section 8 deals with simulation studies to evaluate the efficiency of the ECM algorithm and the Likelihood Ratio test for homogeneity of variance. In addition, a study for misspecification of the structure function is considered. Finally, in Section 9, we illustrate the methodology by considering an application with a real data set, where the main interest is to explain the daily pollen concentration. Section 10 summarizes the contributions of the paper.

2. Motivating example

We illustrate our proposed methods with a data set obtained from [32], which is available, e.g. in the R package SemiPar. This data set contains data on ragweed levels and meteorological variables for 335 days in Kalamazoo, Michigan, USA, from 1991 to 1994. Ferreira and Paula [12] have analyzed this data set using a homoscedastic partially linear model under asymmetric distributions. Now we revisit this data set with the aim of expanding the inferential results to the heteroscedastic partially linear skew-normal model. Following [12], the explanatory variables included in our proposed model are the indicator of significant rain (1: at least 3 hours of steady or brief but intense rain, 0: other), temperature (degrees Fahrenheit), wind speed forecast for the following day (mph) and time (t-days in season). The response variable $(\sqrt{y})$ is the square root of the pollen concentration of ragweed ( $g r a i n s / m^{3}$ ). First, we fit a PLM-SN model to the data as specified by [12]

\sqrt{y_{i}} = β_{1} r a i n_{i} + β_{2} t e m p e r a t u r e_{i} + β_{3} w i n d_{i} + f (t_{i}) + ϵ_{i}, i = 1, \dots, 335,

(1)

where $ϵ_{i}$ are iid errors following a skew-normal distribution, i.e. $S N (0, σ^{2}, λ)$ . To detect deviations from the error model, we use the standard residuals $e_{i}^{0}, i = 1, \dots, 335$ , defined in Section 5. Figure 1(a) presents the plot of standard residuals $e_{i}^{0}$ 's, where we may see a heteroscedastic behaviour of the residuals. Furthermore, Figures 1(b and c) show the scatter plots between these residuals and the explanatory variables of temperature and wind. We can note that the heteroscedasticity of the residuals is due to the temperature.

Figure 1. — Standard residuals of the Homoscedastic PLM-SN model fitted ( $e_{i}^{0}$ 's) to the ragweed levels data: (a) plot of the residuals, (b) scatter plot between residuals and *temperature*, and (c) scatter plot between residuals and *wind*.

Thus, this paper aims to introduce a new class of heteroscedastic PLM under skew-normal distribution. In other words, this work aims to develop current research topics which are of paramount importance for proper data analysis. When skewness and heteroscedasticity problem is a concern in a partially linear regression model this manuscript may be a useful reference to cope with those problems, and hence it is a good contribution to the statistics literature.

3. The proposed model

In this section, we propose the heteroscedastic partially linear model under the assumption that the errors follow the skew-normal distribution and discuss the penalized function method, which is often required to maximize the penalized likelihood function.

3.1. Skew-normal distribution

We start with the definition of the skew-normal (SN) distribution that will be used in this article; see [2] for more details. A random variable $Y \sim SN (μ, σ^{2}, λ)$ if its probability density function (pdf) is given by

f (y | μ, σ^{2}, λ) = 2 ϕ (y | μ, σ^{2}) Φ (\frac{λ (y - μ)}{σ}), y \in R,

(2)

where $ϕ (\cdot; μ, σ^{2})$ stands for the pdf of the normal distribution with mean μ and variance $σ^{2}$ , $Φ (\cdot)$ represents the cumulative distribution function (cdf) of the standard normal distribution. Its stochastic representation is given by

Y \overset{d}{=} μ + σ (δ | T_{0} | + (1 - δ^{2})^{1 / 2} T_{1}), with δ = \frac{λ}{\sqrt{1 + λ^{2}}},

(3)

where $| T_{0} |$ denotes the absolute value of $T_{0}$ , $T_{0} \sim N (0, 1)$ and $T_{1} \sim N (0, 1)$ are independent, and ‘ $\overset{d}{=}$ ’ means ‘distributed as’. This convenient hierarchical representation facilitates EM-type implementation for the maximum-likelihood estimation and can be used to simulate data. Note that if $Y \sim SN (μ, σ^{2}, λ),$ then $Z = (Y - μ) / σ \sim SN (0, 1, λ) .$ A particular case of this distribution is the normal distribution ( $Y \sim N (μ, σ^{2})$ ) when $λ = 0$ . From (3) it follows that the expectation and variance of Y are given, respectively, by

E [Y] = μ + c σ δ and Var [Y] = σ^{2} (1 - c^{2} δ^{2}), with c = \sqrt{\frac{2}{π}} .

(4)

3.2. Model specification

In this section, we define the heteroscedastic partially linear model under skew-normal distribution. First, consider the homoscedastic partially linear model under skew-normal distribution (PLM-SN model), as defined by Ferreira and Paula [12], given by

Y_{i} = x_{i}^{⊤} β + f (t_{i}) + ϵ_{i}, i = 1, \dots, n,

(5)

where $Y_{i}$ denotes the response of the ith experimental unit, $x_{i}$ is a known $p \times 1$ vector covariate vector, $β$ is a p-dimensional vector of unknown regression coefficients, $t_{i}$ is a scalar that may represent a value of a continuous covariate, for example, time, $f (\cdot)$ is a smoothing function, and $ϵ_{i}$ are independent random errors such that $ϵ_{i} \sim SN (0, σ^{2}, λ)$ . However, the actual scale parameter may be related to the ith observation $Y_{i}$ and thus its variance is nonconstant.

We extend the PLM-SN model defined in (5) with the assumption under a skew-normal structure in the presence of heteroscedasticity. Thus, for the ith experimental unit,

ϵ_{i} \sim SN (0, σ_{i}^{2}, λ), σ_{i}^{2} = σ^{2} m_{i} (ρ, z_{i}), i = 1, \dots, n,

(6)

where $m_{i} = m (ρ, z_{i})$ is a known continuously differentiable positive function, $z_{i}$ contains the values of the explanatory variables, which generally constitute, although not necessarily, a subset of $x_{i}$ , and $ρ : p^{*} \times 1$ is a vector of unknown parameters. If the variances depend on the values of some explanatory variables $z_{i}$ , for example, a specific form of $m_{i}$ is the log-linear model given by $m_{i} (ρ, z_{i}) = \exp (\sum_{j = 1}^{p^{*}} ρ_{j} z_{i j})$ ; see [11,22,34] and references therein for more details. It is assumed that there is a $ρ_{0}$ such that $m (ρ_{0}, z_{i}) = 1$ , for all $i = 1, \dots, n$ . We call the structure defined by (5) and (6) the HPLM-SN model (heteroscedastic partially linear skew-normal model).

Alternatively, the model (5)–(6) can be written in matrix form as follows

Y = X β + N f + ϵ,

(7)

where $Y = (Y_{1}, \dots, Y_{n})^{⊤}$ , $X$ is an $n \times p$ design matrix with rows $x_{i}^{⊤}$ , $N$ is an $n \times q$ incidence matrix, $f$ is a $q \times 1$ vector (unknown smooth curve) and $ϵ = (ϵ_{i}, \dots, ϵ_{n})^{⊤}$ is an $n \times 1$ vector of random errors.

3.3. Penalized log-likelihood function

From expressions (6)–(7), we have that $Y_{i} \sim S N (μ_{i}, σ_{i}^{2}, λ)$ , where $μ_{i} = x_{i}^{⊤} β + n_{i}^{⊤} f$ and $n_{i}^{⊤}$ is the ith row of $N$ , the observed-data log-likelihood function of $θ = (β^{⊤}, σ^{2}, λ, ρ^{⊤}, f^{⊤})^{⊤}$ is given by

\begin{aligned} ℓ (θ) \propto - \frac{n}{2} \log σ^{2} - \frac{1}{2} \sum_{i = 1}^{n} \log m_{i} - \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} \frac{(y_{i} - μ_{i})^{2}}{m_{i}} + \sum_{i = 1}^{n} \log Φ (\frac{λ (y_{i} - μ_{i})}{σ m_{i}^{1 / 2}}) . \end{aligned}

(8)

The direct maximization of (8) is difficult due to the term $Φ (\cdot) .$ In addition, maximization of (8) without imposing restrictions on the function $f (\cdot)$ may cause overfitting and nonidentification of $β$ (see, for instance, [14]). A well-known procedure that is based on the idea of log-likelihood penalization consists in incorporating a penalty function in the log-likelihood, such that

\begin{aligned} ℓ_{p} (θ, α) = ℓ (θ) - \frac{α}{2} J (f), \end{aligned}

(9)

where $J (f)$ denotes the penalty function over $f (\cdot)$ and α is a smoothing parameter that controls the tradeoff between the goodness-of-fit, measured by large values of $ℓ (θ)$ , and the estimated smoothing function, measured by small values of $J (f)$ . Therefore, the determination of the parameter α is a crucial part of the estimation process, for which different methods of choice are available in the literature, such as the Akaike information criterion or the Bayesian information criterion. Following [15,18], we use a natural cubic spline to estimate f and the general form of $J (f) = \int_{a}^{b} [f^{(2)} (t)]^{2} d t,$ where $f^{(2)} (t)$ denotes the second derivative of $f (t)$ with $[a, b]$ containing the values $t_{1}^{0}, \dots, t_{q}^{0}$ being the distinct and ordered values of $t_{i}$ . Moreover, from Theorem 2.1 in [15], page 13, the penalty function will satisfy $\int_{a}^{b} [f^{(2)} (t)]^{2} d t = f^{⊤} K f,$ where $K \in R^{q \times q}$ is a nonnegative definite matrix that depends only on knots. In this article, we will use the methods of [15]- first method- and [10]- second method- to construct the matrices $N$ and $K$ . They differ in the choice of knots that will be used to estimate the f curve. In the first method, $f = (f (t_{1}^{0}), \dots, f (t_{q}^{0}))^{⊤}$ , with $t_{1}^{0}, \dots, t_{q}^{0}$ being the distinct and ordered values of $t_{i}$ and $N$ is an $n \times q$ incidence matrix whose $(i, j)$ th element equals the indicator function $I (t_{i} = t_{j}^{0})$ for $j = 1, \dots, q$ . On the other hand, in the second method, the knots are chosen according to a pre-established interval using B-splines and we estimate $f (\cdot)$ as a B-spline of order 3 [3], i.e. $f (x) = \sum_{j = 1}^{q} f_{j} B_{j} (x)$ . In this case, the elements of $N$ are given by $n_{i j} = B_{j} (t_{i}), i = 1, \dots, n$ and j = 1,.., q. A function ‘ $b s p l i n e (t, q, d f)$ ’ in R [29] is used, where q is the number of equidistant knots desired by the user and df is an order of the polynomial, that equals 3, in this paper (a B-spline cubic); see Appendix 2 for more details. Thus, the expression of matrix $K$ is described in Appendix 2.

4. Statistical inference

In this section, we discuss some inferential aspects in the HPLM-SN model as well as the penalized maximum likelihood estimation using the ECM algorithm. The ECM algorithm [28] to obtain the maximum likelihood estimate of $θ$ and a discussion of degrees of freedom estimation are given in Section 4.1. The standard error estimation of $\hat{θ}$ is presented in Appendix 1.

4.1. Parameter estimation using the ECM algorithm

In this section, we present an ECM algorithm for the ML estimation of the HPLM-SN model. To explore the ECM algorithm, we present the HPLM-SN model in an incomplete data framework, using the results presented in Section 3. Thus, from Equation (3), the set-up defined above can be written hierarchically as

\begin{aligned} Y_{i} | W_{i} = w_{i} & \sim N (μ_{i} + σ_{i} δ w_{i}, σ_{i}^{2} (1 - δ^{2})), \end{aligned}

(10)

\begin{aligned} W_{i} & \sim T N (0, 1; 0, + \infty), \end{aligned}

(11)

for $i = 1, \dots, n$ all independent, where $T N (r, s; (a, b))$ denotes the univariate normal distribution $(N (r, s))$ , truncated on the interval $(a, b)$ [20]. Let $y = (y_{1}, \dots, y_{n})^{⊤}$ and $w = (w_{1}, \dots, w_{n})^{⊤} .$ Then, under the hierarchical representation (10)–(11), it follows that the complete log-likelihood function associated with $y_{c} = (y^{⊤}, w^{⊤})^{⊤}$ is

\begin{aligned} ℓ_{c} (θ | y_{c}) & \propto - n \log σ^{2} - \sum_{i = 1}^{n} \log m_{i} - \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} \frac{1}{m_{i}} \\ \times [(1 + λ^{2}) (y_{i} - μ_{i})^{2} - 2 λ w_{i} (y_{i} - μ_{i}) + w_{i}^{2}] . \end{aligned}

As in the original proposal of [9], the E-step of our algorithm consists of taking the conditional expectation $Q (θ | {\hat{θ}}^{(k)}) = E [ℓ_{c} (θ | y_{c}) | y, {\hat{θ}}^{(k)}]$ , where ${\hat{θ}}^{(k)} = ({\hat{β}}^{(k) ⊤}, {\hat{f}}^{(k) ⊤}, {\hat{σ^{2}}}^{(k)}, {\hat{λ}}^{(k)}, {\hat{ρ}}^{(k) ⊤})^{⊤},$ is the current estimate of $θ$ at the kth iteration. The maximum penalized likelihood estimate (MPLE) of $θ$ is the value that maximizes the function

\begin{aligned} Q_{p} (θ | {\hat{θ}}^{(k)}) = Q (θ | {\hat{θ}}^{(k)}) - \frac{α}{2} J (f) . \end{aligned}

(12)

Given ${\hat{α}}^{(k)}$ , the M-step consists of the maximization of $Q_{p} (θ | {\hat{θ}}^{(k)})$ with respect to $θ$ . It follows, after some simple algebra, that the conditional expectation of the complete log-likelihood function has the form

\begin{aligned} Q (θ | {\hat{θ}}^{(k)}) & \propto - n \log σ^{2} - \sum_{i = 1}^{n} \log m_{i} - \frac{(1 + λ^{2})}{2 σ^{2}} (y - X β - N f)^{⊤} \\ \times H (y - X β - N f) - \frac{1}{2 σ^{2}} {\hat{w^{2}}}^{(k) ⊤} H 1_{n} \\ + \frac{λ}{σ^{2}} (y - X β - N f)^{⊤} H {\hat{w}}^{(k)}, \end{aligned}

(13)

where $1_{n}$ is a $n \times 1$ vector of 1's, $H$ is a diagonal matrix of the vector $(m_{1}^{- 1} (ρ, z_{1}), \dots, m_{n}^{- 1} (ρ, z_{n}))$ , ${\hat{w}}^{(k)} = ({\hat{w_{1}}}^{(k)}, \dots, {\hat{w_{n}}}^{(k)})^{⊤}$ and ${\hat{w^{2}}}^{(k)} = ({\hat{w_{1}^{2}}}^{(k)}, \dots, {\hat{w_{n}^{2}}}^{(k)})^{⊤}$ are $n \times 1$ vectors, with ${\hat{w}}_{i}^{(k)} = E [W_{i} | y_{i}, {\hat{θ}}^{(k)}]$ and ${\hat{w^{2}}}_{i}^{(k)} = E [W_{i}^{2} | y_{i}, {\hat{θ}}^{(k)}]$ .

4.1.1. Step-by-step instructions for the ECM algorithm

The ECM algorithm for the HPLM-SN model can be summarized in the following steps:

E-step: Given the current estimates ${\hat{θ}}^{(k)}$ and ${\hat{α}}^{(k)}$ at the kth iteration, we obtain the conditional expectation of the complete data log-likelihood function given the observed $y$ , named the Q-function, which is given by (13), such that
$\begin{aligned} {\hat{w}}_{i}^{(k)} & = {\hat{λ}}^{(k)} e_{i}^{(k)} + {\hat{σ}}_{i}^{(k)} W_{Φ} (\frac{{\hat{λ}}^{(k)} e_{i}^{(k)}}{{\hat{σ}}_{i}^{(k)}}), \end{aligned}$ (14)

$\begin{aligned} {\hat{w^{2}}}_{i}^{(k)} & = {[{\hat{λ}}^{(k)} e_{i}^{(k)}]}^{2} + {\hat{σ^{2}}}_{i}^{(k)} + {\hat{λ}}^{(k)} \hat{σ_{i}^{(k)}} e_{i}^{(k)} W_{Φ} (\frac{{\hat{λ}}^{(k)} e_{i}^{(k)}}{{\hat{σ}}_{i}^{(k)}}), \end{aligned}$ (15)
where $e_{i}^{(k)} = y_{i} - x_{i}^{⊤} {\hat{β}}^{(k)} - n_{i}^{⊤} {\hat{f}}^{(k)}$ , ${\hat{σ^{2}}}_{i}^{(k)} = {\hat{σ^{2}}}^{(k)} m_{i} ({\hat{ρ}}^{(k)}, z_{i})$ , $i = 1, \dots, n,$ and $W_{Φ} (u) = ϕ (u) / Φ (u)$ .

Conditional maximization steps (CM-steps) are given as follows.

CM-step 1: Fix

{\hat{α}}^{(k)}

, update

{\hat{β}}^{(k)}

{\hat{f}}^{(k)}

{\hat{σ^{2}}}^{(k)}

and

{\hat{λ}}^{(k)}

\begin{aligned} {\hat{β}}^{(k + 1)} & = {(X^{⊤} {\hat{H}}^{(k)} X)}^{- 1} X^{⊤} {\hat{H}}^{(k)} [y - N {\hat{f}}^{(k)} - {\hat{λ}}^{(k)} / (1 + ({\hat{λ}}^{(k)})^{2}) {\hat{w}}^{(k)}], \\ {\hat{f}}^{(k + 1)} & = {(N^{⊤} {\hat{H}}^{(k)} N + \frac{α {\hat{σ^{2}}}^{(k)}}{1 + ({\hat{λ}}^{(k)})^{2}} K)}^{- 1} N^{⊤} \\ \times {\hat{H}}^{(k)} (y - X {\hat{β}}^{(k)} - \frac{{\hat{λ}}^{(k)}}{1 + ({\hat{λ}}^{(k)})^{2}} {\hat{w}}^{(k)}), \\ {\hat{σ^{2}}}^{(k + 1)} & = {1_{n}^{⊤} {\hat{H}}^{(k)} {\hat{w^{2}}}^{(k)} - 2 {\hat{λ}}^{(k)} {\hat{w}}^{(k)^{⊤}} {\hat{H}}^{(k)} (y - {\hat{μ}}^{(k)}) \\ + [1 + ({\hat{λ}}^{(k})^{2}] S ({\hat{β}}^{(k)}, {\hat{f}}^{(k)}, {\hat{ρ}}^{(k)})} / 2 n \end{aligned}

and

\begin{aligned} {\hat{λ}}^{(k + 1)} = {\hat{w}}^{(k) ⊤} {\hat{H}}^{(k)} (y - {\hat{μ}}^{(k)}) / S ({\hat{β}}^{(k)}, {\hat{f}}^{(k)}, {\hat{ρ}}^{(k)}), \end{aligned}

(16)

where

S (β, f, ρ) = (y - μ)^{⊤} H (y - μ)

and

μ = X β + N f

CM-step 2: Fix ${\hat{α}}^{(k)}$ and given ${\hat{β}}^{(k + 1)}, {\hat{f}}^{(k + 1)}, {\hat{σ^{2}}}^{(k + 1)}$ and ${\hat{λ}}^{(k + 1)},$ update ${\hat{ρ}}^{(k)}$ as
$\begin{aligned} {\hat{ρ}}^{(k + 1)} = {argmax}_{ρ} Q (ρ | {\hat{β}}^{(k + 1)}, {\hat{f}}^{(k + 1)}, {\hat{σ^{2}}}^{(k + 1)}, {\hat{λ}}^{(k + 1)}) . \end{aligned}$
Given ${\hat{θ}}^{(k + 1)},$ update ${\hat{α}}^{(k)}$ as
$\begin{aligned} {\hat{α}}^{(k + 1)} = {argmin}_{α} BIC (α | {\hat{θ}}^{(k + 1)}) . \end{aligned}$
We use the Bayesian information criterion (BIC) to select the better value of the α, given by
$\begin{aligned} BIC (α) = - 2 ℓ_{p} (\hat{θ}, α) + p (α) \log n, \end{aligned}$
where $ℓ_{p} (\hat{θ}, α)$ denotes the penalized log-likelihood function available at $\hat{θ}$ for a fixed α, defined in (9) and n is the sample size. Note that maximizing the penalized log-likelihood function is equivalent to minimizing the BIC. This procedure requires a one-dimensional search, which can be easily accomplished by using, for example, the ‘optim’ routine in R [29] to estimate α, with α between 0.001 and $10^{3}$ . In additive linear models, degrees of freedom are defined as approximately the number of effective parameters involved in modeling the nonparametric effects [17,19]. In our case, using the expression of $\hat{f}$ in (16), we derive effective degrees of freedom as
$\begin{aligned} d f (α) = tr {N {(N^{⊤} \hat{H} N + \frac{α {\hat{σ}}^{2}}{1 + {\hat{λ}}^{2}} K)}^{- 1} N^{⊤} \hat{H}}, \end{aligned}$
where $H$ is defined in (13). Therefore, one has a total of $p (α) = p + p^{*} + 2 + d f (α)$ parameters to be estimated.

Notes on implementation

The iterations of the above algorithm are repeated until a suitable convergence rule is satisfied, e.g. $‖ θ^{(k + 1)} - θ^{(k)} ‖$ is sufficiently small, say $10^{- 6}$ . A set of reasonable starting values may be achieved by computing ${\hat{β}}^{(0)}$ and ${\hat{σ^{2}}}^{(0)}$ as the solution of the least-squares regression model of $y$ on $X$ . So, ${\hat{f}}^{(0)} = (N^{⊤} N + α {\hat{σ^{2}}}^{(0)} K)^{- 1} N^{⊤} (y - X {\hat{β}}^{(0)})$ , ${\hat{λ}}^{(0)}$ can be the sample skewness coefficient of $y - X {\hat{β}}^{(0)} - N {\hat{f}}^{(0)}$ . The value of ${\hat{ρ}}^{(0)}$ can be the value $ρ_{0}$ such that $m_{i} (ρ_{0}, z_{i}) = 1$ for all $i = 1, \dots, n$ (homoscedasticity of variance).

4.2. Goodness-of-fit

The Mahalanobis distance $d_{i}^{2} = (Y_{i} - x_{i}^{⊤} β - n_{i}^{⊤} f)^{2} / σ_{i}^{2}, i = 1, \dots, n,$ is extremely useful in testing the goodness-of-fit and in detecting outliers. According to [33], it can be shown that the distribution of $d_{i}^{2}$ is the same as under normal distribution. So, in the SN distribution, $d_{i}^{2} = (Y_{i} - x_{i}^{⊤} β - n_{i}^{⊤} f)^{2} / σ_{i}^{2} \sim χ_{1}^{2}$ . This result is interesting because it allows evaluating the statistical models in practice. By substituting the maximum likelihood estimates of $β, f$ and $σ^{2}$ at the distance of Mahalanobis $d_{i}^{2}$ , we can evaluate the fit of the models by constructing quantile-quantile plots with simulated confidence bands of $100 γ %,$ $0 < γ < 1$ [1]. In addition, by plotting the Mahalanobis distance and considering as a benchmark the quantile ν of the quadratic form $d_{i}^{2}$ , we can identify outliers. For instance, for the skew-normal case, we have that $ν = χ^{2} (ε),$ where $0 < ε < 1$ .

5. Residuals

The residual analysis aims at identifying atypical observations and/or model misspecification once residuals are measures of agreement between the data and the fitted model. Under the heteroscedastic PLM-SN model, we defined the following standardized residual

\begin{aligned} e_{i}^{1} = \frac{y_{i} - x_{i}^{⊤} \hat{β} - n_{i}^{⊤} \hat{f}}{\sqrt{\hat{σ^{2}} m_{i} (\hat{ρ}, z_{i})}}, i = 1, \dots, n, \end{aligned}

(17)

where $\hat{β}, \hat{f}, \hat{σ^{2}}$ and $\hat{ρ}$ denote the MPLE of $β, f$ , $σ^{2}$ and ρ, respectively, from the ECM-algorithm described in Section 4.1. Note that when $ρ = 0,$ under the homoscedastic PLM-SN model, we get the following standardized residual

\begin{aligned} e_{i}^{0} = \frac{y_{i} - x_{i}^{⊤} \hat{β} - n_{i}^{⊤} \hat{f}}{\sqrt{\hat{σ^{2}}}}, \end{aligned}

(18)

where $\hat{β}, \hat{f}$ and $\hat{σ^{2}}$ denote the MPLE of $β, f$ and $σ^{2}$ , respectively, from the EM-algorithm described in Section 4 of Ferreira and Paula [12].

Based on the residuals $e_{i}^{1}$ and $e_{i}^{0}$ , we can detect incorrect specification of the error distribution as well as the presence of outlying observations.

6. Influence diagnostics

Cook [6] proposed a unified approach for the assessment of local influence in minor perturbations of a statistical model, which can be viewed as a generalization of the robustness concept for studying and detecting influential subsets of data. Following [12], a direct application of this approach involves extensive algebraic manipulation for the HPLM-SN model. In this article, we will apply the general approach of Zhu and Lee [37] to achieve diagnostic measures for local influence analysis.

6.1. Description of the local influence approach

The general approach developed by Zhu and Lee [37] for local influence analysis of general statistical models with missing data will be utilized to obtain diagnostic measures for the HPLM-SN model. For completeness and to introduce notation, this approach is briefly outlined here. Consider a perturbation vector $ω = (ω_{1}, \dots, ω_{g})^{⊤}$ varying in an open region $Ω \subseteq R^{g}$ , and the following perturbed statistical model $M = {f (y_{c}, θ, ω) : ω \in Ω}$ , where $f (y_{c}, θ, ω)$ is the probability density function for the complete-data, $y_{c}$ , perturbed by ω and $ℓ_{c_{p}} (θ, ω | y_{c}) = \log f (y_{c}, θ, ω),$ its corresponding complete penalized log-likelihood function. We assume there is a $ω_{0}$ such that $ℓ_{c_{p}} (θ, ω_{0} | y_{c}) = ℓ_{c_{p}} (θ | y_{c})$ for all $θ$ . Let $\hat{θ} (ω)$ be the maximum of the function $Q_{p} (θ, ω | \hat{θ}) = E [ℓ_{c_{p}} (θ, ω | y_{c}) | y, \hat{θ}]$ . Then, the influence graph is defined as $α (ω) = (ω^{⊤}, f_{Q} (ω))^{⊤}$ , where $f_{Q} (ω)$ is the Q-displacement function defined as follows: $f_{Q} (ω) = 2 [Q_{p} (\hat{θ} | \hat{θ}) - Q_{p} (\hat{θ} (ω) | \hat{θ})] .$ Following the approach developed by [6,37], the normal curvature $C_{f_{Q}, v}$ of $α (ω)$ at $ω_{0}$ in the direction of some unit vector $v$ is used to summarize the local behavior of the Q-displacement function. It can be shown that $C_{f_{Q}, v} = - 2 v^{⊤} {\ddot{Q}}_{ω_{0}} v$ and $- {\ddot{Q}}_{ω_{0}} = Δ_{ω_{0}}^{⊤} {- \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial θ \partial θ^{⊤}} |_{θ = \hat{θ}}}^{- 1} Δ_{ω_{0}},$ where $Δ_{ω_{0}} = \frac{\partial^{2} Q_{p} (θ, ω | \hat{θ})}{\partial θ \partial ω^{⊤}} |_{θ = \hat{θ}, ω = ω_{0}} .$ As in [6], the symmetric matrix $- 2 {\ddot{Q}}_{ω_{0}}$ is fundamental for detecting influential observations, and its spectral decomposition is given by $- 2 {\ddot{Q}}_{ω_{0}} = \sum_{k = 1}^{g} ζ_{k} ε_{k} ε_{k}^{⊤},$ where ${(ζ_{k}, ε_{k}), k = 1, \dots, g}$ are eigenvalue–eigenvector pairs of $- 2 {\ddot{Q}}_{ω_{0}}$ with $ζ_{1} \geq \dots \geq ζ_{r} > ζ_{r + 1} = \dots = 0$ and orthonormal eigenvectors ${ε_{k}, k = 1, \dots, g}$ . Lesaffre and Verbeke [25] and Zhu and Lee [37] proposed inspecting all eigenvectors corresponding to non-zero eigenvalues for more revealing information. Based on Zhu and Lee [37], we consider the following aggregated contribution vector of all eigenvectors corresponding to non-zero eigenvalues. Let ${\tilde{ζ}}_{k} = ζ_{k} / (ζ_{1} + \dots + ζ_{r}),$ $ε_{k}^{2} = (ε_{k 1}^{2}, \dots, ε_{k g}^{2})^{⊤}$ and $M (0) = \sum_{k = 1}^{r} {\tilde{ζ}}_{k} ε_{k}^{2} .$ The jth component of $M (0)$ , $M (0)_{j}$ , is equal to $\sum_{k = 1}^{r} {\tilde{ζ}}_{k} ε_{k j}^{2}$ . The evaluation of influential cases is based on the visual inspection of the ${M (0)_{j}, j = 1, \dots, g}$ plotted against the index j. The jth case may be regarded as influential if $M (0)_{j}$ is larger than the reference.

The inconvenience involved in the use of the normal curvature consists of deciding the influence of the observations, since $C_{f_{Q}, v} (θ)$ may assume any value and is not invariant under a uniform change of scale. Zhu and Lee [37] considered the following conformal normal curvature $B_{f_{Q}, v} (θ) = C_{f_{Q}, v} (θ) / tr [- 2 {\ddot{Q}}_{ω_{0}}]$ , which has an interesting property $0 \leq B_{f_{Q}, v} (θ) \leq 1$ , for any unitary direction $v$ . Now, let $v_{j}$ be a basic perturbation vector with the jth entry 1 and zero elsewhere. Zhu and Lee [37] showed that for all j, $M (0)_{j} = B_{f_{Q}, v_{j}}$ . Hence, $M (0)_{j}$ can be obtained by $B_{f_{Q}, v_{j}} .$ The computation of $B_{f_{Q}, v_{j}}$ is very simple. We refer the reader to Zhu and Lee [37] for other theoretical properties of $B_{f_{Q}, v_{j}}$ , such as invariance under reparameterizations of $θ$ . Lee and Xu [24] propose to use $1 / m + c^{*} S M (0)$ as a benchmark for establishing the jth case as influential, where $c^{*}$ is a selected constant that may be chosen suitably and $S M (0)$ is the standard deviation of ${M (0)_{j}, j = 1, \dots, g}$ . In this paper, we consider $c^{*} = 3$ unless otherwise indicated.

In the following sections, we derive the Hessian matrix for the proposed HPLM-SN model, including a brief discussion on the perturbation schemes employed for our development.

6.2. The Hessian matrix ${\ddot{Q}}_{θ} (\hat{θ})$

To obtain the diagnostic measures for the local influence of a particular perturbation scheme, it is necessary to compute ${\ddot{Q}}_{θ} (\hat{θ}) = \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial θ θ^{⊤}} |_{θ = \hat{θ}},$ where $θ = (β^{⊤}, σ^{2}, λ, ρ^{⊤}, f^{⊤})^{⊤}$ . Hence, the Hessian matrix has elements given by

\begin{aligned} \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial β \partial β^{⊤}} & = - \frac{(1 + λ^{2})}{σ^{2}} X^{⊤} H X, \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial β \partial σ^{2}} & = - \frac{1}{σ^{4}} X^{⊤} H [(1 + λ^{2}) (y - μ) - λ \hat{w}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial β \partial λ} & = \frac{1}{σ^{2}} X^{⊤} H [2 λ (y - μ) - \hat{w}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial ρ \partial β^{⊤}} & = {\dot{H}}^{⊤} [\frac{(1 + λ^{2})}{σ^{2}} D (y - μ) - \frac{λ}{σ^{2}} D (\hat{w})] X, \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial f \partial β^{⊤}} & = - \frac{1 + λ^{2}}{σ^{2}} N^{⊤} H X, \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial ρ \partial σ^{2}} & = {\dot{H}}^{⊤} [\frac{1}{2 σ^{4}} D (\hat{w^{2}}) 1_{n} - \frac{λ}{σ^{4}} D (\hat{w}) (y - μ)], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial σ^{2} \partial σ^{2}} & = \frac{n}{σ^{4}} - \frac{1}{σ^{6}} [{\hat{w^{2}}}^{⊤} H 1_{n} - 2 λ (y - μ)^{⊤} H \hat{w} + (1 + λ^{2}) S (β, f)], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial σ^{2} \partial λ} & = \frac{1}{σ^{4}} [λ S (β, f) - (y - μ)^{⊤} H \hat{w}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial f \partial σ^{2}} & = - \frac{1}{σ^{4}} N^{⊤} H [(1 + λ^{2}) (y - μ) - λ \hat{w}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial λ^{2}} & = - \frac{1}{σ^{2}} S (β, f), \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial f \partial λ} & = \frac{1}{σ^{2}} N^{⊤} H [2 λ (y - μ) - \hat{w}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial ρ \partial λ} & = - \frac{1}{σ^{2}} {\dot{H}}^{⊤} D (y - μ) [λ (y - μ) - \hat{w}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial ρ \partial ρ^{⊤}} & = \sum_{i = 1}^{n} {\dot{M}}_{i} - \frac{1}{2 σ^{2}} \sum_{i = 1}^{n} {\ddot{H}}_{i} [(1 + λ^{2}) (y_{i} - μ^{2}) - 2 λ {\hat{w}}_{i} (y_{i} - μ_{i}) + {\hat{w^{2}}}_{i}], \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial ρ \partial f^{⊤}} & = \frac{1}{σ^{2}} {\dot{H}}^{⊤} D ((1 + λ^{2}) (y - μ) - λ \hat{w}) N, \\ \frac{\partial^{2} Q_{p} (θ | \hat{θ})}{\partial f \partial f^{⊤}} & = - \frac{(1 + λ^{2})}{σ^{2}} N^{⊤} H N - α K, \end{aligned}

where $D (x)$ is the diagonal matrix of the vector $x,$ ${\dot{H}}^{⊤} = ({\dot{H}}_{1}^{⊤}, \dots, {\dot{H}}_{n}^{⊤}),$ with ${\dot{H}}_{i} = - \frac{1}{m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ^{⊤}},$ ${\dot{M}}_{i} = [- \frac{1}{m_{i}} \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}} + \frac{1}{m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}}]$ and ${\ddot{H}}_{i} = \frac{- 1}{m_{i}^{2}} \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}} + \frac{2}{m_{i}^{3}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}}$ .

6.3. Perturbation schemes

In this section, we will consider four different perturbation schemes for the HPLM-SN model. For each perturbation scheme, one has the partitioned form $Δ_{ω_{0}} = (Δ_{β}^{⊤}, Δ_{σ^{2}}^{⊤}, Δ_{λ}^{⊤}, Δ_{ρ}^{⊤}, Δ_{f}^{⊤})^{⊤},$ where $Δ_{β} = \frac{\partial^{2} Q_{p} (θ, ω | \hat{θ})}{\partial β \partial ω^{⊤}} |_{θ = \hat{θ}, ω = ω_{0}},$ $Δ_{σ^{2}} = \frac{\partial^{2} Q_{p} (θ, ω | \hat{θ})}{\partial σ^{2} \partial ω^{⊤}} |_{θ = \hat{θ}, ω = ω_{0}},$ $Δ_{λ} = \frac{\partial^{2} Q_{p} (θ, ω | \hat{θ})}{\partial λ \partial ω^{⊤}} |_{θ = \hat{θ}, ω = ω_{0}},$ $Δ_{ρ} = \frac{\partial^{2} Q_{p} (θ, ω | \hat{θ})}{\partial ρ \partial ω^{⊤}} |_{θ = \hat{θ}, ω = ω_{0}}$ and $Δ_{f} = \frac{\partial^{2} Q_{p} (θ, ω | \hat{θ})}{\partial f \partial ω^{⊤}} |_{θ = \hat{θ}, ω = ω_{0}} .$

6.3.1. Case-weight perturbation

First, consider the following arbitrary allocation of weights for the expected value of the complete-data penalized log-likelihood function (perturbed Q-function), which may capture departures in general directions, given by

\begin{aligned} Q_{p} (θ, ω | \hat{θ}) = \sum_{i = 1}^{n} ω_{i} Q_{i} (θ | \hat{θ}) - \frac{α}{2} f^{⊤} K f, \end{aligned}

where the contribution from the ith experimental unity to the Q-function is $Q_{i} (θ | \hat{θ}) \propto - \log (σ^{2}) - \log (m_{i}) - \frac{1}{2 m_{i} σ^{2}} [(1 + λ^{2}) (y_{i} - μ^{2}) - 2 λ {\hat{w}}_{i} (y_{i} - μ_{i}) + {\hat{w^{2}}}_{i}],$ with $ω = (ω_{1}, \dots, ω_{n})^{⊤}$ an $n \times 1$ vector, $0 \leq w_{i} \leq 1,$ for $i = 1, \dots, n$ and $ω_{0} = (1, \dots, 1)^{⊤}$ . For this perturbation scheme, we find $Δ_{ω_{0}}$ with the following elements:

\begin{aligned} Δ_{β} & = \frac{1}{σ^{2}} X^{⊤} H D (- λ \hat{w} + (1 + λ^{2}) (y - μ)), \\ Δ_{σ^{2}} & = - \frac{1}{σ^{2}} 1_{n}^{⊤} + \frac{1}{2 σ^{4}} {(y - μ)^{⊤} H [(1 + λ^{2}) D (y - μ) - 2 λ D (\hat{w})] + {\hat{w^{2}}}^{⊤} H}, \\ Δ_{λ} & = \frac{1}{σ^{2}} (y - μ)^{⊤} H D (\hat{w} - λ (y - μ)), \\ Δ_{ρ} & = M^{⊤} - \frac{1}{2 σ^{2}} {\dot{H}}^{⊤} {D (y - μ) [(1 + λ^{2}) D (y - μ) - 2 λ D (\hat{w})] + D (\hat{w^{2}})}, \\ Δ_{f} & = \frac{1}{σ^{2}} N^{⊤} H D (- λ \hat{w} + (1 + λ^{2}) (y - μ)), \end{aligned}

where $M^{⊤} = (M_{1}^{⊤}, \dots, M_{n}^{⊤}),$ with $M_{i} = - \frac{1}{m_{i}} \frac{\partial m_{i}}{\partial ρ^{⊤}}$ .

6.3.2. Response variable perturbation

A perturbation of the response variables $y = (y_{1}, \dots, y_{n})^{⊤}$ is defined as $y_{ω} = y + S_{y} ω,$ where $S_{y}$ is the standard deviation of $y$ . In this case, $ω_{0} = 0 \in R^{n}$ and the perturbed Q-function is as in Equations (12)–(13), switching $y_{ω}$ with $y$ . It follows that the matrix $Δ_{ω_{0}}$ has the following elements:

\begin{aligned} Δ_{β} & = \frac{S_{y}}{σ^{2}} [(1 + λ^{2}) X^{⊤} H], Δ_{σ^{2}} = \frac{S_{y}}{σ^{4}} [(1 + λ^{2}) (y - μ)^{⊤} - λ {\hat{w}}^{⊤}] H, \\ Δ_{λ} & = \frac{S_{y}}{σ^{2}} [{\hat{w}}^{⊤} - 2 λ (y - μ)^{⊤}] H, \\ Δ_{ρ} & = \frac{S_{y}}{σ^{2}} {\dot{H}}^{⊤} [- (1 + λ^{2}) D (y - μ) + λ D (\hat{w})], \\ Δ_{f} & = \frac{S_{y}}{σ^{2}} [(1 + λ^{2}) N^{⊤} H] . \end{aligned}

6.3.3. Explanatory variable perturbation

In this section, we will consider the influence that perturbation in the specific continuous explanatory variable may produce on the parameter estimates. A perturbation of the explanatory variable $x_{r}$ is defined as $x_{r ω} = x_{w} + S_{r} ω, r \in 1, \dots, p,$ where $S_{r}$ is the standard deviation of the explanatory variable $x_{r}$ . In this case, $ω_{0} = 0 \in R^{n}$ and the perturbed Q-function is like Equations (12)–(13), switching $X_{ω}$ with $X$ . Consequently, the matrix $Δ_{ω_{0}}$ has the following elements:

\begin{aligned} Δ_{β} & = \frac{S_{r}}{σ^{2}} {(1 + λ^{2}) [I_{r_{0}} (y - μ)^{⊤} - β_{r} X^{⊤}] - λ I_{r_{0}} {\hat{w}}^{⊤}} H, \\ Δ_{σ^{2}} & = \frac{β_{r} S_{r}}{σ^{4}} [λ {\hat{w}}^{⊤} - (1 + λ^{2}) (y - μ)^{⊤}] H, \\ Δ_{λ} & = \frac{β_{r} S_{r}}{σ^{2}} [- {\hat{w}}^{⊤} + 2 λ (y - μ)^{⊤}] H, \\ Δ_{ρ} & = \frac{β_{r} S_{r}}{σ^{2}} {\dot{H}}^{⊤} [(1 + λ^{2}) D (y - μ) - λ D (\hat{w})], \\ Δ_{f} & = \frac{- β_{r} S_{r}}{σ^{2}} (1 + λ^{2}) N^{⊤} H, \end{aligned}

where $I_{r_{0}}$ denotes a $p \times 1$ vector of zeros with one in the rth position.

6.3.4. Perturbation of the skewness parameter

Consider the perturbed model $Y_{i} = x_{i}^{⊤} β + f (t_{i}) + ϵ_{i}, i = 1, \dots, n,$ with $ϵ_{i} \sim SN (0, σ_{i}^{2}, λ_{i}), σ_{i}^{2} = σ^{2} m_{i} (ρ, z_{i})$ and $λ_{i} = λ s_{i} (ω, z_{i}),$ where $s_{i} = s (ω, z_{i})$ is a known positive continuously differentiable function, $z_{i}$ contains values of the explanatory variables, which constitute in general, although not necessary, a subset of $x_{i}$ , and $ω : l \times 1$ is a perturbation vector. It is assumed that there is a $ω_{0}$ such that $m (ω_{0}, z_{i}) = 1$ , for all $i = 1, \dots, n$ . The perturbed Q-function is similar to Equations (12)–(13), switching $λ_{i}$ with λ. It follows that the matrix $Δ_{ω_{0}}$ has the following elements:

\begin{aligned} Δ_{β} & = \frac{λ}{σ^{2}} X^{⊤} H [2 λ D (y - μ) - D (\hat{w})], \\ Δ_{σ^{2}} & = \frac{λ}{σ^{4}} (y - μ)^{⊤} H [λ D (y - μ) - D (\hat{w})], \\ Δ_{λ} & = \frac{1}{σ^{2}} (y - μ)^{⊤} H [D (\hat{w}) - 2 λ D (y - μ)], \\ Δ_{ρ} & = - \frac{λ}{σ^{2}} {\dot{H}}^{⊤} D (y - μ) [D (y - μ) - D (\hat{w})], \\ Δ_{f} & = \frac{λ}{σ^{2}} N^{⊤} H [2 λ D (y - μ) - D (\hat{w})] . \end{aligned}

In the next section, for simplicity, we consider the diagnostics for the scale parameter in the HPLM-SN model. However, the method proposed here can be used to test for homogeneity of any parameter involved in the variance, as discussed by Xie et al. [34] and Zeller et al. [36].

7. Likelihood ratio test for homogeneity in the HPLM-SN model

The HPLM-SN model defined in Equations (5)–(6) supposes that the variance of the model is not constant with the scale parameter given by $σ_{i}^{2} = σ^{2} m_{i},$ with $m_{i} = m (ρ, z_{i})$ . If the variance depends on the quantity of some explanatory variables $z_{i}$ , some specific forms of $m_{i}$ are usually taken to model the varying dispersion: (i) log-linear model $m_{i} (ρ, z_{i}) = \exp (\sum_{j = 1}^{p^{*}} ρ_{j} z_{i j})$ and (ii) power product model $m_{i} (ρ, z_{i}) = \prod_{j = 1}^{p^{*}} z_{i j}^{ρ_{j}} = \exp (\sum_{j = 1}^{p^{*}} ρ_{j} l o g (z_{i j}))$ ; see [11,22,34] and references therein for more details. Of course, (ii) requires that the $z_{i j}$ be strictly positive, while no such restriction is needed for (i). Furthermore, it is assumed that there is a unique value $ρ = ρ_{0},$ such that $m_{i} (ρ_{0}, z_{i}) = 1$ for all $i = 1, \dots, n$ , then $σ_{i}^{2} = σ^{2}$ and $Y_{i}^{'} s$ has constant variance. Hence the test for homogeneity of scale parameter is equivalent to testing the following hypothesis

\begin{aligned} H_{0} : ρ = ρ_{0} vs H_{1} : ρ \neq ρ_{0} . \end{aligned}

In this article, we use a Likelihood Ratio (LR) test statistic to check $H_{0},$ where $L R = 2 (ℓ_{p} (\hat{θ}, \hat{α}) - ℓ_{p} (\tilde{θ}, \tilde{α})),$ with $(\hat{θ}, \hat{α})$ and $(\tilde{θ}, \tilde{α})$ are the restricted (under $H_{0}$ ) and unrestricted ML estimators of $(θ, α)$ , respectively. When $H_{0}$ is true, the statistic LR is asymptotically distributed as $χ_{p *}^{2} .$

8. Simulation studies

In this section, we present four simulation studies. The first study evaluates the performance of the ML estimates of the HPLM-SN model parameters determined from the ECM-algorithm. In the second and third studies, the performance of the asymptotic distribution and the power of the LR test statistic are examined. The fourth simulation study also evaluates the performance of the proposed test by providing evidence regarding his behavior when the underlying structure function is misspecified. In all simulation studies, we used the method of Eilers and Marx [10] to construct matrices N and K, with the number of knots being 14 and 12, for scenarios 1 and 2, described in Section 8.1, respectively.

8.1. Study I: parameter recovery

In this subsection, we consider two scenarios for simulation in order to verify if we can estimate the true parameter values accurately by using the proposed estimation method. This is the first step to ensure that the estimation procedure works satisfactorily. We fit the HPLM-SN model defined in Section 3.2 to data that were artificially generated from model (5)–(6), where $f (t_{i}) = \cos (t_{i})$ , $t_{i} \in (- 3 π, 3 π)$ (scenario 1) or $f (t_{i}) = \cos (4 π t_{i}) \exp (- t_{i}^{2} / 2)$ , $t_{i} \in (0.6, 1.6)$ (Doppler effect, scenario 2), such that we assume equidistant values for $t_{i}$ in each specific interval, $x_{i} \sim U (0.2, 2),$ $β = (β_{0}, β_{1})^{⊤} = (0, 2)^{⊤}$ , $σ^{2} = 0.1$ , $λ = 3$ and $ρ = 0.1,$ with $m_{i} = e^{ρ x_{i}}$ or $m_{i} = x_{i}^{ρ}$ , $i = 1, \dots, n$ .

We generated 2000 samples from each scenario, for n = 200, 500 and 1000. The average values (mean) and the corresponding standard deviations (SD) of the estimates made by the ECM algorithm in all samples are presented in Tables 1–4. Moreover, these tables contain approximate standard errors (SE) calculated via the observed information matrix for $(β_{1}, σ^{2}, λ, ρ)$ ; see Appendix 1 for more details.

Table 2.

Mean and standard deviation (SD) estimates by the ECM algorithm based on 2000 samples from the HPLM-SN in scenario 2 and $m_{i} = e^{ρ x_{i}}$ . SE is the average of estimated standard errors.

		n = 200			n = 500			n = 1000
Parameter	True Value	Mean	SD	SE	Mean	SD	SE	Mean	SD	SE
$β_{1}$	2.0	1.942	0.948	0.559	1.928	0.554	0.415	1.994	0.382	0.309
$σ^{2}$	0.6	0.551	0.185	0.096	0.566	0.098	0.069	0.577	0.069	0.054
λ	3.0	3.314	0.953	0.471	3.107	0.470	0.335	3.058	0.307	0.244
ρ	4.6	4.693	0.246	0.126	4.651	0.128	0.091	4.631	0.086	0.069

Open in a new tab

Table 3.

Mean and standard deviation (SD) estimates by the ECM algorithm based on 2000 samples from the HPLM-SN in scenario 1 and $m_{i} = x_{i}^{ρ}$ . SE is the average of estimated standard errors.

		n = 200			n = 500			n = 1000
Parameter	True Value	Mean	SD	SE	Mean	SD	SE	Mean	SD	SE
$β_{1}$	2.0	1.951	0.183	0.285	1.982	0.062	0.059	1.993	0.043	0.042
$σ^{2}$	1.0	0.962	0.214	0.157	0.984	0.097	0.090	0.991	0.069	0.064
λ	3.0	4.445	2.883	2.558	3.223	0.627	0.558	3.082	0.375	0.358
ρ	1.6	1.822	0.234	0.211	1.672	0.120	0.115	1.623	0.079	0.081

Open in a new tab

Table 1.

Mean and standard deviation (SD) estimates by the ECM algorithm based on 2000 samples from the HPLM-SN in scenario 1 and $m_{i} = e^{ρ x_{i}}$ . SE is the average of estimated standard errors.

		n = 200			n = 500			n = 1000
Parameter	True Value	Mean	SD	SE	Mean	SD	SE	Mean	SD	SE
$β_{1}$	2.0	1.816	1.071	0.777	1.882	0.544	0.478	1.963	0.390	0.348
$σ^{2}$	0.6	0.483	0.180	0.109	0.545	0.093	0.077	0.569	0.067	0.061
λ	3.0	3.572	1.443	0.782	3.156	0.471	0.471	3.075	0.311	0.276
ρ	4.6	4.792	0.260	0.176	4.674	0.125	0.125	4.640	0.086	0.079

Open in a new tab

Table 4.

Mean and standard deviation (SD) estimates by the ECM algorithm based on 2000 samples from the HPLM-SN in scenario 2 and $m_{i} = x_{i}^{ρ}$ . SE is the average of estimated standard errors.

		n = 200			n = 500			n = 1000
Parameter	True Value	Mean	SD	SE	Mean	SD	SE	Mean	SD	SE
$β_{1}$	2.0	1.956	0.190	0.257	1.980	0.065	0.066	1.989	0.042	0.044
$σ^{2}$	1.0	0.962	0.209	0.166	0.979	0.098	0.096	0.991	0.068	0.068
λ	3.0	4.393	2.871	2.594	3.212	0.626	0.601	3.085	0.376	0.378
ρ	1.6	1.811	0.248	0.231	1.675	0.129	0.131	1.639	0.085	0.086

Open in a new tab

Note that all the point estimates are quite accurate in all the scenarios considered. Thus, the results suggest that the proposed algorithm produces satisfactory estimates. In addition, the SD of the estimates and the SE are closer to each other and decrease with the sample size, showing that the calculation from the observed information matrix seems to be correct.

Finally, in Figure 2, we plot the 2000 estimated functions $f (\cdot)$ of the nonparametric components from the two considered scenarios and log-linear model as a structure of dispersion. Note that in all scenarios considered, the proposed model presents excellent performance. For structure of the dispersion power product model, the results are the same, so the figure is not shown here to save space. The variability among the estimates of the nonparametric function reduces as the sample size increases, as well as the respective mean estimates, become closer to the true values, for both scenarios and both structures of dispersion. This is an indication of the consistency of the nonparametric estimator.

8.2. Study II: the empirical distribution of the LR test statistic

In this subsection, the performance of the asymptotic distribution of the LR test statistic is examined following the procedure described in [13,34]. Therefore, the empirical distribution with the theoretical distribution via Monte Carlo simulations is compared.

The design considered in this simulation study is the same as scenario 1 of Study I, but with $ρ = 0$ (under $H_{0}$ ), $β = (β_{0}, β_{1})^{⊤} = (0, 2)^{⊤}$ , $σ^{2} = 0.1$ , $λ = 3$ and n = 50, 100, 200, 300, 400, 500, 600 and 700. As suggested by Cook and Weisberg [8], the power function and the exponential function are usually employed in practice. Thus, we assume that $m_{i} = e^{ρ x_{i}}$ and $m_{i} = x_{i}^{ρ}$ . Each simulated case was replied 2000 times so that the values of the explanatory variable x were kept fixed throughout the simulations. Under $H_{0}$ , it is expected that the LR test statistic follows a $χ_{1}^{2}$ distribution. Then, using the 2000 estimates of the LR statistics we obtained the empirical distribution function (edf). Figure 3 shows comparisons between the edf of the LR statistic and the theoretical distribution of $χ_{1}^{2}$ for n = 100, 300 and 700. We can see that when n increases, the edfs are very close to the theoretical distribution for the model considered in our study.

Figure 3. — Simulated comparisons between the empirical distributions of the LR statistic and $χ_{1}^{2}$ distribution for n = 100 (a) and (d), n = 300 (b) and (e), n = 700 (c) and (f). In the first row, under the exponential function and in the second line, under power function.

8.3. Study III: the empirical power of the LR test

In our experiment, to gain insight into the performance of the homogeneity LR test in the HPLM-SN model, we performed a simulation study and examined the power functions for various ρ parameter set-ups. For demonstration purposes, we perform this simulation study with the same parameter set options from the previous experiment considered with $ρ = 0, 0.1, 0.2, 0.3, 0.4$ and 0.5. The sample sizes of $n = 50, 100, 200, 300, 400, 500, 600$ and 700 were chosen to evaluate the behavior of the test for small and midsize samples. Each simulated case was replied 2000 times so that the values of the explanatory variable x were kept fixed throughout the simulations.

Tables 5 and 6 provide the empirical type I error probability (under the null hypothesis) and the empirical power of LR test under alternative hypotheses with $α = 0.05$ (i.e. the percentage of times that the corresponding statistic exceeds $5 %$ of the upper points of the reference $χ^{2}$ distribution). The choice of level of significance is usually somewhat arbitrary. The standard significance level $α = 0.05$ was chosen to reflect the usual practice in statistical studies. From Tables 5 and 6, we found that the LR test is usually able to achieve the desired significance level and it is successful in detecting the heteroscedasticity behaviour of the scalar parameter for the model considered in our study. As expected, the performance of the test statistic improves with increasing n. It can be seen that as the size of the sample and ρ increase, the empirical power of the tests increases, approaching 1. As pointed out by Xie et al. [34], the score test statistic is not very sensitive to the functional form in the test for homogeneity of variance parameter. This fact might also be true in our study, in the context of the LR test statistic. Note in Tables 5 and 6 that the results are quite similar.

Table 5.

Empirical type I error probability (when $H_{0}$ ) and empirical power of LR under $H_{1},$ assuming $m_{i} = e^{ρ x_{i}}$ .

n	$ρ = 0.0$	$ρ = 0.2$	$ρ = 0.4$	$ρ = 0.6$	$ρ = 0.8$	$ρ = 1.0$
50	0.342	0.357	0.387	0.547	0.522	0.716
100	0.141	0.180	0.312	0.570	0.718	0.861
200	0.084	0.194	0.569	0.859	0.971	0.997
300	0.062	0.232	0.637	0.928	0.995	1.000
400	0.069	0.295	0.809	0.983	1.000	1.000
500	0.050	0.357	0.887	0.992	1.000	1.000
600	0.051	0.379	0.929	1.000	1.000	1.000
700	0.057	0.454	0.961	1.000	1.000	1.000

Open in a new tab

Table 6.

Empirical type I error probability (when $H_{0}$ ) and empirical power of LR under $H_{1},$ assuming $m_{i} = x_{i}^{ρ}$ .

n	$ρ = 0.0$	$ρ = 0.2$	$ρ = 0.4$	$ρ = 0.6$	$ρ = 0.8$	$ρ = 1.0$
50	0.378	0.377	0.413	0.561	0.528	0.685
100	0.136	0.184	0.304	0.556	0.684	0.814
200	0.083	0.199	0.544	0.832	0.965	0.992
300	0.058	0.221	0.620	0.921	0.994	1.000
400	0.056	0.336	0.862	0.988	1.000	1.000
500	0.055	0.367	0.909	1.000	1.000	1.000
600	0.055	0.377	0.909	1.000	1.000	1.000
700	0.060	0.427	0.949	1.000	1.000	1.000

Open in a new tab

8.4. Study IV: misspecification of the structure function

We report a simulation study to analyze the influence of misspecification of the structure function. The design considered in this simulation study is the same as scenario 1 of Study I, with $m_{i} = e^{ρ x_{i}}$ (case 1) and $m_{i} = z_{i}^{ρ}$ (case 2), but varying $σ^{2} = {0.1, 1}$ and $ρ = {0.1, 1, 5}$ . We generate 2000 Monte Carlo samples of size n = 1000 and we compute the coverage rates (CR) given by the proportion of estimates that filled in $95 %$ confidence interval and the bias, given by the difference between the mean of the estimates and the true value of the parameters. In this context, we use the following structure functions for $m_{i}$ : $m_{i} = 1$ (homoscedasticity), $m_{i} = e^{ρ x_{i}}$ and $m_{i} = z_{i}^{ρ}$ .

For CR, we expected a value close to $95 %$ and for the bias a value close to 0. According to Tables 7 and 8, the true structure function attains our expectations in terms of CR bias for all parameters taken into consideration. On the other hand, we note that other specifications of $m_{i}$ , in general, present a relatively large bias and a low CR in at least one parameter.

Table 7.

Coverage rates (CR) at the nominal level of $5 %$ and bias for different structure functions with $m_{i} = e^{ρ x_{i}}$ (true values of the parameters are in parentheses).

	$m_{i} = 1$		$m_{i} = e^{ρ x_{i}}$		$m_{i} = x_{i}^{ρ}$
Parameter	CR	bias	CR	bias	CR	bias
$β (2)$	93.85	0.005	96.25	−0.061	96.49	−0.127
$σ^{2} (0.1)$	69.50	0.011	94.05	0.001	73.05	0.008
$λ (3)$	94.80	0.086	96.00	−0.035	96.09	−0.141
$ρ (0.1)$	–	–	92.25	0.016	86.27	0.009
$β (2)$	2.00	0.096	95.40	−0.055	93.93	−0.125
$σ^{2} (0.1)$	0.00	0.235	92.45	0.001	8.72	0.204
$λ (3)$	93.20	0.079	96.05	−0.017	96.09	−0.086
$ρ (1)$	–	–	95.10	0.006	44.86	−0.165
$β (2)$	0.00	5.965	87.00	−0.279	18.16	−1.253
$σ^{2} (0.1)$	0.00	263.417	91.85	0.010	18.41	36.800
$λ (3)$	27.35	0.940	96.15	−0.167	27.80	1.343
$ρ (5)$	–	–	94.30	−0.562	2.01	−1.350
$β (2)$	93.00	0.016	95.65	−0.075	95.64	−0.135
$σ^{2} (1)$	69.35	0.111	93.30	0.039	73.63	0.044
$λ (3)$	93.95	0.122	96.34	−0.014	95.99	−0.107
$ρ (0.1)$	–	–	91.75	0.021	87.02	0.016
$β (2)$	1.80	0.302	94.80	−0.091	94.49	−0.097
$σ^{2} (1)$	0.00	2.354	93.00	−0.056	20.46	1.761
$λ (3)$	93.30	0.112	95.25	−0.021	87.21	−0.374
$ρ (1)$	–	–	95.70	−0.004	44.43	−0.172
$β (2)$	0.00	18.438	73.20	−0.377	27.22	−3.129
$σ^{2} (1)$	0.00	2620.733	91.35	−0.144	27.07	337.696
$λ (3)$	27.15	0.963	95.65	6.413	14.17	38.826
$ρ (5)$	–	–	94.05	−0.695	1.32	−1.570

Open in a new tab

Table 8.

Coverage rates (CR) at the nominal level of $5 %$ and bias for different structure functions with $m_{i} = x_{i}^{ρ}$ (true values of the parameters are in parentheses).

	$m_{i} = 1$		$m_{i} = e^{ρ x_{i}}$		$m_{i} = x_{i}^{ρ}$
Parameter	CR	bias	CR	bias	CR	bias
$β (2)$	89.10	0.008	92.65	0.001	92.24	0.001
$σ^{2} (0.1)$	94.40	−0.001	72.45	−0.012	91.70	−0.001
$λ (3)$	95.75	0.080	94.10	0.052	93.90	0.050
$ρ (0.1)$	–	–	93.25	0.011	92.40	0.003
$β (2)$	1.20	0.058	92.50	0.005	92.30	0.001
$σ^{2} (0.1)$	83.35	0.007	0.00	−0.070	91.70	−0.001
$λ (3)$	92.70	0.085	92.75	0.089	92.90	0.043
$ρ (1)$	–	–	89.85	0.070	92.40	0.015
$β (2)$	0.00	0.094	89.60	0.011	93.10	0.001
$σ^{2} (0.1)$	15.25	0.026	0.00	−0.085	91.40	−0.001
$λ (3)$	89.75	0.086	93.20	0.096	92.55	0.037
$ρ (5)$	–	–	87.45	0.092	92.25	0.018
$β (2)$	91.20	0.019	92.75	−0.001	92.35	−0.001
$σ^{2} (1)$	93.85	−0.013	70.10	−0.130	91.10	−0.014
$λ (3)$	94.80	0.099	92.90	0.071	92.80	0.069
$ρ (0.1)$	–	–	91.60	0.013	91.95	0.005
$β (2)$	1.00	0.182	93.25	0.012	93.20	−0.003
$σ^{2} (1)$	79.45	0.078	0.00	−0.708	91.20	−0.016
$λ (3)$	92.65	0.106	92.20	0.125	93.10	0.081
$ρ (1)$	–	–	87.70	0.081	92.45	0.020
$β (2)$	0.00	0.293	90.30	0.028	91.20	−0.007
$σ^{2} (1)$	15.95	0.256	0.00	−0.854	90.70	−0.009
$λ (3)$	89.95	0.098	92.45	0.140	92.90	0.083
$ρ (5)$	–	–	84.45	0.102	90.55	0.032

Open in a new tab

9. Application

As suggested by the analysis presented in Section 2, we generalize the model (1) considering heteroscedastic errors, i.e. $ϵ_{i} \sim SN (0, σ_{i}^{2}, λ)$ , where $σ_{i}^{2} = σ^{2} m_{i} (ρ, z_{i})$ , with $z_{i} = \frac{t e m p e r a t u r e_{i} - a}{b - a}$ , $a = min {t e m p e r a t u r e}$ and $b = max {t e m p e r a t u r e}$ .

Following [34], to test the homogeneity of the scalar parameter, using the LR statistic given in Section 7, we assume $m_{i} (ρ, z_{i}) = e^{ρ z_{i}}$ for simplicity. It is easily seen that when $ρ = 0,$ then $m_{i} = 1, \forall i$ . Hence, we have that the LR=118.687 (p-value $\approx 0$ ), which indicates there is significant evidence of a varying scalar parameter and consequently of heterogeneity in the ragweed data set. Meanwhile, we assume that $m_{i} (ρ, z_{i}) = z_{i}^{ρ}$ . Then the test is still $H_{0} : ρ = 0$ . By a similar computation, we get LR=91.278 (p-value $\approx 0$ ). Therefore, these results are similar to those when we choose the exponential function above.

The results of the fit in terms of log-likelihood and BIC are provided in Table 9. Looking at the BIC values, we see that the heteroscedastic PLM-SN models fit the data better than the homoscedastic PLM-SN model. In particular, the best fit was the heteroscedastic PLM-SN model assuming $m_{i} (ρ, z_{i}) = e^{ρ z_{i}}$ . Table 10 summarizes the MPLE results, including the effective degrees of freedom under skew-normal partially linear models fitted. Furthermore, we construct graphs of the standard residuals $e_{i}^{1}, i = 1, \dots, 335$ , defined in Section 5, under the HPLM-SN model (log-linear dispersion). Figures 4(a–c) show that the residuals $e_{i}^{1}$ 's of the HPLM-SN (log-linear dispersion) do not present more any tendency, confirming that this model is more appropriate for the data set.

Table 9.

Comparison of penalized log-likelihood maximum and BIC for fitted various models using the ragweed levels data. Best fit indicated by (*1).

		Heteroscedastic
	Homoscedastic	(log-linear dispersion)	(power product dispersion)
$ℓ_{p} (\hat{θ}, α)$	−730.822	−671.479	−685.183
BIC	1556.033	1473.77 (*1)	1490.496

Open in a new tab

Table 10.

MPLE results and approximate standard errors (SE) under skew-normal partially linear models fitted to the ragweed levels data.

	Homoscedastic		Heteroscedastic
			(log-linear dispersion)
Effect	Estimate	SE	Estimate	SE
Rain	1.456	0.382	0.627	0.234
Temperature	0.088	0.0183	−0.006	0.013
Wind	0.228	0.037	0.130	0.025
$σ^{2}$	9.760	1.362	0.601	0.093
λ	2.189	0.482	2.705	0.615
ρ	–	–	4.620	0.1450
α	211.486	–	96.057	–
$d f (α)$	11.234		16.500

Open in a new tab

Figure 4. — Residuals of the Heteroscedastic PLM-SN model (exponential function) fitted ( $e_{i}^{1}$ 's) to the ragweed levels data: (a) plot of residuals, (b) scatter plot between residuals and *temperature* and (b) scatter plot between residuals, and *wind speed*.

In order to detect possible outlying observations and to assess the goodness-of-fit of the models, we constructed quantile-quantile plots with simulated confidence bands of $95 %$ [1] based on the Mahalanobis distance $d_{i}^{2}, i = 1, \dots, 335$ . In Figure 5(a), the HPLM-SN model (exponential function) does not present observations outside of the confidence bands. We may notice from Figure 5(b) that the fit under heteroscedastic skew-normal errors (exponential function) seems to be accurate for capturing the tendency at the end of the season.

Figure 5. — Quantile-quantile plots for the Mahalanobis distance and the $95 %$ pointwise confidence bands for f (Days in season) from the HPLM-SN model (exponential function) fitted to the ragweed levels data.

Next, we identify influential cases for the data set using $M (0)$ . Figure 6 presents the index graphs of M(0) for the proposed perturbation schemes. From Figure 6(a), it can be seen that cases 87, 94, 239 and 270 are observations with an outstanding contribution on the log-likelihood function and that may exercise a high influence on the maximum-likelihood estimates. Case 87 seems to be the most influential in the ML estimators in the HPLM-SN model (exponential function) under the case-weight scheme. From Figure 6(b) it can be seen that observations 234, 276, 289, 293, 297, 323, 325 and 330 appear as possibly influential in response perturbation, which may indicate observations with a large influence on their own predicted values. Case 297 seems to be the most influential in the MPLE results in the HPLM-SN model (exponential function) under response variable perturbation. We now examine the effects of perturbing in the specific explanatory variable, i.e. temperature and wind speed. Figures 6(c–d) illustrate the index plot for perturbation of the temperature and wind speed, respectively. Using this perturbation scheme, we can examine that the same observations that stand out as influential in the response variable are also for the explanatory variable wind speed. In addition, cases 87, 239, 265, 270, 276, 299 and 315 are identified as influential under perturbation of the temperature, highlighting observations 239, 270 and 87 that were also identified under the case-weight scheme. Finally, from Figure 6(e) it can be seen that observations 88, 96, 115, 276 and 315 appear as possibly influential in skewness perturbation, which may reveal cases that are most influential, in the sense, of the likelihood displacement on the skewness structure and consequently on the λ estimate. Cases 276 and 315 seem to be the most influential in the MPLE results in the HPLM-SN model (exponential function) under skewness perturbation, and such observations were also identified as influential in the perturbation scheme of the explanatory variable temperature.

Observations 87 and 239 present low levels of ragweed (equal to the first quartile of the response variable values) and low temperature (below the median temperature). However, observation 270 presents a high level of ragweed and temperature (above the median of the values of the respective variables). Observation 297 presents a low level of ragweed and wind speed (below the median of the values of the respective variables). Furthermore, observation 276 presents a high level of ragweed and temperature (above the third quartile of the values of the respective variables), while observation 315 has the lowest level of ragweed and high temperature (above the median temperature). Case 276 is in the right tail of the response variable distribution.

10. Conclusions

In this paper, an extension of the skew-normal partially linear model is developed by considering the case where the error terms are independent and follow a skew-normal distribution in the presence of heteroscedasticity. Our proposed model generalizes the recent work of Ferreira and Paula [12]. We developed the maximum likelihood estimator of the parameters based on the ECM algorithm and we obtained analytic expressions for the E and M-steps, except for the parameter of the heterogeneity that requires a CM-step in the algorithm. Local influence methods were implemented for the HPLM-SN model to evaluate the consequences of model perturbations in situations where different perturbation schemes are investigated. In addition, we discuss the Likelihood Ratio test for homogeneity in the HPLM-SN model. To examine the performances and properties of the LR test, formal simulations studies under several situations were performed, including in the context of misspecication of the structure function. Lastly, the HPLM-SN model was applied to a real data set and compared with the homoscedastic version showing the usefulness of the HPLM-SN to fit data sets with nonparametric components in which the responses are asymmetric and heteroscedastic. Finally, the model proposed in this paper can be extended in context generalized additive models (GAM) as an alternative to relaxing the assumption that the heteroscedasticity function is known. We thank the anonymous referee for valuable suggestions for future research. The authors hope to report these findings in a future paper.

Acknowledgments

We thank the Associate Editor and referees for their helpful comments and suggestions, leading to the improvement of the paper.

Appendices.

Appendix 1. Approximate standard errors.

In this appendix, we derive the observed information matrix associated with the parameter vector $θ$ . The observed information matrix will be used to calculate the standard errors of the estimate $\hat{θ}$ . Following the same procedure as Segal et al. [31], we derive the variance-covariance matrix of the $\hat{θ}$ from the inverse of the observed information matrix, which is obtained by treating the penalized likelihood function (8) as a usual likelihood. Given the HPLM-SN model in Equations (5)–(6), the corresponding penalized log-likelihood function of $θ = (β^{⊤}, f^{⊤}, σ^{2}, λ)^{⊤}$ is of the form $ℓ_{p} (θ) = \sum_{i = 1}^{n} ℓ_{p_{i}} (θ)$ , with $ℓ_{p_{i}} (θ) = \log 2 + ℓ_{1_{p_{i}}} (θ) + \log Φ (ℓ_{2_{i}} (θ))$ , where $ℓ_{1_{p_{i}}} (θ) = \log ϕ (y_{i}; μ_{i}, σ_{i}^{2}) - α / (2 n) f^{⊤} K f$ and $ℓ_{2_{i}} (θ) = λ (y_{i} - μ_{i}) / σ_{i}$ , with $μ_{i} = x_{i}^{⊤} β + n_{i}^{⊤} f$ and $σ_{i}^{2} = σ^{2} m_{i} (ρ, z_{i})$ . Thus, the observed information matrix for $θ$ can be written as $I_{θ θ} = - \sum_{i = 1}^{n} \frac{\partial^{2} ℓ_{p_{i}} (θ)}{\partial θ \partial θ^{⊤}} = I^{1} (θ) + I^{2} (θ),$ where $I^{1} (θ) = - \sum_{i = 1}^{n} \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial θ \partial θ^{⊤}}$ and $I^{2} (θ) = - \sum_{i = 1}^{n} [W_{Φ} (ℓ_{2_{i}} (θ)) \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial θ \partial θ^{⊤}} + W_{Φ}^{'} (ℓ_{2_{i}} (θ)) \frac{\partial ℓ_{2_{i}} (θ)}{\partial θ} \frac{\partial ℓ_{2_{i}} (θ)}{\partial θ^{⊤}}],$ with $W_{Φ}^{'} (x) = - W_{Φ} (x) (x + W_{Φ} (x))$ and $W_{Φ} (x) = ϕ (x) / Φ (x)$ . The first-order and second-order derivatives of $ℓ_{1_{p_{i}}} (θ)$ in relation to $θ$ can be calculated as follows:

\begin{aligned} \frac{\partial ℓ_{1_{p_{i}}} (θ)}{\partial β} = \frac{(y_{i} - μ_{i})}{σ^{2} m_{i}} x_{i}; \frac{\partial ℓ_{1_{p_{i}}} (θ)}{\partial f} = \frac{(y_{i} - μ_{i})}{σ^{2} m_{i}} n_{i} - \frac{α}{n} K f; \\ \frac{\partial ℓ_{1_{p_{i}}} (θ)}{\partial σ^{2}} = \frac{(d_{i} - 1)}{2 σ^{2}}; \frac{\partial ℓ_{1_{p_{i}}} (θ)}{\partial λ} = 0; \\ \frac{\partial ℓ_{1_{p_{i}}} (θ)}{\partial ρ} = \frac{(d_{i} - 1)}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ}; \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial β \partial β^{⊤}} = - \frac{1}{σ^{2} m_{i}} x_{i} x_{i}^{⊤}; \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial f \partial β^{⊤}} = - \frac{1}{σ^{2} m_{i}} n_{i} x_{i}^{⊤}; \\ \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial σ^{2} \partial β} = - \frac{(y_{i} - μ_{i})}{σ^{4} m_{i}} x_{i}; \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial ρ \partial β^{⊤}} = - \frac{(y_{i} - μ_{i})}{σ^{2} m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} x_{i}^{⊤}; \\ \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial f \partial f^{⊤}} = - \frac{1}{σ^{2} m_{i}} n_{i} n_{i}^{⊤} - \frac{α}{n} K; \\ \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial σ^{2} \partial f} = - \frac{(y_{i} - μ_{i})}{σ^{4} m_{i}} n_{i}; \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial ρ \partial f^{⊤}} = - \frac{(y_{i} - μ_{i})}{σ^{2} m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} n_{i}^{⊤}; \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial σ^{4}} = \frac{1}{2 σ^{4}} - \frac{d_{i}}{σ^{4}}; \\ \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial σ^{2} \partial ρ} = - \frac{d_{i}}{2 σ^{2} m_{i}} \frac{\partial m_{i}}{\partial ρ}; \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial ρ \partial ρ^{⊤}} = \frac{1 - 2 d_{i}}{2 m_{i}^{2}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} + \frac{d_{i} - 1}{2 m_{i}} \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}; \\ \frac{\partial^{2} ℓ_{1_{p_{i}}} (θ)}{\partial θ \partial λ} = 0, \end{aligned}

with $d_{i} = \frac{(y_{i} - μ_{i})^{2}}{σ^{2} m_{i}}$ . The first-order and second-order derivatives of $ℓ_{2_{i}} (θ)$ in relation to $θ$ are given by

\begin{aligned} \frac{\partial ℓ_{2_{i}} (θ)}{\partial β} = - \frac{λ}{σ m_{i}^{1 / 2}} x_{i} \frac{\partial ℓ_{2_{i}} (θ)}{\partial f} = - \frac{λ}{σ m_{i}^{1 / 2}} n_{i}; \frac{\partial ℓ_{2_{i}} (θ)}{\partial σ^{2}} = - \frac{λ (y_{i} - μ_{i})}{2 σ^{3} m_{i}^{1 / 2}}; \\ \frac{\partial ℓ_{2_{i}} (θ)}{\partial λ} = \frac{(y_{i} - μ_{i})}{σ m_{i}^{1 / 2}}; \frac{\partial ℓ_{2_{i}} (θ)}{\partial ρ} = - \frac{λ}{2 σ} \frac{(y_{i} - μ_{i})}{m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial β \partial β^{⊤}} = \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial f \partial f^{⊤}} = \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial β \partial f^{⊤}} = 0; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ^{2}} = 0; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial β} = \frac{λ}{2 σ^{3} m_{i}^{1 / 2}} x_{i}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial f} = \frac{λ}{2 σ^{3} m_{i}^{1 / 2}} n_{i}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ \partial β} = - \frac{1}{σ m_{i}^{1 / 2}} x_{i}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ \partial f} = - \frac{1}{σ m_{i}^{1 / 2}} n_{i}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial ρ \partial β^{⊤}} = \frac{λ}{2 σ m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ} x_{i}^{⊤}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial ρ \partial f^{⊤}} = \frac{λ}{2 σ m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ} n_{i}^{⊤}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{4}} = \frac{3 λ}{4 σ^{5}} \frac{(y_{i} - μ_{i})}{m_{i}^{1 / 2}}; \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial λ} = - \frac{1}{2 σ^{3}} \frac{(y_{i} - μ_{i})}{m_{i}^{1 / 2}}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial σ^{2} \partial ρ} = \frac{λ}{4 σ^{3}} \frac{(y_{i} - μ_{i})}{m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial λ \partial ρ} = - \frac{1}{2 σ} \frac{(y_{i} - μ_{i})}{m_{i}^{3 / 2}} \frac{\partial m_{i}}{\partial ρ}; \\ \frac{\partial^{2} ℓ_{2_{i}} (θ)}{\partial ρ \partial ρ^{⊤}} = - \frac{λ}{2 σ} \frac{(y_{i} - μ_{i})}{m_{i}^{3 / 2}} [- \frac{3}{2 m_{i}} \frac{\partial m_{i}}{\partial ρ} \frac{\partial m_{i}}{\partial ρ^{⊤}} + \frac{\partial^{2} m_{i}}{\partial ρ \partial ρ^{⊤}}] . \end{aligned}

Appendix 2. Calculus of the matrices $N$ and $K$ .

Eilers and Marx's method

Let $t = (t_{1}, \dots, t_{n})$ and ndx as the number of equidistant knots desired by the user.

Commands in R [29]:

require(splines)

bspline=function(x,ndx,bdeg) ${$

xl=min(x)

xr=max(x)

dx=(xr-xl)/ndx

knots=seq(xl-bdeg*dx,xr+bdeg*dx,by=dx)

B=splineDesign(knots,x,bdeg+1,0*x,outer.ok=T)

$}$

B=bspline( $t$ ,ndx,3)

D = diag(ncol(B))

for (k in 1:2) D = diff(D)

Thus, $N = B = b s p l i n e (t, n d x, 3)$ and $K = D^{⊤} D$ .

Green $&$ Silverman's method:

Let $t_{1}^{0}, \dots, t_{q}^{0}$ the q distinct and ordered values of $t_{i}$ , $i = 1, \dots, n$ and $N$ a $(n - k) \times q$ incidence matrix whose $(i, j) t h$ element equals the indicator function $I (t_{i} = t_{j}^{0}),$ $j = 1, \dots, q$ . In addition, define $h_{i} = t_{i + 1}^{0} - t_{i}^{0}$ , for $i = 1, \dots, q - 1$ and $S$ as being the $q \times (q - 2)$ matrix with entries $s_{i j}$ , for $i = 1, \dots, q$ and $j = 2, \dots, q - 1$ , given by

\begin{aligned} s_{j - 1, j} = h_{j - 1}^{- 1}, s_{j j} = - h_{j - 1}^{- 1} - h_{j}^{- 1} and s_{j + 1, j} = h_{j}^{- 1}, for j = 2, \dots, q - 1, \end{aligned}

and $s_{i, j} = 0$ , for $| i - j | \geq 2$ . Now, consider $R$ as being a $(q - 2) \times (q - 2)$ matrix with elements $r_{i j}$ given by

\begin{aligned} \begin{matrix} r_{i i} = (h_{i - 1} + h_{i}) / 3, for i = 2, \dots, q - 1, \\ r_{i, i + 1} = r_{i + 1, i} = h_{i} / 6, for i = 2, \dots, q - 2, \end{matrix} \end{aligned}

and $r_{i j} = 0, for | i - j | \geq 2$ . Then, $K = S R^{- 1} S^{⊤}$ .

Funding Statement

This work was supported by CNPq and Fundação de Amparo à Pesquisa do Estado de Minas Gerais (FAPEMIG).

Note

http://azzalini.stat.unipd.it/SN/index.html

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Atkinson A., Two graphical displays for outlying and influential observations in regression, Biometrika 68 (1981), pp. 13–20. [Google Scholar]
2.Azzalini A., A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178. [Google Scholar]
3.Boor C.D., A Practical Guide to Spline, Springer, Berlin, 1978. [Google Scholar]
4.Cancho V.C., Lachos V.H., and Ortega E.M.M., A nonlinear regression model with skew-normal errors, Statist. Papers 51 (2010), pp. 547–558. [Google Scholar]
5.Chen G. and You J., An asymptotic theory for semiparametric generalized least squares estimation in partially linear regression models, Statist. Papers 46 (2005), pp. 173–193. [Google Scholar]
6.Cook R.D., Assessment of local influence, J. R. Stat. Soc. Ser. B 48 (1986), pp. 133–169. [Google Scholar]
7.Cook R.D. and Weisberg S., Residuals and Influence in Regression, Chapman & Hall/CRC, Boca Raton, FL, 1982. [Google Scholar]
8.Cook R.D. and Weisberg S., Diagnostics for heteroscedasticity in regression, Biometrika 70 (1983), pp. 1–10. [Google Scholar]
9.Dempster A., Laird N., and Rubin D., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B 39 (1977), pp. 1–38. [Google Scholar]
10.Eilers P.H.C. and Marx B.D., Flexible smoothing with B-splines and penalties, Stat. Sci. 11 (1996), pp. 89–121. [Google Scholar]
11.Ferreira C.S., Lachos V.H., and Garay A.M., Inference and diagnostics for heteroscedastic nonlinear regression models under skew scale mixtures of normal distributions, J. Appl. Stat. 47 (2020), pp. 1690–1719. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Ferreira C.S. and Paula G.A., Estimation and diagnostic for skew-normal partially linear models, J. Appl. Stat. 44 (2017), pp. 3033–3053. [Google Scholar]
13.Garay A.M., Lachos V.H., Labra F.V., and Ortega E.M.M., Statistical diagnostics for nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Comput. Simul. 84 (2014), pp. 1761–1778. [Google Scholar]
14.Green P.J., Penalized likelihood for general semi-parametric regression models, Int. Stat. Rev. 55 (1987), pp. 245–259. [Google Scholar]
15.Green P.J. and Silverman B.W., Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, Chapman and Hall, Boca Raton, 1994. [Google Scholar]
16.Härdle W., Müller M., Sperlich S., and Werwatz A., Nonparametric and Semiparametric Models, Springer, Berlin, 2004. [Google Scholar]
17.Hastie T. and Tibshirani R., Generalized Additive Models, Chapman and Hall, London, 1990. [DOI] [PubMed] [Google Scholar]
18.Ibacache-Pulgar G. and Paula G.A., Local influence for student-t partially linear models, Comput. Stat. Data Anal. 55 (2011), pp. 1462–1478. [Google Scholar]
19.Ibacache-Pulgar G., Paula G.A., and Cysneiros F.J.A., Semiparametric additive models under symmetric distributions, Test 22 (2013), pp. 103–121. [Google Scholar]
20.Johnson N.L., Kotz S., and Balakrishnan N., Continuous Univariate Distributions, Vol. 1, John Wiley, New York, 1994. [Google Scholar]
21.Keilegom I.V. and Wang L., Semiparametric modeling and estimation of heteroscedasticity in regression analysis of cross-sectional data, Electron. J. Stat. 4 (2010), pp. 133–160. [Google Scholar]
22.Labra F.V., Garay A.M., Lachos V.H., and Ortega E.M.M., Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Plan. Inference 142 (2012), pp. 2149–2165. [Google Scholar]
23.Lachos V.H., Montenegro L.C., and Bolfarine H., Inference and influence diagnostics for skew-normal null intercept measurement errors models, J. Stat. Comput. Simul. 78 (2008), pp. 395–419. [Google Scholar]
24.Lee S.Y. and Xu L., Influence analysis of nonlinear mixed-effects models, Comput. Stat. Data Anal. 45 (2004), pp. 321–341. [Google Scholar]
25.Lesaffre E. and Verbeke G., Local influence in linear mixed models, Biometrics 54 (1998), pp. 570–582. [PubMed] [Google Scholar]
26.Ma B., Chiou J., and Wang A., Efficient semiparametric estimator for heteroscedastic partially linear models, Biometrika 93 (2006), pp. 75–84. [Google Scholar]
27.Mattos T.B. and Ferreira C.S., The mean-shift outlier model under skew normal distribution, Commun. Stat. Simul. Comput. 45 (2016), pp. 1905–1917. [Google Scholar]
28.Meng X.L. and Rubin D.B., Maximum likelihood estimation via the ECM algorithm: A general framework, Biometrika 81 (1993), pp. 633–648. [Google Scholar]
29.R Core Team , R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2021. http://www.R-project.org/.
30.Ruppert D., Wand M.P., and Carrol R., Semiparametric Regression, Cambridge University Press, New York, 2003. [Google Scholar]
31.Segal M.R., Bacchetti P., and Jewell N.P., Variances for maximum penalized likelihood estimates obtained via the EM algorithm, J. R. Stat. Soc. Ser. B 56 (1994), pp. 345–352. [Google Scholar]
32.Stark P.C., Ryan L.M., McDonald J.L., and Burge H.A., Using meteorologic data to predict daily ragweed pollen levels, Aerobiologia 13 (1997), pp. 177–184. [Google Scholar]
33.Wang J. and Genton M.G., The multivariate skew-slash distribution, J. Stat. Plan. Inference 136 (2006), pp. 209–220. [Google Scholar]
34.Xie F.C., Wei B.C., and Lin J.G., Homogeneity diagnostics for skew-normal nonlinear regression models, Stat. Probab. Lett. 79 (2009), pp. 821–827. [Google Scholar]
35.You J., Chen G., and Zhou Y., Statistical inference of partially linear regression models with heteroscedastic errors, J. Multivar. Anal. 98 (2007), pp. 1539–1557. [Google Scholar]
36.Zeller C.B., Carvalho R.R., and Lachos V.H., On diagnostics in multivariate measurement error models under asymmetric heavy-tailed distributions, Statist. Papers 53 (2012), pp. 665–683. [Google Scholar]
37.Zhu H. and Lee S., Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126. [Google Scholar]
38.Zhu H., Lee S., Wei B., and Zhou J., Case-deletion measures for models with incomplete data, Biometrika 88 (2001), pp. 727–737. [Google Scholar]

[CIT0001] 1.Atkinson A., Two graphical displays for outlying and influential observations in regression, Biometrika 68 (1981), pp. 13–20. [Google Scholar]

[CIT0002] 2.Azzalini A., A class of distributions which includes the normal ones, Scand. J. Statist. 12 (1985), pp. 171–178. [Google Scholar]

[CIT0003] 3.Boor C.D., A Practical Guide to Spline, Springer, Berlin, 1978. [Google Scholar]

[CIT0004] 4.Cancho V.C., Lachos V.H., and Ortega E.M.M., A nonlinear regression model with skew-normal errors, Statist. Papers 51 (2010), pp. 547–558. [Google Scholar]

[CIT0005] 5.Chen G. and You J., An asymptotic theory for semiparametric generalized least squares estimation in partially linear regression models, Statist. Papers 46 (2005), pp. 173–193. [Google Scholar]

[CIT0006] 6.Cook R.D., Assessment of local influence, J. R. Stat. Soc. Ser. B 48 (1986), pp. 133–169. [Google Scholar]

[CIT0007] 7.Cook R.D. and Weisberg S., Residuals and Influence in Regression, Chapman & Hall/CRC, Boca Raton, FL, 1982. [Google Scholar]

[CIT0008] 8.Cook R.D. and Weisberg S., Diagnostics for heteroscedasticity in regression, Biometrika 70 (1983), pp. 1–10. [Google Scholar]

[CIT0009] 9.Dempster A., Laird N., and Rubin D., Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. Ser. B 39 (1977), pp. 1–38. [Google Scholar]

[CIT0010] 10.Eilers P.H.C. and Marx B.D., Flexible smoothing with B-splines and penalties, Stat. Sci. 11 (1996), pp. 89–121. [Google Scholar]

[CIT0011] 11.Ferreira C.S., Lachos V.H., and Garay A.M., Inference and diagnostics for heteroscedastic nonlinear regression models under skew scale mixtures of normal distributions, J. Appl. Stat. 47 (2020), pp. 1690–1719. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0012] 12.Ferreira C.S. and Paula G.A., Estimation and diagnostic for skew-normal partially linear models, J. Appl. Stat. 44 (2017), pp. 3033–3053. [Google Scholar]

[CIT0013] 13.Garay A.M., Lachos V.H., Labra F.V., and Ortega E.M.M., Statistical diagnostics for nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Comput. Simul. 84 (2014), pp. 1761–1778. [Google Scholar]

[CIT0014] 14.Green P.J., Penalized likelihood for general semi-parametric regression models, Int. Stat. Rev. 55 (1987), pp. 245–259. [Google Scholar]

[CIT0015] 15.Green P.J. and Silverman B.W., Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, Chapman and Hall, Boca Raton, 1994. [Google Scholar]

[CIT0016] 16.Härdle W., Müller M., Sperlich S., and Werwatz A., Nonparametric and Semiparametric Models, Springer, Berlin, 2004. [Google Scholar]

[CIT0017] 17.Hastie T. and Tibshirani R., Generalized Additive Models, Chapman and Hall, London, 1990. [DOI] [PubMed] [Google Scholar]

[CIT0018] 18.Ibacache-Pulgar G. and Paula G.A., Local influence for student-t partially linear models, Comput. Stat. Data Anal. 55 (2011), pp. 1462–1478. [Google Scholar]

[CIT0019] 19.Ibacache-Pulgar G., Paula G.A., and Cysneiros F.J.A., Semiparametric additive models under symmetric distributions, Test 22 (2013), pp. 103–121. [Google Scholar]

[CIT0020] 20.Johnson N.L., Kotz S., and Balakrishnan N., Continuous Univariate Distributions, Vol. 1, John Wiley, New York, 1994. [Google Scholar]

[CIT0021] 21.Keilegom I.V. and Wang L., Semiparametric modeling and estimation of heteroscedasticity in regression analysis of cross-sectional data, Electron. J. Stat. 4 (2010), pp. 133–160. [Google Scholar]

[CIT0022] 22.Labra F.V., Garay A.M., Lachos V.H., and Ortega E.M.M., Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Plan. Inference 142 (2012), pp. 2149–2165. [Google Scholar]

[CIT0023] 23.Lachos V.H., Montenegro L.C., and Bolfarine H., Inference and influence diagnostics for skew-normal null intercept measurement errors models, J. Stat. Comput. Simul. 78 (2008), pp. 395–419. [Google Scholar]

[CIT0024] 24.Lee S.Y. and Xu L., Influence analysis of nonlinear mixed-effects models, Comput. Stat. Data Anal. 45 (2004), pp. 321–341. [Google Scholar]

[CIT0025] 25.Lesaffre E. and Verbeke G., Local influence in linear mixed models, Biometrics 54 (1998), pp. 570–582. [PubMed] [Google Scholar]

[CIT0026] 26.Ma B., Chiou J., and Wang A., Efficient semiparametric estimator for heteroscedastic partially linear models, Biometrika 93 (2006), pp. 75–84. [Google Scholar]

[CIT0027] 27.Mattos T.B. and Ferreira C.S., The mean-shift outlier model under skew normal distribution, Commun. Stat. Simul. Comput. 45 (2016), pp. 1905–1917. [Google Scholar]

[CIT0028] 28.Meng X.L. and Rubin D.B., Maximum likelihood estimation via the ECM algorithm: A general framework, Biometrika 81 (1993), pp. 633–648. [Google Scholar]

[CIT0029] 29.R Core Team , R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2021. http://www.R-project.org/.

[CIT0030] 30.Ruppert D., Wand M.P., and Carrol R., Semiparametric Regression, Cambridge University Press, New York, 2003. [Google Scholar]

[CIT0031] 31.Segal M.R., Bacchetti P., and Jewell N.P., Variances for maximum penalized likelihood estimates obtained via the EM algorithm, J. R. Stat. Soc. Ser. B 56 (1994), pp. 345–352. [Google Scholar]

[CIT0032] 32.Stark P.C., Ryan L.M., McDonald J.L., and Burge H.A., Using meteorologic data to predict daily ragweed pollen levels, Aerobiologia 13 (1997), pp. 177–184. [Google Scholar]

[CIT0033] 33.Wang J. and Genton M.G., The multivariate skew-slash distribution, J. Stat. Plan. Inference 136 (2006), pp. 209–220. [Google Scholar]

[CIT0034] 34.Xie F.C., Wei B.C., and Lin J.G., Homogeneity diagnostics for skew-normal nonlinear regression models, Stat. Probab. Lett. 79 (2009), pp. 821–827. [Google Scholar]

[CIT0035] 35.You J., Chen G., and Zhou Y., Statistical inference of partially linear regression models with heteroscedastic errors, J. Multivar. Anal. 98 (2007), pp. 1539–1557. [Google Scholar]

[CIT0036] 36.Zeller C.B., Carvalho R.R., and Lachos V.H., On diagnostics in multivariate measurement error models under asymmetric heavy-tailed distributions, Statist. Papers 53 (2012), pp. 665–683. [Google Scholar]

[CIT0037] 37.Zhu H. and Lee S., Local influence for incomplete-data models, J. R. Stat. Soc. Ser. B 63 (2001), pp. 111–126. [Google Scholar]

[CIT0038] 38.Zhu H., Lee S., Wei B., and Zhou J., Case-deletion measures for models with incomplete data, Biometrika 88 (2001), pp. 727–737. [Google Scholar]

PERMALINK

Heteroscedastic partially linear model under skew-normal distribution with application in ragweed pollen concentration

Clécio S Ferreira

Camila Borelli Zeller

Rafael R de Oliveira Garcia

Abstract

1. Introduction

2. Motivating example

Figure 1.

3. The proposed model

3.1. Skew-normal distribution

3.2. Model specification

3.3. Penalized log-likelihood function

4. Statistical inference

4.1. Parameter estimation using the ECM algorithm

4.1.1. Step-by-step instructions for the ECM algorithm

4.2. Goodness-of-fit

5. Residuals

6. Influence diagnostics

6.1. Description of the local influence approach

6.2. The Hessian matrix Q¨θ(θ^)

6.3. Perturbation schemes

6.3.1. Case-weight perturbation

6.3.2. Response variable perturbation

6.3.3. Explanatory variable perturbation

6.3.4. Perturbation of the skewness parameter

7. Likelihood ratio test for homogeneity in the HPLM-SN model

8. Simulation studies

8.1. Study I: parameter recovery

Table 2.

Table 3.

Table 1.

Table 4.

Figure 2.

8.2. Study II: the empirical distribution of the LR test statistic

Figure 3.

8.3. Study III: the empirical power of the LR test

Table 5.

Table 6.

8.4. Study IV: misspecification of the structure function

Table 7.

Table 8.

9. Application

Table 9.

Table 10.

Figure 4.

Figure 5.

Figure 6.

10. Conclusions

Acknowledgments

Appendices.

Appendix 1. Approximate standard errors.

Appendix 2. Calculus of the matrices N and K.

Funding Statement

Note

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

6.2. The Hessian matrix ${\ddot{Q}}_{θ} (\hat{θ})$

Appendix 2. Calculus of the matrices $N$ and $K$ .