Robust Statistical Inference in Generalized Linear Models Based on Minimum Renyi’s Pseudodistance Estimators

María Jaenada; Leandro Pardo

doi:10.3390/e24010123

. 2022 Jan 13;24(1):123. doi: 10.3390/e24010123

Robust Statistical Inference in Generalized Linear Models Based on Minimum Renyi’s Pseudodistance Estimators

María Jaenada ^1,^†, Leandro Pardo ^1,^*,^†

Editor: Philip Broadbridge¹

PMCID: PMC8774563 PMID: 35052149

Abstract

Minimum Renyi’s pseudodistance estimators (MRPEs) enjoy good robustness properties without a significant loss of efficiency in general statistical models, and, in particular, for linear regression models (LRMs). In this line, Castilla et al. considered robust Wald-type test statistics in LRMs based on these MRPEs. In this paper, we extend the theory of MRPEs to Generalized Linear Models (GLMs) using independent and nonidentically distributed observations (INIDO). We derive asymptotic properties of the proposed estimators and analyze their influence function to asses their robustness properties. Additionally, we define robust Wald-type test statistics for testing linear hypothesis and theoretically study their asymptotic distribution, as well as their influence function. The performance of the proposed MRPEs and Wald-type test statistics are empirically examined for the Poisson Regression models through a simulation study, focusing on their robustness properties. We finally test the proposed methods in a real dataset related to the treatment of epilepsy, illustrating the superior performance of the robust MRPEs as well as Wald-type tests.

Keywords: generalized linear model, independent and nonidentically distributed observations, minimum Rényi’s pseudodistance estimators, robust Wald-type test statistics for GLMs, influence function for GLMs, poisson regression model

MSC: 62F35, 62J12

1. Introduction

Generalized linear models (GLMs) were first introduced by Nelder and Wedderburn [1] and later expanded upon by McCullagh and Nelder [2]. The GLMs represent a natural extension of the standard linear regression models, which enclose a large variety of response variable distributions, including distributions of count, binary, or positive values. Let $Y_{1}, \dots, Y_{n}$ be independent response variables. The classical GLM assumes that the density function of each random variable $Y_{i}$ belongs to the exponential family, having the form

f (y, θ_{i}, ϕ) = exp \{\frac{y θ_{i} - b (θ_{i})}{a (ϕ)} + c (y, ϕ)\},

(1)

for $i = 1, \dots, n,$ where the functions $a (ϕ),$ $b (θ_{i})$ and $c (y, ϕ)$ are known. Therefore, the observations are independent but not identically distributed, depending on a location parameter $θ_{i},$ $i = 1, \dots, n,$ and a nuisance parameter $ϕ .$ Further, we denote by $μ_{i}$ the expectation of the random variable $Y_{i}$ and we assume that there exists a monotone differentiable function, so called link function g, verifying

g (μ_{i}) = x_{i}^{T} β,

with $β = (β_{1}, \dots, β_{k}) \in R^{k}$ $(k < n)$ the regression parameter vector. The $k \times 1$ -vector of explanatory variables, $x_{i},$ is assumed to be nonrandom, i.e., the design matrix is fixed. Correspondingly, the location parameter depends on the explanatory variables $θ = θ (x^{T} β)$ the density function given in (1) can be written as $f_{i} (y, β, ϕ),$ empathizing its dependence of $β$ and $x_{i}$ .

The maximum likelihood estimator (MLE) and the quasilikelihood estimators were well studied for the GLMs, and it is well known that they are asymptotically efficient but lack robustness in the presence of outliers, which can result in a significant estimation bias. Jaenada and Pardo [3] revised the different robust estimators in the statistical literature and studied the lack of robustness of the MLE as well. Among others, Stefanski et al. [4] studied optimally bounded score functions for the GLM and generalized the results obtained by Krasker and Welsch [5] for classical LRMs. Künsch et al. [6] introduced the so-called conditionally unbiased bounded-influence estimate, and Morgenthaler [7], Cantoni and Ronchetti [8], Bianco and Yohai [9], Croux and Hesbroeck [10], Bianco et al. [11], and Valdora and Yohai [12] continued the development of robust estimators for the GLMs based on general M-estimators. Later, Ghosh and Basu [13] proposed robust estimators for the GLM, based on the density power divergence (DPD) introduced in Basu et al. [14].

There are not many papers considering robust tests for GLMs. In this sense, Basu et al. [15] considered robust Wald-type tests based on the minimum DPD estimator, but assuming random explanatory variables for the GLM. The main purpose of this paper is to introduce new robust Wald-type tests based on the MRPE under fixed (not random) explanatory variables.

Broniatowski et al. [16] presented robust estimators for the parameters of the linear regression model (LRM) with random explanatory variables and Castilla et al. [17] considered Wald-type test statistics, based on MRPE, for the LRM. Toma and Leoni–Aubin [18] defined new robustness and efficient measures based on the RP and Toma et al. [19] considered the MRPE for general parametric models, and constructed a model selection criterion for regression models. The term “Rényi pseudodistance” (RP) was adopted in Broniatowski et al. [16] because of its similarity with the Rényi divergence (Rényi [20]), although this family of divergences was considered previously in Jones et al. [21]. Fujisawa and S. Eguchi [22] used the RP under the name of $γ$ -cross entropy, introduced robust estimators obtained by minimizing the empirical estimate of the $γ$ -cross entropy (or the $γ$ -divergence associated to the $γ$ -cross entropy) and studied their properties. Further, Hirose and Masuda [23] considered the $γ$ likelihood function to find robust estimation. Using the $γ$ -divergence, Kawashima and Fujisawa [24,25] presented robust estimators for sparse regression and sparse GLMs with random covariates. The robustness of all the previous estimators is based on density power weight, $f {(y, θ)}^{l},$ which gives a small weight to outliers observations. This idea was also developed by Basu et al. [15] for the minimum DPD estimator and was considered some years ago by Windham [26]. More concretely, Basu et al. [14] considered the density power function multiplied by the score function.

The outline of the paper is as follows: in Section 2, some results in relation to the MRPEs for GLMs, previously obtained in Jaenada and Pardo [3], are presented. Section 3 introduces and studies Wald-type tests based on the MRPE for testing linear null hypothesis for the GLMs. In Section 4, the influence function of the MRPE as well as the influence functions of the Wald-type tests are derived. Finally, we empirically examine the performance of the proposed robust estimators and Wald-type test statistics for the Poisson regression model through a simulation study in Section 5, and we illustrate its applicability with real data sets for binomial and Poisson regression.

2. Asymptotic Distribution of the MRPEs for the GLMs

In this Section, we revise some of the results presented in Jaenada and Pardo [3] in relation to the MRPE. Let $Y_{1}, \dots, Y_{n},$ be INIDO random variables with density functions with respect to some common dominating measure, $g_{1}, \dots, g_{n}$ respectively. The true densities $g_{i}$ are modeled by the density functions given in (1), belonging to the exponential family. Such densities are denoted by $f_{i} (y, β, ϕ)$ highlighting its dependence on the regression vector $β,$ the nuisance parameter $ϕ$ and the observation $i$ , $i = 1, \dots, n .$ In the following, we assume that the explanatory variables $x_{i},$ are fixed, and therefore the response variables verify the INIDO set up studied in Castilla et al. [27].

For each of the response variables $Y_{i}$ , the RP between the theoretical density function belonging to the exponential family, $f_{i} (y, γ),$ and the true density underlying the data, $g_{i},$ can be defined, for $α > 0$ as

R_{α} (f_{i} (y, γ), g_{i}) = \frac{1}{α + 1} log (\int f_{i} {(y, γ)}^{α + 1} d y) - \frac{1}{α} log (\int f_{i} {(y, γ)}^{α} g_{i} (y) d y) + k,

(2)

where

k = \frac{1}{α (α + 1)} log (\int g_{i} {(y)}^{α + 1} d y)

does not depend on $γ = {(β^{T}, ϕ)}^{T} .$

We consider $(y_{1}, \dots, y_{n})$ a random sample of independent but nonhomogeneous observations of the response variables with fixed predictors $(x_{1}, \dots, x_{n}) .$ Since only one observation of each variable $Y_{i}$ is available, a natural estimate of its true density $g_{i}$ is the degenerate distribution at the the observation $y_{i} .$ Consequently, in the following we denote ${\hat{g}}_{i}$ the density function of the degenerate variable at the point $y_{i} .$ Then, substituting in (2) the theoretical and empirical densities, yields to the loss

R_{α} (f_{i} (y, γ), {\hat{g}}_{i}) = \frac{1}{α + 1} log (\int f_{i} {(y, γ)}^{α + 1} d y) - \frac{1}{α} log f_{i} {(Y_{i}, γ)}^{α} + k .

(3)

If we consider the limit when $α$ tends to zero we get

R_{0} (f_{i} (y, γ), {\hat{g}}_{i}) = lim_{α ↓ 0} R_{α} (f_{i} (y, γ), {\hat{g}}_{i}) = - log f_{i} (Y_{i}, γ) + k .

(4)

Last expression coincides with the Kullback–Leibler divergence, except for the constant $k .$ More details about Kullback–Leiber divergence can be seen in Pardo [28].

For the seek of simplicity, let us denote

L_{α}^{i} (γ) = {(\int f_{i} {(y, γ)}^{α + 1} d y)}^{\frac{α}{α + 1}},

and

V_{i} (Y_{i}, γ) = \frac{f_{i} {(Y_{i}, γ)}^{α}}{L_{α}^{i} (γ)} .

The expression (3) can be rewritten as

R_{α} (f_{i} (y, γ), {\hat{g}}_{i}) = - \frac{1}{α} log (\frac{f_{i} {(Y_{i}, γ)}^{α}}{{(\int f_{i} {(y, γ)}^{α + 1} d y)}^{\frac{α}{α + 1}}}) + k = - \frac{1}{α} log V_{i} (Y_{i}, γ) + k .

Based on the previous idea, we shall define an objective function averaging the RP between all the the RPs. Since minimizing $R_{α} (f_{i} (y, γ), {\hat{g}}_{i})$ in $γ$ is equivalent to maximizing $log V_{i} (Y_{i}, γ),$ we define a loss function averaging those quantities as

T_{n}^{α} (γ) = \frac{1}{n} \sum_{i = 1}^{n} \frac{f_{i} {(Y_{i}, γ)}^{α}}{{(\int f_{i} {(y, γ)}^{α + 1} d y)}^{\frac{α}{α + 1}}} = \frac{1}{n} \sum_{i = 1}^{n} \frac{f_{i} {(Y_{i}, γ)}^{α}}{L_{α}^{i} (γ)} = \frac{1}{n} \sum_{i = 1}^{n} V_{i} (Y_{i}, γ) .

(5)

Based on (5), we can define the MRPE of the unknown parameter $γ$ , ${\hat{γ}}_{α},$ by

{\hat{γ}}_{α} = arg max_{γ \in Γ} T_{n}^{α} (γ),

(6)

with $T_{n}^{α} (γ)$ defined in (5)

T_{n}^{0} (γ) = \frac{1}{n} \sum_{i = 1}^{n} log f_{i} (y_{i}, γ)

at $α = 0 .$ The MRPE coincides with the MLE at $α = 0$ , and therefore the proposed family can be considered a natural extension of the classical MLE.

Now, since the MRPE is defined as a maximum, it must annul the first derivatives of the loss function given in (5). The estimating equations of the parameters $β$ and $ϕ$ are given by

\{\begin{matrix} \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial V_{i} (Y_{i}, γ)}{\partial β} = 0_{k} \\ \frac{1}{n} \sum_{i = 1}^{n} \frac{\partial V_{i} (Y_{i}, γ)}{\partial ϕ} = 0 . \end{matrix}

(7)

For the first equation, we have

\begin{matrix} \frac{\partial V_{i} (Y_{i}, γ)}{\partial β} & = \frac{1}{L_{α}^{i} {(γ)}^{2}} \{α f_{i} {(Y_{i}, γ)}^{α} \frac{\partial log f_{i} (Y_{i}, γ)}{\partial β} L_{α}^{i} (γ) \\ - [α {(\int f_{i} {(y, γ)}^{α + 1} d y)}^{\frac{α}{α + 1} - 1} \int f_{i} {(y, γ)}^{α + 1} \frac{\partial log f_{i} (y, γ)}{\partial β} d y] f_{i} {(Y_{i}, γ)}^{α}\} . \end{matrix}

The previous partial derivatives can be simplified as

\frac{\partial log f_{i} (Y_{i}, γ)}{\partial β} = \frac{Y_{i} - μ_{i}}{V a r (Y_{i}) g^{'} (μ_{i})} x_{i} = K_{1 i} (Y_{i}, γ) x_{i}

and

\frac{\partial log f_{i} (Y_{i}, γ)}{\partial ϕ} = - \frac{(Y_{i} θ_{i} - b (θ_{i}))}{a {(ϕ)}^{2}} a^{'} (ϕ) + \frac{\partial c (Y_{i}, ϕ)}{\partial ϕ} = K_{2 i} (Y_{i}, γ) .

See Ghosh and Basu [13] for more details. Now using the simplified expressions, we can write the estimating equation for $β$ as

\sum_{i = 1}^{n} \frac{x_{i}}{L_{α}^{i} (γ)} \{M_{i} (Y_{i}, γ) - N_{i} (Y_{i}, γ)\} = 0_{k}

(8)

being

M_{i} (Y_{i}, γ) = f_{i} {(Y_{i}, γ)}^{α} K_{1 i} (Y_{i}, γ)

and

N_{i} (Y_{i}, γ) = \frac{f_{i} {(Y_{i}, γ)}^{α}}{\int f_{i} {(y, γ)}^{α + 1} d y} \int f_{i} {(y, γ)}^{α + 1} K_{1 i} (y, γ) d y .

Subsequently, the estimating equation for $ϕ,$ is given by

\begin{matrix} \frac{\partial V_{i} (Y_{i}, γ)}{\partial ϕ} & = \frac{1}{L_{α}^{i} {(γ)}^{2}} \{α f_{i} {(Y_{i}, γ)}^{α} \frac{\partial log f_{i} (Y_{i}, γ)}{\partial ϕ} L_{α}^{i} (γ) \\ - [α {(\int f_{i} {(y, γ)}^{α + 1} d y)}^{\frac{α}{α + 1} - 1} \int f_{i} {(y, γ)}^{α + 1} \frac{\partial log f_{i} (y, γ)}{\partial ϕ} d y] f_{i} {(Y_{i}, γ)}^{α}\} \\ = \frac{1}{L_{α}^{i} {(γ)}^{2}} \{α f_{i} {(Y_{i}, γ)}^{α} \frac{\partial log f_{i} (Y_{i}, γ)}{\partial ϕ} L_{α}^{i} (γ) \\ - [α \frac{L_{α}^{i} (γ)}{\int f_{i} {(y, γ)}^{α + 1} d y} \int f_{i} {(y, γ)}^{α + 1} \frac{\partial log f_{i} (y, γ)}{\partial ϕ} d y] f_{i} {(Y_{i}, γ)}^{α}\} . \end{matrix}

and thus, the estimating equation for $ϕ$ is given by

\sum_{i = 1}^{n} \frac{1}{L_{α}^{i} (γ)} \{M_{i}^{*} (Y_{i}, γ) - N_{i}^{*} (Y_{i}, γ)\} = 0

(9)

being

M_{i}^{*} (Y_{i}, γ) = f_{i} {(Y_{i}, γ)}^{α} K_{2 i} (Y_{i}, γ),

and

N_{i}^{*} (Y_{i}, γ) = \frac{f_{i} {(Y_{i}, γ)}^{α}}{\int f_{i} {(y, γ)}^{α + 1} d y} \int f_{i} {(y, γ)}^{α + 1} K_{2 i} (y, γ) d y .

Under some regularity conditions, Castilla et al. [27] established the consistency and asymptotic normality of the MRPEs under the INIDO setup. Before stating the consistence and asymptotic distribution of the MRPEs for the GLM, let us introduce some useful notation. We define

\begin{matrix} S_{α}^{i} & = \int f_{i} {(y, β, ϕ)}^{α + 1} d y \\ m_{j l i} (γ) & = \frac{1}{\int f_{i} {(y, γ)}^{α + 1} d y} \int f_{i} {(y, γ)}^{α + 1} K_{j i} (y, γ) K_{l i} (y, γ) d y, \\ m_{j i} (γ) & = \frac{1}{\int f_{i} {(y, γ)}^{α + 1} d y} \int f_{i} {(y, β, ϕ)}^{α + 1} K_{j i} (y, γ) d y, \\ l_{j l i} (γ) & = \int \frac{f_{i} {(y, γ)}^{2 α + 1}}{L_{α}^{i} {(γ)}^{2}} (K_{j i} (y, γ) - m_{j i} (γ)) (K_{l i} (y, γ) - m_{l i} (γ)) d y, \end{matrix}

(10)

for all $j, l = 1, 2$ and $i = 1, \dots, n .$

Theorem 1.

Let $Y_{1}, \dots, Y_{n}$ be a random sample from the GLM defined in (1). The MRPE ${\hat{γ}}_{α} = {({\hat{β}}_{α}^{T}, {\hat{ϕ}}_{α})}^{T}$ is consistent and its asymptotic distribution is given by

$\sqrt{n} Ω_{n} {(γ)}^{- \frac{1}{2}} Ψ_{n} (γ) (({\hat{β}}_{α}, {\hat{ϕ}}_{α}) - (β, ϕ)) \underset{n \to \infty}{\overset{L}{\to}} N (0_{k + 1}, I_{k + 1}),$

where $X$ denotes the design matrix, $I_{k}$ is the k-dimensional identity matrix and the matrices $Ψ_{n}$ and $Ω_{n}$ are defined by

$Ω_{n} (γ) = \frac{1}{n} (\begin{matrix} X^{T} D_{11} X & X^{T} D_{12} 1 \\ 1^{T} D_{12} X & 1^{T} D_{22} 1 \end{matrix}),$

$Ψ_{n} (γ) = \frac{1}{n} (\begin{matrix} X^{T} (D_{11}^{*} - {(D_{1}^{*})}^{T} D_{1}^{*}) X & X^{T} (D_{12}^{*} - {(D_{1}^{*})}^{T} D_{2}^{*}) 1 \\ 1^{T} (D_{12}^{*} - {(D_{1}^{*})}^{T} D_{2}^{*}) X & 1^{T} (D_{22}^{*} - {(D_{2}^{*})}^{T} D_{2}^{*}) 1 \end{matrix}),$

with

$D_{j k} = d i a g {(l_{j k i} (γ))}_{i = 1, \dots, n, j, k = 1, 2} D_{j k}^{*} = d i a g {(m_{j k i} (γ))}_{i = 1, \dots, n}$

and

$D_{j}^{*} = d i a g {(m_{j i} (γ))}_{i = 1, \dots, n},, j . k = 1, 2 .$

Proof.

The consistency is proved for general statistical models in Castilla et al. [27] and the asymptotic distribution of the MRPEs for GLM is derived in Jaenada and Pardo [3]. □

3. Wald Type Tests for the GLMs

In this section, we define Wald-type tests for linear null hypothesis of the form

H_{0} : M^{T} γ = m vs H_{1} : M^{T} γ \neq m

(11)

being $γ = {(β^{T}, ϕ)}^{T},$ $M$ a $(k + 1) \times r$ full rank matrix and

m = {(m_{1}, \dots, m_{r})}^{T}

(12)

a r-dimensional vector $(r \leq k + 1)$ . If the nuisance parameter $ϕ$ is known, as with logistic and Poisson regression, the matrix $M = L_{k \times r} .$ Additionally, choosing

M = (L_{k \times r}, O_{1 \times r})

gives rise to a null hypothesis defined by a linear combination of the regression coefficients, $β$ , with $ϕ$ known or unknown. Further, the simple null hypothesis is a particular case when choosing $M$ as the identity matrix of rank k,

H_{0} : β = β_{0} vs H_{1} : β \neq β_{0}

with $m = β_{0} = {(β_{1}^{0}, \dots, β_{k}^{0})}^{T} .$

In the following we assume that there exist a matrix $A_{α} (γ)$ verifying

lim_{n \to \infty} Ψ_{n} (γ) Ω_{n} {(γ)}^{- 1} Ψ_{n} (γ) = A_{α} (γ) .

Definition 1.

Let ${\hat{γ}}_{α} = {({\hat{β}}_{α}^{T}, {\hat{ϕ}}_{α})}^{T}$ be the MRPE of $γ = {(β^{T}, ϕ)}^{T}$ for the GLM. The Wald-type tests, based on the MRPE, for testing (11) are defined by

$W_{n} ({\hat{γ}}_{α}) = n {(M^{T} {\hat{γ}}_{α} - m)}^{T} {(M^{T} Ψ_{n} {({\hat{γ}}_{α})}^{- 1} Ω_{n} ({\hat{γ}}_{α}) Ψ_{n} {({\hat{γ}}_{α})}^{- 1} M)}^{- 1} (M^{T} {\hat{γ}}_{α} - m) .$ (13)

The following theorem presents the asymptotic distribution of the Wald-type test statistics, $W_{n} ({\hat{γ}}_{α}) .$

Theorem 2.

The Wald-type test $W_{n} ({\hat{γ}}_{α})$ follows asymptotically, under the null hypothesis presented in (11), a chi-square distribution with degrees of freedom equal to the dimension of the vector $m$ in (12)

Under the null hypothesis given in (11) the asymptotic distribution of the Wald-type test statistics is a chi-square distribution with r degrees of freedom.

Proof.

We know that

$\sqrt{n} ({({\hat{β}}_{α}^{T}, {\hat{ϕ}}_{α})}^{T} - {(β^{T}, ϕ)}^{T}) \underset{n \to \infty}{\overset{L}{\to}} N (0_{k + 1}, A_{α} {(γ)}^{- 1}) .$

Therefore,

$\sqrt{n} (M^{T} {\hat{γ}}_{α} - m) = \sqrt{n} M ({\hat{γ}}_{α} - γ) \underset{n \to \infty}{\overset{L}{\to}} N (0_{k + 1}, M^{T} A_{α} {(γ)}^{- 1} M) .$

Now, the result follows taking into account that ${\hat{γ}}_{α}$ is a consistent estimator of $γ_{0} .$ □

Based on the previous convergence, the null hypothesis in (11) is rejected, if

W_{n} ({\hat{γ}}_{α}) > χ_{r, α}^{2}

(14)

being $χ_{r, α}^{2}$ the $100 (1 - α)$ percentile of a chi-square distribution with r degrees of freedom.

Finally, let $γ_{1}$ be a parameter point verifying $M^{T} γ_{1} \neq m,$ i.e., $γ_{1}$ is not on the null hypothesis. The next result establishes that the Wald-type tests given in (14) are consistent (see Fraser [29]).

Theorem 3.

Let $γ_{1}$ be a parameter point verifying $M^{T} γ_{1} \neq m .$ Then the Wald-type tests given in (14) are consistent, i.e.,

$lim_{n \to \infty} P_{γ_{1}} (W_{n} ({\hat{γ}}_{α}) > χ_{r, α}^{2}) = 1 .$

Proof.

See Appendix A. □

Remark 1.

In the proof of the previous Theorem was established the approximate power function of the Wald-type tests defined in (13),

$π_{W_{n} ({\hat{γ}}_{α})} (γ_{1}) \approx 1 - ϕ_{N (0, 1)} (\frac{1}{σ (γ_{1})} (\frac{χ_{r, α}^{2}}{\sqrt{n}} - \frac{W_{n} (γ_{1})}{\sqrt{n}}))$

where

$σ^{2} (γ_{1}) = {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ^{T}})}_{γ = γ_{1}} A_{α} {(γ_{1})}^{- 1} {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ})}_{γ = γ_{1}}$

and

$l_{{\hat{γ}}_{α}} (ζ) = {(M^{T} {\hat{γ}}_{α} - m)}^{T} {(M^{T} A_{α} {(ζ)}^{- 1} M)}^{- 1} (M^{T} {\hat{γ}}_{α} - m) .$

From the above expression, the necessary sample size n for the Wald-type tests to have a predetermined power, $π_{0},$ is given by $n = [n^{*}] + 1$ , with

$n^{*} = \frac{A + B + \sqrt{A (A + 2 B)}}{2 l_{γ_{1}}^{2} (γ_{1})}$

being

$A = σ^{2} (γ_{1}) {(ϕ^{- 1} (1 - π_{0}))}^{2}, B = 2 χ_{r, α}^{2} l_{γ_{1}} (γ_{1})$

and $[\cdot]$ the integer part.

In accordance with Maronna et al. [30], the breakdown point of the estimators ${\hat{γ}}_{α}$ of a parameter $γ$ is the largest amount of contamination that the data may contain such that ${\hat{γ}}_{α}$ still gives enough information about $γ .$ The derivation of a general breakdown points it is in general not easy, so it may deserve a separate paper where it may be jointly considered the replacement finite-sample breakdown point introduced by Donoho and Huber [31]. Although breakdown point is an important theoretical concept in robust statistics, perhaps is more useful the definition of breakdown point associated to a finite sample: replacement finite-sample break down point. More details can be seen in Section 3.2.5 of Maronna et al. [30].

4. Influence Function

We derive in this section the IF of the MRPEs of the parameters $γ = {(β^{T}, ϕ)}^{T}$ and Wald-type statistics based on these MRPEs, $W_{n} ({\hat{γ}}_{α}) .$ The influence function (IF) of an estimator quantifies the impact of an infinitesimal perturbation in the true distribution of the data on the asymptotic value of the resulting parameter estimate (in terms of the corresponding statistical functional). An estimator is said to be robust if its IF is bounded. If we denote $G = (G_{1}, \dots, G_{n})$ the true distributions underlying the data, the functional $T_{α} (G)$ and associated to the MRPE for the parameters $γ$ is such that

\frac{1}{n} \sum_{i = 1}^{n} R_{α} (f_{i} (y, T_{α} (G)), g_{i} (y)) = min_{γ} \frac{1}{n} \sum_{i = 1}^{n} R_{α} (f_{i} (y, γ), g_{i} (y)) .

The IF of a estimator is defined as the limiting standardized bias due to infinitesimal contamination. That is, given a contaminated distribution at the point $(y_{t}, x_{t})$ , $G_{ε} = (1 - ε) G + ε Δ_{(y_{t}, x_{t})}$ with $Δ_{(y_{t}, x_{t})}$ the degenerate distribution at $(y_{t}, x_{t})$ , the IF of the estimator ${\hat{γ}}_{α}$ in terms of its associated functional $T_{α} (G)$ is computed as

IF ((y_{t}, x_{t}), T_{α} (G)) = lim_{ε \to 0} \frac{T_{α} (G_{ε}) - T_{α} (G)}{ε} .

In the following, let us denote $T_{α} (G) = (T_{α}^{β} (G), T_{α}^{ϕ} (G)),$ where $T_{α}^{β} (G)$ and $T_{α}^{ϕ} (G)$ are the functionals associated the parameters $β$ and $ϕ$ , respectively. Then, they must satisfy the estimating equations of the MRPE given by

\begin{matrix} \sum_{i = 1}^{n} \frac{x_{i}}{L_{α}^{i} ((T_{α}^{β} (G), T_{α}^{ϕ} (G)))} \{M_{i} (y_{i}, (T_{α}^{β} (G), T_{α}^{ϕ} (G))) - N_{i} (y_{i}, (T_{α}^{β} (G), T_{α}^{ϕ} (G)))\} & = 0_{k} \\ \sum_{i = 1}^{n} \frac{1}{L_{α}^{i} ((T_{α}^{β} (G), T_{α}^{ϕ} (G)))} \{M_{i}^{*} (y_{i}, (T_{α}^{β} (G), T_{α}^{ϕ} (G))) - N_{i}^{*} (y_{i}, (T_{α}^{β} (G), T_{α}^{ϕ} (G)))\} & = 0 \end{matrix}

(15)

where the quantities $L_{α^{i} (γ)}, M_{i} (y_{i}, γ), N_{i} (y_{i}, γ), M_{i}^{*} (y_{i}, γ)$ and $N_{i}^{*} (y_{i}, gamma)$ are defined in Section 2. Now, evaluating the previous equation at the contaminated distribution $G_{ε}$ , implicitly differentiating the estimating equations in $ε$ and evaluating them at $ε = 0,$ we can obtain the expression of the IF for the GLM.

We first derive the expression IF of MRPEs at the $i_{0} - t h$ direction. For this purpose, we consider the contaminated distributions

G_{i_{0}, ε} = (G_{1}, \dots, G_{i_{0} - 1}, G_{i_{0}, ε}, G_{i_{0} + 1}, \dots, G_{n}),

with $G_{i_{0}, ε} = (1 - ε) G_{i_{0}} + ε Δ_{(y_{i_{0}}, x_{i_{0}})} .$ Here, only the $i_{0}$ -th component of the vector of distributions is contaminated. If the true density function $g_{i}$ of each variable belongs to the exponential model, we have that

g_{i} (y) = \{\begin{matrix} f_{i} (y, γ) & i \neq i_{0} \\ (1 - ε) f_{i} (y, γ) + ε Δ_{(y_{i_{0}}, x_{i_{0}})} (y) & i = i_{0} . \end{matrix}

Accordingly, we define

γ_{ε}^{i_{0}} = T_{α} (G_{1}, \dots, G_{i_{0} - 1}, G_{i_{0}, ε}, G_{i_{0} + 1}, \dots, G_{n})

the MRPE when the true distribution underlying the data is $G_{i_{0}, ε} .$ Based on Remark 5.2 in Castilla et al. [27] the IF of the MRPE at the $i_{0} - t h$ direction with $(y_{i_{0}}, x_{i_{0}})$ the point of contamination is given by

\begin{array}{l} I F ((y_{i_{0}}, x_{i_{0}}), T_{α}, G) = {(\frac{\partial T_{α} (G_{i_{0}, γ_{ε}})}{\partial ε})}_{ε = 0} \\ = Ψ_{n} {(γ)}^{- 1} \frac{f_{i_{0}} {(y_{i_{0}}, γ)}^{α}}{\int f_{i_{0}} {(y, γ)}^{α + 1} d y} (\begin{matrix} K_{1 i} (y_{i_{0}}, γ) - f_{i_{0}} {(y_{i_{0}}, γ)}^{- α} N_{i_{0}} (y_{i_{0}}, γ) \\ K_{2 i} (y_{i_{0}}, γ) - f_{i_{0}} {(y_{i_{0}}, γ)}^{- α} N_{i_{0}}^{*} (y_{i_{0}}, γ) \end{matrix}) (\begin{matrix} x_{i_{0}} & 0 \\ 0 & 1 \end{matrix}) . \end{array}

In a similar manner, the IF in all directions (i.e., all components of the vector of distributions are contaminated) has the following expression

\begin{array}{l} I F ((y_{1}, x_{1}), \dots, (y_{n}, x_{n}), T_{α}, G) = {(\frac{\partial T_{α} (G_{γ_{ε}})}{\partial ε})}_{ε = 0} \\ = Ψ_{n} {(γ)}^{- 1} \sum_{i = 1}^{n} (\frac{f_{i} {(y_{i}, γ)}^{α}}{\int f_{i} {(y, γ)}^{α + 1} d y} (\begin{matrix} K_{1 i} (y_{i}, γ) - f_{i} {(y_{i}, γ)}^{- α} N_{i} (y_{i}, γ) \\ K_{2 i} (y_{i}, γ) - f_{i} {(y_{i}, γ)}^{- α} N_{i}^{*} (y_{i}, γ) \end{matrix}) (\begin{matrix} x_{i} & 0 \\ 0 & 1 \end{matrix})), \end{array}

with $(y_{1}, x_{1}), \dots, (y_{n}, x_{n})$ the point of contamination. We next derive the expression of the IF for the Wald-type tests presented in Section 3. The statistical functional associated with the Wald-type tests for the linear null hypothesis (11) at the distributions $G = (G_{1}, \dots, G_{n})$ , ignoring the constant $n,$ is given by

W_{α} (G) = {(M^{T} T_{α} (G) - m)}^{T} {(M^{T} A_{α} {(T_{α} (G))}^{- 1} M)}^{- 1} (M^{T} T_{α} (G) - m) .

(16)

Again, evaluating the Wald-type test functionals at the contaminated distribution $G_{ε}$ and implicitly differentiating the expression, we can get the expression of it IF. In particular, the IF of the Wald-type test statistics at the $i_{0}$ -th direction and the contamination point $(y_{i_{0}}, x_{0})$ is given by

\begin{matrix} I F_{1} ((y_{i_{0}}, x_{0}), W_{α}, G) & = {(\frac{\partial W_{α} (G_{i_{0}, ε})}{\partial ε})}_{ε = 0} \\ = 2 {(M^{T} T_{α} (G) - m)}^{T} {(M^{T} A_{α} {(T_{α} (G))}^{- 1} M)}^{- 1} M^{T} I F ((y_{i_{0}}, x_{0}), T_{α}, G) . \end{matrix}

Evaluating the previous expression at the null hypothesis, $M^{T} T_{α} (G) = m,$ the IF becomes identically zero,

I F_{1} ((y_{i_{0}}, x_{i_{0}}) W_{α}, G) = 0_{k + 1} .

Therefore, it is necessary to consider the second order IF of the proposed Wald-type tests. Twice differentiating in $W_{α} (G_{ε})$ , we get

\begin{matrix} I F_{2} ((y_{i_{0}}, x_{i_{0}}), W_{α}, G) & = {(\frac{\partial^{2} W_{α} (G_{i_{0}, ε})}{\partial ε^{2}})}_{ε = 0} \\ = 2 I F {((y_{i_{0}}, x_{i_{0}}), T_{α}, F_{β})}^{T} M {(M^{T} A_{α} {(T_{α} (G))}^{- 1} M)}^{- 1} M^{T} I F ((y_{i_{0}}, x_{i_{0}}), T_{α}, G) . \end{matrix}

Finally, the second order IF of the Wald-type tests in all directions is given by

\begin{matrix} I F_{2} ((y_{1}, x_{1}), \dots, (y_{n}, x_{n}), W_{α}, G) & = {(\frac{\partial^{2} W_{α} (G_{ε})}{\partial ε^{2}})}_{ε = 0} \\ = 2 I F {((y_{1}, x_{1}), \dots, (y_{n}, x_{n}), T_{α}, G)}^{T} M {(M^{T} A_{α} {(T_{α} (G))}^{- 1} M)}^{- 1} M^{T} \\ \cdot I F ((y_{1}, x_{1}), \dots, (y_{n}, x_{n}), T_{α}, G) . \end{matrix}

To asses the robustness of the MRPEs and Wald-type test statistics we must discuss the boundedness of the corresponding IF. The boundedness of the second order IF of the Wald-type test statistics is determined by the boundedness of the IF of the MRPEs. Further, the matrix $Ψ_{n} (γ)$ is assumed to be bounded, so the robustness of the estimators only depend on the second factor of the IF. Most standard GLMs enjoy such properties for positives values of $α$ , but the influence function is unbounded at $α = 0,$ corresponding with the MLE. As an illustrative example, Figure 1 plots the IF of the MRPEs for the Poisson regression model with different values of $α = 0, 0.5$ at one direction. The model is fitted with only one covariate, the parameter $ϕ$ is known for Poisson regression ( $ϕ = 1$ ) and the true regression vector is fixed $β = 1 .$ As shown, the IF of the MRPEs with positives values of $α$ are bounded, whereas the IF of the MLE is not, indicating it lack of robustness.

IF of MRPEs with $α = 0$ (**left**) and $α = 0.5$ (**right**) of Poisson regression model.

5. Numerical Analysis: Poisson Regression Model

We illustrate the proposed robust method for the Poisson regression model. As pointed out in Section 1 the Poisson regression model belongs to the GLM with known shape parameter $ϕ = 1,$ location parameter $θ_{i} = x_{i}^{T} β$ and known functions $b (θ_{i}) = exp (x_{i}^{T} β)$ and $c (y_{i}) = - log (y_{i}!)$ . Since the nuisance parameter is known, for the seek of simplicity in the following we only use $β = γ .$ In Poisson regression, the mean of the response variable is linked to the linear predictor through the natural logarithm, i.e., $μ_{i} = exp (x_{i}^{T} β) .$ Thus, we can apply the previous proposed method to estimate the vector of regression parameters $β$ with objective function given in Equation (5).

The results provided are computed in the software R. The minimization of the objective function is performed using the implemented optim() function, which applies the Nelder–Mead iterative algorithm (Nelder and Mead [32]). Nelder–Mead optimization algorithm is robust although relatively slow. The corresponding objective function $T_{n}^{α} (γ)$ given in (5) is highly nonlinear and requires the evaluation of nontrivial quantities. Further, the computation of the Wald-type test statistics defined in (13) requires to evaluate the covariance matrix of the MRPEs, involving nontrivial integrals. Some simplified expressions of the main quantities defined throughout the paper for the Poisson regression model, such as $L_{α}^{i} (β), K_{1 i} (y, β), N_{i} (y, β), m_{1 i} (β), m_{11 i} (β)$ or $l_{11 i} (β),$ are given in the Appendix B. There is no closed expression for these quantities, and they need to be approximated numerically. Since the minimization is iteratively performed, computing such expressions at each step of the algorithm and for each observation may entail an increased computational burden. Nonetheless, the complexity is not significant for low-dimensional data. On the other hand, the optimum in (5) need not to be uniquely defined, since the objective function may have several local minima. Then, the choice of the initial value of the iterative algorithm is crucial. Ideally, a good initial point should be consistent and robust. In our results the MLE is used as initial estimate for the algorithm.

We analyze the performance of the proposed methods in Poisson regression through a simulation study. We asses the behavior of the MRPE under the sparse Poisson regression model with $k = 12$ covariates but only 3 significant variables. We set the 12-dimensional regression parameter $β = (1.8, 1, 0, 0, 1.5, 0, \dots 0)$ and we generate the explanatory variables, $x_{i}$ , from the standard uniform distribution with variance-covariance matrix having Toeplitz structure, with the $(j, l)$ -th element being $0 . 5^{| j - l |}, j, l = 1, \dots, p$ . The response variables are generated from the Poisson regression model with mean $μ_{i} = x_{i}^{T} β,$ $Y_{i} \sim P (μ_{i}) .$ To evaluate the robustness of the proposed estimators, we contaminate the responses using a perturbed distribution of the form $(1 - b) P (μ_{i}) + b P (2 μ_{i}),$ where b is a realization of a Bernoulli variable with parameter $ε$ so called the contamination level. That is, the distribution of the contaminated responses lies in a small neighbourhood of the assumed model. We repeat the process $R = 1000$ for each value of $α$ .

Figure 2 presents the mean squared error of the estimate (MSE), $MSE = | | {\hat{β}}_{α} - β {| |}_{2},$ (left) and the MSE on the prediction (right) against contamination level on data for different values of $α = 0, 0.1, 0.3, 0.5$ and $0.7$ . The sample size is fixed at $n = 200$ and the MSE on the prediction is calculated using $n = 200$ new observations following the true model. As shown, greater values of $α$ correspond to more robust estimators, revealing the role of the tuning parameter on the robustness gain. Most strikingly, the MSE grows linearly for the MLE, while the proposed estimators manage to maintain a low error in all contaminated scenarios.

Mean Squared Error (MSE) on estimation (**left**) and prediction (**right**) against contamination level on data.

Furthermore, it is to be expected that the error of the estimate decreases with larger samples sizes. In this regard, Figure 3 shows the MSE for different values of $α = 0, 0.1, 0.3, 0.5$ and $0.7,$ against the sample size in the absence of contamination (left) and under $5 %$ of contamination. Our proposed estimators are more robust than the classical MLE with almost all contaminated scenarios, since the MSE committed is lower for all positives values of $α$ than for $α = 0$ (corresponding to the MLE), except for too small sample sizes. Conversely, the MLE is, as expected, the most efficient estimator in absence of contamination, closely to our proposed estimators with $α = 0.1, 0.3$ , highlighting the importance of $α$ in controlling the trade-off between efficiency and robustness. In this regard, values of $α$ about $0.3$ perform the best taking into account the low loss of efficiency and the gain in robustness. Finally, note that small sample sizes adversely affect to greater values of $α$ .

MSE in estimation of $β$ in absence of contamination (**left**) and under $5 %$ of contamination level in data (**right**) with different values of $α$ against sample size for Poisson regression model.

On the other hand, one could be interested on testing the significance of the selected variables. For this purpose, we simplify the true model and we examine the performance of the proposed Wald-type test statistics under different true coefficients values. In particular, let us consider a Poisson regression model with only two covariates, generated from the uniform distribution as before, and the linear null hypothesis

H_{0} : β_{2} = 0 .

(17)

That is, we are interested in assessing the significance of the second variable. The sample size if fixed at $n = 200$ and the true value of the component of the regression vector is set $β_{1} = 1 .$ We study the power of the tests under increasing signal of the second parameter $β_{2}$ and increasing contamination level. Here, the model is contaminated by perturbing the true distribution with $(1 - b) P (μ_{i}) + b P ({\tilde{μ}}_{i}),$ where $μ_{i} = x_{i}^{T} β$ is the mean of the Poisson variable in the absence of contamination, ${\tilde{μ}}_{i} = x_{i}^{T} \tilde{β}$ is the contaminated mean, with $\tilde{β} = (1, 0),$ and b is a realization of a Bernoulli variable with probability of success $ε .$ Table 1 presents the rejection rate of the Wald-type test statistics for different true values of $β_{2}$ under different contaminated scenarios. As expected, stronger signals produce higher power for all Wald-type test. Moreover, the power of the Wald-type test statistics based on the MLE decreases when increasing the contamination, whereas the power of the statistics based on the MRPEs with positives values of $α$ keeps sufficiently high. Then, our proposed robust estimators are able to detect the significance of the variable even in heavily contaminated scenarios.

Table 1.

Rejection rate of Wald-type test statistics based on MRPEs with different true values of $β_{2}$ and contamination levels.

$β_{2}$	$α$	Contamination Level
$β_{2}$	$α$	0	$5 %$	$10 %$	$15 %$	$20 %$	$25 %$
$0.3$	0	0.332	0.264	0.227	0.187	0.157	0.141
	0.1	0.435	0.376	0.328	0.285	0.251	0.223
	0.3	0.557	0.511	0.483	0.416	0.390	0.360
	0.5	0.617	0.563	0.533	0.493	0.467	0.427
	0.7	0.638	0.590	0.568	0.536	0.513	0.476
$0.5$	0	0.756	0.730	0.683	0.621	0.551	0.493
	0.1	0.833	0.798	0.775	0.736	0.681	0.622
	0.3	0.885	0.870	0.864	0.829	0.792	0.752
	0.5	0.895	0.891	0.886	0.867	0.842	0.814
	0.7	0.901	0.897	0.893	0.879	0.854	0.832
$0.7$	0	0.971	0.979	0.968	0.948	0.915	0.862
	0.1	0.980	0.988	0.983	0.973	0.962	0.932
	0.3	0.988	0.995	0.992	0.987	0.985	0.969
	0.5	0.989	0.995	0.995	0.992	0.992	0.977
	0.7	0.989	0.995	0.993	0.995	0.990	0.983

Open in a new tab

6. Real Data Applications

6.1. Example I: Poisson Regression Regression

We finally apply our proposed estimators in a real dataset arising from Crohn’s disease. The data were first studied in Lô and Ronchetti [33] to asses the adverse events of a drug. The clinical study included 117 patients affected by the disease, for whom information was recorded for 7 explanatory variables: BMI (body mass index), HEIGHT, COUNTRY (one of the two countries where the patient lives), SEX, AGE, WEIGHT, and TREAT (the drug taken by the patient in factor form: placebo, Dose 1, Dose 2), in addition to the response variable AE (number of adverse events). Lô and Ronchetti [33] considered a Poisson regression model for the Crohn data and determined that only variables Dose 1, BMI, HEIGHT, SEX, AGE, and COUNTRY may be essentially significant. Further, they flagged observations 23rd, 49th, and 51st to be highly influential on the classical analysis. Table 2 presents the estimated coefficient of the explanatory variable when fitting the Poisson regression model. Robust methods suggest higher coefficients for the variables BMI and AGE, whereas fewer values for the coefficients of the categorical variables COUNTRY, SEX, Dose 1.

Table 2.

Estimated coefficients for Crohn’s disease data for different values of $α$ with original data and clean data (after removing influential observations).

	Intercept	BMI	Height	Age	Country	Sex	Dose 1
Original Data
MLE ( $α =$ 0)	6.261	0.026	−0.037	0.012	−0.394	−0.646	−0.533
$α =$ 0.1	5.197	0.037	−0.033	0.014	−0.489	−0.800	−0.469
$α =$ 0.3	4.798	0.058	−0.036	0.021	−0.545	−1.284	−0.832
$α =$ 0.5	4.391	0.067	−0.037	0.028	−0.557	−1.535	−1.036
$α =$ 0.7	5.699	0.067	−0.047	0.036	−0.737	−1.759	−1.157

Open in a new tab

Following the discussion in Lô and Ronchetti [33], classical tests may not select variable AGE to be significant. Then, we propose testing the significance of that variable using Wald-type test statics based on different values $α$ . Table 3 shows the p-values of the corresponding tests with null hypothesis $H_{0}$ : AGE = 0, with the original data and after removing the outlying observations.

Table 3.

p-values of test with null hypothesis $H_{0}$ : AGE = 0 with original and clean data (after removing influential observations).

	Original Data	Clean Data
MLE ( $α =$ 0)	0.059	0.011
$α =$ 0.1	0.018	0.004
$α =$ 0.3	0.001	0.000
$α =$ 0.5	0.000	0.000
$α =$ 0.7	0.000	0.000

Open in a new tab

The MLE rejects the significance of the variable AGE when the original data are used, whereas the Wald-type test statistics with positives values of $α$ indicate strong evidence against the null hypothesis. In contrast, if the influential observations are removed, all Wald-type test statistics agree in the significance of the variable. This example illustrates the robustness of the proposed statistics.

6.2. Example II: Binomial Regression

We finally illustrate the applicability of the MRPE for robust inference in the binomial regression model. We examine the damaged carrots dataset, first studied in Phelps [34] and later discussed by Cantoni and Ronchetti [8] and Ghosh and Basu [13] to illustrate robust procedures for binomial regression. The data contain 24 samples, among which the 14th observation was flagged as an outlier in the y-space but not a leverage point. The data are issued from a soil experiment and give the proportion of carrots showing insect damage in a trial with three blocks and eight dose levels of insecticide. The explanatory variables are the logarithm transform of the dose (Logdose) and two dummy variables for Blocks 1 and 2.

Binomial regression is a natural extension of the logistic regression when the response variable Y does not follow a Bernoulli distribution but a Binomial distribution counting the number of successes in a series of m independent Bernoulli trials. Binomial regression model belongs to the GLM with known shape parameter $ϕ = 1,$ location parameter $θ_{i} = x_{i}^{T} β$ and functions $b (θ_{i}) = m log (1 + exp (x_{i}^{T} β))$ and $c (y_{i}) = log ((\binom{m}{y_{i}}))$ . The mean of the response variable is then linked to the linear predictor through the logit function, i.e.,

log (\frac{μ_{i}}{m - μ_{i}}) = x_{i}^{T} β .

Table 4 presents the estimated coefficients of the regression vector for the carrots data using the MLE and robust MRPEs when the model is fitted with the original data and the model fitted without the outlying observation. The results provided are computed in the same manner as in Section 5, adapting the corresponding quantities in Equation (5) for the binomial model. All integrals involved were numerically approximated, and the MLE is used as initial estimate for the optimization algorithm. The influence of observation 14 stands out when using the MLE; the estimated coefficients are remarkably different when fitting the model with and without observation 14. In contrast, all methods estimate similar coefficients after removing the outlying observation, coinciding with the robust estimates for moderately high values of the tuning parameter $α$ .

Table 4.

Estimated coefficients for damaged carrots data for different values of $α$ with original data and clean data (after outliers removal).

	Intercept	Logdose	B1	B2
Original Data
MLE ( $α = 0$ )	1.480	−1.817	0.542	0.843
$α =$ 0.1	1.729	−1.949	0.527	0.755
$α =$ 0.3	2.017	−2.100	0.479	0.652
$α =$ 0.5	2.090	−2.134	0.386	0.625
$α =$ 0.7	2.150	−2.161	0.258	0.615
Clean Data
MLE ( $α = 0$ )	2.141	−2.179	0.546	0.636
$α =$ 0.1	2.126	−2.167	0.529	0.633
$α =$ 0.3	2.105	−2.149	0.479	0.627
$α =$ 0.5	2.108	−2.144	0.385	0.621
$α =$ 0.7	2.154	−2.163	0.257	0.614

Open in a new tab

7. Conclussions

In this paper, we presented the MRPE and Wald-type test statistics for GLMs. The proposed MRPEs and statistics have appealing robustness properties where the data are contaminated due to outliers or leverage points. MRPEs are consistent and asymptotically normal and represent an attractive alternative to the classical nonrobust methods. Additionally, robust Wald-type test statistics, based on the MRPEs, were developed. Through the study of the IFs and the development of an extensive simulation study, we proved their robustness from a theoretical and practical point of view, respectively. In particular, we illustrated the superior performance of the MRPEs and the corresponding Wald-type tests for the Poisson regression model.

Acknowledgments

We are very grateful to the referees and associate editor for their helpful comments and suggestions. This research is supported by the Spanish Grants PGC2018-095194-B-100 (L. Pardo and M. Jaenada) and FPU/018240 (M. Jaenada). M.Jaenada and L. Pardo are members of the Instituto de Matematica Interdisciplinar, Complutense University of Madrid.

Abbreviations

The following abbreviations are used in this manuscript:

DPD	Density Power Divergence
IF	Influence Function
GLM	Genelarized Linear Model
LRM	Linear Regression Model
MLE	Maximum Likelihood Estimator
MRPE	Minimum Rényi Pseudodistance Estimator
RP	Rényi Pseudodistance

Open in a new tab

Appendix A. Proof of Theorem 3

Let us define

l_{η} (ζ) = {(M^{T} η - m)}^{T} {(M^{T} A_{α} {(ζ)}^{- 1} M)}^{- 1} (M^{T} η - m)

so the Wald-type test statistic is such that

n l_{{\hat{γ}}_{α}} ({\hat{γ}}_{α}) = W_{n} ({\hat{γ}}_{α}) .

We know that ${\hat{γ}}_{α} \underset{n \to \infty}{\overset{P}{\to}} γ_{1}$ and therefore $l_{{\hat{γ}}_{α}} (γ_{1})$ and $l_{γ_{1}} (γ_{1})$ have the same asymptotic distribution. A first order Taylor expansion of $g (ζ) = l_{{\hat{γ}}_{α}} (ζ)$ at ${\hat{γ}}_{α}$ around $γ_{1}$ gives,

l_{{\hat{γ}}_{α}} ({\hat{γ}}_{α}) = l_{{\hat{γ}}_{α}} (γ_{1}) + {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ^{T}})}_{γ = γ_{1}} ({\hat{γ}}_{α} - γ_{1}) + o_{p} (∥{\hat{γ}}_{α} - γ_{1}∥) .

Based on the asymptotic distribution of ${\hat{γ}}_{α}$ we have

\sqrt{n} o_{p} (∥{\hat{γ}}_{α} - γ_{1}∥) = o_{p} (1)

therefore

\sqrt{n} (l_{{\hat{γ}}_{α}} ({\hat{γ}}_{α}) - l_{γ_{1}} (γ_{1})) and \sqrt{n} {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ^{T}})}_{γ = γ_{1}} ({\hat{γ}}_{α} - γ_{1})

have asymptotically the same distribution, i.e.,

\sqrt{n} (l_{{\hat{γ}}_{α}} ({\hat{γ}}_{α}) - l_{γ_{1}} (θ_{1})) \underset{n \to \infty}{\overset{L}{\to}} N (0_{k + 1}, {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ^{T}})}_{γ = γ_{1}} A_{α} {(γ_{1})}^{- 1} {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ})}_{γ = γ_{1}}) .

Now, we shall denote,

σ^{2} (γ_{1}) = {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ^{T}})}_{γ = γ_{1}} A_{α} {(γ_{1})}^{- 1} {(\frac{\partial l_{{\hat{γ}}_{α}} (ζ)}{\partial ζ})}_{γ = γ_{1}} .

Then, we have,

\begin{matrix} P_{γ_{1}} (W_{n} ({\hat{γ}}_{α}) > χ_{r, α}^{2}) & = P_{γ_{1}} (W_{n} ({\hat{γ}}_{α}) - n l_{γ_{1}} (γ_{1}) > χ_{r, α}^{2} - n l_{γ_{1}} (γ_{1})) \\ = P_{γ_{1}} (\frac{\sqrt{n}}{σ (γ_{1})} (l_{{\hat{γ}}_{α}} ({\hat{γ}}_{α}) - l_{γ_{1}} (γ_{1})) > \frac{1}{σ (γ_{1})} (\frac{χ_{r, α}^{2}}{\sqrt{n}} - \sqrt{n} l_{γ_{1}} (γ_{1}))) \\ \approx 1 - ϕ_{N (0, 1)} (\frac{1}{σ (γ_{1})} (\frac{χ_{r, α}^{2}}{\sqrt{n}} - \sqrt{n} l_{γ_{1}} (γ_{1}))), \end{matrix}

where $ϕ_{N (0, 1)} (t)$ represents the distribution function of a standard normal distribution evaluated at $t .$ Finally,

lim_{n \to \infty} P_{γ_{1}} (W_{n} ({\hat{γ}}_{α}) > χ_{r, α}^{2}) = 1 .

Appendix B. Poisson Regression Model

We derive here some explicit expression for the particular case of the Poisson regression. Following the discussion in Section 5, we denote here $γ = β$ since the nuisance parameter is known, $ϕ = 1 .$ The Poisson distribution with parameter $e^{x_{i}^{T} β}$ is given by

f_{i} (y, β) = \frac{1}{y!} e^{- e^{x_{i}^{T} β}} e^{y x_{i}^{T} β}, y = 0, 1, \dots .

Differentiating its logarithm with respect to the regression vector, we get

\frac{\partial log f_{i} (y, β)}{\partial β} = (y - e^{x_{i}^{T} β}) x_{i}^{T} .

so we can write

K_{1 i} (y, β) = y - e^{x_{i}^{T} β} .

Further, we have that

\begin{matrix} N_{i} (y, β) = \frac{f_{i} {(y, β)}^{α}}{\sum_{y = 0}^{\infty} f_{i} {(y, β)}^{α + 1}} \sum_{y = 0}^{\infty} f_{i} {(y, β)}^{α + 1} (y - e^{x_{i}^{T} β}) . \end{matrix}

so the estimating equations of the Poisson regression model are given by

\sum_{i = 1}^{n} \frac{1}{L_{α}^{i} (β)} (f_{i} {(y_{i}, β)}^{α} (y_{i} - e^{x_{i}^{T} β}) - N_{i} (y_{i}, β)) x_{i} = 0_{k} .

(A1)

For $α = 0,$ we have

N_{i} (y_{i}, β) = 0 and L_{α}^{i} (β) = 1

so the estimating equations are given by

\sum_{i = 1}^{n} (y_{i} - e^{x_{i}^{T} β}) x_{i} = 0_{k},

yielding to the maximum likelihood estimating equations.

On the other hand, the asymptotic distribution of ${\hat{β}}_{α}$ is given by

\sqrt{n} {(X^{T} D_{11} X)}^{- \frac{1}{2}} \frac{1}{n} X^{T} (D_{11}^{*} - {(D_{1}^{*})}^{T} D_{1}^{*}) X ({\hat{β}}_{α} - β) \underset{n \to \infty}{\overset{L}{\to}} N (0_{k}, I_{k})

being

D_{11} = d i a g (l_{11 i} (β)),

with

l_{11 i} (β) = \frac{1}{L_{α}^{i} {(β)}^{2}} \sum_{y = 0}^{\infty} f_{i} {(y, β)}^{2 α + 1} (K_{1 i} (y, β) - m_{1 i} (β))

and

m_{1 i} (β) = \frac{1}{\sum_{y = 0}^{\infty} f_{i} {(y, β)}^{α + 1}} \sum_{y = 0}^{\infty} f_{i} {(y, β)}^{α + 1} (y - e^{x_{i}^{T} β}) .

Finally

\begin{matrix} D_{11}^{*} & = d i a g (m_{11 i} ((β))) = \frac{1}{\sum_{y = 0}^{\infty} f_{i} {(y, β)}^{α + 1}} \sum_{y = 0}^{\infty} f_{i} {(y, β)}^{α + 1} {(y - e^{x_{i}^{T} β})}^{2} . \end{matrix}

Author Contributions

Conceptualization, M.J. and L.P.; methodology, M.J. and L.P.; software, M.J. and L.P.; validation, M.J. and L.P.; formal analysis, M.J. and L.P.; investigation, M.J. and L.P.; resources, M.J. and L.P.; data curation, M.J. and L.P.; writing—original draft preparation, M.J. and L.P.; writing—review and editing, M.J. and L.P.; visualization, M.J. and L.P.; supervision, M.J. and L.P.; project administration, M.J. and L.P.; funding acquisition, M.J. and L.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish Grants PGC2018-095194-B-100 (L. Pardo and M. Jaenada) and FPU/018240 (M. Jaenada).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The real datasets are publicly available on the R package robustbase in CRAN under the names of CrohnD (Poisson regression example) and carrots (binomial regression example).

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Nelder J.A., Wedderburn R.W.M. Generalized linear models. J. R. Stat. Soc. 1972;135:370–384. doi: 10.2307/2344614. [DOI] [Google Scholar]
2.McCullagh P., Nelder J.A. Monographs on Statistics and Applied Probability. Chapman and Hall; London, UK: 1983. Generalized Linear Models. [Google Scholar]
3.Jaenada M., Pardo L. Data Analysis and Related Applications: Theory and Practice. Wiley; Athens, Greece: 2021. The minimum Renyi’s Pseudodistances estimators for Generalized Linear Models. Proceeding of the ASMDA. [Google Scholar]
4.Stefanski L.A., Carroll R.J., Ruppert D. Optimally bounded score functions for generalized linear models with applications to logistic regression. Biometrika. 1986;73:413–424. doi: 10.2307/2336218. [DOI] [Google Scholar]
5.Krasker W.S., Welsch R.E. Efficient bounded-influence regression estimation. J. Am. Stat. Assoc. 1982;77:595–604. doi: 10.1080/01621459.1982.10477855. [DOI] [Google Scholar]
6.Künsch H.R., Stefanski L.A., Carroll R.J. Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J. Am. Stat. Assoc. 1989;84:460–466. [Google Scholar]
7.Morgenthaler S. Least-absolute-deviations fits for generalized linear models. Biometrika. 1992;79:747–754. doi: 10.1093/biomet/79.4.747. [DOI] [Google Scholar]
8.Cantoni E., Ronchetti E. Robust inference for generalized linear models. J. Am. Stat. Assoc. 2001;96:1022–1030. doi: 10.1198/016214501753209004. [DOI] [Google Scholar]
9.Bianco A.M., Yohai V.J. Robust Statistics, Data Analysis, and Computer Intensive Methods. Springer; New York, NY, USA: 1996. Robust estimation in the logistic regression model; pp. 17–34. [Google Scholar]
10.Croux C., Haesbroeck G. Implementing the Bianco and Yohai estimator for logistic regression. Comput. Stat. Data Anal. 2003;44:273–295. doi: 10.1016/S0167-9473(03)00042-2. [DOI] [Google Scholar]
11.Bianco A.M., Boent G., Rodrigues I.M. Robust tests in generalized linear models with missing responses. Comput. Stat. Data Anal. 2013;65:80–97. doi: 10.1016/j.csda.2012.05.008. [DOI] [Google Scholar]
12.Valdora M., Yohai V.J. Robust estimators for generalized linear models. J. Stat. Plan. Inference. 2014;146:31–48. doi: 10.1016/j.jspi.2013.09.016. [DOI] [Google Scholar]
13.Ghosh A., Basu A. Robust estimation in generalized linear models: The density power divergence approach. Test. 2016;25:269–290. doi: 10.1007/s11749-015-0445-3. [DOI] [Google Scholar]
14.Basu A., Harris I.R., Hjort N.L., Jones M.C. Robust and efficient estimation by minimising a density power divergence. Biometrika. 1998;85:549–559. doi: 10.1093/biomet/85.3.549. [DOI] [Google Scholar]
15.Basu A., Ghosh A., Mandal A., Martin N., Pardo L. Robust Wald-type tests in GLM with random design based on minimum density power divergence estimators. Stat. Method Appl. 2021;3:933–1005. doi: 10.1007/s10260-020-00544-4. [DOI] [Google Scholar]
16.Broniatowski M., Toma A., Vajda I. Decomposable pseudodistances and applications in statistical estimation. J. Stat. Plan. Inference. 2012;142:2574–2585. doi: 10.1016/j.jspi.2012.03.019. [DOI] [Google Scholar]
17.Castilla E., Martín N., Muñoz S., Pardo L. Robust Wald-type tests based on Minimum Rényi Pseudodistance Estimators for the Multiple Regression Model. J. Stat. Comput. Simul. 2020;14:2592–2613. doi: 10.1080/00949655.2020.1787410. [DOI] [Google Scholar]
18.Toma A., Leoni-Aubin S. Optimal robust M-estimators using Rényi pseudodistances. J. Multivar. Anal. 2013;115:259–273. doi: 10.1016/j.jmva.2012.10.003. [DOI] [Google Scholar]
19.Toma A., Karagrigoriou A., Trentou P. Robust model selection criteria based on pseudodistances. Entropy. 2020;22:304. doi: 10.3390/e22030304. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Rényi A. Proceeding of the 4th Symposium on Probability and Statistics. University of California Press; Berkely, CA, USA: 1961. On measures of entropy and information; pp. 547–561. [Google Scholar]
21.Jones M.C., Hjort N.L., Harris I.R., Basu A. A comparison of related density-based minimum divergence estimators. Biometrika. 2001;88:865–873. doi: 10.1093/biomet/88.3.865. [DOI] [Google Scholar]
22.Fujisawa H., Eguchi S. Robust parameter estimation with a small bias against heavy contamination. J. Multivar. Anal. 2008;99:2053–2081. doi: 10.1016/j.jmva.2008.02.004. [DOI] [Google Scholar]
23.Hirose K., Masuda H. Robust relative error estimation. Entropy. 2018;20:632. doi: 10.3390/e20090632. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Kawashima T., Fujisawa H. Robust and sparse regression via γ-divergence. Entropy. 2017;19:608. doi: 10.3390/e19110608. [DOI] [Google Scholar]
25.Kawashima T., Fujisawa H. Robust and sparse regression in generalized linear model by stochastic optimization. Jpn. J. Stat. Data Sci. 2019;2:465–489. doi: 10.1007/s42081-019-00049-9. [DOI] [Google Scholar]
26.Windham M.P. Robustifying model fitting. J. R. Stat. Soc. Ser. B. 1995;57:599–609. doi: 10.1111/j.2517-6161.1995.tb02050.x. [DOI] [Google Scholar]
27.Castilla E., Jaenada M., Pardo L. Estimation and testing on independent not identically distributed observations based on Rényi’s pseudodistances. arXiv. 20212102.12282 [Google Scholar]
28.Pardo L. Statistical Inference Based on Divergence Measures. Chapman and Hall/CRC; Boca Raton, FL, USA: 2018. [Google Scholar]
29.Fraser D.A.S. Non parametric Methods in Statistics. John Wiley & Sons; New York, NY, USA: 1957. [Google Scholar]
30.Maronna R.A., Martin R.D., Yohai V.J. Robust Statistics Theory and Methods. John Wiley & Sons. Inc.; Hoboken, NJ, USA: 2006. [Google Scholar]
31.Donoho D.L., Huber P.J. A Festschrift for Erich L. Lehmann. CRC Press; Boca Raton, FL, USA: 1983. The notion of breakdown point. [Google Scholar]
32.Nelder J.A., Mead R. A simplex method for function minimization. Comput. J. 1965;7:308–313. doi: 10.1093/comjnl/7.4.308. [DOI] [Google Scholar]
33.Lô S.N., Ronchetti E. Robust and accurate inference for generalized linear models. J. Multivar. Anal. 2009;100:2126–2136. doi: 10.1016/j.jmva.2009.06.012. [DOI] [Google Scholar]
34.Phelps K. Use of the Complementary Log-Log Function to Describe Dose Response Relationships in Insecticide Evaluation Field Trials. In: Gilchrist R., editor. Lecture Notes in Statistics, No. 14.: Proceedings of the International Conference on Generalized Linear Models. Springer; Berlin, Germany: 1982. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The real datasets are publicly available on the R package robustbase in CRAN under the names of CrohnD (Poisson regression example) and carrots (binomial regression example).

[B1-entropy-24-00123] 1.Nelder J.A., Wedderburn R.W.M. Generalized linear models. J. R. Stat. Soc. 1972;135:370–384. doi: 10.2307/2344614. [DOI] [Google Scholar]

[B2-entropy-24-00123] 2.McCullagh P., Nelder J.A. Monographs on Statistics and Applied Probability. Chapman and Hall; London, UK: 1983. Generalized Linear Models. [Google Scholar]

[B3-entropy-24-00123] 3.Jaenada M., Pardo L. Data Analysis and Related Applications: Theory and Practice. Wiley; Athens, Greece: 2021. The minimum Renyi’s Pseudodistances estimators for Generalized Linear Models. Proceeding of the ASMDA. [Google Scholar]

[B4-entropy-24-00123] 4.Stefanski L.A., Carroll R.J., Ruppert D. Optimally bounded score functions for generalized linear models with applications to logistic regression. Biometrika. 1986;73:413–424. doi: 10.2307/2336218. [DOI] [Google Scholar]

[B5-entropy-24-00123] 5.Krasker W.S., Welsch R.E. Efficient bounded-influence regression estimation. J. Am. Stat. Assoc. 1982;77:595–604. doi: 10.1080/01621459.1982.10477855. [DOI] [Google Scholar]

[B6-entropy-24-00123] 6.Künsch H.R., Stefanski L.A., Carroll R.J. Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J. Am. Stat. Assoc. 1989;84:460–466. [Google Scholar]

[B7-entropy-24-00123] 7.Morgenthaler S. Least-absolute-deviations fits for generalized linear models. Biometrika. 1992;79:747–754. doi: 10.1093/biomet/79.4.747. [DOI] [Google Scholar]

[B8-entropy-24-00123] 8.Cantoni E., Ronchetti E. Robust inference for generalized linear models. J. Am. Stat. Assoc. 2001;96:1022–1030. doi: 10.1198/016214501753209004. [DOI] [Google Scholar]

[B9-entropy-24-00123] 9.Bianco A.M., Yohai V.J. Robust Statistics, Data Analysis, and Computer Intensive Methods. Springer; New York, NY, USA: 1996. Robust estimation in the logistic regression model; pp. 17–34. [Google Scholar]

[B10-entropy-24-00123] 10.Croux C., Haesbroeck G. Implementing the Bianco and Yohai estimator for logistic regression. Comput. Stat. Data Anal. 2003;44:273–295. doi: 10.1016/S0167-9473(03)00042-2. [DOI] [Google Scholar]

[B11-entropy-24-00123] 11.Bianco A.M., Boent G., Rodrigues I.M. Robust tests in generalized linear models with missing responses. Comput. Stat. Data Anal. 2013;65:80–97. doi: 10.1016/j.csda.2012.05.008. [DOI] [Google Scholar]

[B12-entropy-24-00123] 12.Valdora M., Yohai V.J. Robust estimators for generalized linear models. J. Stat. Plan. Inference. 2014;146:31–48. doi: 10.1016/j.jspi.2013.09.016. [DOI] [Google Scholar]

[B13-entropy-24-00123] 13.Ghosh A., Basu A. Robust estimation in generalized linear models: The density power divergence approach. Test. 2016;25:269–290. doi: 10.1007/s11749-015-0445-3. [DOI] [Google Scholar]

[B14-entropy-24-00123] 14.Basu A., Harris I.R., Hjort N.L., Jones M.C. Robust and efficient estimation by minimising a density power divergence. Biometrika. 1998;85:549–559. doi: 10.1093/biomet/85.3.549. [DOI] [Google Scholar]

[B15-entropy-24-00123] 15.Basu A., Ghosh A., Mandal A., Martin N., Pardo L. Robust Wald-type tests in GLM with random design based on minimum density power divergence estimators. Stat. Method Appl. 2021;3:933–1005. doi: 10.1007/s10260-020-00544-4. [DOI] [Google Scholar]

[B16-entropy-24-00123] 16.Broniatowski M., Toma A., Vajda I. Decomposable pseudodistances and applications in statistical estimation. J. Stat. Plan. Inference. 2012;142:2574–2585. doi: 10.1016/j.jspi.2012.03.019. [DOI] [Google Scholar]

[B17-entropy-24-00123] 17.Castilla E., Martín N., Muñoz S., Pardo L. Robust Wald-type tests based on Minimum Rényi Pseudodistance Estimators for the Multiple Regression Model. J. Stat. Comput. Simul. 2020;14:2592–2613. doi: 10.1080/00949655.2020.1787410. [DOI] [Google Scholar]

[B18-entropy-24-00123] 18.Toma A., Leoni-Aubin S. Optimal robust M-estimators using Rényi pseudodistances. J. Multivar. Anal. 2013;115:259–273. doi: 10.1016/j.jmva.2012.10.003. [DOI] [Google Scholar]

[B19-entropy-24-00123] 19.Toma A., Karagrigoriou A., Trentou P. Robust model selection criteria based on pseudodistances. Entropy. 2020;22:304. doi: 10.3390/e22030304. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B20-entropy-24-00123] 20.Rényi A. Proceeding of the 4th Symposium on Probability and Statistics. University of California Press; Berkely, CA, USA: 1961. On measures of entropy and information; pp. 547–561. [Google Scholar]

[B21-entropy-24-00123] 21.Jones M.C., Hjort N.L., Harris I.R., Basu A. A comparison of related density-based minimum divergence estimators. Biometrika. 2001;88:865–873. doi: 10.1093/biomet/88.3.865. [DOI] [Google Scholar]

[B22-entropy-24-00123] 22.Fujisawa H., Eguchi S. Robust parameter estimation with a small bias against heavy contamination. J. Multivar. Anal. 2008;99:2053–2081. doi: 10.1016/j.jmva.2008.02.004. [DOI] [Google Scholar]

[B23-entropy-24-00123] 23.Hirose K., Masuda H. Robust relative error estimation. Entropy. 2018;20:632. doi: 10.3390/e20090632. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B24-entropy-24-00123] 24.Kawashima T., Fujisawa H. Robust and sparse regression via γ-divergence. Entropy. 2017;19:608. doi: 10.3390/e19110608. [DOI] [Google Scholar]

[B25-entropy-24-00123] 25.Kawashima T., Fujisawa H. Robust and sparse regression in generalized linear model by stochastic optimization. Jpn. J. Stat. Data Sci. 2019;2:465–489. doi: 10.1007/s42081-019-00049-9. [DOI] [Google Scholar]

[B26-entropy-24-00123] 26.Windham M.P. Robustifying model fitting. J. R. Stat. Soc. Ser. B. 1995;57:599–609. doi: 10.1111/j.2517-6161.1995.tb02050.x. [DOI] [Google Scholar]

[B27-entropy-24-00123] 27.Castilla E., Jaenada M., Pardo L. Estimation and testing on independent not identically distributed observations based on Rényi’s pseudodistances. arXiv. 20212102.12282 [Google Scholar]

[B28-entropy-24-00123] 28.Pardo L. Statistical Inference Based on Divergence Measures. Chapman and Hall/CRC; Boca Raton, FL, USA: 2018. [Google Scholar]

[B29-entropy-24-00123] 29.Fraser D.A.S. Non parametric Methods in Statistics. John Wiley & Sons; New York, NY, USA: 1957. [Google Scholar]

[B30-entropy-24-00123] 30.Maronna R.A., Martin R.D., Yohai V.J. Robust Statistics Theory and Methods. John Wiley & Sons. Inc.; Hoboken, NJ, USA: 2006. [Google Scholar]

[B31-entropy-24-00123] 31.Donoho D.L., Huber P.J. A Festschrift for Erich L. Lehmann. CRC Press; Boca Raton, FL, USA: 1983. The notion of breakdown point. [Google Scholar]

[B32-entropy-24-00123] 32.Nelder J.A., Mead R. A simplex method for function minimization. Comput. J. 1965;7:308–313. doi: 10.1093/comjnl/7.4.308. [DOI] [Google Scholar]

[B33-entropy-24-00123] 33.Lô S.N., Ronchetti E. Robust and accurate inference for generalized linear models. J. Multivar. Anal. 2009;100:2126–2136. doi: 10.1016/j.jmva.2009.06.012. [DOI] [Google Scholar]

[B34-entropy-24-00123] 34.Phelps K. Use of the Complementary Log-Log Function to Describe Dose Response Relationships in Insecticide Evaluation Field Trials. In: Gilchrist R., editor. Lecture Notes in Statistics, No. 14.: Proceedings of the International Conference on Generalized Linear Models. Springer; Berlin, Germany: 1982. [Google Scholar]

PERMALINK

Robust Statistical Inference in Generalized Linear Models Based on Minimum Renyi’s Pseudodistance Estimators

María Jaenada

Leandro Pardo

Roles

Abstract

1. Introduction

2. Asymptotic Distribution of the MRPEs for the GLMs

Theorem 1.

Proof.

3. Wald Type Tests for the GLMs

Definition 1.

Theorem 2.

Proof.

Theorem 3.

Proof.

Remark 1.

4. Influence Function

Figure 1.

5. Numerical Analysis: Poisson Regression Model

Figure 2.

Figure 3.

Table 1.

6. Real Data Applications

6.1. Example I: Poisson Regression Regression

Table 2.

Table 3.

6.2. Example II: Binomial Regression

Table 4.

7. Conclussions

Acknowledgments

Abbreviations

Appendix A. Proof of Theorem 3

Appendix B. Poisson Regression Model

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases