The unit extended Weibull families of distributions and its applications

Renata Rojas Guerra; Fernando A Peña-Ramírez; Marcelo Bourguignon

doi:10.1080/02664763.2020.1796936

. 2020 Aug 3;48(16):3174–3192. doi: 10.1080/02664763.2020.1796936

The unit extended Weibull families of distributions and its applications

Renata Rojas Guerra ^a,^CONTACT, Fernando A Peña-Ramírez ^a, Marcelo Bourguignon ^b

PMCID: PMC9041710 PMID: 35707261

Abstract

In this paper, two new general families of distributions supported on the unit interval are introduced. The proposed families include several known models as special cases and define at least twenty (each one) new special models. Since the list of well-being indicators may include several double bounded random variables, the applicability for modeling those is the major practical motivation for introducing the distributions on those families. We propose a parametrization of the new families in terms of the median and develop a shiny application to provide interactive density shape illustrations for some special cases. Various properties of the introduced families are studied. Some special models in the new families are discussed. In particular, the complementary unit Weibull distribution is studied in some detail. The method of maximum likelihood for estimating the model parameters is discussed. An extensive Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples. Applications to the literacy rate in Brazilian and Colombian municipalities illustrate the usefulness of the two new families for modeling well-being indicators.

Keywords: Extended Weibull distribution, maximum likelihood estimation, moments, unit distributions, well-being indicators

AMS CLASSIFICATION: 60E05

1. Introduction

When analyzing the standard of living elements, Sen [25] claimed that a ‘good life’ could be measured through indicators that consider the actual outcome of peoples' decisions and also their capabilities (opportunities they have). The gross domestic product is usually used as a proxy for capabilities, and social indicators to measure actual outcomes (see Royuela and García [23]). The list of well-being indicators may include several double bounded random variables, such as infant mortality, literacy and murder rates, telephone, television, and internet availability, and human development index. Those indicators also represent essential aspects of the international development agenda [28]. In this context, it is necessary to consider probability distributions that take those characteristics into account.

Some double bounded distributions that have been widely studied in the literature are the classical Beta and Kumaraswamy (Kw) distributions. Much theoretical work has been concentrated on the use of those models. However, the beta and Kw distributions are not always suitable for modeling well-being indicators (see Section 7). On the other hand, only a few papers have dealt with distributions supported on the unit interval. Gómez-Déniz et al. [6] studied the log-Lindley distribution. Mazucheli et al. [14] discussed the unit Birnbaum-Saunders distribution. Mazucheli et al. [12] defined the unit-Lindley distribution, and Ghitany et al. [5] the unit inverse-Gaussian distribution. Recently, Altun and Cordeiro [2] introduced the unit-improved second-degree Lindley distribution.

In this paper, we introduce a new family of distributions for modeling random variables with support on the unit domain. The so-called unit extended Weibull ( $U E W$ ) family may also be considered to model double bounded variate. We provide a comprehensive account of the mathematical properties of the proposed family of distribution. The new family of distributions provides a rich source of alternative distributions for analyzing bounded data. Additionally, the complementary unit extended Weibull ( $C U E W$ ) is also derived. We note five motivations for the proposed families of distributions:

The proposed families define at least forty new special models;
Some distributions commonly used for parametric models on the unit interval are special cases of the proposed families, such as the unit Weibull [15,16] distribution;
The expected value of the proposed models can be obtained in closed form;
The proposed family of distributions is median-parametrized, facilitating the interpretation of its location parameter;
Simulations and real data sets show the good performance of these new models (see Section 7).

The proposed families are obtained from a variable transformation in the extended Weibull ( $E W$ ) class of distributions, pioneered by [8]. Its cumulative distribution function (cdf) is given by

G (x) = 1 - \exp {- α H (x; ξ)},

(1)

where $x > 0, α > 0$ and $H (x; ξ)$ is a non-negative monotonically increasing function which depends on the parameter vector $ξ$ . The corresponding probability density function (pdf) is given by

g (x) = α h (x; ξ) \exp {- α H (x; ξ)},

(2)

where $h (x; ξ)$ is the derivative of $H (x; ξ)$ with respect to x. We emphasize that several well-known distributions can be obtained from different expressions of $H (x; ξ)$ and refer the reader to [24] for a detailed survey on the $E W$ special models, with corresponding $H (x; ξ)$ and $h (x; ξ)$ functions. Nadarajah and Kotz [21] and Pham and Lai [22] also give more details on this family.

The applicability for modeling well-being indicators is the major practical motivation for introducing the distributions on those families. Section 7 illustrates their relevance by means of applications on the Brazilian and Colombian citizens' literacy rates. Literacy levels are related to quality education and have been an international concern. At the World Education Forum, in Dakar, Senegal, 2000, 164 governments signed a global commitment to provide quality basic education for all children, youth, and adults. One of the Dakar's six goals was halving illiteracy rate by 2015. By adopting these goals, Brazil and Colombia joined the group of countries committed to this achievement [30]. This variable is also a useful indicator of poverty, often considered to evaluate the overall standard of living in a country. Messias [17] verified that this variable is strongly associated with life expectancy in Brazil. Ahnen [1] emphasize that the literacy rate can be a control variable to Brazilian police violence. Massa [11] have found statistically significant associations between self-rated health and area-level literacy rates in adults from the 27 Brazilian capitals. Finally, Royuela [23] consider it to examine quality life convergence in Colombia.

The rest of the paper is organized as follows. In Section 2, we define the new families. Section 3 provides some general properties, and Section 4 the parameter estimation by maximum likelihood. Section 5 presents some special distributions obtained from the $U E W$ class. Some general properties and a simulation study of a $U E W$ special model is presented in Section 6. Empirical applications and concluding remarks are presented in Sections 7 and 8, respectively.

2. The unit extended Weibull family

Let X be a non negative random variable which follows a $E W$ distribution with cdf and pdf in (1) and (2), respectively, and denote $X \sim E W (α, ξ)$ . By considering the transformation $Y = e^{- X},$ we derive the $U E W$ class of distributions. Thus, the cdf of the new family is

F_{Y} (y) = \exp {- α H (- \log (y); ξ)},

(3)

where $y \in (0, 1)$ , $α > 0$ is a shape parameter, and $H (- \log (y); ξ)$ is obtained by taking this transformation in the random variable X. In the supplementary material, we prove that α is a location and scale-invariant measure. The vector $ξ$ is a vector of shape parameters that depend on the chosen $H (- \log (y); ξ)$ function. We develop a shiny application to provide interactive plots and illustrate the density shapes of some special cases upon variations in $ξ$ ¹

The pdf corresponding to (3) is

f_{Y} (y) = \frac{α}{y} h (- \log (y); ξ) \exp {- α H (- \log (y); ξ)},

(4)

where $h (- \log (y); ξ)$ is the derivative of $H (x; ξ)$ with respect to x evaluated in $- \log (y)$ . Hereafter, let Y be a random variable having the $U E W$ pdf (4) with parameters α and $ξ$ , say $Y \sim U E W (α, ξ) .$ The quantile function (qf) of the $U E W$ family can be expressed in terms of $H^{- 1} (\cdot)$ , which is the inverse function of $H (\cdot)$ . Therefore, the qf of Y has the form

Q_{Y} (u; α, ξ) = \exp {- H^{- 1} (- \frac{\log (u)}{α}; ξ)} .

(5)

The quantities $H (- \log (y); ξ)$ , $h (- \log y; ξ)$ , and the corresponding parameter vectors for some $E W$ special models are presented in the paper supplementary material. By replacing these quantities in (4), we obtain a new unit distribution on the $U E W$ family.

Next, we provide a different parameterization of the $U E W$ class distributions, in which one of its parameters corresponds to the median. The proposition below refers to the $U E W$ median-based parameterization.

Proposition 2.1

Let Y be a random variable with cdf given by

$F_{Y} (y) = 2^{- H (- \log (y); ξ) / H (- \log (μ); ξ)},$ (6)

where $y \in (0, 1)$ . Then Y belongs to the $U E W$ class of distributions, and $μ \in (0, 1)$ is a location parameter which corresponds to the median of Y.

Proof.

The results hold by setting $α = \log (2) / H (- \log (μ); ξ)$ in Equation (3) to get (6). Therefore, the qf of Y is given by

$Q_{Y} (u; μ, ξ) = \exp {- H^{- 1} (- \frac{\log (u)}{\log (2)} H (- \log (μ); ξ))},$ (7)

and it follows that, by taking u = 0.5 in Equation (7), μ is the median of Y. This completes the proof.

The pdf of Y can be written in the median-based parameterization as

f_{Y} (y) = \frac{\log (2) h (- \log (y); ξ)}{y H (- \log (μ); ξ)} 2^{- H (- \log (y); ξ) / H (- \log (μ); ξ)},

where $y \in (0, 1)$ .

Analogously, let $X \sim E W (α, ξ)$ and now consider the transformation $Z = 1 - e^{- X} .$

Under the above transformation, we may derive the $C U E W$ class of distributions. A similar approach was considered by [7] for obtaining a second kind of unit-Gamma distribution. Note that the $C U E W$ family can also be derived by taking the transformation Z = 1−Y. Thus, the cdf of this alternative unit family is

F_{Z} (z) = 1 - F_{Y} (1 - z) = 1 - \exp {- α H [- \log (1 - z); ξ]},

(8)

and the corresponding pdf reduces to

f_{z} (z) = f_{y} (1 - z) = \frac{α}{1 - z} h (- \log (1 - z); ξ) \exp {- α H (- \log (1 - z); ξ)},

(9)

where $h (- \log (1 - z); ξ)$ is the derivative of $H (x; ξ)$ with respect to x evaluated in $- \log (1 - z) .$ Hereafter, let Z be a random variable having the $C U E W$ pdf (9) with parameters α and $ξ$ , say $Z \sim C U E W (α, ξ) .$ The qf of Z has the form

Q_{Z} (u; α, ξ) = 1 - \exp {- H^{- 1} (- \frac{\log (1 - u)}{α}; ξ)} .

(10)

The quantities $H (- \log (1 - z); ξ)$ , $h (- \log (1 - z); ξ)$ , and the corresponding parameter vectors for some $E W$ special models are presented in the paper supplementary material. By replacing these quantities in (9), we obtain a new unit distribution on the $C U E W$ family.

The following result shows an alternative parametrization of the $C U E W$ class of distributions, in which one of its parameters corresponds to the median of the random variable Z.

Proposition 2.2

Let Z be a random variable with cdf given by

$F_{Z} (y) = 1 - 2^{- H [- \log (1 - z); ξ] / H [- \log (1 - μ); ξ]},$ (11)

where $z \in (0, 1)$ . Then Z belongs to the $C U E W$ class of distributions, and $μ \in (0, 1)$ is a location parameter which corresponds to the median of Z.

Proof.

The results hold by setting $α = \log (2) / H (- \log (1 - μ); ξ)$ in Equation (8) to get (11). Therefore, the qf of Z is given by

$Q_{Z} (u; μ, ξ) = 1 - \exp {- H^{- 1} [- \frac{\log (1 - u)}{\log (2)} H (- \log (1 - μ); ξ)]},$ (12)

and it follows that, by taking u = 0.5 in Equation (12), μ is the median of Z. This completes the proof.

The pdf of Z can be written in the median-based parameterization as

f_{Z} (z) = \frac{\log (2) h (- \log (1 - z); ξ)}{H (- \log (1 - μ); ξ)} 2^{- H (- \log (1 - z); ξ) / H (- \log (1 - μ); ξ)},

where $z \in (0, 1)$ .

3. General mathematical properties

In this section, we derive some useful statistical quantities for both introduced families, including the raw and incomplete moments. All the results of this section can be easily extended to the $U E W$ and $C U E W$ median-based parametrization by substituting α for $\log (2) / H (- \log (μ); ξ)$ and $\log (2) / H (- \log (1 - μ); ξ)$ , respectively.

3.1. Moments

Many of the important characteristics and features of a distribution are obtained through ordinary moments. The sth moment of Y, with $f_{Y} (y)$ from (4), is given by

E (Y^{s}) = E (e^{- s X}) = M_{X} (- s),

(13)

where $M_{X} (t) = \int_{0}^{\infty} e^{t X} f_{X} (x) d x$ is the moment generating function of $X \sim E W (α, ξ)$ . Similar computations can be done for the $C U E W$ family. Note that, the sth moment of Z can be written as $E (Z^{s}) = E (1 - Y)^{s} .$ Using the binomial theorem, and the result in (13), it can be reduced to

E (Z^{s}) = \sum_{j = 0}^{s} (\binom{s}{j}) (- 1)^{j} M_{X} (- j) .

(14)

3.2. Incomplete moments

The sth incomplete moment of Y is defined as $T_{Y} (r, s) = \int_{0}^{r} y^{s} f_{Y} (y) d y$ . Taking $f_{Y} (y)$ from (4), we have that

T_{Y} (r, s) = α \int_{0}^{r} y^{s - 1} h (- \log (y); ξ) \exp {- α H (- \log (y); ξ)} d y .

Setting $u = α H (- \log (y); ξ)$ , we have $d u / d y = - y^{- 1} h (- \log (y); ξ)$ and then

T_{Y} (r, s) = \int_{α H (- \log (r); ξ)}^{\infty} \exp {- [u + s H^{- 1} (u / α; ξ)]} d u .

(15)

Using the relationship between the introduced families, the sth incomplete moment of Z is given by $T_{Z} (r, s) = \int_{1 - r}^{1} (1 - y)^{s} f_{Y} (y) d y .$ Using the binomial theorem, and after some algebra, we can write

T_{Z} (r, s) = E (Z^{r}) - \sum_{j = 0}^{s} (\binom{s}{j}) (- 1)^{j} T_{Y} (1 - r, j) .

(16)

4. Maximum likelihood estimation

The conventional likelihood estimation techniques can be applied to estimate the parameters of the $U E W$ and $C U E W$ families. Let $Y_{1}, \dots, Y_{n}$ be a random sample of size n from the $U E W$ family distributions. Thus, the log-likelihood function for the parameter vector $Θ = (α, ξ^{⊤})^{⊤}$ can be written as

ℓ (Θ) = n \log α - \sum_{i = 1}^{n} \log (y_{i}) + \sum_{i = 1}^{n} \log [h (- \log (y); ξ)] - α \sum_{i = 1}^{n} H (- \log (y); ξ) .

(17)

The components of the score function $U (Θ) = (U_{α}, U_{ξ_{k}})^{⊤}$ are given by

U_{α} = \frac{n}{α} - \sum_{i = 1}^{n} H (- \log (y); ξ)

and

U_{ξ_{k}} = \sum_{i = 1}^{n} \frac{1}{h (- \log (y); ξ)} \frac{\partial h (- \log (y); ξ)}{\partial ξ_{k}} - α \sum_{i = 1}^{n} \frac{\partial H (- \log (y); ξ)}{\partial ξ_{k}} .

Setting $U_{α}$ and $U_{ξ_{k}}$ equal to zero and solving these equations simultaneously yields the maximum likelihood estimators (MLE), $U (\hat{Θ}) = (\hat{α}, \hat{ξ})^{⊤}$ , of $Θ$ . These equations cannot be solved analytically. Statistical software can be used to solve them numerically using iterative methods such as the Newton-Raphson type algorithms.

However, for fixed $ξ$ , it is possible to obtain a semi-closed MLE of α. From $U_{α} = 0$ , the estimator of α is given by

\hat{α} (\hat{ξ}) = \frac{n}{\sum_{i = 1}^{n} H (- \log (y); ξ)} .

By replacing α for $\hat{α} (\hat{ξ})$ in Equation (17) yields the profile log-likelihood for the parameter vector $ξ$ . Maximizing the profile log-likelihood may be simpler since it involves one less parameter. The log-likelihood function for the parameter vector $Θ = (μ, ξ^{⊤})^{⊤}$ , of the $U E W$ median-based parameterization, is obtained just by making $α = \log (2) / H (- \log (μ); ξ)$ in Equation (17). The component $U_{ξ_{k}}$ of the score function remains unchanged, and $U_{μ}$ is given by

U_{μ} = \frac{h (- \log (μ); ξ)}{μ H (- \log (μ); ξ)} [n - \frac{\log (2)}{H (- \log (μ); ξ)} \sum_{i = 1}^{n} H (- \log (y); ξ)] .

For the special models, where $ξ = \emptyset$ , it is possible to obtain a closed-form MLE for μ. The results of the $C U E W$ family can be derived through similar computations. It is easy to note that the log-likelihood of Z is obtained by taking y = 1−z and log(μ) = log(1−μ) in (17).

5. Some special models

In this section, we provide a few examples of unit distributions that arise as special models of the proposed families and are still not defined in the literature. Therefore, the Gompertz and Lomax models are considered as parent distributions in both introduced families. Those models are introduced under the median-parametrization given in Section 2. The unit Weibull (UW) distribution was pioneered by [15] using a transformation in a Weibull random variable. We also note that it arises by considering the Weibull distribution as a parent model in the $E W$ family.

5.1. Unit Gompertz distribution

Consider the Gompertz distribution as a model in the $U E W$ family, we obtain the unit Gompertz (UGo) distribution, in which its cdf and pdf takes the form

F_{UGo} (y) = 2^{(y^{- β} - 1) / (1 - μ^{- β})}

and

f_{UGo} (y) = \frac{β \log (2)}{μ^{- β} - 1} y^{- (β + 1)} 2^{(y^{- β} - 1) / (1 - μ^{- β})},

(18)

respectively, where $y \in (0, 1),$ $β > 0$ is a shape parameter and $μ \in (0, 1)$ is the median parameter. For the pdf (18), it is easy to verify that

lim_{y \to 0} f_{UGo} (y) = 0 and lim_{y \to 1} f_{UGo} (y) = \frac{β \log (2)}{μ^{- β} - 1} .

The shapes behavior of the UGo pdf is given by the following proposition.

Proposition 5.1

Let Y be a random variable following the UGo distribution. Then, its density is unimodal with mode at

$y^{*} = {[\frac{β \log (2)}{(β + 1) (μ^{- β} - 1)}]}^{1 / β} .$

The proof of Proposition 5.1 can be found in the paper supplementary material. From Equation (5), the UGo qf is obtained as

Q_{UGo} (u) = {[1 + \frac{(1 - μ^{- β}) \log (u)}{\log (2)}]}^{- 1 / β} .

The UGo first raw moment reduces to

E_{UGo} (Y) = 2^{1 / (μ^{- β} - 1)} {(\frac{\log (2)}{μ^{- β} - 1})}^{1 / β} Γ (1 - \frac{1}{β}, \frac{\log (2)}{μ^{- β} - 1}),

where $Γ (a, b) = \int_{b}^{\infty} x^{a - 1} e^{- x} d x$ is the upper incomplete gamma function.

5.2. Unit Lomax distribution

By considering the Lomax distribution as a model in the $U E W$ family, we derive the unit Lomax (UL) distribution, in which its cdf and pdf takes the form

F_{UL} (y) = {(1 - β^{- 1} \log (y))}^{- \log (2) / \log (1 - β^{- 1} \log (μ))}

and

\begin{aligned} f_{UL} (y) & = \frac{\log (2)}{β y} {[\log (1 - β^{- 1} \log (μ))]}^{- 1} \\ \times {(1 - β^{- 1} \log (y))}^{- \log (2) / \log (1 - β^{- 1} \log (μ)) - 1}, \end{aligned}

(19)

respectively, where $y \in (0, 1),$ $β > 0$ is a shape parameter and $μ \in (0, 1)$ is the median parameter. For the pdf (19), it can be verified that

lim_{y \to 0} f_{UL} (y) = \infty and lim_{y \to 1} f_{UL} (y) = \frac{\log (2)}{β} {[\log (1 - β^{- 1} \log (μ))]}^{- 1} .

Proposition 5.2

Let Y be a random variable following the UL distribution. Then, for $\log (μ) > β [1 - 2^{1 / (β - 1)}],$ its density is bathtub shaped with minimum at

$y^{*} = \exp {β - 1 - \frac{\log (2)}{\log (1 - β^{- 1} \log (μ))}} .$

The proof of Proposition 5.2 can be found in the paper supplementary material. From Equation (5), the UL qf is obtained as

Q_{UL} (u) = \exp {β (1 - u^{\log (1 - β^{- 1} \log (μ)) / \log (2)})}

and its first raw moment as

E_{UL} (Y) = \frac{\log (2) e^{β} β^{\log (2) / \log (1 - β^{- 1} \log (μ))}}{\log (1 - β^{- 1} \log (μ))} Γ (- \frac{\log (2)}{\log (1 - β^{- 1} \log (μ))}, β) .

5.3. Complementary unit Gompertz distribution

By considering the Gompertz distribution as a model in the $C U E W$ family, we obtain the complementary unit Gompertz (CUGo) distribution, in which its cdf and pdf takes the form

F_{CUGo} (z) = 1 - 2^{[(1 - z)^{- β} - 1] / [1 - (1 - μ)^{- β}]}

and

f_{CUGo} (z) = \frac{β \log (2)}{(1 - μ)^{- β} - 1} (1 - z)^{- (β + 1)} 2^{[(1 - z)^{- β} - 1] / [1 - (1 - μ)^{- β}]},

(20)

respectively, where $z \in (0, 1),$ $β > 0$ is a shape parameter and $μ \in (0, 1)$ is the median parameter.

For the pdf (20), it is easy to verify that

lim_{z \to 0} f_{CUGo} (z) = \frac{β \log (2)}{(1 - μ)^{- β} - 1} and lim_{z \to 1} f_{CUGo} (z) = 0.

The shapes behavior of the CUGo pdf is given by the following proposition.

Proposition 5.3

Let Z be a random variable following the CUGo distribution. Then its density is unimodal with mode at

$z^{*} = 1 - {[\frac{β \log (2)}{(β + 1) [(1 - μ)^{- β} - 1]}]}^{1 / β} .$

The proof of Proposition 5.3 can be found in the paper supplementary material. From Equation (10), the CUGo qf is obtained as

Q_{CUGo} (u) = 1 - {[1 + \frac{\log (1 - u)}{\log (2)} [1 - (1 - μ)^{- β}]]}^{- 1 / β},

and its first raw moment as

E_{CUGo} (Z) = 1 - 2^{1 / [(1 - μ)^{- β} - 1]} {(\frac{\log (2)}{(1 - μ)^{- β} - 1})}^{1 / β} Γ (1 - \frac{1}{β}, \frac{\log (2)}{(1 - μ)^{- β} - 1}) .

5.4. Complementary unit Lomax distribution

By considering the Lomax distribution as a model in the $C U E W$ family, we derive the complementary unit Lomax (CUL) distribution, in which its cdf and pdf takes the form

F_{CUL} (z) = 1 - {[1 - β^{- 1} \log (1 - z)]}^{- \log (2) / \log [1 - β^{- 1} \log (1 - μ)]}

and

\begin{aligned} f_{CUL} (z) & = \frac{\log (2)}{β (1 - z)} {\log [1 - β^{- 1} \log (1 - μ)]}^{- 1} \\ \times {[1 - β^{- 1} \log (1 - z)]}^{- \log (2) / \log [1 - β^{- 1} \log (1 - μ)] - 1}, \end{aligned}

(21)

respectively, where $z \in (0, 1),$ $β > 0$ is a shape parameter and $μ \in (0, 1)$ is the median parameter. For the pdf (21), it is easy to verify that

lim_{z \to 0} f_{CUL} (z) = \frac{\log (2)}{β} {\log [1 - β^{- 1} \log (1 - μ)]}^{- 1} and lim_{z \to 1} f_{CUL} (z) = \infty .

Proposition 5.4

Let Z be a random variable following the CUL distribution. Then, for $\log (1 - μ) > β [1 - 2^{1 / β - 1}],$ its density is bathtub shaped with minimum at

$z^{*} = 1 - \exp {β - 1 - \frac{\log (2)}{\log [1 - β^{- 1} \log (1 - μ)]}} .$

The proof of Proposition 5.4 can be found in the paper supplementary material. From Equation (10), the CUL qf is obtained as

Q_{CUL} (u) = 1 - \exp {β [1 - (1 - u)^{\log [1 - β^{- 1} \log (1 - μ)] / \log (2)}]}

and its first raw moment as

E_{CUL} (Z) = 1 - \frac{\log (2) e^{β} β^{\log (2) / \log [1 - β^{- 1} \log (1 - μ)]}}{\log [1 - β^{- 1} \log (1 - μ)]} Γ (- \frac{\log (2)}{\log [1 - β^{- 1} \log (1 - μ)]}, β) .

6. The complementary unit Weibull distribution and its properties

The two-parameter Weibull distribution [31] is one of the most popular models for modeling non-negative random processes. It has applications ranging from reliability engineering, survival analysis in biomedical sciences, mortality study, insurance, and social sciences, among others. In this section, we describe of the mathematical properties for the complementary unit Weibull (CUW) distribution under the median-parametrization given in Section 2. The parameter estimation by maximum likelihood method is presented and a simulation study is carried out.

Thus, the median-based CUW cdf and pdf are

F_{CUW} (z) = 1 - 2^{- [\log (1 - z) / \log (1 - μ)]^{γ}}

and

f_{CUW} (z) = \frac{γ \log (2)}{1 - z} [- \log (1 - μ)]^{- γ} {[- \log (1 - z)]}^{γ - 1} 2^{- [\log (1 - z) / \log (1 - μ)]^{γ}},

(22)

respectively, where $γ > 0$ . Figure 1 illustrates its pdf shapes for some parameter values. From Equation (5), the qf of CUW distribution is obtained as

Q_{CUW} (u) = 1 - (1 - μ)^{{[- \log (1 - u) / \log (2)]}^{1 / γ}} .

(23)

Based on previous results from Equation (14), the hth moment of the CUW model is given by

E_{CUW} (Z^{s}) = \sum_{j = 0}^{s} (\binom{s}{j}) (- 1)^{j} \sum_{k = 0}^{\infty} \frac{i^{k} \log^{k} (1 - μ)}{(\log 2)^{k / γ} k!} Γ (\frac{k}{γ} + 1) .

(24)

The hth cumulant ( $κ_{h}$ ) of the CUW model can be obtained from (24) using well-known relationships. We have that

κ_{h} = E_{CUW} (Y^{h}) - \sum_{k = 1}^{h - 1} (\binom{h - 1}{k - 1}) κ_{k} E_{CUW} (Y^{h - k}),

where $κ_{1} = E_{CUW} (Y)$ . Note that $κ_{2}$ is the variance of the CUW model. The skewness $γ_{1} = κ_{3} / κ_{2}^{3 / 2}$ and kurtosis $γ_{2} = κ_{4} / κ_{2}^{2}$ follow from the third and fourth standardized cumulants, respectively.

Table 1 provides a numerical study by computing the first four moments, variance, $γ_{1}$ , and $γ_{2}$ for ten different scenarios. Note that the parameterizations chosen are the same presented in Figure 1. All the quantities computed are in agreement with the behavior in those plots. Also, these illustrations indicate that the CUW distribution is quite flexible not only for density shapes but also regarding the moments, skewness, and kurtosis. It can accommodate positive and negative values for both skewness and kurtosis coefficients. Combining (15) and (16), and using the proposed median-parametrization, the sth incomplete moment of the CUW distribution is

\begin{aligned} T_{C U W} (r, s) & = E_{CUW} (Z^{s}) - \sum_{j = 0}^{s} (\binom{s}{j}) (- 1)^{j} \\ \times \int_{\log 2 {[\log (1 - r) / \log (1 - μ)]}^{γ}}^{\infty} \exp {- [u + s (u / α)^{γ}]} d u . \end{aligned}

Using the exponential expansion in $\exp {- s (u / α)^{γ}}$ , and after some algebra, it can be determined as

\begin{aligned} T_{C U W} (r, s) & = E_{CUW} (Z^{s}) - \sum_{j = 0}^{s} (\binom{s}{j}) (- 1)^{j} \sum_{k = 0}^{\infty} \frac{(- j)^{k} \log^{k} (1 - μ)}{(\log 2)^{k / γ} k!} \\ \times Γ (\frac{k}{γ} + 1, \log 2 {[\frac{\log (1 - r)}{\log (1 - μ)}]}^{γ}) . \end{aligned}

Table 1. First four moments, variance, skewness and kurtosis coefficients for some scenarios of the CUW distribution.

μ	γ	$E_{CUW} (Z)$	$E_{UW} (Z^{2})$	$E_{CUW} (Z^{3})$	$E_{CUW} (Z^{4})$	$κ_{2}$	$γ_{1}$	$γ_{2}$
0.10	0.5	0.2245	0.1250	0.0889	0.0704	0.0746	1.3435	0.7287
0.10	0.9	0.1406	0.0374	0.0136	0.0060	0.0177	1.4362	2.1116
0.15	1.5	0.1644	0.0370	0.0101	0.0031	0.0100	0.6986	0.1574
0.30	2.0	0.3029	0.1089	0.0437	0.0190	0.0172	0.1441	−0.5192
0.50	3.0	0.4871	0.2538	0.1389	0.0788	0.0166	−0.4517	−0.1373
0.70	0.7	0.6150	0.4992	0.4387	0.4000	0.1211	−0.4071	−1.3292
0.80	0.9	0.6897	0.5713	0.5040	0.4589	0.0956	−0.7420	−0.7841
0.80	2.5	0.7596	0.6026	0.4919	0.4098	0.0257	−1.1871	1.2709
0.85	2.7	0.8095	0.6757	0.5757	0.4980	0.0204	−1.4738	2.3962
0.90	1.2	0.7897	0.6859	0.6197	0.5722	0.0622	−1.3059	0.7269

Open in a new tab

6.1. Maximum likelihood estimation

Let $z_{1}, \dots, z_{n}$ be a random sample of size n from the CUW $(μ, γ)$ distribution. Let $Θ = (μ, γ)^{⊤}$ be the parameter vector. The log-likelihood function for $Θ$ can be expressed as

\begin{aligned} ℓ (Θ) & = n \log {\frac{γ \log (2)}{{[- \log (1 - μ)]}^{γ}}} - \sum_{i = 1}^{n} \log (1 - z_{i}) + (γ - 1) \sum_{i = 1}^{n} \log [- \log (1 - z_{i})] \\ - \frac{\log (2)}{{[- \log (1 - μ)]}^{γ}} \sum_{i = 1}^{n} [- \log (1 - z_{i})]^{γ} . \end{aligned}

(25)

The maximum likelihood estimates can be obtained by maximizing directly the Equation (25). Alternatively, we can obtain the score vector $U (Θ) = (U_{μ}, U_{γ})^{⊤}$ , set their components to zero and solve these equations simultaneously. For the CUW distribution, the $U (Θ)$ components are given by

U_{μ} = \frac{γ}{(1 - μ) \log (1 - μ)} [n - \frac{\log (2)}{[- \log (1 - μ)]^{γ}} \sum_{i = 1}^{n} [- \log (1 - z_{i})]^{γ}]

and

\begin{aligned} U_{γ} & = \frac{n}{γ} + \sum_{i = 1}^{n} \log [- \log (1 - z_{i})] - n \log [- \log (1 - μ)] \\ - \frac{\log (2)}{{[- \log (1 - μ)]}^{γ}} \sum_{i = 1}^{n} [- \log (1 - z_{i})]^{γ} \log [\frac{\log (1 - z_{i})}{\log (1 - μ)}] . \end{aligned}

As reported in Section 4, we note that for fixed γ, the semi-closed MLE of μ is given by

\hat{μ} (γ) = 1 - \exp {- {(\frac{\log (2)}{n} \sum_{i = 1}^{n} [- \log (1 - z_{i})]^{γ})}^{1 / γ}} .

(26)

By replacing μ by $\hat{μ} (γ)$ in Equation (25), we obtain the profile log-likelihood function for γ, expressed as

\begin{aligned} ℓ (γ) & = n \log (γ n) - n \log {\sum_{i = 1}^{n} [- \log (1 - z_{i})]^{γ}} - \sum_{i = 1}^{n} \log (1 - z_{i}) \\ + (γ - 1) \sum_{i = 1}^{n} \log [- \log (1 - z_{i})] - n . \end{aligned}

(27)

The score vector for (27), $U_{γ}$ , is given by

U_{γ} = \frac{n}{γ} + \sum_{i = 1}^{n} \log [- \log (1 - z_{i})] - \frac{n \sum_{i = 1}^{n} \log [- \log (1 - z_{i})] [- \log (1 - z_{i})]^{γ}}{\sum_{i = 1}^{n} [- \log (1 - z_{i})]^{γ}} .

6.2. Simulation study

We shall now present the results from Monte Carlo simulation studies conducted to evaluate the performance of the MLEs of the parameters that index the CUW distribution. The simulations are carried out in the R programming language, using the optim routine with BFGS quasi-Newton nonlinear optimization algorithm. The inverse transform method is employed to generate a size n sample from a CUW distribution using (23). We simulate 10,000 Monte Carlo replications, the sample sizes being $n \in {20, 50, 100, 300}$ . It is considered ten different combinations for the parameter vector $Θ$ . The scenarios are defined by the illustrations discussed in Figure 1 and Table 1. Thus, the chosen parametrizations cover different density shapes and also various combinations of skewness and kurtosis coefficients.

The mean estimates, percentage relative bias (RB%), and root mean squared errors (RMSE) are computed by maximizing (27) and taking the MLE of μ from (26). One advantage of using the profile log-likelihood is that the maximization of (27) is simpler than for (25), once it involves only one parameter.

The results for each generation scheme are reported in Table 2. As expected, the RMSEs tend to decrease as the sample size increases. We also observe that the overall performance of the MLEs is appropriate. Note that when n = 300, the RB% is less than 1% for both parameter estimates and all the scenarios. In general, the $\hat{μ}$ estimates are more accurate when compared with $\hat{γ}$ . It also presents smaller RMSEs, mostly when n = 20.

Table 2. Monte Carlo results for the mean estimates, RB%, and RMSEs of the CUW distribution with 10,000 replications.

				Mean		RB%		RMSE
Scenario	μ	γ	n	$\hat{μ}$	$\hat{γ}$	$\hat{μ}$	$\hat{γ}$	$\hat{μ}$	$\hat{γ}$
1	0.1	0.5	20	0.1094	0.5381	9.3663	7.6129	0.0568	0.1094
			50	0.1032	0.5145	3.1661	2.8967	0.0353	0.0606
			100	0.1011	0.5067	1.0804	1.3404	0.0247	0.0408
			300	0.1000	0.5023	0.0246	0.4585	0.0150	0.0229
2	0.1	0.9	20	0.1014	0.9694	1.3847	7.7074	0.0303	0.1980
			50	0.0999	0.9246	−0.0722	2.7344	0.0199	0.1096
			100	0.1000	0.9140	−0.0176	1.5596	0.0143	0.0738
			300	0.0993	0.9037	−0.6941	0.4075	0.0105	0.0418
3	0.15	1.5	20	0.1500	1.6142	0.0273	7.6133	0.0271	0.3288
			50	0.1494	1.5479	−0.3772	3.1947	0.0190	0.1848
			100	0.1489	1.5195	−0.7447	1.3028	0.0157	0.1221
			300	0.1495	1.5075	−0.3650	0.4983	0.0100	0.0691
4	0.3	2.0	20	0.2982	2.1500	−0.6097	7.5018	0.0390	0.4364
			50	0.2983	2.0574	−0.5622	2.8685	0.0264	0.2427
			100	0.2988	2.0295	−0.3891	1.4747	0.0201	0.1632
			300	0.2983	2.0087	−0.5669	0.4353	0.0188	0.0921
5	0.5	3.0	20	0.4981	3.2285	−0.3823	7.6171	0.0349	0.6670
			50	0.4979	3.0818	−0.4291	2.7282	0.0281	0.3580
			100	0.4975	3.0376	−0.5095	1.2544	0.0242	0.2443
			300	0.4989	3.0145	−0.2177	0.4846	0.0151	0.1379
6	0.7	0.7	20	0.6847	0.7588	−2.1872	8.4058	0.1455	0.1571
			50	0.6884	0.7214	−1.6610	3.0541	0.1104	0.0847
			100	0.6889	0.7114	−1.5920	1.6234	0.0917	0.0579
			300	0.6938	0.7056	−0.8787	0.7933	0.0641	0.0321
7	0.8	3.0	20	0.7808	0.9725	−2.4046	8.0553	0.1215	0.2028
			50	0.7887	0.9273	−1.4069	3.0367	0.0852	0.1085
			100	0.7922	0.9119	−0.9774	1.3259	0.0686	0.0738
			300	0.7946	0.9043	−0.6741	0.4759	0.0566	0.0410
8	0.8	2.5	20	0.7950	2.6931	−0.6225	7.7256	0.0475	0.5538
			50	0.7953	2.5720	−0.5871	2.8793	0.0407	0.3062
			100	0.7976	2.5369	−0.3048	1.4757	0.0291	0.2035
			300	0.7987	2.5123	−0.1610	0.4934	0.0181	0.1154
9	0.85	2.7	20	0.8456	2.9057	−0.5143	7.6168	0.0402	0.5885
			50	0.8459	2.7742	−0.4882	2.7496	0.0328	0.3268
			100	0.8473	2.7411	−0.3166	1.5238	0.0255	0.2206
			300	0.8488	2.7101	−0.1457	0.3753	0.0165	0.1239
10	0.9	1.2	20	0.8849	1.2906	−1.6780	7.5480	0.0833	0.2609
			50	0.8878	1.2325	−1.3589	2.7100	0.0704	0.1464
			100	0.8930	1.2164	−0.7791	1.3626	0.0536	0.0993
			300	0.8965	1.2053	−0.3938	0.4416	0.0394	0.0550

Open in a new tab

In the paper supplementary material, we provide boxplots that illustrate the convergence of $\hat{μ} (\hat{γ})$ and $\hat{γ}$ for the first 100 replications at selected scenarios from Table 2. The outcome indicates that the precision of the MLEs improved for larger sample sizes. In addition, both $\hat{μ} (\hat{γ})$ and $\hat{γ}$ exhibited high accuracy and precision when n = 300. We note the presence of outliers that overestimate the true value of γ for the small sample size n = 20. By this fact, we observe that this configuration is attenuated as n increases.

7. Applications

In what follows, we shall apply some $U E W$ and $C U E W$ special models for two data sets related to literacy rate, which is defined as the proportion of people aged 15 years old or more who can read or write a simple note. The first data set contains the literacy rates of 5565 cities in Brazil. It was measured during the census in 2010 and is available at http://datasus.saude.gov.br/. The second application models the literacy rates of 1107 cities in Colombia. It was measured during the census in 2005 and is available at www.http://microdatos.dane.gov.co/. The analysis is carried out using the AdequacyModel script [10] in the R programming language.

For modeling those data, we fit the classical beta and Kw distributions and other five special models of both introduced families. They are the UGo, ULo, CUGo, CULo and CUW distributions. They have their densities given by (18), (19), (20), (21) and (22), respectively. We also considered the unit gamma (UG) distribution, introduced by [7] and considered by [27] for hydrological applications. Mazucheli et al. [13] proposed two bias-corrected maximum likelihood estimators (MLEs) for both shape parameters of the UG distribution.

The UG, beta and Kw densities (for 0<y<1) are given by

\begin{aligned} f_{UG} (y) & = {[\frac{μ^{1 / ϕ}}{1 - μ^{1 / ϕ}}]}^{ϕ} \frac{1}{Γ (ϕ)} y_{t}^{μ^{1 / ϕ} / (1 - μ^{1 / ϕ}) - 1} [- \log (y)]^{ϕ - 1}, \\ f_{Beta} (y) & = \frac{Γ (ϕ)}{Γ (μ ϕ) Γ ((1 - μ) ϕ)} y^{μ ϕ - 1} (1 - y)^{(1 - μ) ϕ - 1}, \end{aligned}

and

f_{Kw} (y) = \frac{ϕ \log 0.5}{\log (1 - μ^{ϕ})} y^{ϕ - 1} (1 - y^{ϕ})^{\log 0.5 / \log (1 - μ^{ϕ}) - 1},

(28)

respectively. For the UG and beta models, $μ \in (0, 1)$ is the mean of Y, and φ is a precision parameter. Those parametrizations are presented by [4,19], respectively. For the Kw model, $μ \in (0, 1)$ is the distribution median, and φ is a precision parameter. The pdf in (28) was previously presented by [18].

The descriptive summary of the literacy rates for Brazilian and Colombian municipalities is given in Table 3. We observe that both countries present the mean and median quite distant from the mode, and variance of 0.01. Brazil exhibit higher values for all central tendency measures considered and Colombia for the amplitude. Both countries present negative skewness for this variable. These descriptive measures indicate that the mass of observations is concentrated on the right. This configuration is adequate once this variable is defined positively: the higher the literacy rate, the better the country's education development. According to UNESCO [29], ‘literate societies enable the free exchange of text based information and provide an array of opportunities for lifelong learning’. In addition, for Sen [26], the basic education can be considered a semi-public good, which benefits not only the literate person but also the society in general. We develop an interactive map dashboard as a tool for data visualization on the literacy rate. Interested readers can refer to the website https://newdists.shinyapps.io/UEWfamilies/#section-literacy-rates.

Table 3. Descriptive statistics for literacy rates in both countries.

Country	Mean	Median	Mode	Variance	Skewness	Kurtosis	Min.	Max.
Brazil	0.8419	0.8708	0.93	0.01	−0.61	−0.72	0.53	0.99
Colombia	0.8039	0.8341	0.88	0.01	−1.57	3.67	0.18	0.98

Open in a new tab

The parameter estimates obtained by the maximum likelihood method, and corresponding standard errors (SEs) for all those models are listed in Table 4 for the Brazilian and Colombian data sets. The Cramér-von Misses corrected statistic [3] $(W^{*})$ is also presented to evaluate the goodness-of-fit. The lower is the $W^{*}$ statistic's value, the better is the adjustment to the data. The SEs of the estimates for all fitted models are quite small. Among all fitted models, the figures in Table 4 indicate that, for both data sets, the CUW model has the lowest value for $W^{*}$ . Further, the other distributions on the proposed families are shown competitive with the classical models. It illustrates the relevance of the new family for modeling social indicators, such as the literacy rate.

Table 4. MLEs of the parameters from fitted models to literacy rates for Brazilian municipalities in 2010, and Colombian municipalities in 2005.

	Brazil			Colombia
Distribution	Parameter estimates		$W^{*}$	Parameter estimates		$W^{*}$
Beta $(μ, ϕ)$	0.8420	13.3490	7.3663	0.8020	14.0180	1.8003
	(0.0013)	(0.2520)		(0.0031)	(0.5864)
Kw $(μ, ϕ)$	0.8610	8.9550	6.8732	0.8190	8.7040	1.1929
	(0.0012)	(0.1308)		(0.0029)	(0.2712)
UG $(μ, ϕ)$	0.8420	2.1080	7.3735	0.8020	2.7690	1.8138
	(0.0013)	(0.0372)		(0.0031)	(0.1113)
UGo $(μ, β)$	0.8510	3.7220	13.9384	0.8340	1.2030	3.6124
	(0.0017)	(0.1096)		(0.0046)	(0.1184)
ULo $(μ, β)$	0.8920	0.6940	6.4610	0.8600	1.3700	1.2318
	(0.0015)	(0.0291)		(0.0042)	(0.1444)
CUGo $(μ, β)$	0.8780	1.2270	10.6856	0.8360	1.6040	1.4484
	(0.0015)	(0.0149)		(0.0034)	(0.0397)
CULo $(μ, β)$	0.7600	365.6600	6.1925	0.7080	183.819	1.4064
	(0.0046)	(69.7698)		(0.0109)	(59.528)
CUW $(μ, γ)$	0.8720	3.2070	5.0898	0.8300	3.5548	0.2798
	(0.0013)	(0.0329)		(0.0030)	(0.0808)

Open in a new tab

Note: The corresponding SEs (given in parentheses) and the goodness-of-fit statistic.

Figure 2 contains the data histogram with fitted density functions for some of the competitive models introduced and the beta distribution. This visual inspection indicates that the CUW distribution fits adequately to the Brazilian and Colombian literacy rates. We also note that this plot is in agreement with the results in Table 3. For both samples, the CUW median MLE is very close with the observed, and the CUGo provides the second closer median estimate. By analyzing the quantile-quantile plot, we also observe the CUW distribution's superiority for modeling these data sets. Therefore, we can conclude that the CUW distribution, a particular case of the $C U E W$ family, provides a good fit to the Brazilian and Colombian literacy rates, and other distributions on the introduced families are also quite competitive. Finally, these results illustrate that the new models can be effective alternatives to the classical distributions for modeling bounded data (Figures 3 and 4).

Figure 3. — Quantile-quantile plot of the beta, CUGo and CUW models for the literacy rates for Brazilian municipalities in 2010.

Figure 4. — Quantile-quantile plot of the beta, CUGo and CUW models for the literacy rates for Colombian municipalities in 2005.

8. Concluding remarks

We define two new classes of distributions with bounded domain constructed by a simple and intuitive variable transformation in the extended Weibull family of distributions. The main properties of the families of distributions are derived, such as the quantile function, moments, and incomplete moments. Five special models in the family are described with some details. The maximum likelihood procedure is used for estimating the model parameters. In order to assess the performance of the maximum likelihood estimates, a simulation study is performed employing Monte Carlo experiments. An example of real data illustrates the importance and potentiality of the new family. In conclusion, we define a general approach for generating new unit interval distributions, at least forty distributions, some known, and the great majority new ones. All computational codes are available as supplementary material. We hope these families of distributions may attract wider applications in statistics. Future work should explore a regression structure for the median and zero-augmented $U E W$ family, assuming that the variable has a mixed continuous-discrete distribution to model data that are observed on $[0, 1), (0, 1]$ or $[0, 1]$ .

Supplementary Material

Section_7_Application.R

Click here for additional data file.^{(6.5KB, r)}

Section_6_Simulation.R

Click here for additional data file.^{(1.4KB, r)}

Section_6_Figure1.R

Click here for additional data file.^{(1.5KB, r)}

supplementarydata2.xlsx

Click here for additional data file.^{(37.7KB, xlsx)}

supplementarydata1.xlsx

Click here for additional data file.^{(212.2KB, xlsx)}

supp_UEW_JAS.pdf

Click here for additional data file.^{(416KB, pdf)}

Acknowledgements

We gratefully acknowledge partial financial support from CAPES.

Note

https://newdists.shinyapps.io/UEWfamilies/#section-models.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Ahnen R.E., The politics of police violence in democratic Brazil, Lat. Am. Polit. Soc. 49 (2007), pp. 141–164. doi: 10.1111/j.1548-2456.2007.tb00377.x [DOI] [Google Scholar]
2.Altun E. and Cordeiro G.M., The unit-improved second-degree Lindley distribution: Inference and regression modeling, Comput. Stat. 35 (2019), pp. 1–21. [Google Scholar]
3.Chen G. and Balakrishnan N., A general purpose approximate goodness-of-fit test, J. Qual. Technol. 27 (1995), pp. 154–161. doi: 10.1080/00224065.1995.11979578 [DOI] [Google Scholar]
4.Ferrari S.L.P. and Cribari-Neto F., Beta regression for modelling rates and proportions, J. Appl. Stat. 31 (2004), pp. 799–815. doi: 10.1080/0266476042000214501 [DOI] [Google Scholar]
5.Ghitany M.E., Mazucheli J., Menezes A.F.B., and Alqallaf F., The unit-inverse gaussian distribution: A new alternative to two-parameter distributions on the unit interval, Comm. Statist. Theory Methods 48 (2019), pp. 3423–3438. doi: 10.1080/03610926.2018.1476717 [DOI] [Google Scholar]
6.Gómez-Déniz E., Sordo M.A., and Calderín-Ojeda E., The log-Lindley distribution as an alternative to the beta regression model with applications in insurance, Insurance Math. Econom. 54 (2014), pp. 49–57. doi: 10.1016/j.insmatheco.2013.10.017 [DOI] [Google Scholar]
7.Grassia A., On a family of distributions with argument between 0 and 1 obtained by transformation of the gamma distribution and derived compound distributions, Aust. J. Statist. 19 (1977), pp. 108–114. doi: 10.1111/j.1467-842X.1977.tb01277.x [DOI] [Google Scholar]
8.Gurvich M., DiBenedetto A., and Ranade S., A new statistical distribution for characterizing the random strength of brittle materials, J. Mater. Sci. 32 (1997), pp. 2559–2564. doi: 10.1023/A:1018594215963 [DOI] [Google Scholar]
10.Marinho P.R.D., Bourguignon M., and Dias C.R.B., AdequacyModel: Adequacy of probabilistic models and general purpose optimization, R package version 2.0.0, 2016. Available at https://CRAN.R-project.org/package=AdequacyModel.
11.Massa K.H.C., Pabayo R., and Chiavegatto Filho A.D.P., Income inequality and self-reported health in a representative sample of 27 017 residents of state capitals of Brazil, J. Public Health 40 (2018), pp. e440–e446. doi: 10.1093/pubmed/fdy022 [DOI] [PubMed] [Google Scholar]
12.Mazucheli J., Menezes A.F.B., and Chakraborty S., On the one parameter unit-Lindley distribution and its associated regression model for proportion data, J. Appl. Stat. 46 (2019), pp. 700–714. doi: 10.1080/02664763.2018.1511774 [DOI] [Google Scholar]
13.Mazucheli J., Menezes A.F.B., and Dey S., Improved maximum likelihood estimators for the parameters of the unit-gamma distribution, Comm. Statist. Theory Methods 47 (2017), pp. 3767–3778. doi: 10.1080/03610926.2017.1361993 [DOI] [Google Scholar]
14.Mazucheli J., Menezes A.F.B., and Dey S., The unit-Birnbaum-Saunders distribution with applications, Chil. J. Stat. 9 (2018), pp. 47–57. [Google Scholar]
15.Mazucheli J., Menezes A.F.B., Fernandes L.B., de Oliveira R.P., and Ghitany M.E., The unit-Weibull distribution and associated inference, J. Appl. Probab. Stat. 13 (2019), pp. 1–22. doi: 10.18576/amis/13S101 [DOI] [Google Scholar]
16.Mazucheli J., Menezes A.F.B., Fernandes L.B., de Oliveira R.P., and Ghitany M.E., The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates, J. Appl. Stat. 47 (2019), pp. 954–974. doi: 10.1080/02664763.2019.1657813 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Messias E., Income inequality, illiteracy rate, and life expectancy in Brazil, Amer. J. Public Health 93 (2003), pp. 1294–1296. doi: 10.2105/AJPH.93.8.1294 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Mitnik P.A. and Baek S., The Kumaraswamy distribution: Median-dispersion re-parameterizations for regression modeling and simulation-based estimation, Statist. Papers 54 (2013), pp. 177–192. doi: 10.1007/s00362-011-0417-y [DOI] [Google Scholar]
19.Mousa A.M., El-Sheikh A.A., and Abdel-Fattah M.A., A gamma regression for bounded continuous variables, Adv. Appl. Stat. 49 (2016), pp. 305–326. [Google Scholar]
21.Nadarajah S. and Kotz S., On some recent modifications of Weibull distribution, IEEE Trans. Reliab. 54 (2005), pp. 561–562. doi: 10.1109/TR.2005.858811 [DOI] [Google Scholar]
22.Pham H. and Lai C.D., On recent generalizations of the Weibull distribution, IEEE Trans. Reliab. 56 (2007), pp. 454–458. doi: 10.1109/TR.2007.903352 [DOI] [Google Scholar]
23.Royuela V. and García G.A., Economic and social convergence in Colombia, Reg. Stud. 49 (2015), pp. 219–239. doi: 10.1080/00343404.2012.762086 [DOI] [Google Scholar]
24.Santos-Neto M., Bourguignon M., Zea L.M., Nascimento A.D., and Cordeiro G.M., The Marshall-Olkin extended Weibull family of distributions, J. Stat. Distrib. Appl. 1 (2014), pp. 9. doi: 10.1186/2195-5832-1-9 [DOI] [Google Scholar]
25.Sen A., The standard of living: Lecture I, concepts and critiques, The Standard of Living, Cambridge: Cambridge University Press, 1987, pp. 1–19.
26.Sen A., Develoment as Freedom, Alfred A. Knopf, New York, 1999. [Google Scholar]
27.Tadikamalla P.R., On a family of distributions obtained by the transformation of the gamma distribution, J. Stat. Comput. Simul. 13 (1981), pp. 209–214. doi: 10.1080/00949658108810497 [DOI] [Google Scholar]
28.U.G. Assembly , Work of the statistical commission pertaining to the 2030 agenda for sustainable development (A/RES/71/313), UN General Assembly, New York, NY, USA 2017.
29.UNESCO , Education for All Global Monitoring Report 2006: Education for All. Literacy for life, Oxford University Press, 2005.
30.UNESCO , Education for All Global Monitoring Report 2015, UNESCO, Paris, 2015.
31.Weibull W., A statistical distribution of wide applicability, J. Appl. Mech. 18 (1951), pp. 293–297. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Section_7_Application.R

Click here for additional data file.^{(6.5KB, r)}

Section_6_Simulation.R

Click here for additional data file.^{(1.4KB, r)}

Section_6_Figure1.R

Click here for additional data file.^{(1.5KB, r)}

supplementarydata2.xlsx

Click here for additional data file.^{(37.7KB, xlsx)}

supplementarydata1.xlsx

Click here for additional data file.^{(212.2KB, xlsx)}

supp_UEW_JAS.pdf

Click here for additional data file.^{(416KB, pdf)}

[CIT0001] 1.Ahnen R.E., The politics of police violence in democratic Brazil, Lat. Am. Polit. Soc. 49 (2007), pp. 141–164. doi: 10.1111/j.1548-2456.2007.tb00377.x [DOI] [Google Scholar]

[CIT0002] 2.Altun E. and Cordeiro G.M., The unit-improved second-degree Lindley distribution: Inference and regression modeling, Comput. Stat. 35 (2019), pp. 1–21. [Google Scholar]

[CIT0003] 3.Chen G. and Balakrishnan N., A general purpose approximate goodness-of-fit test, J. Qual. Technol. 27 (1995), pp. 154–161. doi: 10.1080/00224065.1995.11979578 [DOI] [Google Scholar]

[CIT0004] 4.Ferrari S.L.P. and Cribari-Neto F., Beta regression for modelling rates and proportions, J. Appl. Stat. 31 (2004), pp. 799–815. doi: 10.1080/0266476042000214501 [DOI] [Google Scholar]

[CIT0005] 5.Ghitany M.E., Mazucheli J., Menezes A.F.B., and Alqallaf F., The unit-inverse gaussian distribution: A new alternative to two-parameter distributions on the unit interval, Comm. Statist. Theory Methods 48 (2019), pp. 3423–3438. doi: 10.1080/03610926.2018.1476717 [DOI] [Google Scholar]

[CIT0006] 6.Gómez-Déniz E., Sordo M.A., and Calderín-Ojeda E., The log-Lindley distribution as an alternative to the beta regression model with applications in insurance, Insurance Math. Econom. 54 (2014), pp. 49–57. doi: 10.1016/j.insmatheco.2013.10.017 [DOI] [Google Scholar]

[CIT0007] 7.Grassia A., On a family of distributions with argument between 0 and 1 obtained by transformation of the gamma distribution and derived compound distributions, Aust. J. Statist. 19 (1977), pp. 108–114. doi: 10.1111/j.1467-842X.1977.tb01277.x [DOI] [Google Scholar]

[CIT0008] 8.Gurvich M., DiBenedetto A., and Ranade S., A new statistical distribution for characterizing the random strength of brittle materials, J. Mater. Sci. 32 (1997), pp. 2559–2564. doi: 10.1023/A:1018594215963 [DOI] [Google Scholar]

[CIT0010] 10.Marinho P.R.D., Bourguignon M., and Dias C.R.B., AdequacyModel: Adequacy of probabilistic models and general purpose optimization, R package version 2.0.0, 2016. Available at https://CRAN.R-project.org/package=AdequacyModel.

[CIT0011] 11.Massa K.H.C., Pabayo R., and Chiavegatto Filho A.D.P., Income inequality and self-reported health in a representative sample of 27 017 residents of state capitals of Brazil, J. Public Health 40 (2018), pp. e440–e446. doi: 10.1093/pubmed/fdy022 [DOI] [PubMed] [Google Scholar]

[CIT0012] 12.Mazucheli J., Menezes A.F.B., and Chakraborty S., On the one parameter unit-Lindley distribution and its associated regression model for proportion data, J. Appl. Stat. 46 (2019), pp. 700–714. doi: 10.1080/02664763.2018.1511774 [DOI] [Google Scholar]

[CIT0013] 13.Mazucheli J., Menezes A.F.B., and Dey S., Improved maximum likelihood estimators for the parameters of the unit-gamma distribution, Comm. Statist. Theory Methods 47 (2017), pp. 3767–3778. doi: 10.1080/03610926.2017.1361993 [DOI] [Google Scholar]

[CIT0014] 14.Mazucheli J., Menezes A.F.B., and Dey S., The unit-Birnbaum-Saunders distribution with applications, Chil. J. Stat. 9 (2018), pp. 47–57. [Google Scholar]

[CIT0015] 15.Mazucheli J., Menezes A.F.B., Fernandes L.B., de Oliveira R.P., and Ghitany M.E., The unit-Weibull distribution and associated inference, J. Appl. Probab. Stat. 13 (2019), pp. 1–22. doi: 10.18576/amis/13S101 [DOI] [Google Scholar]

[CIT0016] 16.Mazucheli J., Menezes A.F.B., Fernandes L.B., de Oliveira R.P., and Ghitany M.E., The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modeling of quantiles conditional on covariates, J. Appl. Stat. 47 (2019), pp. 954–974. doi: 10.1080/02664763.2019.1657813 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0017] 17.Messias E., Income inequality, illiteracy rate, and life expectancy in Brazil, Amer. J. Public Health 93 (2003), pp. 1294–1296. doi: 10.2105/AJPH.93.8.1294 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0018] 18.Mitnik P.A. and Baek S., The Kumaraswamy distribution: Median-dispersion re-parameterizations for regression modeling and simulation-based estimation, Statist. Papers 54 (2013), pp. 177–192. doi: 10.1007/s00362-011-0417-y [DOI] [Google Scholar]

[CIT0019] 19.Mousa A.M., El-Sheikh A.A., and Abdel-Fattah M.A., A gamma regression for bounded continuous variables, Adv. Appl. Stat. 49 (2016), pp. 305–326. [Google Scholar]

[CIT0021] 21.Nadarajah S. and Kotz S., On some recent modifications of Weibull distribution, IEEE Trans. Reliab. 54 (2005), pp. 561–562. doi: 10.1109/TR.2005.858811 [DOI] [Google Scholar]

[CIT0022] 22.Pham H. and Lai C.D., On recent generalizations of the Weibull distribution, IEEE Trans. Reliab. 56 (2007), pp. 454–458. doi: 10.1109/TR.2007.903352 [DOI] [Google Scholar]

[CIT0023] 23.Royuela V. and García G.A., Economic and social convergence in Colombia, Reg. Stud. 49 (2015), pp. 219–239. doi: 10.1080/00343404.2012.762086 [DOI] [Google Scholar]

[CIT0024] 24.Santos-Neto M., Bourguignon M., Zea L.M., Nascimento A.D., and Cordeiro G.M., The Marshall-Olkin extended Weibull family of distributions, J. Stat. Distrib. Appl. 1 (2014), pp. 9. doi: 10.1186/2195-5832-1-9 [DOI] [Google Scholar]

[CIT0025] 25.Sen A., The standard of living: Lecture I, concepts and critiques, The Standard of Living, Cambridge: Cambridge University Press, 1987, pp. 1–19.

[CIT0026] 26.Sen A., Develoment as Freedom, Alfred A. Knopf, New York, 1999. [Google Scholar]

[CIT0027] 27.Tadikamalla P.R., On a family of distributions obtained by the transformation of the gamma distribution, J. Stat. Comput. Simul. 13 (1981), pp. 209–214. doi: 10.1080/00949658108810497 [DOI] [Google Scholar]

[CIT0028] 28.U.G. Assembly , Work of the statistical commission pertaining to the 2030 agenda for sustainable development (A/RES/71/313), UN General Assembly, New York, NY, USA 2017.

[CIT0029] 29.UNESCO , Education for All Global Monitoring Report 2006: Education for All. Literacy for life, Oxford University Press, 2005.

[CIT0030] 30.UNESCO , Education for All Global Monitoring Report 2015, UNESCO, Paris, 2015.

[CIT0031] 31.Weibull W., A statistical distribution of wide applicability, J. Appl. Mech. 18 (1951), pp. 293–297. [Google Scholar]

PERMALINK

The unit extended Weibull families of distributions and its applications

Renata Rojas Guerra

Fernando A Peña-Ramírez

Marcelo Bourguignon

Abstract

1. Introduction

2. The unit extended Weibull family

Proposition 2.1

Proof.

Proposition 2.2

Proof.

3. General mathematical properties

3.1. Moments

3.2. Incomplete moments

4. Maximum likelihood estimation

5. Some special models

5.1. Unit Gompertz distribution

Proposition 5.1

5.2. Unit Lomax distribution

Proposition 5.2

5.3. Complementary unit Gompertz distribution

Proposition 5.3

5.4. Complementary unit Lomax distribution

Proposition 5.4

6. The complementary unit Weibull distribution and its properties

Figure 1.

Table 1. First four moments, variance, skewness and kurtosis coefficients for some scenarios of the CUW distribution.

6.1. Maximum likelihood estimation

6.2. Simulation study

Table 2. Monte Carlo results for the mean estimates, RB%, and RMSEs of the CUW distribution with 10,000 replications.

7. Applications

Table 3. Descriptive statistics for literacy rates in both countries.

Table 4. MLEs of the parameters from fitted models to literacy rates for Brazilian municipalities in 2010, and Colombian municipalities in 2005.

Figure 2.

Figure 3.

Figure 4.

8. Concluding remarks

Supplementary Material

Acknowledgements

Note

Disclosure statement

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases