Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2022 Nov 16:1–24. Online ahead of print. doi: 10.1007/s00180-022-01296-3

An extended approach for the generalized powered uniform distribution

Carlos Rondero-Guerrero 1, Isidro González-Hernández 1, Carlos Soto-Campos 1,
PMCID: PMC9667452  PMID: 36405879

Abstract

A new uniform distribution model, generalized powered uniform distribution (GPUD), which is based on incorporating the parameter k into the probability density function (pdf) associated with the power of random variable values and includes a powered mean operator, is introduced in this paper. From this new model, the shape properties of the pdf as well as the higher-order moments, the moment generating function, the model that simulates the GPUD and other important statistics can be derived. This approach allows the generalization of the distribution presented by Jayakumar and Sankaran (2016) through the new GPUD(J-S) distribution. Two sets of real data related to COVID-19 and bladder cancer were tested to demonstrate the proposed model’s potential. The maximum likelihood method was used to calculate the parameter estimators by applying the maxLik package in R. The results showed that this new model is more flexible and useful than other comparable models.

Keywords: Generalized uniform distribution, Maximum likelihood estimation, COVID-19

Introduction

In recent years, several researchers have proposed different generalizations of new distribution functions of continuous random variables to more broadly model various behaviors related to survival analysis, such as the useful life of a computer. This also allows the study of hazard functions to determine the reliability of devices subject to use and deterioration. Additionally, these extended distributions provide greater flexibility in modeling various real-life problems (Nassar et al. 2018; Sankaran and Jayakumar 2016; Torabi et al. 2018).

This research follows the approach presented in the seminal article of Marshall and Olkin (1997) and other authors, for example, Alshangiti et al. (2014); Jayakumar and Sankaran (2016); and Jose and Krishna (2011), who continued this research and reported different approaches and a new model family under the Marshall–Olkin extended uniform distribution.

The uniform distribution U(0, 1) is closely related to other distributions since it is of fundamental importance to generate random numbers in simulation processes through the inverse transform method, which in turn allows the numerical evaluation of the behavior of statistical models (Law et al. 2000).

Previously reported results, such as those by Rondero-Guerrero et al. (2020), motivated us to consider that statistical distributions are frequently applied to models of real-world phenomena in different areas, such as medical sciences, finance, engineering, and economics, where the analysis of risk and survival functions is included due to their relevance in modeling data from various systems. Therefore, we decided to extend this work and apply it to real-world data to develop a new model that better fits the analyzed data and can be used in many areas, for example, public health policy design.

Our approach is based on the generalization of a new family of functions of the uniform distribution, where an average that we refer to as the powered mean is incorporated. This generalization includes the parameter k, which appears in the power of the values taken by the continuous random variable (Rondero-Guerrero et al. 2020). Furthermore, this proposal generalizes the work of Jayakumar and Sankaran (2016).

This article is structured as follows. In Sect. 2, the general conditions of the new generalized powered uniform distribution (GPUD) family are defined and discussed. In Sect. 3, some interesting properties, such as the hazard function, survival function, moment generating function, moments, and statistics such as quantile, median, asymmetry, and kurtosis, are presented. In Sect. 4, the GPUD approach is used to generalize the work of Jayakumar and Sankaran (2016), allowing us to generate another family of distributions. This family contains the parameter k as the power of the values of the random variable, which provides greater flexibility to this and other models. In Sect. 4, expressions for estimating the parameters of the generalization and a simulation study to determine the performance of maximum likelihood estimators for specific sample sizes are presented. In Sect. 5, the generalization presented by Jayakumar and Sankaran (2016) is applied to two sets of real data, which allows us to adjust the proposed model and empirically demonstrate that the GPUD is more appropriate than other models, such as the Weibull, Weibull exponentiated (Pal et al. 2006), new Marshall–Olkin Weibull (Cui et al. 2020) and Marshall–Olkin exponentiated generalized (García et al. 2020) models. Finally, the conclusions are presented in Sect. 6.

A new family of uniform distribution functions

Based on our research (see Rondero-Guerrero et al. (2020)), a new family of distribution functions, the generalized powered uniform distribution, with a continuous random variable X, was introduced. The respective probability density function is defined as

fXk(x)=xk(b-a)Mk(a,b),axb;k=0,1,...,n.0,x<a;x>b 1

The term Mk(a,b) is defined as an operator that is herein referred to as the powered mean and is expressed as

Mk(a,b)=j=0kak-jbkk+1. 2

For the case where k=0 and M0(a,b)=1, we recover the uniform distribution on (0,1). It is easy to show that Equation (1) is a well-defined pdf, where the mean is μ=a+b2.

The distribution function (df) corresponding to Eq. (1) is given as

FXk(x)=P(X<x)=0,x<axk+1-ak+1bk+1-ak+1,axb;k=0,1,,n.1,xb 3

It should be noted that Eqs. (1) and (3) represent any member of a new family of probability density and distribution functions for different values of the k parameter. These results show that there are many polynomial functions satisfying the pdf and df conditions, that is, FXk(x)=fXk(x).

When a=0, and b=1, the GPUD takes the form

FXk(x)=xk+1,fork=0,1,2,,n. 4

It is important to note that the above expression will be used in Sect. 3 when the survival function is defined as

F¯Xk(x)=1-FXk(x)=1-xk+1. 5

The graphs of fXk(x) and FXk(x) are shown in Figs. 1 and 2, respectively, for different values of k. The results in Fig. 1 show that the pdf skews to the right as the value of the parameter k increases. Graphical properties are of great importance because they allow researchers and professional users of statistical methods to determine whether any of these distributions fit the dataset of a specific application.

Fig. 1.

Fig. 1

fXkx for a=0, b=1, and k=0,1,2,3,4,5

Fig. 2.

Fig. 2

FXk(x) for a=0, b=1, and k=0,1,2,3,4,5

General properties of the GPUD

In this section, the general properties of the GPUD are examined to show the flexibility of this new family of distributions that will allow the development of a generalization of the model proposed by Jayakumar and Sankaran (2016), which is studied in Sect. 4.

Hazard function and survival function

From Eqs. (1) and (3) the hazard function and survival function respectively, can be obtained as shown below

hXk(x)=fXk(x)F¯Xk(x)=(k+1)xk1-xk+1,fora=0,b=1,andk=0,1,...,n. 6

In what follows, we will use the more common notation for the survival function SXk(x)=F¯Xk(x), where

SXk(x)=1-FXk(x)=1-xk+1 7

for a=0, b=1, and k=0,1,2,...,n.

Survival analysis is a topic of great importance for researchers in many disciplines. Both the hazard and survival functions are defined for a nonnegative continuous or discrete random variable X, which is related to data from a lifetime and allows the formulation of statistical models in areas such as medicine, engineering, and biology. For instance, in biomedical research, survival analysis is applied to a random variable related to the time that elapses from the onset of a disease until the patient either recovers or dies.

When working with the hazard function, researchers are interested in the properties of the graphs, as they are useful in identifying whether the distribution can model increasing or decreasing failure rates. Figures 3 and 4 show the shapes of hXk(x) and SXk(x), respectively, for different k values.

Fig. 3.

Fig. 3

hXk(x) for a=0, b=1, and k=0,1,2,3,4,5

Fig. 4.

Fig. 4

SXk(x) for a=0, b=1, and k=0,1,2,3,4,5

Moment generating function

A function of great importance in the calculation of higher-order moments is the moment generating function. It is possible to see that a compact expression of it in terms of the powered mean can be obtained.

Theorem 1

The moment generating function of the pdf is given as

ϕxk(t)=l=0tll!·Mk+lMk. 8

Proof

The moment generating function ϕx(t) is defined as

ϕxk(t)=E[etx]=-etxfXk(x)dx=1(b-a)Mk(a,b)abetxxkdx, 9

By expanding etx in a Taylor series, one has

ϕxk(t)=E[etx]=1(b-a)Mk(a,b)ab[1+tx1!+(tx)22!+(tx)33!+...]xkdx, 10

and

ϕxk(t)=1+t1!·Mk+1(a,b)Mk(a,b)+(t)22!·Mk+2(a,b)Mk(a,b)+(t)33!·Mk+3(a,b)Mk(a,b)+, 11

or in terms of the series, one has

ϕxk(t)=l=0tll!Mk+lMk. 12

From the above expression, it is possible to obtain all the moments μxr.

Moments

Next, the calculation of the higher-order moments for the GPUD are shown, allowing the determination of the mean and variance, among others.

Theorem 2

The rth moment of the pdf is given as

μxr=Ek[xr]=Mk+r(a,b)Mk(a,b). 13

Proof

The rth moment can be written as

μxr=Ek[xr]=-xrfXk(x)dx=abxrxk(b-a)Mk(a,b)dx, 14

from which, one can obtain

μxr=1(b-a)Mk(a,b)abxk+rdx=bk+r+1-ak+r+1(b-a)Mk(a,b)(k+r+1), 15

Therefore,

μxr=Ek[xr]=Mk+r(a,b)Mk(a,b). 16

It is worth noting the advantage of the last equation in calculating higher-order moments. For the case where a=0 and b=1, it is expressed as

μxr(0,1)=Mk+r(0,1)Mk(0,1)=k+1k+r+1. 17

For the U(0, 1) distribution, k=0. Thus, for the first moment r=1, the mean is μ(0,1)=1/2. From the previous result, we can calculate the corresponding variance

σk2=μk2-μk12 18

for k=0,1,2,...,n. This result can be rewritten as

σk2=Mk+2Mk-Mk+1Mk2. 19

We emphasize that the higher-order moments with respect to the mean such that Ek[(x-μk1)r] necessarily involve expressions that are given in terms of the powered mean operator, which shows its relevance and, in addition, an increase in calculations.

For the skewness and kurtosis coefficients, the calculation gives

γ3k=Ex-μσ3=1σ3E(x3)-3μE(x2)+2μ3, 20
γ4k=Ex-μσ4=1σ4E(x4)-4μE(x3)+6μ2E(x2)-3μ4. 21

To demonstrate the flexibility of the GPUD properties, Table 1 shows the corresponding calculations for μk, σk2, γ3k, and γ4k for a=0, b=1, and different values of k. The data in the table indicate that the GPUD has a negative bias for values of k1. Furthermore, the GPUD is a member of the leptokurtic family.

Table 1.

Mean, variance and the coefficients of skewness and kurtosis for the GPUD

(abk) Mean Variance Skewness Kurtosis
(0, 1, 0) 0.5 0.083 0 −1.2
(0, 1, 1) 0.666 0.055 −0.565 −0.6
(0, 1, 2) 0.75 0.037 −0.860 0.095
(0, 1, 3) 0.8 0.026 −1.049 0.696
(0, 1, 4) 0.833 0.019 −1.183 1.2
(0, 1, 5) 0.857 0.015 −1.283 1.62

Simulation, quantiles and median

Using Eq. (3), the random variable X of GPUD(abk) can be simulated as

x=(ak+1+(bk+1-ak+1)u)1/(k+1), 22

where u is the standard uniform distribution. In addition, the qth quantile of GPUD (abk) is given as

x=(ak+1+(bk+1-ak+1)q)1/(k+1). 23

Table 2 shows the median of the GPUD distribution for different values of the parameter k.

Table 2.

Medians of the GPUD distribution

a b k Median
0 1 0 0.5
0 1 1 0.707
0 1 2 0.793
0 1 3 0.840
0 1 4 0.870
0 1 5 0.890

Generalization of Jayakumar and Sankaran’s (2016) distribution using the GPUD approach

After showing the characteristics of the GPUD family, we will see below how this new approach provides greater versatility in modeling specific statistical applications and data analysis, which has allowed us to generalize the results obtained by Jayakumar and Sankaran (2016). These authors introduced what they referred to as the generalized uniform distribution (GUD), where the parameters (α,θ) are considered. The survival function reported by the same authors is based on the truncated negative binomial distribution, which allows the Marshall and Olkin (1997) model to be generalized as shown below:

G¯(x;α,θ)=αθ1-αθ[[F(x)+αF¯(x)]-θ-1],forθ>0,α>0,andxR. 24

The theta parameter allows us to give a greater amplitude and flexibility to the Marshall and Olkin (1997) model, whose survival function is given as

G¯(x;α)=αF¯(x)1-(1-α)F¯(x),α>0,andxR. 25

Note that if θ=1, Eq. (24) is reduced to the model in (25).

Jayakumar and Sankaran (2016) considered the df as F(x)=x and the survival function as F¯(x)=1-x, which come from the uniform distribution, and substituting them into Eq. (24) results in

G¯(x;α,θ)=αθ1-αθ[[x(1-α)+α)]-θ-1],θ>0,0<α<1, 26

The corresponding df is

G(x;α,θ)=1-αθx(1-α)+α-θ1-αθ, 27

and the pdf is given as

g(x;α,θ)=(1-α)θαθ(1-αθ)x(1-α)+αθ+1. 28

GPUD approach

Using the GPUD approach, we now introduce FXk(x)=xk+1 and SXk(x)= F¯Xk(x)=1-FXk(x)=1-xk+1, for k=0,1,2,...,n; here, 0<x<1, and we obtain a new family of distributions with three parameters (α,θ,k), which will be defined as GPUD(J-S). The way we develop our generalization is shown below where the survival function is expressed as

G¯(J-S)(x;α,θ,k)=αθ1-αθxk+1(1-α)+α-θ-1 29

for θ>0, 0<α<1, and k=0,1,2,3,...,n.

Therefore, the df of this new GPUD(J-S) family is given as

G(J-S)(x;α,θ,k)=1-αθxk+1(1-α)+α-θ1-αθ. 30

The corresponding pdf is given as

g(J-S)(x;α,θ,k)=αθ1-αθ·θ(k+1)(1-α)xk(1-α)xk+1+αθ+1. 31

Note: If k=0, θ>1 and θ1, the GUD is obtained as given by Jayakumar and Sankaran (2016). In Figs. 5 and 6, k=3, θ=5 and several values of α are considered.

Fig. 5.

Fig. 5

g(J-S)(x) for k=3, θ=5, and α=0.1,0.3,0.7,1.3,5

Fig. 6.

Fig. 6

G(J-S)(x) for k=3, θ=5, and α=0.1,0.3,0.7,1.3,5

Hazard and survival functions of GPUD(J-S)

The generalization presented herein is relevant because it allows the creation of a wide range of different hazard functions that can be applied to various analyses of survival or reliability studies in diverse areas, such as medicine, engineering, and economics. Generally, we work with a one-dimensional and continuous random variable defined in [0,), which measures the time between events, unless otherwise indicated.

The hazard function given by Jayakumar and Sankaran (2016) is

h(x;α,θ)=g(x;α,θ)G¯(x;α,θ)=θ(1-α)[x(1-α)+α][1-x(1-α)+αθ]. 32

In our case, using GPUD(J-S), where F¯(x)=1-xk+1, we obtain a new family of hazard functions given in terms of the parameter k

h(J-S)(x;α,θ,k)=θ(1-α)(k+1)xk[α+(1-α)xk+1][1-α+(1-α)xk+1θ]. 33

In Fig. 7, the behavior of the hazard function h(J-S)(x;α,θ,k) referring to GPUD(J-S) for different values of (α,θ,k) is shown. It is important to note that if we substitute the value of k=1 in Eq. (33), we obtain the same results reported by the cited authors.

Fig. 7.

Fig. 7

h(J-S)(x) for k=2, θ=7, and α=0.1,0.4,0.9,1.6,2.5

The behavior of the survival function S(J-S)(x;α,θ,k), which refers to GPUD(J-S) for different values of (α,θ,k), is shown in Fig. 8.

S(J-S)x;α,θ,k=1-1-αθxk+1(1-α)+α-θ1-αθ. 34

Fig. 8.

Fig. 8

S(J-S)(x) for k=2, θ=7, and α=0.1,0.4,0.9,1.6,2.5

Our model notably provides greater versatility due to the presence of the parameter k, which is an exponent in the random variable’s values, as shown in Figs. 7 and 8.

Parameter estimation of GPUD(J-S)

There are many methods for estimating unknown parameters from data. In our case, we will consider the maximum likelihood estimation (MLE) (Okasha and Kayid 2016; Torabi et al. 2018). For a sample of the random variable (x1,x2,...,xn), starting from Eq. (31), an additional parameter k is introduced to obtain the generalizations proposed in this article. The log-likelihood function is given as

L(x;α,θ,k)=i=1ng(J-S)(xi,α,θ,k), 35

which from our proposal is expressed as

L(x;α,θ,k)=αθθ(k+1)(1-α)1-αθn·i=1nxik(1-α)xik+1+α-(θ+1). 36

Note the relevance of the parameter k in the previous expression which comes from the generalization of the work by Jayakumar and Sankaran (2016), where only the parameters α and θ appear.

The corresponding maximum likelihood function is given as

lnL(x;α,θ,k)=n·lnαθθ(k+1)(1-α)1-αθ-(θ+1)i=1nln1-αxik+1+α+ki=1nlnxi. 37

To obtain the covariance matrix I-1(β) and the corresponding estimators, the partial derivatives of the log-likelihood function are calculated and are given as

lnLα=-n1-α+nθα+nαθ-1θ1-αθ-θ+1i=1n1-xik+1xik+11-α+α 38
lnLθ=nθ+n·ln(α)+αθn·ln(α)1-αθ-i=1nlnxik+11-α+α 39
lnLk=nk+1-θ+1i=1n1-αxik+1lnxixik+11-α+α+i=1nlnxi. 40

The maximum likelihood estimators can be obtained numerically by solving the equations, lnLα=0, lnLθ=0, and lnLk=0.

On the other hand, the second derivatives of the log-likelihood function of GPUD(J-S) with respect to α, θ, and k are given as

2lnLα2=-n1-α2-nθα2+nαθθ2-αθθ+α2θθα21-αθ2-θ+1i=1n-1-xik+12xik+11-α+α2, 41
2lnLθ2=n-θ-2+αθlnα21-αθ+αθ2lnα21-αθ2, 42
2lnLk2=-n(k+1)-2-θ+1i=1nxik+1lnxi21-αxik+11-α+α+θ+1i=1nxik+12lnxi21-α2xik+11-α+α2, 43
2lnLθk=-i=1nxik+1lnxi1-αxik+11-α+α, 44
2lnLαk=θ+1i=1nxik+1lnxixik+11-α+α+θ+1i=1n1-xik+1xik+1lnxi1-αxik+11-α+α2, 45
2lnLαθ=nα-1+αθθlnαα1-αθ+αθα1-αθ+αθ2lnαθ1-αθ2α-i=1n-xik+1+1xik+11-α+α. 46

The matrix of the maximum likelihood estimators of β=(α,θ,k) for β^=(α^,θ^,k^) is given as

I(β)=-E2lnLα22lnLαθ2lnLαk2lnLαθ2lnLθ22lnLθk2lnLαk2lnLθk2lnLk2. 47

Therefore, the covariance matrix will be I-1(β). The approximate confidence intervals q to the (1-δ)100% for the parameters α, θ, and k will be α^±Zδ/2V(α^) , θ^±Zδ/2V(θ^) and k^±Zδ/2V(k^), respectively, where V(α^), V(θ^), and V(k^) are the variances of α^, θ^, and k^, respectively, which are represented by elements of the principal diagonal of the matrix I-1(β) and Zδ/2 is the δ/2 upper percentile of the standard normal distribution.

Simulation, quantiles and the median of GPUD(J-S)

In this work, we follow the approach proposed by several authors to calculate the inverse function of df, where a known simulation mechanism generates random numbers. To obtain the model that simulates random numbers that have a behavior GPUD(J-S)(x;α,θ,k) (Eq. 30), we obtain the inverse function

X=α1-α1-Y(1-αθ)-1/θ-1k+1 48

where YU(0,1) and k=1,2,...,n.

For the particular case of k=0 the result reported by Jayakumar and Sankaran (2016) is obtained

X=α1-α1-Y(1-αθ)-1/θ-1.

On the other hand, it is interesting to calculate the qth quartile from the perspective of this research generalization because it gives us information about the usual parameters of the distribution

xq=α1-α11-q(1-αθ)1/θ-1k+1. 49

In particular, the median is obtained by putting q=1/2 in Eq. (49).

In Eq. (49) for k=0, the result reported by Jayakumar and Sankaran (2016) is obtained

xq=α1-α11-q(1-αθ)1/θ-1.

Table 3 shows the calculation of the medians, to exhibit one of the advantages of our model for different values of the parameters (α,θ,k).

Table 3.

Medians of GPUD(J-S) distribution

α θ k Median
0.999 1 0 0.500
0.1 5 1 0.017
0.5 2.9 2 0.465
0.2 3.5 3 0.379
0.3 4.5 4 0.516
0.4 6 5 0.605

A simulation study was carried out to verify the MLE’s performance considering different parameter values and sample sizes for GPUD(J-S). The different sample sizes considered in the simulation are n=50,100,500, and 1000. We used the maxLik package in R to find the parameter estimates. The process was replicated 1000 times for each sample size, and we reported the average parameter estimate and the associated mean square errors. The results are reported in Table 4. As the sample size increases, the mean bias and mean square errors decrease, indicating the consistency property of the MLE.

Table 4.

Simulation results for some different values of the parameters α,θ, and k

Parameters
(α,θ,k) α^S^E(α^) θ^S^E(θ^) k^S^E(k^)
n=50
0.1, 5.0, 1 0.182 (0.038) 6.560 (1.133) 0.897 (0.026)
0.3, 4.5, 4 0.311 (0.074) 4.553 (1.164) 3.960 (0.142)
0.5, 2.9, 2 0.572 (0.136) 3.535 (1.838) 1.971 (0.083)
0.2, 3.5, 3 0.240 (0.278) 3.970 (1.244) 3.368 (0.135)
0.4, 6.0, 5 0.481 (0.377) 6.427 (1.565) 5.398 (0.360)
n=100
0.1, 5.0, 1 0.094 (0.009) 4.955 (0.319) 1.015 (0.013)
0.3, 4.5, 4 0.289 (0.024) 4.318 (0.595) 4.019 (0.060)
0.5, 2.9, 2 0.535 (0.039) 3.287 (0.475) 1.978 (0.035)
0.2, 3.5, 3 0.223 (0.024) 3.821 (0.372) 2.983 (0.046)
0.4, 6.0, 5 0.413 (0.123) 6.318 (0.480) 5.070 (0.064)
n=500
0.1, 5.0, 1 0.105 (0.004) 5.274 (0.139) 1.002 (0.008)
0.3, 4.5, 4 0.317 (0.015) 4.538 (0.228) 3.915 (0.038)
0.5, 2.9, 2 0.507 (0.032) 2.966 (0.409) 2.008 (0.028)
0.2, 3.5, 3 0.211 (0.013) 3.681 (0.184) 2.978 (0.044)
0.4, 6.0, 5 0.405 (0.071) 6.118 (0.296) 5.009 (0.046)
n=1000
0.1, 5.0, 1 0.105 (0.003) 5.132 (0.118) 0.997 (0.006)
0.3, 4.5, 4 0.305 (0.010) 4.521 (0.162) 3.993 (0.031)
0.5, 2.9, 2 0.503 (0.019) 2.995 (0.137) 2.000 (0.015)
0.2, 3.5, 3 0.208 (0.009) 3.554 (0.093) 2.990 (0.025)
0.4, 6.0, 5 0.399 (0.013) 6.034 (0.025) 5.003 (0.019)

Application to real data

In this section, the practical utility of GPUD(J-S) is realized by analyzing two different datasets of real data to show the potential of the new family of distributions. The first set of data is related to the global health problem currently being experienced by the pandemic caused by a new strain of coronavirus (COVID-19), which has infected more than 187 million people around the world and has caused the death of more than 4 million people as of June 31, 2021. Data correspond to people who died from COVID-19 and had diabetes. We analyzed the period from the onset of symptoms to the death of the patient. The data were collected from the Ministry of Health of the Government of Mexico (https://www.gob.mx/salud/documentos/datos-abiertosbases-historicas-direccion-general-de-epidemiologia). All the cases correspond to the period starting on February 27 (when the first person infected with COVID-19 and who also had diabetes appeared in Mexico) and ending on April 20, 2020. A total of 1113 cases were obtained.

We clarify that the information of the referred source is available in days (dates). However, compatible calculation purposes required dividing each data point by the longest lifespan of infected people (76.1 days) to obtain the values of the study variable in the interval 0<x<1, as this is a requirement of our model. The second set of data refers to the remission time (in months) of a sample of 128 patients with bladder cancer presented by Shakhatreh (2018): 0.08, 0.20, 0.40, 0.50, 0.51, 0.81, 0.90, 1.05, 1.19, 1.26, 1.35, 1.40, 1.46, 1.76, 2.02, 2.02, 2.07, 2.09, 2.23, 2.26, 2.46, 2.54, 2.62, 2.64, 2.69, 2.69, 2.75, 2.83, 2.87, 3.02, 3.25, 3.31, 3.36, 3.36, 3.48, 3.52, 3.57, 3.64, 3.70, 3.82, 3.88, 4.18, 4.23, 4.26, 4.33, 4.34, 4.40, 4.50, 4.51, 4.87, 4.98, 5.06, 5.09, 5.17, 5.32, 5.32, 5.34, 5.41, 5.41, 5.49, 5.62, 5.71, 5.85, 6.25, 6.54, 6.76, 6.93, 6.94, 6.97, 7.09, 7.26, 7.28, 7.32, 7.39, 7.59, 7.62, 7.63, 7.66, 7.87, 7.93, 8.26, 8.37, 8.53, 8.65, 8.66, 9.02, 9.22, 9.47, 9.74, 10.06, 10.34, 10.66, 10.75, 11.25, 11.64, 11.79, 11.98, 12.02, 12.03, 12.07, 12.63, 13.11, 13.29, 13.80, 14.24, 14.76, 14.77, 14.83, 15.96, 16.62, 17.12, 17.14, 17.36, 18.10, 19.13, 20.28, 21.73, 22.69, 23.63, 25.74, 25.82, 26.31, 32.15, 34.26, 36.66, 43.01, 46.12, 79.05.

In this case, a procedure similar to that used for the COVID-19 dataset was implemented. For the second set of data, each data point was divided by 79.051 to obtain values of the study variable in the interval 0<x<1.

The fit of the GPUD(J-S) distribution is compared with the following lifetime distributions for a continuous variable X:

  1. Weibull distribution with a pdf
    g(x;λ,β)=λβλxλ-1e-xβλ;λ,β>0.
  2. Exponentiated Weibull (EW) distribution (Pal et al. 2006) with a pdf
    g(x;α,λ,γ)=αγλγxγ-11-e-(λx)γα-1e-(λx)γ;α,λ,γ>0.
  3. New Marshall-Olkin Weibull (NMOW) distribution (Cui et al. 2020) with a pdf
    g(x;θ,λ,β)=θλβxβ-1e-(λx)βθ+(θ-1)e-(λx)β2;θ,λ,β>0.
  4. Kumaraswamy Exponential-Weibull (KwEW) distribution (Cordeiro et al. 2016) with a pdf
    g(x;α,γ,k,β,λ)=αγλ+kβxk-1e-λx-βxk1-e-λx-βxk-1+α·1-1-e-λx-βxkα-1+γ1R+(x);α,γ,k,β,λ>0.
  5. Alpha power transformed Weibull (APTW) distribution (Dey et al. 2017) with a pdf
    g(x;α,β,λ)=logαα-1βλxλ-1e-βxλα1-e-βxλ;β,λ>0,andα0.
  6. Extended Exponentiated Weibull (EEW) distribution (Bidram et al. 2015) with a pdf
    g(x;λ,α,β,γ)=γαβλβxβ-1e-λxβ1-e-(λx)βα-11-(1-γ)1-1-e-(λx)βα2;λ,α,β,γ>0.

Table 5 presents the calculations obtained from the seven distributions for the values of the estimators, log-likelihood (–log L), Akaike Information criterion (AIC) and Bayesian information criterion (BIC). According to Jayakumar and Sankaran (2019), AIC=-2logL+2k and BIC=-2logL+klog(n). L is the likelihood function evaluated in the maximum likelihood estimates, k is the number of parameters, and n is the sample (dataset). Additionally, the Crammer–von Mises (W), Anderson–Darling (A), and Kolmogorov–Smirnov (K-S) statistics and their corresponding pvalues are calculated to test the goodness of fit and have other criteria. Thus, a determination of which of these distributions best fits the dataset can be made. Usually, the smaller the K- S, W, and A statistics’ values are, the better the fit to the data.

Table 5.

Parameter estimates and goodness of fit statistics for COVID-19 data

Model MLEs −log L AIC BIC W A K-S pvalue
GPUD(J-S) θ^=1.382 996.80 1987.6 1972.55 0.040 0.286 0.015 0.93
α^=0.017
k^=1.355
Weibull λ^=1.526 1049.43 2102.87 2112.90 1.215 7.836 0.063 0.000
β^=0.017
EW α^=4.659 1009.16 2024.32 2039.37 0.109 0.842 0.024 0.539
λ^=15.68
γ^=0.800
NMOW θ^=23.594 1012.29 2030.59 2045.63 0.062 0.433 0.017 0.892
λ^=4.784
β^=2.490
KwEW α^=5.135 1008.85 2027.70 2052.78 0.115 0.883 0.024 0.507
γ^=0.867
k^=0.714
β^=5.990
λ^=4.440
APTW α^=0.008 998.913 2003.82 2018.87 0.371 2.432 0.031 0.222
β^=5.334
λ^=1.893
EEW λ^=5.577 1010.26 2028.53 2048.58 0.121 0.804 0.020 0.728
α^=2.691
β^=1.118
γ^=0.253

Note that in Table 5, the K-S statistic of the GPUD(J-S) distribution is the smallest compared to the other distributions, and therefore, the value corresponding to the pvalue is the highest, which shows that this new proposal produces the best fit for the COVID-19 dataset.

Figure 9 shows the fit of the different distributions compared with that of GPUD(J-S). As seen in this figure, our model presents excellent flexibility, and it is competitive with other widely accepted distributions in use, such as the Weibull distribution or the exponentiated Weibull distribution, among others. On the other hand, Fig. 10 shows the Q-Q plot for COVID-19 data.

Fig. 9.

Fig. 9

The fitted pdfs of GPUD(J-S), W, EW, NMOW, KwEE, APTW, and EEW for COVID-19 data

Fig. 10.

Fig. 10

Q–Q plot for the GPUD(J-S) distribution for the COVID-19 dataset

Additionally, referring to Table 6, it is convenient to note that the data correspond to the remission times of bladder cancer. In this case, a similar observation is made for GPUD(J-S), where the calculated values of K-S and the value of p show the best fit of the data.

Table 6.

Parameter estimates and goodness of fit statistics for cancer data

Model MLEs −log L AIC BIC W A K-S pvalue
GPUD(J-S) θ^=1.592 150.15 294.30 285.74 0.052 0.409 0.029 0.999
α^=0.038
k^=0.514
Weibull λ^=1.046 145.28 294.56 300.27 0.130 0.783 0.070 0.541
β^=0.121
EW α^=2.104 148.42 302.84 311.40 0.052 0.329 0.055 0.820
λ^=16.732
γ^=0.732
NMOW θ^=11.957 149.21 304.42 312.98 0.028 0.205 0.031 0.999
λ^=4.567
β^=1.585
KwEW α^=2.947 148.26 306.52 320.78 0.058 0.368 0.050 0.897
γ^=0.995
k^=0.531
β^=4.538
λ^=4.390
APTW α^=0.050 148.81 303.62 312.18 0.044 0.273 0.055 0.832
β^=5.540
λ^=1.241
EEW λ^=2.636 149.22 306.44 317.85 0.029 0.211 0.040 0.984
α^=1.058
β^=1.501
γ^=0.080

Figure 11 also shows the fit of the different distributions compared with GPUD(J-S). It can be seen that our model is highly competitive with other distributions. Figure 12 displays the Q-Q plot for bladder cancer data.

Fig. 11.

Fig. 11

The fitted pdfs of GPUD(J-S), W, EW, NMOW, KwEE, APTW, and EEW for bladder cancer data

Fig. 12.

Fig. 12

Q–Q plot for the GPUD(J-S) distribution for bladder cancer data

Conclusions

In this article, a new family of uniform distributions on (0,1) with three parameters (abk), which is referred to as the generalized powered uniform distribution (GPUD), was proposed. The method used in this proposal incorporates the parameter k as the power of the values of the continuous random variable, favoring a greater diversity of the probability density and survival and hazard functions. Additionally, some properties were derived from the new distribution. Furthermore, the GPUD results allowed us to generalize the model presented by Jayakumar and Sankaran (2016). As a result, it was possible to generate a new family of distributions called GPUD(J-S), which results in excellent flexibility in the cumulative distribution function due to the contribution of the parameter k. Clarifying our GPUD approach has allowed us to generalize other models, such as that presented by Jose and Krishna (2011), which is research in progress. To demonstrate the effectiveness of the model, two sets of real data related to COVID-19 and bladder cancer were adapted, and the maxLik package in R was used to find the estimators of the parameters.

In relation to the second real dataset given in Table 6, the value obtained for the statistic -log(L) based on NMOW was slightly lower (0.94) than that obtained by GPUD(J-S).

We highlighted that the Kolmogorov–Smirnov (K-S) goodness of fit statistic of our model was 0.029, which is the lowest value compared to the other distributions. In addition, a value of p = 0.999 for the K-S statistic was achieved using our model, which, along with the value obtained using the NMOW distribution, were the highest values shown in Table 6. Both values provide evidence that our model is highly competitive.

It should be noted that in the first set of real data (see Table 5), the GPUD(J-S) distribution, shows the best values of the calculated statistics.

The results obtained show that GPUD(J-S) is a valid alternative to other known distributions, such as the Weibull, exponentiated Weibull, and new Marshall–Olkin Weibull distributions, among others, with the added advantage that it provides the versatility of working with the parameter k in the values of the random variable.

Acknowledgements

The authors acknowledge support from CONACyT and PRODEP, MEXICO, and practical recommendations from the referee regarding the structure of this paper.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Carlos Rondero-Guerrero, Email: ronderocar@gmail.com.

Isidro González-Hernández, Email: igonzalez@uaeh.edu.mx.

Carlos Soto-Campos, Email: csoto@uaeh.edu.mx.

References

  1. Alshangiti AM, Kayid M, Alarfaj B. A new family of Marshall-Olkin extended distributions. J Comput Appl Math. 2014;271:369–379. doi: 10.1016/j.cam.2014.04.020. [DOI] [Google Scholar]
  2. Bidram H, Alamatsaz MH, Nekoukhou V. On an extension of the exponentiated Weibull distribution. Commun Statistics-Simulation Comput. 2015;44(6):1389–1404. doi: 10.1080/03610918.2013.819918. [DOI] [Google Scholar]
  3. Cordeiro GM, Saboor A, Khan MN, Gamze O, Pascoa MA. The Kumaraswamy exponential-Weibull distribution: theory and applications. Hacettepe J Math Stat. 2016;45(4):1203–1229. [Google Scholar]
  4. Cui W, Yan Z, Peng X. A new Marshall Olkin Weibull distribution. Eng Lett. 2020;28(1):63–68. [Google Scholar]
  5. Dey S, Sharma VK, Mesfioui M. A new extension of Weibull distribution with application to lifetime data. Ann Data Sci. 2017;4(1):31–61. doi: 10.1007/s40745-016-0094-8. [DOI] [Google Scholar]
  6. García V, Martel-Escobar M, Vázquez-Polo FJ. Generalising exponential distributions using an extended Marshall-Olkin procedure. Symmetry. 2020;12(3):464. doi: 10.3390/sym12030464. [DOI] [Google Scholar]
  7. Jayakumar K, Sankaran KK. On a generalisation of uniform distribution and its properties. Statistica. 2016;76(1):83–91. [Google Scholar]
  8. Jayakumar K, Sankaran K. Discrete Linnik Weibull Distribution. Commun Statistics-Simulation and Comput. 2019;48(10):3092–3117. doi: 10.1080/03610918.2018.1475009. [DOI] [Google Scholar]
  9. Jose K, Krishna E. Marshall-Olkin extended uniform distribution. ProbStat Forum. 2011;4(October):78–88. [Google Scholar]
  10. Law AM, Kelton WD, Kelton WD. Simulation modeling and analysis. 3. New York: McGraw-Hill; 2000. [Google Scholar]
  11. Marshall AW, Olkin I. A new Method for Adding a Parameter to a Family of Distributions with Application to the Exponential and Weibull Families. Biometrika. 1997;84(3):641–652. doi: 10.1093/biomet/84.3.641. [DOI] [Google Scholar]
  12. Nassar M, Afify AZ, Dey S, Kumar D. A new extension of Weibull distribution: properties and different methods of estimation. J Comput Appl Math. 2018;336:439–457. doi: 10.1016/j.cam.2017.12.001. [DOI] [Google Scholar]
  13. Okasha HM, Kayid M. A new family of Marshall-Olkin extended generalized linear exponential distribution. J Comput Appl Math. 2016;296:576–592. doi: 10.1016/j.cam.2015.10.017. [DOI] [Google Scholar]
  14. Pal M, Ali MM, Woo J. Exponentiated Weibull distribution. Statistica. 2006;66(2):139–147. [Google Scholar]
  15. Rondero-Guerrero C, González-Hernández I, Soto-Campos C. On a generalized uniform distribution. Adv Appl Stat. 2020;60(1):93–103. doi: 10.1007/s00180-022-01296-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Sankaran K, Jayakumar K. A new extended uniform distribution. Int J Stat Distrib Appl. 2016;2(3):35–41. [Google Scholar]
  17. Shakhatreh MK. A new three-parameter extension of the log-logistic distribution with applications to survival data. Commun Statistics-Theory Methods. 2018;47(21):5205–5226. doi: 10.1080/03610926.2017.1388399. [DOI] [Google Scholar]
  18. Torabi H, Bagheri F, Mahmoudi E. Estimation of parameters for the Marshall-Olkin generalized exponential distribution based on complete data. Math Comput Simul. 2018;146(April):177–185. doi: 10.1016/j.matcom.2017.11.005. [DOI] [Google Scholar]

Articles from Computational Statistics are provided here courtesy of Nature Publishing Group

RESOURCES