Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2022 Mar 19;9(5):909–943. doi: 10.1007/s40745-022-00373-0

The Exponentiated Gumbel–Weibull {Logistic} Distribution with Application to Nigeria’s COVID-19 Infections Data

Patrick Osatohanmwen 1,, Eferhonore Efe-Eyefia 2, Francis O Oyegue 3, Joseph E Osemwenkhae 3, Sunday M Ogbonmwan 3, Benson A Afere 4
PMCID: PMC8934027  PMID: 38624783

Abstract

A new flexible univariate probability distribution was defined in this paper. The new distribution is so called the ‘exponentiated Gumbel–Weibull {logistic} distribution’ and it arose by using the exponentiated Gumbel distribution to generate a generalized Weibull distribution using the logit function or the quantile function of the logistic distribution as a link. The new distribution was observed to be both unimodal and bimodal as well as exhibits various shape and tail properties consistent with data arising from several real life phenomena. A detail study of its statistical properties was carried out and the maximum likelihood method was used in the estimation of its parameters. The new distribution was applied in fitting the reported daily number of infections due to the COVID-19 pandemic in Nigeria. Five other datasets were further used to ascertain the flexibility of the new distribution in fitting data sets with different statistical properties.

Keywords: T–R {Y} family, Gumbel distribution, Weibull distribution, Maximum likelihood estimation, Monte Carlo Simulations

Introduction

The science of data is one which involves the use of some methodologies from disparate fields in extracting information from data usually for policy purposes. These methodologies include statistical methodologies, scientific methodologies, artificial intelligence as well as data analysis methodologies [13]. These methodologies come handy in aggregating, cleaning, preparing data for analysis, manipulating data as well finding specific patterns or trajectories that data follow. Within the vanguard of statistical modeling of data, the practice is usually to find a stochastic model which best describe the behavior of a given data. These stochastic models are usually completely specified as probability distribution functions from which other desirable properties of the data are obtained for either policy making or for further investigations. The need to obtain appropriate distribution functions which can best describe the stochastic behavior of data sets arising from several real life situations is one of the major drives for the development of new and more flexible families of probability distributions. Within the context of applications, the classical probability distribution functions have been found to be unable to adequately fit data sets with varying shape and tail properties in many studies and hence the increasing volumes of research devoted so far to generalized them and in the process increase their flexibility. Several methods have been put forward in the literature for the generalization of a probability distribution [412] each with their attendant benefits and shortcomings.

The COVID-19 pandemic is one which has ravage the entire world and accompanying it are economic, social and behavioral challenges and responses. Several studies, using mathematical models, statistical models, behavioral models and those involving artificial intelligence frameworks have been put forward already to explain the evolution, transmission and the impacts of the pandemic in several countries of the world using data on the daily, weekly or monthly number of infections from the disease [1321]. However, it is important to state that data of this nature tends to possess one or more characteristics which classical probability distributions as used in statistical modeling may not be able to capture when they are used to describe them. For example, data of this sort tends to be highly skewed either to the right or to the left with the possibility of having some outlying observation and hence, a classical distribution like the normal distribution cannot be used to fit such data and it becomes imperative to use a very flexible distribution to fit data of this sort such as generalized families of distributions. In this paper a new probability distribution which is a generalization of the classical Weibull distribution is developed and used to fit the daily number of infections from the COVID-19 pandemic in Nigeria. The new distribution is further used in fitting five other data sets in order to demonstrate how flexible it can be.

The rest of the paper is organized thus. In Sect. 2, the new distribution is presented. A discussion on some of the statistical properties of the distribution is contained in Sect. 3. The process of using the maximum likelihood method for the estimation of the parameters of the distribution is contained in Sect. 4 while application of the distribution to real data sets is carried out in Sect. 5. The paper closes in Sect. 6 with summary and conclusion.

The New Distribution

Supposed T is a random variable following the exponentiated Gumbel distribution defined by [22] with the cumulative distribution function (cdf), probability density function (pdf) and quantile function given respectively by

FTx=1-1-exp-exp-x-kcβ,
fTx=βcexp-x-kcexp-exp-x-kc1-exp-exp-x-kcβ-1,
QTp=k-clog-log1-1-p1/β,
c,β>0,-x,k,xk,0<p<1.

Suppose also that R is a Weibull random variable with cdf, pdf and quantile function given respectively by

FRx=1-e-x/λα,
fRx=αλxλα-1e-x/λα,
QRp=λ-log1-p1/α,
x>0,α,λ>0,0<p<1.

Let Y be a standard logistic random variable with cdf, pdf and quantile function given respectively

FYx=11+e-x,
fYx=e-x1+e-x2,
QYp=logp1-p,-x,0<p<1.

The cdf

Fx=-QYFRxfTtdt=FTQYFRx 1

is a valid cdf and from (1) we have the cdf of the 5-parameter exponentiated Gumbel–Weibull {logistic} (EGuWL) distribution given as

Fx=1-1-exp-ek/ceα-1-1/cβ,x,α,β,c,λ>0,-k. 2

The pdf corresponding to (2) is expressed as

fx=αβek/cλcx/λα-1eαeα-1-1-1/cexp-ek/ceα-1-1/c×1-exp-ek/ceα-1-1/cβ-1,x,α,β,c,λ>0,-k, 3

where the parameters α,β,c and k control the shape of the distribution and λ is scale parameter. The graphs of the pdf in (3) are shown in Figs. 1, 2 and 3 for various combinations of parameter values. The quantile function corresponding to the cdf in (1) is given by

Qp=λlog-e-k/clog1-1-p1/β-c+11/α,α,β,c,λ>0,-k,0<p<1. 4

Fig. 1.

Fig. 1

EGuWL density showing skewness to the right

Fig. 2.

Fig. 2

EGuWL density showing symmetry and bimodality

Fig. 3.

Fig. 3

EGuWL density showing bimodality and left skewness

In Fig. 1, for fixed values of the parameters λ and k we observe that the EGuWL density is highly skewed to the right when the parameters α,βandc are varied. In fact, for decreasing (increasing) values of parameter αparameterβ the density falls exponentially. This behavior shows that the EGuWL distribution can be very effective in fitting highly right-skewed data sets with possibility of outliers or reverse-J shaped data sets. In Fig. 2, for fixed values of λ and β and varied values of α,candk the EGuWL density can be bimodal and almost symmetric. For negative values of k and increasing (decreasing) values parameter αparameterc, the EGuWL density is bimodal and for non-negative values of the parameter k and increasing (decreasing) values of parameter αparameterc, the EGuWL density is almost symmetric. This highlights that the EGuWL distribution can be used for fitting bimodal and near symmetric data sets. In Fig. 3, the EGuWL density is also observed to possess left-skewness. In fact, for fixed values of λ and α the density is skewed to the left when the value of β is decreasing and when the values of kandc is increasing. This also shows that the EGuWL distribution can also be used to fit left-skewed data sets. Observe that in the Figs. 1, 2 and 3, the value of the parameter λ is always fixed, this is because λ is a scale parameter and its value does not affect the shape of the density.

Proposition 1:

Suppose X is an EGuWL random variable and U and T are uniform random variable defined on (0, 1) and exponentiated Gumbel random variable respectively, then.

  • (i)
    X=λlogeT+11/α,
  • (ii)
    X=λlog-e-k/clog1-1-U1/β-c+11/α.

Proof:

The proof of (i) and (ii) follow from (1) and (4) respectively. Proposition 1 is very useful for simulating random samples from the EGuWL distribution by first simulating from the exponentiated Gumbel distribution or the uniform distribution and applying the transformation accordingly. The relation in (i) can also be used to determine the moments of the EGuWL distribution.

Statistical Properties of the New Distribution

Here we present some essential statistical properties of the EGuWL distribution. A discussion on the hazard function is used to begin the section.

Hazard Function

The hazard function of the EGuWL distribution is expressed as

hx=αβek/cλcx/λα-1eαeα-1-1-1/cexp-ek/ceα-1-1/c×1-exp-ek/ceα-1-1/c-1,x,α,β,c,λ>0,-k. 5

Figures 4, 5, 6 display the shape of the EGuWL hazard function for various combinations of parameter values. Figures 4, 5, 6 show that the EGuWL hazard can be decreasing, increasing and upside down bathtub. These results are very useful in lifetime data analysis.

Fig. 4.

Fig. 4

EGuWL hazard showing decreasing and upside down bathtub shapes

Fig. 5.

Fig. 5

EGuWL hazard showing increasing shapes

Fig. 6.

Fig. 6

EGuWL hazard showing increasing shapes

Mode

Proposition 2:

The mode(s) of the EGuWL distribution is either at x=0 or it will satisfy the equation.

wx=β-1xeα-1Axtx, 6

where

Ax=1-exp-ek/ceα-1-1/c-1,
wx=α-1eα-1-αx/λα+α/cx/λαeαek/ceα-1-1/c-1,
tx=αek/cλcx/λα-1eαeα-1-1-1/cexp-ek/ceα-1-1/c.

Proof:

As observed from the graphs of the EGuWL density, the distribution can be both unimodal and bimodal. On differentiating the EGuWL density w.r.t x, one obtains.

fx=βAx1-Fxtxwxxeα-1-1-β-1Axtx.

The derivative fx does not exist when x=0. Other critical point(s) satisfy fx=0, hence the EGuWL distribution mode(s) will either be at x=0 or it will satisfy the equation

wx=β-1xeα-1Axtx.

Remark 1:

Observe that the expression wxxeα-1-1-β-1Axtx is a factor of fx and has the same sign as fx. Analytical solution of (6) for x is not possible. However, (6) can be solved numerically in order to obtain the desired mode(s).

Moments

An expression for computing the rth non-central moments of the EGuWL distribution can easily be obtained by making using of the relationship between the EGuWL random variable X and the exponentiated Gumbel random variable T as specified in Proposition 1(i). In particular, the relation X=λlogeT+11/α implies that

μr=EXr=λrElogeT+1r/α.

Since X which is an EGuWL random variable is a transformed exponentiated Gumbel random variable T following from proposition 1(i), its moments can be obtained as if one is obtaining the moments of the exponentiated Gumbel random variable T hence the density function of the exponentiated Gumbel distribution will be used in obtaining the moments instead of the more complex density function of the EGuWL distribution and this is a major result in this paper. It follows that

μr=βλrc-loget+1r/αe-t-k/cexp-e-t-k/c1-exp-e-t-k/cβ-1dt. 7

The rth non-central moments of the EGuWL distribution are computed from the relation in (7). The mean μ, variance σ2, skewness S and kurtosis K of the EGuWL distribution are given respectively as

μ=μ1,
σ2=μ2+μ2-2μ2,
S=μ3,-3μμ2+2μ2μ2-μ23/2
K=μ4,-4μμ3+6μ2μ2,-3μ4μ2,-μ22.

The quantile function can also be used in computing the skewness and kurtosis of a distribution, especially when such quantile function exists in a simple analytic form. Galton [23] proposed a quantile measure based approach for evaluating skewness while Moor [24] did the same for Kurtosis. Galton’s skewness and Moor’s kurtosis are evaluated using the relations

S=Q6/8-2Q4/8+Q2/8Q6/8-Q2/8,
K=Q7/8-Q5/8+Q3/8-Q1/8Q6/8-Q2/8.

Since the quantile function of the EGuWL distribution exists in a simple analytic form as expressed in (4), the above expressions can be used in computing the skewness and kurtosis of the EGuWL distribution. 3-D plots of the Galton’s skewness and the Moore’s kurtosis of the EGuWL distribution for some selected parameters values are presented in Fig. 7.

Fig. 7.

Fig. 7

Galton’s skewness (S) and Moore’s kurtosis (K) for the EGuWL distribution (k = 0, λ = 1, β = 0.5)

Entropy

Shannon [25] offered a probabilistic definition of entropy. The Shannon entropy ηX of a random variable X following a known probability distribution is a measure of variation of uncertainty.

Proposition 3:

The Shannon entropy of a random variable X following the EGuWL distribution can be expressed as.

ηX=ηT-μT-logα/λ-Zα,β,c,k, 8

where

ηT and μT are respectively the Shannon entropy and mean of the exponentiated Gumbel distribution,

Zα,β,c,k=βc-2loge-t+1+α-1/αlogloget+1-loget+1×e-t-k/cexp-e-t-k/c1-exp-e-t-k/cβ-1dt.

Proof:

For a random variable X with density function fx, the Shannon Entropy of X is defined as.

ηX=E-logfX.

The pdf fx corresponding to the cdf Fx in (1) can be written as

fx=fRxfTQYFRxfYQYFRx,

and hence

fX=fRXfTQYFRXfYQYFRX.

Observe that from (1), t=QYFRx and hence T=QYFRX. It follows that

fX=fRXfTTfYT

and

ηX=E-logfX=E-logfTT-ElogfRX+ElogfYT.

It follows that

ηX=ηT-ET-2Eloge-T+1-ElogfRX,

and consequently

ηX=ηT-μT-2Eloge-T+1-ElogfRX.

From Proposition 1(i) we have that X=λlogeT+11/α and thus

logfRX=logα/λ+α-1/αloglogeT+1-logeT+1.

It follows that

ElogfRX=logα/λ+α-1/αEloglogeT+1-ElogeT+1.

Thus

ηX=ηT-μT-logα/λ-2Eloge-T+1-α-1/αEloglogeT+1+ElogeT+1,

where

Eloge-T+1=βc-loge-t+1e-t-k/cexp-e-t-k/c1-exp-e-t-k/cβ-1dt, 9
ElogeT+1=βc-loget+1e-t-k/cexp-e-t-k/c1-exp-e-t-k/cβ-1dt, 10
EloglogeT+1=βc-logloget+1e-t-k/cexp-e-t-k/c1-exp-e-t-k/cβ-1dt, 11

The integrals in (9)–(11) exist because

logloget+1loget+1log2+twhent>0,
logloget+1loget+1log2whent<0,
logloget+1loge-t+1log2+twhent<0,

and

logloget+1loge-t+1log2whent>0.

Hence

ηX=ηT-μT-logα/λ-Zα,β,c,k

where

Zα,β,c,k=βc-2loge-t+1+α-1/αlogloget+1-loget+1×e-t-k/cexp-e-t-k/c1-exp-e-t-k/cβ-1dt.

Remark 2:

It can be easily verified that.

ηT=logc-logβ+γ+1-β-1Elog1-GX,

where G. is the cdf of the Gumbel distribution, γ=0.57722 is the Euler’s constant and

Elog1-GX=1c-log1-Gxe-x-k/cexp-e-x-k/cdx.

An expression for μT was given in [22] as

μT=Γβ+1n=0-1nk+cγ+clogn+1n+1!Γβ-n,

where Γ. is the complete gamma function.

Estimation

Here the maximum likelihood method of estimation of parameters is presented for the estimation of the parameters of the EGuWL distribution.

Maximum Likelihood Method of Estimation of the Parameters of the EGuWL Distribution

For a complete random independent sample x1,x2,,xn of size n, the log-likelihood function of the EGuWL distribution is

L=nlogα+logβ+k/c-logk-logλ+α-1i=1nlogxi/λ+i=1nxi/λα+-1-1/ci=1nlogeα-1-ek/ci=1neα-1-1/c+β-1i=1nlog1-exp-ek/ceα-1-1/c. 12

Suppose Θ=αβckλT be the unknown parameter vector, the associated score function is given by

UΘ=LαLβLcLkLλT,

where Lα,Lβ,Lc,LkandLλ are the partial derivatives of the log-likelihood function w.r.t. to each parameter and are given by

Lα=n/α+i=1nlogxi/λ+i=1nxi/λαlogxi/λ+-1-1/ci=1nxi/λαlogxi/λeαeα-1+ek/c/ci=1nxi/λαlogxi/λeαeα-1-1-1/c-β-1ek/cci=1nxi/λαlogxi/λeαeα-1-1-1/cexp-ek/ceα-1-1/c1-exp-ek/ceα-1-1/c,
Lβ=n/β+i=1nlog1-exp-ek/ceα-1-1/c,
Lc=-nkc2+1c2i=1nlogeα-1-ek/cc2i=1neα-1-1/clogeα-1-keα-1-1/c+β-1ek/cc2×i=1neα-1-1/clogeα-1-keα-1-1/cexp-ek/ceα-1-1/c1-exp-ek/ceα-1-1/c,
Lk=nc-nk-ek/cci=1neα-1-1/c+β-1ek/cci=1neα-1-1/cexp-ek/ceα-1-1/c1-exp-ek/ceα-1-1/c,
Lλ=-nλ-nα-1λ-αλ2i=1nxixi/λα-1-α-1-1/cλ2i=1nxixi/λα-1eαeα-1-αek/cλ2ci=1nxixi/λα-1eαeα-1-1-1/c+αβ-1ek/cλ2c×i=1nxixi/λα-1eαeα-1-1-1/cexp-ek/ceα-1-1/c1-exp-ek/ceα-1-1/c.

The maximum likelihood estimate of Θ is obtained by solving the non-linear systems of equations UΘ=0. Since the resulting systems of equations are not in closed form, the solutions can be found numerically using any of the Newton’s type algorithms.

The Fisher information matrix (FIM) of the EGuWL distribution is the 5×5 symmetric matrix given by

IΘ=-EΘIααIαβIαcIαkIαλIβαIββIβcIβkIβλIcαIcβIccIckIcλIkαIkβIkcIkkIkλIλαIλβIλcIλkIλλ,

where the elements IijΘ=2LΘiΘj. Thus, the elements of the FIM can be obtained by realizing the second order partial derivatives of the log-likelihood function w.r.t. to the parameters. These elements can be numerically obtained by using the R software. The total FIM, IΘ, can be approximated by

JΘ^-2LΘiΘjΘ=Θ^5×5.

For real data, JΘ^ is obtained after the maximum likelihood estimate of Θ is gotten, which implies the convergence of the iterative numerical procedure involved in finding such estimate.

Suppose Θ^ is the maximum likelihood estimate of Θ. Under the usual regularity conditions and that the parameters are in the interior of the parameter space, but not on the boundary, we have: nΘ^-ΘdN50,I-1Θ, where I-1Θ is the inverse of the expected FIM, which also corresponds to the variance–covariance matrix of the parameters. The asymptotic behavior is still valid if I-1Θ is replaced by the inverse of the observed information matrix evaluated at Θ^, that is J-1Θ^. The multivariate normal distribution with mean vector 0=00000T and covariance matrix I-1Θ can be used to construct confidence intervals for the EGuWL parameters. The approximate 1001-ω% two-sided confidence interval for the parameters α,β,c,kandλ are given by

α^±Zω/2Iαα-1Θ^,β^±Zω/2Iββ-1Θ^,c^±Zω/2Icc-1Θ^,
k^±Zω/2Ikk-1Θ^,λ^±Zω/2Iλλ-1Θ^,

respectively, where Iαα-1Θ^,Iββ-1Θ^,Icc-1Θ^,Ikk-1Θ^andIλλ-1Θ^ are diagonal elements of I-1Θ^ and Zω/2 is the upper ω/2th percentile of a standard normal distribution.

Monte Carlo Simulations

Here we conduct a Monte Carlo simulations study to assess the performance and efficiency of the maximum likelihood estimators of the parameters of the EGuWL distribution. The performance of the maximum likelihood estimators are examined for different sample sizes and different combinations of parameter values. The simulation is repeated for N=5000 times using the sample sizes n=25,80,150,400,800and1500 and parameter combination values I:α=5,β=1.5,c=4,k=-2,λ=1.5 and II:α=2,β=1.5,c=2.5,k=2,λ=3 Random samples are simulated from the EGuWL distribution using Proposition 1(i) and five quantities are computed in the simulations and these include:

  1. Mean estimates (ME) of the maximum likelihood estimator of the parameter Θ=αβckλ where
    ME=1Ni=1NΘ^;
  2. Average bias (AVB) of the maximum likelihood estimator of the parameter Θ=αβckλ where
    AVB=1Ni=1NΘ^-Θ;
  3. Root mean squared error (RSME) of the maximum likelihood estimator of the parameter

    Θ=αβckλ where
    RMSE=1Ni=1NΘ^-Θ2;
  4. Coverage probability (CP) of 95% confidence intervals of the parameters Θ=αβckλ i.e., the percentage of intervals that contain the true value of parameter Θ;

  5. Average width (AW) of 95% confidence intervals of the parameter Θ=αβckλ.

Tables 1 and 2 contain the results for the quantities ME, AVB, RMSE, AW and CP. In Tables 1 and 2, it can be observed that ME of all the parameters reduce as the sample size increases and moves toward their true values. The AVB of all the parameters are all positive and reduce as the sample size increases. The RMSE and the AW of all the parameters also reduce as the sample size increases.

Table 1.

Results of Monte Carlo simulations α=5,β=1.5,c=4,k=-2,λ=1.5

Parameter Sample size ME AVB RMSE AW CP
β n = 25 2.3662 2.8721 9.5568 92.1923 0.92
n = 80 1.9902 2.8001 9.2332 43.7943 0.94
n = 150 1.8189 1.6123 4.9977 19.0392 0.92
n = 400 1.7231 0.6956 1.8794 6.2785 0.95
n = 800 1.6070 0.2885 0.8791 3.1967 0.96
n = 1500 1.5158 0.1722 0.5319 2.1018 0.95
α n = 25 6.6151 1.6151 2.8147 22.4692 0.99
n = 80 5.9234 0.9234 2.1205 11.6721 0.98
n = 150 5.6044 0.6044 1.7176 7.9471 0.97
n = 400 5.3479 0.3479 1.0629 4.2522 0.97
n = 800 5.1483 0.1483 0.6956 2.7024 0.97
n = 1500 5.0368 0.0368 0.4752 1.8867 0.96
λ n = 25 1.5495 0.0495 0.2718 2.2593 0.99
n = 80 1.5594 0.0594 0.2367 1.3777 0.97
n = 150 1.5397 0.0397 0.2124 1.0284 0.95
n = 400 1.5396 0.0396 0.1434 0.5947 0.94
n = 800 1.5163 0.0163 0.1006 0.4082 0.94
n = 1500 1.5080 0.0081 0.0734 0.2959 0.95
k n = 25 − 0.5616 1.4384 6.9987 61.4781 1
n = 80 − 0.6047 1.8336 5.4006 28.2781 1
n = 150 − 0.6947 1.3053 3.8439 17.0276 0.99
n = 400 − 1.3415 0.6585 2.0352 7.9190 0.98
n = 800 − 1.6746 0.3254 1.1400 4.7119 0.99
n = 1500 − 1.7674 0.2326 0.7742 3.2086 0.95
c n = 25 6.0365 2.0365 5.2700 41.5927 0.98
n = 80 5.7615 1.7615 4.2515 21.6274 0.97
n = 150 5.2363 1.2363 3.2890 14.0813 0.95
n = 400 4.7095 0.7095 1.9105 7.0457 0.98
n = 800 4.3271 0.3271 1.1555 4.2893 0.97
n = 1500 4.1631 0.1631 0.7403 2.9448 0.96

Table 2.

Results of Monte Carlo simulations α=2,β=1.5,c=2.5,k=2,λ=3

Parameter Sample size ME AVB RMSE AW CP
β n = 25 2.9043 28.2944 33.9312 398.123 0.97
n = 80 2.8575 19.1266 26.1717 333.010 0.99
n = 150 2.8018 14.6778 20.2345 271.750 0.99
n = 400 2.2214 5.9548 18.6956 99.3318 0.99
n = 800 2.0757 3.6372 10.4671 55.8275 1
n = 1500 2.0091 2.3331 5.7627 28.5450 0.99
α n = 25 5.3867 3.3870 4.5350 18.6944 1
n = 80 4.0405 2.0405 3.2464 10.4061 0.97
n = 150 3.1945 1.1945 2.3676 7.1001 0.97
n = 400 2.2856 0.2856 0.8372 2.9608 0.96
n = 800 2.0716 0.0716 0.4053 1.8213 0.98
n = 1500 2.0002 0.0002 0.2412 1.2770 0.98
λ n = 25 4.8666 1.8660 2.5276 10.7823 0.77
n = 80 4.7721 1.7721 2.4828 9.6358 0.78
n = 150 4.4782 1.4783 2.2842 9.1945 0.95
n = 400 3.8396 0.8396 1.5297 6.9524 0.95
n = 800 3.5479 0.5479 1.0833 5.0864 0.97
n = 1500 3.4106 0.4106 0.8283 3.5243 0.99
k n = 25 4.0900 2.0900 8.7545 90.8129 0.98
n = 80 4.6124 2.6123 7.9091 49.9287 0.97
n = 150 4.5187 2.5187 7.5799 36.1367 0.86
n = 400 2.8963 0.8963 2.8955 14.2318 0.99
n = 800 2.5846 0.5846 1.6062 8.4314 0.99
n = 1500 2.4372 0.4373 1.0703 5.5644 0.99
c n = 25 6.2989 3.7989 7.0401 47.6068 0.99
n = 80 5.5586 3.0586 6.0645 27.1930 0.99
n = 150 4.7446 2.2446 5.2725 18.9548 0.99
n = 400 3.2178 0.7178 1.9232 6.8753 1
n = 800 2.8741 0.3741 0.9428 3.7113 1
n = 1500 2.7351 0.2351 0.5228 2.3270 0.99

Remark 3

The simulations was also conducted for other sets of combination of parameter values namely α=4,β=4.5,c=2,k=0,λ=0.5, α=2.5,β=3,c=5,k=-5,λ=1 and α=4,β=2.5,c=1.5,k=5,λ=2.5 and the results followed similar pattern as obtained in Tables 1 and 2. To conserve space, they are not reported.

Applications

The EGuWL distribution will be applied to fit the daily number of reported infections from the COVID-19 pandemic in Nigeria. Five other data sets will also be used to demonstrate its flexibility. The fit of the EGuWL distribution will be compared with those of other models in its class.

  • (i)

    Application to Nigeria’s COVID—19 data

    For the first application, the EGuWL is used to fit the daily number of reported infections from the COVID-19 pandemic in Nigeria for a seven months period (20th March–19th October, 2020). The data set was obtained from the website of the National Center for Disease Control (NCDC) at http://covid19.ncdc.gov.ng/. The data set is unimodal, right-skewed and platykurtic (skewness = 0.4671, excess kurtosis = − 0.8916). The data set is contained in Table 3.

The Weibull (W), exponentiated Gumbel (EGu) [22], the beta exponential (BE) [26], the beta generalized exponential (BGE) [27] and the Gumbe Weibull {logistic} (GuWL) [28] distributions are also used to fit the data and their fits are compared with that of the EGuWL distribution. The BE, BGE and GuWL densities are given respectively by

fBEx=Γa+bΓaΓbλe-bλx1-e-λxa-1,x,a,b,λ,>0,
fBGEx=βΓa+bλΓaΓbe-x/λ1-e-x/λaβ-11-1-e-x/λβb-1,x,a,b,β,λ,>0,
fGuWLx=αek/cλcx/λα-1eαeα-1-1-1/cexp-ek/ceα-1-1/c,x,α,c,λ,>0,-<k<.

The results from fitting the COVID-19 data which include the estimate of the parameters, the standard errors of these estimated parameters, the loglikelihood (loglik) values, the Akaike Information Criterion (AIC) values and the Kolmogorov–Smirnov (K–S) statistic values (the corresponding p values are also reported) of all the fitted distributions are reported in Table 4. Figure 8 shows the graph of all the fitted densities alongside the histogram of the data. The results in Table 4 clearly show that the EGuWL distribution provided the best fit for the data by possessing the smallest AIC value as well as the highest p value of the K–S statistic.

Table 3.

Daily number of infections from COVID-19 (20th March–19th October, 2020)

4, 4, 10, 8, 10, 4, 7, 14, 5, 19, 22, 20, 8, 35, 10, 25, 5, 18, 6, 16, 22, 14, 17, 13, 5, 20, 30, 34, 35, 51, 48, 86, 38, 117, 91, 108, 114, 87, 91, 64, 195, 196, 204, 238, 220, 170, 245,148, 195, 381, 386, 239, 248, 242, 146, 184, 193, 288, 176, 388, 216, 226, 284, 339, 245, 265, 313, 229, 276, 389, 182, 387, 553, 307, 416, 241, 348, 350, 328, 389, 260, 315, 663, 409, 681, 627, 501, 403, 573, 490, 587, 745, 667, 661, 436, 675, 452, 649, 594, 684, 779, 490, 566, 561, 790, 626, 454, 603, 544, 575, 503, 460, 499, 575, 664, 571, 595, 463, 643, 595, 600, 653, 556, 562, 576, 543, 604, 591, 438, 555, 648, 624, 404, 481, 462, 386, 304, 288, 304, 457, 354, 443, 453, 437, 290, 423, 453, 373, 329, 325, 298, 417, 410, 593, 476, 340, 601, 322, 321, 252, 221, 296, 160, 250, 138, 143, 239, 216, 125, 162, 100, 155, 296, 176, 197, 188, 160, 79, 132, 90, 126, 131, 221, 189, 97, 195, 176, 111, 125, 213, 136, 126, 136, 187, 201, 153, 126, 160, 58, 120, 118, 155, 103, 151, 111, 163, 164, 225, 179, 148, 212, 113, 133, 118

Table 4.

Maximum likelihood fit of the COVID-19 data

Distribution W BE EGu BGE GuWL EGuWL

Parameter

estimates

α^=1.20850.0693 a^=0.96430.1632 β^^=0.07730.0053 a^=3.516250.047 a^=1.60140.0943 a^=1.60730.4738
λ^=303.5317.926 b^=0.08090.0174 c^=21.4400.0020 b^=0.89091.9056 c^=5.00920.6222 β^^=24.31734.992
λ^=0.04300.0087 k^=11.0080.0573 β^^=0.32314.7859 k^=1.96680.4694 c^=10.2403.4696
λ^=234.86488.64 λ^=114.453.7350 k^=14.2897.4607
λ^=205.2779.159
Log Likelihood -1420.29 -1425.26 -14.32.10 -1423.82 -1405.78 -1403.23
AIC 2844.58 2856.52 2870.20 2855.64 2819.56 2816.46

K-S

p-value

0.0738

0.1845

0.1175

0.0050

0.0973

0.0324

0.0848

0.0872

0.0629

0.351

0.0622

0.3636

(Standard error of estimates in parenthesis)

Fig. 8.

Fig. 8

Graph of the fitted densities for the COVID-19 data

  • (ii)

    Application to Aluminum Coupons data

For the second application, the EGuWL distribution is used to fit the fatigue time of 101 6061-T6 Aluminum Coupons cut parallel to the direction of rolling and oscillated at 18 cycles per second (cps). The data set was reported in [29] and presented in Table 5. The data set is unimodal, right-skewed and leptokurtic (Skewness = 0.3355 and excess kurtosis = 1.1687). The beta normal (BN) [6], the beta Weibull (BW) [30], the beta Burr XII (BBXII) [31], Gumbel–Burr XII {logistic} (GuBXIIL) [32] and the GuWL distributions are also used to fit the data set and their fits are compared with that of the EGuWL distribution. The BN, BW, BBXII and the GuBXIIL densities are given respectively by

fBNx=Γa+bcΓaΓbϕx-kcΦx-kca-11-Φx-kcb-1,a,b,c>0,-<x,k<,

Table 5.

Fatigue time of 101 6061-T6 Aluminum Coupons

70, 90, 96, 97, 99, 100, 103, 104,104,105,107,108, 108, 108,109, 109, 112, 112,113, 114, 114, 114, 116, 119, 120, 120,120, 121, 121, 123, 124, 124, 124, 124, 124, 128, 128, 129,129, 130, 130, 130, 131, 131, 131, 131, 131, 132, 132, 132,133, 134, 134, 134, 134, 134, 136, 136, 137, 138, 138, 138,139, 139, 141, 141, 142, 142, 142, 142, 142, 142, 144, 144,145, 146, 148, 148, 149, 151, 151, 152, 155, 156, 157, 157,157, 157, 158, 159, 162, 163, 163, 164, 166, 166, 168, 170,174, 196, 212

ϕ. and Φ. are the pdf and cdf of the normal distribution respectively,

fBWx=αΓa+bλΓaΓbx/λα-1e-bx/λα1-e-x/λαa-1,x,α,a,b,λ>0,
fBBXIIx=αcΓa+bλΓaΓbx/λα-11+x/λα-bc-11-1+x/λα-ca-1,x,α,a,b,c,λ>0,
fGuBXIILx=αβek/cλcx/λα-11+x/λαβ-11+x/λαβ-1-1-1/cexp-ek/c1+x/λαβ-1-1/c,x,α,β,c,λ>0,-<k<.

The results from fitting the Aluminum Coupons which include the estimate of the parameters, the standard errors of these estimated parameters, the loglikelihood (loglik) values, the Akaike Information Criterion (AIC) values and the Kolmogorov–Smirnov (K–S) statistic values (the corresponding p values are also reported) of all the fitted distributions are reported in Table 6. Figure 9 shows the graph of all the fitted densities alongside the histogram of the data. The results in Table 6 clearly show that the EGuWL distribution provided the best fit for the data by possessing the smallest AIC value as well as the highest p value of the K–S statistic.

  • (iii)

    Application to the Kevlar 49/epoxy strands failure times data (pressure at 70%)

Table 6.

Maximum likelihood fit of the Aluminum Coupons

Distribution BW BN GuWL BBXII GuBXIIL EGuWL

Parameter

estimates

a^=6.24696.6825 a^=8.128530.488 α^=2.25900.0028 a^=124.92197.607 α^=1.05800.5606 α^=2.06231.4468
b^=1.43002.6866 b^=1.69313.4807 c^=4.07330.3075 b^=52.77850.597 β^=2.28342.3011 β^=1.75462.7663
α^=2.76391.7179 c^=44.296464.704 k^=11.0250.4289 c^=0.71830.5679 c^=0.47740.4626 c^=3.78125.4660
λ^=106.5324.282 c^=86.946106.44 λ^=43.2160.0028 α^=1.14650.6420 k^=17.674917.136 k^=9.937511.154
λ^=35.83554.408 λ^=12.45115.9684 λ^=44.6142.1010
Log Likelihood − 456.67 − 456.88 − 456.61 − 457.90 − 475.18 − 455.59
AIC 921.34 921.75 921.21 925.80 960.35 921.18
K–S 0.0654 0.0647 0.0750 0.0913 0.1329 0.0611
p value 0.7550 0.7673 0.5936 0.3482 0.0514 0.8222

(Standard error of estimates in parenthesis)

Fig. 9.

Fig. 9

Graph of the fitted densities for the Aluminum Coupons data

For the third application, the EGuWL distribution is used to fit the Kevlar 49/epoxy strands failure times data (pressure at 70%). The data set was reported in [28]. The data set is multimodal, platykurtic, and approximately symmetric. (skewness = 0.0998, excess kurtosis = − 0.79). The data set is presented in Table 7. The BN, BW, GuWL, beta exponentiated Weibull (BEW) [33] and the Gumbel–Weibull {logistic} Poisson (GuWLP) [12] distributions are also used to fit the data set and their fits are compared with that of the EGuWL distribution. The BEW and the GuWLP densities are given respectively by

fBEWx=αβΓa+bλΓaΓbx/λα-1e-x/λα1-e-x/λαaβ-1×1-1-e-x/λαβb-1,x,a,α,b,β,λ,>0,
fGuWLPx=βαek/cλceβ-1x/λα-1eαeα-1-1-1/c×exp-ek/ceα-1-1/c×expβ1-exp-ek/ceα-1-1/c,x,α,λ,c>0,β,kR.

Table 7.

Kevlar 49/epoxy strands failure times data (pressure at 70%)

1051, 1337, 1389, 1921, 1942, 2322, 3629, 4006, 4012, 4063, 4921, 5445, 5620, 5817, 5905, 5956, 6068, 6121, 6473, 7501, 7886, 8108, 8546, 8666, 8831, 9106, 9711, 9806, 10,205, 10,396, 10,861, 11,026, 11,214, 11,362, 11,604, 11,608, 11,745, 11,762, 11,895, 12,044, 13,520, 13,670, 14,110, 14,496, 15,395, 16,179, 17,092, 17,568, 17,568

The results from fitting the Kevlar 49/epoxy strands failure times data (pressure at 70%) which include the estimate of the parameters, the standard errors of these estimated parameters, the loglikelihood (loglik) values, the Akaike Information Criterion (AIC) values and the Kolmogorov–Smirnov (K–S) statistic values (the corresponding p values are also reported) of all the fitted distributions are reported in Table 8. Figure 10 shows the graph of all the fitted densities alongside the histogram of the data. The results in Table 8 clearly show that the EGuWL distribution provided the best fit for the data by possessing the smallest AIC value as well as the highest p value of the K -S statistic.

  • (iv)

    Application to the Kevlar 49/epoxy strands failure times data (pressure at 90%)

Table 8.

Maximum likelihood fit of the Kevlar 49/epoxy strands failure times data (pressure at 70%)

Distribution BW BN GuWL BEW GuWLP EGuWL

Parameter

estimates

a^=0.48770.1222 a^=0.11500.1489 α^=2.67410.3582 a^=7.900875.977 α^=2.04380.3801 α^=2.60930.0127
b^=0.11830.0189 b^=0.08060.1068 c^=4.10361.0343 b^=0.14980.0479 β^=-0.6062.5157 β^=0.22370.1811
α^=2.69800.0423 c^=1087.1794.87 k^=1.15460.7454 β^=0.01940.4225 c^=3.63671.9862 c^=2.19051.4673
λ^=5002.40.0509 k^=7796.11390.6 λ^=6116.9246.5 α^=2.48840.2836 k^=1.68223.1542 k^=-1.5581.7711
λ^=5000.00.3352 λ^=4555.469.076 λ^=4611.90.04750
Log Likelihood 479.49 − 480.41 − 479.49 − 480.0 − 478.86 − 478.40$
AIC 966.97 968.81 966.97 970.0 960.35 966.80
K–S 0.0764 0.0832 0.0742 0.0755 0.0701 0.0607
p value 0.9165 0.8590 0.9316 0.9227 0.9556 0.9888

(Standard error of estimates in parenthesis)

Fig. 10.

Fig. 10

Graph of the fitted densities for the Kevlar 49/epoxy strands failure times data (pressure at 70%)

For the fourth application, the EGuWL distribution is used to fit the Kevlar 49/epoxy strands failure times data (pressure at 90%). The data set was reported in [28]. The data set is unimodal, leptokurtic, and highly skewed to the right (reverse J-shape) (skewness = 3.0472, excess kurtosis = 14.4745). The data set is presented in Table 9. The BN, BW, GuWL, exponentiated Weibull (EW) [5] and the GuWLP distributions are also used to fit the data set and their fits are compared with that of the EGuWL distribution. The EW density is given by

fEWx=αβλx/λα-1e-x/λα1-e-x/λαβ-1,x,α,β,λ,>0.

Table 9.

Kevlar 49/epoxy strands failure times data (pressure at 90%)

0.01, 0.01, 0.02, 0.02, 0.02, 0.03, 0.03, 0.04, 0.05, 0.06, 0.07, 0.07, 0.08, 0.09, 0.09, 0.10, 0.10, 0.11, 0.11, 0.12, 0.13, 0.18, 0.19, 0.20, 0.23, 0.24, 0.24, 0.29, 0.34, 0.35, 0.36, 0.38, 0.40, 0.42, 0.43, 0.52, 0.54, 0.56, 0.60, 0.60, 0.63, 0.65, 0.67, 0.68, 0.72, 0.72, 0.72, 0.73, 0.79, 0.79, 0.80, 0.80, 0.83, 0.85, 0.90, 0.92, 0.95, 0.99, 1.00, 1.01, 1.02, 1.03, 1.05, 1.10, 1.10, 1.11, 1.15, 1.18, 1.20, 1.29, 1.31, 1.33, 1.34, 1.40, 1.43, 1.45, 1.50, 1.51, 1.52, 1.53, 1.54, 1.54, 1.55, 1.58, 1.60, 1.63, 1.64, 1.80, 1.80, 1.81, 2.02, 2.05, 2.14, 2.17, 2.33, 3.03, 3.03, 3.34, 4.20, 4.69, 7.89

The results from fitting the Kevlar 49/epoxy strands failure times data (pressure at 90%) which include the estimate of the parameters, the standard errors of these estimated parameters, the loglikelihood (loglik) values, the Akaike Information Criterion (AIC) values and the Kolmogorov–Smirnov (K–S) statistic values (the corresponding p values are also reported) of all the fitted distributions are reported in Table 10. Figure 11 shows the graph of all the fitted densities alongside the histogram of the data. The results in Table 10 clearly show that the EGuWL distribution provided the best fit for the data by possessing the smallest AIC value as well as the highest p value of the K -S statistic.

  • (v)

    Application to the Australian Athletes' Height Data

Table 10.

Maximum likelihood fit of the Kevlar 49/epoxy strands failure times data (pressure at 90%)

Distribution BW BN GuWL EW GuWLP EGuWL

Parameter

estimates

a^=0.76090.1240 a^=10.5905.3031 α^=30.91960.0888 β^^=0.79320.2870 α^=0.88610.2282 α^=0.94130.0027
b^=0.21570.0241 b^=0.09490.0594 c^=3.27390.6359 α^=1.06020.2398 β^=-0.5963.2028 β^=0.72810.3845
α^=1.05130.0027 c^=0.46390.1702 k^=1.93750.8810 λ^=1.21760.3932 c^=2.94981.8805 c^=3.11631.0502
λ^=0.25880.0027 k^=-0.8460.2472 λ^=0.20680.0713 k^=1.39183.3291 k^=1.60641.5398
λ^=0.19920.1150 λ^=0.17360.0027
Log Likelihood − 102.17 − 129.81 − 100.94 − 102.79 − 100.16 − 99.80
AIC 212.34 267.62 209.88 211.57 210.31 209.59
K–S 0.0784 0.1219 0.0689 0.0844 0.0683 0.0629
p value 0.5385 0.0913 0.6983 0.4433 0.7078 0.7953

(Standard error of estimates in parenthesis)

Fig. 11.

Fig. 11

Graph of the fitted densities for the Kevlar 49/epoxy strands failure times data (pressure at 90%)

For the fifth application, the EGuWL distribution is used to fit the heights (in centimeters) of 100 female Australian athletes. The data set was collected by the Australian Institute of Sport and reported in [28]. The data set is unimodal, leptokurtic, and left-skewed (skewness = − 0.5684, excess kurtosis = 1.3212). The data set is presented in Table 11. The BN, GuWL, EW, Weibull–Pareto {exponential} (WPE) [34] and the beta skew normal (BSN) [35] distributions are also used to fit the data set and their fits are compared with that of the EGuWL distribution. The WPE and the BSN densities are given by

fWPEx=βcxβlogx/λc-1exp-βlogx/λc,x>λ,c,β,λ,>0,
fBSNx=2Γa+bΓaΓbϕzΦαzΦz;αa-11-Φz;αb-1,
z=x-k/c,a,b,c>0,-<x,α,k<,Φz;α=Φz-2Tz,α,

Table 11.

Australian Athletes' Height Data

148.9, 149.0, 156.0, 156.9, 157.9, 158.9, 162.0, 162.0, 162.5, 163.0, 163.9, 165.0, 166.1, 166.7, 167.3, 167.9, 168.0, 168.6, 169.1, 169.8, 169.9, 170.0, 170.0, 170.3, 170.8, 171.1, 171.4, 171.4, 171.6, 171.7, 172.0, 172.2, 172.3, 172.5, 172.6, 172.7, 173.0, 173.3, 173.3, 173.5, 173.6, 173.7, 173.8, 174.0, 174.0, 174.0, 174.1, 174.1, 174.4, 175.0, 175.0, 175.0, 175.3, 175.6, 176.0, 176.0, 176.0, 176.0, 176.8, 177.0, 177.3, 177.3, 177.5, 177.5, 177.8, 177.9, 178.0, 178.2, 178.7, 178.9, 179.3, 179.5, 179.6, 179.6, 179.7, 179.7, 179.8, 179.9, 180.2, 180.2, 180.5, 180.5, 180.9, 181.0, 181.3, 182.1, 182.7, 183.0, 183.3, 183.3, 184.6, 184.7, 185.0, 185.2, 186.2, 186.3, 188.7, 189.7, 193.4, 195.9

ϕ. and Φ. are the pdf and cdf of the normal distribution respectively, T.,. is the Owen’s T function.

The results from fitting the Heights data which include the estimate of the parameters, the standard errors of these estimated parameters, the loglikelihood (loglik) values, the Akaike Information Criterion (AIC) values and the Kolmogorov–Smirnov (K–S) statistic values (the corresponding p values are also reported) of all the fitted distributions are reported in Table 12. Figure 12 shows the graph of all the fitted densities alongside the histogram of the data. The results in Table 12 clearly show that the EGuWL distribution provided the best fit for the data by possessing the smallest AIC value as well as the highest p value of the K -S statistic.

  • (vi)

    Application to Australian Athletes’ sum of skin folds data

Table 12.

Maximum likelihood fit of the Heights data

Distribution WPE BN GuWL EW BSN EGuWL

Parameter

estimates

c^=8.18923.3757 a^=0.97731.3802 α^=12.4330.0400 β^^=2.78031.3356 a^=0.94191.1986 α^=12.3580.0027
β^=2.84281.1247 b^=8.314425.382 c^=3.54640.2775 α^=14.7483.2334 b^=7.583619.702 β^=0.97170.3854
λ^=125.1117.253 c^=13.28014.537 k^=6.31240.3764 λ^=170.284.5382 c^=10.12912.262 c^=4.11480.9673
k^=193.9729.487 λ^=148.840.0547 k^=100.19129.59 k^=7.55801.5104
α^=4.171112.143 λ^=146.480.0027
Log Likelihood − 351.49 − 350.30 − 350.14 − 351.44 −350.30 −349.02
AIC 708.97 708.60 708.28 708.89 210.31 708.04
K–S 0.0801 0.0721 0.0587 0.0711 0.0722 0.0534
p value 0.5171 0.6489 0.8607 0.6662 0.6472 0.9230

(Standard error of estimates in parenthesis)

Fig. 12.

Fig. 12

Graph of the fitted densities for the Heights data

For the last application, the EGuWL distribution is used to fit the sum skin folds of 100 female Australian athletes. The data set was collected by the Australian Institute of Sport and reported in [28]. The data set is unimodal, leptokurtic, and right-skewed (skewness = 0.7878, excess kurtosis = 0.7320). The data set is presented in Table 13. The BN, GuWL, WPE, EW and BW distributions are also used to fit the data set and their fits are compared with that of the EGuWL distribution. The results from fitting the sum of skin folds data which include the estimate of the parameters, the standard errors of these estimated parameters, the loglikelihood (loglik) values, the Akaike Information Criterion (AIC) values and the Kolmogorov–Smirnov (K–S) statistic values (the corresponding p-values are also reported) of all the fitted distributions are reported in Table 14. Figure 13 shows the graph of all the fitted densities alongside the histogram of the data. The results in Table 14 clearly show that the EGuWL distribution provided the best fit for the data by possessing the smallest AIC value as well as the highest p value of the K–S statistic.

Table 13.

Australian Athletes' sum of skin folds data

33.8, 36.8, 38.2, 41.1, 41.6, 42.3, 43.5, 43.5, 46.1, 46.2, 46.3, 47.5, 47.6, 48.4, 49.0, 49.9, 50.0, 52.5, 52.6, 54.6,54.6, 55.6, 56.8, 57.9, 58.9, 59.4, 61.9, 62.6, 62.9, 65.1, 67.0, 68.3, 68.9, 69.9, 70.0, 71.3, 71.6, 73.9, 74.7, 74.9, 75.1,75.2, 76.2, 76.8, 77.0, 80.1, 80.3, 80.3, 80.3, 80.6, 83.0, 87.2, 88.2, 89.0,90.2, 90.4, 91.0, 91.2, 95.4, 96.8, 97.2, 97.9, 98.0, 98.1, 98.3, 98.5, 99.8, 99.9, 101.1, 102.8, 102.8,103.6,103.6, 104.6, 106.9, 109.0, 109.1, 109.5, 109.6, 110.2, 110.7, 111.1, 113.5, 114.0, 115.9, 117.8, 122.1,123.6, 125.9, 126.4, 126.4, 131.9, 136.3,143.5, 148.9,156.6,156.6, 171.1, 181.7, 200.8

Table 14.

Maximum likelihood fit of the sum of skin folds data

Distribution BW BN GuWL EW WPE EGuWL

Parameter

estimates

a^=4.35091.0637

a^=9.77060.4008

b^=0.19670.0223

α^=2.28680.9179 β^^=6.50776.3889 c^=3.63181.0579 α^=2.7590k30.0052
b^=0.16410.0176 b^=0.19670.0223 c^=1.33100.3218 α^=1.22440.4407 β^=0.71830.1801 β^=0.17920.0180
α^=1.99990.0026 c^=25.43090.9049 k^=0.04591.3030 λ^=41.48724.646 λ^=23.04687.6267 c^=0.64220.0052
λ^=33.3100.0026 k^=9.15174.3612 λ^=81.63927.344 k^=-0.9160.0052
λ^=65.9510.0052
Log Likelihood − 486.25 − 487.06 − 486.28 − 487.27 − 486.07 − 485.23
AIC 980.50 982.10 980.55 980.54 978.13 980.47
K–S 0.0725 0.0711 0.0704 0.0809 0.0825 0.0598
p value 0.6424 0.6925 0.6778 0.5042 0.4782 0.8463

(Standard error of estimates in parenthesis)

Fig. 13.

Fig. 13

Graph of the fitted densities for the sum of skin folds data

Summary and Conclusion

A new flexible probability distribution called the exponentiated Gumbel–Weibull {logistic} distribution has been defined and studied in this paper. The new distribution has been applied in modeling the daily number of infections from the novel COVID-19 pandemic in Nigeria. Five other data sets which exhibit various shape and tail behaviors have been further used to buttress the flexibility of the new distribution. The performance of the distribution in fitting the various data sets have been compared with those of other probability distributions in its class and results obtained showed that the new distribution gave the best fits. We hope the new distribution will attract further usage in fitting data sets from other fields.

Author Contributions

The first draft of the manuscript was written by Patrick Osatohanmwen and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Funding

No funding was received for conducting this study.

Declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Ethics approval

Ethical standards as recommended by the journal and in line with global best practices have been followed in the course of wrting the article as well as in the reporting of the results conatined therein.

Data Availability

All data as used in the article and in the generation of results are contained in the body of the article and where necesary, URL address have been provided to also acess them.

Code Availability

The codes used in the article can be obtained upon request from the corresponding author.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Olson DL, Shi Y (2007) Introduction to business data mining. McGraw-Hill/Irwin, New York [Google Scholar]
  • 2.Shi Y, Tian YJ, Kou G, Peng Y, Li JP (2011) Optimization based data mining: theory and applications. Springer, Berlin [Google Scholar]
  • 3.Tien JM (2017) Internet of things, real-time decision making, and artificial intelligence. Ann Data Sci 4(2):149–178 [Google Scholar]
  • 4.Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178 [Google Scholar]
  • 5.Mudholkar GS, Srivastava DK (1993) Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Trans Reliab 42:299–302. 10.1109/24.229504 [Google Scholar]
  • 6.Eugene N, Lee C, Famoye F (2002) Beta-normal distribution and its applications. Commun Stat Theory Methods 31:497–512. 10.1081/STA-120003130 [Google Scholar]
  • 7.Shaw WT, Buckley IR (2009) The alchemy of probability distributions: beyond Gram-Charlier expansions and a skew-kurtotic-normal distribution from a rank transmutation map. arXiv:0901.0434
  • 8.Cordeiro GM, de Castro M (2011) A new family of generalized distributions. J Stat Comput Simul 81:883–898. 10.1080/00949650903530745 [Google Scholar]
  • 9.Cordeiro GM, Ortega GM, da Cunha DCC (2013) The exponentiated generalized class of distributions. J Data Sci 11:1–27 [Google Scholar]
  • 10.Alzaatreh A, Lee C, Famoye F (2014) T – normal family of distributions: a new approach to generalize the normal distribution. J Stat Distrib Appl 1:16 [Google Scholar]
  • 11.Osatohanmwen P, Oyegue FO, Ajibade B, Ewere F (2020) A new generalized family of distributions on the unit interval: the T - kumaraswamy family of distributions. J Data Sci 18(2):218–236 [Google Scholar]
  • 12.Osatohanmwen P, Oyegue FO, Ogbonmwan SM (2020) The T – R Y power series family of probability distributions. J Egypt Math Soc 28:29. 10.1186/s42787-020-00083-7 [Google Scholar]
  • 13.Liu Z, Magal P, Seydi O, Webb G (2020) Predicting the cumulative number of cases for the COVID-19 epidemic in China from early data. arXiv:2002.12298v1 [DOI] [PubMed]
  • 14.Roosa K, Lee Y, Luo R, Kirpich A, Rothenberg A, Hyman JM, Yan P, Chowell G (2020) Real-time forecasts of the COVID-19 epidemic in China from February 5th to February 24th, 2020. Infect Dis Model 5:256–263. 10.1016/j.idm.2020.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Tang B, Bragazzi NL, Li Q, Tang S, Xiao Y, Wu J (2020) An updated estimation of the risk of transmission of the novel corona virus (2019-nCov). Infect Dis Model 5:248–255. 10.1016/j.idm.2020.02.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tang B, Wang X, Li Q, Bragazzi NL, Tang S, Xiao Y, Wu J (2020) Estimation of the transmission risk of the 2019-nCov and its implication for public health intervention. J Clin Med 9(2):462 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wu JT, Leung K, Leung GM (2020) Nowcasting and forecasting the potential domestic and international spread of the 2019-nCov outbreak originating in Wuhan, China: a modelling study. Lancet 395:689–697. 10.1016/s0140-6736(20)30260-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Osatohanmwen P, Oyegue FO, Ogbonmwan SM (2020) Modeling the daily number of reported cases of infection from the COVID-19 Pandemic in Nigeria: a stochastic approach. Earthline J Math Sci 5(2):217–235. 10.34198/ejms.5221.217235 [Google Scholar]
  • 19.Guan C, Liu W, Cheng JYC (2021) Using social media to predict the stock market crash and rebound amid the pandemic: the digital ‘Haves’ and ‘Have-mores.’ Ann Data Sci. 10.1007/s40745-021-00353-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Li J, Guo K, Herrera Viedma E, Lee H, Liu J, Zhong Z, Gomes L, Filip FG, Fang SC, Özdemir MS, Liu XH, Lu G, Sh Y (2020) Culture vs policy: more global collaboration to effectively combat COVID-19. Innovation. 10.1016/j.xinn.2020.100023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Liu Y, Gu Z, Xia S, Shi B, Zhou X, Shi Y, Liu J (2020) What are the underlying transmission patterns of COVID-19 outbreak? An age-specific social contact characterization. EClincialMedicine 22:100354 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Nadarajah S (2006) The exponentiated Gumbel distribution with climate application. Environmetrics 17:13–23. 10.1002/env.739 [Google Scholar]
  • 23.Galton F (1883) Enquiries into human faculty and its development. Macmillan and Company, London [Google Scholar]
  • 24.Moor JJ (1988) A quantile alternative for Kurtosis. Statistician 37:25–32 [Google Scholar]
  • 25.Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27:379–432 [Google Scholar]
  • 26.Nadarajah S, Kotz S (2006) The beta exponential distribution. Reliab Eng Syst Saf 91:689–697. 10.1016/j.ress.2005.05.008 [Google Scholar]
  • 27.Barreto-Souza W, Santos AHS, Cordeiro GM (2010) The beta generalized exponential distribution. J Stat Comput Simul 80:159–172. 10.1080/00949650802552402 [Google Scholar]
  • 28.Al-Aqtash R, Lee C, Famoye F (2014) Gumbel - Weibull distribution: properties and application. J Mod App Stat Method 13:201–225. 10.22237/jmasm/1414815000 [Google Scholar]
  • 29.Birnbaum ZW, Saunders SC (1969) A new family of life distributions. J App Prob 6:637–652. 10.2307/3212003 [Google Scholar]
  • 30.Famoye F, Lee C, Olumolade O (2005) The beta-Weibull distribution. J Stat Theory Appl 4:121–136 [Google Scholar]
  • 31.Paranaiba PF, Ortega EMM, Cordeiro GM, Pescim R (2013) The beta Burr XII distribution with application to lifetime data. Comput Stat Data Anal 55:1118–1136. 10.1016/j.csda.2010.09.009 [Google Scholar]
  • 32.Osatohanmwen P, Oyegue FO, Ogbonmwan SM (2019) A new Member from the T-X family of distributions: the Gumbel-Burr XII distribution and its properties. Sankhya A 81:298–322. 10.1007/s13171-017-0110-x [Google Scholar]
  • 33.Cordeiro GM, Gomes AE, da-Silva CQ, Ortega EMM (2013) The beta exponentiated Weibull distribution. J Stat Comput Simul 83(1):114–138. 10.1080/00949655.2011.615838 [Google Scholar]
  • 34.Alzaatreh A, Lee C, Famoye F (2013) Weibull-pareto distribution and its applications. Commun Stat Theory Methods 42:1673–1691. 10.1080/03610926.2011.599002 [Google Scholar]
  • 35.Mameli V, Musio M (2013) A generalization of the beta skew-normal distribution: the beta skew-normal. Commun Statist Theory Methods 42:2229–2244. 10.1080/03610926.2011.607530 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

All data as used in the article and in the generation of results are contained in the body of the article and where necesary, URL address have been provided to also acess them.


Articles from Annals of Data Science are provided here courtesy of Nature Publishing Group

RESOURCES