Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2021 Jan 20;49(7):1615–1635. doi: 10.1080/02664763.2021.1874891

A new flexible generalized family for constructing many families of distributions

M H Tahir a,CONTACT, M Adnan Hussain a, Gauss M Cordeiro b
PMCID: PMC9041776  PMID: 35707557

Abstract

We propose a new flexible generalized family (NFGF) for constructing many families of distributions. The importance of the NFGF is that any baseline distribution can be chosen and it does not involve any additional parameters. Some useful statistical properties of the NFGF are determined such as a linear representation for the family density, analytical shapes of the density and hazard rate, random variable generation, moments and generating function. Further, the structural properties of a special model named the new flexible Kumaraswamy (NFKw) distribution, are investigated, and the model parameters are estimated by maximum-likelihood method. A simulation study is carried out to assess the performance of the estimates. The usefulness of the NFKw model is proved empirically by means of three real-life data sets. In fact, the two-parameter NFKw model performs better than three-parameter transmuted-Kumaraswamy, three-parameter exponentiated-Kumaraswamy and the well-known two-parameter Kumaraswamy models.

Keywords: Flexible G-family, generalized family, Kumaraswamy distribution, maximum-likelihood method, new flexible family, T–X family

2010 Mathematics Subject Classifications: 60E05, 60E10, 62E10, 62P12

1. Introduction

Both data and model are equally important in applied research. One class of researchers prefer to realize a phenomenon first, and the other is interested in testing models by fitting to real data. We cannot enter into this endless debate but prefer to follow the last option in order to check the suitability of the proposed family, derived sub-families and special models from the generator. This is actually one of the main objectives of the modern distribution theory, where new families and models are proposed and are then adopted or tested to tackle problems encountered in different fields, such as reliability and survival studies, engineering, actuaries, sports sciences, agriculture, etc. This revolution makes the data analyst to cope with data sets available from different phenomenons. In such way: (i) the well-established parent models are extended by adding shape(s) parameters; (ii) the functional forms of the parent models are modified; (iii) inverted and weighted forms are adopted; (iv) generalized (G) classes have been proposed through transformations, mixtures, composition, copulas, convolution and compounding methods; (v)  the special models of generalized classes are investigated, among other methods. All such proposals and their increasing interests led to new and alternative ways for problem solving so that one can reach on a lucid and conclusive end by which the research activity has been kept warm and live. For detailed study and discussion, the reader is referred to [9,13,15].

Alzaatreh et al. [2] introduced a general method for constructing G-families by using the transformed-transformer (T–X) approach. Let r(t) be the probability density function (pdf) and R(t) be the cumulative distribution function (cdf) of a random variable (rv) T[a,b] for <a<b< and let W[G(x)] be a function of the cdf G(x) or survival function (sf) G¯(x)=1G(x) of any baseline rv ( W() is known as generator) such that W[G(x)] satisfies three conditions:

  1. W[G(x)][a,b],

  2. W[G(x)] is differentiable and monotonically non-decreasing, and

  3. limxW[G(x)]=a and limxW[G(x)]=b.

The cdf of the T–X family is

FTX(x)=aW[G(x)]r(t)dt=R(W[G(x)]), (1)

where W[G(x)] satisfies the conditions (i)–(iii).

The pdf corresponding to Equation (1) is

fTX(x)=r(W[G(x)])ddxW[G(x)]. (2)

In Table 1, we give an update on pioneer generators W[G(x)] which are natural models of the T–X family. Here T could be I=(0,1), R+=(0,), R=(,). To the best of our knowledge, we are unable to find any other generator that can be included in Table 1.

Table 1.

Pioneer generators as functions of G(x) (W[G(x)]) from the T–X family.

Range of T W[G(x)] Models of the T–X family
I G(x) Beta-G [4]
R+ logG¯(x) ZBgamma-G [17]
R+ G(x)/G¯(x) Odd log-logistic-G[5]
R+ [logG¯(x)]/G¯(x) Weibull-G[1]
R log[G(x)/G¯(x)] Log odd logistic-G[16]
R log[logG¯(x)] Logistic-X[14]

There exist some G-classes such as Marshall-Olkin-G (MO-G) [11], exponentiated-G (exp-G) which includes the Lehmann alternative of type 1 (LA1) and Lehmann alternative of type 2 (LA2) [7], transmuted-G (Tr-G) [12], cubic rank transmuted-G (CRTr-G) [6] and exponentiated-generalized-G (EG-G) [3]. These six G-classes (MO-G, LA1, LA2, Tr-G, CRTr-G, EG-G) have not been developed from any existing parent model.

The main motivations for the new flexible generalized family (for short, NFGF) of distributions are:

  • The NFGF is not developed from any well-known parent model similar to the MO-G, LA1, LA2, Tr-G, CRTr and EG-G classes;

  • The NFGF does not include any extra parameter;

  • Any baseline model can be chosen for the NFGF;

  • The special models generated from the NFGF are free from non-identifiability issue. You may choose either exponentiated or inverted models;

  • The new special models based on the NFGF have the ability to compete with existing parent or some other competitive models;

  • The new special models based on the NFGF can produce flexible shapes of the density and hazard rate (in some cases) in comparison to existing parent models;

  • The new special models of the NFGF can provide consistently better fits than other corresponding models.

We unfold the paper as follows. In Section 2, we propose a new generator called the NFGF. In Section 3, we obtain some of its mathematical properties such as a linear representation for the family density, analytical shapes of the density and hazard rate, random variable generation, moments and generating function. In Section 4, we define a new flexible Kumaraswamy (NFKw) distribution and investigate some structural properties. Its model parameters are estimated by the maximum-likelihood method. In Section 5, a simulation study is carried out to check the precision of the estimates of the NFKw distribution. In Section 6, the potentiality of this model is illustrated by means of three real-life data sets. We show that it performs better than some well-known models. The last section offers some concluding remarks.

2. The proposed flexible G-family

Let T be a baseline rv having cdf G(x;ξ), sf G¯(x;ξ) and pdf g(x;ξ), where ξ is the baseline parameter vector. We define the cdf and pdf of the NFGF by

F(x)=F(x;ξ)=1G¯(x;ξ)G(x;ξ) (3)

and

f(x)=f(x;ξ)=g(x;ξ)G¯(x;ξ)G(x;ξ)[G(x;ξ)G¯(x;ξ)logG¯(x;ξ)], (4)

respectively.

Henceforth, let X be a rv having the density (4). The sf S(x) and hazard rate function (hrf) h(x) of X are, respectively,

S(x)=S(x;ξ)=G¯(x;ξ)G(x;ξ)

and

τ(x)=τ(x;ξ)=g(x;ξ)[G(x;ξ)G¯(x;ξ)logG¯(x;ξ)]. (5)

We consider below some special distributions for different supports of rvs {(0,1),(0,),(,),(0,θ),(μ,0)}, namely for Kumaraswamy (Kw), beta, Weibull (W), Burr XII (Br), Gumbel (Gu), logistic (Lo), power function (or generalized uniform) (PF) and Pareto (Pa) models:

  1. If TKw(a,b) has the cdf G(x)=1(1xa)b,xI and pdf g(x)=abxa1(1xa)b1, then the cdf and pdf of the NFKw model are, respectively, given by
    FNFKw(x)=1[(1xa)b]1(1xa)b,xI,a,b>0, (6)
    and
    fNFKw(x)=abxa1(1xa)b1[(1xa)b]1(1xa)b×{[(1xa)b1]blog(1xa)}. (7)
  2. If TBeta(a,b) has the cdf G(x)=Ix(a,b),xI and pdf g(x)=[B(a,b)]1xa1(1x)b1, then the cdf and pdf of the new flexible beta (NFB) model are, respectively, given by
    FNFB(x)=1[1Ix(a,b)]Ix(a,b),xI,a,b>0,
    and
    fNFB(x)=[B(a,b)]1xa1(1x)b1[1Ix(a,b)]Ix(a,b)×[Ix(a,b)1Ix(a,b)log{1Ix(a,b)}],
    where B(a,b)=01ta1(1t)b1dt, Bx(a,b)=0xta1(1t)b1dt and It(a,b)=Bx(a,b)B(a,b)=[B(a,b)]10xta1(1t)b1dt are the beta function, incomplete beta function and incomplete beta function ratio, respectively.
  3. If TW(α,β) has the cdf G(x)=1exp(αxβ),xR+ and pdf g(x)=αβxβ1exp(αxβ), then the cdf and pdf of the new flexible Weibull (NFW) model are, respectively, given by
    FNFW(x)=1[exp(αxβ)]1exp(αxβ),xR+,α,β>0,
    and
    fNFW(x)=αβxβ1exp(αxβ)[exp(αxβ)]1exp(αxβ)×{[exp(αxβ)1]+αxβ}.
  4. If TBr(c,k) has the cdf G(x)=1(1+xc)k,xR+ and pdf g(x)=ckxc1(1+xc)(k+1), then the cdf and pdf of the new flexible Burr XII (NFBr) model are, respectively, given by
    FNFBr(x)=1[(1+xc)k]1(1+xc)k,xR+,c,k>0,
    and
    fNFBr(x)=ckxc1(1+xc)(k+1)[(1+xc)k]1(1+xc)k×{[(1+xc)k1]+klog(1+xc)}.
  5. If TGu(μ,σ) has the cdf G(x)=exp[exp{(xμ)/σ}],xR and pdf g(x)=σ1exp{(xμ)/σ}exp[exp{(xμ)/σ}], then the cdf and pdf of the new flexible Gumbel (NFGu) model are, respectively, given by
    FNFGu(x)=1[1exp(e(xμ)/σ)]exp(e(xμ)/σ),xR,μ,σ>0,
    and
    fNFGu(x)=1σe(xμ)/σexp(e(xμ)/σ)×[1exp(e(xμ)/σ)]exp(e(xμ)/σ)×{exp(e(xμ)/σ)1exp(e(xμ)/σ)log[1exp(e(xμ)/σ)]}.
  6. If TLo(μ,σ) has the cdf G(x)=[1+exp{(xμ)/σ}]1,xR+ and pdf g(x)=σ1exp{(xμ)/σ}[1+exp{(xμ)/σ}]2, then the cdf and pdf of the new flexible logistic (NFLo) model are, respectively, given by
    FNFLo(x)=1{1[1+exp{(xμ)/σ}]1}[1+exp{(xμ)/σ}]1,xR+,μ,σ>0,
    and
    fNFLo(x)=exp{(xμ)/σ}σ[1+exp{(xμ)/σ}]2×{1[1+exp{(xμ)/σ}]1}[1+exp{(xμ)/σ}]1×{[1+exp{(xμ)/σ}]11[1+exp{(xμ)/σ}]1[1+exp{(xμ)/σ}]11[1+exp{(xμ)/σ}]1log[1[1+exp{(xμ)/σ}]1]}.
  7. If TPF(β,θ) has the cdf G(x)=(xθ)β,0<x<θ and pdf g(x)=βθβxβ1, then the cdf and pdf of the new flexible power function (NFPF) model are, respectively, given by
    FNFPF(x)=1[1(xθ)β](xθ)β,0<x<θ,β,θ>0,
    and
    fNFPF(x)=βθβxβ1[1(xθ)β](xθ)β{(xθ)β1(xθ)βlog[1(xθ)β]}.
  8. If TPa(β,α) has the cdf G(x)=1(βx)α,xβ and pdf g(x)=αβαxα+1, then the cdf and pdf of the new flexible Pareto (NFPa) model are, respectively, given by
    FNFPa(x)=1[(βx)α]1(βx)α,xβ,β,α>0,
    and
    fNFPa(x)=αβαxα+1[(βx)α]1(βx)α{[(βx)α1]αlog(βx)}.
    The density and hazard rate behaviors of the special models NFKw, NFBeta, NFW, NFBr, NFGu, NFLo, NFPF and NFPa are plotted in Figures 18.

Figure 2.

Figure 2.

Plots for the NFBeta model for some parameter values: (a) density and (b) hazard rate.

Figure 3.

Figure 3.

Plots for the NFW model for some parameter values: (a) density and (b) hazard rate.

Figure 4.

Figure 4.

Plots for the NFBr model for some parameter values: (a) density and (b) hazard rate.

Figure 5.

Figure 5.

Plots for the NFGu model for some parameter values: (a) density and (b) hazard rate.

Figure 6.

Figure 6.

Plots for the NFLo model for some parameter values: (a) density and (b) hazard rate.

Figure 7.

Figure 7.

Plots for the NFPF model for some parameter values: (a) density and (b) hazard rate.

Figure 1.

Figure 1.

Plots for the NFKw model for some parameter values: (a) density and (b) hazard rate.

Figure 8.

Figure 8.

Plots for the NFPa model for some parameter values: (a) density and (b) hazard rate.

Our proposed generator defined by (3) can extend several well-known G-classes of distributions which can be accessed through SupplementaryDataANFGF.pdf. Secondly, the information regarding the new flexible G-families derived from these G-classes can be accessed through SupplementaryDataBNFGF.pdf.

3. Properties of the NFGF

3.1. Linear representation

For an arbitrary baseline cdf G(x), the exponentiated-G (exp-G) distribution with parameter a>0, has cdf and pdf in the forms Ha(x)=G(x)a and ha(x)=ag(x)G(x)a1, respectively. Re-calling (3) and then using Mathematica, the power series holds

F(x)==2ηG(x), (8)

where η2=1, η3=1/2, η4=1/6 and η5=1/4,, which can be expressed as

F(x)==2ηH(x;ξ), (9)

where H(x;ξ)=G(x;ξ) (for 2).

By differentiating (9), the density of X takes the form

f(x)==1η+1h+1(x;ξ), (10)

where h+1 is the exp-G density with power parameter +1. Equation (10) reveals that the NKwG density function is a linear combination of exp-G densities. Thus, some mathematical properties of the NFGF can be determined directly from those of the exp-G distributions, which are known for several baseline distributions.

3.2. Analytical shapes

The shapes of the density and hrf of X can be described analytically. The critical points of the density of X are the roots of the equation:

g(x)g(x)+g(x)[log2(1G(x))+G(x)(log(1G(x))+1)2+2]G(x)(1G(x))log(1G(x))=0.

The critical points of the hrf of X are obtained from the equation:

g(x)g(x)+g(x)(2G(x))(1G(x))(G(x)(1G(x))log(1G(x)))=0.

3.3. Quantile function

The simplest method for generating rvs is based on the inverse cdf. For an arbitrary cdf, the quantile function (qf) is defined as Q(u)=F1(u)=min{x;F(x)u}. The qf of the NFGF can be determined by inverting (3) and solving two nonlinear equations numerically. We can use the following procedure:

  1. Set z=z(u)=1u;

  2. Find w=w(u) numerically in wlog(1w)=log(z) using any Newton-Raphson algorithm;

  3. Solving numerically for x in G(x;ξ)=w gives the qf x=Q(u;ξ) of X.

3.4. Moments and generating function

The nth ordinary moment of X, say E(Xn), can be expressed from Equation (10) as

E(Xn)==1η+1E(Y+1n)==1(+1)η+1τn,, (11)

where τn,=xnG(x;ξ)g(x;ξ)dx=01QG(u;ξ)nudu, and QG(u;ξ) is the qf of the baseline G.

The first four moments can be used to describe some characteristics of a distribution. Clearly, the central moments and cumulants of X can be determined from Equation (11) using well-known results.

The nth lower incomplete moment of X, say mn(y)=yxnf(x)dx, is

mn(y)==1η+1yxnh+1(x)dx==1(+1)η+10G(y;ξ)QG(u;ξ)nudu. (12)

The last two integrals can be evaluated numerically for most G distributions.

The total deviations from the mean and median are δ1=2μ1F(μ1)2m1(μ1) and δ2=μ12m1(M), where F(μ1) comes from Equation (3).

The moment generating function (mgf) M(t)=E(etX) of X follows from Equation (10) as

M(t)==1η+1M+1(t)==0(+1)η+1ρ(t), (13)

where M+1(t) is the mgf of Y+1 and ρ(t)=01exp[tQG(u;ξ)]udu. Hence, M(t) can be obtained from the exp-G generating function.

3.5. Estimation

Here, we consider the estimation of the unknown parameters of the NFGF family by the maximum-likelihood method. The maximum-likelihood estimates (MLEs) enjoy desirable properties that can be used when constructing confidence intervals and deliver simple approximations that work well in finite samples. The normal approximation for the MLEs in distribution theory can easily be handled either analytically or numerically.

The log-likelihood function (θ) for the vector of parameters θ=(a,b,ξ) from n observations x1,,xn has the form

=(θ)=i=1nlogg(xi;ξ)+i=1nG(xi;ξ)log[1G(xi;ξ)]+i=1nlog(G(xi;ξ)1G(xi;ξ)log[1G(xi;ξ)]). (14)

The MLE θ^ of θ can be evaluated by maximizing (θ). There are several routines for numerical maximization of (θ) in the R program (optim function), SAS (PROC NLMIXED), Ox (sub-routine MaxBFGS), among others. All distributions in the NFGF can also be fitted to data sets using the AdequacyModel package in R (see https://www.r-project.org/). An important advantage of this package is that it is not necessary to define the log-likelihood function and that it computes the MLEs, their standard errors (SEs) and some goodness-of-fit (GoF) statistics. We only need to provide the pdf and cdf of the distribution to be fitted to a data set.

Alternatively, we can differentiate the log-likelihood and solving the resulting nonlinear likelihood equations. Then, the score components with respect to ξ are

ξ=i=1ngiξg(xi;ξ)i=1n{G(xi;ξ)1G(xi;ξ)log(1G(xi;ξ))}Giξi=1nGiξ[G(xi;ξ)2](1G(xi;ξ))[log{1G(xi;ξ)}+G(xi;ξ){1+log(1G(xi;ξ))}],

where giξ=g(xi;ξ)ξ and Giξ=G(xi;ξ)ξ are column vectors of the same dimension of ξ.

Setting the score components to zero and solving them simultaneously yields the MLEs of the parameters of the NFGF. The resulting equations cannot be solved analytically, but some statistical softwares can be used to solve them numerically through iterative Newton–Raphson type algorithms.

We can obtain the elements of the p×p observed information matrix J(θ) (p is the dimension of ξ) by numerical integration. Further, the approximate multivariate normal Np+2(0,J(θ^)1) distribution for θ^, where the observed information matrix J(θ) is evaluated at θ^, can be used to construct confidence intervals for the parameters of the NFGF family.

4. The NFKw model and its properties

The sf and hrf of NFKw rv are, respectively, given by

SNFKw(x;a,b)=[(1xa)b]1(1xa)b

and

τNFKw(x;a,b)=abxa1(1xa)b1{[(1xa)b1]blog(1xa)}.

4.1. Quantile function

The qf of the NFKw distribution cannot be obtained explicitly. However, we can use Newton–Raphson algorithm to generate NFKw variates as follows:

  • Step 1:

    Set n, a, b and initial value x0.

  • Step 2:

    Generate U∼Uniform (0,1).

  • Step 3:
    Update x0 by using the Newton's formula
    x=x0R(x0;a,b),
    where R(x0;a,b)=FNFKw(x0;a,b)fNFKw(x0;a,b), and FNFKw and fNFKw are obtained from Equations (6) and (7), respectively.
  • Step 4:

    If |x0x|ϵ, ( ϵ>0, very small tolerance limit), then store x0=x as a variate from the NFKw(a, b) distribution.

  • Step 5:

    If |x0x|>ϵ, then, set x0=x and go to step 3.

  • Step 6:

    Repeat steps (2)–(5) n times to generate x1,xn.

4.2. Properties

First, the linear representation for cdf of the NFKw model follows from Equation (3) as

FNFKw(x;a,b)==2η[1(1xa)b]. (15)

By expanding binomial series and noting that =2η=1, we can write

FNFKw(x;a,b)=1+=2ηk=1(1)k(k)(1xa)bk

and then by changing k by k + 1

FNFKw(x;a,b)=1+=2ηk=0(1)k+1(k+1)(1xa)b(k+1).

Let δk=2 for k=0,1,2 and δk=k for k ≥ 3. We can change conveniently the double sums

FNFKw(x;a,b)=1+=δkηk=0(1)k+1(k+1)(1xa)b(k+1).

The last expression can be written as

FNFKw(x;a,b)=1+k=0(ωk)(1xa)b(k+1),

where ωk=(1)k=δk(k+1)η.

By differentiating the last expression, the NFKw density can be written as

fNFKw(x;a,b)=k=0ωkπ(x;a,b(k+1)), (16)

where π(x;a,(k+1)b) denotes the Kumaraswamy density with shape parameters a and (k+1)b.

It is clear from Equation (16), that the NFKw density is a linear combination of Kumaraswamy densities. So, several of the NFKw properties can be obtained from the Kumaraswamy distribution.

Let Zk be a rv with density π(x;a,(k+1)b). Then, several properties of X can follow from those of Zk. First, the nth ordinary moment of X takes the form

μn=k=0ωkb(k+1)B(na+1,b(k+1)). (17)

The cumulants ( κn) of X can be determined recursively from (17) as κs=μsk=1s1(s1k1)κkμsk, respectively, where κ1=μ1.

The skewness and kurtosis plots of the NFKw distribution are displayed in Figure 9. These plots reveal that the parameters a and b play a significant role in modeling the skewness and kurtosis behaviors of X.

Figure 9.

Figure 9.

Plots for the NFKw model: (a) skewness (b) kurtosis.

The nth incomplete moment of X is mn(y)=E(XnXy)=0yxnfNFKw(x;a,b)dx, which is easily found by changing variables from the lower incomplete beta function Bt(u,v)=0ttu1(1t)v1dt when calculating the corresponding moment of Zk. Then, we obtain

mn(z)=k=0ωkb(k+1)Bx(na+1,(k+1)b). (18)

The total deviations from the mean μ1 and median M of X have the forms δ1=2μ1FNFKw(μ1)2m1(μ1) and δ2=μ12m1(M), where M can be determined from FNKwW(M)=0.5.

The first incomplete moment m1(y) is also used to construct the Bonferroni and Lorenz curves (popular measures in economics, reliability, demography, insurance and medicine). The Bonferroni and Lorenz curves of X for a given probability π are given by B(π)=m1(q)/(πmu1) and L(π)=πB(π), respectively, where q=Q(π) is the qf of X discussed in Section 4.2.

4.3. Estimation

Let x1,,xn be a sample of size n from the NFKw distribution given in Equation (7). The log-likelihood function for the vector of parameters θ=(a,b) reduces to

(θ)=nlog(ab)+(a1)i=1nlog(xi)+(b1)i=1nlog(1xia)+i=1n[1(1xia)b]×log[(1xia)b]+i=1nlog[1(1xia)b(1xia)blog((1xia)b)].

The components of the score vector U(θ) are

Ua=na+i=1nlog(xi)(b1)i=1nxialog(xi)1xia+i=1n[bxialog(xi)(1xia)b1log((1xia)b)bxialog(xi){1(1xia)b}1xia]+i=1nbxialog(xi){1(1xia)b}(1xia)[(1xia)b{1+log((1xia)b)}1].Ub=nb+i=1nlog(1xia)+i=1nlog(1xia)×[{1(1xia)b}(1xia)blog((1xia)b)]+i=1n{1(1xia)b}(1xia)b(log(1xia))2log(1xia)(1xia)b{1(1xia)b}log((1xia)b).

Setting these equations to zero and solving them simultaneously yields the MLEs of the model parameters.

5. Simulation study

In this section, we evaluate the accuracy of the MLEs of the NFKw parameters using Monte Carlo simulations. The simulation study is repeated for N= 1,000 times each with given sample size n=50, 100, 200, 300, 400 and parameter scenarios: I: a = 0.5, and b = 1.5, II: a = 1.1, and b = 3.5 and III: a=2.5, and b = 1.5. The precision of the MLEs is investigated in terms of the biases and mean square errors (MSEs), namely:

Bias(θ^)=i=1Nθi^NθandMSE(θ^)=i=1N(θi^θ)2N.

We display plots of the biases and MSEs for the estimates of NFKw parameters a and b in Figures 10 and 11. These plots reveal that the values of biases and MSEs decrease as the sample size n increases. Thus, the MLEs perform well in estimating the parameters of the NFKw distribution.

Figure 10.

Figure 10.

Plots of the biases for the estimated values of NFKw parameters.

Figure 11.

Figure 11.

Plots of the MSEs for the estimated values of NFKw parameters.

6. Empirical illustration of the NFKw model

We compare the proposed two-parameter NFKw model (a special model of NFGF) with three-parameter transmuted-Kumaraswamy (TrKw) [8], three-parameter exponentiated-Kumaraswamy (EKw) [10] and two-parameter Kumaraswamy (Kw) models to three real-life data sets (Flood data, Leaves data, Glass Fiber data) which can be accessed from SupplementaryDataCNFGF.pdf. The pdfs of these models are, respectively given by:

fKw(x;a,b)=abxa1(1xa)b1,xI,a,b>0fTrKw(x;λ,a,b)=abxa1(1xa)b1[1λ+2λ(1xa)b],x,λI,a,b>0

and

fEKw(x;α,a,b)=αabxa1(1xa)b1(1(1xa)b)α1,xI,a,b,α>0.

The parameters of the models are estimated by the maximum-likelihood method and the log-likelihood function is evaluated at the MLEs ( ^). The well-known goodness-of-fit (GoF) statistics such as Akaike information criterion (AIC), Bayesian Information Criterion (BIC), Hannan–Quinn Information Criterion (HQIC), Anderson-Darling ( A), Cramér–von Mises ( W) and Kolmogrov-Smirnov (K-S) are used for model comparisons. The lower values of the GoF statistics and higher p-values of K-S indicate good fit.

Tables 24 and 6 give the MLEs and their standard errors for the NFKw model and other competitive models TrKw, EKw and Kw for these data sets. The values of the GoF statistics in Tables 35 and 7 indicate that the NFKw model shows small values of the GoF statistics and hence the proposed model provides best fit as compared to the other models. These plots also support our claim.

Table 2.

MLEs and their SEs (in parentheses) for data set 1.

Distribution a b α λ
NFKw 2.3455 7.7175
  (0.4556) (3.2113)
TrKw 3.7259 10.9645 0.6141
  (0.6490) (6.0368) (0.3752)
EKw 3.3633 45.8805 0.2570
  (0.6021) (9.4457) (0.1269)
Kw 3.3631 11.7886
  (0.6033) (5.3594)

Table 4.

MLEs and their standard errors (in parentheses) for data set 2.

Distribution a b α λ
NFKw 1.8585 43.1739
  (0.1319) (11.0563)
TrKw 2.9292 177.1790 0.3571
  (0.2128) (62.1210) (0.3535)
EKw 2.8099 85.9558 2.0498
  (0.1940) (509.7923) (12.1569)
Kw 2.8104 176.3490
  (0.1941) (59.9656)

Table 6.

MLEs and their standard errors (in parentheses) for data set 3.

Distribution a b α λ
NFKw 1.7263 19.4685
  (0.2661) (8.7742)
TrKw 2.7037 45.8129 0.4748
  (0.3944) (27.9858) (0.3956)
EKw 2.4966 105.2523 0.4157
  (0.3691) (4066.8131) (16.0628)
Kw 2.4998 43.9672
  (0.3700) (23.6081)

Table 3.

The statistics AIC, BIC, HQIC, A, W, K-S and p-values for data set 1.

                K-S
Distribution ^ AIC BIC HQIC A W K-S p-value
NFKw 14.2378 24.4757 22.4842 24.0869 0.6966 0.1147 0.1797 0.5380
TrKw 13.5209 21.0419 18.0547 20.4588 0.8409 0.1409 0.1930 0.4455
EKw 12.8662 19.7324 16.7452 19.1493 1.4830 0.2626 0.7960 1.972e−11
Kw 12.8662 21.7324 19.7409 21.3436 0.9722 0.1658 0.2109 0.3360

Table 5.

The statistics AIC, BIC, HQIC, A, W, K-S and p-values for data set 2.

                K-S
Distribution ^ AIC BIC HQIC A W K-S p-value
NFKw 195.7365 387.4730 381.7689 385.1554 0.9281 0.1657 0.1005 0.1509
TrKw 195.3060 384.6120 376.0559 381.1356 1.0578 0.1906 0.1098 0.0913
EKw 194.8007 383.6015 375.0454 380.1251 0.9207 0.1670 0.5341 2.2e−16
Kw 194.8007 385.6015 379.8974 383.2839 1.1612 0.2078 0.1181 0.0563

Table 7.

The statistics AIC, BIC, HQIC, A, W, K-S and p-values for data set 3.

                K-S
Distribution ^ AIC BIC HQIC A W K-S p-value
NFKw 32.0281 60.0563 57.4646 59.2856 1.1518 0.1835 0.1755 0.3766
TrKw 31.1656 56.3312 52.4437 55.1753 1.3400 0.2187 0.1837 0.3222
Ekw 30.6731 55.3462 51.4587 54.1902 1.7763 0.3018 0.6172 2.325e−09
Kw 30.6731 57.3461 54.7545 56.5755 1.4409 0.2379 0.1915 0.2756

Firstly, it is clear that, the NFKw model provides a better fit than the other tested models, because it has the smallest value among ^, AIC, BIC, HQIC, A, W and K-S. Figures 1214 also support our claim about NFKw model.

Figure 13.

Figure 13.

Estimated plots (a) density (b) sf (c) hazard rate, and (d) Box-plot for data set 2.

Figure 12.

Figure 12.

Estimated plots (a) density (b) sf (c) hazard rate, and (d) Box-plot for data set 1.

Figure 14.

Figure 14.

Estimated plots (a) density (b) sf (c) hazard rate, and (d) Box-plot for data set 3.

7. Concluding remarks

We introduce a new cumulative distribution F(x;ξ)=1[1G(x;ξ)]G(x;ξ) without extra parameters defined from a baseline cumulative distribution G(x;ξ) which serves as a flexible generator of generalized classes of distributions. The function F(x;ξ) defines the new flexible generalized family (NFGF) of distributions. We present many sub-families of the NFGF. We obtain some mathematical properties of the NFGF and also study some properties of the special model called the new flexible Kumaraswamy (NFKw) distribution. We compare this distribution with the transmuted-Kumaraswamy, exponentiated-Kumaraswamy, and Kumaraswamy models by considering six popular GoF statistics. We find that the new distribution provides better estimates and minimum GoF values. The NFKw model outperforms these three models on the basis of numerical and graphical analysis. We expect that this new generator of G-families will be able to attract readers and applied statisticians.

Supplementary Material

Supplemental Material
Supplemental Material
Supplemental Material

Acknowledgments

The authors would like to thank two anonymous reviewers for comments and suggestions which improved the earlier version of the manuscript.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Ahmad Z., Elgarhy M., and Hamedani G.G., A new Weibull-X family of distributions: properties, characterizations and applications, J. Stat. Distrib. Appl. 5 (2018), Art. 5, pp. 18. [Google Scholar]
  • 2.Alzaatreh A., Famoye F, and Lee C., A new method for generating families of continuous distributions, Metron 71 (2013), pp. 63–79. [Google Scholar]
  • 3.Cordeiro G.M., Ortega E.M.M., and Cunha D.C.C., The exponentiated generalized class of distributions, J. Data Sci. 11 (2013), pp. 1–27. [Google Scholar]
  • 4.Eugene N., Lee C., and Famoye F., Beta-normal distribution and its applications, Commun. Stat. Theory Methods 31 (2002), pp. 497–512. [Google Scholar]
  • 5.Gleaton J.U. and Lynch J.D., Properties of generalized log-logistic families of lifetime distributions, J. Probab. Stat. Sci. 4 (2006), pp. 51–64. [Google Scholar]
  • 6.Granzotto D.C.T., Louzada F., and Balakrishnan N., Cubic rank transmuted distributions: inferential issues and applications, J. Stat. Comput. Simul. 87 (2016), pp. 2760–2778. [Google Scholar]
  • 7.Gupta R.C., Gupta P.L., and Gupta R.D., Modeling failure time data by Lehman alternatives, Commun. Stat. Theory Methods 27 (1999), pp. 887–904. [Google Scholar]
  • 8.Khan M.S., King R., and Hudson L.I., Transmuted Kumaraswamy distribution, Stat. Trans. 17 (2016), pp. 183–210. [Google Scholar]
  • 9.Lee C., Famoye F., and Alzaatreh A., Methods for generating families of univariate continuous distributions in the recent decades, WIREs Comput. Stat. 5 (2013), pp. 219–238. [Google Scholar]
  • 10.Lemonte A.J., Barreto-Souza W., and Cordeiro G.M., The exponentiated Kumaraswamy distribution and its log-transform, Braz. J. Probab. Stat. 27 (2013), pp. 31–53. [Google Scholar]
  • 11.Marshall A.W. and Olkin I., A new method for adding parameters to a family of distributions with application to the exponential and Weibull families, Biometrika 84 (1997), pp. 641–652. [Google Scholar]
  • 12.Shaw W.T. and Buckley I.R., The alchemy of probability distributions: beyond Gram-Charlier expansions, and a skew-kurtotic-normal distribution from a rank transmutation map, UCL discovery repository (2007). Available at http://discovery.ucl.ac.uk/id/eprint/643923.
  • 13.Tahir M.H. and Cordeiro G.M., Compounding of distributions: a survey and new generalized classes, J. Stat. Distrib. Appl. 3 (2016), pp. 13. [Google Scholar]
  • 14.Tahir M.H., Cordeiro G.M., Alizadeh M., Mansoor M., and Zubair M., The logistic-X family of distributions and its applications, Commun. Stat. Theory Methods 45 (2016), pp. 7326–7349. [Google Scholar]
  • 15.Tahir M.H. and Nadarajah S., Parameter induction in continuous univariate distributions: well-established G families, An. Acad. Bras. Ciênc. 87 (2015), pp. 539–568. [DOI] [PubMed] [Google Scholar]
  • 16.Torabi H. and Montazari N.H., The logistic-uniform distribution and its application, Commun. Stat. Simul. Comput. 43 (2014), pp. 2551–2569. [Google Scholar]
  • 17.Zografos K. and Balakrishnan N., On families of beta- and generalized gamma generated distributions and associated inference, Stat. Methodol. 6 (2009), pp. 344–362. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material
Supplemental Material
Supplemental Material

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES