Abstract
In this paper, we introduce a new general family of distributions obtained by a subtle combination of two well-established families of distributions: the so-called power Topp–Leone-G and inverse exponential-G families. Its definition is centered around an original cumulative distribution function involving exponential and polynomial functions. Some desirable theoretical properties of the new family are discussed in full generality, with comprehensive results on stochastic ordering, quantile function and related measures, general moments and related measures, and the Shannon entropy. Then, a statistical parametric model is constructed from a special member of the family, defined with the use of the inverse Lomax distribution as the baseline distribution. The maximum likelihood method was applied to estimate the unknown model parameters. From the general theory of this method, the asymptotic confidence intervals of these parameters were deduced. A simulation study was conducted to evaluate the numerical behavior of the estimates we obtained. Finally, in order to highlight the practical perspectives of the new family, two real-life data sets were analyzed. All the measures considered are favorable to the new model in comparison to four serious competitors.
Keywords: power Topp–Leone distribution, inverse exponential-G family, moments, entropy, estimation, data analysis
MSC: 60E05, 62E15, 62F10
1. Introduction
Owing to the growing amount of data from various applied fields and unstoppable computer progress, there is increasing motivation on developing efficient and flexible statistical models. Such models can be derived from general families of distributions having desirable properties, such as those constructed from a generator distribution. The main idea of this construction is to add shape parameter(s) to a baseline distribution with the aim to upgrade its flexibility level. Among the well-known examples of such families, there are the beta-G [1], Kumaraswamy-G [2], Weibull-GG [3], Garhy-G [4], type II half logistic-G [5], Transmuted Topp–Leone G [6], generalized odd log-logistic-G [7], odd Fréchet-G [8], power Lindley-G [9], Fréchet Topp–Leone-G [10], exponentiated generalized Topp–Leone-G [11], and truncated inverted Kumaraswamy-G [12]. We also refer to the exhaustive survey in [13]. Recently, several researchers used the Topp–Leone (TL) distribution as generator distribution to develop new general families, reaching the aims of simplicity and flexibility. Among them, Ref. [14] proposed the Topp–Leone-G (TL-G) family, Ref. [15] introduced the power TL-G (PTL-G) family, Ref. [16] introduced the generalized TL-G family, Ref. [17] studied the type II TL-G family, and [18] proposed the type II generalized TL-G family.
For the purposes of this paper, let us describe in detail the PTL-G family from [15]. The PTL-G family is defined by the following cumulative distribution function (cdf):
with , where is a cdf of a baseline continuous distribution which may depend on a vector parameter ; i.e., . As indicated by the name, the construction of the family uses the so-called power Topp–Leone distribution as the generator distribution. In comparison to the (power one) TL-G family, Ref. [15] demonstrated the significant impact of the parameter on the shapes of the probability density and hazard rate functions, providing desirable modeling properties. This is particularly flagrant with the consideration of the gamma distribution as the baseline distribution, as illustrated by the graphics and applications of [15].
In a parallel work, beyond the TL distribution and its extensions, Ref. [19] introduced the inverse exponential-G (IE-G) family, based on the inverse exponential distribution as the generator distribution, and defined by the following cdf:
The main features of this family are being simple, with no new, additional parameters, and having a completely different nature of the former baseline cdf owing to the combination the exponential (implicit) odd functions. An immediate remark illustrating this claim is the following: it has a fastest rate of decay to 0 when . By the consideration of a practical data set and the exponential distribution as baseline distribution, Ref. [19] shows that the corresponding model is better than the Lindley and exponential models (all having the same number of parameters). The nice results behind the IE-G family have been the driver for more investigations, with extended or modified versions of this family. We refer the reader to [8,20] for the odd Fréchet-G family, Ref. [21] for the extended odd Fréchet-G family, and [22] for the modified odd Fréchet-G family.
In this paper, in view of the previously mentioned literature, we introduce a new family of distributions by combining, in some senses, the PTL-G and IE-G families. It is defined by composition of their respective cdfs, i.e., by the cdf given by
(1) |
with . Thus, this cdf can be view as a polyno-exponential transformation of the baseline cdf . The new family is called the new power TL-G (NPTL-G) family. Thus, by construction, we aim to combine the benefits of the PTL-G and IE-G families, and thus, create new statistical perspectives of various kinds. The key motivations behind the NPTL-G family are the following.
To provide very simple models and create new simple distributions.
To improve the flexibility of existing distributions on various aspects (such as mode, median, skewness, and kurtosis…).
To provide better fits than competing modified models having the same of higher number of parameters.
We support these claims both in full generality and by putting the light on the special member of the NPTL-G family defined with the inverse Lomax (ILx) distribution as the baseline distribution (the reason of this choice will be explained later). The resulting distribution, called the new power Topp–Leone inverse Lomax (NPTLILx) distribution, offers a new three-parameter lifetime distribution, with a high potential of applicability. We illustrate that by the means of two practical data sets with different features: the first one is from [23] and is about active repair times for airborne communication transceiver, and the second one is from [24] and is about actual tax revenue in Egypt. Favorable results were obtained for the proposed model in comparison to serious competitors, motivating its use wider statistical uses.
The contents of this paper are organized as follows. In Section 2, the basics of the NPTL-G family are presented, as is the NPTLILx distribution. Various mathematical properties of the family are discussed in Section 3. Section 4 is devoted to the estimation of the unknown parameters from the NPTLILx model, with a comprehensive simulation study. The data analyses are shown in Section 5 with numerical and graphical illustrations. A conclusion and perspectives are formulated in Section 6.
2. Basics of the NPTL-G Family
The basics of the NPTL-G family are presented in this section, with a focus on the main functions of interest.
2.1. Probability Density Function
Upon differentiation of according to x, owing to (1), the probability density function (pdf) of the NPTL-G family is given by
(2) |
where is the probability density function corresponding to . From this expression, some asymptotic results on can be derived. When , we have
Furthermore, when , we have
The variations of can be studied in a standard manner, starting with the critical point(s) given by the solution of the non-linear equation according to x: , with
Then, for a critical point , the sign of is informative on its nature (minimum, maximum, or inflection point).
2.2. Hazard Rate Function
The hazard rate function (hrf) of the NPTL-G family is given by
Some asymptotic results on are presented below. When , we have
Additionally, when , we have
Thus, the parameters and have a significant effect on the asymptotes when , but no effect when . The variations of can be studied in similar manner to by using the relation .
2.3. A Special Member: The NPTLILx Distribution
The NPTL-G family contains distributions of various natures, depending on the choice of the baseline distribution. In this study, as evoked in the introduction, we chose the inverse Lomax distribution with shape parameter as the baseline distribution to define the NPTLILx distribution. Thus, it is defined by the following cdf:
(another parameter of the former definition of the inverse Lomax distribution has been reduced to 1 for the purposes of the paper). Let us now briefly motivate this choice. As suggested by its name, the inverse Lomax distribution is the distribution of the random variable , where X denotes a random variable following the standard Lomax distribution (with parameters and 1). The corresponding pdf and hrf are, respectively, given by
and
In addition to being simple, it has been proven to be a very flexible to model data having a subjacent non-monotonic hrf. Further details and applications can be found in [25,26,27].
Thus, the NPTLILx distribution is defined by the following cdf:
(3) |
with . The corresponding pdf and hrf are given by, respectively,
(4) |
and
Possible shapes of the pdf and hrf of the NPTLILx distribution are illustrated in Figure 1 and Figure 2, respectively. In particular, from Figure 1, we see that the pdf can be right skewed and reversed-J shaped. From Figure 2, we see that the hrf can be increasing, decreasing, upside down, and bathtub shaped. All these curvature properties are known to be desirable to create flexible statistical models.
Figure 1.
Plots of some probability density functions (pdfs) of the new power Topp–Leone inverse Lomax (NPTLILx) distribution.
Figure 2.
Plots of some hazard rate functions (hrfs) of the NPTLILx distribution.
3. Some Mathematical Properties
The section presents some important mathematical properties of the NPTL-G family.
3.1. On a Stochastic Ordering
The following result shows some inequalities involving .
Proposition 1.
For any such that , the following inequalities hold:
Proof.
The bracket term in the definition of given by (1) is central. Since , we have , implying the second inequality. For the first inequality, the following well-known logarithmic inequality: for , gives , implying that by taking . Therefore, we have , and a fortiori, . The first inequality follows. This ends the proof of Proposition 1. □
An immediate consequence of Proposition 1 is the following stochastic ordering result:
where is the cdf of the exponentiated IE-G family (with power parameter ).
Another stochastic ordering result comes from the following remark: the function given by
has the properties of a cdf, with the corresponding pdf given by
To the best of our knowledge, it is new in the literature (and out the scope of this paper).
3.2. Quantile Function with Some Related Measures and Functions
The quantile function (qf) of the NPTL-G family is expressed in the following result.
Proposition 2.
The qf of the NPTL-G family is given by
where is the qf corresponding to .
Proof.
For the sake of simplicity, let us set for . Then, by the definition of a qf, satisfies the non-linear equation: , implying that ; hence, , which is equivalent to solving the polynomial equation according to y: , with . By determining the two roots of this polynomial, keeping only the one in the unit interval (since ), we get . After some algebra, we get . The desired result follows by compounding with , ending the proof of Proposition 2. □
From the qf, we can define several quantities of importance, providing distributional properties of the family. Some of them are presented below.
The three quartiles of the NPTL-G family are defined by , , and . In particular, the median of the NPTL-G family is given by
Additionally, the inter-quartile range is given by , allowing one to define the Galton coefficient of skewness and the Moors coefficient of kurtosis, given by, respectively,
and
See [28,29] for more details on these coefficients, respectively.
On the other hand, upon differentiation of according to u, the corresponding quantile density function is given by
where is the quantile density function corresponding to . Also, the hazard quantile function is defined by
These functions have central roles in reliability. Further details can be found in [30].
Last but not least, the qf allows us to generate values from members of the NPTL-G family. This property will be used in Section 4.2 in the context of the NPTLILx distribution; i.e., with the qf given by , , so
As a numerical illustration, Table 1 shows the values of , M, , S, and K of the NPTLILx distribution for some parameter values.
Table 1.
The values of , M, , S, and K of the NPTLILx distribution for some parameter values.
M | S | K | |||
---|---|---|---|---|---|
0.0162 | 0.0413 | 0.1108 | 0.4703 | 1.8976 | |
0.0667 | 0.1377 | 0.3000 | 0.3917 | 1.7017 | |
0.1150 | 0.2200 | 0.4464 | 0.3664 | 1.6542 | |
0.1594 | 0.2922 | 0.5701 | 0.3533 | 1.6323 | |
0.2200 | 0.3877 | 0.7298 | 0.3421 | 1.6151 | |
0.5513 | 0.9168 | 1.6330 | 0.3244 | 1.5894 | |
1.2624 | 2.0174 | 3.4724 | 0.3167 | 1.5807 | |
1.9896 | 3.1308 | 5.3207 | 0.3148 | 1.5788 | |
3.4563 | 5.3666 | 9.0231 | 0.3137 | 1.5778 | |
7.3808 | 11.2118 | 18.5331 | 0.3130 | 1.5772 | |
15.2458 | 22.9128 | 37.5597 | 0.3128 | 1.5771 | |
23.1143 | 34.6163 | 56.5877 | 0.3128 | 1.5771 | |
38.8534 | 58.0246 | 94.6445 | 0.3128 | 1.5771 |
We see in Table 1 that the effects of , , and on the quartiles are significant (we always have so the distribution is right-skewed and moderate variations for K).
3.3. Series Expansion
The exp-G family of distributions, introduced by [31], is defined by the following cdf: , , with . The corresponding pdf is given by
The interesting part of the exp-G family is to have well-known properties for a lot of baseline cdfs . For instance, the member of the exp-G family defined with the inverse Lomax distribution as baseline with shape parameter becomes the inverse Lomax distribution with shape parameter .
The following result concerns a series expansion for the pdf of the NPTL-G family in terms of pdfs of the exp-G family.
Proposition 3.
We have the following series expansion:
where
with the notation: .
Proof.
We first investigate a series expansion of based on the Equation (1). Since , the generalized binomial formula gives
On the other hand, thanks to the power series of the exponential function, we get
Now, it follows from the generalized and standard binomial formulas that
By combining all the above equalities together, we obtain
Upon differentiation of according to x, we get the desired result, by removing the term in , which vanished. Proposition 3 is proven. □
3.4. General Moments with Some Related Measures and Functions
Let X be a random variable having the cdf given by (1) (defined on a probability space , with an expectation denoted by E). Then, for any function (such that all the following introduced quantities exist or converge), we have
Two equivalent expressions involving already introduced qfs are as follows:
and
Numerical solutions exist to evaluate them for given , and , , and . Alternatively, we can consider Proposition 3, which implies that
(5) |
where
In some circumstances, truncated sums can be considered for practical purposes; for a large integer K, the following approximation reveals to be tractable and efficient:
Some specific choices for are of particular interest. Some of them are discussed below.
By taking , we get the s-th moment of X—i.e., , including the mean of X, i.e., —and allow the expression the variance of X; i.e., .
By taking , we get the s-th central moment of X, i.e., , allowing one to calculate the s-th general coefficient of X given by , among others. This coefficient is useful to investigate the skewness and kurtosis properties of X.
By taking , we get the moment generation function of X according to the variable t; i.e., . It is well-known that .
By taking , we get the characteristic function of X according to the variable t; i.e., . In a same title of the cdf, the characteristic function entirely determines the NPTL-G family.
By taking , which is equal to if and 0 otherwise, we get the s-th incomplete moment of X according to the variable y; i.e., . This function is useful to define mean deviations of X, the corresponding residual life function, Bonferroni and Lorenz curves, and others.
In the case of the NPTLILx distribution, since when , the mean exists but the variance does not exist, nor do moments of order greater to 2 (there is no problem when ). However, all the incomplete moments exist for any fixed . In this regard, Table 2 provides the four first incomplete moments for X with .
Table 2.
The values for the first four incomplete moments of the NPTLILx distribution; i.e., with , with , for some parameter values.
0.1330 | 0.4114 | 62.0150 | 31180.1500 | |
0.2289 | 0.8177 | 124.0269 | 62360.2800 | |
0.5037 | 2.4083 | 372.0441 | 187080.6 | |
0.3216 | 1.5544 | 247.4997 | 124628 | |
1.1648 | 12.2536 | 2209.2820 | 1118342 | |
0.4499 | 1.6965 | 248.6065 | 124813.2 | |
3.6376 | 41.6806 | 6308.0930 | 3138854 | |
1.8826 | 22.1519 | 3941.2640 | 1991066 | |
8.7637 | 288.4201 | 62161.91 | 31710832 | |
65.7921 | 9473.519 | 3149044 | 1707182402 | |
125.1442 | 28393.86 | 11331002 | 6455199311 | |
180.9897 | 55135.34 | 25604726 | 15505315569 | |
268.5182 | 106564.8 | 56092507 | 35760970261 | |
337.2701 | 160356.5 | 93263392 | 62533636278 |
3.5. Shannon Entropy
Here, we study the Shannon entropy of the NPTL-G family as defined by [32]. We recall that the Shannon entropy of a random variable measures the amount of uncertainty for the outcome of this variable. A high entropy reveals a high degree of uncertainty.
Now, let X be a random variable having the cdf given by (1). Then, the Shannon entropy of X is defined by
By the use of any mathematical software, for a given baseline cdf , and , , and , we can determine this integral. Another approach consists of developing by the use of the pdf given by (2):
Some expectation terms can be expressed by using (5) with an appropriate function as soon as exists and the sums converge.
In the context of the NPTLILx distribution, some values of are collected in Table 3 for some parameter values.
Table 3.
The values of the Shannon entropy of the NPTLILx distribution for some parameter values.
0.5 | 0.5 | 0.5 | |
1.0 | 0.5 | 0.5 | |
2.0 | 0.5 | 0.5 | |
3.0 | 0.5 | 0.5 | 0.1094 |
5.0 | 0.5 | 0.5 | 0.4405 |
10 | 0.5 | 0.5 | 0.8528 |
10 | 1.0 | 0.5 | 1.5895 |
10 | 2.0 | 0.5 | 2.2907 |
10 | 5.0 | 0.5 | 3.1614 |
10 | 8 | 0.5 | 3.5497 |
0.5 | 0.5 | 0.1 | |
5.0 | 0.5 | 0.1 | |
5.0 | 5.0 | 0.1 | 1.1590 |
In Table 3, the values belongs to the wide interval , meaning that , , and have an important impact on the amount of information quantified by .
4. Estimation with Numerical Results
In this section, we investigate the NPTLILx model characterized by the cdf given by (3). Thanks to its attractive theoretical and practical properties, the maximum likelihood method is used to estimate the parameters , , and . Numerical results attest to the efficiency of the estimates obtained.
Hereafter, we consider a random variable X following the NPTLILx distribution with parameters , , and .
4.1. Maximum Likelihood Estimation
Let be a random sample of size n of X. Then, by using the pdf given by (4), the likelihood and log-likelihood functions are, respectively, given by
and
The maximum likelihood estimates (MLEs) of , , and , say , , and , respectively, are defined such that or . Let us work with the function for the sake of simplicity. Since is differentiable with respect to , and , the MLEs can obtained by solving the non-linear equations defined by the first partial derivatives of with respect to , and equal to 0, with
and
The complexity of these expressions do not allow us to provide closed-forms for the MLEs. However, several numerical solutions exist to maximize based on Newton–Raphson algorithms, one of which is employed in this study.
The corresponding Fisher information matrix we observed is given by
(the elements of are upon request from the authors). When n is large, the distribution of the subjacent random vector behind can be approximated by a three dimensional normal distribution with mean vector and covariance matrix . By denoting , and , the diagonal elements of this matrix, we are able to construct asymptotic confidence intervals for , , and . Indeed, with the adopted notations, the asymptotic (equitailed) confidence intervals (CIs) of , , and at the level are given by, respectively,
and
where is the upper -th percentile of the normal distribution . For practical purposes, if lower bounds of these intervals are negative, we can put it at 0, since all the parameters are supposed to be positive. All the technical details can be found in [33].
4.2. Numerical Results
Here, we provide a simulation study to show the nice behavior of the MLEs for the NPTLILx model presented in the subsection above. First of all, let us mention that a random sample from X can be obtained by the use of the qf: for any random sample of size n from the uniform distribution , say , the corresponding random sample of size n of X is given by with .
From N random samples of X, let be either , , or and be the MLE of constructed from the i-th sample. Then, we define the (mean) MLE, bias, and mean square error (MSE) by, respectively,
Additionally, the asymptotic (mean) confidence intervals of , , and at the level can be determined. We define the (mean) lower bounds (LBs), (mean) upper bounds (UBs), and (mean) average length (ALs) by, respectively,
where . For the purposes of this study, we consider the levels and , so and , respectively. The software Mathematica 9 was employed.
Our simulation study was based on the the following plan.
random samples of size , 200, 300, and 1000 are to be generated from X.
Values of the true parameters are taken as, in order, , , and .
The MLEs, MSEs, biases, LBs, UBs, and ALs for the selected values of the parameters are to be calculated.
Numerical outcomes are listed in Table 4, Table 5 and Table 6.
Table 4.
Maximum likelihoods (MLEs), biases, MSEs, LBs, UBs, and ALs of the NPTLILx model for .
n | Par. | ML | Bias | MSE | 90% | 95% | ||||
---|---|---|---|---|---|---|---|---|---|---|
LB | UB | AL | LB | UB | AL | |||||
100 | 0.290 | 0.105 | 1.109 | 1.637 | 1.266 | 1.951 | ||||
0.163 | 0.063 | 0.016 | 0.417 | 0.508 | 0.466 | 0.605 | ||||
0.572 | 0.072 | 0.012 | 0.364 | 0.780 | 0.416 | 0.324 | 0.820 | 0.496 | ||
200 | 0.304 | 0.103 | 0.937 | 1.266 | 1.058 | 1.508 | ||||
0.146 | 0.046 | 0.008 | 0.324 | 0.354 | 0.358 | 0.422 | ||||
0.567 | 0.067 | 0.011 | 0.386 | 0.749 | 0.363 | 0.351 | 0.784 | 0.433 | ||
300 | 0.340 | 0.084 | 0.944 | 1.207 | 1.059 | 1.438 | ||||
0.137 | 0.037 | 0.006 | 0.021 | 0.254 | 0.233 | 0.276 | 0.277 | |||
0.548 | 0.048 | 0.007 | 0.414 | 0.683 | 0.269 | 0.388 | 0.709 | 0.321 | ||
1000 | 0.342 | 0.074 | 0.747 | 0.811 | 0.825 | 0.966 | ||||
0.124 | 0.024 | 0.002 | 0.047 | 0.200 | 0.153 | 0.033 | 0.215 | 0.182 | ||
0.545 | 0.045 | 0.005 | 0.435 | 0.655 | 0.220 | 0.414 | 0.676 | 0.263 |
Table 5.
MLEs, biases, MSEs, LBs, UBs, and ALs of the NPTLILx model for .
n | Par. | ML | Bias | MSE | 90% | 95% | ||||
---|---|---|---|---|---|---|---|---|---|---|
LB | UB | AL | LB | UB | AL | |||||
100 | 2.042 | 0.542 | 5.123 | 215.955 | 427.827 | 256.918 | 509.752 | |||
0.373 | 0.055 | 9.711 | 18.674 | 11.499 | 22.250 | |||||
0.618 | 0.118 | 0.060 | 31.814 | 62.393 | 37.788 | 74.341 | ||||
200 | 1.843 | 0.343 | 1.449 | 84.975 | 166.265 | 100.894 | 198.103 | |||
0.442 | 0.015 | 5.124 | 9.363 | 6.021 | 11.156 | |||||
0.547 | 0.047 | 0.019 | 11.206 | 21.317 | 13.247 | 25.400 | ||||
300 | 1.667 | 0.167 | 0.874 | 49.192 | 95.050 | 58.293 | 113.252 | |||
0.458 | 0.012 | 2.935 | 4.954 | 3.410 | 5.903 | |||||
0.542 | 0.042 | 0.013 | 6.818 | 12.550 | 8.019 | 14.953 | ||||
1000 | 1.358 | 0.460 | 5.008 | 7.300 | 5.707 | 8.698 | ||||
0.534 | 0.034 | 0.002 | 0.184 | 0.884 | 0.700 | 0.117 | 0.951 | 0.834 | ||
0.534 | 0.034 | 0.010 | 0.279 | 0.788 | 0.509 | 0.230 | 0.837 | 0.607 |
Table 6.
MLEs, biases, MSEs, LBs, UBs, and ALs of the NPTLILx model for .
n | Par. | ML | Bias | MSE | 90% | 95% | ||||
---|---|---|---|---|---|---|---|---|---|---|
LB | UB | AL | LB | UB | AL | |||||
100 | 2.843 | 1.043 | 1.889 | 1330.640 | 2655.600 | 1584.900 | 3164.120 | |||
0.639 | 0.239 | 0.675 | 95.462 | 189.645 | 113.619 | 225.960 | ||||
1.573 | 0.373 | 0.660 | 255.807 | 508.468 | 304.490 | 605.834 | ||||
200 | 1.095 | 0.523 | 361.618 | 721.046 | 430.654 | 859.119 | ||||
0.331 | −0.069 | 0.014 | 28.493 | 56.324 | 33.886 | 67.109 | ||||
1.571 | 0.371 | 0.183 | 112.947 | 222.752 | 134.274 | 265.406 | ||||
300 | 1.150 | 0.519 | 309.685 | 616.159 | 368.679 | 734.147 | ||||
0.421 | 0.021 | 0.004 | 19.804 | 38.766 | 23.515 | 46.189 | ||||
1.390 | 0.190 | 0.165 | 95.370 | 188.020 | 113.372 | 224.024 | ||||
1000 | 1.260 | 0.496 | 141.281 | 280.040 | 168.093 | 333.665 | ||||
0.387 | 0.001 | 10.007 | 19.341 | 11.859 | 23.044 | |||||
1.361 | 0.161 | 0.150 | 48.295 | 93.407 | 57.238 | 111.293 |
From Table 4, Table 5 and Table 6, we can see that, when n increases, biases, MSEs, and ALs decrease. This observation is consistent with the well-known convergence properties of the MLEs.
5. Data Analysis
In this section, we prove the flexibility of the NPTLILx model by analyzing two practical datasets. The fits of the NPTLILx model are compared to the competitive models listed in Table 7. The common point of all of them is the use the inverse Lomax distribution as the baseline distribution.
Table 7.
The competitive models considered.
Except the former inverse Lomax distribution, the considered models possess three or four parameters. The comparison of these models was performed by using the following well-known statistical benchmarks: CVM (Cramér–von Mises); AD (Anderson–Darling); KS (Kolmogorov–Smirnov) statistic with the corresponding p-value, minus log-likelihood ; AIC (Akaike information criterion); CAIC (corrected Akaike information criterion); BIC (Bayesian information criterion); and HQIC (Hannan–Quinn information criterion). For the CVM, AD, KS, , AIC, CAIC, BIC, and HQIC, the smaller the value is, the better the fit to the data. Additionally, the higher the p-values of the KS test are, the better the fit to the data. All these measures were computed by using the R software.
Dataset I: The first data refer to [23]. It consists of 40 observations of the active repair times for airborne communication transceiver. The unit is the hour. The data are: 0.50, 0.60, 0.60, 0.70, 0.70, 0.70, 0.80, 0.80, 1.00, 1.00, 1.00, 1.00, 1.10, 1.30, 1.50, 1.50, 1.50, 1.50, 2.00, 2.00, 2.20, 2.50, 2.70, 3.00, 3.00, 3.30, 4.00, 4.00, 4.50, 4.70, 5.00, 5.40, 5.40, 7.00, 7.50, 8.80, 9.00, 10.20, 22.00, 24.50.
A basic statistical description of the data gives: , mean , standard deviation , median , skewness , and kurtosis . One can notice that the data are skewed to the right with a high kurtosis.
Dataset II: Next, we use the actual taxes dataset as described in [24]. The data consist of the monthly actual taxes revenue in Egypt from January 2006 to November 2010. The unit is the 1000 million Egyptian pounds. The data are: 5.9, 20.4, 14.9, 16.2, 17.2, 7.8, 6.1, 9.2, 10.2, 9.6, 13.3, 8.5, 21.6, 18.5,5.1,6.7, 17, 8.6, 9.7, 39.2, 35.7, 15.7, 9.7, 10, 4.1, 36, 8.5, 8, 9.2, 26.2, 21.9, 16.7, 21.3, 35.4, 14.3, 8.5, 10.6, 19.1, 20.5, 7.1, 7.7, 18.1, 16.5, 11.9, 7, 8.6, 12.5, 10.3, 11.2, 6.1, 8.4, 11, 11.6, 11.9, 5.2, 6.8, 8.9, 7.1, 10.8.
A basic statistical description of the data gives: , mean , standard deviation , median , skewness , and kurtosis . Thus, these data are skewed to the right with a moderate kurtosis.
The graphical and numerical analyses of these two datasets are as follows. Figure 3 presents the total test time (TTT) plots of the two datasets. The first plot shows a convex curve, indicating that a decreasing hrf for the fitting model is appropriate for Data set I, whereas the second plot shows a concave curve, indicating that an increasing hrf for the fitting model is appropriate for Data set II. These cases are covered by the NPTLILx model, as shown in Figure 2.
Figure 3.
Total test time (TTT) plots for Datasets I and II, respectively.
Table 8 and Table 9 present the CVM, AD, KS, and the related p-value, and the MLEs of the models’ parameters for Datasets I and II, respectively. The obtained p-values indicate that the NPTLILx model is the best. Table 10 and Table 11 communicate the , AIC, BIC, CAIC, BIC, and HQIC of the models for Datasets I and II, respectively. Since the smallest values are obtained for the NPTLILx model, it can be considered the best with these criteria. The estimated pdfs and cdfs for the considered models are displayed in Figure 4 and Figure 5 for Datasets I and II, respectively. The plots of the estimated pdfs are visually refined via an individual treatment in Figure 6 and Figure 7. In order to give another point of view, we illustrate the adequateness of the models via the use of probability–probability (PP) plots in Figure 8 and Figure 9, for Datasets I and II, respectively. In particular, for Dataset II, in view of the perfect adjustment of the scatter plot by the PP line, it is clear that the NPTLILx model provides a better fit in comparison to the other models. To resume, the NPTLILx model reveals itself to be the more appropriate model for the two datasets, illustrating its applicability in a concrete setting.
Table 8.
Goodness-of-fit measures, MLEs, and SEs for Dataset I.
Model | CVM | AD | KS | p-Value | MLEs with SEs (in Parentheses) | |||
---|---|---|---|---|---|---|---|---|
NPTLILx | 0.0550 | 0.3462 | 0.0943 | 0.8683 | ||||
0.0682 | 11.9902 | 1.5651 | ||||||
(0.0897) | (2.4167) | (0.6425) | ||||||
WILx | 0.1522 | 1.0852 | 0.1784 | 0.1566 | a | b | ||
0.0026 | 0.8867 | 0.0185 | 0.2581 | |||||
(0.0006) | (0.1938) | (0.0285) | (0.7650) | |||||
TILx | 0.0685 | 0.4578 | 0.1108 | 0.7100 | ||||
37.8324 | 2.9879 | 0.1676 | ||||||
(9.7598) | (1.9941) | (0.2682) | ||||||
PILx | 0.1079 | 0.6582 | 0.1272 | 0.5369 | ||||
0.1130 | 6.8594 | 0.0571 | ||||||
(0.1242) | (6.4470) | (0.2480) | ||||||
ILx | 0.0632 | 0.4065 | 0.0981 | 0.8355 | ||||
0.2003 | 8.2426 | |||||||
(0.1372) | (5.1671) |
Table 9.
Goodness-of-fit measures, MLEs, and SEs for Dataset II.
Model | CVM | AD | KS | p-Value | MLEs with SEs (in Parentheses) | |||
---|---|---|---|---|---|---|---|---|
NPTLILx | 0.0357 | 0.2698 | 0.0615 | 0.9786 | ||||
14.4361 | 0.4378 | 5.0301 | ||||||
(1.0378) | (1.1103) | (0.2431) | ||||||
WILx | 0.2363 | 1.4829 | 0.3248 | a | b | |||
0.0021 | 1.1404 | 0.0172 | 3.9985 | |||||
(0.0002) | (0.1202) | (0.0064) | (3.1371) | |||||
TILx | 0.0398 | 0.2701 | 0.0998 | 0.5988 | ||||
50.4579 | 0.0908 | 15.8617 | ||||||
(6.7245) | (0.2094) | (3.9080) | ||||||
PILx | 0.1133 | 0.6440 | 0.1447 | 0.1689 | ||||
1.1545 | 2.3262 | 300.7315 | ||||||
(0.4472) | (0.2880) | (121.3061) | ||||||
ILx | 0.0529 | 0.3075 | 0.2928 | |||||
0.1464 | 71.1473 | |||||||
(0.2569) | (23.8010) |
Table 10.
The values of , AIC, and KS with its p-value for Dataset I.
Model | AIC | CAIC | BIC | HQIC | |
---|---|---|---|---|---|
NPTLILx | 88.8229 | 183.6459 | 184.3125 | 188.7125 | 185.4778 |
WILx | 98.2937 | 204.5874 | 205.7303 | 211.3429 | 207.0300 |
TILx | 90.5459 | 187.0919 | 187.7586 | 192.1586 | 188.9239 |
PILx | 90.5908 | 187.1817 | 187.8484 | 192.2483 | 189.0136 |
ILx | 91.3612 | 186.7226 | 187.0469 | 190.1003 | 187.9439 |
Table 11.
The values of , AIC, and KS with its p-value for Dataset II.
Model | AIC | CAIC | BIC | HQIC | |
---|---|---|---|---|---|
NPTLILx | 189.2811 | 384.5622 | 384.9985 | 390.7948 | 386.9951 |
WILx | 219.6212 | 447.2424 | 447.9832 | 455.5526 | 450.4864 |
TILx | 190.3769 | 386.7538 | 387.1902 | 392.9864 | 389.1868 |
PILx | 195.1056 | 396.2113 | 396.6476 | 402.4439 | 398.6442 |
ILx | 211.6436 | 427.2872 | 427.5015 | 431.4422 | 428.9091 |
Figure 4.
Plots for the estimated pdfs and cdfs for Dataset I.
Figure 5.
Plots for the estimated pdfs and cdfs for Dataset II.
Figure 6.
Plots of the pdfs estimated for Dataset I.
Figure 7.
Plots of the pdfs estimated for Dataset II.
Figure 8.
Probability–probability (PP) plots of considered models for Dataset I.
Figure 9.
PP plots of considered models for Dataset II.
We end this section by providing some additional graphical and numerical elements on the NPTLILx model, related to the quantities presented in Section 4.1. To illustrate the uniqueness of the MLEs of , and , the profiles of the log-likelihood function are proposed in Figure 10 and Figure 11 for Datasets I and II, respectively. The Fisher information matrices of the NPTLILx model taken at the MLEs for Datasets I and II are, respectively, given by
Figure 10.
Profiles of the log-likelihood function of the NPTLILx model for Dataset I.
Figure 11.
Profiles of the log-likelihood function of the NPTLILx model for Dataset II.
Then, the confidence intervals for , , and at the levels and are provided in Table 12.
Table 12.
Confidence intervals for the parameters of the NPTLILx model for Datasets I and II, respectively.
CI | |||
---|---|---|---|
[0 0.2157] | [8.0147 15.9656] | [0.5081 2.6220] | |
[0 0.2440] | [7.2534 16.7269] | [0.3058 2.8244] | |
CI | |||
[12.7289 16.1432] | [0 2.2642] | [4.6302 5.4300] | |
[12.4020 16.4701] | [0 2.6139] | [4.5536 5.5065] |
6. Conclusion and Perspectives
In this paper, we introduced and studied a new general family of distributions, called the NPTL-G family, based on the so-called power Topp–Leone-G and inverse exponential-G families. Various mathematical properties were presented, including stochastic ordering, quantile function and related measures, general moments and related measures, and the Shannon entropy, with discussions. Then, we payed special attention to a member of the family defined with the inverse Lomax distribution, called the NPTLILx distribution. The estimation of the unknown model parameters was done with the maximum likelihood method, with numerical guarantees on their behavior via a simulation study. The applicability of the NPTLILx model was then illustrated by the consideration of two practical datasets. It was then proven that the NPTLILx model is a serious alternative to other models, also using the inverse Lomax distribution as the baseline. Future work will include the constructions of various regression models, Bayesian estimation of the parameters, and analyses of new datasets. Thanks to its numerous qualities, we believe that the NPTL-G family can be helpful for the practitioner, for statistical analyses beyond the scope of this paper.
Among the interesting perspectives of work, one could investigate the confidence bounds and supersaturation properties of the cdfs of the members of the NPTL-G family, which are useful for choosing an appropriate model for given data, following the spirit of [38,39,40,41,42]. All these aspects need further investigations that we leave for future works.
Acknowledgments
We would like to thank the two reviewers for their thoughtful efforts towards improving our manuscript. The authors gratefully acknowledge the DSR for technical and financial support.
Author Contributions
R.A.R.B., F.J., C.C., and M.E. contributed equally to this work.
Funding
This work was funded by the Deanship of Scientific Research (DSR), King AbdulAziz University, Jeddah, under grant number DF-277-305-1441.
Conflicts of Interest
The authors declare no conflict of interest.
References
- 1.Eugene N., Lee C., Famoye F. The beta-normal distribution and its applications. Commun. Stat. Theory Methods. 2002;31:497–512. doi: 10.1081/STA-120003130. [DOI] [Google Scholar]
- 2.Cordeiro G.M., de Castro M. A new family of generalized distribution. J. Stat. Comput. Simul. 2002;81:883–898. doi: 10.1080/00949650903530745. [DOI] [Google Scholar]
- 3.Bourguignon M., Silva R.B., Cordeiro G.M. The Weibull-G family of probability distributions. J. Data Sci. 2014;12:53–68. [Google Scholar]
- 4.Elgarhy M., Hassan A.S., Rashed M. Garhy-Generated Family of Distributions with Application. Math. Theory Model. 2016;6:1–15. [Google Scholar]
- 5.Hassan A.S., Elgarhy M., Shakil M. Type II half logistic family of distributions with applications. Pak. J. Stats. Oper. Res. 2017;13:245–264. [Google Scholar]
- 6.Yousof H.M., Alizadeh M., Jahanshahi S.M.A., Ramires T.G., Ghosh I., Hamedani G.G. The Transmuted Topp-Leone G family of distributions: Theory, characterizations and applications. J. Data Sci. 2017;15:723–740. [Google Scholar]
- 7.Cordeiro G.M., Alizadeh M., Ozel G., Hosseini B., Ortega E.M.M., Altun E. The generalized odd log-logistic family of distributions: Properties, regression models and applications. J. Stat. Comput. Simul. 2017;87:908–932. doi: 10.1080/00949655.2016.1238088. [DOI] [Google Scholar]
- 8.Haq M.A., Elgarhy M. The odd Fréchet-G family of probability distributions. J. Stat. Appl. Probab. 2018;7:189–203. doi: 10.18576/jsap/070117. [DOI] [Google Scholar]
- 9.Hassan A.S., Nassar S.G. Power Lindley-G family. Ann. Data Sci. 2018;6:189–210. [Google Scholar]
- 10.Reyad H., Korkmaz M.C., Afify A.Z., Hamedani G.G., Othman S. The Fréchet Topp-Leone-G family of distributions: Properties, characterizations and applications. Ann. Data Sci. 2019 doi: 10.1007/s40745-019-00212-9. [DOI] [Google Scholar]
- 11.Reyad H.M., Alizadeh M., Jamal F., Othman S., Hamedani G.G. The exponentiated generalized Topp Leone-G family of distributions: Properties and applications. Pak. J. Stats. Oper. Res. 2019;15:1–24. doi: 10.18187/pjsor.v15i1.2166. [DOI] [Google Scholar]
- 12.Bantan R.A., Jamal F., Chesneau C., Elgarhy M. Truncated inverted Kumaraswamy generated family of distributions with applications. Entropy. 2019;21:1089. doi: 10.3390/e21111089. [DOI] [Google Scholar]
- 13.Tahir M.H., Cordeiro G.M. Compounding of distributions: A survey and new generalized classes. J. Stat. Distribut. Appl. 2016;3:1–35. doi: 10.1186/s40488-016-0052-1. [DOI] [Google Scholar]
- 14.Al-Shomrani A., Arif O., Shawky A., Hanif S., Shahbaz M.Q. Topp-Leone family of distributions: Some properties and application. Pak. J. Stat. Oper. Res. 2016;12:443–451. doi: 10.18187/pjsor.v12i3.1458. [DOI] [Google Scholar]
- 15.Rezaei S., Sadr B.B., Alizadeh M., Nadarajah S. Topp-Leone generated family of distributions: Properties and applications. Commun. Stat. Theory Methods. 2016;46:2893–2909. doi: 10.1080/03610926.2015.1053935. [DOI] [Google Scholar]
- 16.Mahdavi A. Generalized Topp-Leone family of distributions. J. Biostat. Epidemiol. 2017;3:65–75. [Google Scholar]
- 17.Elgarhy M., Nasir M.A., Jamal F., Ozel G. The type II Topp-Leone generated family of distributions: Properties and applications. J. Stat. Manag. Syst. 2018;21:1529–1551. doi: 10.1080/09720510.2018.1516725. [DOI] [Google Scholar]
- 18.Hassan A.S., Elgarhy M., Ahmad Z. Type II generalized Topp-Leone family of distributions: Properties and applications. J. Data Sci. 2019;17:638–659. [Google Scholar]
- 19.Kumar D., Singh S.K., Singh U. Life time distributions: Derived from some minimum guarantee distribution. Sohag J. Math. 2017;4:7–11. doi: 10.18576/sjm/040102. [DOI] [Google Scholar]
- 20.Alrajhi S. The odd Fréchet inverse exponential distribution with application. J. Nonlinear Sci. Appl. 2019;12:535–542. doi: 10.22436/jnsa.012.08.04. [DOI] [Google Scholar]
- 21.Nasiru S. Extended odd Fréchet-G family of distributions. J. Probab. Stat. 2018;1:1–12. doi: 10.1155/2018/2931326. [DOI] [Google Scholar]
- 22.Chesneau C., Djibrila S. The generalized odd inverted exponential-G family of distributions: properties and applications. Eurasian Bullet. Math. 2019 in press. [Google Scholar]
- 23.Jorgensen B. Statistical Properties of the Generalized Inverse Gaussian Distribution. Springer; New York, NY, USA: 1982. [Google Scholar]
- 24.Mead M.E. A new generalization of Burr XII distribution. J. Stat. 2014;12:53–73. [Google Scholar]
- 25.Kleiber C. Lorenz ordering of order statistics from log-logistic and related distributions. J. Stat. 2004;120:13–19. doi: 10.1016/S0378-3758(02)00495-0. [DOI] [Google Scholar]
- 26.Kleiber C., Kotz S. Statistical Size Distributions in Economics and Actuarial Sciences. John Wiley and Sons, Inc.; Hoboken, NJ, USA: 2003. [Google Scholar]
- 27.Singh S.K., Singh U., Kumar D. Bayes estimators of the reliability function and parameter of inverted exponential distribution using informative and non-informative priors. J. Stat. Comput. Simul. 2013;83:2258–2269. doi: 10.1080/00949655.2012.690156. [DOI] [Google Scholar]
- 28.Galton F. Inquiries into Human Faculty and Its Development. Macmillan and Company; London, UK: 1883. [Google Scholar]
- 29.Moors J.J.A. A quantile alternative for kurtosis. J. R. Stat. Soc. 1988;37:25–32. doi: 10.2307/2348376. [DOI] [Google Scholar]
- 30.Nair N.U., Sankaran P.G. Quantile-based reliability analysis. Commun. Stat. Theory Methods. 2009;38:222–232. doi: 10.1080/03610920802187430. [DOI] [Google Scholar]
- 31.Gupta R.D., Kundu D. Exponentiated exponential family: An alternative to gamma and Weibull distributions. Biometric. J. 2001;43:117–130. doi: 10.1002/1521-4036(200102)43:1<117::AID-BIMJ117>3.0.CO;2-R. [DOI] [Google Scholar]
- 32.Shannon C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948;27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
- 33.Casella G., Berger R.L. Statistical Inference. Brooks/Cole Publishing Company; Bel Air, CA, USA: 1990. [Google Scholar]
- 34.Rahman J., Aslam M., Ali S. Estimation and prediction of inverse Lomax model via Bayesian approach. Caspian J. Appl. Sci. Res. 2013;2:43–56. [Google Scholar]
- 35.Hassan A.S., Abd-Allah M. On the Inverse Power Lomax Distribution. Ann. Data Sci. 2019;6:259–278. doi: 10.1007/s40745-018-0183-y. [DOI] [Google Scholar]
- 36.Hassan A.S., Ismail D.M. Parameter Estimation of Topp-Leone Inverse Lomax Distribution. J. Modern Appl. Stat. Methods. 2019 in press. [Google Scholar]
- 37.Hassan A.S., Mohamed R.E. Weibull Inverse Lomax Distribution. Pak. J. Stat. Oper. Res. 2019;15:587–603. [Google Scholar]
- 38.Sendov B. Hausdorff Approximations. Wolters Kluwer; Alphen aan den Rijn, The Netherlands: 1990. [Google Scholar]
- 39.Iliev A., Rahnev A., Kyurkchiev N., Markov S. A study on the unit-logistic, unit-Weibull and Topp-Leone cumulative sigmoids. Biomath. Commun. 2019;6:1–15. doi: 10.11145/bmc.2019.03.167. [DOI] [Google Scholar]
- 40.Kyurkchiev N. Uniform approximation of the generalized cut function by Erlang cumulative distribution function. Application in applied insurance mathematics. Int. J. Theor. Appl. Math. 2016;2:40–44. [Google Scholar]
- 41.Kyurkchiev N. Mathematical Concepts in Insurance and Reinsurance: Some Moduli in Programming Environment MATHEMATICA. LAP LAMBERT Academic Publishing; Riga, Latvia: 2016. [Google Scholar]
- 42.Kyurkchiev N., Iliev A., Markov S. Some Techniques for Recurrence Generating of Activation Functions: Some Modeling and Approximation Aspects. LAP LAMBERT Academic Publishing; Riga, Latvia: 2017. [Google Scholar]