Abstract
Creating new distributions with more desired and flexible qualities for modeling lifetime data has resulted in a concentrated effort to modify or generalize existing distributions. In this paper, we propose a new distribution called the power exponentiated Lindley (PEL) distribution by generalizing the Lindley distribution using the power exponentiated family of distributions, that can fit lifetime data. Then the main statistical properties such as survival function, hazard function, reverse hazard function, moments, quantile function, stochastic ordering, MRL, order statistics, etc., of the newly proposed distribution have been derived. The parameters of the distribution are estimated using the MLE method. Then, a Monte Carlo simulation study is used to check the consistency of the parameters of the PEL distribution in terms of MSE, RMSE, and bias. Finally, we implement the PEL distribution as a statistical lifetime model for the COVID-19 case fatality ratio (in %) in China and India, and the new cases of COVID-19 reported in Delhi. Then we check whether the new distribution fits the data sets better than existing well-known distributions. Different statistical measures such as the value of the log-likelihood function, K-S statistic, AIC, BIC, HQIC, and p-value are used to assess the accuracy of the model. The suggested model seems to be superior to its base model and other well-known and related models when applied to the COVID-19 data set.
Keywords: Lindley distribution, Maximum likelihood estimation, Monte Carlo simulation study, Power exponentiated family, Quantile function, Stochastic ordering
Introduction
Now China is experiencing the most significant increase in COVID-19 cases because of the new variant of coronavirus known as the ’stealth’ sub-variant of omicron, also known as BA.2. The first case of this variant was reported in March 14th of 2022 in China. This outbreak occurred when corona virus infections are declining in several other nations, and limitations are being eased. The BA.2 variant is anticipated to have the same severity as the ’original version’, according to the World Health Organization (WHO). Same time in India, cases are reducing, and restrictions are removed. All public places are opened to the public. The vaccination process is progressing in every country. Following the outbreak of COVID-19, a slew of researchers have begun to investigate and use the virus data for various purposes. Liu et al. [18] proposed the arc-sine-modified Weibull distribution to model the mortality rates of COVID-19 cases in China. Pathak et al. [23] used the exponentiated exponential distribution as a suitable statistical lifetime model for Kerala COVID-19 patient data. Nagy et al. [22] introduced a superior discrete statistical model for the COVID-19 mortality numbers in Saudi Arabia and Latvia. Riad et al. [26] introduced a new lifetime distribution named the power Bilal (PB) distribution to model COVID-19 data and investigated its statistical properties. Ahsan-ul-Haq et al. [3] used different probability distributions to model COVID-19 cases in Pakistan; these are some of the recent studies based on COVID-19 data. As a result, in this study, we used COVID-19 data from Delhi, India, and China.
Once all distributional assumptions are satisfied, the accuracy of parametric inference and data set modeling is mostly influenced by how well the presented data fits the probability distribution. Several studies have been conducted in order to create probability distributions with significantly more accurate and adaptable characteristics which can model real-life data sets of different kinds. The need to produce new distributions arises from theoretical concerns, actual applications, or both. There has been significant development in the generalization of some well-known distributions and their practical application to challenge other well-known distributions. The exponential distribution is ideal for portraying the lifetime data, like for many types of manufactured items. The main feature of the exponential distribution is that it may be used to simulate the behavior of things with a fixed failure rate. It has a straightforward mathematical structure that makes it simple to manipulate. While discussing the sampling of standard deviation (SD), Kondo [16] in 1930 referred to the exponential distribution as Pearson’s X Type distribution. Then, Jowett [14] proposed the exponential distribution and its applications in 1958. Due to analytical simplicity, the exponential distribution was commonly employed to describe failure time data in the early days of reliability theory research. However, in most modern reliability engineering issues, the exponential distribution’s constant failure rate property is not necessarily appropriate. Hence, many generalized forms of this distribution came into existence. Some of them are the generalized exponential distributions by Gupta and Kundu [13], the k-generalized exponential distribution proposed by Rather and Rather [25], the extended exponentiated exponential distribution and its properties by Abu et al. [2].
Also, numerous lifetime distributions have been proposed as an alternative to exponential distribution for modeling lifetime data in the literature. Lindley [17] presented the Lindley distribution to demonstrate that the Bayes’ and fiducial distributions are not always identical. Later, Ghitany et al. [11] investigated the properties of the Lindley distribution and found that it is a better alternative to the exponential distribution. If a continuous random variable X follows the Lindley distribution, then cumulative distribution function (cdf) with parameter q is provided by
| 1 |
and probability density function (pdf) as
| 2 |
Ghitany et al. [10] also introduced the power Lindley distribution as a new generalization of the Lindley distribution. Later, Nadarajah et al. [21] introduced a new generalization of the Lindley distribution called the generalized Lindley distribution. Sakthivel et al. [27] introduced two parameter cubic rank transmutation of Lindley distribution. Ashly and Rajitha [7] proposed the negative binomial improved second degree Lindley distribution. Ashour and Eltehiwy [8] proposed the Lindley distribution with three parameters known as the exponentiated power Lindley distribution, which has got a wide range of applications in fields including survival analysis, reliability, biology, and others.The power XLindley (PXL) distribution was introduced by Meriem et al. [19]. It is a two-parameter distribution that extends the XLindley distribution. Recently the negative binomial Akash distribution and its applications was proposed by Rajitha and Ashly [24]. Besides, some heavy-tailed distributions have also been developed by many researchers. Teamah et al. [29]. introduced a relatively new heavy-tailed statistical model using alpha power transformation and exponentiated log-logistic distribution, called alpha power exponentiated log-logistic distribution.
Furthermore, researchers have shown a growing interest in developing new families of distributions to improve the accuracy of fitting complicated forms of data. The new families were established using a variety of ways that included adding extra location, scale, shape, and transmuted characteristics. Some of the newly proposed families are the Chen-G family by Anzagra et al. [6], the sine Topp-Leone-G family by Al-Babtain et al. [4], the Poisson-G family by Abouelmagd et al. [1], and the odd Lomax trigonometric generalized family of distributions by Alshanbari et al. [5] etc. Recently, Klakattawi et al. [15] proposed the Marshall–Olkin Weibull generated family. With the help of the power exponentiated family of distributions proposed by Modi [20], we propose a new generalization of the Lindley distribution in this study. The cdf and pdf of the power exponentiated family are given by
| 3 |
and
| 4 |
where G(x) and g(x) are pdf and cdf of any baseline distribution.
The main motivation of this article is to introduce a new superior model capable of modeling and fitting different types of data. We also want to show the superiority of the new model in beating all its competitors and recommend the proposed distribution as a strong and novel candidate for modeling real data sets. When modeling a situation with a known distribution is challenging, we may use generalization to account for extra data variation. The challenges that are currently present are evolving significantly along with our world. Because of this, we require additional generalizations of probability distributions to capture more complex data. As COVID-19 is the new problem faced by almost every country, we tried to model those data using generalized form of different distributions, but the result was not that convincing. As a result, we proposed a new distribution to solve this. The following things impacted the creation of this work:
Although the PEL distribution is limited to the closed form expression of the characteristics of the distribution, it is simple to implement.
Explicitly defining the statistical characteristics is straightforward.
PEL distribution is suitable for modeling, decreasing, increasing, and upside-down lifetime data. Its density can be symmetric or right skewed and many various shapes.
Also, its hazard rate can be increasing, upside-down, increasing constant.
The novel PEL distribution is adequate for skewed data that may not be adequately fitted by other distributions.
Also, it may be used pretty well, to analyse a large number of real-life data sets and fits them quite well, also it can be used in various problems in applied areas such as medicine, engineering, industrial reliability, COVID-19 analysis, and survival analysis.
The PEL distribution has simple closed forms for both the cdf and hazard rate functions. Hence it will be useful to work with censored and complete samples as well.
This study aims to develop a new distribution called the power exponentiated Lindley (PEL) distribution and derive some of its characteristics. In Sect. 2, we introduce the pdf and cdf of the PEL distribution along with its mixture representation. Then in Sect. 3, some of the essential statistical properties are derived, such as hazard rate, survival function, reverse hazard function, moments, moment generating function, mean residual life, mean past lifetime, stochastic ordering, and order statistics. In Sect. 4, we find the maximum likelihood estimates (MLE) for the parameters of the PEL distribution using the log-likelihood function. Section 5 is about the Monte Carlo simulation study which is used to check the consistency of the parameters of the PEL distribution in terms of bias, mean squared error (MSE), and root mean squared error (RMSE). Then in Sect. 6 we discuss about the application of the PEL distribution using COVID 19 case fatality ratio (in %) in China and India, and the new cases of COVID-19 reported in Delhi. Further, we prove that the PEL distribution is more efficient in fitting data than other well-known existing distributions. In Sect. 7, the results of the simulation study and the application of real-life data are addressed. Finally, in Sect. 8, conclusion is presented.
The power exponentiated Lindley distribution
With the help of the power exponentiated family of distributions from Eqs. (3) and (4), we introduce a new generalization of the Lindley distribution called the PEL distribution. The cdf for the PEL distribution with and v as shape parameters and q as scale parameter can be obtained as:
| 5 |
By differentiating this cdf, the corresponding pdf is obtained as,
| 6 |
Expansion of the PEL distribution
Expansion of pdf and cdf of the PEL distribution is useful while deriving their properties. For this purpose, we use the following two lemmas:
Lemma 2.1.1
If is a positive real non integer and 1, from Gradshteyn et.al ([12] p.25) Equation (1.110) we get binomial series expansion as;
Lemma 2.1.2
If b and y are any real numbers, then from Gradshteyn et.al ([12] p.26) Equation (1.211.2),
Using Lemma 2.1.1 and Lemma 2.1.2, expansion of cdf of the PEL distribution can be derived as,
| 7 |
By applying Lemma 2.1.2 in Eq. (6), we get the expansion of pdf of the PEL distribution as;
By binomial series expansion (Lemma 2.1.1), we derive the mixture representation of pdf of the PEL distribution as,
| 8 |
Figures. 1 and 2 show the graphical representation of cdf and pdf of the PEL distribution for different combination of parameter values of , v and q. The pdf plots show that as the values of all parameter increase, the peak of graph decreases. Also all plots of pdf are positively skewed. In cdf plots, when value of parameter q increases, graph reaches its maximum value for small value of x. In case of increase in values of other two parameters (v and ), graph reaches maximum value for higher value of x.
Fig. 1.
Plots for pdf of the PEL distribution for distinct parameter values
Fig. 2.
Plots for cdf of the PEL distribution for distinct parameter values
Statistical properties of the PEL distribution
Various statistical properties of the PEL distribution such as hazard function, reverse hazard function, survival function, median, mode, moments, MGF, inverted moments, incomplete rth moments, MRL, stochastic ordering, quantile function, order statistics etc, are discussed in this section.
Survival function, hazard function and, reverse hazard function
The survival function is a function that calculates the probability of a patient, device, or other object of interest surviving after a certain period. It is given by,
The survival function of the PEL distribution can be written as,
| 9 |
Survival function with expansion of cdf of the PEL distribution is,
| 10 |
The hazard function of a distribution is given by , is a useful measure for describing life phenomena. It calculates the conditional likelihood of a failure based on the current state of the system. For the PEL distribution hazard function can be written as,
| 11 |
where .
Using expansion of pdf and cdf of the PEL distribution, hazard function is:
| 12 |
where And reverse hazard rate function for the PEL distribution can be written as,
| 13 |
where .
When expansion of pdf and cdf is taken into consideration, reverse hazard function is,
| 14 |
where .
From the plot of hazard function (Fig. 4), it is evident that for the PEL distribution, as value of x increases hazard rate increases and after certain values for x, hazard rate becomes constant. Also if is less than 1, hazard rate decreases initially and then increases. The survival and reverse hazard function (Figs. 3 and 5) value decreases as value of x increases.
Fig. 4.
Plots for hazard function of the PEL distribution for distinct parameter values
Fig. 3.
Plots for survival function of the PEL distribution for different parameter values
Fig. 5.
Plots for the reverse hazard rate function of the PEL distribution for distinct parameter values
Median and mode
The median for a distribution is obtained by
Therefore in case of the PLE distribution, the median can be obtained as,
| 15 |
Thus, the median of the PEL distribution can be obtained by solving Eq. (15).
As we have the pdf of the PEL distribution with random variable X, then corresponding mode is obtained by equating . That is,
| 16 |
where . Thus, the mode of the PEL distribution can be obtained by solving Eq. (16).
Moments
Moments are popularly used to describe the characteristic of a distribution. The rth moment of the random variable X can be written as,
The moment of the PEL distribution can be derived as,
Above equation is not having an anti-derivative, let
| 17 |
Then rth moment of the PEL distribution can be written as,
| 18 |
The different moments is found by substituting r=1,2,3..,r. So first 4 raw moments are as follows;
where , ,
,
Mean
For the PEL distribution mean is obtained as,
| 19 |
Figure 6 shows the plot of mean of the PEL distribution and mean decreases as values of parameter increase.
Fig. 6.
Plots of mean for different parameter values
Variance
Variance, =
The variance for the PEL distribution it is obtained as,
| 20 |
Figure 7 shows that as values of parameters increase, variance decreases.
Fig. 7.
Plots for variance for different parameter values
Skewness
The skewness is a metric for the asymmetry of the probability distribution of a real-valued random variable with respect to its mean. It is defined as:
where,
So, the skewness of the PEL distribution is obtained as;
| 21 |
From Fig. 8, it is clear that as values of parameters increase, skewness decreases. Also all values are positive, so its positively skewed.
Fig. 8.
Plots of skewness for different parameter values
Kurtosis
The kurtosis is a statistical term used to characterize how much data clusters in the tails or the peak of a frequency distribution. It is defined as,
where,
For the PEL distribution, kurtosis can be obtained as,
| 22 |
Fig. 9.
Plots of kurtosis for different parameter values
Moment generating function
The moment generating function (MGF) creates a single function from which all the moments of a random variable can be recovered at a later time. A probability distribution is uniquely determined by its MGF. The MGF of a random variable X is given by,
For the PEL distribution, MGF is obtained as;
| 23 |
Inverted moments
The rth inverted moments of a distribution is derived by
For the PEL distribution, using Eq. (17) inverted moment is obtained as,
| 24 |
Incomplete moment
The incomplete rth moment is defined by
For the PEL distribution, incomplete rth moment is obtained by;
| 25 |
Mean residual life function
In reliability and survival analysis, the mean residual life (MRL) function is important. It specifies how long a system will function, starting at a point of time x. The MRL function of a life time random variable X’ is given as,
where, s(x) is the survival rate function and f(x) is pdf of the distribution.
For the PEL distribution, the MRL function is obtained as,
| 26 |
Order statistics
David [9] given the pdf of the kth order statistics where k= 1, 2,...,n as;
where,
Theorem 3.8.1
Let be a simple random sample from a PEL distribution, with cdf and pdf, respectively, provided by (5) and (6). The order statistics acquired from this sample be, . Then order statistic of the PEL distribution is given by
| 27 |
where, and
Proof
When F(x) and f(x) are from Eq. (5) and (6) respectively, and by applying Lemma 2.1.1, we get
Hence kth order statistic of the PEL distribution is obtained as,
| 28 |
where, .
Corresponding 1st and nth order statistics of the PEL distribution are:
| 29 |
| 30 |
Quantile function
Theorem 3.9.1
Let a random sample X be from the PEL distribution, then the quantile function of X is given by
| 31 |
Proof
To find the quantile function for the PEL distribution, we equate where .
Taking logarithm on both sides
Multiplying,
We can see from the equation above that the Lambert W function of the real argument is . Then we have;
| 32 |
Moreover, it is clear that for each and , is greater than 0 and it may be verified that , since . In account of the characteristics of negative branch of the Lambert W function, Eq. (32) becomes
| 33 |
Therefore, the quantile function of the PEL distribution can be written as,
Stochastic ordering
In many practical problems, it becomes necessary to compare two life distributions with reference to some of their characteristics and there stochastic ordering plays an important role. A random variable Y is greater than X in the following way:
-
(i)
Stochastic order if
-
(ii)
Hazard rate order if
-
(iii)
Mean residual life order if
-
(iv)
Likelihood ratio order if decreasing in x
The following are the relationship between the above-mentioned properties of a distribution.
and
Theorem 3.10.1
Let and if and , we have then and .
Proof
To prove decreasing in x we have to show that the derivative of is less than 0.
Taking ln (log to base e) on both sides:
Differentiating both sides with respect to x, we get
| 34 |
Here, we are using graph to show that all values of derivative of (Eq. (34)) are less than 0, when , and .
From Fig. 10, it is clear that all derivative values of likelihood function are negative for all x. So we can conclude that is decreasing in x. Hence, we proved so we can say that and when Y and X follows the PEL distribution.
Fig. 10.

Plot of the derivative of ratio of pdfs
Maximum likelihood estimation method
To find the maximum likelihood estimates (MLE) for parameters , v and q of the PEL distribution, we follow the following steps:
The likelihood function of the PEL distribution is obtained as,
The log-likelihood function is,
The derivative of the log-likelihood function with respect to v can be written as:
| 35 |
The derivative of the log-likelihood function with respect to can be written as:
| 36 |
The derivative of the log-likelihood function with respect to q can be written as:
| 37 |
We can estimate the unknown parameters using the MLE method by assigning these non-linear Eqs. (36)–(37) to zero and solving them simultaneously. But due to the complexity of these equations, numerical methods like the Newton–Raphson method is used for solving the equations with the help of R software.
Table 2.
Summary statistics of case fatality ratio (in %) of COVID-19 in China
| Minimum | First quartile | Median | Mean | Third quartile | Maximum |
|---|---|---|---|---|---|
| 1.000 | 1.750 | 2.580 | 2.358 | 3.080 | 3.610 |
Simulation study
This section presents a Monte Carlo simulation study to assess the performance of the parameters of the PEL distribution when the MLE method is applied. Consistency of the parameters is measured in terms of MSE, RMSE, and bias. With sample sizes of n = 10, 40, 70, and 100 with 7 distinct parameter combinations, the simulation study was repeated 1000 times. The steps to carry out a simulation study are as follows:
- Step 1: Random number generation of the PEL distribution
- (i) A random variable is generated from the uniform distribution U(0, 1).
- (ii) The created random variable is then fed into the quantile function to produce random numbers of the PEL distribution.
- (iii) steps (i) and (ii) are repeated to obtain observations of the desired sample sizes n.
Step 2: For each sample, the MLEs are computed.
Step 3: Repeat steps 1 and 2, 1000 times.
- Step 4: Then, the MLEs and corresponding MSEs, RMSEs and bias are calculated using the equations,
where observed values of the parameter, predicted MLEs of parameter, n = number of data points
Table 3.
Parameter estimation of different distributions for case fatality ratio (in %) of COVID-19 in China
| Distribution | Parameter estimates |
|---|---|
| PEL | q= 1.684938 |
| v = 19.384176 | |
| = 5.615004 | |
| PEE | p = 1.592209 |
| v = 34.266258 | |
| = 9.256132 | |
| EPL | = 1.326197 |
| = 8.830552 | |
| EG | = 2.103211 |
| L | = 0.6768592 |
| E | = 0.4240158 |
Numerical results are given in Table 1 for the PEL distribution with parameter values (, , = 5), (, , = 5), (, , = 5), (, , = 5), (, , = 5), (, , = 4) and (, , = 7).
Table 1.
Result of simulation study with different parameter values
| (q = 2, v = 6, = 7) | |||||
|---|---|---|---|---|---|
| Sample size | Parameter | MLE values | MSE | RMSE | Bias |
| 10 | q | 2.229891 | 0.05284965 | 0.229891 | 0.229891 |
| v | 6.44686 | 0.1996834 | 0.4468595 | 0.4468595 | |
| 8.257543 | 1.581416 | 1.257543 | 1.257543 | ||
| 40 | q | 2.078754 | 0.006202 | 0.078754 | 0.078754 |
| v | 6.438407 | 0.192200 | 0.438407 | -.438407 | |
| 8.115525 | 1.244396 | 1.115525 | 1.115525 | ||
| 70 | q | 2.096669 | 0.009344812 | 0.09666857 | 0.09666857 |
| v | 6.35277 | 0.124446 | 0.35277 | 0.35277 | |
| 7.969238 | 0.939422 | 0.969238 | 0.969238 | ||
| 100 | q | 2.084044 | 0.007063391 | 0.08404398 | 0.08404398 |
| v | 6.327033 | 0.1069504 | 0.3270328 | 0.3270328 | |
| 7.5781 | 0.3341996 | 0.5781 | 0.5781 | ||
| (q = 3, v = 5, = 5) | |||||
|---|---|---|---|---|---|
| Sample size | Parameter | MLE values | MSE | RMSE | Bias |
| 10 | q | 3.301988 | 0.09119683 | 0.3019881 | 0.3019881 |
| v | 4.380859 | 0.3833354 | 0.6191409 | 0.6191409 | |
| 6.326143 | 1.758655 | 1.326143 | 1.326143 | ||
| 40 | q | 3.252458 | 0.063735 | 0.252458 | 0.252458 |
| v | 4.56723 | 0.187289 | 0.43277 | 0.43277 | |
| 6.09412 | 1.197098 | 1.09412 | 1.09412 | ||
| 70 | q | 3.039756 | 0.001580571 | 0.03975639 | 0.03975639 |
| v | 4.79342 | 0.042304 | 0.20568 | 0.20568 | |
| 5.8346 | 0.6965566 | 0.8345997 | 0.8345997 | ||
| 100 | q | 3.068626 | 0.004709587 | 0.06862643 | 0.06862643 |
| v | 4.922712 | 0.005973372 | 0.07728759 | 0.07728759 | |
| 5.167415 | 0.02802767 | 0.1674147 | 0.1674147 | ||
| (q = 2, v = 4, = 5) | |||||
| 10 | q | 2.362413 | 0.1313435 | 0.3624134 | 0.3624134 |
| v | 5.448296 | 2.097562 | 1.448296 | 1.448296 | |
| 6.472191 | 2.167348 | 1.472191 | 1.472191 | ||
| 40 | q | 2.366537 | 0.1343496 | 0.3665373 | 0.3665373 |
| v | 5.006866 | 1.013779 | 1.006866 | 1.006866 | |
| 6.131707 | 1.280761 | 1.131707 | 1.131707 | ||
| 70 | q | 2.021464 | 0.0004607113 | 0.02146419 | 0.02146419 |
| v | 4.50431 | 0.25432 | 0.50431 | 0.50431 | |
| 5.162059 | 0.026263 | 0.162059 | 0.162059 | ||
| 100 | q | 2.01071 | 0.0001146987 | 0.01070975 | 0.01070975 |
| v | 3.900007 | 0.009998551 | 0.09999275 | 0.09999275 | |
| 5.008271 | 0.00006840 | 0.008270939 | 0.008270939 | ||
| (q = 2, v = 6, = 5) | |||||
| 10 | q | 2.143071 | 0.02046922 | 0.1430707 | 0.1430707 |
| v | 4.655974 | 1.806405 | 1.344026 | 1.344026 | |
| 5.881638 | 0.7772855 | 0.881638 | 0.881638 | ||
| 40 | q | 2.047812 | 0.002286014 | 0.04781228 | 0.04781228 |
| v | 4.690861 | 1.713845 | 1.309139 | 1.309139 | |
| 5.499703 | 0.249703 | 0.4997029 | 0.4997029 | ||
| 70 | q | 2.082811 | 0.006857659 | 0.08281098 | 0.08281098 |
| v | 5.25277 | 0.55835 | 0.74723 | 0.74723 | |
| 5.369238 | 0.136336 | 0.369238 | - 0.369238 | ||
| 100 | q | 2.074615 | 0.005567403 | 0.07461503 | 0.07461503 |
| v | 5.47779 | 0.2727033 | 0.52221 | 0.52221 | |
| 5.309179 | 0.0955919 | 0.3091794 | 0.3091794 | ||
| (q = 2, v = 5, = 4) | |||||
| 10 | q | 2.460609 | 0.2121606 | 0.460609 | 0.460609 |
| v | 6.049118 | 1.100648 | 1.049118 | 1.049118 | |
| 6.192997 | 4.809234 | 2.192997 | 2.192997 | ||
| 40 | q | 2.435673 | 0.1898112 | 0.4356732 | 0.4356732 |
| v | 6.056851 | 1.116934 | 1.056851 | 1.056851 | |
| 6.18177 | 4.760122 | 2.18177 | 2.18177 | ||
| 70 | q | 2.108643 | 0.01180321 | 0.1086426 | 0.1086426 |
| v | 5.799162 | 0.63865 | 0.799162 | 0.799162 | |
| 5.65016 | 2.723027 | 1.65016 | 1.65016 | ||
| 100 | q | 2.000951 | 9.052476e07 | 0.000951445 | 0.000951445 |
| v | 5.3867261 | 0.1495571 | 0.3867261 | 0.3867261 | |
| 4.436273 | 0.1903345 | 0.4362734 | - 0.4362734 | ||
| (q = 2, v = 5, = 5) | |||||
|---|---|---|---|---|---|
| Sample size | Parameter | MLE values | MSE | RMSE | Bias |
| 10 | q | 2.299243 | 0.08954612 | 0.2992426 | 0.2992426 |
| v | 5.517957 | 0.2682797 | 0.5179573 | 0.5179573 | |
| 6.257853 | 1.257853 | 1.18177 | 1.257853 | ||
| 40 | q | 2.150361 | 0.02260835 | 0.1503607 | 0.1503607 |
| v | 4.50711 | 0.2429405 | 0.4928899 | 0.4928899 | |
| 5.567977 | 0.3225978 | 0.567977 | 0.567977 | ||
| 70 | q | 2.087306 | 0.007622 | 0.087306 | 0.087306 |
| v | 4.51032 | 0.239786 | 0.48968 | 0.48968 | |
| 5.563583 | 0.317625 | 0.563583 | 0.563583 | ||
| 100 | q | 2.055663 | 0.003098397 | 0.05566325 | 0.05566325 |
| v | 4.518508 | 0.2318341 | 0.4814916 | 0.4814916 | |
| 5.503536 | 0.2535486 | 0.5035361 | 0.5035361 | ||
| (q = 1, v = 5, = 5) | |||||
| 10 | q | 1.104757 | 0.01097399 | 0.1047568 | 0.1047568 |
| v | 5.454657 | 0.206712 | 0.454657 | 0.454657 | |
| 5.674442 | 0.4548726 | 0.6744424 | 0.6744424 | ||
| 40 | q | 0.9401529 | 0.003581671 | 0.05984706 | 0.05984706 |
| v | 5.40569 | 0.16458 | 0.40569 | 0.40569 | |
| 5.529449 | 0.280316 | 0.529449 | 0.529449 | ||
| 70 | q | 1.036275 | 0.001315878 | 0.03627504 | 0.03627504 |
| v | 4.550159 | 0.2023571 | 0.4498411 | 0.4498411 | |
| 5.441113 | 0.1945803 | 0.4411126 | 0.4411126 | ||
| 100 | q | 0.9874188 | 0.0001582864 | 0.01258119 | 0.01258119 |
| v | 4.94588 | 0.002928959 | 0.05411985 | 0.05411985 | |
| 4.740304 | 0.06744196 | 0.2596959 | 0.2596959 | ||
As the sample size (n) increases, Table 1 clearly shows that the estimated parameter values tends to approach the actual parameter values, and all errors and bias are decreasing. This indicates that the precision and consistency of the MLEs are attained as the sample size increases. Thus, we can logically conclude that the MLE approach is really effective at estimating the parameters of the PEL distribution.
Real data analysis
The performance of the proposed PEL distribution is evaluated in this section, using three real data sets and comparing the PEL distribution with certain well-known existing distributions. To assess the performance of the considered distributions, we calculate the Akaike information criteria (AIC), Bayesian information criteria (BIC), Hannan–Quinn information criterion (HQIC) and p-value. Meanwhile, the best distribution is the one with the highest log-likelihood value, p-value, and lowest AIC, BIC, and the HQIC values. Distributions which are used to compare are; the power exponentiated exponential (PEE) distribution proposed by Modi [20], the exponentiated power Lindley (EPL) distribution proposed by Ashour and Eltehiwy [8], the exponentiated gamma (EG) distribution proposed by Shawky and Bakoban [28], the Lindley (L) distribution proposed by Lindley [17] and the exponential (E) distribution by Kondo [16]. The pdf of the distributions are as follows:
Table 5.
Summary statistics of case fatality ratio (in %) of COVID-19 in India
| Minimum | First quartile | Median | Mean | Third quartile | Maximum |
|---|---|---|---|---|---|
| 1.014 | 1.717 | 2.328 | 2.723 | 3.162 | 10.480 |
Table 6.
Parameter estimation of different distributions for case fatality ratio (in %) of COVID-19 in India
| Distribution | Parameter Estimates |
|---|---|
| PEL | q= 1.225578 |
| v = 2.532038 | |
| = 3.797981 | |
| PEE | p = 1.147063 |
| v = 5.743843 | |
| = 6.642685 | |
| EPL | = 1.670529 |
| = 16.820413 | |
| EG | = 1.531871 |
| L | = 0.639152 |
| E | = 0.3672219 |
- PEE:
- EPL:
- EG:
- L:
- E:
Data set 1: The data represent the COVID-19 case fatality ratio (in %) of China, from 8th March to 1st April of year 2022 due to new variant of COVID-19 (stealth omicron). The data is collected from official site of World Health Organization (WHO) [https://covid19.who.int/]. The data are as follows:
1.09, 1.00, 1.08, 1.12, 1.50, 1.60, 1.77, 1.81, 2.07, 1.75, 2.58, 2.59, 2.65, 3.09, 3.20, 3.47, 3.21, 2.77, 3.17, 2.65, 3.00, 3.61, 3.08, 2.70, 2.41.
From Table 4, PEL distribution is having high p-value, log-likelihood value and low AIC, BIC, and HQIC values when compared to other distributions. So, the PEL distribution provides a better fit for the above data-set of case fatality ratio (in %) in China due to COVID-19.
Table 4.
Fitting of case fatality ratio (in %) for COVID-19 in China
| Distribution | Log-likelihood | D | AIC | BIC | HQIC | p-value |
|---|---|---|---|---|---|---|
| PEL | 31.39583 | 0.13715 | 68.79167 | 72.44829 | 69.80586 | 0.7348 |
| PEE | 31.89439 | 0.15297 | 69.78878 | 73.44541 | 70.80297 | 0.6022 |
| EG | 34.11301 | 0.30605 | 72.22602 | 74.66378 | 72.90215 | 0.01849 |
| EPL | 38.63597 | 0.40619 | 83.27195 | 86.92857 | 84.28614 | 0.05227 |
| L | 42.84234 | 0.28665 | 87.68468 | 88.90356 | 88.02274 | 0.03287 |
| E | 46.44959 | 0.34559 | 94.89917 | 96.11805 | 95.23724 | 0.005101 |
From Fig. 11, its evident that the PEL distribution fits to histogram of data better than any other distributions. Also the PEE distribution has a good fit but not up to the PEL distribution. And the total time on test (TTT) plot is concave down and monotonically increasing, so we can say that hazard rate is increasing.
Fig. 11.
Graphical representation of each distribution for case fatality ratio (in %) of China
Data set 2: The data represent the case fatality ratio (in %) of India, from 1st February to 1st April of year 2022 due to COVID-19. Covid cases in India started decreasing from February of 2022. The data is collected from official site of World Health Organization (WHO) [https://covid19.who.int/]. The data are as follows:
1.067, 1.757, 1.705, 1.849, 1.131, 1.595, 1.524, 1.014, 1.266, 1.678, 1.758, 1.898, 1.459, 3.370, 1.283, 1.753, 1.840, 2.134, 2.293, 2.217, 2.365, 1.485, 2.603, 2.952, 2.164, 3.142, 4.880, 2.885, 1.513, 2.704, 3.169, 2.485, 6.080, 2.462, 1.508, 1.078, 3.777, 3.407, 2.363, 5.893, 3.421, 7.211, 2.001, 2.087, 3.487, 3.457, 4.925, 2.469, 10.48, 2.514, 2.779, 2.514, 2.285, 3.895.
When compared to other distributions in Table 7, the PEL distribution is having high p-value, log-likelihood value and low AIC, BIC, and HQIC values. So, the PEL distribution provides the best fit for the data-set of case fatality ratio (in %) in India due to COVID-19.
Table 7.
Fitting of case fatality ratio (in ) for COVID-19 in India
| Distribution | Log-likelihood | D | AIC | BIC | HQIC | p-value |
|---|---|---|---|---|---|---|
| PEL | 88.90882 | 0.095222 | 183.8176 | 189.7846 | 186.1189 | 0.7117 |
| PEE | 89.20371 | 0.14562 | 184.4074 | 190.3744 | 186.7086 | 0.2023 |
| EG | 91.46156 | 0.10876 | 186.9231 | 190.9011 | 188.4573 | 0.5454 |
| L | 101.9173 | 0.27143 | 205.8345 | 207.8235 | 206.6016 | 0.0007005 |
| EPL | 102.6339 | 0.36951 | 211.2678 | 217.2347 | 213.569 | 7.887e07 |
| E | 108.0896 | 0.3109 | 218.1792 | 220.1682 | 218.9463 | 5.854e05 |
From Fig. 12, it is clear that the PEL distribution performs better than any other distributions. Also the EG distribution has a good fit but not up to the PEL distribution. And the TTT plot is concave down and monotonically increasing, so we can say that hazard rate is increasing.
Fig. 12.
Graphical representation of each distribution for case fatality ratio (in %) of COVID-19 in India
Data set 3: The data represent the count of new cases of COVID-19 reported in capital city of India from 11th of March 2022 to 11th of April 2022. Data is collected from JHU CSSE COVID-19 data (https://github.com/CSSEGISandData/COVID-19). Dataset is:
174, 161, 132, 136, 131, 144, 148, 140, 127, 122, 108, 104, 132, 111, 112, 120, 71, 90, 95, 123, 113, 131, 114, 85, 82, 112, 126, 176, 146, 160, 141, 137
Table 8.
Summary statistics of new cases of COVID-19 in Delhi
| Minimum | First quartile | Median | Mean | Third quartile | Maximum |
|---|---|---|---|---|---|
| 71.0 | 111.8 | 126.5 | 125.1 | 140.2 | 176.0 |
Table 9.
Parameter estimation of different distributions of new cases of COVID-19 in Delhi
| Distribution | Parameter estimates |
|---|---|
| PEL | q= 0.02726960 |
| v = 0.01210532 | |
| = 11.66903275 | |
| PEE | p = 0.02874461 |
| v = 19.79049937 | |
| = 8.76642866 | |
| EPL | = 1.9637799 |
| = 9.5688528 | |
| EG | = 0.002413117 |
| L | = 0.01277392 |
| E | = 0.009069704 |
When compared to other distributions in Table 10, the PEL distribution is having high p-value, log-likelihood value and low AIC, BIC, and HQIC values. As a result, for modeling the data-set of new cases of COVID-19 in Delhi, the PEL distribution is the best choice.
Table 10.
Fitting of new cases of COVID-19 in Delhi
| Distribution | Log-likelihood | D | AIC | BIC | HQIC | p-value |
|---|---|---|---|---|---|---|
| PEL | 149.8621 | 0.094787 | 305.7242 | 310.1214 | 307.1818 | 0.936 |
| PEE | 155.7403 | 0.18741 | 317.4805 | 321.8777 | 318.9381 | 0.211 |
| L | 176.4869 | 0.33998 | 354.9738 | 356.4396 | 355.4597 | 0.001226 |
| E | 186.8052 | 0.49341 | 375.6104 | 377.0761 | 376.0962 | 3.423e07 |
| EG | 213.3347 | 0.77917 | 430.6694 | 433.6009 | 431.6411 | 2.2e16 |
| EPL | 223.2282 | 0.48475 | 452.4564 | 456.8536 | 453.9139 | 5.884e07 |
Figure 13 also shows that the PEL distribution provide best fit to the data set compared with other existing distributions. And the TTT plot is concave down and monotonically increasing, so we can say that hazard rate is increasing.
Fig. 13.
Graphical representation of each distribution of new cases of COVID-19 in Delhi
Result and discussion
In Sect. 5, we conducted a Monte Carlo simulation study to test the consistency of the MLEs of the PEL distribution. We considered varying sample sizes along with different combinations of parameters and for 1000 times we replicated the generated random samples. For each combination of parameters for different sample sizes, we calculated error measures like standard error, MSE, RMSE and bias. We can see that the MLEs converge to the respective parameter values as sample size increases, also all errors and bias decrease as sample size increases.
Then to confirm the efficiency of the proposed distribution, we took three lifetime data sets based on COVID-19 and compared it with the existing distributions. To assess the fitting of the considered distributions, we calculated the AIC, BIC, HQIC, k-s statistic, log-likelihood value and p-value. Meanwhile, the best distribution is the one with the highest log-likelihood value or the lowest AIC, BIC and HQIC values. The values in Tables 4, 7 and 10 show that the PEL distribution has highest log likelihood and p-value, also lowest AIC, BIC, and HQIC values. To find these values, we used the following equations;
where, L is the largest value of the likelihood function for the model, p is the number of predicted parameter estimates, and m is the number of observations.
Also, to get a good clarity we used pdf diagram against histogram plot of data sets and TTT-plot in Figs. 11, 12 and 13. From these figures, we can see that the PEL distribution fits better to the histogram plots of COVID-19 data sets than other considered distributions.
Conclusion
This study proposes a new distribution known as the PEL distribution. Different statistical properties such as hazard rate, survival function, reverse hazard function, moments, median, and mode of the new distribution are derived. Also, order statistics, stochastic ordering, and quantile function are derived. Parameters of the new distribution are estimated using the MLE method. The performance of the parameters is then tested with a Monte Carlo simulation study. We considered three real lifetime data sets to analyze the performance and superiority of the PEL distribution by comparing it with existing five distributions. From both simulation study and real data analysis, we are able to conclude that the newly proposed PEL distribution fits better to all data sets when compared to other well-known distributions and also the parameters are consistent with low error measures such as MSE, RMSE and bias.We hope that this model will be used for data analysis in many different fields such as economics, engineering, and medicine.
Future work
We will extend this work with regression model of PEL distribution in the next article, utilizing a categorical data and several classical and Bayesian approach estimators. Also, we will perform probabilistic machine learning for modeling the COVID-19 data. We may assess the proposed adaptability to the data by comparing it to the competitive regression models and machine learning models.
Acknowledgements
We are grateful to the reviewers for their valuable feedback on an earlier version of this paper.
Author Contributions
RCS designed the research. All authors contributed equally to this paper. The authors have analyzed the results and revised the paper. All authors read and approved the final manuscript.
Funding
No funds, grants, or other support were received.
Data availability
The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials.
Declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
C. S. Rajitha and A. Akhilnath have authors contributed equally to this work.
Contributor Information
C. S. Rajitha, Email: rajitha.sugun@gmail.com, Email: cs_rajitha@cb.amrita.edu
A Akhilnath, Email: akhilnathakhi888@gmail.com.
References
- 1.Abouelmagd THM, Hamed MS, Ebraheim Abd El Hadi N. The poisson-G family of distributions with applications. Pak. J. Stat. Oper. Res. 2017;13(2):313–326. doi: 10.18187/pjsor.v13i2.1740. [DOI] [Google Scholar]
- 2.Abu-Youssef SE, Mohammed BI, Sief MG. An extended exponentiated exponential distribution and its properties. Int. J. Comput. Appl. 2015;121(5):1–6. [Google Scholar]
- 3.Ahsan-ul-Haq M, Ahmed M, Zafar J, Ramos PL. Modeling of COVID-19 cases in Pakistan using lifetime probability distributions. Ann. Data Sci. 2022;9(1):141–152. doi: 10.1007/s40745-021-00338-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Al-Babtain A, Elbatal I, Chesneau C, Elgarhy M. Sine Topp-Leone-G family of distributions: theory and applications. Open Phys. 2020;18(1):574–593. doi: 10.1515/phys-2020-0180. [DOI] [Google Scholar]
- 5.Alshanbari HM, Gemeay AM, El-Bagoury AAAH, Khosa SK, Hafez EH, Muse AH. A novel extension of Frechet distribution: application on real data and simulation. Alex. Eng. J. 2022;2022(1):7917–7938. doi: 10.1016/j.aej.2022.01.013. [DOI] [Google Scholar]
- 6.Anzagra L, Sarpong S, Nasiru S. Chen-G class of distributions. Cogent Math. Stat. 2020;7(1):1–8. doi: 10.1080/25742558.2020.1721401. [DOI] [Google Scholar]
- 7.Ashly R, Rajitha CS. Negative binomial improved second degree Lindley distribution and its application. Adv. Math. Sci. J. 2020;9(2):569–581. doi: 10.37418/amsj.9.2.5. [DOI] [Google Scholar]
- 8.Ashour SK, Eltehiwy MA. Exponentiated power Lindley distribution. J. Adv. Res. 2015;6(6):895–905. doi: 10.1016/j.jare.2014.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.David HA. Order Statistics. New York: Wiley; 1981. [Google Scholar]
- 10.Ghitany ME, Al-Mutairi DK, Balakrishnan N, Al-Enezi LJ. Power Lindley distribution and associated inference. Comput. Stat. Data Anal. 2013;64(C):20–33. doi: 10.1016/j.csda.2013.02.026. [DOI] [Google Scholar]
- 11.Ghitany ME, Atieh B, Nadarajah S. Lindley distribution and its application. Math. Comput. Simul. 2008;78(4):493–506. doi: 10.1016/j.matcom.2007.06.007. [DOI] [Google Scholar]
- 12.Gradshteyn IS, Ryzhik IM. Table of Integrals, Series and Products. San Diego: Academic Press; 2007. [Google Scholar]
- 13.Gupta RD, Kundu D. Generalized exponential distributions. Aust. N. Z. J. Stat. 1999;41(2):173–188. doi: 10.1111/1467-842X.00072. [DOI] [Google Scholar]
- 14.Jowett GH. The exponential distribution and its applications. Inc. Stat. 1958;8(2):89–95. [Google Scholar]
- 15.Klakattawi H, Alsulami D, Elaal MA, Dey S, Baharith L. A new generalized family of distributions based on combining Marshal-Olkin transformation with T-X family. PLoS One. 2022;17(2):1–18. doi: 10.1371/journal.pone.0263673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Kondo T. A theory of the sampling distribution of standard deviation. J. Biom. 1930;22(2):34–64. [Google Scholar]
- 17.Lindley DV. Fiducial distributions and Bayes’ theorem. J. R. Stat. Soc. 1958;20(1):102–107. [Google Scholar]
- 18.Liu X, Ahmad Z, Gemeay AM, Abdulrahman AT, Hafez EH, Khalil N. Modeling the survival times of the COVID-19 patients with a new statistical model: a case study from China. PLoS One. 2021;16(7):1–31. doi: 10.1371/journal.pone.0254999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Meriem B, Gemeay AM, Almetwally EM, Halim Z, Alshawarbeh E, Abdulrahman AT, Abd El-Raouf MM, Hussam E. The power XLindley distribution: statistical inference, fuzzy reliability, and COVID-19 application. J. Funct. Spaces. 2022;2022(2):1–21. [Google Scholar]
- 20.Modi K. Power exponentiated family of distributions with application on two real-life datasets. Thail. Stat. 2021;19(3):536–546. [Google Scholar]
- 21.Nadarajah S, Bakouch HS, Tahmasbi RA. Generalized Lindley distribution. Sankhyā Indian J. Stat. Ser. B. 7. 2011;3(2):331–359. [Google Scholar]
- 22.Nagy M, Almetwally EM, Gemeay AM, Mohammed HS, Jawa TM, Sayed-Ahmed N, Muse AH. The new novel discrete distribution with application on COVID-19 mortality numbers in kingdom of Saudi Arabia and Latvia. Complexity. 2021;2021(1):1–20. [Google Scholar]
- 23.Pathak A, Kumar M, Singh SK, Singh U. Statistical inferences: based on exponentiated exponential model to assess novel corona virus (COVID-19) Kerala patient data. Ann. Data Sci. 2022;9(1):101–119. doi: 10.1007/s40745-021-00348-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Rajitha CS, Ashly R. The negative binomial Akash distribution and its applications. Reliab. Theory Appl. 2022;17(2):482–491. [Google Scholar]
- 25.Rather, N.A., Rather, T.A.: New generalizations of exponential distribution with applications. J. Prob. Stat. 2017, 1–9 (2017)
- 26.Riad FH, Alruwaili B, Gemeay AM, Hussam E. Statistical modeling for COVID 19 virus spread in Kingdom of Saudi Arabia and Netherlands. Alex. Eng. J. 2022;61(12):9849–9866. doi: 10.1016/j.aej.2022.03.015. [DOI] [Google Scholar]
- 27.Sakthivel, K.M., Rajitha, C.S., Dhivakar, K.: Two parameter cubic rank transmutation of Lindley distribution. In: AIP Conference Proceedings Vol. 2261(1), p. 030086. (2020)
- 28.Shawky AI, Bakoban RA. Exponentiated gamma distribution: different methods of estimations. J. Appl. Math. 2012;2012(1):1–23. doi: 10.1155/2012/284296. [DOI] [Google Scholar]
- 29.Teamah AAM, Elbanna AA, Gemeay AM. Ahmed: heavy-tailed log-logistic distribution properties, risk measures and applications. Stat. Optim. Inf. Comput. 2021;9(4):910–941. doi: 10.19139/soic-2310-5070-1220. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The authors confirm that the data supporting the findings of this study are available within the article [and/or] its supplementary materials.












