Abstract
One of the most commonly used models in survival analysis is the additive Weibull model and its generalizations. They are well suited for modeling bathtub-shaped hazard rates that are a natural form of the hazard rate. Although they have some advantages, the maximum likelihood and the least square estimators are biased and have poor performance when the data set contains a large number of parameters. As an alternative, the expectation-maximization (EM) algorithm was applied to estimate the parameters of the additive Weibull model. The accuracy of the parameter estimates and the simulation study confirmed the advantages of the EM algorithm.
1. Introduction
There are many situations in which the hazard rate function shows a bathtub shape (BTS) with three life periods, early decreasing hazard rate period, useful period (where the hazard rate is approximately constant), and eventually increasing hazard rate period. Lai et al. [1] and Nadarajah [2] presented a list of distributions with BTS hazard rate. Many authors assumed a BTS hazard rate model in their research, and among them, we can point to Block et al. [3], Glaser [4], Leemis and Beneke [5], Mi [6, 7], Mitra and Basu [8], and Noughabi et al. [9].
Xie and Lai [10] and Jiang and Murthy [11, 12] studied some aspects of the additive Weibull model with the hazard rate function:
| (1) |
as a good candidate for describing the bathtub-shaped failure rate function. They urged that the statistical inference about this model is complex due to the number of the parameters, and as a remedy, they suggested the reduced model by considering λ1 = λ2 and α2 = 1/α1 which still accommodates the bathtub-shaped hazard rate.
Lai et al. [13] added one constant magnitude to the additive Weibull hazard rate (1) to provide more realistic model:
| (2) |
In addition, Bebbington et al. [14] considered the additive Weibull models (1) and (2) in their research to express the concept of the useful period of life of a bathtub-shaped hazard rate distribution. They concluded that the additive Weibull model is sufficiently flexible to describe a bathtub-shaped hazard rate. The common estimator of the parameters of these models is the maximum likelihood estimator (MLE).
The EM algorithm is an iterative algorithm for estimating parameters of models involving latent variables, e.g., when data is derived from a mixture or competing risk model. It was used by Dempster et al. [15], Balakrishnan et al. [16], Davies et al. [17], Yang et al. [18], and Okamura and Dohi [19] to estimate the parameters in their models. In this paper, we use the EM algorithm to estimate the parameters of the additive model (2) and show by a simulation study that this algorithm gives a better estimate than the MLE and the least square estimator (LSE).
The paper has been organized as the following. In Section 2, we present a short representation of the MLE and LSE of the parameters. Then, the EM algorithm has been discussed. Section 3 provides a simulation study for comparing the results related to MLE, LSE, and EM estimator. In Section 4, a data set has been analyzed to show the applicability of the proposed estimators.
2. The MLE, Least Square, and EM
Let x1, x2, ⋯, xn represents a realization of an iid random sample of the additive Weibull hazard rate distribution with the hazard rate (2). The log-likelihood function has five parameters and is of the form
| (3) |
To find the MLE, we should find admissible values of (α1, λ1, α2, λ2, λ3) which maximize the log-likelihood function.
To find the LSE of the parameters, we apply the reliability function related to the hazard rate model (2) which is
| (4) |
The empirical reliability function is defined to be
| (5) |
in which the indicator function I(t < xi) equals 1 when t < xi and otherwise is 0. So, the LSE of the parameters can be computed by minimizing the following sum of squares of errors in terms of (α1, λ1, α2, λ2, λ3).
| (6) |
2.1. The EM Algorithm
Let X1i, X2i, and X3i, i = 1, 2, ⋯, n follows from the Weibull distributions with parameters (α1, λ1) and (α2, λ2) and the exponential distribution with mean 1/λ3, respectively. Assume that in a lifetime experiment, the observations are realizations of the competing risk random variable Xi = min{X1i, X2i, X3i}. This means that the lifetime event may be due to one of three competing causes. Let the latent random variable Zi with the support {1, 2, 3} such that
| (7) |
With these notations, the likelihood function is
| (8) |
where Rj(x | θ), fj(x | θ), and λj(x | θ) show the corresponding reliability function, the density function, and the hazard rate function of Xji, j = 1, 2, 3. Then, the log-likelihood function is
| (9) |
The EM algorithm is an iterative algorithm, and every iteration of it consists of two consecutive steps, namely, the E step and the M step. In the E step, the expectation of the log-likelihood with respect to the estimate of the conditional probabilities of the latent variables has been constructed. Then, in the M step, the constructed expectation of the E step is maximized to compute the estimate of the parameters in the current iteration.
2.2. The E Step
Suppose that the estimate of θ at iteration t be denoted by θt, then the conditional distribution of Zi is
| (10) |
| (11) |
and pi3,t = 1 − pi1,t − pi2,t where
| (12) |
The probabilities pi1,t, pi2,t, and pi3,t are called the membership probabilities.
Now we define the expectation of the log-likelihood function (8) with respect to the conditional distribution of Zi.
| (13) |
So the Q(θ | θt) can be written as sum of three distinct expressions Q1(θ | θt), Q2(θ | θt), and Q3(θ | θt), where
| (14) |
| (15) |
| (16) |
These statements will be applied in the M step, to compute the estimates of the parameters.
2.3. The M Step
The estimate of the parameters at iteration t + 1 can be obtained by
| (17) |
| (18) |
| (19) |
We should optimize Q1(θ | θt) and Q2(θ | θt) numerically since they have not closed form for their critical points. But, the point which maximizes Q3(θ | θt) has a closed form, and by solving the equation (∂/∂λ3)Q3 = 0, we have
| (20) |
The iterative process can be concluded if Q(θt+1 | θt+1) < Q(θt | θt) + ε for some small predefined ε.
3. Simulation Study
To provide a random instance of the competing risk model with hazard rate (2), we simulate one random instance of Weibull with parameters (α1, λ1), namely, X11, one random instance of Weibull with parameters (α2, λ2), namely, X12, and one random instance of the exponential distribution with parameter λ3, namely, X13. Then, the random variable X1 = min{X11, X12, X13} follows from the desired competing risk model.
In every run of the simulation study, we drive r = 500 replicates of samples of sizes n = 50 and 100. Then, for each sample, the parameters have been estimated applying the MLE, LSE, or EM algorithm (see Supplementary Materials for all R codes used for simulation study). Every cell of Table 1 shows the results of one run. The results contain the bias (B), the absolute bias (AB), and the mean squared error (MSE) which, for example for α1, have been computed by the following relations.
| (21) |
| (22) |
| (23) |
where is the estimate of α1 based on the ith replication. Some important observations of the simulation results have been pointed out in the following.
As sample size increases, the AB and MSE decrease in all approaches
The EM estimator outperforms the LSE and MLE in terms of AB and MSE in all cases
The LSE outperforms the MLE in terms of AB and MSE in all cases
Table 1.
Every cell consists of the bias, the absolute bias, and the mean squared error for five parameters α1, λ1, α2, λ2, and λ3 from top to bottom, respectively.
| n | The EM algorithm | The LSE | The MLE | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| B | AB | MSE | B | AB | MSE | B | AB | MSE | ||
|
α
1 = 0.8 λ1 = 0.01 α2 = 1.3 λ2 = 0.02 λ3 = 0.04 |
50 | 0.114788 | 0.209461 | 0.073327 | 0.064984 | 0.266447 | 0.295491 | 0.150792 | 0.278228 | 0.475591 |
| -0.001410 | 0.003386 | 0.000017 | 0.015999 | 0.020659 | 0.000707 | 0.026929 | 0.030571 | 0.001399 | ||
| 0.354883 | 0.414357 | 0.325645 | 0.450476 | 0.596059 | 1.216075 | 1.710262 | 1.829587 | 30.45458 | ||
| 0.001644 | 0.004211 | 0.000030 | 0.0008324 | 0.015219 | 0.000326 | 0.003705 | 0.013934 | 0.000315 | ||
| 0.006223 | 0.007680 | 0.000101 | -0.018570 | 0.025329 | 0.000851 | -0.025577 | 0.031822 | 0.001172 | ||
| 100 | 0.090420 | 0.161060 | 0.041728 | 0.075634 | 0.208668 | 0.090426 | 0.074398 | 0.221743 | 0.092066 | |
| -0.001857 | 0.002883 | 0.000011 | 0.014107 | 0.018573 | 0.000558 | 0.022671 | 0.027087 | 0.001132 | ||
| 0.269857 | 0.300074 | 0.151175 | 0.351361 | 0.551910 | 0.787756 | 1.266889 | 1.407221 | 38.77269 | ||
| 0.000711 | 0.002802 | 0.000012 | 0.007800 | 0.013041 | 0.000264 | 0.000171 | 0.012493 | 0.000243 | ||
| 0.006117 | 0.006801 | 0.000072 | -0.017343 | 0.024008 | 0.000788 | -0.018373 | 0.027122 | 0.000936 | ||
|
| ||||||||||
|
α
1 = 1.1 λ1 = 0.1 α2 = 0.9 λ2 = 0.2 λ3 = 0.3 |
50 | 0.087247 | 0.189355 | 0.067184 | 0.388006 | 0.476704 | 0.656696 | 1.686055 | 1.736100 | 193.3874 |
| 0.010944 | 0.026164 | 0.001247 | 0.131134 | 0.184852 | 0.053799 | 0.057770 | 0.124714 | 0.025303 | ||
| 0.049117 | 0.165828 | 0.047906 | -0.011756 | 0.306156 | 0.273852 | 0.019276 | 0.239581 | 0.241092 | ||
| 0.011833 | 0.033445 | 0.001913 | 0.082536 | 0.215583 | 0.070223 | 0.161446 | 0.250880 | 0.084662 | ||
| 0.010969 | 0.033325 | 0.001867 | -0.149815 | 0.212298 | 0.056467 | -0.165021 | 0.245053 | 0.068212 | ||
| 100 | 0.057342 | 0.128094 | 0.027325 | 0.332300 | 0.424207 | 0.583780 | 0.919810 | 0.977701 | 8.022046 | |
| 0.005099 | 0.018084 | 0.000525 | 0.104450 | 0.160100 | 0.040268 | 0.030251 | 0.106644 | 0.017581 | ||
| 0.035447 | 0.107349 | 0.018491 | -0.011164 | 0.227723 | 0.132404 | 0.008467 | 0.179967 | 0.311876 | ||
| 0.004282 | 0.023096 | 0.000883 | 0.087848 | 0.195858 | 0.057835 | 0.170213 | 0.248756 | 0.082494 | ||
| 0.003564 | 0.025512 | 0.001046 | -0.151875 | 0.206708 | 0.053532 | -0.158807 | 0.236292 | 0.064980 | ||
4. Applications
Lawless [20] analyzed failure times of some electrical appliances. The scaled TTT transform plot drawn in Figure 1 shows a bathtub shape for the hazard rate function. This gives us some nonparametric information indicating that the data come from a BT hazard rate model. So we tried to fit some distributions accommodating bathtub-shaped hazard rate to this data set. We use the MLE and the EM algorithm to fit the five parameters competing risk model (2). Also, the reduced model with the reliability function
| (24) |
has been fitted by computing the MLE of the parameters. In Figure 2, the cumulative distribution function (CDF) for fitted models has been drawn. There is a significant distance between the fitted model (14) and the empirical CDF which shows that the reduced model may be improper in some examples.
Figure 1.

The scaled TTT transform plot for data sets of Table 2.
Figure 2.

The empirical CDF and some fitted models to data set of Table 2.
Moreover, some results of fit have been abstracted in Table 3. Based on the Kolmogorov-Smirnov (K-S) statistics and p value, the competing risk model (2) which has been fitted by the EM algorithm gives the best description of the data. The MLE has also provided good results, but it is worthy to denote that we applied the estimates of the EM algorithm as the initial values in the likelihood maximization process. All of the fitted models confirm a BT hazard rate model which was firstly recognized by the TTT transform plot. So it may be interesting to investigate the point which maximizes the mean residual life and/or the median residual life functions. These points are referred to burn-in points and show the time at which the component is in its most reliable condition. The left side of Figure 3 draws the mean residual life function along with the median residual life function related to the best fitted model. Also, the burn-in points related to both functions have been determined in the figure. The right side of Figure 2 draws the hazard rate function of this model and shows a BT hazard rate model.
Table 3.
The results of fitting some suitable models to the data set.
| The model | The method | Estimations | K-S statistics | K-S p value | AIC |
|---|---|---|---|---|---|
| Model (2) | EM | 0.05246 | 0.9936 | - | |
|
| |||||
| Model (2) | MLE | 0.05359 | 0.9917 | 1049.014 | |
|
| |||||
| Model (20) | MLE | 0.09987 | 0.5539 | 1044.176 | |
Figure 3.

(a) The mean residual life and the median residual life of the fitted model. (b) The hazard rate function of this fitted model.
5. Conclusion
The competing risk model of the baseline distribution Weibull plays a vital role in describing nonmonotone hazard rate models. One drawback of this model is that it has large number of the parameters which causes the estimation problem harder. Some authors suggested reduced versions to overcome this problem. But, there are many examples showing that the reduced model may not be proper. So, we implemented the EM algorithm for estimating the parameters. The simulation results confirm that this algorithm is better than MLE and LSE. As future works, such EM algorithm may be constructed for similar competing risk models or mixture models, for example, the gamma competing risk model with the following reliability function may be a good candidate:
| (25) |
in which Γ(α, t) = ∫t∞yα−1e−ydy is the upper incomplete gamma function and α1 > 0, λ1 > 0, α2 > 0, and λ2 > 0.
Table 2.
Failure time of electrical appliances in terms of 1000 s cycles.
| 34 | 59 | 61 | 69 | 80 | 123 | 142 | 165 | 210 | 381 |
| 479 | 556 | 574 | 839 | 917 | 969 | 991 | 1064 | 1088 | 1091 |
| 1270 | 1275 | 1355 | 1397 | 1477 | 1578 | 1649 | 1702 | 1893 | 1932 |
| 2161 | 2292 | 2326 | 2337 | 2628 | 2785 | 2811 | 2886 | 2993 | 3122 |
| 3715 | 3790 | 3857 | 3912 | 4100 | 4106 | 4116 | 4315 | 4510 | 4584 |
| 5299 | 5583 | 6065 | 9701 |
Acknowledgments
The author thanks the two anonymous reviewers for their comments and suggestions. This work is supported by Researchers Supporting Project number RSP-2021/392, King Saud University, Riyadh, Saudi Arabia.
Data Availability
The lifetime data used to support the findings of this study are included within the article.
Conflicts of Interest
There is no any conflict of interest.
Supplementary Materials
The R codes used for simulation study and applications are presented in a supplementary file.
References
- 1.Lai C. D., Xie M., Murthy D. N. P. Bathtub-Shaped Failure Rate Life Distributions. Handbook of Statistics Vol. 20 - Advances in Reliability. In: Balakrishnan N., Rao C. R., editors. Handbook of Statistics . Netherlands: Elsevier Science B.V.69-104: 2001. [Google Scholar]
- 2.Nadarajah S. Bathtub-shaped failure rate functions. Quality & Quantity . 2009;43(5):855–863. doi: 10.1007/s11135-007-9152-9. [DOI] [Google Scholar]
- 3.Block H. W., Savits T. H., Singh H. A Criterion for burn-in that balances mean residual life and residual variance. Operations Research . 2002;50(2):290–296. doi: 10.1287/opre.50.2.290.435. [DOI] [Google Scholar]
- 4.Glaser R. E. Bathtub and related failure rate characterizations. Journal of the American Statistical Association . 1980;75:667–672. [Google Scholar]
- 5.Leemis L. M., Beneke M. Burn-in models and methods: a review. IIE Transactions . 1990;22(2):172–180. [Google Scholar]
- 6.Mi J. Maximization of a survival probability and its application. Journal of Applied Probability . 1994;31(4):1026–1033. [Google Scholar]
- 7.Mi J. Bathtub failure rate and upside-down bathtub mean residual life. IEEE Transactions on Reliability . 1995;44(3):388–396. [Google Scholar]
- 8.Mitra M., Basu S. K. On some properties of the bathtub failure rate family of life distributions. Microelectronics Reliability . 1996;36(5):679–684. [Google Scholar]
- 9.Noughabi M., Borzadaran G., Roknabadi A. On the reliability properties of some weighted models of bathtub shaped hazard rate distributions. Probability in the Engineering and Informational Sciences . 2013;27(1):125–140. doi: 10.1017/S0269964812000344. [DOI] [Google Scholar]
- 10.Xie M., Lai C. D. Reliability analysis using additive Weibull model with bathtub-shaped failure rate function. Reliability Engineering and System Safety . 1996;52:87–93. [Google Scholar]
- 11.Jiang R., Murthy D. N. P. Two sectional models involving three Weibull distributions. Quality and Reliability Engineering International . 1997;13:83–96. [Google Scholar]
- 12.Jiang R., Murthy D. N. P. Parametric study of competing risk model involving two Weibull distributions. International Journal of Reliability, Quality and Safety Engineering . 1997;4(1):17–34. [Google Scholar]
- 13.Lai C. D., Zhang L. Y., Xie M. Mean residual life and other properties of Weibull related bathtub shaped failure rate distributions. International Journal of Reliability, Quality and Safety Engineering . 2004;11:113–132. [Google Scholar]
- 14.Bebbington M., Lai C. D., Zitikis R. Useful periods for lifetime distributions with bathtub shaped hazard rate functions. IEEE Transactions on Reliability . 2006;55(2):245–251. doi: 10.1109/TR.2001.874943. [DOI] [Google Scholar]
- 15.Dempster A. P., Laird N. M., Rubin D. B. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological) . 1977;39:1–22. doi: 10.1111/j.2517-6161.1977.tb01600.x. [DOI] [Google Scholar]
- 16.Balakrishnan N., So H. Y., Ling M. H. EM algorithm for one-shot device testing with competing risks under Weibull distribution. IEEE Transactions on Reliability . 2016;65(2):973–991. doi: 10.1109/TR.2015.2500361. [DOI] [Google Scholar]
- 17.Davies K., Pal S., Siddiqua J. A. Stochastic EM algorithm for generalized exponential cure rate model and an empirical study. Journal of Applied Statistics . 2021;48(12):2112–2135. doi: 10.1080/02664763.2020.1786676. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Yang J., Chen J., Wang X. EM algorithm for estimating reliability of multi-release open source software based on general masked data. IEEE Access . 2021;9:18890–18903. doi: 10.1109/ACCESS.2021.3054760. [DOI] [Google Scholar]
- 19.Okamura H., Dohi T. Application of EM algorithm to NHPP-based software reliability assessment with generalized failure count data. Mathematics . 2021;9(9):p. 985. doi: 10.3390/math9090985. [DOI] [Google Scholar]
- 20.Lawless J. F. Statistical Models and Methods for Lifetime Data . 2nd ed. New York: Wiley; 2003. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
The R codes used for simulation study and applications are presented in a supplementary file.
Data Availability Statement
The lifetime data used to support the findings of this study are included within the article.
