A modified truncated distribution for modeling the heavy tail, engineering and environmental sciences data

Ahtasham Gul; Muhammad Mohsin; Muhammad Adil; Mansoor Ali

doi:10.1371/journal.pone.0249001

. 2021 Apr 6;16(4):e0249001. doi: 10.1371/journal.pone.0249001

A modified truncated distribution for modeling the heavy tail, engineering and environmental sciences data

Ahtasham Gul ^1,^2,^3,^*, Muhammad Mohsin ¹, Muhammad Adil ², Mansoor Ali ³

Editor: Feng Chen⁴

PMCID: PMC8023488 PMID: 33822800

Abstract

Truncated models are imperative to efficiently analyze the finite data that we observe in almost all the real life situations. In this paper, a new truncated distribution having four parameters named Weibull-Truncated Exponential Distribution (W-TEXPD) is developed. The proposed model can be used as an alternative to the Exponential, standard Weibull and shifted Gamma-Weibull and three parameter Weibull distributions. The statistical characteristics including cumulative distribution function, hazard function, cumulative hazard function, central moments, skewness, kurtosis, percentile and entropy of the proposed model are derived. The maximum likelihood estimation method is employed to evaluate the unknown parameters of the W-TEXPD. A simulation study is also carried out to assess the performance of the model parameters. The proposed probability distribution is fitted on five data sets from different fields to demonstrate its vast application. A comparison of the proposed model with some extant models is given to justify the performance of the W-TEXPD.

1 Introduction

Truncated probability models are efficiently used when stochastic variable is confined in some domains. They are required in almost every field like astronomy, epidemiology, biometry, engineering and economy. For instance, the government is interested to know the population of families who are living in New York city having monthly income more than 50,000 US dollars. Another example is recruitment of police officials who meet the minimum prerequisite academic qualification. In engineering, the measurements are taken by using a detector which detects the signals above a specific limit and the weak signals are not taken into account. In all the above situations we need truncated probability distributions to model them.

Weibull and Exponential distributions are immensely utilized in reliability and lifetime analysis due to their simplicity and easy mathematical manipulations. The Weibull distribution is generated by the Swedish physicist [1]. It is commonly used for modeling reliability, electrical engineering [2], mechanical engineering [3], life time and environmental sciences data due to wide-variety of shapes. A considerable literature discussing the methods of estimation of Weibull parameters is given by [4, 5]. [6] stating that Weibull distribution becomes reversed J-shaped, exponential and bell shaped for the shape parameter <, = and > 0 respectively. A comprehensive account on truncated Weibull distribution is given by [7, 8]. [9] fits the truncated Weibull distribution in different areas like to analyse the diameter of trees by truncating data at a specific threshold level and to infer the height of small trees. [10] studies the method of moments to compute the moment expression for two parameters, three-parameters and truncated (left, right and doubly) Weibull distributions. Exponential distribution is also famous for modeling the data due to availability of good estimators, and its nice mathematical properties (e.g. being memory less). [11] defines the maximum likelihood estimator of scale parameter for Exponential distribution.

[12] computes the parameter estimates of truncated Gamma probability density function (pdf). [13] distinguishes the worth of truncated probability density function in hydrology by computing the truncated moment expressions (TMEs) as well as complete moments of different densities and notify that complete moments are regarded as a special case of truncated moment expressions. [14] utilizes both skew-Cauchy (SK-CD) and truncated skew-Cauchy (TSK-CD) probability functions for modeling the exchange rate between the U.K pound sterling and the U.S dollar from 1800 to 2003 and verdicts that TSK-CD is a better probability function to model the data set contrary to SK-CD. [15] studies the truncated version of the Birnbaum-Saunders (BS) distribution to enhance a forecasting of actuarial model, specifically for modeling data regarding insurance payments that establish a deduction.

In the field of hydrology, [16] uses the generalized exponential (GE) distribution to study the flood frequency for Polish Rivers. [17] employs Weibull density function to study accrual failure detector and calls it “Weibull Distribution Failure Detector for Cloud Computing”. [18] introduces a new generalized form of Weibull probability model i.e. “Alpha logarithmic transformed Weibull distribution” (ALTW) to model the failure time of turbocharger of engine.

In engineering, [19] fitted the micro-level spatial joint and macro-level model with conditional autoregressive (CAR) to analyze the zonal crash using three years urban highway data in the USA. They conclude that micro-level model better fits the data. [20] proposes a new Bayesian Spatio-temporal model to study the association between frequency of free way incidence and other risk factors. [21] considers the mixed logit model to identify the main factors of single and multiple vehicle accidents. Similarly, in another study [22] applies the mixed logit model to analyze the significant factors of single vehicle (SV) and multiple vehicle (MV) accidents by using the 10 years truck drivers data at rural highway of the USA. [23] develops Weibull-Lindely distribution by compounding of two distributions and highlights its worth by fitting it on three medica data sets. [24] introduces U-statistics for Weibull distribution parameters and compares it with nine parameter estimation techniques.

Some distinct characteristics motivated us to demonstrate the W-TEXPD like: (i) it is distinctive by the induction of a new scale parameter obtained from the new truncated transformed distribution along with the usual induction of location parameter; (ii) the W-TEXPD shows monotonic, non-monotonic and bathtub shaped hazard rates which make the W-TEXPD a better model than those lifetime models that only demonstrate constant or monotonically increasing/decreasing hazard rates; (iii) it can be viewed that various known lifetime classical models are the special cases of W-TEXPD; (iv) it can be observed that W-TEXPD is appropriate for fitting the scattered, skewed (spread) and/or heavy tailed (flat curved) data which may not be appropriately fitted by other typical probability density functions; and (v) the results achieved by Monte Carlo simulation study for different sample sizes reveals the stability of the model parameters. Finally, we intend to find that how well W-TEXPD performs as compared to several renowned classical lifetime models by using five data sets having skewed and heavy tailed data.

The manuscript is sorted as: In Section 2, W-TEXPD is described and its characteristics such as hazard function, cumulative hazard function, raw and central moments, skewness, kurtosis, Shannon’s entropy and order statistics are derived. In Section 3, Maximum likelihood estimates of the model parameters are obtained. In Section 4, Monte Carlo simulation study is performed to examine the performance of W-TEXPD for different choices of the model parameters. In Section 5, the feasibility of the proposed model is studied by fitting it to the real data sets and comparing with some baseline models. Some concluding remarks are recorded in Section 6.

2 Weibull-Truncated Exponential distribution (W-TEXPD)

[25] suggests a new method for generating a family of truncated distributions called T-X_T family of distributions by using a new function given as

\begin{matrix} W (F (x_{T})) = - log {1 - F (x | x > τ)} . \end{matrix}

(1)

Let X be a non-negative random variable truncated on left having probability density function (pdf) f(x_T) and distribution function (cdf) F(x_T) on domain [τ, ∞). Also let T be a random variable with pdf r(t) and cdf R(t) on interval [−∞ ≤ a ≤ t ≤ b ≤ ∞).

Then the cdf of T-X_T family of distributions is

\begin{matrix} G (x_{T}) = \int_{τ}^{- log {1 - F (x | x > τ)}} r (t) d t, \end{matrix}

(2)

\begin{matrix} G (x_{T}) = R [- log {1 - F (x_{T})}], \end{matrix}

(3)

where R(t) is the cdf of random variable T, while the corresponding pdf of T-X_T family of distributions is

\begin{matrix} g (x_{T}) = r [- log {1 - F (x | x > τ)}] \frac{f (x | x > τ)}{1 - F (x | x > τ)}, x > τ . \end{matrix}

(4)

\begin{matrix} g (x_{T}) = r [- log {1 - F (x_{T})}] {\frac{f (x_{T})}{1 - F (x_{T})}}, \end{matrix}

(5)

\begin{matrix} g (x_{T}) = r [H (x_{T})] h [x_{T}] . \end{matrix}

(6)

The idea presented in Eq (2) is extended by the method of generating a new family of distributions called T-X family of distributions proposed by [26] which is the extension of Beta Generated distributions originally introduced by [27].

Suppose X be an exponential random variable having density function

\begin{matrix} f (x) = θ e^{- θ x}, \end{matrix}

(7)

with corresponding cdf

\begin{matrix} F (x) = 1 - θ e^{- θ x} . \end{matrix}

(8)

The T-Truncated Exponential distribution defined by [25] is expressed as:

\begin{matrix} g (x) = θ r {θ (x - a)} . \end{matrix}

(9)

Let T be the Weibull random variable having pdf

\begin{matrix} r (t) = \frac{β}{α^{β}} t^{β - 1} e^{- {(t / α)}^{β}}, 0 < t < \infty . \end{matrix}

(10)

The Weibull-Truncated exponential distribution (W-TEXPD) is defined by using Eq (9) as

\begin{matrix} g (x) = {(\frac{θ}{α})}^{β} β {(x - τ)}^{β - 1} e^{- {\frac{θ (x - τ)}{α}}^{β}} τ < x < \infty . \end{matrix}

(11)

Where

τ ————> Location parameter.

θ, α———–> Scale parameter.

β ————> Shape parameter.

Some special cases of W-TEXPD

W-TEXPD reduces to Exponential distribution for τ = 0 and θ, β = 1.
W-TEXPD reduces to Weibull for τ = 0, θ = 1.
W-TEXPD reduces to Shifted Gamma-Weibull [28] and three parameter Weibull [29] distribution for θ = 1.

Fig 1 displays different shapes of W-TEXPD for different values of the parameters.

Some characteristics of W-TEXPD are:

Lemma 2.0.1. The hazard function of W-TEXPD is

\begin{matrix} h (t) = \frac{f (t)}{1 - F (t)} = \frac{f (t)}{R (t)} . \\ h (t) = β {(\frac{θ}{α})}^{β} {(t - τ)}^{β - 1} . \end{matrix}

(12)

Fig 2 highlights that the W-TEXPD can model both monotonically and non-monotonically hazard rate shapes with different values of the parameters.

Lemma 2.0.2. The cumulative hazard function of W-TEXPD is computed as

\begin{matrix} H (t) = \int_{0}^{x} h (t) d t . \\ H (t) = {(\frac{θ}{α})}^{β} {(x - τ)}^{β}, β > 0 . \end{matrix}

(13)

Lemma 2.0.3. The p^th percentile of W-TEXPD is given by

\begin{matrix} G (x) = P . \\ 1 - e^{- {\frac{θ (x - τ)}{α}}^{β}} = P, \\ x = τ + \frac{α}{θ} log (1 - P) . \end{matrix}

(14)

Lemma 2.0.4. The first four raw moments of W-TEXPD are given by

\begin{matrix} E (X) = \int_{τ}^{\infty} x g (x) d x, \\ E (X) = τ + \frac{α}{θ} Γ (\frac{1}{β} + 1) . \end{matrix}

(15)

\begin{matrix} E (X^{2}) = β {(\frac{θ}{α})}^{β} \int_{τ}^{\infty} x^{2} {(x - τ)}^{β - 1} e^{- {\frac{θ (x - τ)}{α}}^{β}} d x, \\ E (X^{2}) = τ^{2} + {(\frac{α}{θ})}^{2} Γ (\frac{2}{β} + 1) + 2 \frac{τ α}{θ} Γ (\frac{1}{β} + 1) . \end{matrix}

(16)

\begin{matrix} E (X^{3}) = \int_{a}^{\infty} x^{3} g (x) d x, \\ E (X^{3}) = τ^{3} + {(\frac{α}{θ})}^{3} Γ (\frac{3}{β} + 1) + 3 τ {(\frac{α}{θ})}^{2} Γ (\frac{2}{β} + 1) + 3 τ^{2} (\frac{α}{θ}) Γ (\frac{1}{β} + 1) . \end{matrix}

(17)

\begin{matrix} E (X^{4}) = τ^{4} + 4 τ^{3} (\frac{α}{θ}) Γ (\frac{1}{β} + 1) + 6 τ^{2} {(\frac{α}{θ})}^{2} Γ (\frac{2}{β} + 1) + 4 τ {(\frac{α}{θ})}^{3} Γ (\frac{3}{β} + 1) + {(\frac{α}{θ})}^{4} Γ (\frac{4}{β} + 1) . \end{matrix}

(18)

Lemma 2.0.5. The first four central moments of W-TEXPD are given by

\begin{matrix} μ_{1} = 0 . \end{matrix}

(19)

\begin{matrix} μ_{2} = V a r (x) = E (x^{2}) - {E (x)}^{2}, \\ V a r (x) = {(\frac{α}{θ})}^{2} [Γ (\frac{2}{β} + 1) - {Γ (\frac{1}{β} + 1)}^{2}] . \end{matrix}

(20)

\begin{matrix} μ_{3} = {μ_{3}}^{/} - 3 {μ_{1}}^{/} {μ_{2}}^{/} + 2 {({μ_{1}}^{/})}^{3}, \\ μ_{3} = {(\frac{α}{θ})}^{3} Γ (\frac{3}{β} + 1) + Γ (\frac{1}{β} + 1) [6 τ^{2} (\frac{θ}{α} - \frac{α}{θ}) - 3 τ^{2} (\frac{α}{θ}) + 2 {(\frac{θ}{α})}^{2} + 6 τ {(\frac{θ}{α})}^{2}] . \end{matrix}

(21)

\begin{matrix} μ_{4} = {μ_{4}}^{/} - 4 {μ_{1}}^{/} {μ_{3}}^{/} + 6 {({μ_{1}}^{/})}^{2} {μ_{2}}^{/} - 3 {({μ_{1}}^{/})}^{4}, \\ μ_{4} = [\begin{matrix} 4 τ^{3} (\frac{θ}{α}) Γ (\frac{1}{β} + 1) - 6 τ^{2} {(\frac{θ}{α})}^{2} Γ (\frac{2}{β} + 1) - 12 τ^{2} (\frac{θ}{α}) Γ (\frac{1}{β} + 1) - 4 τ Γ (\frac{3}{β} + 1) \\ + 12 τ {(\frac{α}{θ})}^{2} {Γ (\frac{1}{β} + 1)}^{3} \\ - 6 τ^{2} {(\frac{α}{θ})}^{2} {Γ (\frac{2}{β} + 1) + {Γ (\frac{1}{β} + 1)}^{2} - {Γ (\frac{1}{β} + 1)}^{2} Γ (\frac{2}{β} + 1)} \\ + 8 τ^{3} (\frac{α}{θ}) Γ (\frac{1}{β} + 1) - {(\frac{α}{θ})}^{4} Γ (\frac{1}{β} + 1) [4 Γ (\frac{3}{β} + 1) + 3 {Γ (\frac{1}{β} + 1)}^{3}] \end{matrix}] . \end{matrix}

(22)

Lemma 2.0.6. The skewness and kurtosis of W-TEXPD are defined as

\begin{matrix} S k e w n e s s = \frac{μ_{3}}{{(μ_{2})}^{3 / 2}}, \\ S k e w n e s s = \frac{{(\frac{α}{θ})}^{3} Γ (\frac{3}{β} + 1) + Γ (\frac{1}{β} + 1) [6 τ^{2} (\frac{θ}{α} - \frac{α}{θ}) - 3 τ^{2} (\frac{α}{θ}) + 2 {(\frac{θ}{α})}^{2} + 6 τ {(\frac{θ}{α})}^{2}]}{{(\frac{α}{θ})}^{3} {[Γ (\frac{2}{β} + 1) - {Γ (\frac{1}{β} + 1)}^{2}]}^{^{3 / 2}}} . \end{matrix}

(23)

\begin{matrix} K u r t o s i s = \frac{μ_{4}}{{(μ_{2})}^{2}}, \\ K u r t o s i s = \frac{[\begin{matrix} 4 τ^{3} (\frac{θ}{α}) Γ (\frac{1}{β} + 1) - 6 τ^{2} {(\frac{θ}{α})}^{2} Γ (\frac{2}{β} + 1) - 12 τ^{2} (\frac{θ}{α}) Γ (\frac{1}{β} + 1) \\ - 4 τ Γ (\frac{3}{β} + 1) + 12 τ {(\frac{α}{θ})}^{2} {Γ (\frac{1}{β} + 1)}^{3} - 6 τ^{2} {(\frac{α}{θ})}^{2} Γ (\frac{2}{β} + 1) \\ + {Γ (\frac{1}{β} + 1)}^{2} - {Γ (\frac{1}{β} + 1)}^{2} Γ (\frac{2}{β} + 1) + 8 τ^{3} (\frac{α}{θ}) Γ (\frac{1}{β} + 1) \end{matrix}]}{{(\frac{α}{θ})}^{4} {[Γ (\frac{2}{β} + 1) - {Γ (\frac{1}{β} + 1)}^{2}]}^{2}} . \end{matrix}

(24)

Theorem 2.1. Let X_T be a stochastic variable following W-TEXPD, then Shannon entropy is given by

\begin{matrix} η_{x_{T}} = 1 + \frac{(β - 1)}{β} {C + β ln (\frac{θ}{α})} - ln {β {(\frac{θ}{α})}^{β}} . \end{matrix}

Proof. The Shannon entropy of a random variable X_T is a measure of uncertainty given as

\begin{matrix} η_{x_{T}} = E {ln g (x_{T})} = \int_{τ}^{\infty} {- ln g (x_{T})} g (x_{T}) d x_{T}, \end{matrix}

\begin{matrix} η_{x_{T}} = - [\begin{matrix} ln {β {(\frac{θ}{α})}^{β}} \int_{τ}^{\infty} g (x_{T}) d x_{T} + \int_{τ}^{\infty} (β - 1) ln (x_{T} - τ) g (x) d x_{T} \\ - {(\frac{θ}{α})}^{β} \int_{τ}^{\infty} {(x_{T} - τ)}^{β} g (x_{T}) d x_{T} \end{matrix}], \end{matrix}

\begin{matrix} η_{x_{T}} = [\begin{matrix} {(\frac{θ}{α})}^{β} \int_{τ}^{\infty} {(x_{T} - τ)}^{β} g (x_{T}) d x_{T} - (β - 1) \int_{τ}^{\infty} ln (x_{T} - a) g (x) d x_{T} \\ - ln {β {(\frac{θ}{α})}^{β}} \int_{τ}^{\infty} g (x_{T}) d x_{T} \end{matrix}], \end{matrix}

\begin{matrix} η_{x_{T}} = {(\frac{θ}{α})}^{β} I_{1} - (β - 1) I_{2} - ln {β {(\frac{θ}{α})}^{β}} . \end{matrix}

(25)

\begin{matrix} I_{1} = β {(\frac{θ}{α})}^{β} \int_{a}^{\infty} {(x_{T} - a)}^{2 β - 1} e^{- {\frac{θ (x_{T} - a)}{α}}^{β}} d x_{T}, \end{matrix}

let

\begin{matrix} {\frac{θ (x_{T} - a)}{α}}^{β} = u, \end{matrix}

\begin{matrix} I_{1} = {(\frac{θ}{α})}^{- β} \int_{0}^{\infty} u e^{- u} d u, \end{matrix}

\begin{matrix} I_{1} = {(\frac{α}{θ})}^{β}, \end{matrix}

\begin{matrix} I_{2} = \int_{a}^{\infty} ln (x_{T} - a) g (x_{T}) d x_{T}, \end{matrix}

\begin{matrix} I_{2} = β {(\frac{θ}{α})}^{β} \int_{a}^{\infty} {ln (x_{T} - a)} {(x_{T} - a)}^{β - 1} e^{- {(θ / α)}^{β} {(x_{T} - a)}^{β}} d x_{T}, \end{matrix}

\begin{matrix} {(x_{T} - a)}^{β} = u, \end{matrix}

\begin{matrix} β {(\frac{θ}{α})}^{β} \int_{a}^{\infty} {ln (x - τ)} {(x - τ)}^{β - 1} e^{- {(θ / α)}^{β} {(x - τ)}^{β}} d x = \frac{1}{β} {(\frac{θ}{α})}^{β} \int_{0}^{\infty} ln u e^{- {(θ / α)}^{β} u} d u, \end{matrix}

\begin{matrix} I_{2} = \frac{1}{β} {(\frac{θ}{α})}^{β} \int_{0}^{\infty} ln u e^{- {(θ / α)}^{β} u} d u . \end{matrix}

To solve the above integral, we use the following combinations of logarithms and exponential given by [30] (Jeffrey and Zwillinger. 2007, 7th edition, Eq. (4.331.1), p. 571).

\begin{matrix} \int_{0}^{\infty} ln u e^{- μ u} d u = - \frac{1}{μ} (C + ln μ) [Re μ > 0], \end{matrix}

\begin{matrix} I_{2} = - \frac{1}{β} {C + β ln (\frac{θ}{α})} . \end{matrix}

By using I₁ and I₂, Eq (25) becomes

\begin{matrix} η_{x_{T}} = 1 + \frac{(β - 1)}{β} {C + β ln (\frac{θ}{α})} - ln {β {(\frac{θ}{α})}^{β}} . \end{matrix}

Theorem 2.2. The r^th order statistics f_r;n (x) of a random sample of size n for the W-TEXPD distribution is given by

\begin{matrix} f_{r; n} (x) = β {(\frac{θ}{α})}^{β} {(x - τ)}^{β - 1} {F (x)}^{r - 1} {1 - F (x)}^{n - r + 1} . \end{matrix}

Proof. By definition

\begin{matrix} f_{r; n} (x) = C_{r; n} g (x) {G (x)}^{r - 1} {1 - G (x)}^{n - r}, \end{matrix}

\begin{matrix} f_{r; n} (x) = C_{r; n} f (x) {F (x)}^{r - 1} {1 - F (x)}^{n - r} \end{matrix}

(26)

where

\begin{matrix} f (x) = {(\frac{θ}{α})}^{β} β {(x - a)}^{β - 1} e^{- {\frac{θ (x - a)}{α}}^{β}}, \\ F (x) = 1 - e^{- {\frac{θ (x - a)}{α}}^{β}} \\ f_{r; n} (x) = β {(\frac{θ}{α})}^{β} {(x - τ)}^{β - 1} {F (x)}^{r - 1} {1 - F (x)}^{n - r + 1} . \end{matrix}

(27)

3 Estimation of model parameters by using Maximum Likelihood (ML) method

In this Section, we estimate the unknown parameters of W-TEXPD by applying maximum likelihood estimation method as defined by [31]. The log-likelihood function of W-TEXPD is given by:

\begin{matrix} ln L (τ, θ, α, β; x) = ln \prod_{i = 1}^{n} [β {(\frac{θ}{α})}^{β} {(x - τ)}^{β - 1} e^{- {\frac{θ (x - τ)}{α}}^{β}}], \\ ln L (τ, θ, α, β; x) = [\begin{matrix} n ln β + n β ln θ - n β ln α + β \sum_{i = 1}^{n} ln (x_{i} - τ) - \sum_{i = 1}^{n} ln (x_{i} - τ) \\ - {(\frac{θ}{α})}^{β} \sum_{i = 1}^{n} {(x_{i} - τ)}^{β} \end{matrix}] . \end{matrix}

(28)

Now computing the first partial derivatives of (28) with respect to τ, θ, α, β and equating the results to zero, we have

\begin{matrix} \frac{\partial ln L (τ, θ, α, β; x)}{\partial τ} = M i n [x_{i}], i = 0, 1, 2 . . . . . . . . . ., n, \end{matrix}

(29)

\begin{matrix} \frac{\partial ln L (τ, θ, α, β; x)}{\partial θ} = \frac{n β}{θ} - \frac{β θ^{β - 1}}{α^{β}} \sum_{i = 1}^{n} {(x_{i} - τ)}^{β} = 0, \end{matrix}

(30)

\begin{matrix} \frac{\partial ln L (τ, θ, α, β; x)}{\partial α} = β θ^{β} α^{- (β + 1)} \sum_{i = 1}^{n} {(x_{i} - τ)}^{β} - \frac{n β}{α} = 0, \end{matrix}

(31)

\begin{matrix} \frac{\partial ln L (τ, θ, α, β; x)}{\partial β} = [\begin{matrix} \frac{n}{β} + n ln θ - n ln α + \sum_{i = 1}^{n} ln (x_{i} - τ) - {(\frac{θ}{α})}^{β} ln (\frac{θ}{α}) \sum_{i = 1}^{n} (x_{i} - τ)^{β} \\ - {(\frac{θ}{α})}^{β} \sum_{i = 1}^{n} (x_{i} - τ)^{β} ln (x_{i} - τ) \end{matrix}] = 0, \end{matrix}

(32)

respectively. Since the Eqs (30) to (32) are not in closed form, we use a well-known iterative method i.e. Newton Raphson to obtain the approximate ML estimates for the parameters θ, α and β.

3.1 Asymptotic confidence bounds

It is observed that ML estimates of the unknown parameters θ, α, β of W-TEXPD are not in closed forms. In this situation, we compute the asymptotic confidence bounds of W-TEXPD based on the asymptotic distribution of the MLE.

The Fisher Information matrix can be used for interval estimation and hypothesis testing. For W-TEXPD, Information matrix is obtained by computing the second partial derivatives of the Eqs (30) to (32) as:

\begin{matrix} I_{n} = (\begin{matrix} I_{α α} & I_{α β} & I_{α θ} \\ I_{β α} & I_{β β} & I_{β θ} \\ I_{θ α} & I_{θ β} & I_{θ θ} \end{matrix}), \end{matrix}

the entries of Fisher Information matrix of W-TEXPD are:

\begin{matrix} I_{α α} = \frac{\partial^{2} ln L (τ, θ, α, β; x)}{\partial α^{2}} = \frac{n β}{α^{2}} - \frac{β (β + 1) θ^{β}}{α^{β + 2}} \sum_{i = 1}^{n} {(x_{i} - τ)}^{β} . \end{matrix}

(33)

\begin{matrix} I_{β β} = \frac{\partial^{2} ln L (τ, θ, α, β; x)}{\partial β^{2}} = [\begin{matrix} - \frac{n}{β^{2}} - 2 {(\frac{θ}{α})}^{β} ln (\frac{θ}{α}) \sum_{i = 1}^{n} (x_{i} - a)^{β} \\ - {(\frac{θ}{α})}^{β} ln (\frac{θ}{α}) \sum_{i = 1}^{n} {ln (x_{i} - τ)} (x_{i} - τ)^{β} \\ - {(\frac{θ}{α})}^{β} ln {(\frac{θ}{α})}^{β} \sum_{i = 1}^{n} {ln (x_{i} - τ)} (x_{i} - τ)^{β} \\ - 2 {(\frac{θ}{α})}^{β} \sum_{i = 1}^{n} {ln (x_{i} - τ)} (x_{i} - τ)^{β} \end{matrix}] . \end{matrix}

(34)

\begin{matrix} I_{θ θ} = \frac{\partial^{2} ln L (τ, θ, α, β; x)}{\partial θ^{2}} = - \frac{n β}{θ^{2}} - \frac{β (β - 1) θ^{β - 2}}{α^{β}} \sum_{i = 1}^{n} {(x_{i} - τ)}^{β} . \end{matrix}

(35)

\begin{matrix} I_{θ α} = \frac{\partial ln L (τ, θ, α, β; x)}{\partial θ \partial α} = \frac{β^{2} θ^{β - 1}}{α^{β + 1}} \sum_{i = 1}^{n} {(x_{i} - τ)}^{β} . \end{matrix}

(36)

\begin{matrix} I_{θ β} = \frac{\partial ln L (τ, θ, α, β; x)}{\partial θ \partial β} = - [\begin{matrix} θ^{β - 1} α^{- β} \sum_{i = 1}^{n} (x_{i} - τ)^{β} + β θ^{β - 1} α^{- β} ln θ \sum_{i = 1}^{n} (x_{i} - τ)^{β} \\ + β θ^{β - 1} α^{- β} ln α \sum_{i = 1}^{n} (x_{i} - τ)^{β} \\ + β θ^{β - 1} α^{- β} \sum_{i = 1}^{n} (x_{i} - τ)^{β} \sum_{i = 1}^{n} {ln (x_{i} - τ)} \end{matrix}] . \end{matrix}

(37)

\begin{matrix} I_{α β} = \frac{\partial ln L (τ, θ, α, β; x)}{\partial α \partial β} = - [\begin{matrix} θ^{β} α^{- (β + 1)} \sum_{i = 1}^{n} (x_{i} - τ)^{β} + β θ^{β} α^{- (β + 1)} ln θ \sum_{i = 1}^{n} (x_{i} - τ)^{β} \\ + β θ^{β} α^{- (β + 1)} ln α \sum_{i = 1}^{n} (x_{i} - τ)^{β} \\ + β θ^{β} α^{- (β + 1)} \sum_{i = 1}^{n} (x_{i} - τ)^{β} \sum_{i = 1}^{n} l {n (x_{i} - τ)} - \frac{n}{α} \end{matrix}] = 0 . \end{matrix}

(38)

The asymptotic confidence intervals are obtained by using either the approximate normal distribution or the approximate log-normal distribution of the ML estimates $\hat{g} = (\hat{α}, \hat{β}, \hat{θ})$ . The estimated standard errors of $\hat{α}, \hat{β}$ and $\hat{θ}$ are expressed as:

\begin{matrix} σ (\hat{α}, \hat{β}, \hat{θ}) = \sqrt{\sum_{j j}}, w h e r e \sum = {[I_{n}]}^{- 1} . \end{matrix}

For instance, the expressions for (1 − ξ)100% confidence interval of α calculated by using the approximate normal distribution and log-normal distribution are

\begin{matrix} \hat{α} \pm δ_{ξ / 2} σ (\hat{α}) . \end{matrix}

(39)

and

\begin{matrix} \hat{α} exp {\pm δ_{ξ / 2} σ (\hat{α}) / \hat{α}}, \end{matrix}

\begin{matrix} \hat{α} e^{δ_{ξ / 2} σ (\hat{α}) / \hat{α}} \leq α \leq \hat{α} e^{δ_{ξ / 2} σ (\hat{α}) / \hat{α}}, \end{matrix}

(40)

respectively, where δ_ξ/2 is the 1 − δ_ξ/2 percentile of standard normal distribution. The log-normal approximation works well if the standard error of parameters is greater than half of their point estimate.

4 Simulation study

The core feature of probability is randomness and uncertainty. The randomness exists in every field of life. Simulation imitates the realization of a random experiment, so that random values are generated (that are deterministic) by using an appropriate model designed on the basis of random experiment. A simple such model can be a probability distribution that is used to sketch a real mechanism that produces values of some quantity of interest.

Here, we carry out Monte Carlo simulation studies to assess the performance of maximum likelihood estimates (MLEs) using the R programming. The Monte Carlo simulations are run 1000 times and in each replication, random sample of size n is drawn from the W-TEXPD (α, β, θ). The model parameters are estimated by maximum likelihood method.

Table 1 presents the average point estimates of three parameters with standard errors (SEs), bias and mean square errors (MSEs) for the sample sizes 20, 50, 100 and 200. A fixed seed is used to generate such random numbers implying that all results of these studies can always be exactly replicated.

Table 1. Average estimated values, corresponding SEs (given in parentheses) bias and MSE of model parameters.

Actual	n	Average Estimate (S.E)			Bias			MSE
Actual	n	$\hat{α}$	$\hat{β}$	$\hat{θ}$	$\hat{α}$	$\hat{β}$	$\hat{θ}$	$\hat{α}$	$\hat{β}$	$\hat{θ}$
α = 0.1 β = 0.5 θ = 2.0	20	0.08583 (0.00021)	0.66002 (0.00208)	1.99978 (0.00003)	-0.01417	0.16001	-0.00001	0.00007	0.02991	0.00001
	50	0.09253 (0.00005)	0.72130 (0.00116)	1.99985 (0.00001)	-0.00747	0.22130	-0.00014	0.00005	0.05033	0.00000
	100	0.09603 (0.00002)	0.73558 (0.00103)	1.99999 (0.00004)	-0.00397	0.23558	-0.00007	0.00001	0.05655	0.00001
	200	0.09793 (0.00001)	0.74491 (0.00110)	1.99991 (0.00001)	-0.00207	0.24492	-0.00004	0.00004	0.06119	0.00001
α = 0.1 β = 1.0 θ = 2.0	20	0.09378 (0.00034)	1.03364 (0.00171)	2.00001 (0.00003)	-0.00622	0.03364	0.00001	0.00016	0.00404	0.00001
	50	0.09842 (0.00011)	1.02537 (0.00159)	2.00006 (0.00009)	-0.00158	0.02537	0.00006	0.00001	0.00317	0.00000
	100	0.09942 (0.00003)	1.02011 (0.00125)	2.00000 (0.00002)	-0.00057	0.02011	0.00000	0.00001	0.00196	0.00000
	200	0.09979 (0.00001)	1.01700 (0.00102)	2.00000 (0.00000)	-0.00002	0. 01700	0.00000	0.00000	0.00133	0.00000
α = 0.1 β = 1.0 θ = 5.0	20	0.09399 (0.00032)	1.03973 (0.00196)	5.00016 (0.00001)	-0.00060	0.03973	0.00016	0.00014	0.00542	0.00001
	50	0.09857 (0.00001)	1.02870 (0.00171)	5.00001 (0.00000)	-0.00142	0.02871	0.00001	0.00001	0.00377	0.00000
	100	0.09950 (0.00003)	1.02194 (0.00130)	5.00000 (0.00000)	-0.00005	0.02194	0.00000	0.00001	0.00218	0.00000
	200	0.09980 (0.00001)	1.01827 (0.00105)	5.00000 (0.00000)	-0.00002	0.01827	0.00000	0.00000	0.00144	0.00000
α = 2.5 β = 1.0 θ = 2.0	20	2.50850 (0.00301)	1.00439 (0.00096)	2.00047 (0.00031)	0.00850	0.00439	0.00047	0.00915	0.00095	0.00010
	50	2.50775 (0.00126)	1.01231 (0.00132)	2.00001 (0.00000)	0.00775	0.01231	0.00001	0.00166	0.00189	0.00000
	100	2.50491 (0.00079)	1.01451 (0.00130)	2.00000 (0.00000)	0.00491	0.01451	0.00000	0.00065	0.00191	0.00000
	200	2.50192 (0.00015)	1.01335 (0.00097)	2.00000 (0.00000)	0.00192	0.01335	0.00000	0.00002	0.00113	0.00000

Model	Estimates					Statistic
Model	$\hat{l}$	$\hat{a}$	$\hat{α}$	$\hat{β}$	$\hat{θ}$	AIC	BIC
W-TEXPD	-153.67	1.00	0.145 (0.070)	0.738 (0.105)	0.003 (0.001)	313.35	317.65
Weibull	-156.10	—	52.883 (11.853)	0.848 (0.116)	—	316.21	319.079
Gamma	-156.39	—	0.014 (0.004)	0.807 (0.176)	—	316.77	319.64
Exponential	-156.89	—	0.017 (0.003)	—	—	315.78	317.22
TEXPD	-156.353	1.00	0.018 (0.003)	—	—	314.71	316.32

Model	W-TEXPD	Weibull	Gamma	Exponential	TEXPD
K-Smirnov	0.140	0.159	0.176	0.221	0.228
C-Von	0.091	0.118	0.143	0.253	0.284
A-Darling	0.633	0.659	0.783	1.359	1.734

Model	W-TEXPD	Weibull	Gamma	Exponential	TEXPD
K-Smirnov	0.109	0.151	0.123	0.307	0.233
C-Von	0.034	0.582	0.039	0.537	0.240
A-Darling	0.240	0.329	0.216	2.814	1.272

Model	W-TEXPD	Weibull	Gamma	Exponential	TEXPD
K-Smirnov	0.088	0.092	0.097	0.089	0.105
C-Von	0.025	0.043	0.051	0.041	0.061
A-Darling	0.220	0.282	0.313	0.272	0.486

Model	W-TEXPD	Weibull	Gamma	Exponential	TEXPD
K-Smirnov	0.224	0.190	0.205	0.189	0.315
C-Von	0.422	0.343	0.398	0.341	1.041
A-Darling	3.029	2.055	2.313	2.049	2.165

PERMALINK

A modified truncated distribution for modeling the heavy tail, engineering and environmental sciences data

Ahtasham Gul

Muhammad Mohsin

Muhammad Adil

Mansoor Ali

Roles

Abstract

1 Introduction

2 Weibull-Truncated Exponential distribution (W-TEXPD)

Fig 1. PDF of W-TEXPD for different values of α, β and θ.

Fig 2. h(t) of W-TEXPD for different values of α, β and θ.

3 Estimation of model parameters by using Maximum Likelihood (ML) method

3.1 Asymptotic confidence bounds

4 Simulation study

Table 1. Average estimated values, corresponding SEs (given in parentheses) bias and MSE of model parameters.

5 Real life application

5.1 Application 1: Aeronautical engineering

Table 2. Descriptive statistics of failure times of 31 air conditioners of airplane.

Table 3. Negative log-likelihood values (l^), MLEs of model parameters, the corresponding SEs (given in parentheses) and the statistics AIC and BIC of failure times of 31 air conditioning system of airplane.

Table 4. Goodness of fit statistic of failure times of 31 air conditioners of airplane.

Fig 3. The fitted pdf of W-TEXPD on the histogram of failure times of 31 air conditioning system of airplane along with their cdf, Q-Q and probability plots.

5.2 Application 2: Electrical engineering

Table 5. Descriptive statistics of failure times of 50 components (per 1000 hours).

Table 6. Negative log-likelihood values (ℓ^), MLEs of model parameters, the corresponding SEs (given in parentheses) and the statistics AIC and BIC of failure time of 50 components (per 1000 hours).

Table 7. Goodness of fit statistic of failure time of 50 components (per 1000 hours).

Fig 4. The fitted pdf of W-TEXPD on the histogram of failure times of 50 components (per 1000 hours) along with their cdf, Q-Q and probability plots.

5.3 Application 3: Mechanical engineering

Table 8. Descriptive statistics of ball bearing data.

Table 9. Negative log-likelihood values, MLEs of the model parameters, the corresponding SEs (in parentheses) along with AIC and BIC values.

Table 10. Goodness of fit statistic.

Fig 5. Estimated pdf and P-P plot of W-TEXPD, Weibull, Gamma, Exponential and TEXPD distributions.

5.4 Application 4: Bio-chemical engineering

Table 11. Descriptive statistics of Vinyl chloride data (μg/L).

Table 12. Negative log-likelihood values, MLEs of model parameters, the corresponding SEs (in parentheses) along with the AIC and BIC values.

Table 13. Goodness of fit statistic.

Fig 6. Estimated pdf and P-P plot of W-TEXPD, Weibull, Gamma, Exponential and TEXPD models.

5.5 Application 5: Environmental sciences

Table 14. Descriptive statistics of 40 losses due to wind-related catastrophes.

Table 15. Negative log-likelihood values (ℓ^), MLEs of model parameters, the corresponding SEs (given in parentheses)along with the AIC and BIC values for 40 losses due to wind-related catastrophes.

Table 16. Goodness of fit statistic of 40 losses due to wind-related catastrophes.

Fig 7. The fitted pdf of W-TEXPD on the histogram of 40 losses due to wind-related catastrophes along with their cdf, Q-Q and probability plots.

6 Concluding remarks

Acknowledgments

Data Availability

Funding Statement

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 3. Negative log-likelihood values ( $\hat{l}$ ), MLEs of model parameters, the corresponding SEs (given in parentheses) and the statistics AIC and BIC of failure times of 31 air conditioning system of airplane.

Table 6. Negative log-likelihood values ( $\hat{ℓ}$ ), MLEs of model parameters, the corresponding SEs (given in parentheses) and the statistics AIC and BIC of failure time of 50 components (per 1000 hours).

Table 15. Negative log-likelihood values ( $\hat{ℓ}$ ), MLEs of model parameters, the corresponding SEs (given in parentheses)along with the AIC and BIC values for 40 losses due to wind-related catastrophes.