Skip to main content
Heliyon logoLink to Heliyon
. 2024 Mar 13;10(6):e27376. doi: 10.1016/j.heliyon.2024.e27376

Application of Odd Lomax log-logistic distribution to cancer data

Benson Benedicto Kailembo 1, Srinivasa Rao Gadde 1,, Peter Josephat Kirigiti 1
PMCID: PMC10955245  PMID: 38515696

Abstract

The effectiveness of the parental distribution is modified in this article by adding flexibility, allowing it to capture all characteristics of the provided real-world data sets. This is accomplished by using the T-X class of distributions to generalize the parental distribution. Odd Lomax log-logistic distribution or OLLLD in short, is the name of the generalized parental distribution. The fundamental statistical properties of OLLLD are explicitly expressed. The maximum likelihood estimation approach is used to estimate the unidentified OLLLD parameters. In order to investigate the fit of the approach employed in estimating the parameters of OLLLD, the data are generated and an investigation done. Again, the ability of OLLLD is evaluated by fitting it to the real survival time data set of breast cancer.

Keywords: Maximum likelihood estimation, Odd Lomax log-logistic distribution, Statistical properties, Quantile function, Moments, Breast cancer

MSC: 62F10, 62F40, 62F86, 62P05

1. Introduction

The choice of a particular probability distribution to reflect real-world phenomena may depend on how tractable or adaptable the distribution is, in accordance with probability distribution theory [1]. In actuality, rather than altering the present data set, it is better to employ the distribution which have the ability to capture all features of the given real data set because doing so could jeopardize the data set's originality [2].

From the traditional distributions, this article is going to concentrate on log-logistic distribution due to its some applicability. This distribution is frequently used to model lifetime’s data and other sorts of durations in a variety of domains, including survival analysis. Additionally, this distribution can be used in reliability engineering to examine object failure times and in finance to evaluate extreme values and risk [3].

Despite of its applicability, the log-logistic distribution has some limitations in its use like limited flexibility, limited tail behavior and susceptibility to outliers, that it is sensitive to outliers which can have significance impact on its fit to the data [4].

To overcome these limitations, different authors came up with different generalized versions of log-logistic distributions. These generalized versions of log-logistic distribution was through using different classes/families of distributions such as Beta family of distributions introduced by Ref. [5], Mc Donald-Generalized (Mc-G) class of distributions by Ref. [6], Marshall-Olkin transformation by Ref. [7], exponentiated generalized class of distributions by Ref. [8], Zografos-Balakrishnan-G, (ZB-G) class of distributions developed by Ref. [9], Odd universal exponential class of distributions introduced by Ref. [10] and inverse Lomax class of distributions introduced by Ref. [11].

Using the families mentioned above and others, the generalized log-logistic distributions introduced are Marshall-Olkin extended Fisk distribution [12], exponentiated Fisk distribution [13], McDonald Fisk distribution [14], Kumaraswamy odd fisk distribution [15], Kumaraswamy Marshall-Olkin Fisk distribution [16], Marshall-Olkin Lindley Fisk distribution [17], the beta Fisk distribution [3] and the inverse Lomax Fisk distribution [18].

In terms of flexibility to capture all properties of the data, these generalized distributions above appeared to be superior to the parent distribution (log-logistic distribution), especially for the data that was more skewed than the normal [19].

Of interest in this article is to introduce another generalized log-logistic distribution using T-X family of distributions through setting Lomax distribution as a generator and log-logistic distribution as a baseline so as to aid in the process of enhancing parental (log-logistic) distribution.

The Fisk distribution has two parameters and its pdf together with cdf are given as

f(x;b,a)==(ab)(xb)a1(1+(xb)a)2,x>0,a,b>0 (1)
F(x;b,a)=(xb)a/(1+(xb)a),x>0,a,b>0 (2)

The sections of this article are organized in such a way that section two provides the modified Fisk distribution, inclusive of statistical features and graphical depiction. Section three presents the new generalized model's deduced attributes. Section four offers the maximum likelihood estimator for estimating unknown distributional parameters. Section five applies the new model to a dataset from real-world data, and section six presents a summary and conclusion.

2. The odd Lomax log-logistic distribution

Since log-logistic distribution has some limitations in its use as mentioned before, then in this study we are generalizing it using Lomax distribution as a generator and T-X family of distributions. Lomax distribution is employed because it is a distribution which is heavy tailed naturally compared to gamma, Weibull and exponential distributions [20]. So it is going to add the tail behavior into the parental distribution through the addition of shape parameter as a result to get the distribution which is more tailed and flexible to capture well all features of the given real datasets. Also T-X family of distributions is employed to generalize the log-logistic distribution because it is the family of distributions which is more flexible compared to other families used before to generalize the log-logistic distribution and it gives the better results [21].

2.1. Definition

Numerous generalized classes of distributions have been created and used to explain a wide range of phenomena. The various natures of T-X family of distributions were studied by Ref. [22]. One subfamily of distributions, with distinct X distributions but the same T distribution, has been examined. Here, we consider the possibility of generating a family of Lomax-X distributions by considering T to be a Lomax random variable. As an odds-logistic random variable, we take the base distribution into consideration. As a result, the new model is now an odd Lomax log-logistic distribution. We derived the cdf together with pdf for T-X class of distributions using Lomax (generator) with two extra shape and scale parameters as

J(x;c,d,φ)=rA(x;φ)cd[1+(sd)](c+1)ds=1[1+1d(A(x;φ))]c (3)
f(x;c,d,φ)=ddx(J(x;c,d,φ))=c[1+1d(A(x;φ))](c+1)×ddx(1+1d(A(x;φ))) (4)

In which φ = (a, b) parametric vector of the parental distribution.

A(x;φ) is the odd function obtained using Equation (2) as

A(x;φ)=J(x,a,b)J(x,a,b)=J(x,a,b)1J(x,a,b)=xaba+xa1(xaba+xa)=xaba,x,a,b>0 (5)

On substituting Equation (5) into Equations (3), (4), and after simplification, the obtained cdf with its corresponding pdf of OLLLD are given as

J(x;c,d,a,b)=1(1+1d(xb)a)c,x,c,d,a,b>0 (6)
f(x;c,d,a,b)==acbd(xb)a1(1+1d(xb)a)(c+1)x,c,d,a,b>0 (7)

Where x = random variable of OLLLD.

2.2. Validity of the model

The following integral in Equation (8) must hold for a continuous probability distribution to be considered legitimate in terms of probability theory, which is to say that

f(x)dx=1 (8)

Proof.

By substituting Equation (7) into Equation (8) yield:

0f(x;c,d,a,b)dx=0acbd(xb)a1(1+1d(xb)a)(c+1)dx (9)
Letk=1d(xb)adk=abd(xb)a1dx (10)

On putting Equation (10) into Equation (9) results:

0f(x;c,d,a,b)dx=0c(1+k)(c+1)dk
=|(1+k)c|0=|(1+k)c|0
=0+1=1
Therefore0f(x;c,d,a,b)dx=1 (11)

2.3. Sketches of cdf along with pdf for OLLLD

For following sketches, represents cdf together with pdf curves for OLLLD for different parametric values.

From Fig. 1, it shows that the support of the original distributions (log-logistic and Lomax distributions) which extends from zero to positive infinity was retained. Also the cdf of the modified distribution seemed to give different shape curves as the parametric values changes. This implies the flexibility of the modified model that it is more flexible compared to the original distribution.

Fig. 1.

Fig. 1

Plots for cdf curves for disparate parameter values for OLLL distribution.

Fig. 2 shows that all pdf curves with different parametric values are right positively skewed. Also, the shape of pdf’s is leptokurtic when the values of a>3 and c>3 at b= 2.5 and d= 2, and platykurtic when the values of a<3 and c<3 at b=2.5 and d= 2. Again, from Fig. 2, OLLLD is more malleable in comparison with the parental distribution, that as you adjust the values of the parameters as you get different shapes, tail behaviors and central tendencies.

Fig. 2.

Fig. 2

The pdf curves for different parameter values of OLLL distribution.

3. Properties of OLLLD

The statistical features of OLLLD are deduced in this section.

3.1. Quantile function of OLLLD

The value associated with a particular probability is presented via the quantile function. Additionally, it can be used to produce random numbers with a certain distribution. The quantile function of OLLLD is obtained through the following formula.

Q(q)=F1(q) (12)
F(x)=qq(0,1)Q(q)=x
x=bd1a[(1q)1c1]1a
Q(q)=x=bd1a[(1q)1c1]1a0<q<1andc,d,a,b>0 (13)

Where q = quantile values and a, c, b and d are parameters of the quantile function.

Also the median can be obtained through setting q=0.5, then

Q(12)=x=bd1a[(12)1c1]1ac,d,a,b>0 (14)

Again, the quantile-based metrics of skewness and kurtosis are also obtained from the aforementioned quantile function and are as follows.

According to quartiles, Bowley's measure of skewness is respectively given as

skw=Q32(median)+Q1Q3Q1 (15)

In octiles-based, the Moors' kurtosis respectively is expressed as

kurt=Q(78)Q(58)Q(38)+Q(18)Q(68)Q(18) (16)

In Fig. 3, the 3-dimensional plots of both skewness and kurtosis seemed to increase progressively as the values of parameters increases towards positive right axes from the origin and this implies the flexibility of the model.

Fig. 3.

Fig. 3

Plots of 3 dimensional Figures for skewness along with kurtosis for different parametric values with regard to OLLLD.

3.2. Moments of OLLLD

Let X be variate following OLLLD. Then, the moments of OLLLD are respectively presented as

μr=E(Xr)=0xrf(x;Θ)dx (17)

On substituting Equation (7) into Equation (17) yields

μr=E(Xr)=0xr[acbd(xb)a1[1+1d(xb)a](c+1)]dx (18)
μr=cbrdra(1)2raΓ(ra+1)Γ(cra)Γ(1+c)B(p,q)=Γ(p)Γ(q)Γ(p+q) (19)

Then, the corresponding raw moments are respectively:

μ1=cbd1a(1)2aΓ(1a+1)Γ(c1a)Γ(1+c) (20)
μ2=cb2d2a(1)4aΓ(2a+1)Γ(c2a)Γ(1+c) (21)
μ3=cb3d3a(1)6aΓ(3a+1)Γ(c3a)Γ(1+c) (22)
μ4=cb4d4a(1)8aΓ(4a+1)Γ(c4a)Γ(1+c) (23)

Additionally, using moment from the inception and moment from the norm as shown in the following relationships, the expectation, deviation, coefficient of variance, coefficient of lopsidedness, and coefficient of applanation of OLLLD were calculated as tabulated below.

Expectation=μ1,Deviation=μ2(μ1)2=μ2,CV=(σ2(μ1)2)0.5
CS=E(xμ1σ)3=μ3(σ)3andCK=E(xμ1σ)4=μ4(σ)4

Where μ3 and μ4 can be obtained through the following relationship

μ3=μ33μ2μ1+2(μ1)3andμ4=μ44μ3μ1+6μ2(μ1)23(μ1)4

Table 1 demonstrates that for all various parametric values, the mean values are higher than the median values. This suggests that the distribution has a longer tail in the right and is positively right biased.

Table 1.

Moment values, mean, variance, median, CV, CS and CK for a few particular OLLLD parametric values.

Parameters
a b c d Mean Variance Median CV CS CK
1.0 1.0 8.0 1.0 0.14 0.03 0.09 1.15 3.12 38.59
1.5 0.24 0.03 0.20 0.75 1.55 29.41
2.5 0.40 0.03 0.38 0.46 0.62 60.18
3.4 0.50 0.03 0.49 0.36 0.31 79.43
4.0 0.53 0.02 0.52 0.31 0.21 87.12
1.0 1.5 8.0 1.0 0.21 0.06 0.14 1.15 3.12 38.59
2.5 0.36 0.17 0.23 1.15 3.12 38.59
3.5 0.50 0.33 0.32 1.15 3.12 38.59
4.5 0.64 0.55 0.41 1.15 3.12 38.59
5.5 0.79 0.82 0.50 1.15 3.12 38.59
1.0 1.0 3.0 1.0 0.34 0.24 0.21 1.39 4.13 65.35
6.0 0.20 0.06 0.12 1.22 3.81 55.56
8.0 0.14 0.03 0.09 1.15 3.12 38.59
10 0.11 0.02 0.07 1.12 2.81 33.33
12 0.09 0.01 0.06 1.10 2.64 30.81
1.0 1.0 8.0 0.5 0.04 0.00 0.05 1.91 3.87 40.82
0.8 0.09 0.02 0.07 1.38 3.23 36.47
1.2 0.21 0.04 0.11 0.97 3.24 44.51
1.5 0.32 0.06 0.14 0.75 4.08 65.86
1.9 0.44 0.06 0.17 0.57 4.86 69.13

3.3. Conditional moments of OLLLD

Conditional moments offer a technique to examine and comprehend a distribution's features under particular restrictions or conditions, enabling more focused and insightful study in a variety of domains. The conditional moments of OLLLD respectively is

E(XrX>x)=1R(x)txrf(x;Θ)dx (24)

Diversely, conditional moments are acquired by using the quantile function as described below

E(XrX>x)=1R(x)q1[Q(q)]rdq (25)
E(XrX>x)=brdra(1)raR(x)k=0s=0(1)k+ss+1(rak)(kc+s1s)[1qs+1] (26)
E(XrX>x)=brdra(1)raR(x)k=0s=0(1)k+ss+1(rak)(kc+s1s)[1[J(x)]s+1] (27)

3.4. Reliability study of OLLLD

The likelihood of system or component to keep working after time t is represented by the survival function. The OLLLD's survival function is respectively defined as

R(x;c,d,a,b)=(1+1d(xb)a)cx,c,d,a,b>0 (28)

Fig. 4 shows that, the survival function of OLLLD retains the behavior of the survival functions that it is monotonically decreasing as you increase the number of random variables for different parametric values.

Fig. 4.

Fig. 4

The survival curves for some chosen parametric values of OLLLD.

The hazard function may change over time, reflecting shifting risk or failure rates. OLLLD’s hazard function respectively can be expressed as

h(x;c,d,a,b)=acbd(xb)a11+1d(xb)ax>0andc,d,a,b>0 (29)

Fig. 5 demonstrates that the curves of the suggested distribution's hazard rate function are almost universally inverted bathtub-shaped and very few are declining curves when the values of the parameters change (increasing and decreasing).

Fig. 5.

Fig. 5

The hazard curves for some chosen parametric values of OLLLD.

The mean residual life function or mean remaining lifespan function is another name for the reversed hazard rate function. It gives details on how much time typically passes before anything occurs. As follows is the reversed hazard rate function for OLLLD:

r(x;c,d,a,b)=acbd(xb)axb(1+1d(xb)a)(c+1)(1(1+1d(xb)a)c)x>0,c,d,a,b>0 (30)

Cumulative hazard function sheds light basing on total risk or failure rate that has been encountered up to a certain point in time. It can be used to estimate the total failure rate over time, compare various populations or treatment groups, and examine the dependability and aging trends of systems or components.

The OLLLD’s cumulative hazard function respectively is described by

H(x;c,d,a,b)=cln(1+1d(xb)a)x,c,d,a,b>0 (31)

3.5. Asymptotic behavior and mode of OLLLD

The suggested model complies with asymptotic behavior requirements and has a mode value if both limits converge to aught.

Thismeansthatlimx0f(x;Θ)=limxf(x;Θ)=0 (32)
limx0(acbd(xb)a1(1+1d(xb)a)(c+1))=(acbd(0b)a1/(1+1d(0b)a)(c+1))=0 (33)

Again,

limx(acbd(xb)a1/(1+1d(xb)a)(c+1))=(acbd(b)a1/(1+1d(b)a)(c+1))=0 (34)

Then the new distribution satisfies the asymptotic behavior since all limits converges to zero. Also, the mode is obtained using the expression that

ddx(ln(f(x;Θ)))=0 (35)
ln(f(x;Θ))=ln(acbd(xb)a1(1+1d(xb)a)(c+1))
ddx(ln(f(x;Θ)))=ax1x(c+1)(11+1d(xb)a)×ddx(A)=0A=1+1d(xb)a
(ax+1x)+(c1)(abd(xb)a11+1d(xb)a)=0x,c,d,a,b>0 (36)

An analytical solution to the nonlinear Equation (36) that represents the proposed distribution's mode is not possible. It is amenable to numerical solution via the Newton-Raphson technique.

3.6. Expected life remain of OLLLD

Expected life remain is the expectation life X = x, taking that into consideration an object lasted up until the moment x. The following is how to tackle expected life remain of OLLLD:

MRL(x)=E(Xx/X>x)=1R(x)txf(x;Θ)dxx (37)
MRL(x)=bd1a(1)1aR(x)k=0s=0(1)k+ss+1(1ak)(kc+s1s)[1[J(x)]s+1]x (38)

Where OLLLD’s cdf and survival function respectively are J(x) and R(x).

3.7. Deviation of first moment

The deviation of the first moment provides a description of the dispersion of the data around mean. There are two ways to define deviation of first moment namely deviation of first moment from mean and deviation of first moment from median.

3.7.1. Deviation of first moment from mean

The deviation of the first moment from the mean is used to calculate the data points' average deviation from the mean. Here, the deviation of first moment from the mean can be calculated in such a way:

MD(μ)=0|xμ|f(x;Θ)dx (39)
MD(μ)=0μ(μx)f(x;Θ)dx+μ(xμ)f(x;Θ)dx
=μJ(μ)0μxf(x;Θ)dx+μxf(x;Θ)dxμ[1J(μ)]
=2μJ(μ)μ0μxf(x;Θ)dx+μxf(x;Θ)dx
MD(μ)=2μJ(μ)2μ+2μxf(x;Θ)dx (40)

By considering the part of integral in Equation (40), we have

μxf(x;Θ)dx=1R(μ)u1Q(u)du
μxf(x;Θ)dx=bd1a(1)1aR(μ)k=0s=0(1)k+ss+1(1ak)(kc+s1s)[1[J(μ)]s+1] (41)

On combining Equations (40), (41) we get

MD(μ)=2μJ(μ)2μ+2bd1a(1)1aR(μ)k=0s=0(1)k+ss+1(1ak)(kc+s1s)[1[J(μ)]s+1] (42)

3.7.2. Deviation of first moment from median

The deviation of first moment from median calculates the deviation of first moment of the data points from the median. The same methods we outlined previously employed to determine the deviation of first moment from the median.

MD(x˜)=0|xx˜|f(x;Θ)dx (43)
=0x˜x˜f(x;Θ)dx0x˜xf(x;Θ)dx+x˜xf(x;Θ)dxx˜x˜f(x;Θ)dx
MD(x˜)=2x˜J(x˜)μx˜+2x˜xf(x;Θ)dx (44)

On considering the part of integral in Equation (44)

x˜xf(x;Θ)dx=bd1a(1)1aR(x˜)k=0s=0(1)k+ss+1(1ak)(kc+s1s)[1[J(x˜)]s+1] (45)

On combining Equations (44), (45) yield

MD(x˜)=2x˜J(x˜)μx˜+2bd1a(1)1aR(x˜)k=0s=0(1)k+ss+1(1ak)(kc+s1s)[1[J(x˜)]s+1] (46)

4. Estimation of parameters of OLLLD

Assume that x1,x2,,xn are OLLLD’s random sample whose parameters are a, b, c and d respectively. At that instant, likelihood function for OLLLD is the n samples joint pdf which respectively is given as

L(x;Θ)=i=1nf(xi;Θ) (47)

Where Θ=(a,b,c,d) is the parametric vector of OLLLD.

L(xi;Θ)=i=1nf(xi;Θ)=i=1nacbd(xib)a1[1+1d(xib)a](c+1)
la=na+i=1nlog(xib)n(c+1)[1d((xib)aln(xib))1+1d(xib)a]logLa=la (48)
lb=nan(a1)bn(c+1)[1d(ab(xib)a)1+1d(xib)a]
lb=nan(a1)b+n(c+1)[1d(ab(xib)a)1+1d(xib)a] (49)
lc=ncnlog[1+1d(xib)a] (50)
lc=nc+n(c+1)[1d2(xib)a1+1d(xib)a] (51)

Setting Equations (48), (49), (50), (51) above equal to 0 and then after solving it concurrently yield MLEs for four (4) parameters of OLLL distribution.

Since the equations above are non-linear equations, then iterative procedure is to be applied to solve them numerically in order to obtain the estimates of OLLLD.

4.1. Simulation study

Here, the quantile function of the proposed model with initial parameters a=0.7, b=0.2, c=0.3 and d=0.1 was used to simulate five distinct data sets of sizes n = 25, 50, 75, 100, 150, 200, 250 and 300 using Monte Carlo simulation technique in R software. For each sample, the proposed model's maximum likelihood estimates were calculated numerically. The biasness together with least error squares were determined in order to evaluate the efficacy of the estimator by repeating the process 1000 times for each sample. The processes above were repeated for different parametric values as shown in the cases below.

CaseI:a=0.7,b=0.2,c=0.3,d=0.1CaseII:a=0.8,b=0.3,c=0.2,d=0.1CaseIII:a=1.4,b=0.4,c=0.6,d=0.1CaseIV:a=1.1,b=0.1,c=0.1,d=0.1

According to Table 2, average bias and MSE values are extremely low and go smaller as sample size m rises. Furthermore, the maximum likelihood estimates approach their real initial numbers when increasing the sample size progressively, suggesting MLEs are asymptotically unbiased. Similarly, when the size of the data n generated rises, the least error squares gradually decline, indicating consistency in the estimations. All of this implies that the maximum likelihood method is a good estimator in estimating the parameters of OLLLD.

Table 2.

The estimates (MLE’s), Average bias (AB), Mean square errors (MSE) and for the modified log-logistic distribution.

Sample Parameters
CASE I
CASE II
MLE's AB MSE MLE's AB MSE
n = 25 a 0.2408 0.0330 0.0626 4.7988 0.3370 1.0838
b 0.0003 0.8542 2.1905 0.1423 0.2020 0.6396
c 0.0343 0.0726 0.1860 0.0461 0.0693 0.2705
d 0.4492 2.0722 6.0159 0.1143 0.2174 0.6603
n = 50 a 0.0839 0.0259 0.0621 0.9493 0.3192 1.0695
b 0.1992 0.2243 0.6258 0.2566 0.0608 0.2295
c 0.1134 0.0124 0.0655 0.2386 0.0169 0.1271
d 0.9982 0.6744 1.8210 0.9522 0.1069 0.4657
n = 75 a 0.0695 0.0226 0.0609 1.3292 0.2571 0.8536
b 0.3687 0.0904 0.3895 0.3344 0.0258 0.1606
c 0.1364 0.0046 0.0522 0.1534 −0.0016 0.0853
d 1.4457 0.3453 1.0505 1.0178 0.0493 0.2431
n = 100 a 0.0960 0.0197 0.0581 1.2632 0.1454 0.5259
b 0.2391 0.0521 0.2412 0.3032 0.0177 0.1274
c 0.1209 0.0020 0.0442 0.1778 0.0001 0.0695
d 1.0835 0.2403 0.7532 0.9992 0.0436 0.2011
n = 150 a 0.1307 0.0146 0.0548 1.1231 0.0786 0.3112
b 0.2707 0.0156 0.1631 0.1885 0.0063 0.1012
c 0.0686 −0.0004 0.0332 0.1979 −0.0005 0.0536
d 0.7055 0.1664 0.5729 0.9760 0.0227 0.1130
n = 200 a 0.1078 0.0120 0.0396 1.3369 0.0714 0.2629
b 0.1485 −0.0055 0.1301 0.3370 0.0027 0.0872
c 0.0892 −0.0020 0.0276 0.1972 −0.0028 0.0462
d 0.8976 0.0864 0.4227 1.0097 0.0212 0.1007
n = 250 a 0.1495 0.0070 0.0269 1.4988 0.0486 0.2123
b 0.0175 −0.0040 0.1326 0.2537 0.0072 0.0777
c 0.0632 −0.0008 0.0231 0.1469 0.0000 0.0418
d 0.7331 0.0693 0.3578 0.8771 0.0186 0.0833
n = 300 a 0.0843 0.0050 0.0215 1.7849 0.0549 0.1963
b 0.2649 −0.0098 0.1115 0.1939 −0.0001 0.0711
c 0.1136 −0.0001 0.0211 0.1262 −0.0030 0.0365
d 1.2412 0.0677 0.3219 1.1070 0.0150 0.0690
Sample Parameters CASE III CASE IV
MLE's AB MSE MLE's AB MSE
n = 25 a 1.1350 0.5500 1.5078 0.9185 0.1309 1.7273
b 0.8201 0.3649 1.0990 0.1749 24.0709 545.7292
c 0.1574 0.0590 0.1663 0.0956 8.6148 200.7526
d 0.2391 0.2051 0.5740 0.2090 5.1146 125.1855
n = 50 a 1.9058 0.4980 1.4177 0.4913 0.3202 1.1375
b 0.5434 0.0608 0.4541 0.7609 0.0305 0.1220
c 0.0676 0.0089 0.0714 0.2742 0.0093 0.0658
d 0.1401 0.0607 0.2366 0.3942 0.0455 0.1847
n = 75 a 1.3957 0.4623 1.3870 1.3742 0.3647 1.0474
b 0.5165 0.0008 0.2224 0.1001 0.0106 0.0732
c 0.0897 −0.0011 0.0470 0.0674 −0.0001 0.0489
d 0.0653 0.0252 0.1266 0.0303 0.0170 0.1111
n = 100 a 3.2557 0.4070 1.1805 0.8040 0.3331 1.0220
b 0.1721 −0.0098 0.1886 0.1855 0.0033 0.0605
c 0.0456 −0.0026 0.0429 0.1472 −0.0047 0.0392
d 0.0137 0.0198 0.1140 0.1514 0.0047 0.0836
n = 150 a 1.2033 0.2416 0.8849 1.2935 0.1789 0.6095
b 0.6355 −0.0121 0.1404 0.1231 0.0051 0.0504
c 0.1480 −0.0023 0.0317 0.0871 −0.0022 0.0330
d 0.1322 0.0110 0.0795 0.0873 0.0050 0.0674
n = 200 a 1.1435 0.1468 0.4895 1.9877 0.1130 0.4357
b 0.6090 −0.0087 0.1146 0.0461 0.0042 0.0410
c 0.1349 −0.0011 0.0284 0.0531 −0.0007 0.0273
d 0.2198 0.0119 0.0715 0.0442 0.0033 0.0538
n = 250 a 1.7911 0.1318 0.4769 1.7827 0.0998 0.3231
b 0.4124 −0.0119 0.1023 0.0552 0.0008 0.0371
c 0.0822 −0.0014 0.0251 0.0606 −0.0025 0.0249
d 0.0883 0.0093 0.0638 0.0344 0.0006 0.0501
n = 300 a 1.6118 0.1056 0.4875 1.2514 0.0733 0.2623
b 0.4933 −0.0087 0.0878 0.0742 0.0010 0.0303
c 0.0865 −0.0008 0.0230 0.0865 −0.0016 0.0213
d 0.0787 0.0082 0.0584 0.0788 0.0009 0.0417

5. Application of OLLLD

In this section, the real survival time’s data set of 90 breast cancer patients collected from UNIOSUN Teaching Hospital in Osogbo Nigeria between 2000 and 2014 years was modeled using OLLLD in order to prove the theoretical results. The data can be accessed freely using the link https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7426529. Also the new distribution was compared with other competitive models such as Fisk distribution (LLD), Kumaraswamy fisk distribution (KwLLD), Marshal-Olkin Fisk distribution (MoLLD), Kumaraswamy Marshall-Olkin Fisk distribution (KMoLLD), exponential model (ExpD), Lomax distribution (LOMD), odd Lomax inverse exponential distribution (OLOMINExD) and Weibull Fisk distribution (WeLLD) using different statistical criteria’s. The statistical criteria’s employed in their short forms includes AIC, HQIC, BIC, CIAC, K–S test, W test and A2 test. Again, the goodness of fit of the new model is tested through comparing it with its competitive models when fitted into the real breast cancer dataset. There are many test statistics (both common and proposed test statistics) for more details please refer [[23], [24], [25]].

5.1. The TTT and box plots

A key tool chart that can be used to investigate or to know if the given real data set can be modeled or not modeled using the specified model is known as TTT plot. In Fig. 6 (left side), TTT plot is convex in this instance due to the decreasing failure rate. This suggests that the data set is appropriate for additional study. The box plot is used to present the data set's main finding. It paints a precise image of the distribution the data collected.

Fig. 6.

Fig. 6

The TTT and box plots for survival times of breast cancer patients.

5.2. Histogram and violin plots

Fig. 7 represents the distribution of survival data sets from which it indicates also the presence of extreme values because some bars are scattered. Also the shape of violin indicates the dispersion of the data set that violin's width is wider (diversity) between the ages of 0 and 25, which suggests that the expected survival durations for breast cancer patients fall into this category. Furthermore, a box plot inside the violin was utilized to display additional summary statistics; the box stands for the interquartile range (IQR), which depicts the variation in the middle 50% of survival times. The median survival time is shown by white dot inside the box.

Fig. 7.

Fig. 7

The histogram with its fitted density and violin plots for survival cancer data.

5.3. Summary of the data set

Table 3 makes it abundantly evident that the variance is 381.824 and that the mean value (13.23) is higher than the median value (6.77) of the data. This indicates the asymmetrical of cancer data in a positive direction. The data set's excessive dispersion may be the cause of the variance's apparent large deviation from the mean. Again, another proof of the asymmetry for cancer data is its asymmetrical number which is larger than zero (2.4024), and the data set also has a heavier tail because the kurtosis value is greater than three, leptokurtic (8.585035). Table 4 provides the estimates of models OLLLD, MoLLD, KwLLD, KMoLLD, LLD, LOMD, WeLLD, ExpD, and OLOMINExD fitted to the real data set.

Table 3.

Summary for real breast cancer dataset provided.

n Min. Q1 Median Mean Q3 Max. Var. Skewness Kurtosis
90 0.03 1.17 6.77 13.23 15.37 99.10 381.824 2.4024 8.5850

Table 4.

Estimates of distributions with their standard errors (in parenthesis).

Model Estimates
a b c d λ
OLLLD 0.6474 (0.0567) 19.6376 (2.6541) 33.3877 (47.7601) 20.6901 (10.5276)
MoLLD 0.3117 (0.1021) 0.8889 (0.0780) 10.862 (0.0033)
KwLLD 5.2670 (0.1830) 146.272 (0.0018) 0.1897 (0.0175) 96.8698 (0.0014)
KMoLLD 0.4704 (0.4176) 5.5112 (8.8061) 19.5197 (56.1815) 1.2733 (0.9871) 18.8786 (49.5493)
LLD 0.8889 (0.0780) 4.5611 (0.9565)
LOMD 1.3944 (0.4698) 7.9366 (4.3377)
WeLLD 0.0555 (0.0047) 0.7235 (0.0063) 3.2009 (0.00006) 16.0279 (0.00003)
ExpD 13.2284 (1.4022)
OLOMINExD 1.0379 (0.2670) 73.6844 (41.8998) 0.0643 (0.0295)

OLLLD is the best model, as evidenced by AIC, CIAC, BIC and HQIC values are given in Table 5. OLLLD appeared to be appropriate for the fitted data set even in the case of other statistical criteria like A2, W, and K–S are provided in Table 6. Basing on result achieved, the conclusion is reached that the OLLLD is superior relative to eight rival distributions.

Table 5.

The statistics AIC, CAIC, BIC and HQIC for the real data set.

Model Criteria
Rank
AIC CAIC BIC HQIC
OLLLD 609.56 610.04 619.51 613.57 1st
MoLLD 620.43 620.71 627.89 623.44 7th
KwLLD 610.96 611.44 620.92 614.97 3rd
KMoLLD 611.41 612.14 623.86 616.43 4th
LLD 618.43 618.57 623.40 620.43 5th
LOMD 619.20 619.34 624.18 621.21 6th
WeLLD 609.65 610.93 619.60 613.66 2nd
ExpD 1247.69 1247.73 1250.18 1248.69 9th
OLOMINExD 623.14 623.43 630.61 626.15 8th

Table 6.

The A2, W, K–S Statistic and p-values for the real data set.

Model Criteria
Rank
A2 W K–S P-Value
OLLLD 0.6350 0.1089 0.0721 0.7434 1st
MoLLD 1.7135 0.3212 0.1060 0.2699 7th
KwLLD 0.8236 0.1559 0.0832 0.5696 4th
KMoLLD 0.6459 0.1182 0.0736 0.7340 2nd
LLD 1.7135 0.3212 0.1060 0.2700 7th
LOMD 1.3409 0.2525 0.1219 0.1422 6th
WeLLD 0.6520 0.1152 0.0756 0.6890 3rd
ExpD 0.8917 0.1714 0.9191 0.0000 9th
OLOMINExD 1.7261 0.3233 0.1230 0.1353 8th

Fig. 8 clear shows that the OLLLD, WeLLD, KwLLD and KMoLLD exhibit almost identical fits. Furthermore the plots also show that OLLLD offer superior fits than other eight competitive distributions.

Fig. 8.

Fig. 8

The plots of histogram coupled with estimated pdf’s (left side) and the empirical distribution function with estimated cdf’s (right side).

6. Summary and conclusion

In this article, a new three parameter distribution named as odd Lomax log-logistic distribution (OLLLD) is proposed. The OLLLD is the generalization to log-logistic distribution and provide more flexibility for analyzing the real data. We have studied some statistical applications of the developed distribution. Various figures have been displayed to study the nature of the distribution. The figures show the unimodality of the distribution, flexible and asymmetry in positive direction for all different parametric values. From the quantile function the skewness and kurtosis values obtained which are greater than zero and its three dimensional plots are increasing progressively as the parametric values increased. The proposed distribution suggests that real-world phenomena with declining failure rates and inverted bathtub failure rates can be modeled. The parameters for OLLLD are estimated using MLE technique. We also provide some simulation results to assess the performance of the proposed MLE. Furthermore, outcomes show that MLE is a good estimator in estimating the parameters of OLLLD. Again, OLLLD used to model cancer data set and the comparison is done to test its goodness basing on other competitive distributions. Clearly, OLLLD seemed to do better than its competitive distributions. The study can be extended using various estimation approaches such as moment method (MoM), least squares estimation (LSE), quantile estimation, probability weighted moments (PWM) and Bayesian estimation to estimate unknown parameters of OLLLD. Furthermore the study can be expanded by verifying the outcomes using additional goodness of fit tests apart from the ones used in this study and the superiority of OLLLD can be tested using other competitive distributions.

Ethics approval and consent to participate

This study was based on published data, so ethical approval was not required.

Consent of publication

Not applicable.

Data availability

The data can be accessed freely using the link https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7426529.

Funding

The author(s) received no specific funding for this work.

CRediT authorship contribution statement

Benson Benedicto Kailembo: Formal analysis, Writing – original draft. Srinivasa Rao Gadde: Methodology, Visualization, Writing – review & editing. Peter Josephat Kirigiti: Data curation, Methodology, Writing – original draft.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  • 1.Job O., Solomon Ogunsanya A. Weibull log logistic {exponential} distribution: some properties and application to survival data. Int. J. Stat. Distrib. Appl. 2022;8(1):1. doi: 10.11648/j.ijsd.20220801.11. [DOI] [Google Scholar]
  • 2.Oguntunde E.P. Covenant Univ. Repos.; 2017. Generalisation of the Inverse Exponential Distribution: Statistical Properties and Applications,” Provid; p. 202. [Google Scholar]
  • 3.Lemonte A.J. The beta log-logistic distribution. Brazilian J. Probab. Stat. 2014;28(3):313–332. doi: 10.1214/12-BJPS209. [DOI] [Google Scholar]
  • 4.Lima S.R., Cordeiro G.M. The extended log-logistic distribution: properties and application. An. Acad. Bras. Cienc. 2017;89(1):3–17. doi: 10.1590/0001-3765201720150579. [DOI] [PubMed] [Google Scholar]
  • 5.Sepanski J.H., Kong L. A family of generalized beta distributions for income. Adv. Appl. Stat. · Novemb. 2007 Source. 2007;(October) [Google Scholar]
  • 6.Pokhrel K., Kafle R.C. McDonald-G family of distributions. J. Stat. Theory Appl. 2012;11 11, 2012. [Google Scholar]
  • 7.Rubio F.J., Steel M.F.J. On the Marshall-Olkin transformation as a skewing mechanism. Comput. Stat. Data Anal. 2012;56(7):2251–2257. doi: 10.1016/j.csda.2012.01.003. [DOI] [Google Scholar]
  • 8.Cordeiro G.M., Ortega E.M.M., da Cunha D.C.C. The exponentiated generalized class of distributions. J. Data Sci. 2013;11(1):1–27. doi: 10.6339/jds.2013.11(1).1086. [DOI] [Google Scholar]
  • 9.Dila G.K., Tripathy M.R. 2015. A Study on Zografos-Balakrishnan G-Family of Distributions Thesis. May. [Google Scholar]
  • 10.Tahir M.H., Cordeiro G.M., Alizadeh M., Mansoor M., Zubair M., Hamedani G.G. The odd generalized exponential family of distributions with applications. J. Stat. Distrib. Appl. 2015;2(1):1–28. doi: 10.1186/s40488-014-0024-2. [DOI] [Google Scholar]
  • 11.Falgore J.Y., Doguwa S.I. The inverse lomax-G family with application to breaking strength data. Asian J. Probab. Stat. 2020;(August):49–60. doi: 10.9734/ajpas/2020/v8i230204. [DOI] [Google Scholar]
  • 12.Gui W. Marshall-olkin extended log-logistic distribution and its application in minification processes. Appl. Math. Sci. 2013;7(77–80):3947–3961. doi: 10.12988/ams.2013.35268. [DOI] [Google Scholar]
  • 13.Chaudhary A.K., Kumar V. Bayesian estimation of three-parameter exponentiated log-logistic distribution. Int. J. Statistika Math. 2014;9(2):66–81. ISSN 2277- 2790 E-ISSN 2249-8605, Vol. 9, Issue 2, 2014 pp. 66–81. [Google Scholar]
  • 14.Tahir M.H., Mansoor M., Zubair M., Hamedani G.G., Science C. McDonald log-logistic distribution. J. Stat. Theory Appl. 2014;13(1):65–82. Vol. 13, No. 1 (March 2014), 65-82. [Google Scholar]
  • 15.Alizadeh M., Emadi M., Doostparast M., Cordeiro G.M., Ortega E.M.M., Pescim R.R. A new family of distributions: the Kumaraswamy odd log-logistic, properties and applications. Hacettepe J. Math. Stat. 2015;44(6):1491–1512. doi: 10.15672/HJMS.2014418153. [DOI] [Google Scholar]
  • 16.Cakmakyapan S., Ozel G., Mousa Hussein El Gebaly Y., Hamedani G.G. The Kumaraswamy marshall-olkin log-logistic distribution with application. J. Stat. Theory Appl. 2018;17(1):59. doi: 10.2991/jsta.2018.17.1.5. [DOI] [Google Scholar]
  • 17.Moakofi T., Oluyede B., Makubate B. Marshall-Olkin Lindley-Log-logistic distribution: model, properties and applications. Math. Slovaca. 2021;71(5):1269–1290. doi: 10.1515/ms-2021-0052. [DOI] [Google Scholar]
  • 18.Falgore J.Y., Doguwa S.I. Inverse Lomax log-logistic distribution with applications. Thail. Stat. 2023;21(1):37–47. 2023; 21(1) 37-47. [Google Scholar]
  • 19.Ghosh I., Bourguignon M. A new extended Burr XII distribution. Aust. J. Stat. 2017;46(1):33–39. doi: 10.17713/ajs.v46i1.139. [DOI] [Google Scholar]
  • 20.Ul Haq M.A., Hamedani G.G., Elgarhy M., Ramos P.L. Marshall-Olkin power Lomax distribution: properties and estimation based on complete and censored samples. Int. J. Stat. Probab. 2019;9(1):48. doi: 10.5539/ijsp.v9n1p48. [DOI] [Google Scholar]
  • 21.Jamal F., et al. Some new members of the T-X family of distributions. HAL open Sci. 2019;33 [Google Scholar]
  • 22.Alzaatreh A., Lee C., Famoye F. A new method for generating families of continuous distributions. Metron. 2013;71(1):63–79. doi: 10.1007/s40300-013-0007-y. [DOI] [Google Scholar]
  • 23.Zamanzade E., Arghami N. Goodness-of-fit test based on correcting moments of modified entropy estimator. J. Stat. Comput. Simulat. Dec. 2011;81:2077–2093. doi: 10.1080/00949655.2010.517533. [DOI] [Google Scholar]
  • 24.Mahdizadeh M., Zamanzade E. New goodness of fit tests for the Cauchy distribution. J. Appl. Stat. 2017;44(6):1106–1121. doi: 10.1080/02664763.2016.1193726. [DOI] [Google Scholar]
  • 25.Mahdizadeh M., Zamanzade E. Goodness-of-fit testing for the Cauchy distribution with application to financial modeling. J. King Saud Univ. Sci. 2019;31(4):1167–1174. doi: 10.1016/j.jksus.2019.01.015. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data can be accessed freely using the link https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7426529.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES