Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2022 Nov 18;51(4):664–681. doi: 10.1080/02664763.2022.2146661

A model for bimodal rates and proportions

Roberto Vila a,CONTACT, Lucas Alfaia a, André FB Menezes b, Mehmet N Çankaya c,d, Marcelo Bourguignon e
PMCID: PMC10929684  PMID: 38476621

Abstract

The beta model is the most important distribution for fitting data with the unit interval. However, the beta distribution is not suitable to model bimodal unit interval data. In this paper, we propose a bimodal beta distribution constructed by using an approach based on the alpha-skew-normal model. We discuss several properties of this distribution, such as bimodality, real moments, entropies and identifiability. Furthermore, we propose a new regression model based on the proposed model and discuss residuals. Estimation is performed by maximum likelihood. A Monte Carlo experiment is conducted to evaluate the performances of these estimators in finite samples with a discussion of the results. An application is provided to show the modelling competence of the proposed distribution when the data sets show bimodality.

Keywords: Bimodal model, bimodality, bounded data, beta distribution, maximum likelihood, regression model

1. Introduction

The need for modelling and analysing the bimodal bounded data, especially for data on the unit interval, occurs in many fields of real life, such as bioinformatics [12], image classification [16], transaction at a car dealership [26] and so on. In such situations, in order to apply probabilistic modelling for these phenomena, under a parametric paradigm, probability distributions limited to (0,1) are indispensable. The unimodal beta model is the most widely used model in the literature to describe data in the unit interval, especially because of its flexibility and fruitful properties [13]. However, despite its broad sense applicability in many fields, the beta distribution is not suitable to model bimodal data on the unit interval.

In general, one uses mixtures of distributions to describe the bimodal data. For example, the studies [26] and [25] consider finite mixtures of beta regression models to analyze the priming effects in judgements of imprecise probabilities. However, in general, mixtures of distributions may suffer from identifiability problems in the parameter estimation; see Refs. [14,15]. Thus, new mixture-free models which have the capacity to accommodate both unimodal and bimodal data are very important. The nature of phenomena can show bimodality due to many reasons, such as economical policies, uncertainty of social movement and its effects on the economy [28,30].

Since the structure of phenomena depends on many factors, it is reasonable to expect that the non-identically distributed data can occur in the data observed by the experimenter. For example, bimodality was introduced by Elal-Olivero [8], Domma et al. [6] and Vila and Çankaya [28] is a necessary probabilistic model to perform an efficient fitting on the non-identically distributed data set or the mixed data set. If we have the mixed data set, the mixed form of Beta and Weibull in Vila and Çankaya [28] should be necessary to model the data set efficiently, because the mixing proportions π1 and π2=(1π1) in the bimodal case cannot be estimated accurately. The analytical expression of the mixed distribution can lead to problem while the optimization of the maximum likelihood estimation method according to parameters from parametric models, such as Beta, Weibull, etc. and the mixing parameters π1 and π2 of Beta from separate populations Beta1 and Beta2 is performed. At least, we can come across the numerical error while performing computation. The original working principle of phenomena can depend on the probabilistic model such as the bimodal beta (Bbeta) or the bimodal Weibull (BWeibull) in Vila and Çankaya [28]. On the other hand, the parameters in the mixed form of two Beta distributions, i.e. Beta1 and Beta2, are α1, α2, β1, β2, π1 and π2. However, Bbeta distribution includes four parameters which are α, β, ρ and δ. Bbeta has less parameters when compared with the mixed form of Beta. Further, since we have the exact expression for the cumulative distribution function of Bbeta, it is advantageous for us to generate the bimodal artificial data sets, which can be used to check whether or not a data set in the system can be modelled by Bbeta with the estimated parameters. In other words, the results and outputs on the unit interval data can be modelled and tested by using the proposed distribution.

Variations of the beta model can be found in Ferrari and Cribari-Neto [9], Ospina and Ferrari [21], Bayes et al. [4], Hahn [11], among others. However, all the models cited above are not suitable for capturing bimodality. Recently, probabilistic models for modelling bimodality on the positive real line were discussed by various authors. Olmos et al. [20] introduced recently a bimodal extension of the unit-Birnbaum-Saunders distribution. Vila et al. [29] proposed the bimodal gamma distribution. Vila and Çankaya [28] considered a bimodal Weibull distribution. Recently, [17] proposed a family of bimodal distributions generated by distributions with positive support. Despite this, to the best of our knowledge, a specific parametric model to describe bimodality data observed of the unit interval has never been considered in the literature recently. Despite this, to the best of our knowledge, a specific regression bimodal model to unit interval data with a regression structure for the parameters has never been considered in the literature. Martnez-Flrez et al. [18] considered a transformation in a random variable that follows a unit-bimodal Birnbaum-Saunders (UBBS for short) distribution only in the case of identically and independently variables.

Based on the above discussion and motivated by the presence of bimodality in proportion responses, we develop a model for double-bounded response variables. In particular, we extended the usual beta distribution using a quadratic transformation technique used to generate bimodal functions [8,28]. The approach, therefore, appears to be a new development for the literature. We discuss several properties of the proposed model, such as bimodality, real moments, hazard rate, entropies and identifiability. Furthermore, we study the effects of the explanatory variables on the response variable using a regression model.

In what follows, we list some of the main contributions and advantages of the proposed model.

  • We introduce a new family of distributions that is flexible version of the usual beta distribution so that it is capable of fitting bimodal as well as unimodal data. We provide general properties of the proposed model;

  • We propose an extended version of the quadratic transformation technique used to generate bimodal functions;

The rest of the article proceeds as follows. In Sections 2 and 3, we present the new distribution and derive some of its properties. Then in Section 4, we present the main properties of the bimodal beta, which include entropies, stochastic representation and identifiability. Section 5 presents the bimodal beta regression model. Also, the estimation method for the model parameters and diagnostic measures are discussed. In Section 6, some numerical results of the estimators and the empirical distribution of the residuals are presented with a discussion of the results. A real-life application related to the proportion of votes that Jair Bolsonaro received in the second turn of Brazilian elections in 2018 is analysed in Section 7. Section 8 summarizes the main findings of the paper.

2. The bimodal beta distribution

In this section, the bimodal beta (Bbeta) distribution is introduced and its density is derived. Moreover, some results on the bimodality properties are obtained. We say that a random variable (r.v.) X has a Bbeta distribution with parameter vector θδ=(α,β,ρ,δ), α>0,β>0, ρ0 and δR, denoted by XBbeta(θδ), if its probability density function (PDF) is given by

f(x;θδ)={ρ+(1δx)2Z(θδ)B(α,β)xα1(1x)β1,0<x<1,0,otherwise, (1)

where

Z(θδ)=1+ρ2δαα+β+δ2α(α+1)(α+β)(α+β+1) (2)

denotes the normalization constant and B(α,β)=01tα1(1t)β1dt is the beta function. When δ=2, α=β=1 and ρ=0, we have the U-quadratic distribution on (0,1). When δ=0, we obtain the classical beta distribution with parameter vector θ0=(α,β,ρ,0):=(α,β). The parameters α, β (which appear as exponents of the r.v.) and ρ control the shape of the distribution. The uni- or bimodality is controlled by the parameter δ. Note that for α, β and δ0 fixed, the parameter ρ also controls the unimodality or bimodality of the distribution; see Subsection 2.1. From Figure 1, we note some different shapes of the Bbeta PDF for different combinations of parameters. Figure 1(a,b) represents L shape and its bimodal form and bell shaped case of beta distribution, respectively.

Figure 1.

Figure 1.

The PDF of the bimodal beta distribution for different values of parameters. In the figure on the left, the PDF presents strict decreasing monotonicity or decreasing-increasing-decreasing shapes. In the figure on the right, the PDF shows symmetry and uni- or bimodality.

Unlike Figure 1(b), Figure 2(a,b) shows that, a peak can be major peak and the other one can be minor peak.

Figure 2.

Figure 2.

The PDF of the bimodal beta distribution for different values of parameters. Both figures present asymmetry and bimodality.

The asymptotic behaviour of the PDF (1) is as follows:

f(0+;θδ)=limx0,x>0f(x;θδ)={β(ρ+1)1+ρ2δ1+β+2δ2(1+β)(2+β),α=1,0,α>1,+,α<1, (3)

and

f(1;θδ)=limx1,x<1f(x;θδ)={α[ρ+(1δ)2]1+ρ2δαα+1+δ2α(α+1)(α+1)(α+2),β=1,0,β>1,+,β<1. (4)

This asymptotic behaviour of the Bbeta PDF was expected, since the bimodal beta distribution is defined in terms of the classical beta. It is clear that when δ=0; f(0+;θδ)=β for α=1 and f(1;θδ)=α for β=1.

If XBbeta(θδ), the cumulative distribution function (CDF) (see Figure 3), the survival function (SF) and the hazard rate function (HR) of X are, respectively, given by

F(x;θδ)=1Z(θδ)[(1+ρ)Ix(α,β)2δBx(α+1,β)B(α,β)+δ2Bx(α+2,β)B(α,β)], (5)
S(x;θδ)=1Z(θδ)i=02ci[B(α+i,β)B(α,β)Bx(α+i,β)B(α,β)]and (6)
H(x;θδ)=[ρ+(1δx)2]xα1(1x)β1i=02ci[B(α+i,β)Bx(α+i,β)], (7)

where Ix(α,β)=Bx(α,β)/B(α,β) is the incomplete beta function ratio, Bx(α,β)=0xtα1(1t)β1dt is the incomplete beta function, and c0=1+ρ, c1=2δ, c2=δ2. For more details on the derivation of these formulas, see Section 3.

Figure 3.

Figure 3.

The CDF of the bimodal beta distribution for different values of parameters. Due to bimodality, as seen in the figure, it is natural to expect the CDF graph to present up to three inflection points.

2.1. Bimodality properties

To state the following result that guarantees the uni- or bimodality of the Bbeta distribution, we define the following cubic polynomial:

p3(x)=a3x3+a2x2+a1x+a0=0, (8)

where a3=δ2(α+β), a2=δ[α(δ+2)+2β+δ2], a1=[α(2δ+ρ+1)+(β2)(ρ+1)] and a0=(α1)(ρ+1).

Theorem 2.1 Uni- or bimodality —

Let XBbeta(θδ) such that α>1,β>1, (δ1)2+ρ>0 and δ>0.

  • (i)

    If p3(x) has a single positive zero then the Bbeta distribution is unimodal.

  • (ii)

    If p3(x) has exactly three zeros in (0,1) then the Bbeta distribution is bimodal.

Proof.

A simple computation shows that

f(x;θδ)=xα2(1x)β2Z(θδ)B(α,β)p3(x), (9)

where p3(x) is given in (8). Under the conditions stated in the theorem, we have a3<0, a2>0, a1<0 and a0>0. By definition, the boundary points are never critical points, then we exclude the analysis at these points.

Since p3(0)=a0>0 and p3(1)=a3+a2+a1+a0=(1β)[(δ1)2+ρ]<0 because β>1 and (δ1)2+ρ>0, the Intermediate Value Theorem guarantees that there is at least one root in the interval (0,1). Further, by Descartes rule of signs (see, e.g. Refs. [31] and [10]), p3(x) has one or three roots in the interval (0,1).

Assume that p3(x) has a single zero. In this case, f(x;θδ) has a single critical point, denoted by x0. Since, for α>1 and β>1, f(0+;θδ)=0 and f(1;θδ)=0, see limits in (3) and (4); it follows that f(x;θδ) increases on (0,x0) and decreases on (x0,1). That is, x0 is a global maximum point of f(x;θδ). This proves Item (i).

On the other hand, if p3(x) has exactly three zeros in (0,1) then f(x;θδ) has three critical points x1,x2 and x3. Without loss of generality, let us assume that x1<x2<x3. Again, since, for α>1 and β>1, f(0+;θδ)=0 and f(1;θδ)=0, it follows that f(x;θδ) increases on the intervals (0,x1) and (x2,x3), and decreases on (x1,x2) and (x3,1). In other words, x1 and x3 are two maximum points and x2 is the unique minimum point. Then the statement in Item (ii) follows.

Thus we have completed the proof of the theorem.

Remark 2.1

By considering α, β, δ and ρ as in the Table 1, it is clear that the conditions of Theorem 2.1 are satisfied. Then, depending on the number of roots of p3(x), Theorem 2.1 guarantees the uni- or bimodality (U- or B) of PDF f(x;θδ). These results are compatible with Figure 1(b).

Again, using the values of Figure 2(a ,b), it can be verified that the conditions of Theorem 2.1 are satisfied, which allows concluding the bimodality of the PDF f(x;θδ). This contrasts the shape of PDF shown in Figure 2.

Table 1.

Roots of the polynomial p3(x) and shapes of the PDF bimodal beta using the values of the parameters of Figure 1(b).

α β δ ρ a3 a2 a1 a0 Real roots of p3(x) in (0,1) Shape
2 2 2 0.25 16 24 10.5 1.25 x=0.19,x=0.5,x=0.81 B
2 2 2 1.5 16 24 13 2.5 x = 0.5 U
2 2 2 0 16 24 10 1 x=0.15,x=0.5,x=0.85 B
2 2 2 0.5 16 24 11 1.5 x=0.25,x=0.5,x=0.75 B
2 2 2 2 16 24 14 3 x = 0.5 U

Theorem 2.2 Bimodality; case ρ=0

If XBbeta(θδ), α>1,β>1, ρ=0, δ>1, and

[δ(α+1)+α+β2]2>4δ(α+β)(α1), (10)

then the Bbeta distribution is bimodal with maximum points

xmax,±=1δ+δ(α+1)(α+β+2)2δ(α+β)±[δ(α+1)+α+β2]24δ(α+β)(α1)2δ(α+β)

and minimum point x=1/δ, where 0<xmax,<x=1/δ<xmax,+<1.

Proof.

Taking ρ=0 in (9), we have

f(x;θδ)=xα2(1x)β2Z(θδ)B(α,β)(1δx){(α+β)δx2[δ(α+1)+α+β2]x+(α1)}.

A direct calculus shows that f(x;θδ)=0 if and only if (excluding the boundary points) x=1/δ and

x±=δ(α+1)+α+β2±[δ(α+1)+α+β2]24δ(α+β)(α1)2δ(α+β)=xmax,±.

Hence, under condition (10), it follows that the equation f(x;θδ)=0 has three roots x=1/δ,x and x+ within the interval (0,1), where x<x=1/δ<x+.

Since, for α>1 and β>1, f(0+;θδ)=0 and f(1;θδ)=0, see limits in (3) and (4); the bimodality of the Bbeta distribution is guaranteed, where x=xmax, and x+=xmax,+ are two maximum points and x=1/δ is the unique minimum point.

Remark 2.2

Let α=β=δ=2 and ρ=0. It is clear that the condition (10) is satisfied. Then, by Theorem 2.2, the Bbeta distribution is bimodal with maximum points xmax,=(22)/40.15 and xmax,+=(2+2)/40.85, and minimum point x=1/2; which is compatible with Figure 1(b).

Proposition 2.3

The Bbeta PDF f(x;θδ) is symmetric at the point x=1/2 whenever α=β and δ=2.

Proof.

A simple algebraic manipulation shows that, if α=β and δ=2 then f(0.5x;θδ)=f(0.5+x;θδ), 0<x<1. Then the proof follows.

Theorem 2.3

If XBbeta(θδ), α=β>1, ρ<1/(α1) and δ=2, then the Bbeta distribution is bimodal with maximum points

xmax,±=12±ρα(1α)+α2α

and minimum point x=1/2, where 0<xmax,<x=1/2<xmax,+<1. Moreover, the maximum values coincide, that is, f(xmax,;θδ)=f(xmax,+;θδ).

Proof.

As a by-product of proof of the Theorem 2.1, we have f(x;θδ)=0 if and only if x is a zero of polynomial p3(x) defined in (8). Setting α=β>1 and δ=2 in polynomial p3(x), we get f(x;θδ)=0 if and only if

p3(x)=8αx3+12αx2[α(5+ρ)+(α2)(ρ+1)]x+(α1)(ρ+1)=0.

Note that the above polynomial can be written as p3(x)=2(x(1/2))[4αx24αx+(α1)(ρ+1)]. Then, it is clear that x=1/2 and xmax,± are critical points of f(x;θδ), where 0<xmax,<x=1/2<xmax,+<1. Note that the restriction ρ<1/(α1) guarantees that the discriminant of the quadratic polynomial implicit in p3(x) is positive.

By using that α>1 and β>1, and by following the same steps as in the final paragraph of proof of the Theorem 2.2, we guarantee bimodality of the Bbeta distribution.

Finally, the identity f(xmax,;θδ)=f(xmax,+;θδ) follows from Proposition 2.3.

Remark 2.4

By considering α, β, δ and ρ as in the Table 2, it is clear that the restriction ρ<1/(α1) is satisfied. Then, Theorem 2.3 guarantees the bimodality (B) of PDF f(x;θδ) with minimum point x=1/2 and points (and values) of maximum specified in this table. These results are compatible with Figures 1(b) and 2(a)–(b).

Table 2.

Modes, maximum values and shapes of the PDF bimodal beta using the parameter values in Figure 1 (b) and Figure 2(a)–(b).

α β δ ρ xmax, xmax,+ f(xmax,;θδ) f(xmax,+;θδ) Shape
2 2 2 0.25 0.19 0.81 1.30 1.30 B
2 2 2 0 0.15 0.85 1.87 1.87 B
2 2 2 0.5 0.25 0.75 1.21 1.21 B

3. Moments

In this section, some closed expressions for truncated moments and real moments of the Bbeta distribution are obtained. Other properties as raw moments, mean residual life function and moment generating function were also analysed in Section I of the Supplementary Material.

Theorem 3.1

If XBBeta(θδ) then, for 0a<b1 and r>α,

E(Xr1{aXb})=1Z(θδ)i=02ci[Bb(α+r+i,β)B(α,β)Ba(α+r+i,β)B(α,β)],

where c0=1+ρ, c1=2δ, c2=δ2, and Bx(α,β) is the incomplete beta function.

Proof.

By using definition of expectation and definition of Bbeta density, we have

E(Xr1{aXb})=1Z(θδ)i=02ciE(Yr+i1{aYb}),YBbeta(θ0).

Since E(Yr+i1{aYb})=[Bb(α+r+i,β)Ba(α+r+i,β)]/B(α,β), the proof of the theorem follows.

Taking r = 0, b = x and a = 0 in Theorem 3.1, we get the formula (5) for the CDF. Letting r = 0, b = 1 and a = x in Theorem 3.1, we get the formula (6) for the SF. By combining the formula (6) of CDF and definition of the Bbeta distribution, we obtain the formula (7) for the HR.

Taking r = 1, a = x and b = 1 in Theorem 3.1, we get a closed formula for the mean residual life function, see Corollary 1.1 of the Supplementary Material.

Corollary 3.1 Real moments —

If XBbeta(θδ) and r>α, then

E(Xr)=1Z(θδ)[(1+ρ)B(α+r,β)B(α,β)2δB(α+r+1,β)B(α,β)+δ2B(α+r+2,β)B(α,β)].

Proof.

By taking b = 1 and a = 0 in Theorem 3.1 we have the following:

E(Xr)=1Z(θδ)i=02ciB(α+r+i,β)B(α,β),

where c0=1+ρ, c1=2δ and c2=δ2.

As a consequence of the above corollary, the closed expressions for the standardized moments, variance, skewness and kurtosis of the bimodal beta r.v. X are easily obtained.

4. Further properties

In this section, we consider some properties of the Bbeta distribution, such as stochastic representation and identifiability. For reasons of space, entropy measures such as Tsallis [27], quadratic [23] and Shannon [24] ones were studied in Section II of the Supplementary Material.

4.1. Stochastic representation

Let W be a discrete r.v. with the following probability function P(W=wk)=πk, k = 0, 1, 2, where

π0=1+ρZ(θδ),π1=2αδZ(θδ)(α+β),π2=α(α+1)δ2Z(θδ)(α+β)(α+β+1),forδ<0,

and Z(θδ) is as in (2). Notice that π0+π1+π2=1.

Let's consider the following three r.v.'s: Y0;α,βBeta(α,β), Y1;α+1,βBeta(α+1,β) and Y2;α+2,βBeta(α+2,β). Then we define a new r.v. X as follows:

X=Y0;α,β1{W=w0}+Y1;α+1,β1{W=w1}+Y2;α+2,β1{W=w2}, (11)

where W is independent of Y0;α,β, Y1;α+1,β and Y2;α+2,β.

Proposition 4.1 Stochastic representation for δ<0

If X admits the form (11), then XBbeta(θδ). Conversely, if XBbeta(θδ) then X is as in (11).

Proof.

Using the law of total probability and the definition of X, we get

P(Xx)=P(Y0;α,βx|W=w0)π0+P(Y1;α+1,βx|W=w1)π1+P(Y2;α+2,βx|W=w2)π2=P(Y0;α,βx)π0+P(Y1;α+1,βx)π1+P(Y2;α+2,βx)π2,

where in the last line we used the independence of W with respect to variables Y1;α,β, Y2;α+1,β and Y3;α+2,β. Since P(Yk;α+k,βx)=Ix(α+k,β), k = 0, 1, 2, the above equality becomes

1Z(θδ)[(1+ρ)Ix(α,β)2δBx(α+1,β)B(α,β)+δ2Bx(α+2,β)B(α,β)].

But, by (5), the right-hand side is equal to the CDF F(x;θδ).

Then we have completed the proof.

4.2. Identifiability

A simple observation shows that the bimodal beta PDF f(x;θδ) in (1), with parameter vector θδ=(α,β,ρ,δ), can be written as a finite (generalized) mixture of three beta distributions with different shape parameters, i.e.

f(x;θδ)=π0f(x;α,β)+π1f(x;α+1,β)+π2f(x;α+2,β),0<x<1, (12)

where π0, π1 and π2 are constants (that depends only on θδ) given in Proposition 4.1, and f(x;α,β)=xα1(1x)β1/B(α,β), 0<x<1, ( α>0,β>0), denotes the standard beta PDF.

Unlike Proposition 4.1, here δ can be non-negative. In principle, mixing non-negative weights are not necessary since mixtures can be PDF even if some of weights are negative.

Let B be the family of beta distributions, as follows:

B={F:F(x;α,β)=0xf(y;α,β)dy,α>0,β>0,0<x<1}.

Write HB as the class of all finite mixtures of B. It is well-known that HB is not identifiable; see the main Theorem of Ahmad and Al-Hussaini [1]. Let HB be the class of all finite mixtures of B with the restriction that the shape parameters β are pairwise different (that is, βiβj for ij). As a consequence of the main result of Atienza et al. [3], it is a simple task to prove that the class HB is identifiable; see, e.g. Proposition 3.2.2 of de Alencar [5] or Proposition 1.2 in the Appendix of Alfaia [2].

The following result proves the identifiability of bimodal beta distribution.

Proposition 4.2

The mapping θδ=(α,β,ρ,δ)f(;θδ), where the β's are pairwise different, is one-to-one.

Proof.

Let us suppose that f(x;θδi)=f(x;θδj) for all 0<x<1, where θδi=(αi,βi,ρi,δi) and θδj=(αj,βj,ρj,δj). In other words, by (12),

π0;if(x;αi,βi)+π1;if(x;αi+1,βi)+π2;if(x;αi+2,βi)=π0;jf(x;αj,βj)+π1;jf(x;αj+1,βj)+π2;jf(x;αj+2,βj),

where πk;i and πk;j, k = 0, 1, 2, are defined as in Proposition 4.1. Since HB is identifiable, we have πk;i=πk;j, for k = 0, 1, 2, and αi=αj, βi=βj. Hence, from equalities πk;i=πk;j, k = 0, 1, 2, immediately follows that ρi=ρj and δi=δj. Therefore, θδi=θδj, and the proof follows.

5. Regression model, estimation and diagnostic analysis

Let X1,,Xn be n independent random variables, where each Xi, i=1,,n, follows the PDF given in (1). We assume that the parameters αi and βi satisfy the following functional relations:

g1(αi)=η1i=wiγandg2(βi)=η2i=ziζ, (13)

where γ=(γ1,,γp) and ζ=(ζ1,,ζq) are vectors of unknown regression coefficients which are assumed to be functionally independent, γRp and ζRq, with p + q<n, η1i and η2i are the linear predictors, and wi=(wi1,,wip) and zi=(zi1,,ziq) are observations on p and q known regressors, for i=1,,n. Furthermore, we assume that the covariate matrices W=(w1,,wn) and Z=(z1,,zn) have rank p and q, respectively. The link functions g1:RR+ and g2:RR+ in (13) must be strictly monotone, positive and at least twice differentiable, such that αi=g11(wiγ) and βi=g21(ziζ), with g11() and g21() being the inverse functions of g1() and g2(), respectively. There are several possible choices for the link functions g1() and g2(). For example, one can use the logarithmic specification gj()=log(), square root gj()=, or identity gj()= (with special attention to the sign of the estimates), j = 1, 2. In this paper, we consider the log link, gj()=log(), since it is the most used for positive parameters.

The log-likelihood function for θδ=(γ,ζ,ρ,δ) based on a sample of n independent observations is given by

(θδ)=i=1n(αi,βi,ρ,δ), (14)

where (αi,βi,ρ,δ)=logZ(θδ)logB(αi,βi)+log[ρ+(1δxi)2]+(αi1)logxi+(βi1)log(1xi),i=1,,n, and Z(θδ) is as in (2).

The maximum likelihood estimator (MLE) θ^δ=(γ^,ζ^,ρ^,δ^) of θδ=(γ,ζ,ρ,δ) is obtained by the maximization of the log-likelihood function (14). However, it is not possible to derive analytical solution for the MLE θ^, hence we resort to numerical solution using some optimization algorithm, such as Newton-Raphson and quasi-Newton.

Under mild regularity conditions and when n is large, the asymptotic distribution of the MLE θ^δ=(γ^,ζ^,ρ^,δ^) is approximately multivariate normal (of dimension p + q + 2) with mean vector θδ=(γ,ζ,ρ,δ) and variance covariance matrix K1(θδ) where K(θδ)=E[(θδ)/θδθδ], is the expected Fisher information matrix. Unfortunately, there is no closed form expression for the matrix K(θδ). Nevertheless, a consistent estimator of the expected Fisher information matrix is given by J(θ^δ)=(θδ)/θδθδ|θδ=θ^δ, which is the estimated observed Fisher information matrix. Therefore, for large n, we can replace K(θδ) by J(θ^δ).

Let θδr be the r-th component of θδ. The asymptotic 100(1φ)% confidence interval for θδr is given by θ^δr±zφ/2SE(θ^δr),r=1,,p+q+2, where zφ/2 is the φ/2 upper quantile of the standard normal distribution and SE(θ^δr) is the asymptotic standard error of θδr. Note that SE(θ^δr) is the square root of the r-th diagonal element of the matrix J1(θ^δ).

Residuals are widely used to check the adequacy of the fitted model. To check the goodness of fit of the Bbeta model, we propose to use the randomized quantile residuals introduced by Dunn and Smyth [7]. Let F(xi;θδ) be the cumulative distribution function of the Bbeta distribution, as defined in (5), in which the regression structures are assumed as in (13). The randomized quantile residual is given by

ri=Φ1(F(xi;θ^δ)),i=1,,n,

where Φ1() is the standard normal distribution function. If the assumed model for the data is well adjusted, these residuals have standard normal distribution [7].

6. Simulation study

In this section, Monte Carlo simulations are performed (i) to evaluate the finite-sample behaviour of the maximum likelihood estimates of the regression coefficients and (ii) to investigate the empirical distribution of the randomized quantile residuals.

The Monte Carlo experiments were carried out by considering the following regression structure

log(αi)=γ0+γ1wi,log(βi)=ζ0+ζ1zi,i=1,,n,

i.e. gj()=log(),j=1,2, where the true values of the parameters were chosen to be the same with the values of the estimated parameters for the case in which we use the application part of regression, i.e. γ0=1.8,γ1=5.9,ζ0=3.8,ζ1=2.4,ρ=0.1 and δ=2.4. The covariate values of wi and zi were generated from the standard uniform distribution. The sample size considered was n = 50, 100, 200 and 300. All simulations were conducted in R [22] using the BFGS algorithm available in the optim() function. For each scenario, the Monte Carlo experiment was repeated 5000 times.

The Bbeta distribution is easily simulated from (5) as follows: if U has a uniform U(0,1) distribution, the solution of the non-linear equation X=F1(U;θδ) has the XBbeta(θδ) distribution, where F1(;) is the inverse functions of F(;). To simulate data from this non-linear equation, we can use the programming language R through f.inv() function [22].

In the rest of this section, a small simulation study is presented to observe the finite sample performance of the proposed estimators from a regression approach. For such evaluation, the estimated bias and the estimated mean squared error (MSE) were calculated. The results are presented in Table 3 and Figure 4.

Table 3.

The estimated values for bias and mean squared error of the maximum likelihood estimators of γ0,γ1,ζ0,ζ1,ρ and δ, and some values of sample size n.

n The estimated bias The estimated MSE
  γ0 γ1 ζ0 ζ1 δ ρ γ0 γ1 ζ0 ζ1 δ ρ
50 0.212 0.106 0.132 0.299 0.177 1.306 0.234 0.634 0.417 0.839 0.488 0.235
100 0.213 0.099 0.114 0.254 0.120 0.938 0.192 0.475 0.276 0.558 0.183 0.091
200 0.202 0.093 0.095 0.215 0.081 0.543 0.157 0.390 0.181 0.381 0.068 0.006
300 0.195 0.091 0.088 0.200 0.061 0.414 0.139 0.353 0.152 0.313 0.037 0.003

Figure 4.

Figure 4.

Box plots from 5000 simulated estimates of γ0,γ1,ζ0,ζ1,ρ and δ for different sample sizes.

Table 3 presents the bias and MSE for the MLEs of γ0,γ1,ζ0,ζ1,ρ and δ. Based on the results at these tables, we find that the estimates are convergent to their corresponding values of parameters. As expected, increasing the sample size n reduces substantially both bias and MSE. The previous findings are confirmed by the box plots shown in Figure 4.

6.1. Residuals

The second simulation study was performed to examine how well the distributions of the randomized quantile residuals are approximated by the standard normal distribution. The evaluation of the randomized quantile residuals were based on the normal probability plots of the mean order statistics and descriptive statistics. The results are presented in Table 4 and Figure 4 of the Supplementary Material.

Table 4.

Descriptive measures of the randomized quantile residuals for the bimodal beta model for different sample sizes.

n Mean StdDev Skewness Kurtosis
50 −0.001 0.999 0.028 2.854
100 −0.002 0.999 0.054 2.976
200 −0.003 0.997 0.077 3.002
300 −0.003 0.997 0.084 3.025

In Table 4, we present the mean, standard deviation (StdDev), skewness and kurtosis of the randomized quantile residuals. For all scenarios, that is, the residuals have approximately zero mean and unit standard deviation, have skewness close to zero, and the kurtosis is near three.

7. Real data application

In this section, to evaluate the applicability of the proposed model, a real data set with bimodality is considered. In particular, a real-life application related to the proportion of votes that Jair Bolsonaro received in the second turn of Brazilian elections in 2018 is analysed. We compared the potentiality of the Bbeta regression with the traditional beta regression model. In order to estimate the parameters of model, we adopt the MLE method (as discussed in Section 5). The asymptotic standard errors were computed using the observed Fisher information matrix. The required numerical evaluations for data analysis were implemented using the R software [22].

The goal of this data analysis is to describe the proportion of votes that Jair Bolsonaro received in the second turn of Brazilian elections in 2018 for all 5.565 cities, and it is available at https://dadosabertos.tse.jus.br. The response variable Xi is the proportion of votes given the municipal human development (mhdi). The MHDI is used as explanatory variable since it is an important measure to guide authorities to assess progress and social reality as well as to define public policy priorities and comparisons of different cities [19]. Figure 5 plots the histogram with density estimated the response variable used in the application and the scatter plot of municipal human development against proportion of votes. From Figure 5, we can see that the response variable has bimodality. Furthermore, there is evidence of a proportion of votes trend with increased municipal human development. The correlation coefficient between the proportion of votes and MHDI is 0.8290.

Figure 5.

Figure 5.

Estimated PDF and scatter plot of municipal human development against proportion of votes.

To explain this proportion of votes we consider the bimodal beta regression model, defined as

YiBbeta(θδ),log(αi)=γ0+γ1mhdii,log(βi)=ζ0+ζ1mhdii,

where i=1,2,,5.565 cities and mhdii is municipal human development of cities i. For comparison purposes the beta regression model was fitted, assuming that

Yibeta(μi,ϕi),logit(μi)=β0+β1mhdii,log(ϕi)=γ0+γ1mhdii.

and the unit-bimodal Birnbaum-Saunders (UBBS) regression model was fitted, assuming that

YiUBBS(αi,βi,δ),log(αi)=ν0+ν1mhdii,log(βi)=η0+η1mhdii.

Table 5 shows the estimated parameters and standard errors. Table 6 shows Akaike information criterion (AIC) and Bayesian information criterion (BIC) for the fitted models. In general, it is expected that the better model fitting the data presents the smallest values for the quantities which are AIC and BIC. Based on the AIC and BIC criteria, the model which provides a better fit in this data set is the Bbeta regression model. This claim is also supported by the residual plots with simulated envelopes shown in Figure 6.

Table 5.

Maximum likelihood estimates and standard errors (SE) for the fit of the bimodal beta, beta and unit-bimodal Birnbaum-Saunders models in the proportion of votes.

Model Parameter Estimate SE
Bbeta γ0 −1.8999 0.1963
  γ1 5.9471 0.3044
  ζ0 3.8341 0.1915
  ζ1 −2.4232 0.2862
  ρ 0.1096 0.0090
  δ 2.4092 0.0351
beta β0 −7.5343 0.0749
  β1 11.1820 0.1105
  γ0 1.0029 0.1675
  γ1 2.5214 0.2528
UBBS ν0 −0.5721 0.1001
  ν1 −0.1035 0.1436
  η0 5.0120 0.0257
  η1 −8.0601 0.0381
  δ 0.6405 0.0990

Table 6.

Information criteria for the fitted models.

Models AIC BIC
Bbeta −8786 −8746
beta −8238 −8212
UBBS −8115 −8082

Figure 6.

Figure 6.

Half-normal plot of randomized quantile residuals with simulated envelope for the fit of beta and bimodal beta.

8. Concluding remarks

When modeling responses with bimodal bounded to the unit interval, despite its broad sense applicability in many fields, the beta distribution is not suitable. In this paper, the well-known two-parameter beta distribution is extended by introducing two extra parameters, thus defining the bimodal beta (Bbeta) distribution, based on a quadratic transformation technique used to generate bimodal functions [8,28], which generalizes the beta distribution. We provide a mathematical treatment of the new distribution, including bimodality, moments, entropies, stochastic representation and identifiability. We allow a regression structure for the parameters α and β. The estimation of the model parameters is approached by maximum likelihood and its good performance has been evaluated by means of Monte Carlo simulations. Furthermore, we have proposed residuals for the proposed model and conducted a simulation study to establish their empirical properties in order to evaluate their performances. The proposed model was fitted to the proportion of votes that Jair Bolsonaro received in the second turn of Brazilian elections in 2018. As expected, the Bbeta model outperforms the beta regression in the presence of bimodality. Further, Bbeta is capable to fit well when compared with UBBS.

Supplementary Material

Supplemental Material
CJAS_A_2146661_SM1389.pdf (244.2KB, pdf)

Acknowledgments

The authors would like to thank the reviewers for all useful and helpful comments on an earlier version of our manuscript, which resulted in this improved version.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Ahmad K.E. and Al-Hussaini E.K., Remarks on the non-identifiability of mixtures of distributions, Ann. Inst. Stat. Math. 34 (1982), pp. 543–544. [Google Scholar]
  • 2.Alfaia L.M., A Distribuição Beta Bimodal: Propriedades e Aplicaçães, UnB, Brasília, 2021. [Google Scholar]
  • 3.Atienza N., Garcia-Heras J., and Munoz-Pichardo J.M., A new condition for identifiability of finite mixture distributions, Metrika 63 (2006), pp. 215–221. [Google Scholar]
  • 4.Bayes C.L, Bazán J.L., and Catalina G., A new robust regression model for proportions, Bayesian Anal. 7 (2012), pp. 841–866. [Google Scholar]
  • 5.de Alencar E.R., Discriminante não-linear Para Mistura De Distribuições Beta, UnB, Brasília, 2018. [Google Scholar]
  • 6.Domma F., Popović B.V., and Nadarajah S., An extension of Azzalini's method, J. Comput. Appl. Math. 278 (2015), pp. 37–47. [Google Scholar]
  • 7.Dunn P.K. and Smyth G.K., Randomized quantile residuals, J. Comput. Graph. Stat. 5 (1996), pp. 236–244. [Google Scholar]
  • 8.Elal-Olivero D., Alpha-skew-normal distribution, Proyecciones J. Math. 29 (2010), pp. 224–240. [Google Scholar]
  • 9.Ferrari S. and Cribari-Neto F, Beta regression for modelling rates and proportions, J. Appl. Stat. 31 (2004), pp. 799–815. [Google Scholar]
  • 10.Griffiths L., Introduction to the Theory of Equations, J. Wiley, 1947. [Google Scholar]
  • 11.Hahn E.D., Regression modelling with the tilted beta distribution: A Bayesian approach, Can. J. Stat. 49 (2021), pp. 262–282. [Google Scholar]
  • 12.Ji Y., Wu C., Liu P., Wang J., and Coombes K.R, Applications of beta-mixture models in bioinformatics, Bioinformatics 21 (2005), pp. 2118–2122. [DOI] [PubMed] [Google Scholar]
  • 13.Johnson N.L., Kotz S., and Balakrishnan N., Continuous Univariate Distributions, Vol. 2nd ed., 2, John Wiley & Sons Inc., New York, 1995. [Google Scholar]
  • 14.Lin T.I., Lee J.C., and Hsieh W.J., Robust mixture models using the skew-t distribution, Stat. Comput. 17 (2007a), pp. 81–92. [Google Scholar]
  • 15.Lin T.I., Lee J.C., and Yen S.Y., Finite mixture modeling using the skew-normal distribution, Stat. Sin. 17 (2007b), pp. 909–927. [Google Scholar]
  • 16.Ma Z. and Leijon A., Beta mixture models and the application to image classification. Proceedings of IEEE International Conference on Image Processing (ICIP), 2045–2048, 2009.
  • 17.Martínez-Flórez M., Martínez E., Tovar-Falón R., and Gómez H.W., A family of bimodal distributions generated by distributions with positive support, J. Appl. Stat. 49 (2022), pp. 3614–3637. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Martínez-Flórez M., Olmos N.M., and Venegas O., Unit-bimodal Birnbaum-Saunders distribution with applications, Commun. Stat. -- Simul. Comput. (2022), pp. 1–20. 10.1080/03610918.2022.2069260 [DOI] [Google Scholar]
  • 19.Menezes A.F.B. and Furriel W.O., Beta and simplex regression models in the analysis of the municipal human development index 2010, Rev. Bras. Biom. 37 (2019), pp. 394–408. [Google Scholar]
  • 20.Olmos N.M., Martínez-Flórez G., and Bolfarine H., Bimodal Birnbaum-Saunders distribution with applications to non-negative measurements, Commun. Stat. -- Theory Methods 46 (2017), pp. 6240–6257. [Google Scholar]
  • 21.Ospina R. and Ferrari S.L.P., Inflated beta distributions, Stat. Pap. 51 (2008), pp. 111–126. [Google Scholar]
  • 22.R Core Team , R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/ (2021).
  • 23.Rao C.R., Quadratic entropy and analysis of diversity, Sankhya Ser. A. 72 (2010), pp. 70–80. [Google Scholar]
  • 24.Shannon C.E., A mathematical theory of communication, Bell Syst. Tech. J. 27 (1948). 379–423. 623–656. [Google Scholar]
  • 25.Smithson M., Merkle E.C., and Verkuilen J., Beta regression finite mixture models of polarization and priming, J. Educ. Behav. Stat. 36 (2011), pp. 804–831. [Google Scholar]
  • 26.Smithson M. and Segale C., Partition priming in judgments of imprecise probabilities, J. Stat. Theory Pract. 3 (2009), pp. 169–181. [Google Scholar]
  • 27.Tsallis C., Possible generalization of Boltzmann-Gibbs statistics, J. Stat. Phys. 52 (1988), pp. 479–487. [Google Scholar]
  • 28.Vila R. and Çankaya M.N., A bimodal Weibull distribution: properties and inference, J. Appl. Stat. 49 (2022), pp. 3044–3062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Vila R., Ferreira L., Saulo H., Prataviera F., and Ortega E.M.M., A bimodal gamma distribution: properties, regression model and applications, Statistics 54 (2020), pp. 469–493. [Google Scholar]
  • 30.Wong M.C., Bubble Value At Risk: A Countercyclical Risk Management Approach, John Wiley & Sons, Singapore, 2013. [Google Scholar]
  • 31.Xue J.Loop Tiling for Parallelism, The Springer International Series in Engineering and Computer Science, Springer, New York, 2012. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material
CJAS_A_2146661_SM1389.pdf (244.2KB, pdf)

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES