Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2021 Feb 20;49(8):2035–2051. doi: 10.1080/02664763.2021.1890001

A new heteroscedastic regression to analyze mass loss of wood in civil construction in Brazil

J C S Vasconcelos a,CONTACT, E M M Ortega a, J S Vasconcelos b, G M Cordeiro c, A L Vivan d, M A M Biaggioni b
PMCID: PMC9225232  PMID: 35757588

Abstract

A heteroscedastic regression based on the odd log-logistic Marshall–Olkin normal (OLLMON) distribution is defined by extending previous models. Some structural properties of this distribution are presented. The estimation of the parameters is addressed by maximum likelihood. For different parameter settings, sample sizes and some scenarios, various simulations investigate the performance of the heteroscedastic OLLMON regression. We use residual analysis to detect influential observations and to check the model assumptions. The new regression explains the mass loss of different wood species in civil construction in Brazil.

Keywords: Carbonization in building, heteroscedastic regression, lignocellulosic mass loss, Marshall–Olkin family, regression model

1. Introduction

Wood is the main commercial forest product. The widespread use of wood is due to the high ratio between its resistance and weight, making it an excellent structural and finish material for construction as well as for making a wide range of products. Brazil has a huge variety of woody species, and here we focus on the following:

  • Goupia glabra Aubl. is commonly known in Brazil as ‘Peroba do Norte’ or ‘cupiúba’. According to [11], its wood is dense, with only slight or no distinction between heartwood and sapwood, with brownish-red color and strong odor. It is used to make doors and shutters, boards (for floors and baseboards), furniture, carts, railroad ties, poles, fence posts, small bridges, etc. It grows mainly in the north region of Brazil.

  • Couratari is a genus of the Lecythidaceae family that grows in many regions of Brazil. In particular, the species Courataria trovinosa, known popularly as ‘Tauari’, is native to the Amazon Forest. Its wood is yellowish-white to beige-light yellow with moderate sheen with no distinction between heartwood and sapwood and odor varying from slightly perceptible to perceptible (in this case disagreeable). It has medium density, right grain and medium texture. Its basic density is 0.5 g/cm 3 and it is used in civil construction for doors, shutters and blinds, slats and secondary structural members, decorative items, baseboards, and floorboards [32].

  • Pinus spp. is an important exotic genus, several of whose species are widely planted in Brazil. The wood of Pinus has basic density varying from 0.311 to 0.366 g/cm 3 [14] and it is used in large scale in the civil construction industry [26].

  • Cedrinho wood is the popular name for Erisma uncinatum Warm. Also called cedrilho, being a native species from the tropical forest [25]. Its occurrence is in the Amazon with heartwood and sapwood distinguished by color, reddish-brown heartwood; lackluster; imperceptible smell and taste; low density; right grain to reverse; medium to coarse texture. In civil construction it is used for the manufacture of doors, shutters, frames, slats, rafters, wainscoting, panels, frames, fittings, ceilings, scaffolding, forms for concrete, braces [32].

  • Schefflera morototoni is a tree native to South America. It is called matchwood in English and ‘Morototoni’ or ‘Morototó’ in Portuguese. The wood is used in construction (floors, wainscoting, door/window frames) and the tree grows widely in the Amazon Forest, where it reaches up to 20 m in height with a straight cylindrical trunk. The bark is cream to gray colored, and the wood has light to medium density (0.55–0.60 g/cm 3[18].

All these species are widely used in civil construction in Brazil, thus explaining their choice for analysis in this paper. The study was conducted in 2020 at the Faculty of Agronomic Sciences of Paulista State University (UNESP), located in Botucatu city (state of São Paulo) to investigate the effect of temperature on the mass loss of these species. Few studies have investigated the mass loss of wood when submitted to carbonization and the effects of fire on wood structures. The logical hypothesis is that the higher the temperature, the greater the mass loss of wood will be, and the faster it will turn into charcoal, losing its visual aspect and structural qualities due to decomposition. These hypotheses will be analyzed in this paper.

Figure 1(a–e) present the distribution of each species regarding mass loss. It can be noted that these distributions are variously skewed to the left and right. There is different behavior in relation to the variability (non-constant variation) of these species of wood. In particular, Figure 2 indicates the mass loss of all the species taken together, where the presence of bimodality can be seen in the data set.

Figure 1.

Figure 1.

Histogram for wood mass loss data by species.

Figure 2.

Figure 2.

Histogram for wood mass loss data for all species together.

The odd log-logistic Marshall--Olkin normal (OLLMON) distribution has the flexibility to accommodate various density functions, such as asymmetric and bimodal. Another objective is to show that the OLLMON distribution explains the wood data set. The quantile residuals (qrs) are used to investigate possible deviations from the distribution assumption and to detect possible atypical observations.

The paper has the following structure. The OLLMON model is introduced in Section 2. Some mathematical properties are obtained in the Appendix. The heteroscedastic OLLMON regression is defined in Section 3. Also, the accuracy of the maximum likelihood estimators (MLEs) is investigated and some case-deletion diagnostic measures and qrs are defined for the fitted regression. In Section 4, the new regression is fitted to wood mass loss data with five species. These species are widely used in civil construction in Brazil. Some concluding remarks close the paper in Section 5.

2. The formulation of the OLLMON model

Recently, several generalized forms of normal distribution have been explored. Some related works were developed by Marshall and Olkin [19], Gleaton and Lynch [8,9], da Silva Braga et al. [5] and Alizadeh et al. [1].

Recently, several distributions families have been developed, for example, for example, [12] proposed a new extended G family of distributions, Hamedani et al. [13] presented the odd power Lindley generator of probability distributions, Korkmaz et al. [16] introduced the Weibull Marshall–Olkin family, Korkmaz et al. [15] study the odd log-logistic Marshall–Olkin generalized half normal model, Cordeiro et al. [4] proposed the xgamma-G family and Vila et al. [31] introduced the bimodal gamma distribution.

In a similar direction, the cumulative distribution function (cdf) and probability density function (pdf) of the OLLMON model are defined by

F(y;μ,σ,ν,τ)=Φ(z)τΦ(z)τ+ν[1Φ(z)]τ (1)

and, say f(y)=f(y;μ,σ,ν,τ),

f(y)=τν2πσexp(z22)Φ(z)τ1[1Φ(z)]τ1{Φ(z)τ+ν[1Φ(z)]τ}2, (2)

respectively, where z=(yμ)/σ, μ and σ are the parameters of the N(μ,σ2) distribution, ν>0 and τ>0 are shape parameters and Φ() is the standard normal cdf.

Let YOLLMON(μ,σ,ν,τ) have density function (2). For ν>0 and τ=1, Equation (2) is the Marshall–Olkin normal (MON) distribution [19]. For ν=1, it is the odd log-logistic normal (OLLN) [5]. The N (μ,σ2) distribution is clearly a special model when ν=τ=1.

Figure 3(a–d) display some plots of the pdf of Y. The OLLMON density can have bimodal and asymmetric shapes, which makes it a very flexible model compared to the OLLN, MON and normal.

Figure 3.

Figure 3.

Plots of the OLLMON density. (a) For σ=0.5, ν=0.5 τ=0.15 and varying μ. (b) For μ=0.1, ν=2, τ=0.2 and varying σ. (c) For μ=0.1, σ=0.1, τ=0.2 and varying ν. (d) For μ=0.1, σ=0.1, ν=1 and varying τ.

The quantile function (qf) of Y follows by inverting Equation (1)

y=Q(u)=F1(u;μ,σ,ν,τ)=QN{(νu)1/τ(νu)1/τ+(1u)1/τ},u(0,1), (3)

where QN(u)=Gμ,σ1(u)=μ+σΦ1(u) and Φ1(u) is the standard normal qf.

The OLLMON distribution can be simulated from Equation (3). Plots comparing its exact density and the histogram of the simulated data for some parameter values are displayed in Figure 4. They reveal that the generated data are consistent with the proposed distribution.

Figure 4.

Figure 4.

Histograms and plots of the OLLMON density.

Some mathematical properties of the new distribution are addressed in the Appendix.

3. The heterocedastic OLLMON regression

The proper choice of the distribution for a response variable Y is fundamental in regression modeling. Several works can be found in the literature on heteroscedastic regressions. For example, Labra et al. [17] developed a heteroscedastic nonlinear regression under scale mixtures of skew-normal, Ortega et al. [21] addressed the heteroscedastic log-exponentiated Weibull regression with censored data, Prataviera et al. [23] presented a generalized flexible Weibull regression for repairable systems, Souza Vasconcelos et al. [27] and Vasconcelos et al. [30] defined the odd log-logistic generalized inverse Gaussian and odd log-logistic extended gamma regressions with two systematic components and Prataviera et al. [22] proposed non-proportional hazard models for survival data.

Based on the OLLMON distribution and the works previously mentioned, a heterocedastic regression is defined by the response variable YiOLLMON(μi,σi,ν,τ) having density (2) under two systematic components for the location μi and scale σi parameters

μi=xiTβ1andg(σi)=ηi=viTβ2,i=1,,n (4)

where xi=(xi1,,xip1)T and vi=(vi1,,vip2)T are vectors of known explanatory variables, β1=(β11,,β1p1)T and β2=(β21,,β2p2)T are vectors of unknown parameters, and g(σi)=log(σi) is a twice differentiable link function. The second component in (4) has varying dispersion.

The heterocedastic OLLMON regression contains as special models: the heteroscedastic MON regression when τ=1, the heteroscedastic OLLN regression when ν=1 and the heteroscedastic normal regression when τ=ν=1.

Let (y1,x1,v1),,(yn,xn,vn) be a set of n independent observations. The log-likelihood function for θ=(β1T,β2T,ν,τ)T is

l(θ)=nlog(τν)+i=1nlog(12πσi)12i=1nzi2+(τ1)i=1nlogΦ(zi)+(τ1)i=1nlog[1Φ(zi)]2i=1nlog{ν[1Φ(zi)]τ+Φ(zi)τ}, (5)

where zi=(yiμi)/σi.

The maximum likelihood estimator (MLE) θ^ can be found by maximizing (5) using the gamlss package [28] in R software.

3.1. Simulation study

Monte Carlo simulations are carried out to verify the consistency of the MLEs. There are two approaches: in the first one, the accuracy of estimator is examined in the OLLMON distribution, and in the second it is adopted the proposed regression under two systematic components. For both approaches, n = 40, 70 and 200. This procedure is repeated 3000 times to find the average estimates (AEs), biases and the mean square errors (MSEs).

  • OLLMON distribution

    In the first approach, the steps are:
    1. The generated values y1,,yn are obtained from Equation (3), where uU(0.1).
    2. The parameter values are based on Figure 4( a–d), which have different forms for the proposed distribution. Thus, we consider the following scenarios:
      • °
        Scenario 1: μ=3, σ=0.5, ν=2 and τ=0.2 (left asymmetry).
      • °
        Scenario 2: μ=1, σ=0.1, ν=1 and τ=0.2 (bimodal).
      • °
        Scenario 3: μ=5, σ=0.1, ν=0.7 and τ=0.2 (right asymmetry).
      • °
        Scenario 4: μ=2, σ=0.1, ν=4 and τ=0.45 (unimodal form).
    3. For each combination of n, μ, σ, ν and τ, 3000 random samples are generated, and then the estimates μ^, σ^, ν^ and τ^ are calculated.

    The simulation results are reported in Table 1. The biases and the MSEs decrease when the sample size increases, and then the estimators are satisfactory (biases approximately vanish).

  • Heterocedastic OLLMON regression

    In the second approach, the regression is simulated under the conditions:
    1. The parameter values are: ν=0.7 and τ=0.6, β10=0.9, β11=0.5, β12=0.4, β20=1.7, β21=0.3 and β22=0.7.
    2. The systematic components are: μi=β10+β11xi1+β12xi2 and σi=exp(β20+β21xi1+β22xi2).
    3. The explanatory variables are: xi1Uniform(0,2) and xi2Bernoulli(0.5).
    4. For each combination of the values of n, β10, β11, β12, β20, β21, β22, ν and τ, 3000 observations are generated for the response variable yiOLLMON(μi,σi,ν,τ) from Equation (3).
    5. The estimates β^10, β^11, β^12, β^20, β^21, β^22, ν^ and τ^ are then calculated for each replication.

    The estimated quantities from the fitted OLLMON regression reported in Table 2 indicates that the AEs converge to the true parameters and that the MSEs decay toward zero when n increases, thus showing a good accuracy of the estimators.

Table 1.

Simulation results from the OLLMON distribution.

  n = 40 n = 70 n = 200
Parameter AE Bias MSE AE Bias MSE AE Bias MSE
Scenario 1
μ 3.149 0.149 0.080 3.135 0.135 0.059 3.137 0.137 0.041
σ 0.562 0.062 0.035 0.534 0.034 0.019 0.502 0.002 0.005
ν 1.813 0.187 0.223 1.836 0.164 0.201 1.840 0.160 0.153
τ 0.271 0.071 0.025 0.243 0.043 0.012 0.215 0.015 0.003
Scenario 2
μ 0.997 0.003 0.005 1.003 0.003 0.003 1.010 0.010 0.001
σ 0.120 0.020 0.003 0.113 0.013 0.001 0.102 0.002 0.000
ν 1.150 0.150 0.942 1.044 0.044 0.172 0.965 0.035 0.043
τ 0.281 0.081 0.035 0.252 0.052 0.017 0.212 0.012 0.004
Scenario 3
μ 5.002 0.002 0.005 5.006 0.006 0.004 5.009 0.009 0.001
σ 0.121 0.021 0.003 0.111 0.011 0.001 0.102 0.002 0.000
ν 0.761 0.061 0.186 0.715 0.015 0.072 0.675 0.025 0.020
τ 0.286 0.086 0.042 0.247 0.047 0.018 0.209 0.009 0.005
Scenario 4
μ 2.026 0.026 0.002 2.022 0.022 0.001 2.018 0.018 0.001
σ 0.118 0.018 0.002 0.113 0.013 0.001 0.104 0.004 0.000
ν 3.339 0.661 0.582 3.353 0.647 0.564 3.415 0.585 0.479
τ 0.612 0.162 0.115 0.567 0.117 0.067 0.495 0.045 0.013

Table 2.

Simulated quantities from the OLLMON regression.

  n = 40 n = 70 n = 200
Parameter AE Bias MSE AE Bias MSE AE Bias MSE
β10 0.930 0.030 0.034 0.920 0.020 0.016 0.911 0.011 0.005
β11 0.519 0.019 0.029 0.518 0.018 0.014 0.510 0.010 0.004
β12 0.465 0.065 0.059 0.449 0.049 0.031 0.429 0.029 0.010
β20 2.196 0.496 0.482 2.020 0.320 0.254 1.819 0.119 0.074
β21 0.310 0.010 0.056 0.304 0.004 0.023 0.302 0.002 0.007
β22 0.712 0.012 0.060 0.694 0.006 0.028 0.683 0.017 0.009
ν 0.588 0.112 0.063 0.617 0.083 0.052 0.655 0.045 0.030
τ 0.379 0.221 0.100 0.444 0.156 0.069 0.537 0.063 0.030

3.2. Diagnostic and residual analysis

Case deletion is a common approach to verify the effect of deleting an observation from a data set [33]. Let a subscript ‘ i ’ denote the original quantity with the ith observation deleted. So, θ^(i)=(β^1(i)T,β^2(i)T,ν^(i),τ^(i))T is the MLE of θ by maximizing l(i)(θ). The generalized Cook and likelihood distance are useful influence measures for the ith observation defined by GDi(θ)=(θ^(i)θ^)T{L¨(θ)}(θ^(i)θ^) and LDi(θ)=2{l(θ^)l(θ^(i))}, respectively.

If a regression is fitted to a data set, the assessment of this adjustment can be carried out by analyzing the residuals. From this analysis, discrepant observations are identified, in addition to assessing whether there are serious departures from the regression assumptions. If the regression is suitable, the residual plots versus the order of observations or the predicted values should behave randomly around zero.

The qrs [6] have been adopted frequently in regression applications. For the proposed fitted regression, they are

qri=Φ1{Φ(z^i)τ^Φ(z^i)τ^+ν^[1Φ(z^i)]τ^}, (6)

where z^i=(yiμ^i)/σ^i

Monte Carlo simulations with 3000 generated samples examine the empirical distribution of the qri's. Samples sizes 40, 70 and 200 are generated following the same structure of the algorithm described in Section 3.1. Plots of the qrs versus the expected normal order statistics are displayed in Figure 5( a–c). Clearly, the empirical distribution of these residuals is quite close to the standard normal distribution.

Figure 5.

Figure 5.

Normal probability plots for qri's in the fitted OLLMON regression. (a) n = 40. (b) n = 70. (c) n = 200.

4. Application to wood species data

The experiments were conducted in the months of February and March 2020 at the Faculty of Agronomic Sciences of the Paulista State University (UNESP), located in Botucatu (São Paulo state). The species studied are: Peroba do Norte, Tauari, Cedrinho, Pinus, and Morototoni. We chose these species because they are widely used in civil construction and have a wide range of basic densities. The wood test bodies had thickness of 2.0 cm, width of 2.0 cm and length of 5.0 cm, which were heated in a muffle furnace (a type of oven used in laboratories to heat samples to high temperatures) to final temperatures of 400 C to 500 C. The final temperature was maintained for 10 min and the carbonization rate was 14.3 C/min for each wood species. We weighed each sample after natural cooling and homogenization of the weight of the sample. The variables considered are:

  • yi: wood mass loss (g);

  • xi1: temperature ( C) with two levels (400 and 500 );

  • xi2: five wood species, i=1,2,,70. In this case, we define four dummy variables (dij), j=2,,5.

Table 3 lists the means and standard deviation (SDs) for the temperatures and loss of mass of each wood species.

Table 3.

Statistics for the wood mass loss data.

Temperatures and species Mean SD
400 9.823 3.125
500 10.615 3.616
Peroba do Norte 16.178 1.467
Tauari 10.690 0.673
Cedrinho 8.935 0.599
Pinus 8.337 0.696
Morototoni 6.955 1.266

4.1. Marginal modeling of the response variable under the OLLMON distribution

Table 4 gives the MLEs, SEs (in parentheses) from some fitted distributions and the Akaike Information Criterion (AIC) and Global Deviance (GD), the last statistic defined by GD =2^, where ^ is the maximized log-likelihood function; see [29]. Its numbers indicate that the OLLMON distribution is the best model to the current data.

Table 4.

Findings for some distributions fitted to wood species.

Model μ log(σ) ν τ AIC GD
OLLMON 13.337 0.376 0.204 0.079 340.961 332.961
  (0.231) (0.044) (0.042) (0.007)    
OLLN 9.852 1.902 1 2.212 373.096 367.096
  (0.457) (1.069) (–) (2.530)    
MON 14.729 1.226 0.069 1 359.895 353.895
  (0.355) (0.061) (0.014) (–)    
Normal 10.219 1.210 (–) (–) 372.100 368.100
  (0.401) (0.085) (–) (–)    

Likelihood ratio (LR) statistics compare nested models in Table 5, where the p-values reveal that the new distribution is the best among the four.

Table 5.

LR statistics.

Models Hypotheses LR statistic p-value
OLLMON vs OLLN H0:ν=1 vs H1:H0isfalse 34.135 <0.001
OLLMON vs MON H0:τ=1 vs H1:H0isfalse 20.934 <0.001
OLLMON vs Normal H0:ν=1 and τ=1 vs H1:H0isfalse 35.139 <0.001

Figure 6(a) provides the data histogram and some fitted densities, whereas Figure 6(b) gives the empirical cdf and the estimated cdfs. These plots really indicate that OLLMON distribution is an excellent alternative for modeling data on wood species.

Figure 6.

Figure 6.

(a) Estimated densities. (b) Estimated cumulative functions and empirical cdf.

4.2. The heteroscedastic OLLMON regression

Consider the heteroscedastic OLLMON regression with the systematic components

μi=β10+β11xi1+j=25β1jdijandσi=exp(β20+β21xi1+j=25β2jdij).

The MLEs, SEs and p-values are reported in Table 6. Interpretations for significant regression coefficients are addressed at the end of this section.

Table 6.

Findings from the new regression fitted to wood species data.

Link Efeitos Parameter Estimate SE p-value
  Intercept β10 16.367 0.439 <0.001
  Temp. 500 β11 0.672 0.157 <0.001
μ Tauari β12 −5.871 0.423 <0.001
  Cedrinho β13 −7.443 0.402 <0.001
  Pinus β14 −8.099 0.395 <0.001
  Morototoni β15 −9.048 0.506 <0.001
  Intercept β20 −0.804 0.467 0.090
  Temp. 500 β21 −0.058 0.159 0.717
σ Tauari β22 −0.582 0.221 0.011
  Cedrinho β23 −0.937 0.224 <0.001
  Pinus β24 −0.665 0.219 0.004
  Morototoni β25 0.035 0.224 0.877
    ν 0.464 0.161  
    τ 0.237 0.166  

Table 7 gives the GD and AIC statistics for comparing the proposed regression with three other nested regressions, thus indicating that it outperforms the other regressions irrespective of the criteria. The LR statistics reported in Table 8 indicate that the OLLMON regression is the best model for these data.

Table 7.

Model selection measures for the wood species data.

Model AIC GD
OLLMON 183.612 155.612
OLLN 186.212 160.213
MON 186.231 160.418
N 186.418 162.231

Table 8.

LR tests for the wood mass loss data.

Regressions Hypotheses LR statistics p-values
OLLMON vs OLLN H0:ν=1 vs H1:H0isfalse 4.601 0.032
OLLMON vs MON H0:τ=1 vs H1:H0isfalse 4.806 0.028
OLLMON vs N H0:ν=1 and τ=1 vs H1:H0isfalse 6.619 0.037

Next, the measures GDi(θ) and LDi(θ) are plotted in Figure 7, where 17 and 29 are influent cases. In addition, Figure 8(a) represents graphically the qrs from the fitted OLLMON regression, where all points have a random behavior and belong to the interval [3,3]. The normal probability plot for the qrs with envelope [2] can verify deviations from the response distribution in the fitted regression. There are no observations falling outside the envelope in Figure 8(b), and then the new regression is very suitable for these data.

Figure 7.

Figure 7.

Index plots for θ: (a) LDi(θ) and (b) GDi(θ).

Figure 8.

Figure 8.

(a) Index plot of qrs. (b) Envelope for the qrs.

Multiple comparisons of the wood species in relation to mass loss reported in Table 9 (under the fitted OLLMON regression) reveal that the species are different in relation to the loss of wood mass. Table 9 is constructed using different regressions based on the OLLMON distribution, in which we only change the reference levels. For example, we start the regression analysis by setting the Peroba do Norte species as reference level, then we use the Tauari species as reference level, and so forth.

Table 9.

Results from the fitted OLLMON regression to wood species data.

Tests for the location μ
Hypotheses H0 Estimate SE p-value
Tauari - Peroba do Norte = 0 5.871 0.423 <0.001
Cedrinho - Peroba do Norte = 0 7.443 0.402 <0.001
Pinus - Peroba do Norte = 0 8.099 0.395 <0.001
Morototoni - Peroba do Norte = 0 9.048 0.506 <0.001
Cedrinho - Tauari = 0 1.573 0.201 <0.001
Pinus - Tauari = 0 2.229 0.221 <0.001
Morototoni - Tauari = 0 3.178 0.392 <0.001
Pinus - Cedrinho = 0 0.656 0.204 0.002
Morototoni - Cedrinho = 0 1.605 0.363 <0.001
Morototoni - Pinus = 0 0.949 0.386 0.017
Tests for the scale σ
Hypotheses H0 Estimate EP p-value
Tauari - Peroba do Norte = 0 0.582 0.221 0.011
Cedrinho - Peroba do Norte = 0 0.937 0.224 <0.001
Pinus - Peroba do Norte = 0 0.665 0.219 0.004
Morototoni - Peroba do Norte = 0 0.035 0.224 0.877
Cedrinho - Tauari = 0 0.354 0.219 0.112
Pinus - Tauari = 0 0.083 0.212 0.696
Morototoni - Tauari = 0 0.617 0.213 0.005
Pinus - Cedrinho = 0 0.271 0.226 0.235
Morototoni - Cedrinho = 0 0.971 0.217 <0.001
Morototoni - Pinus = 0 0.700 0.221 0.003

A graphical comparison among the wood species is displayed in Figure 9. These plots provide the empirical and the estimated cdfs (under the fitted OLLMON regression) for all wood species. They confirm that there are relevant differences among wood species in terms of weight loss of wood.

Figure 9.

Figure 9.

Estimated cdfs for all wood species.

  • Findings for the location parameter μ of the loss of wood mass.
    1. There is a significant difference between the temperatures (400 C and 500 C) at the 5% level to explain the location μ of the loss of wood mass.
    2. The numbers in Table 9 reveal that there is a significant difference between the species Tauari, Cedrinho, Pinus and Morototoni to the species Peroba do Norte (reference level) for explaining the location μ.
    3. The species Cedrinho, Pinus and Morototoni show a significant difference at the 5% level comparable to Tauari to explain the location μ.
    4. The species Pinus and Morototoni present a significant difference in relation to Cedrinho to explain μ.
    5. The Morototoni and Pinus differ significantly at the 5% level for explaining μ.
    6. The species that presents the worst performance to the location of the loss of wood mass is Peroba do Norte. The species Morototoni stands out in a favorable way to explain μ. All of these significant differences can also be noted graphically in Figure 9.
  • Findings for the scale parameter σ of the loss of wood mass.
    1. There is no significant difference between the temperatures 400 C and 500 C to explain the scale σ of the loss of wood mass.
    2. There is a significant difference for the species Tauari, Cedrinho and Morototoni in relation to Peroba do Norte (taken as reference level) to explain the scale of the loss of wood mass.
    3. The scale of the loss of wood mass for Morototoni is significantly greater than that one for Tauari (reference level).
    4. The scale of the loss of wood mass for Morototoni is different from that one for Cedrinho.
    5. The scales of the loss of wood mass for the species Morototoni and Pinus are statistically different. In summary, Peroba do Norte leads to the greatest scale of the loss of wood mass. On the other hand, the species Cedrinho, Tauari and Pinus yield lower scales of this loss. The variability of the scales can also be shown in Figure 10.

Figure 10.

Figure 10.

Estimated densities of loss of wood mass of each species. (a) Peroba do Norte, (b) Cedrinho, (c) Morototoni, (d) Pinus, (e) Tauari.

5. Concluding remarks

The article introduces a new regression based on the odd log-logistic Marshall–Olkin normal (OLLMON) distribution, which includes as special cases three distributions. The proposed regression is defined as a useful extension of the heteroscedastic normal regression hat can be an interesting alternative for analyzing bimodal data. The parameters are estimated by maximum likelihood. Simulation results under different scenarios show the accuracy of the estimators. The usefulness of the quantile residuals is verified. The importance of the OLLMON regression is proved empirically by means of a wood data set representing five forest species that are widely used in civil construction in Brazil and allow to verify the behavior towards the loss of mass of these species in contact with fire under two different temperatures. Peroba do Norte (Goupia glabra Aubl.) gives a more heterogeneous loss of wood mass, whereas Cedrinho (Erisma uncinatum Warm) provides more homogeneity loss of wood mass compared to other species.

Acknowledgments

The agencies CNPq and CAPES (Brazil) support this work. We thank the anonymous referees for the helpful remarks on this manuscript.

Appendix. Mathematical properties.

The linear representation for the cdf (1) follows from [3]:

F(y;μ,σ,ν,τ)=i=0wi{Φ(z)τΦ(z)τ+[1Φ(z)]τ}i+1, (A1)

where the coefficients are (for i=0,1,)

wi=wi(ν)={(1)iν(i+1)j=i(ji)(j+1)(1ν)j,ν(0,1),ν1(11/ν)i,ν>1.

For |w|1, a power series for wa=[1(1w)]a using (1w)a=j=0(1)j(aj)wj twice is

wa=k=0pk(a)wk, (A2)

where

pk(a)=j=k(1)k+j(aj)(ak).

A convergent power series for wa+(1w)a (for any real a>0 and |w|1) can be expressed as

wa+(1w)a=k=0qk(a)wk, (A3)

where qk(a)=pk(a)+(1)k(ak).

Combining (A2) and (A3) and a result in Section 0.313 of [10] gives

wawa+(1w)a=k=0pk(a)wkk=0qk(a)wk=k=0sk(a)wk, (A4)

where s0(a)=p0(a)/q0(a) and the quantities sk(a)'s (for k1) are

sk(a)=1p0(a)[pk(a)1q0(a)r=1kpr(a)qkr(a)].

Equations (A1) and (A4) and a result in Section 0.314 of [10] (for zR) lead to

{Φ(z)τΦ(z)τ+[1Φ(z)]τ}i+1={k=0sk(τ)Φ(z)k}i+1=k=0ti+1,k(τ)Φ(z)k, (A5)

where, for i = 0 and k0, ti+1,k(τ)=sk(τ) and, for i1, ti+1,0(τ)=s0(τ)i+1 and, for k1,

ti+1,k(τ)=1ks0(a)r=1k[r(i+2)k]sr(τ)ti+1,kr(τ).

By inserting (A5) in Equation (A1) yields

F(y;μ,σ,ν,τ)=k=0vkΦ(z)k, (A6)

where vk=vk(τ,ν)=i=0wi(ν)ti+1,k(τ) for k0.

By differentiating (A6), the density of Y admits the linear representation

f(y;μ,σ,ν,τ)=σ1k=0vk+1πk+1(z), (A7)

where πk+1(z)=(k+1)Φ(z)kϕ(z) denotes the exponentiated standard normal (ESN) density with power parameter k + 1.

Equation (A7) shows that the OLLMON density is a linear combination of ESN densities with different powers. It can be used to obtain some structural properties of Y.

Let Tk+1 have the ESN density πk+1(z). The nth moment of Y follows from (A7) as

μn=E(Xn)=k=0vk+1E(Tk+1n)=k=0(k+1)vk+1τn,k, (A8)

where τn,k=znΦ(z)kϕ(z)dz=01QN(u)nukdu.

Nadarajah [20] expressed τn,k in terms of the Lauricella function of type A [7]

FA(n)(a;b1,,bn;c1,,cn;x1,,xn)m1=0mn=0(a)m1++mn(b1)m1(bn)mn(c1)m1(cn)mnx1m1xnmnm1!mn!,

which has numerical routines for direct computation.

In fact, Nadarajah obtained the useful result

τn,k=2s/2π(k+1/2)l=0(n+kl)evenk(kl)2lπlΓ(n+kl+12)×FA(rl)(n+kl+12;12,,12;32,,32;1,,1), (A9)

which can be used in Equation (A8) to obtain μn.

The nth incomplete moment of Y can be expressed as

mn(w)=wynf(y;μ,σ,ν,τ)dy=σ1k=0(k+1)vk+1int0Φ(w)QN(u)nukdu,

where the last integral can be computed numerically.

The moment generating function (mgf) of Y follows from (A7)

M(t)=σ1k=0vk+1Mk+1(t)=σ1k=0(k+1)vk+1ρk(t), (A10)

where Mk+1(t) is the mgf of Tk+1 and ρk(t)=etzΦ(z)kϕ(z)dz=01exp[tQN(u)]ukdu.

We now provide a simple representation for ρk(s) for s0. The standard normal cdf Φ(z) can be expressed as a power series Φ(z)=j=0ajzj, where a0=(1+2/π)1/2, a2j=0 (for j=1,2,) and a2j+1=(1)j2j2π(2j+1)j! (for j=0,1,2) Thus, we can write

Φ(z)k=j=0ck,jzj, (A11)

where the coefficients ck,j can be found recursively (for /j=1,2,)

ck,j=(ja0)1m=1j[m(k+1)j]amck,jm,

and ck,0=a0n. Clearly, the quantities ck,j are obtained from ck,0,,ck,j1 and hence from a0,,ai given before. We have

ρk(s)=12πj=0ck,jzjexp(szz22)dz,

where the integral follows from Equation 2.3.15.8 of [24]

J(s,j)=zjexp(szz22)dz=(1)j2πjsj{exp(s22)}.

Hence, it follows from (A10) and the last two equations

M(s)=1σπj,k=0(1)j(k+1)vk+1ck,jjsj{exp(s22)}. (A12)

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Alizadeh M., Ozel G., Altun E., Abdi M., and Hamedani G.G., The odd log-logistic Marshall–Olkin Lindley model for lifetime data, J. Stat. Theory Appl. 16 (2017), pp. 382–400. [Google Scholar]
  • 2.Atkinson A.C., Plots, Transformations and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis, Clarendon Press Oxford, Oxford, 1985. [Google Scholar]
  • 3.Barreto-Souza W., Lemonte A.J., and Cordeiro G.M., General results for the Marshall and Olkin's family of distributions, An. Acad. Bras. Ciênc. 85 (2013), pp. 3–21. [Google Scholar]
  • 4.Cordeiro G.M., Altun E., Korkmaz M.C., Pescim R.R., Afify A.Z., and Yousof H.M., The xgamma family: Censored regression modelling and applications, Revstat Stat. J. 18 (2020), pp. 593–612. [Google Scholar]
  • 5.da Silva Braga A., Cordeiro G.M., Ortega E.M., and Nilton da Cruz J., The odd log-logistic normal distribution: Theory and applications in analysis of experiments, J. Stat. Theory Pract. 10 (2016), pp. 311–335. [Google Scholar]
  • 6.Dunn P.K. and Smyth G.K., Randomized quantile residuals, J. Comput. Graph. Stat. 5 (1996), pp. 236–244. [Google Scholar]
  • 7.Exton H., Handbook of Hypergeometric Integrals: Theory, Applications, Tables, Computer Programs, Halsted Press, New York, 1978. [Google Scholar]
  • 8.Gleaton J.U. and Lynch J.D., Properties of generalized log-logistic families of lifetime distributions, J. Probab. Stat. Sci. 4 (2006), pp. 51–64. [Google Scholar]
  • 9.Gleaton J.U. and Lynch J.D., Extended generalized log-logistic families of lifetime distributions with an application, J. Probab. Stat. Sci. 8 (2010), pp. 1–17. [Google Scholar]
  • 10.Gradshteyn I.S. and Ryzhik I.M., Table of Integrals, Series, and Products, Academic Press, New York, 2000. [Google Scholar]
  • 11.Gurgel E.S., Gomes J.I., Groppo M., Martins-da-Silva R.C., Souza A.S.D., Margalho L., and Carvalho L.T.D., Conhecendo Espécie De Plantas Da Amazônia: Cupiúba (Goupia Glabra Aubl.-Goupiaceae), Comunicado Técnico. Embrapa Amazônia Oriental, 2015.
  • 12.Hamedani G.G., Altun E., Korkmaz M.C., Yousof H.M., and Butt N.S., A new extended G family of continuous distributions with mathematical properties, characterizations and regression modeling, Pakistan J. Stat. Oper. Res. XIV (2018), pp. 737–758. [Google Scholar]
  • 13.Hamedani G.G., Altun E., Korkmaz M.C., Yousof H.M., and Butt N.S., The odd power Lindley generator of probability distributions: Properties, characterizations and regression modeling, Int. J. Stat. Probab. 8 (2019), pp. 70–89. [Google Scholar]
  • 14.Higa A.R., Kageyama P.Y., and Ferreira M., Variação da densidade básica da madeira de P. elliottii var. elliottii e P. taeda, IPEF 7 (1973), pp. 79–89. [Google Scholar]
  • 15.Korkmaz M.C., Altun E., Alizadeh M., and Yousof H.M., A new flexible lifetime model with log-location regression modeling, properties and applications, J. Stat. Manag. Syst. 22 (2019), pp. 871–891. [Google Scholar]
  • 16.Korkmaz M.C., Cordeiro G.M., Yousof H.M., Pescim R.R., Afify A.Z., and Nadarajah S., The Weibull Marshall–Olkin family: Regression model and application to censored data, Commun. Stat. - Theory Methods 48 (2019), pp. 4171–4194. [Google Scholar]
  • 17.Labra F.V., Garay A.M., Lachos V.H., and Ortega E.M., Estimation and diagnostics for heteroscedastic nonlinear regression models based on scale mixtures of skew-normal distributions, J. Stat. Plan. Inference 142 (2012), pp. 2149–2165. [Google Scholar]
  • 18.Macieira A., da Costa C.C., de Carvalho L.T., Fiaschi P., Gomes J., Martins-Da-Silva R.C.V., and Margalho L., Conhecendo espécies de plantas da Amazônia: Morototó (Schefflera morototoni (Aubl.) Maguire, Steyerm. & Frodin-Araliaceae), Embrapa Amazônia Oriental-Comunicado Técnico (INFOTECA-E), 2014.
  • 19.Marshall A.W. and Olkin I., A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families, Biometrika 84 (1997), pp. 641–652. [Google Scholar]
  • 20.Nadarajah S., Explicit expressions for moments of order statistics, Stat. Probab. Lett. 78 (2008), pp. 196–205. [Google Scholar]
  • 21.Ortega E.M., Lemonte A.J., Cordeiro G.M., Cancho V.G., and Mialhe F.L., Heteroscedastic log-exponentiated Weibull regression model, J. Appl. Stat. 45 (2018), pp. 384–408. [Google Scholar]
  • 22.Prataviera F., Loibel S.M., Grego K.F., Ortega E.M., and Cordeiro G.M., Modelling non-proportional hazard for survival data with different systematic components, Environ. Ecol. Stat. 27 (2020), pp. 1–23. [Google Scholar]
  • 23.Prataviera F., Ortega E.M., Cordeiro G.M., Pescim R.R., and Verssani B.A., A new generalized odd log-logistic flexible Weibull regression model with applications in repairable systems, Reliab. Eng. Syst. Saf. 176 (2018), pp. 13–26. [Google Scholar]
  • 24.Prudnikov A.P., Brychkov I.A., and Marichev O.I., Integrals and Series: Special Functions, Vol. 2, CRC Press, 1986. [Google Scholar]
  • 25.Segundinho P.G.D.A., Zangiácomo A.L., Carreira M.R., Dias A.A., and Lahr F.A.R., Avaliação de vigas de madeira laminada colada de cedrinho (Erisma uncinatum warm), Cerne 19 (2013), pp. 441–449. [Google Scholar]
  • 26.Shimizu J.Y., Pinus Na Silvicultura Brasileira, Embrapa Florestas, Colombo, 2008. [Google Scholar]
  • 27.Souza Vasconcelos J.C., Cordeiro G.M., Ortega E.M., and Araújo E.G., The new odd log-logistic generalized inverse Gaussian regression model, J. Probab. Stat. 2019 (2019), pp. 1–13. [Google Scholar]
  • 28.Stasinopoulos D.M. and Rigby R.A., Generalized additive models for location scale and shape (GAMLSS) in R, J. Stat. Softw. 23 (2007), pp. 1–46. [Google Scholar]
  • 29.Stasinopoulos M.D., Rigby R.A., Heller G.Z., Voudouris V., and De Bastiani F., Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC The R Series, Boca Raton, FL, 2017. [Google Scholar]
  • 30.Vasconcelos J.C.S., Cordeiro G.M., Ortega E.M.M., and Rezende É.M.D., A new regression model for bimodal data and applications in agriculture, J. Appl. Stat. 48 (2021), pp. 349–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Vila R., Ferreura L., Saulo H., Prataviera F., and Ortega E.M.M., A bimodal gamma distribution: Properties, regression model and applications, Statistics 54 (2020), pp. 469–493. [Google Scholar]
  • 32.Zenid G.J., Madeira: Uso Sustentável Na Construção Civil, Instituto de Pesquisas Tecnológicas-SVMA, São Paulo, 2009. [Google Scholar]
  • 33.Cook R.D., Detection of influential observation in linear regression, Technometrics 19 (1977), pp. 15–18. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES