Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2021 Sep 11;49(16):4225–4253. doi: 10.1080/02664763.2021.1976120

On a log-symmetric quantile tobit model applied to female labor supply data

Danúbia R Cunha a, Jose Angelo Divino a, Helton Saulo b,CONTACT
PMCID: PMC9639474  PMID: 36353305

Abstract

The study of female labor supply has been a topic of relevance in the economic literature. Generally, the data are left-censored and the classic tobit model has been extensively used in the modeling strategy. This model, however, assumes normality for the error distribution and is not recommended for data with positive skewness, heavy-tails and heteroscedasticity, as is the case of female labor supply data. Moreover, it is well-known that the quantile regression approach accounts for the influences of different quantiles in the estimated coefficients. We take all these features into account and propose a parametric quantile tobit regression model based on quantile log-symmetric distributions. The proposed method allows one to model data with positive skewness (which is not suitable for the classic tobit model), to study the influence of the quantiles of interest, and to account for heteroscedasticity. The model parameters are estimated by maximum likelihood and a Monte Carlo experiment is performed to evaluate alternative estimators. The new method is applied to two distinct female labor supply data sets. The results indicate that the log-symmetric quantile tobit model fits better the data than the classic tobit model.

Keywords: Log-symmetric distributions, quantile regression, Monte Carlo simulation, PNAD, PSID

1. Introduction

The modeling of female labor supply data is a relevant issue because the participation rate of women in the labor market has been a transforming factor in the labor market. In the United States, for instance, Jacobsen [17] claims that the increase in female participation in the labor market is the most striking economic statistic in the twentieth century. As for Brazil, Scorzafave and Menezes-Filho [37] emphasize that there was a strong increase in female participation rates, especially for women with 1–11 years of study. On the other hand, Barros et al. [3] highlight that the study of the mechanisms and motivations to explain the increase in the female participation rate, coupled with the fact that this rate is a basic socioeconomic indicator, has boosted the researchers' attention. However, even today, women are less likely to participate in the labor market than men. In addition, they are also more likely to be unemployed in most countries, according to a recent study by the International Labour Organization [15].

In general, data on female labor supply is left-censored. The fraction of women that does not work declares zero income, which is classified as censored. The censoring occurs when the information on the dependent variable is not fully available for some units of the sample. However, for these units, data on explanatory variables are known across their domain. Thus, in these cases, it is necessary to work with models that take censoring into account, such as the classic tobit model; see Long [24]. This model was introduced by Tobin [42] and has been widely used in empirical literature; see, for instance, Amemiya [1], Helsel [14] and Barros et al. [2]. It is applied to left-censored dependent variables, being motivated by a study on the relationship between family spending on durable goods and family income. In the data, some values of the dependent variable (family expenditure) were zero, making it censored at a fixed limit value. Other studies in which the dependent variable is censored at zero for some observations include Fair [7,8] for modeling the number of extramarital affairs, Jarque [18] for family spending in several groups of commodities, and Melenberg and van Soest [27] for holiday expenses, among others.

The classic tobit [42] model might be used to estimate labor supply regressions. By fixing working hours as the dependent variable, there is the possibility that it takes the value zero. This happens when one or more individuals do not work, that is, do not supply hours of work; see details in Moffitt [28]. According to Heckman and MaCurdy [13], which was discussed by Islam [16], the censored model is relevant in cases where the sample consists of individuals randomly chosen and with labor hours reported as zero if the individual does not work. In this case, the techniques used for estimating linear models are inappropriate due to the presence of censoring.

One of the limitations of the classic tobit model [42], however, is the assumption that the error term is normally distributed. Although the normal distribution is widely used in applications, it might not be appropriate when positive skewness, heavy-tails and heteroscedasticity are present in the dependent variable, which is the case of female labor supply data. In this context, the class of log-symmetric distributions are important tools to overcome such limitation. The log-symmetric distributions are a generalization of the log-normal distribution and have as their special cases distributions that have lighter or heavier tails than those of the log-normal, as well as bimodal distributions; see, for instance, Jones [19], Vanegas and Paula [43–46] and Medeiros and Ferrari [26]. In addition to the log-normal distribution, other examples of log-symmetric distributions are the log-Student-t, log-power-exponential and extended Birnbaum-Saunders, among others.

Although important, the use of an appropriate distribution to describe the error distribution of a tobit model does not provide a more comprehensive picture of the effect of the explanatory variables on the dependent variable. In this sense, quantile regression plays an important role, being able to model conditional quantiles as a function of explanatory variables; details on quantile regression can be seen in Koenker and Bassett Jr [23], Koenker [21], Hao and Naiman [12] and Davino et al. [5]. In addition, quantile regression modeling is more efficient in cases where errors are not normally distributed or when the dependent variable has extreme values.

In this context, this paper proposes a quantile tobit model based on log-symmetric distributions. The new tobit model applies a reparameterization of the log-symmetric distributions suggested by Saulo et al. [35], which has the quantile as one of its parameters. The secondary objectives are: (i) to obtain the maximum likelihood estimates of the model parameters; (ii) to carry out a Monte Carlo simulation to evaluate the performance of the maximum likelihood estimates; and, (iii) to apply the proposed methodology to two female labor supply data sets. The first application uses data from the Brazilian National Household Sample Survey (PNAD) for the year of 2015, whereas the second one employs data extracted from the Panel Study of Income Dynamics (PSID), which is an American longitudinal household survey. The PNAD data were obtained from the Brazilian Institute of Geography and Statistics (IBGE),1 while the PSID data were studied by Barros et al. [2] by using the classical tobit model. Note that there are at least three approaches to estimate quantile regressions: (i) the distribution-free (semi-parametric) approach, which includes the works of Powell [32], Portnoy [31] and Peng and Huang [30]; (ii) the pseudo-likelihood method through an asymmetric Laplace distribution; and (iii) the parametric approach with traditional maximum likelihood (ML) framework [10]. Our proposed methodology falls within the third category and can be seen as a generalization of the works by Desousa et al. [6] and Saulo et al. [36]. In general, the Monte Carlo simulations and the applications to the PNAD and PSID data sets show that the proposed log-symmetric quantile tobit models provide good fit to the data.

The advantages of the proposed log-symmetric quantile tobit model over the classic normal tobit model are: (i) greater flexibility in terms of distributional assumptions, since the log-symmetric class incorporates several special cases, such as the log-normal, log-Student-t, log-power-exponential, and extended Birnbaum-Saunders, among others; (ii) greater flexibility for data modeling, allowing for considering the effects of explanatory variables along the spectrum of the dependent variable due to the quantile approach; and (iii) flexibility to accommodate heteroscedasticity, since the proposed tobit model allows for the inclusion of explanatory variables in the dispersion parameter.

The paper is organized as follows. Section 2 briefly describes the class of log-symmetric distributions, both in its classical representation and in the quantile reparameterization as proposed by Saulo et al. [35]. Section 3 introduces the log-symmetric quantile tobit model and provides details on the estimation, interpretation of the coefficients and residual analysis. Section 4 carries out a Monte Carlo simulation to assess the performance of the maximum likelihood estimates. Section 5 performs two empirical applications to the PNAD and PSID data sets. Finally, Section 6 is dedicated to the concluding remarks and discussions of further research.

2. Log-symmetric distributions

This section briefly describes the classical log-symmetric distributions [44] and those based on the quantile proposed by Saulo et al. [35]. The log-symmetric distributions reparameterized by the quantile, that is, those that have the quantile as one of their parameters, will be used to derive the log-symmetric quantile tobit model.

2.1. Classical log-symmetric distributions

A random variable T follows a log-symmetric distribution with scale parameter λ>0 and power parameter ϕ>0, if its probability density function and cumulative distribution function are given by

fT(t;λ,ϕ)=1ϕtg(1ϕ[log(t)log(λ)]2),t>0, (1)

and

FT(t;λ,ϕ)=G(1ϕ[log(t)log(λ)]2),t>0, (2)

respectively, where G(ω)=ηωg(z2)dz, ωR, with η being a normalizing constant and g() a density generator. In this case, the notation TLS(λ,ϕ,g) is used. Note that as ϕ0+ or ϕ, fT(t;λ,ϕ) tends to 0. The 100q-th quantile of TLS(λ,ϕ,g) is given by

Q=QT(q;λ,ϕ)=λexp(ϕG1(q)),q(0,1), (3)

where G1 is the inverse of G given in (2). Table 1 presents some density generators g for some log-symmetric distributions; see details in Vanegas and Paula [44]. Note that the generator g may involve and extra parameter ξ.

Table 1.

Density generator g(u) for some log-symmetric distributions.

Distribution g(u)
Log-normal( λ,ϕ) exp(12u)
Log-Student-t( λ,ϕ,ξ) (1+uξ)ξ+12, ξ>0
Log-power-exponential( λ,ϕ,ξ) exp(12u11+ξ), 1<ξ1
Extended Birnbaum-Saunders( λ,ϕ,ξ) cosh(u1/2)exp(2ξ2sinh2(u1/2)), ξ>0

2.2. Quantile-based log-symmetric distributions

Consider a fixed number q(0,1) and Q the 100q-th quantile of TLS(λ,ϕ,g) given in (3). Then, considering the one-to-one transformation (λ,ϕ)(Q,ϕ), Saulo et al. [35] proposed a reparameterization of the classical log-symmetric distribution, where the probability density function and the cumulative distribution function are given respectively by

fT(t;Q,ϕ)=1ϕtg(1ϕ[log(t)log(Q)+ϕzq]2),t>0, (4)

and

FT(t;Q,ϕ)=G(1ϕ[log(t)log(Q)+ϕzq]2),t>0, (5)

where zq=G1(q), with G being given as in (2). In this case, the notation TQLS(Q,ϕ,g) is used. If TQLS(Q,ϕ,g), Saulo et al. [35] have shown that the following properties hold: (a) cTQLS(cQ,ϕ,g), with c>0; (b) TcQLS(Qc,c2ϕ,g), with c>0. We then readily have the following relation:

T=QϵϕQLS(Q,ϕ,g),whereϵQLS(1,1,g). (6)

Figure 1 displays different shapes of the quantile-based log-symmetric densities for different combinations of parameters. From this figure, we observe that the quantile parameter Q modifies the kurtosis. Specifically, as Q increases the kurtosis decreases Figure 1(a,d,g,j). We also observe the the power parameter ϕ controls the relative dispersion (or skewness), such that when ϕ increases the relative dispersion (or skewness) also increases Figure 1(b,e,h,l). The extra parameter ξ in the log-Student-t and log-power-exponential cases modifies the kurtosis Figure 1(f,i), and in the extended Birnbaum-Saunders case this parameter modifies the skewness and kurtosis of the model Figure 1(l).

Figure 1.

Figure 1.

Quantile-based log-symmetric densities for some parameter values. (a) log-normal. (b) log-normal. (c) log-normal. (d) log-Student-t. (e) log-Student-t. (f) log-Student-t. (g) log-power-exponential. (h) log-power-exponential. (i) log-power-exponential. (j) Extended Birnbaum-Saunders.(k) Extended Birnbaum-Saunders. (l) Extended Birnbaum-Saunders.

3. Log-symmetric quantile tobit model

Let Ti be a positive censored variable to the left at point Ψ, namely, it is observable for values greater than Ψ and censored for values less than or equal to Ψ. Based on (6), the log-symmetric quantile tobit model can be formulated as

Ti={Ψ,TiΨ,i=1,,m,Ti=Qiϵiϕi,Ti>Ψ,i=m+1,,n, (7)

where ϵiQLS(1,1,g), Qi=exp(xiβ), ϕi=exp(wiκ), β=(β0,,βk) and κ=(κ0,,κl) are vectors of regression coefficients, and xi=(1,xi1,,xik) and wi=(1,wi1,,wil) are vectors of explanatory variables fixed and known associated with Qi and ϕi, respectively.

The estimation of the parameters of the quantile log-symmetric tobit model presented in (7) can be done by the maximum likelihood method. Let T=(T1,,Tm,Tm+1,,Tn) be a sample of size n from the quantile log-symmetric tobit model that contains m left-censored data at Ψ and nm uncensored data. Then, the corresponding likelihood function for the parameter vector θ=(β,κ) is given by

L(θ)=i=1mG(log(Ψ)log(Qi)+ϕizqϕi)i=m+1n1ϕig([log(ti)log(Qi)+ϕizq]2ϕi), (8)

where Qi, ϕi are as in (7), G is as in (2), and g is given in Table 1. By taking the logarithm of (8), we obtain the log-likelihood function (θ), that is,

(θ)=i=1ni(θ), (9)

where

i(θ)={log(G(log(Ψ)log(Qi)+ϕizqϕi)),i=1,,m,12log(ϕi)+log(g([log(ti)log(Qi)+ϕizq]2ϕi)),i=m+1,,n.

By taking the first derivative of (θ) with respect to β and κ, we obtain the score vector, that is,

˙(θ)=(θ)θ=i=1n˙i(θ), (10)

where ˙i(θ)=(˙iβ(θ),˙iκ(θ)), with

˙iβ(θ)={1ϕiΠ(ξic)xi,i=1,,m,2ϕiΔ(ξi2)ξixi,i=m+1,,n,˙iκ(θ)={12ϕiΠ(ξic)wiγic,i=1,,m,12wi1ϕΔ(ξi2)ξiwiγi,i=m+1,,n,

where Π(ξic)=(dG(u)/du|u=ξic)/G(ξic) and Δ(ξi2)=(dg(u)/du|u=ξi2)/g(ξi2), with

ξic=(log(Ψ)log(Qi)+ϕizq)/ϕi, ξi=(log(ti)log(Qi)+ϕizq)/ϕi, γic=log(Ψ)log(Qi) and γi=log(ti)log(Qi).

The maximum likelihood estimate for θ is obtained by maximizing the log-likelihood function (9) by equating the score vector ˙(θ), which contains the vector of first derivatives of (θ), to zero, providing the likelihood equations. In this case, as there is no analytical solution, they are solved by using the Broyden–Fletcher–Goldfarb–Shanno (BFGS) iterative method for non-linear optimization. Note that initial values, θ^0=(β^0,κ^0), are required to start the iterative procedure and we obtain them from the R packages quantreg and ssym; see Koenker [22] and Vanegas and Paula [45]. These packages fit a quantile regression model (for β^0) and a log-symmetric regression model (for κ^0), respectively. In general, the estimates are quite stable in relation to the initial values. We assessed it through a numerical exercise, i.e. we replaced θ^0 by θ^0=θ^0+Ω in the estimation procedure, with Ω=0,1,10, and then checked the final estimates. We observed that the estimates are virtually the same for any value of Ω.

Note that in (8), the extra parameter ξ is assumed to be fixed. The reason for this lies in the works of Lucas [25] and Kano et al. [20]. The first work shows that the robustness of the Student-t distribution to outlying observations holds only when the degree of freedom parameter is fixed, instead of being estimated directly in the log-likelihood function. In the second work, the authors report difficulties in estimating the extra parameter for the power-exponential distribution. Therefore, by extending the arguments to the log-Student-t and lg-power-exponential distributions, the extra parameter ξ is estimated using the profiled log-likelihood. Two basic steps are required:

  • (S1)

    Consider a grid of values ξ1,ξ2,,ξK. For each fixed value of ξj, j=1,2,,K, compute the estimate of θ=(β,κ) based on ξj, that is, θ^j=(β^j,κ^j). Compute also the value of the associated log-likelihood function, j(θ^).

  • (S2)

    Obtain the final estimates of ξ and θ=(β,κ), ξ^ and θ^=(β^,κ^) say, as the associated estimates that maximize the log-likelihood function ( maxjj(θ^)).

Under regularity conditions, the asymptotic distribution of θ^ is a multivariate normal, that is,

n(θ^θ)˙Nk+l+2(0k+l+2,Σθ), (11)

where ˙ denotes convergence in distribution and Σθ is the asymptotic variance-covariance matrix of θ^ [4], which is the inverse of the expected Fisher information matrix. We can approximate the expected Fisher information matrix by its observed version obtained from the Hessian matrix ¨(θ), which contains the second derivatives of (θ). Thus, Σθ[¨(θ)]1, where

¨(θ)=2(θ)θθ=i=1n¨i(θ),with¨i(θ)=[¨iββ(θ)¨iβκ(θ)¨iκβ(θ)¨iκκ(θ)].

The elements of ¨(θ) are given by

¨iββ(θ)={1ϕiΠ(ξic)xixi,i=1,,m,4ϕiΔ(ξi2)ξi2xixi+2ϕiΔ(ξi2)xixi,i=m+1,,n,¨iβκ(θ)=¨iκβ(θ)={[wi2ϕiΠ(ξic)+12ϕiΠ(ξic)wiγic]xi,i=1,,m,2{wi2ϕiΔ(ξi2)ξiwiγiϕi[Δ(ξi2)ξi2+12Δ(ξi2)]}xi,i=m+1,,n,˙iκκ(θ)={14ϕi[Π(ξic)+1ϕiΠ(ξic)γic]γicwiwi,i=1,,m,{wi2ϕiΔ(ξi2)ξi+wiγiϕi[Δ(ξi2)ξi2+12Δ(ξi2)]}wiγi,i=m+1,,n.

The corresponding standard errors can then be approximated by the square roots of the diagonal elements in the variance-covariance matrix evaluated at θ^.

3.1. Interpretation of the regression coefficients

The regression coefficient of the proposed tobit model is interpreted in terms of the effect on the latent variable Ti in the uncensored part. Let βj be the j-th regression coefficient and use the subscript (j) to imply excluding the j-th element, such that xi(j) and β(j) are, respectively, the vector of explanatory variables excluding xij and the regression coefficients excluding βj. Note that the quantile of Ti is given by

Q(Ti|xij,xi(j))=exp(β0+βjxij+xi(j)β(j)). (12)

If xij increases by 1 while keeping xi(j) fixed, we obtain

Q(Ti|xij+1,xi(j))=exp(βj(xij+1))exp(β0+xi(j)β(j))=exp(βj)exp(β0+βjxij+xi(j)β(j))=exp(βj)Q(Ti|xij,xi(j)).

Thus, for any j increasing xij by 1, the quantile of Ti will be multiplied by exp(βj). This is usually expressed as a percentage change, and

Q(Ti|xij+1,xi(j))Q(Ti|xij,xi(j))Q(Ti|xij,xi(j))×100%=(exp(βj)1)×100%

is the approximate percentage increase (or decrease if the value of βj is negative) in the quantile of Ti when xij is increased by 1. For xij dichotomous, (exp(βj)1)×100% is the percentage increase (or decrease if βj is negative) in the quantile of Ti when xij changes from 0 to 1. Note that when 0.4βj0.4, we can use the approximation (exp(βj)1)βj; see Weisberg [47]. Finally, note that from (12), we can compute the marginal effect of the j-th covariate xj by differentiating the quantile function Q(Ti|xij,xi(j)) with respect to xj, that is, we have [9]

Q(Ti|xij,xi(j))xj=exp(β0+βjxij+xi(j)β(j))βj.

3.2. Residual analysis

Goodness of fit and departures from the assumptions of the model can be assessed through residual analysis. In this work, we work with martingale-type (MT) residual, which is given by

rMTi=sign(rMi)2(rMi+ρilog(ρirMi)),i=1,,n.

where rMi=ρi+log(S^(ti)), with S^(ti) being the survival function fitted to the data, and ρi=0 or 1 indicating that case i is censored or not, respectively; details on the MT residual can be seen in Therneau et al. [41]. Simulations results indicate that the empirical distribution of the MT residual is in agreement with the standard normal distribution; see Silva et al. [39]. Then, a normal quantile-quantile (QQ) plot with simulated envelope can be constructed for the MT residual to verify whether the model is correctly specified.

4. Monte Carlo simulation

In this section, the performance of the maximum likelihood estimates of the log-symmetric quantile tobit models are evaluated by a Monte Carlo experiment. We consider the following distributions: log-normal (log-NO), log-Student-t (log-t), log-power-exponential (log-PE) and extended Birnbaum-Saunders (EBS). For each Monte Carlo replica, a simulated sample of the log-symmetric quantile tobit model is generated for fixed parameter values. Then, the maximum likelihood estimates are obtained for each simulated sample. Estimates of relative bias (RB), relative root mean squared error (RRMSE) and relative mean absolute error (RMAE) are thus computed from the Monte Carlo replicas as

RB^(θ^)=1NREPi=1NREPθ^(i)θθ,RRMSE^(θ^)=1NREPi=1NREP(θ^(i)θ)2θ,andRMAE^(θ^)=1NREPi=1NREP|θ^(i)θ|θ,

where θ and θ^(i) are the true parameter value and its respective i-th maximum likelihood estimate, and NREP is the number of Monte Carlo replicas. The R software was used in all numerical calculations; see R Core Team [33]. Simulated data are generated according to two scenarios, that is, when the model is correctly specified and when it is not.

4.1. Scenario 1

In the first scenario, the model is correctly specified with data generating process given by

Ti={Ψ,TiΨ,i=1,,m,Ti=Qiϵiϕi,Ti>Ψ,i=m+1,,n, (13)

where ϵiQLS(1,1,g), Qi=exp(β0+β1xi) and ϕi=exp(κ0+κ1wi).

The simulation scenario considers: (i) the log-t model with ξ=4, β0=1.5, β1=5.5, κ0=2.0, κ1=2.0, q=0.10,0.50,0.90; and (ii) the log-PE model with ξ=0.3, β0=1.0, β1=1.2, κ0=2.0, κ1=0.5; with NREP=200 Monte Carlo replications. The explanatory variables xi and wi are generated from the N(0,1) distribution. The value of Ψ in (13) is determined so that the censoring proportion is 10% or 40%.

Tables A2 and A3 present the results of Monte Carlo simulations based on the log-t and log-PE distributions (similar results are obtained for the log-NO and EBS cases). For comparison, the results of the Powell [32], Portnoy [31] and Peng-Huang [30] estimators are given as well; see the R package quantreg Koenker [22]. Tables A2 and A3 report the RB, RRMSE and RMAE obtained for different combinations of censoring proportions, q, and sample size, (n). The results in Tables A2 and A3 allow us to conclude that: ML1) as the sample size increases, in general, the RB, RRMSE and RMAE of all the estimates of the log-t and log-PE quantile tobit model parameters tend to decrease, as expected. This is the case because the maximum likelihood estimator is consistent [11]; ML2) when the censoring proportion increases, in general, the RB, RRMSE and RMAE of all the estimates related to the log-t and log-PE quantile tobit models tend to increase, meaning that the performances of the estimates deteriorates. This result is also expected since the likelihood function loses information contained in the sample when the percentage of censoring increases [40]; ML3) compared to the results of Powell, Portnoy and Peng-Huang, we observe that the results based on the log-t and log-PE distributions, in general, recover the parameters better; and ML4) the performance of all the estimates seem better at the quantiles closer to q = 0.50. The results in ML1) and ML3) are illustrated in Figure 2, ML2) is represented in Figure 3, and ML4) is plotted in Figure 4. These figures illustrate the results of β^1 for the log-t.

Figure 2.

Figure 2.

RB, RRMSE and RMAE of β^1 for the log-t, Powell, Portnoy and Peng-Huang cases for some n values (q = 0.50 and censoring=10%).

Figure 3.

Figure 3.

RB, RRMSE and RMAE of β^1 for the log-t, Powell, Portnoy and Peng-Huang cases for some censoring ratio values (n = 50 and q = 0.50).

Figure 4.

Figure 4.

RB, RRMSE and RMAE of β^1 for the log-t, Powell, Portnoy and Peng-Huang cases for some q values (n = 50 and censoring=10%).

In the this first scenario, we also compute the coverage probability (CP) of 95% confidence intervals (CIs) based on the asymptotic normality in (11). The CP estimate is given by

CP^(θ^)=1NREPi=1NREPI(θ[θ^L(i),θ^U(i)]),

where I is an indicator function taking the value 1 if θ belongs to the i-th interval [θ^L(i),θ^U(r)], and 0 otherwise, where θ^L(i) and θ^U(i) are the i-th upper and lower limit estimates of the 95% confidence interval. Table A1 reports the CP results for the log-t and log-PE models. From this table, we observe that: CP1) the CP approaches the 95% nominal level as the sample size increases; CP2) there seems to be no clear effect with the increase of either q or the censoring ratio. The result CP1) is illustrated in Figure 5(a) while CP2) is represented in Figure 5(b,c).

Figure 5.

Figure 5.

CPs for the log-t model for some values of n, q and censoring proportion. (a) q = 0.50 and censoring=10%. (b) n = 50 and censoring=10%. (c) n = 50 and q = 0.50.

4.2. Scenario 2

In the second scenario, the model is misspecified with data generating process given by

Ti={Ψ,TiΨ,i=1,,m,Ti,Ti>Ψ,i=m+1,,n, (14)

where TiWeibull(Qi,ζ), Qi=exp(β0+β1xi). That is, Ti follows a Weibull distribution with shape and quantile parameters ζ>0 and Qi>0, respectively, with probability density function and cumulative distribution function given by

f(ti;Qi,ζ)=ζtiζ1Qiζln(1q)exp(tiζQiζln(1q)),F(ti;Qi,ζ)=1(1q)tiζQi,ti>0, (15)

respectively, where q(0,1). Note that the 100q-th quantile of Ti is given by Qi. The simulation scenario considers β0=1.0, β1=1.5, ζ=2.5, q = 0.10, 0.30, 0.70, 0.90, with NREP=200 Monte Carlo replications. The explanatory variable xi is generated from the N(0,1) distribution and the value of Ψ in (14) is determined so that the censoring proportion is 25%.

Table A4 presents the RB, RRMSE and RMAE results based on the log-NO, log-t, log-PE and EBS distributions. Moreover, the results of Powell, Portnoy and Peng-Huang, are given as well. From Table A4, we observe that the results for β1 associated with the log-t model are better. In general, the true values of the parameters are well recovered in all models, that is, the estimates are close to the true values of the parameters in all models. These results are illustrated in Figure 6.

Figure 6.

Figure 6.

RB, RRMSE and RMAE of β^1 for the log-t, Powell, Portnoy and Peng-Huang cases for some n values (q = 0.30 and censoring=25%) with misspecified data generating process.

5. Applications

In this section, the proposed log-symmetric quantile tobit models are used to analyze the PNAD and PSID data. The PNAD data set was obtained from the official IBGE website2, whereas the PSID data set has already been analyzed in the literature on tobit models. The distributions are the same ones used in the Monte Carlo simulations. The steps to model the female labor supply data are described by Algorithm 1.

5.

5.1. PNAD data

In this subsection, the log-symmetric quantile tobit models are illustrated using data from the PNAD for the year 2015, from the IBGE, which reports demographic and socioeconomic characteristics of the Brazilian population annually. We used a sub-sample from the PNAD composed only by women3. The PNAD data consists of a sample composed of women aged between 18 and 65 years old, with information on hourly wages and socioeconomic characteristics. In total, the sample contains 26,460 observations, of which 387 are censored with a salary equal to zero. The data covers 10 metropolitan regions in Brazil, represented by Belem-PA, Fortaleza-CE, Recife-PE, Salvador-BA, Belo Horizonte- MG, Rio de Janeiro-RJ, Curitiba-PR, Porto Alegre-RS, Brasília-DF and São Paulo-SP. Nominal income values were deflated by the National Consumer Price Index (INPC) provided by the IBGE.

The objective is to study women's labor supply. The dependent variable is women's hourly wages (T) and the explanatory variables are woman's age in years (age), woman's age squared as a proxy for experience age2, dummy variable for race with value 1 for white and 0 for non-white (color), dummy variable for marital status with value 1 for married and 0 for non-married (civil), dummy variable for children under 10 in the household with a value of 1 if yes and 0 if no (minor), formal years of education ranging from 0 to 16 years (educ), number of years in the main job measured in years of work and varying between 0 to 56 years (exper) and a dummy variable for the condition of the woman in the household assuming value of 1 if the woman is the head of the household and 0 otherwise (head). Similarly to the PSID application, the dependent variable woman's hourly wage is added to 1 (T + 1), such that Ψ=1.

The choice of these explanatory variables is due to their importance in the female labor supply literature, in addition to some similarity with the PSID data. The variable educ, for instance, directly affects female participation rate in the labor market. On the other hand, the women labor supply does not depend only on the demand side. There are also other elements that might limit this participation, such as the presence of children in the household.

Descriptive statistics for women's hourly wages (T) indicate that the mean and median are 22.133 and 7.59, respectively. The coefficient of variation is 493.11%, indicating a high dispersion of the data around the mean, whereas the coefficients of skewness and kurtosis are equal to 20.225 and 573.536, respectively. The skewness suggests the presence of a positive asymmetry and the kurtosis indicates occurrence of heavy tails, implying the use of log-symmetric distributions. The asymmetric nature of the data is confirmed by the histogram shown in Figure 7(a).

Figure 7.

Figure 7.

Histogram (a) for the observed women's hourly wages and QQ plot (b) and its envelope for the MT residual for the log-t quantile tobit model (q = 0.50).

The proposed models can accommodate heteroscedasticity, then two versions are considered:

Ti={Ψ,TiΨ,i=1,,387,Ti=Qiϵiϕi,Ti>Ψ,i=388,,26,460,

where

Qi=exp(β0+β1agei+β2agei2+β3colori+β4civili+β5minori+β6educi+β7experi+β8headi),

ϵiQLS(1,1,g), where

  • Specification 1: ϕi=exp(κ0).

  • Specification 2: ϕi=exp(κ0+κ1agei+κ2agei2+κ3colori+κ4civili+κ5minori +κ6educi+κ7experi+κ8headi).

In Specification 2, explanatory variables are present in the dispersion parameter ϕ, that is, the presence of heteroscedasticity.

The AIC and BIC values for the adjusted log-symmetric quantile tobit models are reported in Table 2. In Table 2, the values of q are 0.05,0.25,0.50,0.75 and 0.95. However, similar results are obtained when considering q={0.01,0.02,,0.99}. The models with explanatory variables in the dispersion parameter (ϕ) present better adjustments for all distributions here considered. In general, the log-t quantile tobit model shows the best fit to the data.

Table 2.

AIC and BIC values for different models and q with the PNAD data.

  Specification Criterion q = 0.05 q = 0.25 q = 0.50 q = 0.75 q = 0.95
Log-NO 1 AIC 63738.61 63738.61 63738.61 63738.61 63738.61
  1 BIC 63820.45 63820.45 63820.45 63820.45 63820.45
  2 AIC 62632.53 62707.62 62748.85 62782.9 62822.39
  2 BIC 62779.83 62854.92 62896.15 62930.2 62969.69
Log-t 1 AIC 53825.67 53825.67 53836.37 53825.67 53825.67
  1 BIC 53907.51 53907.51 53918.2 53907.51 53907.51
  2 AIC 50348.87 51912.48 52303.72 52496.98 52696.98
  2 BIC 50496.17 52059.78 52451.02 52644.28 52844.28
Log-PE 1 AIC 55905.30 55905.30 55905.30 55905.29 55905.30
  1 BIC 55987.13 55987.13 55987.13 55987.13 55987.14
  2 AIC 53986.88 54591.75 54747.16 54847.73 54970.37
  2 BIC 54134.18 54739.05 54894.46 54995.03 55117.67
EBS 1 AIC 63895.42 63895.42 63895.42 63895.42 63895.42
  1 BIC 63977.25 63977.25 63977.25 63977.25 63977.25
  2 AIC 62806.21 62877.53 62917 62949.78 62987.94
  2 BIC 62953.51 63024.83 63064.3 63097.08 63135.24

The model parameter estimates for the log-t tobit quantile model based on Specification 2 considering q={0.05,0.25,0.50,0.75,0.95}, are reported in Table 3. This table also presents the estimation results based on the optimal quantile, denoted by qotm, which was chosen through a search approach. For a grid of values of q={0.01,0.02,,0.99}, we estimated the model parameters and computed the corresponding AIC and BIC values. Then, the value of qotm was the one which had the lowest AIC and BIC values. From Table 3, note that the maximum likelihood estimates of the model parameters change according to the value of q, meaning that the magnitude of the effect of the explanatory variables varies with q. We can interpret the estimated coefficients in terms of the effect on the latent variable Ti (observed part of the hourly wage); see Subsection 3.1. For white women, for instance, there is an increase in the 5 percentile (q = 0.05) of the hourly wage of (exp(0.0219)1)100%=2.21% when compared to non-white women. However, the increase in the 95 percentile (q = 0.95) of the hourly wage is of (exp(0.0219)1)100%=26.05%. That is, the effect of color on the observed part of the hourly wage is greater for women with higher income (larger quantiles).

Table 3.

Maximum likelihood estimates (standard errors in parentheses) for the log-t quantile tobit model across different values of q (Specification 2).

  q = 0.05 q = 0.25 q = 0.50 q = 0.75 q = 0.95 qotm=0.01
β0(intercept) 1.7685* 0.9726* 0.6617* 0.5392* 0.5919* 1.9216*
  (0.0522) (0.0351) (0.0304) (0.0367) (0.0731) (0.0865)
β1(age) −0.0113* 0.0195* 0.0311* 0.0417* 0.0656* −0.0413*
  (0.0027) (0.0018) (0.0016) (0.0020) (0.0041) (0.0044)
β2(age2) 0.0001* −0.0002* −0.0003* −0.0004* −0.0006* 0.0004*
  (0.0000) (0.0000) (0.0000) (0.0000) (0.0001) (0.0001)
β3(color) 0.0219* 0.1235* 0.1683* 0.2003* 0.2315* −0.0735*
  (0.0108) (0.0068) (0.0062) (0.0081) (0.0163) (0.0169)
β4(civil) −0.0008 −0.0177 −0.0104 0.0038 0.0351 0.0399
  (0.0236) (0.0167) (0.0161) (0.0207) (0.0412) (0.0348)
β5(minor) −0.0344* −0.0238* −0.0162* −0.0110 −0.0020 −0.0335**
  (0.0111) (0.0070) (0.0064) (0.0082) (0.0162) (0.0174)
β6(educ) −0.0376* 0.0382* 0.0646* 0.0767* 0.0842* −0.1227*
  (0.0014) (0.0014) (0.0011) (0.0011) (0.0018) (0.0019)
β7(exper) −0.0012 0.0109* 0.0174* 0.0229* 0.0369* −0.0101*
  (0.0009) (0.0007) (0.0006) (0.0008) (0.0017) (0.0012)
β8(head) −0.0459* −0.0044 0.0197* 0.0397* 0.0798* −0.0811*
  (0.0110) (0.0075) (0.0069) (0.0087) (0.0175) (0.0163)
κ0(intercept) −6.3627* −5.9789* −5.2843* −4.8397* −4.2542* −5.2983*
  (0.1141) (0.1608) (0.1559) (0.1466) (0.1370) (0.0792)
κ1(age) 0.0829* 0.0930* 0.0916* 0.0884* 0.0806* 0.0579*
  (0.0058) (0.0079) (0.0079) (0.0075) (0.0070) (0.0040)
κ2(age2) −0.0008* −0.0008* −0.0008* −0.0008* −0.0007* −0.0006*
  (0.0001) (0.0001) (0.0001) (0.0001) (0.0001) (0.0000)
κ3(color) 0.2495* 0.2688* 0.2368* 0.2005* 0.1130 0.1799*
  (0.0207) (0.0272) (0.0280) (0.0279) (0.0260) (0.0142)
κ4(civil) −0.0311 0.0649 0.0927 0.0990 −0.0023 −0.0572**
  (0.0504) (0.0664) (0.0666) (0.0649) (0.0600) (0.0328)
κ5(minor) 0.0370 0.0314 0.0263 0.0251 0.0849 0.0194
  (0.0226) (0.0298) (0.0302) (0.0298) (0.0271) (0.0152)
κ6(educ) 0.2022* 0.1381* 0.0830* 0.0529* 0.0234* 0.1673*
  (0.0030) (0.0048) (0.0042) (0.0036) (0.0028) (0.0019)
κ7(exper) 0.0242* 0.0271* 0.0253* 0.0245* 0.0259* 0.0174*
  (0.0014) (0.0019) (0.0020) (0.0021) (0.0020) (0.0009)
κ8(head) 0.1104* 0.1227* 0.1142* 0.1070* 0.0938* 0.0795*
  (0.0211) (0.0289) (0.0292) (0.0288) (0.0265) (0.0139)
ξ 2 2 2 2 2 2

Source: Elaborated by the authors based on PNAD data.

* significant at 5% level. ** significant at 10% level.

Table 3 reveals that the parameter estimates associated with the explanatory variables age, age2, color, minor, educ, exper and head, which model the quantile, are statistically significant at least for one q, and the parameter estimates associated with the explanatory variables age, age2, color, civil, educ, exper and head that model the dispersion, are also statistically significant at least for one q. This indicates the presence of heteroscedasticity in the data. Thus, the reduced predictive log-t quantile tobit model (q = 0.50) is given by (standard errors in parentheses)

Ti={Ψ,TiΨ,i=1,,387,Ti=Q^iϵiϕ^i,Ti>Ψ,i=388,,26,460, (16)

where

Q^i=exp(0.6622(0.0303)+0.0311(0.0016)agei0.0003(<0.0001)agei2+0.1683(0.0062)colori0.0169(0.0064)minori+0.0646(0.0011)educi+0.0174(0.0006)experi+0.0195(0.0069)headi),ϕ^i=exp(5.2777(0.1567)+0.0922(0.0078)agei0.0008(0.0001)agei2+0.2362(0.0279)colori+0.0997(0.0654)civili+0.0828(0.0042)educi+0.0253(0.0020)experi+0.1147(0.0291)headi),

and ξ^=2. The QQ plot with simulated envelope of the MT residual for this model is depicted in Figure 7(b). The log-t model provides a good fit to the data. It is worth mention that similar plots are obtained for other values of q.

By comparing the the proposed log-t model to the classic normal tobit model, we have that AIC=313909.3 and BIC=313991.1 for the latter, indicating the proposed model is a better choice. Moreover, we can compare the log-t results to the Powell, Portnoy and Peng-Huang ones. To to so, we fit these models for each q=0.01,0.02,,0.98,0.99, and then we compute the averages of the corresponding root mean squared errors (RMSEs) and mean absolute errors (MAEs) of the predicted values, where RMSE=1/ni=1n(tit^i)2 and MAE=1/ni=1n|tit^i|, with ti and t^i being the actual and predicted values, respectively. Table 4 reveals that the log-t model presents a good performance, especially in terms of MAE. The Powell, Portnoy and Peng-Huang procedures had estimation problems for some values of q.

Table 4.

Means of the RMSEs and MAEs based on q=0.01,0.02,,0.98,0.99 for the indicated models.

Measure Log-t Powell Portnoy Peng-Huang
RMSE 112.6243 112.6054 112.7335 112.2671
MAE 21.2145 23.6919 23.8055 23.3265

5.2. PSID data

This application considers a data set corresponding to the PSID of 1976, based on data from the previous year, 1975; see Mroz [29]. This set contains 753 observations of white married women between 30 and 60 years of age in 1975 (year of the interview was in 1976). Of these 753 women, 325 have a salary equal to zero, that is, censored at zero. Since the proposed models are for positive data, the dependent variable is considered to be T+1, such that Ψ=1.

The objective here is to estimate the labor supply of white married women by using log-symmetric quantile tobit models. The dependent variable is the hourly wage (T) (in 1975 US dollars) and the explanatory variables are age in years (age), years of education (educ), number of children under 6 years old in the household ( chil6), number of children between 6 and 18 years old in the household ( chil618), and years of previous experience in the labor market (exper). These data were previously studied by Barros et al. [2] by using the classic normal tobit model and the Student-t tobit model.

Similarly to the first application, two versions of the log-symmetric quantile tobit model are considered:

Ti={Ψ,TiΨ,i=1,,325,Ti=Qiϵiϕi,Ti>Ψ,i=326,,753,

where

Qi=exp(β0+β1agei+β2educi+β3chil6i+β4chil618i+β5experi),

ϵiQLS(1,1,g), and

  • Specification 1: ϕi=exp(κ0).

  • Specification 2: ϕi=exp(κ0+κ1agei+κ2educi+κ3chil6i+κ4chil618i+κ5experi).

Note that the difference between the two models lies in the inclusion of explanatory variables in the dispersion parameter ϕ. Thus, Specification 2 accounts for the presence of heteroscedasticity.

A similar analysis to the previous application allows us to conclude that the models with explanatory variables in the dispersion parameter (ϕ) provide better adjustments when compared to those without explanatory variables according to the values of the AIC and BIC. Moreover, the log-PE quantile tobit model provided the best fit to the data.

Table 5 reports the maximum likelihood estimates and standard errors for the log-PE quantile tobit model parameters based on Specification 2, considering q={0.05,0.25,0.50,0.75,0.95} and qotm. From Table 5, we note that the maximum likelihood estimates of the parameters change according to the value of q. Thus, the magnitude of the effect of the explanatory variables varies with q. Again, we can interpret the estimated coefficients in terms of the effect on the latent variable Ti, that is, the effect on the observed part of the hourly wage. We observe, for instance, that one additional year of experience (exper), increases in (exp(0.0974)1)100%=10.23% the 5 percentile (q = 0.05) of the hourly wage, while it increases by (exp(0.0383)1)100%=3.90% the 75 percentile (q = 0.75) of hourly wages. In other words, the effect of increased labor experience on the observed part of the hourly wage is greater for women with lower income (lower quantiles).

Table 5.

Maximum likelihood estimates (standard errors in parentheses) for the log-PE quantile tobit model across different values of q (Specification 2).

  q = 0.05 q = 0.25 q = 0.50 q = 0.75 q = 0.95 qotm=0.99
β0(intercept) 0.2486 0.4507 0.5790 0.6487** 0.7147** 0.7921
  (0.6830) (0.5235) (0.3964) (0.3554) (0.4319) (0.5196)
β1(age) −0.1070* −0.0864* −0.0631* −0.0350* −0.0013 0.0159**
  (0.0111) (0.0089) (0.0071) (0.0064) (0.0078) (0.0093)
β2(educ) 0.1708* 0.1638* 0.1537* 0.1393* 0.1213* 0.1124*
  (0.0386) (0.0293) (0.0212) (0.0167) (0.0191) (0.0227)
β3(chil6) −1.7487* −1.4588* −1.1038* −0.6507* −0.0929 0.1946
  (0.2786) (0.2301) (0.1785) (0.1354) (0.1588) (0.1985)
β4(chil618) −0.0314 −0.0113 0.0017 0.0030 −0.0152 −0.0320
  (0.0613) (0.0475) (0.0362) (0.0313) (0.0377) (0.0445)
β5(exper) 0.0974* 0.0814* 0.0625* 0.0383* 0.0061 −0.0123
  (0.0091) (0.0074) (0.0062) (0.0061) (0.0079) (0.0093)
κ0(intercept) −0.9522** −1.0583* −1.1658* −1.2561* −1.2605* −1.1975*
  (0.5074) (0.5335) (0.5619) (0.5809) (0.5814) (0.5912)
κ1(age) 0.0577* 0.0632* 0.0695* 0.0760* 0.0800* 0.0794*
  (0.0074) (0.0081) (0.0089) (0.0097) (0.0101) (0.0102)
κ2(educ) −0.0249 −0.0298 −0.0352 −0.0401 −0.0415 −0.0400
  (0.0262) (0.0273) (0.0283) (0.0290) (0.0284) (0.0279)
κ3(chil6) 0.8337* 0.9299* 1.0337* 1.1305* 1.1777* 1.1655*
  (0.1434) (0.1584) (0.1752) (0.1893) (0.1944) (0.1919)
κ4(chil618) 0.0401 0.0319 0.0194 −0.0003 −0.0282 −0.0417
  (0.0423) (0.0448) (0.0475) (0.0504) (0.0521) (0.0521)
κ5(exper) −0.0445* −0.0497* −0.0560* −0.0638* −0.0720* −0.0750*
  (0.0061) (0.0067) (0.0074) (0.0082) (0.0090) (0.0091)
ξ −0.48 −0.48 −0.48 −0.48 −0.48 −0.48

Source: Elaborated by the authors based on PSID data.

* significant at 5% level. ** significant at 10% level.

From Table 5 the parameter estimates associated with the explanatory variables age, educ, chil6 and exper, which model the quantile, are statistically significant at least for one q, and that the explanatory variables age, chil6, exper, which model the dispersion, are statistically significant, indicating the presence of heteroscedasticity in the data and justifying the dispersion modeling. Thus, the reduced predictive log-PE quantile tobit model (q = 0.50) is given by (standard errors in parentheses)

Ti={Ψ,TiΨ,i=1,,325,Ti=Q^iϵiϕ^i,Ti>Ψ,i=326,,753, (17)

where

Q^i=exp(0.8755(0.3189)0.0637(0.0072)agei+0.1324(0.0158)educi1.1339(0.1891)chil6i+0.0600(0.0059)experi),ϕ^i=exp(1.4206(0.3175)+0.0698(0.0083)agei+1.0394(0.1676)chil6i0.0594(0.0065)experi),

and ξ^=0.57. The log-PE model provides a good fit from the QQ plot with simulated envelope of the MT residual for this model (similar adjustments are found for other values of q), as shown in Figure 8(b).

Figure 8.

Figure 8.

Histogram (a) for the observed women's hourly wages and QQ plot (b) and its envelope for the MT residual for the log-PE quantile tobit model (q = 0.50).

We can compare the proposed log-PE model to the classic normal tobit model and the Student-t tobit model, which were fitted by Barros et al. [2] using the same data set. We have the AIC=2892.12 and BIC=2919.87 for the normal case, and AIC=2758.99 e BIC=2786.73 for the Student-t case. Thus, we highlight the superiority of the proposed models when compared to the normal classic and Student-t models adjusted by Barros et al. [2] under three basic aspects as follows. (i) The use of asymmetric distributions (log-symmetric distributions) that is more adequate for the PSID data. The normal classic and Student-t tobit models use symmetric distributions. (ii) The possibility of modeling the dispersion and the consequent accommodation of heteroscedasticity, which improves the fit. (iii) The modeling in terms of quantiles, which provides a richer characterization of the effects of the explanatory variables on the dependent variable. The classic normal tobit and Student-t models do not consider the quantile approach. We also can compare the results of the proposed models with results of Powell, Portnoy and Peng-Huang. Once again, we fit these models for each q=0.01,0.02,,0.98,0.99, and then to compute the averages of the corresponding RMSEs and MAEs of the predicted values. From Table 6, we observe good results for the log-PE model. The Powell, Portnoy and Peng-Huang procedures present estimation problems for different values of q, and the results for the Powell model diverge.

Table 6.

Means of the RMSEs and MAEs based on q=0.01,0.02,,0.98,0.99 for the indicated models.

Measure Log-PE Powell Portnoy Peng-Huang
RMSE 3.9750 1.9640e+14 4.0309 3.8051
MAE 2.7777 8.1192e+13 3.1193 2.8907

6. Concluding remarks

In this paper, a class of quantile tobit models was proposed based on a reparameterization of the log-symmetric distributions. In such reparameterization, the quantile is one of the distribution parameters. The advantages of the proposed models over the classic tobit model include:

  1. flexibility to assume several asymmetric distributions, since the quantile-based log-symmetric class incorporates several distributions as special cases, such as log-normal, log-Student-t, log-power-exponential and extended Birnbaum-Saunders, among others;

  2. greater flexibility in the analysis of the effects of explanatory variables on the dependent variable due to the quantile approach; and

  3. ability to accommodate heteroscedasticity, since the proposed model allows for the inclusion of explanatory variables in the dispersion parameter.

A Monte Carlo simulation study was carried out to evaluate the performance of the maximum likelihood estimates. In general, the results showed good performances of the maximum likelihood estimates in terms of relative bias, relative root mean squared error and relative mean absolute error. Two applications to actual data from PNAD (Brazil) ad PSID (United States) were carried out to illustrate the proposed methodology. The applications favored the use of the log-symmetric quantile tobit models over the classic tobit model, corroborating the advantages (i), (ii) and (iii) mentioned above. As another product of this paper, the authors are preparing an R package [33], which might be an important tool for professionals, researchers in the field of economics and statistics, data scientists, among others. A preliminary code for implementation of our method in R is available at GitHub through the link https://github.com/danubiacunha/quantiletobit.git.

As future research, the following lines might be explored:

  1. to study some hypothesis and misspecification tests via Monte Carlo simulation; see Santos and Cribari-Neto [34];

  2. to generalize the proposed models for the cases with right censoring or two-sided censoring; see Long [24, pp. 211–212];

  3. to propose bivariate models; see Seung-Hoon [38].

  4. to investigate multiple maxima as well as local and global maxima;

  5. to investigate the behavior of the Wald, score, likelihood ratio and gradient tests; see Saulo et al. [35].

We are currently working on some of these issues and hope to report new findings in the near future.

Tables.

Table A1.

Coverage probability from simulated data and selected quantile tobit models.

      n = 25 n = 50 n = 100
Model Censoring Parameter q = 0.10 q = 0.50 q = 0.90 q = 0.10 q = 0.50 q = 0.90 q = 0.10 q = 0.50 q = 0.90
Log-t 10% β0 81.20 78.00 71.00 92.20 91.20 89.00 93.60 92.20 89.60
    β1 79.20 75.20 71.60 87.40 88.20 88.60 93.40 93.20 93.40
    κ0 90.80 90.60 88.20 94.80 94.00 92.60 91.80 92.20 93.80
    κ1 89.60 85.00 80.20 93.60 93.40 94.80 93.20 94.40 92.20
  40% β0 74.00 71.20 72.20 91.20 92.20 86.40 93.20 92.20 89.80
    β1 73.20 64.20 74.60 85.80 86.60 83.80 93.80 94.20 93.20
    κ0 76.40 84.00 90.20 93.60 91.80 94.40 91.80 93.60 95.00
    κ1 78.80 78.00 84.00 94.00 93.20 90.80 92.40 93.60 92.60
Log-PE 10% β0 90.12 89.09 90.20 94.80 93.00 92.00 93.20 93.39 94.60
    β1 82.26 83.43 86.33 93.80 93.80 92.80 92.40 94.19 93.80
    κ0 92.54 92.74 94.29 93.60 93.60 93.20 94.40 95.60 95.20
    κ1 91.53 89.11 94.49 94.80 92.20 92.80 95.80 92.60 92.60
  40% β0 89.00 86.59 87.25 95.40 93.80 91.20 92.80 93.00 94.40
    β1 81.00 78.66 85.02 94.19 93.60 93.80 93.20 94.00 94.40
    κ0 89.80 91.67 92.51 95.20 94.80 94.80 95.60 95.40 95.00
    κ1 91.40 85.77 92.11 95.00 93.60 94.60 95.80 93.00 93.60

Table A2.

RB, RMSE and RMAE from simulated data for the indicated cases.

        n = 25 n = 50 n = 100
  Censoring q Parameter RB RRMSE RMAE RB RRMSE RMAE RB RRMSE RMAE
Log-t 10% 0.10 β0 0.0580 0.4680 0.3750 −0.0042 0.2216 0.1742 −0.0003 0.1208 0.0958
      β1 0.0327 0.1513 0.1171 0.0063 0.0379 0.0309 0.0038 0.0312 0.0249
      κ0 −0.0854 0.2223 0.1773 −0.0260 0.1164 0.0934 −0.0119 0.0757 0.0608
      κ1 0.0591 0.1725 0.1359 0.0273 0.1167 0.0913 0.0135 0.0806 0.0646
Powell     β0 −0.2224 2.5671 1.1454 −0.0382 0.9985 0.5300 −0.1522 0.8199 0.3387
      β1 −0.0515 0.4696 0.3191 −0.0466 0.2112 0.1352 0.0017 0.1440 0.0959
Portnoy     β0 −0.6835 2.6454 1.1437 −0.5662 2.7080 0.7910 −0.1366 0.4889 0.2933
      β1 0.1423 0.4835 0.2975 0.0483 0.3710 0.1574 −0.0286 0.1217 0.0892
Peng-Huang     β0 −1.2078 4.9358 1.5976 −1.2277 6.2892 1.3935 −0.2538 0.6938 0.3609
      β1 0.2013 0.8014 0.3958 0.1137 0.7966 0.2412 −0.0384 0.1571 0.1098
Log-t   0.50 β0 −0.0059 0.3076 0.2419 −0.0082 0.1664 0.1310 0.0002 0.0604 0.0453
      β1 0.0034 0.1128 0.0899 −0.0016 0.0514 0.0400 0.0012 0.0239 0.0191
      κ0 −0.0793 0.2218 0.1776 −0.0081 0.1442 0.1154 −0.0119 0.1015 0.0807
      κ1 0.1227 0.2782 0.2200 0.0463 0.1637 0.1293 0.0436 0.1112 0.0899
Powell     β0 0.0133 0.5119 0.3943 0.0901 1.1402 0.3327 −0.0062 0.1844 0.1360
      β1 −0.0088 0.1838 0.1394 −0.0115 0.1827 0.0785 0.0031 0.0700 0.0456
Portnoy     β0 0.0232 0.5068 0.3883 −0.0024 0.2608 0.2090 0.0039 0.1600 0.1256
      β1 −0.0089 0.1818 0.1387 −0.0033 0.0760 0.0581 0.0019 0.0513 0.0401
Peng-Huang     β0 0.0531 0.5169 0.3978 −0.0148 0.2633 0.2091 −0.0235 0.1634 0.1257
      β1 −0.0108 0.1803 0.1376 0.0029 0.0768 0.0595 0.0012 0.0516 0.0403
Log-t   0.90 β0 −0.1216 0.5445 0.4352 −0.0355 0.1891 0.1522 −0.0153 0.1327 0.1051
      β1 0.0062 0.1049 0.0846 0.0018 0.0392 0.0314 0.0001 0.0280 0.0224
      κ0 −0.0681 0.2054 0.1621 −0.0243 0.1112 0.0884 −0.0021 0.0778 0.0614
      κ1 0.1096 0.2413 0.1958 0.0508 0.1471 0.1172 0.0288 0.0821 0.0638
Powell     β0 0.6479 2.3091 1.1293 0.1809 0.6641 0.4120 0.1122 0.4176 0.2936
      β1 0.0714 0.5123 0.2955 0.0099 0.1379 0.0980 0.0028 0.1016 0.0751
Portnoy     β0 0.7666 1.8913 1.1297 0.2676 0.7288 0.4717 0.1212 0.4234 0.2960
      β1 0.0720 0.4845 0.3161 0.0104 0.1426 0.1036 0.0026 0.0988 0.0741
Peng-Huang     β0 1.8024 3.5908 2.0110 0.2890 0.7779 0.4838 0.0901 0.4004 0.2859
      β1 0.1390 0.7290 0.4684 0.0153 0.1480 0.1045 0.0012 0.0982 0.0729
Log-t 40% 0.10 β0 −0.0762 0.7533 0.5757 −0.0348 0.3213 0.2479 0.0145 0.3928 0.3083
      β1 0.0207 0.1314 0.1025 0.0143 0.0804 0.0623 0.0001 0.0879 0.0700
      κ0 −0.0815 0.2778 0.2142 −0.0183 0.1439 0.1165 −0.0295 0.1130 0.0889
      κ1 0.0578 0.2083 0.1586 0.0328 0.1559 0.1257 0.0222 0.1063 0.0850
Powell     β0 −0.5188 4.5059 1.9175 −0.9452 5.1417 1.4365 0.4036 2.8785 1.7625
      β1 0.0440 0.6426 0.3880 −0.1098 0.9502 0.5168 −0.4373 0.8079 0.6864
Portnoy     β0 −0.5741 2.8342 1.5242 −0.5912 3.3235 0.9425 −1.0443 3.0364 1.5773
      β1 0.1358 0.4130 0.2775 0.0602 0.5395 0.2696 0.1662 0.4545 0.3274
Peng-Huang     β0 −2.7295 6.7772 3.2428 −2.8272 10.2338 3.1052 −2.0500 4.6851 2.4503
      β1 0.4193 0.8157 0.4814 0.4064 1.4192 0.5347 0.3037 0.6310 0.4302
Log-t   0.50 β0 −0.0610 0.3356 0.2442 −0.0151 0.1681 0.1318 −0.0034 0.1339 0.1041
      β1 0.0067 0.0618 0.0455 0.0120 0.1017 0.0799 0.0027 0.0653 0.0511
      κ0 −0.0722 0.2823 0.2225 −0.0397 0.1743 0.1381 −0.0293 0.1267 0.0986
      κ1 0.1879 0.3893 0.3120 0.0602 0.1756 0.1406 0.0298 0.1417 0.1130
Powell     β0 −0.0998 0.8234 0.5718 −0.1519 1.2939 0.4629 −0.0219 0.6335 0.2936
      β1 0.0098 0.1995 0.1327 0.0754 0.7943 0.3016 0.0185 0.3173 0.1295
Portnoy     β0 −0.1035 0.4767 0.3470 −0.0309 0.2578 0.2042 −0.0553 0.2548 0.1931
      β1 0.0100 0.1073 0.0768 0.0109 0.1698 0.1311 0.0202 0.1020 0.0782
Peng-Huang     β0 −0.1271 0.4896 0.3597 −0.0551 0.2656 0.2104 −0.1048 0.2760 0.2086
      β1 0.0203 0.1126 0.0807 0.0243 0.1779 0.1340 0.0293 0.1063 0.0818
Log-t   0.90 β0 −0.0977 0.4698 0.3752 −0.0534 0.2166 0.1739 −0.0232 0.1481 0.1181
      β1 0.0201 0.1425 0.1127 0.0035 0.0549 0.0435 0.0049 0.0520 0.0406
      κ0 −0.0490 0.2402 0.1935 −0.0134 0.1394 0.1095 −0.0136 0.0941 0.0735
      κ1 0.1253 0.3044 0.2312 0.0770 0.1889 0.1474 0.0309 0.1140 0.0916
Powell     β0 0.3990 1.3719 0.8616 0.3454 0.9561 0.6535 0.1000 0.4597 0.3284
      β1 −0.1542 0.4091 0.2843 −0.0737 0.2134 0.1599 −0.0278 0.1843 0.1430
Portnoy     β0 0.4222 1.5037 0.9012 0.2809 0.8129 0.5072 0.1186 0.4077 0.2916
      β1 −0.0291 0.5690 0.3643 0.0054 0.1936 0.1338 0.0103 0.1475 0.1137
Peng-Huang     β0 1.1683 2.7855 1.4775 0.2702 0.7869 0.5049 0.0681 0.3780 0.2734
      β1 0.0192 0.7575 0.4694 0.0331 0.2331 0.1491 0.0154 0.1428 0.1077

Table A3.

RB, RMSE and RMAE from simulated data for the cases.

        n = 25 n = 50 n = 100
  Censoring q Parameter RB RRMSE RMAE RB RRMSE RMAE RB RRMSE RMAE
Log-PE 10% 0.10 β0 0.1384 1.0672 0.8466 −0.0013 0.7284 0.5754 0.0347 0.4928 0.3885
      β1 0.0055 0.5653 0.4353 0.0414 0.4093 0.3277 0.0151 0.2528 0.1956
      κ0 −0.0689 0.1922 0.1491 −0.0301 0.1327 0.1056 −0.0134 0.0840 0.0675
      κ1 0.1852 0.6635 0.5181 0.0360 0.3166 0.2552 0.0323 0.2465 0.1988
Powell     β0 −0.6721 2.7554 1.6954 −2.1507 7.1013 2.9568 −1.8793 6.9061 2.5035
      β1 0.3991 1.5360 1.0998 −0.6379 3.6862 1.8388 0.0528 3.1405 1.3626
Portnoy     β0 −0.5907 2.3943 1.5534 −0.9888 4.3779 1.6299 −1.2863 3.1860 1.6451
      β1 0.5390 1.4984 1.1252 0.7231 2.4029 1.1043 0.7421 1.4551 0.9087
Peng-Huang     β0 −1.3971 3.2524 2.0838 −2.1049 4.0427 2.4967 −2.0706 4.9947 2.3498
      β1 0.7602 1.9042 1.5156 1.0617 2.3874 1.6791 0.8418 2.0581 1.1636
Log-PE   0.50 β0 −0.0444 0.7616 0.6071 −0.0818 0.5009 0.4013 −0.0139 0.3214 0.2503
      β1 −0.0181 0.5756 0.4433 0.0367 0.4118 0.3309 0.0166 0.2501 0.1951
      κ0 −0.0557 0.1828 0.1429 −0.0283 0.1295 0.1024 −0.0091 0.0856 0.0681
      κ1 0.2246 0.9145 0.7502 0.0815 0.5129 0.4007 0.0628 0.3909 0.3107
Powell     β0 −0.0334 0.8392 0.6640 −0.0729 0.5721 0.4580 0.0074 0.3720 0.2901
      β1 −0.0269 0.6266 0.4776 0.0243 0.4838 0.3887 0.0023 0.2936 0.2296
Portnoy     β0 −0.0221 0.8345 0.6613 −0.0729 0.5632 0.4475 0.0053 0.3701 0.2867
      β1 −0.0284 0.6280 0.4748 0.0224 0.4848 0.3881 0.0022 0.2886 0.2263
Peng-Huang     β0 0.0219 0.8361 0.6641 −0.0958 0.5800 0.4630 −0.0494 0.3739 0.2935
      β1 −0.0254 0.6389 0.4854 0.0257 0.4882 0.3910 0.0023 0.2928 0.2304
Log-PE   0.90 β0 −0.2088 0.9744 0.8001 −0.1864 0.7440 0.5903 −0.0650 0.4539 0.3655
      β1 −0.0297 0.5144 0.4067 0.0382 0.4134 0.3312 0.0166 0.2500 0.1949
      κ0 −0.0556 0.1747 0.1386 −0.0320 0.1294 0.1032 −0.0124 0.0828 0.0665
      κ1 0.1602 0.5975 0.4830 0.0937 0.3743 0.3012 0.0438 0.2648 0.2117
Powell     β0 −0.0805 1.4013 1.1274 −0.0805 0.9817 0.7972 −0.0028 0.6137 0.4898
      β1 −0.0823 0.9390 0.7299 0.1319 0.8866 0.7013 −0.0070 0.5119 0.4103
Portnoy     β0 −0.0738 1.2981 1.0494 −0.0280 0.9489 0.7748 0.0042 0.6218 0.4932
      β1 −0.0719 0.9187 0.7180 0.1187 0.8450 0.6777 −0.0125 0.5050 0.4036
Peng-Huang     β0 0.6397 1.6876 1.2664 0.0731 1.0289 0.8317 −0.0308 0.6166 0.4935
      β1 −0.0852 1.1035 0.8781 0.1231 0.8790 0.7030 −0.0074 0.5086 0.4073
Log-PE 40% 0.10 β0 0.3619 1.4767 1.1750 −0.0389 0.9883 0.7781 0.0467 0.6877 0.5514
      β1 −0.0080 0.6619 0.5297 0.0737 0.4716 0.3684 0.0088 0.2946 0.2321
      κ0 −0.1399 0.2890 0.2210 −0.0377 0.1631 0.1286 −0.0204 0.1105 0.0887
      κ1 0.3149 0.8399 0.6308 0.0464 0.3414 0.2701 0.0418 0.2655 0.2110
Powell     β0 2.4789 3.4527 3.1471 2.6180 3.5536 3.3332 2.2081 5.6156 3.9753
      β1 −0.2191 1.2845 1.0825 −0.3330 1.3566 1.1505 −0.4379 1.8923 1.2370
Portnoy     β0 1.6025 3.2917 2.8021 −0.7189 5.4671 3.3247 −2.1865 7.5831 4.1709
      β1 0.6165 1.7084 1.2353 1.2614 3.3669 2.1392 1.4990 2.8490 1.7677
Peng-Huang     β0 −0.6804 3.2222 2.3646 −1.9252 4.3854 2.7560 −10.7373 15.5892 11.1410
      β1 1.8081 2.4192 1.9235 2.1521 3.2258 2.4718 4.3267 5.6656 4.3716
Log-PE   0.50 β0 −0.1155 0.8514 0.6769 −0.1329 0.5649 0.4492 −0.0212 0.3616 0.2895
      β1 0.0257 0.6601 0.5114 0.0705 0.4685 0.3689 0.0145 0.2820 0.2208
      κ0 −0.0929 0.2470 0.1924 −0.0313 0.1558 0.1251 −0.0165 0.1086 0.0861
      κ1 0.3852 1.3960 1.1080 0.0434 0.6482 0.4967 0.0561 0.4511 0.3569
Powell     β0 −0.5423 2.0553 1.1863 −0.4349 1.6693 0.8268 −0.3318 1.8591 0.6498
      β1 0.2307 1.2243 0.8322 0.2188 1.0260 0.6315 0.1722 0.8218 0.4464
Portnoy     β0 −0.4457 1.9939 1.0416 −0.1954 0.7329 0.5408 −0.0670 0.4447 0.3435
      β1 0.2215 1.3833 0.7818 0.1029 0.6299 0.4654 0.0404 0.3361 0.2606
Peng-Huang     β0 −0.5049 3.8184 1.1518 −0.2208 0.7274 0.5590 −0.1355 0.4849 0.3693
      β1 0.3089 2.3524 0.8460 0.0986 0.6380 0.4791 0.0528 0.3607 0.2824
log-PE   0.90 β0 −0.3588 1.1051 0.9109 −0.1942 0.7816 0.6194 −0.0810 0.4818 0.3868
      β1 0.0033 0.5826 0.4480 0.0694 0.4464 0.3497 0.0268 0.2703 0.2152
      κ0 −0.0650 0.2296 0.1751 −0.0220 0.1489 0.1189 −0.0111 0.1059 0.0854
      κ1 0.2182 0.8758 0.6678 0.0819 0.4809 0.3823 0.0515 0.3141 0.2491
Powell     β0 −0.0360 1.3816 1.1077 −0.0740 0.9848 0.7957 0.0012 0.6121 0.4880
      β1 −0.1382 0.8689 0.6732 0.0968 0.8490 0.6661 −0.0106 0.5058 0.4068
Portnoy     β0 −0.1084 1.3262 1.0780 −0.0291 0.9586 0.7788 0.0043 0.6225 0.4932
      β1 −0.0166 1.0442 0.7869 0.1083 0.8425 0.6677 −0.0131 0.5044 0.4031
Peng-Huang     β0 0.5940 1.7319 1.3086 0.0572 1.0351 0.8324 −0.0308 0.6137 0.4918
      β1 0.0175 1.2963 0.9787 0.1286 0.8952 0.7081 −0.0040 0.5090 0.4063

Table A4.

RB, RMSE and RMAE from simulated data for the indicated cases ( censoring=25%).

      n = 25 n = 50 n = 100
  $q$ Parameter RB RRMSE RMAE RB RRMSE RMAE RB RRMSE RMAE
Log-NO 0.10 β0 −0.0225 0.0449 0.0336 −0.0189 0.0323 0.0256 −0.0643 0.0672 0.0645
    β1 0.0003 0.0227 0.0180 −0.0008 0.0165 0.0133 −0.0136 0.0191 0.0155
Log-t   β0 0.0026 0.0579 0.0455 0.0115 0.0345 0.0272 0.0244 0.0343 0.0289
    β1 −0.0019 0.0240 0.0195 0.0005 0.0175 0.0141 0.0005 0.0093 0.0075
Log-PE   β0 −0.0749 0.0838 0.0750 −0.0732 0.0837 0.0752 −0.1294 0.1358 0.1295
    β1 −0.0052 0.0264 0.0215 −0.0095 0.0283 0.0204 −0.0117 0.0242 0.0190
EBS   β0 −0.0932 0.1017 0.0932 −0.0937 0.1008 0.0937 −0.1685 0.1767 0.1685
    β1 −0.0106 0.0266 0.0220 0.0025 0.0192 0.0156 −0.0080 0.0211 0.0170
Powell   β0 0.0028 0.0852 0.0642 0.0000 0.0587 0.0469 0.0054 0.0409 0.0333
    β1 −0.0020 0.0565 0.0435 −0.0000 0.0415 0.0327 −0.0042 0.0255 0.0198
Portnoy   β0 −0.0313 0.1144 0.0764 −0.0158 0.0670 0.0496 −0.0008 0.0404 0.0326
    β1 0.0131 0.0551 0.0421 0.0072 0.0396 0.0318 −0.0021 0.0248 0.0199
Peng-Huang   β0 −0.0361 0.1182 0.0807 −0.0286 0.0834 0.0579 −0.0120 0.0448 0.0345
    β1 0.0100 0.0649 0.0482 0.0084 0.0445 0.0350 −0.0020 0.0269 0.0215
Log-NO 0.30 β0 −0.0399 0.0539 0.0434 −0.0330 0.0404 0.0347 −0.0475 0.0512 0.0478
    β1 −0.0002 0.0200 0.0157 −0.0065 0.0205 0.0168 −0.0003 0.0124 0.0098
Log-t   β0 0.0205 0.0445 0.0350 0.0158 0.0325 0.0261 −0.0021 0.0192 0.0150
    β1 −0.0021 0.0237 0.0193 0.0007 0.0168 0.0136 0.0004 0.0092 0.0075
Log-PE   β0 −0.0339 0.0504 0.0398 −0.0254 0.0377 0.0316 −0.0424 0.0488 0.0429
    β1 0.0013 0.0239 0.0196 −0.0031 0.0190 0.0152 −0.0048 0.0147 0.0119
EBS   β0 −0.0703 0.0815 0.0715 −0.0669 0.0766 0.0681 −0.0980 0.1002 0.0980
    β1 −0.0064 0.0256 0.0206 −0.0028 0.0189 0.0153 0.0090 0.0141 0.0115
Powell   β0 −0.0118 0.0490 0.0381 −0.0067 0.0358 0.0278 0.0019 0.0237 0.0189
    β1 0.0029 0.0284 0.0230 0.0022 0.0251 0.0203 −0.0026 0.0148 0.0122
Portnoy   β0 −0.0181 0.0548 0.0404 −0.0069 0.0346 0.0273 −0.0009 0.0228 0.0180
    β1 0.0058 0.0303 0.0241 0.0019 0.0240 0.0195 −0.0011 0.0140 0.0115
Peng-Huang   β0 −0.0190 0.0582 0.0420 −0.0104 0.0370 0.0292 −0.0050 0.0247 0.0196
    β1 0.0065 0.0317 0.0253 0.0028 0.0249 0.0200 −0.0008 0.0145 0.0119
Log-NO 0.70 β0 −0.0084 0.0291 0.0230 0.0019 0.0237 0.0191 0.0273 0.0371 0.0317
    β1 −0.0001 0.0222 0.0178 −0.0052 0.0194 0.0154 −0.0118 0.0183 0.0154
Log-t   β0 −0.0056 0.0322 0.0254 −0.0051 0.0237 0.0190 −0.0078 0.0159 0.0126
    β1 −0.0025 0.0241 0.0195 0.0008 0.0165 0.0133 0.0004 0.0092 0.0075
Log-PE   β0 0.0122 0.0348 0.0276 0.0213 0.0357 0.0291 0.0376 0.0446 0.0384
    β1 0.0016 0.0226 0.0184 −0.0086 0.0258 0.0206 −0.0037 0.0164 0.0128
EBS   β0 0.0055 0.0355 0.0281 0.0234 0.0426 0.0316 0.0317 0.0382 0.0326
    β1 −0.0057 0.0257 0.0208 −0.0086 0.0306 0.0227 0.0033 0.0172 0.0137
Powell   β0 −0.0067 0.0310 0.0238 −0.0030 0.0236 0.0184 −0.0015 0.0154 0.0121
    β1 −0.0013 0.0227 0.0182 0.0015 0.0170 0.0136 −0.0003 0.0096 0.0076
Portnoy   β0 −0.0051 0.0282 0.0211 −0.0022 0.0224 0.0177 −0.0020 0.0156 0.0123
    β1 −0.0004 0.0203 0.0165 0.0017 0.0163 0.0130 0.0004 0.0094 0.0075
Peng-Huang   β0 −0.0011 0.0283 0.0220 −0.0028 0.0233 0.0183 −0.0042 0.0164 0.0129
    β1 −0.0001 0.0212 0.0171 0.0026 0.0166 0.0133 0.0010 0.0098 0.0079
Log-NO 0.90 β0 0.0282 0.0394 0.0327 0.0361 0.0420 0.0372 0.0666 0.0717 0.0666
    β1 −0.0040 0.0218 0.0176 −0.0030 0.0171 0.0137 −0.0018 0.0148 0.0119
Log-t   β0 0.0477 0.0643 0.0540 0.0268 0.0408 0.0323 0.0133 0.0207 0.0169
    β1 −0.0015 0.0236 0.0191 0.0008 0.0162 0.0130 0.0003 0.0092 0.0075
Log-PE   β0 0.1032 0.1102 0.1037 0.1166 0.1198 0.1166 0.1631 0.1648 0.1631
    β1 −0.0022 0.0248 0.0200 −0.0042 0.0194 0.0151 0.0018 0.0146 0.0111
EBS   β0 0.0566 0.0663 0.0579 0.0820 0.0864 0.0820 0.1154 0.1195 0.1154
    β1 −0.0037 0.0241 0.0188 −0.0049 0.0230 0.0178 −0.0114 0.0296 0.0269
Powell   β0 −0.0026 0.0310 0.0235 −0.0027 0.0240 0.0193 −0.0023 0.0158 0.0128
    β1 −0.0056 0.0245 0.0182 −0.0027 0.0181 0.0140 0.0001 0.0105 0.0082
Portnoy   β0 −0.0038 0.0269 0.0210 −0.0017 0.0218 0.0175 −0.0013 0.0147 0.0119
    β1 −0.0033 0.0217 0.0167 −0.0007 0.0152 0.0121 0.0003 0.0102 0.0081
Peng-Huang   β0 0.0056 0.0294 0.0238 −0.0032 0.0238 0.0188 −0.0046 0.0156 0.0126
    β1 −0.0023 0.0234 0.0176 0.0020 0.0167 0.0133 0.0022 0.0103 0.0082

Funding Statement

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazil (CAPES) Finance Code 001. Danúbia Rodrigues thanks CAPES-PROEX for the doctorate scholarship. Jose Angelo Divino and Helton Saulo acknowledge Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) for financial support.

Notes

1

Available at https://www.ibge.gov.br/

3

We do not use the most recent Continuous PNAD data set because of a lack clarity in some variables of interest in its dictionary. This is case of the labor experience/skill variable, which are the years of work in the main activity, and marital status, which is also not clearly defined in the Continuous PNAD. It is worth mention that is not necessary the variables of the two samples, PNAD and PSID, are similar. We only need to have a censored variable for female hours worked. The 2015 PNAD data set also be used only to illustrate the proposed methodology.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Amemiya T., Tobit models: A survey, J. Econom. 24 (1984), pp. 3–61. [Google Scholar]
  • 2.Barros M., Galea M., Leiva V., and Santos-Neto M., Generalized tobit models: diagnostics and application in econometrics, J. Appl. Stat. 45 (2018), pp. 145–167. [Google Scholar]
  • 3.Barros R., Jatobá J., and Mendonça R., A evolução da participação de mulheres no mercado de trabalho: uma análise de decomposição, Proceedings of the 4th National Meeting of Labor Studies, Brazilian Association of Labor Studies, 1995.
  • 4.Cox D. and Hinkley D., Theoretical Statistics, Chapman and Hall, London, 1974. [Google Scholar]
  • 5.Davino C., Furno M., and Vistocco D., Quantile Regression, Wiley, Chichester, 2014. [Google Scholar]
  • 6.Desousa M.F., Saulo H., Leiva V., and Scalco P., On a tobit-Birnbaum-Saunders model with an application to medical data, J. Appl. Stat. 45 (2018), pp. 932–955. [Google Scholar]
  • 7.Fair R., A note on computation of the tobit estimator, Econometrica 45 (1977), pp. 1723–1727. [Google Scholar]
  • 8.Fair R., A theory of extramarital affairs, J. Political Econom. 86 (1978), pp. 45–61. [Google Scholar]
  • 9.Geraci M., Qtools: A collection of models and tools for quantile inference, R. J. 8 (2016), pp. 117–138. [Google Scholar]
  • 10.Gilchrist W., Statistical Modelling with Quantile Functions, 1st ed., Chapman & Hall/CRC, Boca Raton, FL, 2000. [Google Scholar]
  • 11.Greene W.H., Econometric Analysis, 7th ed., Pearson Education, New York, NY, 2012. [Google Scholar]
  • 12.Hao L. and Naiman D., Quantile Regression, Sage Publications, California, 2007. [Google Scholar]
  • 13.Heckman J.J. and MaCurdy T.E., A life cycle model of female labor supply, Rev. Econom. Stud. 47 (1980), pp. 47–74. [Google Scholar]
  • 14.Helsel D.R., Statistics for Censored Environmental Data Using Minitab and R, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]
  • 15.ILO , World Employment and Social Outlook: Trends for Women 2018 - Global snapshot. International Labour Organization, Geneva, 2018.
  • 16.Islam N, A dynamic tobit model of female labor supply, Working Papers In Economics No 259, 2007. pp. 1–29.
  • 17.Jacobsen J., Labor force participation, Q. Rev. Econ. Financ. 39 (1999), pp. 597–610. [Google Scholar]
  • 18.Jarque C., An application of LDV models to household expenditure analysis in Mexico, J. Econom. 36 (1987), pp. 31–53. [Google Scholar]
  • 19.Jones M.C., On reciprocal symmetry, J. Stat. Plan. Inference. 138 (2008), pp. 3039–3043. [Google Scholar]
  • 20.Kano Y., Berkane M., and Bentler P.M., Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations, J. Am. Stat. Assoc. 88 (1993), pp. 135–143. [Google Scholar]
  • 21.Koenker R., Quantile Regression, Cambridge University Press, Cambridge, 2005. [Google Scholar]
  • 22.Koenker R., quantreg: Quantile Regression. R package version 5.85, 2021.
  • 23.Koenker R. and Bassett Jr G., Regression quantiles, Econometrica 46 (1978), pp. 33–50. [Google Scholar]
  • 24.Long J.S., Regression Models for Categorical and Limited Dependent Variables, Sage Publications Inc., Thousand Oaks, 1997. [Google Scholar]
  • 25.Lucas A., Robustness of the student t based M-estimator, Commun. Stat.: Theor. Meth. 41 (1997), pp. 1165–1182. [Google Scholar]
  • 26.Medeiros M.C. and Ferrari S.L.P., Small-sample testing inference in symmetric and log-symmetric linear regression models, Stat. Neerl. 71 (2017), pp. 200–224. [Google Scholar]
  • 27.Melenberg B. and van Soest A., Parametric and semi-parametric modelling of vacation expenditures, J. Appl. Econom. 11 (1996), pp. 59–76. [Google Scholar]
  • 28.Moffitt R., The tobit model, hours of work and institutional constraints, Rev. Econ. Stat. 64 (1982), pp. 510–515. [Google Scholar]
  • 29.Mroz T., The sensitiviy of an empirical model of married women's hours of work to economic and statistical assumptions, Econometrica 55 (1987), pp. 765–799. [Google Scholar]
  • 30.Peng L. and Huang Y., Survival analysis with quantile regression models, J. Amer. Stat. Assoc. 103 (2008), pp. 637–649. [Google Scholar]
  • 31.Portnoy S., Censored regression quantiles, J. Amer. Stat. Assoc. 98 (2003), pp. 1001–1012. [Google Scholar]
  • 32.Powell J.L., Censored regression quantiles, J. Econom. 32 (1986), pp. 143–155. [Google Scholar]
  • 33.R Core Team . R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2020.
  • 34.Santos J. and Cribari-Neto F., Hypothesis testing in log-Birnbaum-Saunders regressions, Commun. Stat. – Simul. Comput. 46 (2017), pp. 3990–4003. [Google Scholar]
  • 35.Saulo H., Dasilva A., Leiva V., Sánchez L., and Fuente-Mella H.L., Log-symmetric quantile regression models, Stat. Neerl. (2021a), pp. 1–40. 10.1111/stan.12243 [DOI] [Google Scholar]
  • 36.Saulo H., Leão J., Nobre L., and Balakrishnan N., A class of asymmetric regression models for left-censored data, Brazilian J. Probab. Stat. 35 (2021b), pp. 62–84. [Google Scholar]
  • 37.Scorzafave L.G. and Menezes-Filho N.A., Participção feminina no mercado de trabalho brasileiro: evolução e determinantes, Pesquisa E Planejamento Econômico. 31 (2001), pp. 441–478. [Google Scholar]
  • 38.Seung-Hoon Y., Analysing household bottled water and water purifier expenditures: simultaneous equation bivariate tobit model, Appl. Econ. Lett. 12 (2005), pp. 297–301. [Google Scholar]
  • 39.Silva G.O., Ortega E.M., and Cordeiro G.M., A log-extended weibull regression model, Comput. Stat. Data. Anal. 53 (2009), pp. 4482–4489. [Google Scholar]
  • 40.Stute W., Strong consistency of the mle under random censoring, Metrika 39 (1992), pp. 257–267. [Google Scholar]
  • 41.Therneau T., Grambsch P., and Fleming T., Martingale-based residuals for survival models, Biometrika 77 (1990), pp. 147–160. [Google Scholar]
  • 42.Tobin J., Estimation of relationships for limited dependent variables, Econometrica 26 (1958), pp. 24–36. [Google Scholar]
  • 43.Vanegas L.H. and Paula G.A., A semiparametric approach for joint modeling of median and skewness, Test 24 (2015), pp. 110–135. [Google Scholar]
  • 44.Vanegas L.H. and Paula G.A., Log-symmetric distributions: statistical properties and parameter estimation, Brazilian J. Probab. Stat. 30 (2016a), pp. 196–220. [Google Scholar]
  • 45.Vanegas L.H. and Paula G.A., ssym: Fitting Semi-Parametric Log-Symmetric Regression Models, 2016b. R package version 1.5.7.
  • 46.Vanegas L.H. and Paula G.A., Log-symmetric regression models under the presence of non-informative left- or right-censored observations, Test 26 (2017), pp. 405–428. [Google Scholar]
  • 47.Weisberg S., Applied Linear Regression, 4th ed., John Wiley & Sons, Hoboken, NJ, 2014. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES