On a log-symmetric quantile tobit model applied to female labor supply data

Danúbia R Cunha; Jose Angelo Divino; Helton Saulo

doi:10.1080/02664763.2021.1976120

. 2021 Sep 11;49(16):4225–4253. doi: 10.1080/02664763.2021.1976120

On a log-symmetric quantile tobit model applied to female labor supply data

Danúbia R Cunha ^a, Jose Angelo Divino ^a, Helton Saulo ^b,^CONTACT

PMCID: PMC9639474 PMID: 36353305

Abstract

The study of female labor supply has been a topic of relevance in the economic literature. Generally, the data are left-censored and the classic tobit model has been extensively used in the modeling strategy. This model, however, assumes normality for the error distribution and is not recommended for data with positive skewness, heavy-tails and heteroscedasticity, as is the case of female labor supply data. Moreover, it is well-known that the quantile regression approach accounts for the influences of different quantiles in the estimated coefficients. We take all these features into account and propose a parametric quantile tobit regression model based on quantile log-symmetric distributions. The proposed method allows one to model data with positive skewness (which is not suitable for the classic tobit model), to study the influence of the quantiles of interest, and to account for heteroscedasticity. The model parameters are estimated by maximum likelihood and a Monte Carlo experiment is performed to evaluate alternative estimators. The new method is applied to two distinct female labor supply data sets. The results indicate that the log-symmetric quantile tobit model fits better the data than the classic tobit model.

Keywords: Log-symmetric distributions, quantile regression, Monte Carlo simulation, PNAD, PSID

1. Introduction

The modeling of female labor supply data is a relevant issue because the participation rate of women in the labor market has been a transforming factor in the labor market. In the United States, for instance, Jacobsen [17] claims that the increase in female participation in the labor market is the most striking economic statistic in the twentieth century. As for Brazil, Scorzafave and Menezes-Filho [37] emphasize that there was a strong increase in female participation rates, especially for women with 1–11 years of study. On the other hand, Barros et al. [3] highlight that the study of the mechanisms and motivations to explain the increase in the female participation rate, coupled with the fact that this rate is a basic socioeconomic indicator, has boosted the researchers' attention. However, even today, women are less likely to participate in the labor market than men. In addition, they are also more likely to be unemployed in most countries, according to a recent study by the International Labour Organization [15].

In general, data on female labor supply is left-censored. The fraction of women that does not work declares zero income, which is classified as censored. The censoring occurs when the information on the dependent variable is not fully available for some units of the sample. However, for these units, data on explanatory variables are known across their domain. Thus, in these cases, it is necessary to work with models that take censoring into account, such as the classic tobit model; see Long [24]. This model was introduced by Tobin [42] and has been widely used in empirical literature; see, for instance, Amemiya [1], Helsel [14] and Barros et al. [2]. It is applied to left-censored dependent variables, being motivated by a study on the relationship between family spending on durable goods and family income. In the data, some values of the dependent variable (family expenditure) were zero, making it censored at a fixed limit value. Other studies in which the dependent variable is censored at zero for some observations include Fair [7,8] for modeling the number of extramarital affairs, Jarque [18] for family spending in several groups of commodities, and Melenberg and van Soest [27] for holiday expenses, among others.

The classic tobit [42] model might be used to estimate labor supply regressions. By fixing working hours as the dependent variable, there is the possibility that it takes the value zero. This happens when one or more individuals do not work, that is, do not supply hours of work; see details in Moffitt [28]. According to Heckman and MaCurdy [13], which was discussed by Islam [16], the censored model is relevant in cases where the sample consists of individuals randomly chosen and with labor hours reported as zero if the individual does not work. In this case, the techniques used for estimating linear models are inappropriate due to the presence of censoring.

One of the limitations of the classic tobit model [42], however, is the assumption that the error term is normally distributed. Although the normal distribution is widely used in applications, it might not be appropriate when positive skewness, heavy-tails and heteroscedasticity are present in the dependent variable, which is the case of female labor supply data. In this context, the class of log-symmetric distributions are important tools to overcome such limitation. The log-symmetric distributions are a generalization of the log-normal distribution and have as their special cases distributions that have lighter or heavier tails than those of the log-normal, as well as bimodal distributions; see, for instance, Jones [19], Vanegas and Paula [43–46] and Medeiros and Ferrari [26]. In addition to the log-normal distribution, other examples of log-symmetric distributions are the log-Student-t, log-power-exponential and extended Birnbaum-Saunders, among others.

Although important, the use of an appropriate distribution to describe the error distribution of a tobit model does not provide a more comprehensive picture of the effect of the explanatory variables on the dependent variable. In this sense, quantile regression plays an important role, being able to model conditional quantiles as a function of explanatory variables; details on quantile regression can be seen in Koenker and Bassett Jr [23], Koenker [21], Hao and Naiman [12] and Davino et al. [5]. In addition, quantile regression modeling is more efficient in cases where errors are not normally distributed or when the dependent variable has extreme values.

In this context, this paper proposes a quantile tobit model based on log-symmetric distributions. The new tobit model applies a reparameterization of the log-symmetric distributions suggested by Saulo et al. [35], which has the quantile as one of its parameters. The secondary objectives are: (i) to obtain the maximum likelihood estimates of the model parameters; (ii) to carry out a Monte Carlo simulation to evaluate the performance of the maximum likelihood estimates; and, (iii) to apply the proposed methodology to two female labor supply data sets. The first application uses data from the Brazilian National Household Sample Survey (PNAD) for the year of 2015, whereas the second one employs data extracted from the Panel Study of Income Dynamics (PSID), which is an American longitudinal household survey. The PNAD data were obtained from the Brazilian Institute of Geography and Statistics (IBGE),¹ while the PSID data were studied by Barros et al. [2] by using the classical tobit model. Note that there are at least three approaches to estimate quantile regressions: (i) the distribution-free (semi-parametric) approach, which includes the works of Powell [32], Portnoy [31] and Peng and Huang [30]; (ii) the pseudo-likelihood method through an asymmetric Laplace distribution; and (iii) the parametric approach with traditional maximum likelihood (ML) framework [10]. Our proposed methodology falls within the third category and can be seen as a generalization of the works by Desousa et al. [6] and Saulo et al. [36]. In general, the Monte Carlo simulations and the applications to the PNAD and PSID data sets show that the proposed log-symmetric quantile tobit models provide good fit to the data.

The advantages of the proposed log-symmetric quantile tobit model over the classic normal tobit model are: (i) greater flexibility in terms of distributional assumptions, since the log-symmetric class incorporates several special cases, such as the log-normal, log-Student-t, log-power-exponential, and extended Birnbaum-Saunders, among others; (ii) greater flexibility for data modeling, allowing for considering the effects of explanatory variables along the spectrum of the dependent variable due to the quantile approach; and (iii) flexibility to accommodate heteroscedasticity, since the proposed tobit model allows for the inclusion of explanatory variables in the dispersion parameter.

The paper is organized as follows. Section 2 briefly describes the class of log-symmetric distributions, both in its classical representation and in the quantile reparameterization as proposed by Saulo et al. [35]. Section 3 introduces the log-symmetric quantile tobit model and provides details on the estimation, interpretation of the coefficients and residual analysis. Section 4 carries out a Monte Carlo simulation to assess the performance of the maximum likelihood estimates. Section 5 performs two empirical applications to the PNAD and PSID data sets. Finally, Section 6 is dedicated to the concluding remarks and discussions of further research.

2. Log-symmetric distributions

This section briefly describes the classical log-symmetric distributions [44] and those based on the quantile proposed by Saulo et al. [35]. The log-symmetric distributions reparameterized by the quantile, that is, those that have the quantile as one of their parameters, will be used to derive the log-symmetric quantile tobit model.

2.1. Classical log-symmetric distributions

A random variable T follows a log-symmetric distribution with scale parameter $λ > 0$ and power parameter $ϕ > 0$ , if its probability density function and cumulative distribution function are given by

f_{T} (t; λ, ϕ) = \frac{1}{\sqrt{ϕ} t} g (\frac{1}{ϕ} {[\log (t) - \log (λ)]}^{2}), t > 0,

(1)

and

F_{T} (t; λ, ϕ) = G (\frac{1}{ϕ} {[\log (t) - \log (λ)]}^{2}), t > 0,

(2)

respectively, where $G (ω) = η \int_{- \infty}^{ω} g (z^{2}) d z$ , $ω \in R$ , with η being a normalizing constant and $g (\cdot)$ a density generator. In this case, the notation $T \sim LS (λ, ϕ, g)$ is used. Note that as $ϕ \to 0^{+}$ or $ϕ \to \infty$ , $f_{T} (t; λ, ϕ)$ tends to 0. The $100 q$ -th quantile of $T \sim LS (λ, ϕ, g)$ is given by

Q = Q_{T} (q; λ, ϕ) = λ \exp (\sqrt{ϕ} G^{- 1} (q)), q \in (0, 1),

(3)

where $G^{- 1}$ is the inverse of G given in (2). Table 1 presents some density generators g for some log-symmetric distributions; see details in Vanegas and Paula [44]. Note that the generator g may involve and extra parameter ξ.

Table 1.

Density generator $g (u)$ for some log-symmetric distributions.

Distribution	$g (u)$
Log-normal( $λ, ϕ$ )	∝ $\exp (- \frac{1}{2} u)$
Log-Student-t( $λ, ϕ, ξ$ )	∝ $(1 + \frac{u}{ξ})^{- \frac{ξ + 1}{2}}$ , $ξ > 0$
Log-power-exponential( $λ, ϕ, ξ$ )	∝ $\exp (- \frac{1}{2} u^{\frac{1}{1 + ξ}})$ , $- 1 < ξ \leq 1$
Extended Birnbaum-Saunders( $λ, ϕ, ξ$ )	∝ $\cosh (u^{1 / 2}) \exp (- \frac{2}{ξ^{2}} \sinh^{2} (u^{1 / 2}))$ , $ξ > 0$

Open in a new tab

2.2. Quantile-based log-symmetric distributions

Consider a fixed number $q \in (0, 1)$ and Q the $100 q$ -th quantile of $T \sim LS (λ, ϕ, g)$ given in (3). Then, considering the one-to-one transformation $(λ, ϕ) \mapsto (Q, ϕ)$ , Saulo et al. [35] proposed a reparameterization of the classical log-symmetric distribution, where the probability density function and the cumulative distribution function are given respectively by

f_{T} (t; Q, ϕ) = \frac{1}{\sqrt{ϕ} t} g (\frac{1}{ϕ} {[\log (t) - \log (Q) + \sqrt{ϕ} z_{q}]}^{2}), t > 0,

(4)

and

F_{T} (t; Q, ϕ) = G (\frac{1}{ϕ} {[\log (t) - \log (Q) + \sqrt{ϕ} z_{q}]}^{2}), t > 0,

(5)

where $z_{q} = G^{- 1} (q)$ , with G being given as in (2). In this case, the notation $T \sim QLS (Q, ϕ, g)$ is used. If $T \sim QLS (Q, ϕ, g)$ , Saulo et al. [35] have shown that the following properties hold: (a) $c T \sim QLS (c Q, ϕ, g)$ , with c>0; (b) $T^{c} \sim QLS (Q^{c}, c^{2} ϕ, g)$ , with c>0. We then readily have the following relation:

T = Q ϵ^{\sqrt{ϕ}} \sim QLS (Q, ϕ, g), where ϵ \sim QLS (1, 1, g) .

(6)

Figure 1 displays different shapes of the quantile-based log-symmetric densities for different combinations of parameters. From this figure, we observe that the quantile parameter Q modifies the kurtosis. Specifically, as Q increases the kurtosis decreases Figure 1(a,d,g,j). We also observe the the power parameter ϕ controls the relative dispersion (or skewness), such that when ϕ increases the relative dispersion (or skewness) also increases Figure 1(b,e,h,l). The extra parameter ξ in the log-Student-t and log-power-exponential cases modifies the kurtosis Figure 1(f,i), and in the extended Birnbaum-Saunders case this parameter modifies the skewness and kurtosis of the model Figure 1(l).

3. Log-symmetric quantile tobit model

Let $T_{i}$ be a positive censored variable to the left at point Ψ, namely, it is observable for values greater than Ψ and censored for values less than or equal to Ψ. Based on (6), the log-symmetric quantile tobit model can be formulated as

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, m, \\ T_{i}^{*} = Q_{i} ϵ_{i}^{\sqrt{ϕ_{i}}}, & T_{i}^{*} > Ψ, i = m + 1, \dots, n, \end{cases}

(7)

where $ϵ_{i} \sim QLS (1, 1, g)$ , $Q_{i} = \exp (x_{i}^{⊤} β)$ , $ϕ_{i} = \exp (w_{i}^{⊤} κ)$ , $β = (β_{0}, \dots, β_{k})^{⊤}$ and $κ = (κ_{0}, \dots, κ_{l})^{⊤}$ are vectors of regression coefficients, and $x_{i}^{⊤} = (1, x_{i 1}, \dots, x_{i k})^{⊤}$ and $w_{i}^{⊤} = (1, w_{i 1}, \dots, w_{i l})^{⊤}$ are vectors of explanatory variables fixed and known associated with $Q_{i}$ and $ϕ_{i}$ , respectively.

The estimation of the parameters of the quantile log-symmetric tobit model presented in (7) can be done by the maximum likelihood method. Let $T = (T_{1}, \dots, T_{m}, T_{m + 1}, \dots, T_{n})^{⊤}$ be a sample of size n from the quantile log-symmetric tobit model that contains m left-censored data at Ψ and n−m uncensored data. Then, the corresponding likelihood function for the parameter vector $θ = (β^{⊤}, κ^{⊤})^{⊤}$ is given by

\begin{aligned} L (θ) & = \prod_{i = 1}^{m} G (\frac{\log (Ψ) - \log (Q_{i}) + \sqrt{ϕ_{i}} z_{q}}{\sqrt{ϕ_{i}}}) \\ \prod_{i = m + 1}^{n} \frac{1}{\sqrt{ϕ_{i}}} g (\frac{[\log (t_{i}) - \log (Q_{i}) + \sqrt{ϕ_{i}} z_{q}]^{2}}{ϕ_{i}}), \end{aligned}

(8)

where $Q_{i}$ , $ϕ_{i}$ are as in (7), G is as in (2), and g is given in Table 1. By taking the logarithm of (8), we obtain the log-likelihood function $ℓ (θ)$ , that is,

ℓ (θ) = \sum_{i = 1}^{n} ℓ_{i} (θ),

(9)

where

ℓ_{i} (θ) = {\begin{cases} \log (G (\frac{\log (Ψ) - \log (Q_{i}) + \sqrt{ϕ_{i}} z_{q}}{\sqrt{ϕ_{i}}})), & i = 1, \dots, m, \\ - \frac{1}{2} \log (ϕ_{i}) + \log (g (\frac{[\log (t_{i}) - \log (Q_{i}) + \sqrt{ϕ_{i}} z_{q}]^{2}}{ϕ_{i}})), & i = m + 1, \dots, n . \end{cases}

By taking the first derivative of $ℓ (θ)$ with respect to $β$ and $κ$ , we obtain the score vector, that is,

\dot{ℓ} (θ) = \frac{\partial ℓ (θ)}{\partial θ} = \sum_{i = 1}^{n} {\dot{ℓ}}_{i} (θ),

(10)

where ${\dot{ℓ}}_{i} (θ) = ({\dot{ℓ}}_{i β}^{⊤} (θ), {\dot{ℓ}}_{i κ} (θ))^{⊤}$ , with

\begin{aligned} {\dot{ℓ}}_{i β} (θ) & = {\begin{cases} - \frac{1}{\sqrt{ϕ_{i}}} Π (ξ_{i}^{c}) x_{i}, & i = 1, \dots, m, \\ - \frac{2}{\sqrt{ϕ_{i}}} Δ (ξ_{i}^{2}) ξ_{i} x_{i}, & i = m + 1, \dots, n, \end{cases} \\ {\dot{ℓ}}_{i κ} (θ) & = {\begin{cases} - \frac{1}{2 \sqrt{ϕ_{i}}} Π (ξ_{i}^{c}) w_{i} γ_{i}^{c}, & i = 1, \dots, m, \\ - \frac{1}{2} w_{i} - \frac{1}{\sqrt{ϕ}} Δ (ξ_{i}^{2}) ξ_{i} w_{i} γ_{i}, & i = m + 1, \dots, n, \end{cases} \end{aligned}

where $Π (ξ_{i}^{c}) = (d G (u) / d u |_{u = ξ_{i}^{c}}) / G (ξ_{i}^{c})$ and $Δ (ξ_{i}^{2}) = (d g (u) / d u |_{u = ξ_{i}^{2}}) / g (ξ_{i}^{2})$ , with

$ξ_{i}^{c} = (\log (Ψ) - \log (Q_{i}) + \sqrt{ϕ_{i}} z_{q}) / \sqrt{ϕ_{i}}$ , $ξ_{i} = (\log (t_{i}) - \log (Q_{i}) + \sqrt{ϕ_{i}} z_{q}) / \sqrt{ϕ_{i}}$ , $γ_{i}^{c} = \log (Ψ) - \log (Q_{i})$ and $γ_{i} = \log (t_{i}) - \log (Q_{i})$ .

The maximum likelihood estimate for $θ$ is obtained by maximizing the log-likelihood function (9) by equating the score vector $\dot{ℓ} (θ)$ , which contains the vector of first derivatives of $ℓ (θ)$ , to zero, providing the likelihood equations. In this case, as there is no analytical solution, they are solved by using the Broyden–Fletcher–Goldfarb–Shanno (BFGS) iterative method for non-linear optimization. Note that initial values, ${\hat{θ}}_{0} = ({\hat{β}}_{0}^{⊤}, {\hat{κ}}_{0}^{⊤})^{⊤}$ , are required to start the iterative procedure and we obtain them from the R packages quantreg and ssym; see Koenker [22] and Vanegas and Paula [45]. These packages fit a quantile regression model (for ${\hat{β}}_{0}$ ) and a log-symmetric regression model (for ${\hat{κ}}_{0}$ ), respectively. In general, the estimates are quite stable in relation to the initial values. We assessed it through a numerical exercise, i.e. we replaced ${\hat{θ}}_{0}$ by ${\hat{θ}}_{0} = {\hat{θ}}_{0} + Ω$ in the estimation procedure, with $Ω = 0, 1, \dots 10$ , and then checked the final estimates. We observed that the estimates are virtually the same for any value of Ω.

Note that in (8), the extra parameter ξ is assumed to be fixed. The reason for this lies in the works of Lucas [25] and Kano et al. [20]. The first work shows that the robustness of the Student-t distribution to outlying observations holds only when the degree of freedom parameter is fixed, instead of being estimated directly in the log-likelihood function. In the second work, the authors report difficulties in estimating the extra parameter for the power-exponential distribution. Therefore, by extending the arguments to the log-Student-t and lg-power-exponential distributions, the extra parameter ξ is estimated using the profiled log-likelihood. Two basic steps are required:

(S1)
Consider a grid of values $ξ_{1}, ξ_{2}, \dots, ξ_{K}$ . For each fixed value of $ξ_{j}$ , $j = 1, 2, \dots, K$ , compute the estimate of $θ = (β^{⊤}, κ^{⊤})^{⊤}$ based on $ξ_{j}$ , that is, ${\hat{θ}}_{j} = ({\hat{β}}_{j}^{⊤}, {\hat{κ}}_{j}^{⊤})^{⊤}$ . Compute also the value of the associated log-likelihood function, $ℓ_{j} (\hat{θ})$ .
(S2)
Obtain the final estimates of ξ and $θ = (β^{⊤}, κ^{⊤})^{⊤}$ , $\hat{ξ}$ and $\hat{θ} = ({\hat{β}}^{⊤}, {\hat{κ}}^{⊤})^{⊤}$ say, as the associated estimates that maximize the log-likelihood function ( $max_{j} ℓ_{j} (\hat{θ})$ ).

Under regularity conditions, the asymptotic distribution of $\hat{θ}$ is a multivariate normal, that is,

\sqrt{n} (\hat{θ} - θ) \dot{\sim} N_{k + l + 2} (0_{k + l + 2}, Σ_{θ}),

(11)

where $\dot{\sim}$ denotes convergence in distribution and $Σ_{θ}$ is the asymptotic variance-covariance matrix of $\hat{θ}$ [4], which is the inverse of the expected Fisher information matrix. We can approximate the expected Fisher information matrix by its observed version obtained from the Hessian matrix $\ddot{ℓ} (θ)$ , which contains the second derivatives of $ℓ (θ)$ . Thus, $Σ_{θ} \approx [- \ddot{ℓ} (θ)]^{- 1}$ , where

\ddot{ℓ} (θ) = \frac{\partial^{2} ℓ (θ)}{\partial θ \partial θ^{⊤}} = \sum_{i = 1}^{n} {\ddot{ℓ}}_{i} (θ), with {\ddot{ℓ}}_{i} (θ) = [\begin{array}{cc} {\ddot{ℓ}}_{i β β} (θ) & {\ddot{ℓ}}_{i β κ} (θ) \\ {\ddot{ℓ}}_{i κ β} (θ) & {\ddot{ℓ}}_{i κ κ} (θ) \end{array}] .

The elements of $\ddot{ℓ} (θ)$ are given by

\begin{aligned} {\ddot{ℓ}}_{i β β} (θ) & = {\begin{cases} \frac{1}{ϕ_{i}} Π^{'} (ξ_{i}^{c}) x_{i} x_{i}^{⊤}, & i = 1, \dots, m, \\ \frac{4}{ϕ_{i}} Δ^{'} (ξ_{i}^{2}) ξ_{i}^{2} x_{i} x_{i}^{⊤} + \frac{2}{ϕ_{i}} Δ (ξ_{i}^{2}) x_{i} x_{i}^{⊤}, & i = m + 1, \dots, n, \end{cases} \\ {\ddot{ℓ}}_{i β κ} (θ) & = {\ddot{ℓ}}_{i κ β} (θ) \\ = {\begin{cases} [\frac{w_{i}}{2 \sqrt{ϕ_{i}}} Π (ξ_{i}^{c}) + \frac{1}{2 ϕ_{i}} Π^{'} (ξ_{i}^{c}) w_{i} γ_{i}^{c}] x_{i}, & i = 1, \dots, m, \\ - 2 {- \frac{w_{i}}{2 \sqrt{ϕ_{i}}} Δ (ξ_{i}^{2}) ξ_{i} - \frac{w_{i} γ_{i}}{ϕ_{i}} [Δ^{'} (ξ_{i}^{2}) ξ_{i}^{2} + \frac{1}{2} Δ (ξ_{i}^{2})]} x_{i}, & i = m + 1, \dots, n, \end{cases} \\ {\dot{ℓ}}_{i κ κ} (θ) & = {\begin{cases} - \frac{1}{4 ϕ_{i}} [- Π (ξ_{i}^{c}) + \frac{1}{\sqrt{ϕ_{i}}} Π^{'} (ξ_{i}^{c}) γ_{i}^{c}] γ_{i}^{c} w_{i} w_{i}^{⊤}, & i = 1, \dots, m, \\ {\frac{w_{i}}{2 \sqrt{ϕ_{i}}} Δ (ξ_{i}^{2}) ξ_{i} + \frac{w_{i} γ_{i}}{ϕ_{i}} [Δ^{'} (ξ_{i}^{2}) ξ_{i}^{2} + \frac{1}{2} Δ (ξ_{i}^{2})]} w_{i} γ_{i}, & i = m + 1, \dots, n . \end{cases} \end{aligned}

The corresponding standard errors can then be approximated by the square roots of the diagonal elements in the variance-covariance matrix evaluated at $\hat{θ}$ .

3.1. Interpretation of the regression coefficients

The regression coefficient of the proposed tobit model is interpreted in terms of the effect on the latent variable $T_{i}^{*}$ in the uncensored part. Let $β_{j}$ be the j-th regression coefficient and use the subscript (j) to imply excluding the j-th element, such that $x_{i (j)}$ and $β_{(j)}$ are, respectively, the vector of explanatory variables excluding $x_{i j}$ and the regression coefficients excluding $β_{j}$ . Note that the quantile of $T_{i}^{*}$ is given by

Q (T_{i}^{*} | x_{i j}, x_{i (j)}) = \exp (β_{0} + β_{j} x_{i j} + x_{i (j)}^{⊤} β_{(j)}) .

(12)

If $x_{i j}$ increases by 1 while keeping $x_{i (j)}$ fixed, we obtain

\begin{aligned} Q (T_{i}^{*} | x_{i j} + 1, x_{i (j)}) & = \exp (β_{j} (x_{i j} + 1)) \exp (β_{0} + x_{i (j)}^{⊤} β_{(j)}) \\ = \exp (β_{j}) \exp (β_{0} + β_{j} x_{i j} + x_{i (j)}^{⊤} β_{(j)}) \\ = \exp (β_{j}) Q (T_{i}^{*} | x_{i j}, x_{i (j)}) . \end{aligned}

Thus, for any j increasing $x_{i j}$ by 1, the quantile of $T_{i}^{*}$ will be multiplied by $\exp (β_{j})$ . This is usually expressed as a percentage change, and

\frac{Q (T_{i}^{*} | x_{i j} + 1, x_{i (j)}) - Q (T_{i}^{*} | x_{i j}, x_{i (j)})}{Q (T_{i}^{*} | x_{i j}, x_{i (j)})} \times 100 % = (\exp (β_{j}) - 1) \times 100 %

is the approximate percentage increase (or decrease if the value of $β_{j}$ is negative) in the quantile of $T_{i}^{*}$ when $x_{i j}$ is increased by 1. For $x_{i j}$ dichotomous, $(\exp (β_{j}) - 1) \times 100 %$ is the percentage increase (or decrease if $β_{j}$ is negative) in the quantile of $T_{i}^{*}$ when $x_{i j}$ changes from 0 to 1. Note that when $- 0.4 \leq β_{j} \leq 0.4$ , we can use the approximation $(\exp (β_{j}) - 1) \approx β_{j}$ ; see Weisberg [47]. Finally, note that from (12), we can compute the marginal effect of the j-th covariate $x_{j}$ by differentiating the quantile function $Q (T_{i}^{*} | x_{i j}, x_{i (j)})$ with respect to $x_{j}$ , that is, we have [9]

\frac{\partial Q (T_{i}^{*} | x_{i j}, x_{i (j)})}{\partial x_{j}} = \exp (β_{0} + β_{j} x_{i j} + x_{i (j)}^{⊤} β_{(j)}) β_{j} .

3.2. Residual analysis

Goodness of fit and departures from the assumptions of the model can be assessed through residual analysis. In this work, we work with martingale-type (MT) residual, which is given by

r_{_{M T_{i}}} = sign (r_{_{M_{i}}}) \sqrt{- 2 (r_{_{M_{i}}} + ρ_{i} \log (ρ_{i} - r_{_{M_{i}}}))}, i = 1, \dots, n .

where $r_{_{M_{i}}} = ρ_{i} + \log (\hat{S} (t_{i}))$ , with $\hat{S} (t_{i})$ being the survival function fitted to the data, and $ρ_{i} = 0$ or 1 indicating that case i is censored or not, respectively; details on the MT residual can be seen in Therneau et al. [41]. Simulations results indicate that the empirical distribution of the MT residual is in agreement with the standard normal distribution; see Silva et al. [39]. Then, a normal quantile-quantile (QQ) plot with simulated envelope can be constructed for the MT residual to verify whether the model is correctly specified.

4. Monte Carlo simulation

In this section, the performance of the maximum likelihood estimates of the log-symmetric quantile tobit models are evaluated by a Monte Carlo experiment. We consider the following distributions: log-normal (log-NO), log-Student-t (log-t), log-power-exponential (log-PE) and extended Birnbaum-Saunders (EBS). For each Monte Carlo replica, a simulated sample of the log-symmetric quantile tobit model is generated for fixed parameter values. Then, the maximum likelihood estimates are obtained for each simulated sample. Estimates of relative bias (RB), relative root mean squared error (RRMSE) and relative mean absolute error (RMAE) are thus computed from the Monte Carlo replicas as

\begin{aligned} \hat{RB} (\hat{θ}) & = \frac{\frac{1}{NREP} \sum_{i = 1}^{NREP} {\hat{θ}}^{(i)} - θ}{θ}, \\ \hat{RRMSE} (\hat{θ}) & = \frac{\sqrt{\frac{1}{NREP} \sum_{i = 1}^{NREP} ({\hat{θ}}^{(i)} - θ)^{2}}}{θ}, and \\ \hat{RMAE} (\hat{θ}) & = \frac{\frac{1}{NREP} \sum_{i = 1}^{NREP} | {\hat{θ}}^{(i)} - θ |}{θ}, \end{aligned}

where θ and ${\hat{θ}}^{(i)}$ are the true parameter value and its respective i-th maximum likelihood estimate, and $NREP$ is the number of Monte Carlo replicas. The R software was used in all numerical calculations; see R Core Team [33]. Simulated data are generated according to two scenarios, that is, when the model is correctly specified and when it is not.

4.1. Scenario 1

In the first scenario, the model is correctly specified with data generating process given by

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, m, \\ T_{i}^{*} = Q_{i} ϵ_{i}^{\sqrt{ϕ_{i}}}, & T_{i}^{*} > Ψ, i = m + 1, \dots, n, \end{cases}

(13)

where $ϵ_{i} \sim QLS (1, 1, g)$ , $Q_{i} = \exp (β_{0} + β_{1} x_{i})$ and $ϕ_{i} = \exp (κ_{0} + κ_{1} w_{i})$ .

The simulation scenario considers: (i) the log-t model with $ξ = 4$ , $β_{0} = 1.5$ , $β_{1} = 5.5$ , $κ_{0} = 2.0$ , $κ_{1} = 2.0$ , $q = 0.10, 0.50, 0.90$ ; and (ii) the log-PE model with $ξ = 0.3$ , $β_{0} = 1.0$ , $β_{1} = 1.2$ , $κ_{0} = 2.0$ , $κ_{1} = 0.5$ ; with $NREP = 200$ Monte Carlo replications. The explanatory variables $x_{i}$ and $w_{i}$ are generated from the N(0,1) distribution. The value of Ψ in (13) is determined so that the censoring proportion is 10% or 40%.

Tables A2 and A3 present the results of Monte Carlo simulations based on the log-t and log-PE distributions (similar results are obtained for the log-NO and EBS cases). For comparison, the results of the Powell [32], Portnoy [31] and Peng-Huang [30] estimators are given as well; see the R package quantreg Koenker [22]. Tables A2 and A3 report the RB, RRMSE and RMAE obtained for different combinations of censoring proportions, q, and sample size, (n). The results in Tables A2 and A3 allow us to conclude that: ML1) as the sample size increases, in general, the RB, RRMSE and RMAE of all the estimates of the log-t and log-PE quantile tobit model parameters tend to decrease, as expected. This is the case because the maximum likelihood estimator is consistent [11]; ML2) when the censoring proportion increases, in general, the RB, RRMSE and RMAE of all the estimates related to the log-t and log-PE quantile tobit models tend to increase, meaning that the performances of the estimates deteriorates. This result is also expected since the likelihood function loses information contained in the sample when the percentage of censoring increases [40]; ML3) compared to the results of Powell, Portnoy and Peng-Huang, we observe that the results based on the log-t and log-PE distributions, in general, recover the parameters better; and ML4) the performance of all the estimates seem better at the quantiles closer to q = 0.50. The results in ML1) and ML3) are illustrated in Figure 2, ML2) is represented in Figure 3, and ML4) is plotted in Figure 4. These figures illustrate the results of ${\hat{β}}_{1}$ for the log-t.

Figure 2. — RB, RRMSE and RMAE of ${\hat{β}}_{1}$ for the log-t, Powell, Portnoy and Peng-Huang cases for some n values (q = 0.50 and censoring=10%).

Figure 3. — RB, RRMSE and RMAE of ${\hat{β}}_{1}$ for the log-t, Powell, Portnoy and Peng-Huang cases for some censoring ratio values (n = 50 and q = 0.50).

Figure 4. — RB, RRMSE and RMAE of ${\hat{β}}_{1}$ for the log-t, Powell, Portnoy and Peng-Huang cases for some q values (n = 50 and censoring=10%).

In the this first scenario, we also compute the coverage probability (CP) of 95% confidence intervals (CIs) based on the asymptotic normality in (11). The CP estimate is given by

\hat{CP} (\hat{θ}) = \frac{1}{NREP} \sum_{i = 1}^{NREP} I (θ \in [{\hat{θ}}_{L}^{(i)}, {\hat{θ}}_{U}^{(i)}]),

where $I$ is an indicator function taking the value 1 if θ belongs to the i-th interval $[{\hat{θ}}_{L}^{(i)}, {\hat{θ}}_{U}^{(r)}]$ , and 0 otherwise, where ${\hat{θ}}_{L}^{(i)}$ and ${\hat{θ}}_{U}^{(i)}$ are the i-th upper and lower limit estimates of the 95% confidence interval. Table A1 reports the CP results for the log-t and log-PE models. From this table, we observe that: CP1) the CP approaches the 95% nominal level as the sample size increases; CP2) there seems to be no clear effect with the increase of either q or the censoring ratio. The result CP1) is illustrated in Figure 5(a) while CP2) is represented in Figure 5(b,c).

Figure 5. — CPs for the log-t model for some values of n, q and censoring proportion. (a) q = 0.50 and censoring=10%. (b) n = 50 and censoring=10%. (c) n = 50 and q = 0.50.

4.2. Scenario 2

In the second scenario, the model is misspecified with data generating process given by

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, m, \\ T_{i}^{*}, & T_{i}^{*} > Ψ, i = m + 1, \dots, n, \end{cases}

(14)

where $T_{i}^{*} \sim Weibull (Q_{i}, ζ)$ , $Q_{i} = \exp (β_{0} + β_{1} x_{i})$ . That is, $T_{i}^{*}$ follows a Weibull distribution with shape and quantile parameters $ζ > 0$ and $Q_{i} > 0$ , respectively, with probability density function and cumulative distribution function given by

\begin{aligned} f (t_{i}; Q_{i}, ζ) & = - ζ t_{i}^{ζ - 1} Q_{i}^{- ζ} \ln (1 - q) \exp (t_{i}^{ζ} Q_{i}^{- ζ} \ln (1 - q)), \\ F (t_{i}; Q_{i}, ζ) & = 1 - (1 - q)^{t_{i}^{ζ} Q_{i}}, t_{i} > 0, \end{aligned}

(15)

respectively, where $q \in (0, 1)$ . Note that the $100 q$ -th quantile of $T_{i}^{*}$ is given by $Q_{i}$ . The simulation scenario considers $β_{0} = 1.0$ , $β_{1} = 1.5$ , $ζ = 2.5$ , q = 0.10, 0.30, 0.70, 0.90, with $NREP = 200$ Monte Carlo replications. The explanatory variable $x_{i}$ is generated from the N(0,1) distribution and the value of Ψ in (14) is determined so that the censoring proportion is 25%.

Table A4 presents the RB, RRMSE and RMAE results based on the log-NO, log-t, log-PE and EBS distributions. Moreover, the results of Powell, Portnoy and Peng-Huang, are given as well. From Table A4, we observe that the results for $β_{1}$ associated with the log-t model are better. In general, the true values of the parameters are well recovered in all models, that is, the estimates are close to the true values of the parameters in all models. These results are illustrated in Figure 6.

Figure 6. — RB, RRMSE and RMAE of ${\hat{β}}_{1}$ for the log-t, Powell, Portnoy and Peng-Huang cases for some n values (q = 0.30 and censoring=25%) with misspecified data generating process.

5. Applications

In this section, the proposed log-symmetric quantile tobit models are used to analyze the PNAD and PSID data. The PNAD data set was obtained from the official IBGE website², whereas the PSID data set has already been analyzed in the literature on tobit models. The distributions are the same ones used in the Monte Carlo simulations. The steps to model the female labor supply data are described by Algorithm 1.

5.1. PNAD data

In this subsection, the log-symmetric quantile tobit models are illustrated using data from the PNAD for the year 2015, from the IBGE, which reports demographic and socioeconomic characteristics of the Brazilian population annually. We used a sub-sample from the PNAD composed only by women³. The PNAD data consists of a sample composed of women aged between 18 and 65 years old, with information on hourly wages and socioeconomic characteristics. In total, the sample contains 26,460 observations, of which 387 are censored with a salary equal to zero. The data covers 10 metropolitan regions in Brazil, represented by Belem-PA, Fortaleza-CE, Recife-PE, Salvador-BA, Belo Horizonte- MG, Rio de Janeiro-RJ, Curitiba-PR, Porto Alegre-RS, Brasília-DF and São Paulo-SP. Nominal income values were deflated by the National Consumer Price Index (INPC) provided by the IBGE.

The objective is to study women's labor supply. The dependent variable is women's hourly wages (T) and the explanatory variables are woman's age in years (age), woman's age squared as a proxy for experience $a g e^{2}$ , dummy variable for race with value 1 for white and 0 for non-white (color), dummy variable for marital status with value 1 for married and 0 for non-married (civil), dummy variable for children under 10 in the household with a value of 1 if yes and 0 if no (minor), formal years of education ranging from 0 to 16 years (educ), number of years in the main job measured in years of work and varying between 0 to 56 years (exper) and a dummy variable for the condition of the woman in the household assuming value of 1 if the woman is the head of the household and 0 otherwise (head). Similarly to the PSID application, the dependent variable woman's hourly wage is added to 1 (T + 1), such that $Ψ = 1$ .

The choice of these explanatory variables is due to their importance in the female labor supply literature, in addition to some similarity with the PSID data. The variable educ, for instance, directly affects female participation rate in the labor market. On the other hand, the women labor supply does not depend only on the demand side. There are also other elements that might limit this participation, such as the presence of children in the household.

Descriptive statistics for women's hourly wages (T) indicate that the mean and median are 22.133 and 7.59, respectively. The coefficient of variation is 493.11%, indicating a high dispersion of the data around the mean, whereas the coefficients of skewness and kurtosis are equal to 20.225 and 573.536, respectively. The skewness suggests the presence of a positive asymmetry and the kurtosis indicates occurrence of heavy tails, implying the use of log-symmetric distributions. The asymmetric nature of the data is confirmed by the histogram shown in Figure 7(a).

Figure 7. — Histogram (a) for the observed women's hourly wages and QQ plot (b) and its envelope for the MT residual for the log-t quantile tobit model (q = 0.50).

The proposed models can accommodate heteroscedasticity, then two versions are considered:

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, 387, \\ T_{i}^{*} = Q_{i} ϵ_{i}^{\sqrt{ϕ_{i}}}, & T_{i}^{*} > Ψ, i = 388, \dots, 26, 460, \end{cases}

where

\begin{aligned} Q_{i} & = \exp (β_{0} + β_{1} a g e_{i} + β_{2} a g e_{i}^{2} + β_{3} c o l o r_{i} + β_{4} c i v i l_{i} + β_{5} m i n o r_{i} \\ + β_{6} e d u c_{i} + β_{7} e x p e r_{i} + β_{8} h e a d_{i}), \end{aligned}

$ϵ_{i} \sim QLS (1, 1, g)$ , where

Specification 1: $ϕ_{i} = \exp (κ_{0})$ .
Specification 2: $ϕ_{i} = \exp (κ_{0} + κ_{1} a g e_{i} + κ_{2} a g e_{i}^{2} + κ_{3} c o l o r_{i} + κ_{4} c i v i l_{i} + κ_{5} m i n o r_{i}$ $+ κ_{6} e d u c_{i} + κ_{7} e x p e r_{i} + κ_{8} h e a d_{i})$ .

In Specification 2, explanatory variables are present in the dispersion parameter ϕ, that is, the presence of heteroscedasticity.

The AIC and BIC values for the adjusted log-symmetric quantile tobit models are reported in Table 2. In Table 2, the values of q are $0.05, 0.25, 0.50, 0.75$ and $0.95$ . However, similar results are obtained when considering $q = {0.01, 0.02, \dots, 0.99}$ . The models with explanatory variables in the dispersion parameter (ϕ) present better adjustments for all distributions here considered. In general, the log-t quantile tobit model shows the best fit to the data.

Table 2.

AIC and BIC values for different models and q with the PNAD data.

	Specification	Criterion	q = 0.05	q = 0.25	q = 0.50	q = 0.75	q = 0.95
Log-NO	1	AIC	63738.61	63738.61	63738.61	63738.61	63738.61
	1	BIC	63820.45	63820.45	63820.45	63820.45	63820.45
	2	AIC	62632.53	62707.62	62748.85	62782.9	62822.39
	2	BIC	62779.83	62854.92	62896.15	62930.2	62969.69
Log-t	1	AIC	53825.67	53825.67	53836.37	53825.67	53825.67
	1	BIC	53907.51	53907.51	53918.2	53907.51	53907.51
	2	AIC	50348.87	51912.48	52303.72	52496.98	52696.98
	2	BIC	50496.17	52059.78	52451.02	52644.28	52844.28
Log-PE	1	AIC	55905.30	55905.30	55905.30	55905.29	55905.30
	1	BIC	55987.13	55987.13	55987.13	55987.13	55987.14
	2	AIC	53986.88	54591.75	54747.16	54847.73	54970.37
	2	BIC	54134.18	54739.05	54894.46	54995.03	55117.67
EBS	1	AIC	63895.42	63895.42	63895.42	63895.42	63895.42
	1	BIC	63977.25	63977.25	63977.25	63977.25	63977.25
	2	AIC	62806.21	62877.53	62917	62949.78	62987.94
	2	BIC	62953.51	63024.83	63064.3	63097.08	63135.24

Open in a new tab

The model parameter estimates for the log-t tobit quantile model based on Specification 2 considering $q = {0.05, 0.25, 0.50, 0.75, 0.95}$ , are reported in Table 3. This table also presents the estimation results based on the optimal quantile, denoted by $q_{o t m}$ , which was chosen through a search approach. For a grid of values of $q = {0.01, 0.02, \dots, 0.99}$ , we estimated the model parameters and computed the corresponding AIC and BIC values. Then, the value of $q_{o t m}$ was the one which had the lowest AIC and BIC values. From Table 3, note that the maximum likelihood estimates of the model parameters change according to the value of q, meaning that the magnitude of the effect of the explanatory variables varies with q. We can interpret the estimated coefficients in terms of the effect on the latent variable $T_{i}^{*}$ (observed part of the hourly wage); see Subsection 3.1. For white women, for instance, there is an increase in the $5^{\circ}$ percentile (q = 0.05) of the hourly wage of $(\exp (0.0219) - 1) * 100 % = 2.21 %$ when compared to non-white women. However, the increase in the $95^{\circ}$ percentile (q = 0.95) of the hourly wage is of $(\exp (0.0219) - 1) * 100 % = 26.05 %$ . That is, the effect of color on the observed part of the hourly wage is greater for women with higher income (larger quantiles).

Table 3.

Maximum likelihood estimates (standard errors in parentheses) for the log-t quantile tobit model across different values of q (Specification 2).

	q = 0.05	q = 0.25	q = 0.50	q = 0.75	q = 0.95	$q_{o t m} = 0.01$
$β_{0} (i n t e r c e p t)$	1.7685*	0.9726*	0.6617*	0.5392*	0.5919*	1.9216*
	(0.0522)	(0.0351)	(0.0304)	(0.0367)	(0.0731)	(0.0865)
$β_{1} (a g e)$	−0.0113*	0.0195*	0.0311*	0.0417*	0.0656*	−0.0413*
	(0.0027)	(0.0018)	(0.0016)	(0.0020)	(0.0041)	(0.0044)
$β_{2} (a g e^{2})$	0.0001*	−0.0002*	−0.0003*	−0.0004*	−0.0006*	0.0004*
	(0.0000)	(0.0000)	(0.0000)	(0.0000)	(0.0001)	(0.0001)
$β_{3} (c o l o r)$	0.0219*	0.1235*	0.1683*	0.2003*	0.2315*	−0.0735*
	(0.0108)	(0.0068)	(0.0062)	(0.0081)	(0.0163)	(0.0169)
$β_{4} (c i v i l)$	−0.0008	−0.0177	−0.0104	0.0038	0.0351	0.0399
	(0.0236)	(0.0167)	(0.0161)	(0.0207)	(0.0412)	(0.0348)
$β_{5} (m i n o r)$	−0.0344*	−0.0238*	−0.0162*	−0.0110	−0.0020	−0.0335**
	(0.0111)	(0.0070)	(0.0064)	(0.0082)	(0.0162)	(0.0174)
$β_{6} (e d u c)$	−0.0376*	0.0382*	0.0646*	0.0767*	0.0842*	−0.1227*
	(0.0014)	(0.0014)	(0.0011)	(0.0011)	(0.0018)	(0.0019)
$β_{7} (e x p e r)$	−0.0012	0.0109*	0.0174*	0.0229*	0.0369*	−0.0101*
	(0.0009)	(0.0007)	(0.0006)	(0.0008)	(0.0017)	(0.0012)
$β_{8} (h e a d)$	−0.0459*	−0.0044	0.0197*	0.0397*	0.0798*	−0.0811*
	(0.0110)	(0.0075)	(0.0069)	(0.0087)	(0.0175)	(0.0163)
$κ_{0} (i n t e r c e p t)$	−6.3627*	−5.9789*	−5.2843*	−4.8397*	−4.2542*	−5.2983*
	(0.1141)	(0.1608)	(0.1559)	(0.1466)	(0.1370)	(0.0792)
$κ_{1} (a g e)$	0.0829*	0.0930*	0.0916*	0.0884*	0.0806*	0.0579*
	(0.0058)	(0.0079)	(0.0079)	(0.0075)	(0.0070)	(0.0040)
$κ_{2} (a g e^{2})$	−0.0008*	−0.0008*	−0.0008*	−0.0008*	−0.0007*	−0.0006*
	(0.0001)	(0.0001)	(0.0001)	(0.0001)	(0.0001)	(0.0000)
$κ_{3} (c o l o r)$	0.2495*	0.2688*	0.2368*	0.2005*	0.1130	0.1799*
	(0.0207)	(0.0272)	(0.0280)	(0.0279)	(0.0260)	(0.0142)
$κ_{4} (c i v i l)$	−0.0311	0.0649	0.0927	0.0990	−0.0023	−0.0572**
	(0.0504)	(0.0664)	(0.0666)	(0.0649)	(0.0600)	(0.0328)
$κ_{5} (m i n o r)$	0.0370	0.0314	0.0263	0.0251	0.0849	0.0194
	(0.0226)	(0.0298)	(0.0302)	(0.0298)	(0.0271)	(0.0152)
$κ_{6} (e d u c)$	0.2022*	0.1381*	0.0830*	0.0529*	0.0234*	0.1673*
	(0.0030)	(0.0048)	(0.0042)	(0.0036)	(0.0028)	(0.0019)
$κ_{7} (e x p e r)$	0.0242*	0.0271*	0.0253*	0.0245*	0.0259*	0.0174*
	(0.0014)	(0.0019)	(0.0020)	(0.0021)	(0.0020)	(0.0009)
$κ_{8} (h e a d)$	0.1104*	0.1227*	0.1142*	0.1070*	0.0938*	0.0795*
	(0.0211)	(0.0289)	(0.0292)	(0.0288)	(0.0265)	(0.0139)
ξ	2	2	2	2	2	2

Open in a new tab

Source: Elaborated by the authors based on PNAD data.

* significant at 5% level. ** significant at 10% level.

Table 3 reveals that the parameter estimates associated with the explanatory variables age, $a g e^{2}$ , color, minor, educ, exper and head, which model the quantile, are statistically significant at least for one q, and the parameter estimates associated with the explanatory variables age, $a g e^{2}$ , color, civil, educ, exper and head that model the dispersion, are also statistically significant at least for one q. This indicates the presence of heteroscedasticity in the data. Thus, the reduced predictive log-t quantile tobit model (q = 0.50) is given by (standard errors in parentheses)

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, 387, \\ T_{i}^{*} = {\hat{Q}}_{i} ϵ_{i}^{\sqrt{{\hat{ϕ}}_{i}}}, & T_{i}^{*} > Ψ, i = 388, \dots, 26, 460, \end{cases}

(16)

where

\begin{aligned} {\hat{Q}}_{i} & = \exp (\underset{(0.0303)}{0.6622} + \underset{(0.0016)}{0.0311} a g e_{i} - \underset{(< 0.0001)}{0.0003} a g e_{i}^{2} + \underset{(0.0062)}{0.1683} c o l o r_{i} - \underset{(0.0064)}{0.0169} m i n o r_{i} \\ + \underset{(0.0011)}{0.0646} e d u c_{i} + \underset{(0.0006)}{0.0174} e x p e r_{i} + \underset{(0.0069)}{0.0195} h e a d_{i}), \\ {\hat{ϕ}}_{i} & = \exp (- \underset{(0.1567)}{5.2777} + \underset{(0.0078)}{0.0922} a g e_{i} - \underset{(0.0001)}{0.0008} a g e_{i}^{2} + \underset{(0.0279)}{0.2362} c o l o r_{i} + \underset{(0.0654)}{0.0997} c i v i l_{i} \\ + \underset{(0.0042)}{0.0828} e d u c_{i} + \underset{(0.0020)}{0.0253} e x p e r_{i} + \underset{(0.0291)}{0.1147} h e a d_{i}), \end{aligned}

and $\hat{ξ} = 2$ . The QQ plot with simulated envelope of the MT residual for this model is depicted in Figure 7(b). The log-t model provides a good fit to the data. It is worth mention that similar plots are obtained for other values of q.

By comparing the the proposed log-t model to the classic normal tobit model, we have that $AIC = 313909.3$ and $BIC = 313991.1$ for the latter, indicating the proposed model is a better choice. Moreover, we can compare the log-t results to the Powell, Portnoy and Peng-Huang ones. To to so, we fit these models for each $q = 0.01, 0.02, \dots, 0.98, 0.99$ , and then we compute the averages of the corresponding root mean squared errors (RMSEs) and mean absolute errors (MAEs) of the predicted values, where $RMSE = \sqrt{1 / n \sum_{i = 1}^{n} (t_{i} - {\hat{t}}_{i})^{2}}$ and $MAE = 1 / n \sum_{i = 1}^{n} | t_{i} - {\hat{t}}_{i} |$ , with $t_{i}$ and ${\hat{t}}_{i}$ being the actual and predicted values, respectively. Table 4 reveals that the log-t model presents a good performance, especially in terms of MAE. The Powell, Portnoy and Peng-Huang procedures had estimation problems for some values of q.

Table 4.

Means of the RMSEs and MAEs based on $q = 0.01, 0.02, \dots, 0.98, 0.99$ for the indicated models.

Measure	Log-t	Powell	Portnoy	Peng-Huang
RMSE	112.6243	112.6054	112.7335	112.2671
MAE	21.2145	23.6919	23.8055	23.3265

Open in a new tab

5.2. PSID data

This application considers a data set corresponding to the PSID of 1976, based on data from the previous year, 1975; see Mroz [29]. This set contains 753 observations of white married women between 30 and 60 years of age in 1975 (year of the interview was in 1976). Of these 753 women, 325 have a salary equal to zero, that is, censored at zero. Since the proposed models are for positive data, the dependent variable is considered to be $T + 1$ , such that $Ψ = 1$ .

The objective here is to estimate the labor supply of white married women by using log-symmetric quantile tobit models. The dependent variable is the hourly wage (T) (in 1975 US dollars) and the explanatory variables are age in years (age), years of education (educ), number of children under 6 years old in the household ( $c h i l 6$ ), number of children between 6 and 18 years old in the household ( $c h i l 618$ ), and years of previous experience in the labor market (exper). These data were previously studied by Barros et al. [2] by using the classic normal tobit model and the Student-t tobit model.

Similarly to the first application, two versions of the log-symmetric quantile tobit model are considered:

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, 325, \\ T_{i}^{*} = Q_{i} ϵ_{i}^{\sqrt{ϕ_{i}}}, & T_{i}^{*} > Ψ, i = 326, \dots, 753, \end{cases}

where

Q_{i} = \exp (β_{0} + β_{1} a g e_{i} + β_{2} e d u c_{i} + β_{3} c h i l 6_{i} + β_{4} c h i l 618_{i} + β_{5} e x p e r_{i}),

$ϵ_{i} \sim QLS (1, 1, g)$ , and

Specification 1: $ϕ_{i} = \exp (κ_{0})$ .
Specification 2: $ϕ_{i} = \exp (κ_{0} + κ_{1} a g e_{i} + κ_{2} e d u c_{i} + κ_{3} c h i l 6_{i} + κ_{4} c h i l 618_{i} + κ_{5} e x p e r_{i})$ .

Note that the difference between the two models lies in the inclusion of explanatory variables in the dispersion parameter ϕ. Thus, Specification 2 accounts for the presence of heteroscedasticity.

A similar analysis to the previous application allows us to conclude that the models with explanatory variables in the dispersion parameter (ϕ) provide better adjustments when compared to those without explanatory variables according to the values of the AIC and BIC. Moreover, the log-PE quantile tobit model provided the best fit to the data.

Table 5 reports the maximum likelihood estimates and standard errors for the log-PE quantile tobit model parameters based on Specification 2, considering $q = {0.05, 0.25, 0.50, 0.75, 0.95}$ and $q_{o t m}$ . From Table 5, we note that the maximum likelihood estimates of the parameters change according to the value of q. Thus, the magnitude of the effect of the explanatory variables varies with q. Again, we can interpret the estimated coefficients in terms of the effect on the latent variable $T_{i}^{*}$ , that is, the effect on the observed part of the hourly wage. We observe, for instance, that one additional year of experience (exper), increases in $(\exp (0.0974) - 1) * 100 % = 10.23 %$ the $5^{\circ}$ percentile (q = 0.05) of the hourly wage, while it increases by $(\exp (0.0383) - 1) * 100 % = 3.90 %$ the $75^{\circ}$ percentile (q = 0.75) of hourly wages. In other words, the effect of increased labor experience on the observed part of the hourly wage is greater for women with lower income (lower quantiles).

Table 5.

Maximum likelihood estimates (standard errors in parentheses) for the log-PE quantile tobit model across different values of q (Specification 2).

	q = 0.05	q = 0.25	q = 0.50	q = 0.75	q = 0.95	$q_{o t m} = 0.99$
$β_{0} (i n t e r c e p t)$	0.2486	0.4507	0.5790	0.6487**	0.7147**	0.7921
	(0.6830)	(0.5235)	(0.3964)	(0.3554)	(0.4319)	(0.5196)
$β_{1} (a g e)$	−0.1070*	−0.0864*	−0.0631*	−0.0350*	−0.0013	0.0159**
	(0.0111)	(0.0089)	(0.0071)	(0.0064)	(0.0078)	(0.0093)
$β_{2} (e d u c)$	0.1708*	0.1638*	0.1537*	0.1393*	0.1213*	0.1124*
	(0.0386)	(0.0293)	(0.0212)	(0.0167)	(0.0191)	(0.0227)
$β_{3} (c h i l 6)$	−1.7487*	−1.4588*	−1.1038*	−0.6507*	−0.0929	0.1946
	(0.2786)	(0.2301)	(0.1785)	(0.1354)	(0.1588)	(0.1985)
$β_{4} (c h i l 618)$	−0.0314	−0.0113	0.0017	0.0030	−0.0152	−0.0320
	(0.0613)	(0.0475)	(0.0362)	(0.0313)	(0.0377)	(0.0445)
$β_{5} (e x p e r)$	0.0974*	0.0814*	0.0625*	0.0383*	0.0061	−0.0123
	(0.0091)	(0.0074)	(0.0062)	(0.0061)	(0.0079)	(0.0093)
$κ_{0} (i n t e r c e p t)$	−0.9522**	−1.0583*	−1.1658*	−1.2561*	−1.2605*	−1.1975*
	(0.5074)	(0.5335)	(0.5619)	(0.5809)	(0.5814)	(0.5912)
$κ_{1} (a g e)$	0.0577*	0.0632*	0.0695*	0.0760*	0.0800*	0.0794*
	(0.0074)	(0.0081)	(0.0089)	(0.0097)	(0.0101)	(0.0102)
$κ_{2} (e d u c)$	−0.0249	−0.0298	−0.0352	−0.0401	−0.0415	−0.0400
	(0.0262)	(0.0273)	(0.0283)	(0.0290)	(0.0284)	(0.0279)
$κ_{3} (c h i l 6)$	0.8337*	0.9299*	1.0337*	1.1305*	1.1777*	1.1655*
	(0.1434)	(0.1584)	(0.1752)	(0.1893)	(0.1944)	(0.1919)
$κ_{4} (c h i l 618)$	0.0401	0.0319	0.0194	−0.0003	−0.0282	−0.0417
	(0.0423)	(0.0448)	(0.0475)	(0.0504)	(0.0521)	(0.0521)
$κ_{5} (e x p e r)$	−0.0445*	−0.0497*	−0.0560*	−0.0638*	−0.0720*	−0.0750*
	(0.0061)	(0.0067)	(0.0074)	(0.0082)	(0.0090)	(0.0091)
ξ	−0.48	−0.48	−0.48	−0.48	−0.48	−0.48

Open in a new tab

Source: Elaborated by the authors based on PSID data.

* significant at 5% level. ** significant at 10% level.

From Table 5 the parameter estimates associated with the explanatory variables age, educ, $c h i l 6$ and exper, which model the quantile, are statistically significant at least for one q, and that the explanatory variables age, $c h i l 6$ , exper, which model the dispersion, are statistically significant, indicating the presence of heteroscedasticity in the data and justifying the dispersion modeling. Thus, the reduced predictive log-PE quantile tobit model (q = 0.50) is given by (standard errors in parentheses)

T_{i} = {\begin{cases} Ψ, & T_{i}^{*} \leq Ψ, i = 1, \dots, 325, \\ T_{i}^{*} = {\hat{Q}}_{i} ϵ_{i}^{\sqrt{{\hat{ϕ}}_{i}}}, & T_{i}^{*} > Ψ, i = 326, \dots, 753, \end{cases}

(17)

where

\begin{aligned} {\hat{Q}}_{i} & = \exp (\underset{(0.3189)}{0.8755} - \underset{(0.0072)}{0.0637} a g e_{i} + \underset{(0.0158)}{0.1324} e d u c_{i} - \underset{(0.1891)}{1.1339} c h i l 6_{i} + \underset{(0.0059)}{0.0600} e x p e r_{i}), \\ {\hat{ϕ}}_{i} & = \exp (- \underset{(0.3175)}{1.4206} + \underset{(0.0083)}{0.0698} a g e_{i} + \underset{(0.1676)}{1.0394} c h i l 6_{i} - \underset{(0.0065)}{0.0594} e x p e r_{i}), \end{aligned}

and $\hat{ξ} = - 0.57$ . The log-PE model provides a good fit from the QQ plot with simulated envelope of the MT residual for this model (similar adjustments are found for other values of q), as shown in Figure 8(b).

Figure 8. — Histogram (a) for the observed women's hourly wages and QQ plot (b) and its envelope for the MT residual for the log-PE quantile tobit model (q = 0.50).

We can compare the proposed log-PE model to the classic normal tobit model and the Student-t tobit model, which were fitted by Barros et al. [2] using the same data set. We have the $AIC = 2892.12$ and $BIC = 2919.87$ for the normal case, and $AIC = 2758.99$ e $BIC = 2786.73$ for the Student-t case. Thus, we highlight the superiority of the proposed models when compared to the normal classic and Student-t models adjusted by Barros et al. [2] under three basic aspects as follows. (i) The use of asymmetric distributions (log-symmetric distributions) that is more adequate for the PSID data. The normal classic and Student-t tobit models use symmetric distributions. (ii) The possibility of modeling the dispersion and the consequent accommodation of heteroscedasticity, which improves the fit. (iii) The modeling in terms of quantiles, which provides a richer characterization of the effects of the explanatory variables on the dependent variable. The classic normal tobit and Student-t models do not consider the quantile approach. We also can compare the results of the proposed models with results of Powell, Portnoy and Peng-Huang. Once again, we fit these models for each $q = 0.01, 0.02, \dots, 0.98, 0.99$ , and then to compute the averages of the corresponding RMSEs and MAEs of the predicted values. From Table 6, we observe good results for the log-PE model. The Powell, Portnoy and Peng-Huang procedures present estimation problems for different values of q, and the results for the Powell model diverge.

Table 6.

Means of the RMSEs and MAEs based on $q = 0.01, 0.02, \dots, 0.98, 0.99$ for the indicated models.

Measure	Log-PE	Powell	Portnoy	Peng-Huang
RMSE	3.9750	1.9640e+14	4.0309	3.8051
MAE	2.7777	8.1192e+13	3.1193	2.8907

Open in a new tab

6. Concluding remarks

In this paper, a class of quantile tobit models was proposed based on a reparameterization of the log-symmetric distributions. In such reparameterization, the quantile is one of the distribution parameters. The advantages of the proposed models over the classic tobit model include:

flexibility to assume several asymmetric distributions, since the quantile-based log-symmetric class incorporates several distributions as special cases, such as log-normal, log-Student-t, log-power-exponential and extended Birnbaum-Saunders, among others;
greater flexibility in the analysis of the effects of explanatory variables on the dependent variable due to the quantile approach; and
ability to accommodate heteroscedasticity, since the proposed model allows for the inclusion of explanatory variables in the dispersion parameter.

A Monte Carlo simulation study was carried out to evaluate the performance of the maximum likelihood estimates. In general, the results showed good performances of the maximum likelihood estimates in terms of relative bias, relative root mean squared error and relative mean absolute error. Two applications to actual data from PNAD (Brazil) ad PSID (United States) were carried out to illustrate the proposed methodology. The applications favored the use of the log-symmetric quantile tobit models over the classic tobit model, corroborating the advantages (i), (ii) and (iii) mentioned above. As another product of this paper, the authors are preparing an R package [33], which might be an important tool for professionals, researchers in the field of economics and statistics, data scientists, among others. A preliminary code for implementation of our method in R is available at GitHub through the link https://github.com/danubiacunha/quantiletobit.git.

As future research, the following lines might be explored:

to study some hypothesis and misspecification tests via Monte Carlo simulation; see Santos and Cribari-Neto [34];
to generalize the proposed models for the cases with right censoring or two-sided censoring; see Long [24, pp. 211–212];
to propose bivariate models; see Seung-Hoon [38].
to investigate multiple maxima as well as local and global maxima;
to investigate the behavior of the Wald, score, likelihood ratio and gradient tests; see Saulo et al. [35].

We are currently working on some of these issues and hope to report new findings in the near future.

Tables.

Table A1.

Coverage probability from simulated data and selected quantile tobit models.

			n = 25			n = 50			n = 100
Model	Censoring	Parameter	q = 0.10	q = 0.50	q = 0.90	q = 0.10	q = 0.50	q = 0.90	q = 0.10	q = 0.50	q = 0.90
Log-t	10%	$β_{0}$	81.20	78.00	71.00	92.20	91.20	89.00	93.60	92.20	89.60
		$β_{1}$	79.20	75.20	71.60	87.40	88.20	88.60	93.40	93.20	93.40
		$κ_{0}$	90.80	90.60	88.20	94.80	94.00	92.60	91.80	92.20	93.80
		$κ_{1}$	89.60	85.00	80.20	93.60	93.40	94.80	93.20	94.40	92.20
	40%	$β_{0}$	74.00	71.20	72.20	91.20	92.20	86.40	93.20	92.20	89.80
		$β_{1}$	73.20	64.20	74.60	85.80	86.60	83.80	93.80	94.20	93.20
		$κ_{0}$	76.40	84.00	90.20	93.60	91.80	94.40	91.80	93.60	95.00
		$κ_{1}$	78.80	78.00	84.00	94.00	93.20	90.80	92.40	93.60	92.60
Log-PE	10%	$β_{0}$	90.12	89.09	90.20	94.80	93.00	92.00	93.20	93.39	94.60
		$β_{1}$	82.26	83.43	86.33	93.80	93.80	92.80	92.40	94.19	93.80
		$κ_{0}$	92.54	92.74	94.29	93.60	93.60	93.20	94.40	95.60	95.20
		$κ_{1}$	91.53	89.11	94.49	94.80	92.20	92.80	95.80	92.60	92.60
	40%	$β_{0}$	89.00	86.59	87.25	95.40	93.80	91.20	92.80	93.00	94.40
		$β_{1}$	81.00	78.66	85.02	94.19	93.60	93.80	93.20	94.00	94.40
		$κ_{0}$	89.80	91.67	92.51	95.20	94.80	94.80	95.60	95.40	95.00
		$κ_{1}$	91.40	85.77	92.11	95.00	93.60	94.60	95.80	93.00	93.60

Open in a new tab

Table A2.

RB, RMSE and RMAE from simulated data for the indicated cases.

				n = 25			n = 50			n = 100
	Censoring	q	Parameter	RB	RRMSE	RMAE	RB	RRMSE	RMAE	RB	RRMSE	RMAE
Log-t	10%	0.10	$β_{0}$	0.0580	0.4680	0.3750	−0.0042	0.2216	0.1742	−0.0003	0.1208	0.0958
			$β_{1}$	0.0327	0.1513	0.1171	0.0063	0.0379	0.0309	0.0038	0.0312	0.0249
			$κ_{0}$	−0.0854	0.2223	0.1773	−0.0260	0.1164	0.0934	−0.0119	0.0757	0.0608
			$κ_{1}$	0.0591	0.1725	0.1359	0.0273	0.1167	0.0913	0.0135	0.0806	0.0646
Powell			$β_{0}$	−0.2224	2.5671	1.1454	−0.0382	0.9985	0.5300	−0.1522	0.8199	0.3387
			$β_{1}$	−0.0515	0.4696	0.3191	−0.0466	0.2112	0.1352	0.0017	0.1440	0.0959
Portnoy			$β_{0}$	−0.6835	2.6454	1.1437	−0.5662	2.7080	0.7910	−0.1366	0.4889	0.2933
			$β_{1}$	0.1423	0.4835	0.2975	0.0483	0.3710	0.1574	−0.0286	0.1217	0.0892
Peng-Huang			$β_{0}$	−1.2078	4.9358	1.5976	−1.2277	6.2892	1.3935	−0.2538	0.6938	0.3609
			$β_{1}$	0.2013	0.8014	0.3958	0.1137	0.7966	0.2412	−0.0384	0.1571	0.1098
Log-t		0.50	$β_{0}$	−0.0059	0.3076	0.2419	−0.0082	0.1664	0.1310	0.0002	0.0604	0.0453
			$β_{1}$	0.0034	0.1128	0.0899	−0.0016	0.0514	0.0400	0.0012	0.0239	0.0191
			$κ_{0}$	−0.0793	0.2218	0.1776	−0.0081	0.1442	0.1154	−0.0119	0.1015	0.0807
			$κ_{1}$	0.1227	0.2782	0.2200	0.0463	0.1637	0.1293	0.0436	0.1112	0.0899
Powell			$β_{0}$	0.0133	0.5119	0.3943	0.0901	1.1402	0.3327	−0.0062	0.1844	0.1360
			$β_{1}$	−0.0088	0.1838	0.1394	−0.0115	0.1827	0.0785	0.0031	0.0700	0.0456
Portnoy			$β_{0}$	0.0232	0.5068	0.3883	−0.0024	0.2608	0.2090	0.0039	0.1600	0.1256
			$β_{1}$	−0.0089	0.1818	0.1387	−0.0033	0.0760	0.0581	0.0019	0.0513	0.0401
Peng-Huang			$β_{0}$	0.0531	0.5169	0.3978	−0.0148	0.2633	0.2091	−0.0235	0.1634	0.1257
			$β_{1}$	−0.0108	0.1803	0.1376	0.0029	0.0768	0.0595	0.0012	0.0516	0.0403
Log-t		0.90	$β_{0}$	−0.1216	0.5445	0.4352	−0.0355	0.1891	0.1522	−0.0153	0.1327	0.1051
			$β_{1}$	0.0062	0.1049	0.0846	0.0018	0.0392	0.0314	0.0001	0.0280	0.0224
			$κ_{0}$	−0.0681	0.2054	0.1621	−0.0243	0.1112	0.0884	−0.0021	0.0778	0.0614
			$κ_{1}$	0.1096	0.2413	0.1958	0.0508	0.1471	0.1172	0.0288	0.0821	0.0638
Powell			$β_{0}$	0.6479	2.3091	1.1293	0.1809	0.6641	0.4120	0.1122	0.4176	0.2936
			$β_{1}$	0.0714	0.5123	0.2955	0.0099	0.1379	0.0980	0.0028	0.1016	0.0751
Portnoy			$β_{0}$	0.7666	1.8913	1.1297	0.2676	0.7288	0.4717	0.1212	0.4234	0.2960
			$β_{1}$	0.0720	0.4845	0.3161	0.0104	0.1426	0.1036	0.0026	0.0988	0.0741
Peng-Huang			$β_{0}$	1.8024	3.5908	2.0110	0.2890	0.7779	0.4838	0.0901	0.4004	0.2859
			$β_{1}$	0.1390	0.7290	0.4684	0.0153	0.1480	0.1045	0.0012	0.0982	0.0729
Log-t	40%	0.10	$β_{0}$	−0.0762	0.7533	0.5757	−0.0348	0.3213	0.2479	0.0145	0.3928	0.3083
			$β_{1}$	0.0207	0.1314	0.1025	0.0143	0.0804	0.0623	0.0001	0.0879	0.0700
			$κ_{0}$	−0.0815	0.2778	0.2142	−0.0183	0.1439	0.1165	−0.0295	0.1130	0.0889
			$κ_{1}$	0.0578	0.2083	0.1586	0.0328	0.1559	0.1257	0.0222	0.1063	0.0850
Powell			$β_{0}$	−0.5188	4.5059	1.9175	−0.9452	5.1417	1.4365	0.4036	2.8785	1.7625
			$β_{1}$	0.0440	0.6426	0.3880	−0.1098	0.9502	0.5168	−0.4373	0.8079	0.6864
Portnoy			$β_{0}$	−0.5741	2.8342	1.5242	−0.5912	3.3235	0.9425	−1.0443	3.0364	1.5773
			$β_{1}$	0.1358	0.4130	0.2775	0.0602	0.5395	0.2696	0.1662	0.4545	0.3274
Peng-Huang			$β_{0}$	−2.7295	6.7772	3.2428	−2.8272	10.2338	3.1052	−2.0500	4.6851	2.4503
			$β_{1}$	0.4193	0.8157	0.4814	0.4064	1.4192	0.5347	0.3037	0.6310	0.4302
Log-t		0.50	$β_{0}$	−0.0610	0.3356	0.2442	−0.0151	0.1681	0.1318	−0.0034	0.1339	0.1041
			$β_{1}$	0.0067	0.0618	0.0455	0.0120	0.1017	0.0799	0.0027	0.0653	0.0511
			$κ_{0}$	−0.0722	0.2823	0.2225	−0.0397	0.1743	0.1381	−0.0293	0.1267	0.0986
			$κ_{1}$	0.1879	0.3893	0.3120	0.0602	0.1756	0.1406	0.0298	0.1417	0.1130
Powell			$β_{0}$	−0.0998	0.8234	0.5718	−0.1519	1.2939	0.4629	−0.0219	0.6335	0.2936
			$β_{1}$	0.0098	0.1995	0.1327	0.0754	0.7943	0.3016	0.0185	0.3173	0.1295
Portnoy			$β_{0}$	−0.1035	0.4767	0.3470	−0.0309	0.2578	0.2042	−0.0553	0.2548	0.1931
			$β_{1}$	0.0100	0.1073	0.0768	0.0109	0.1698	0.1311	0.0202	0.1020	0.0782
Peng-Huang			$β_{0}$	−0.1271	0.4896	0.3597	−0.0551	0.2656	0.2104	−0.1048	0.2760	0.2086
			$β_{1}$	0.0203	0.1126	0.0807	0.0243	0.1779	0.1340	0.0293	0.1063	0.0818
Log-t		0.90	$β_{0}$	−0.0977	0.4698	0.3752	−0.0534	0.2166	0.1739	−0.0232	0.1481	0.1181
			$β_{1}$	0.0201	0.1425	0.1127	0.0035	0.0549	0.0435	0.0049	0.0520	0.0406
			$κ_{0}$	−0.0490	0.2402	0.1935	−0.0134	0.1394	0.1095	−0.0136	0.0941	0.0735
			$κ_{1}$	0.1253	0.3044	0.2312	0.0770	0.1889	0.1474	0.0309	0.1140	0.0916
Powell			$β_{0}$	0.3990	1.3719	0.8616	0.3454	0.9561	0.6535	0.1000	0.4597	0.3284
			$β_{1}$	−0.1542	0.4091	0.2843	−0.0737	0.2134	0.1599	−0.0278	0.1843	0.1430
Portnoy			$β_{0}$	0.4222	1.5037	0.9012	0.2809	0.8129	0.5072	0.1186	0.4077	0.2916
			$β_{1}$	−0.0291	0.5690	0.3643	0.0054	0.1936	0.1338	0.0103	0.1475	0.1137
Peng-Huang			$β_{0}$	1.1683	2.7855	1.4775	0.2702	0.7869	0.5049	0.0681	0.3780	0.2734
			$β_{1}$	0.0192	0.7575	0.4694	0.0331	0.2331	0.1491	0.0154	0.1428	0.1077

Open in a new tab

Table A3.

RB, RMSE and RMAE from simulated data for the cases.

				n = 25			n = 50			n = 100
	Censoring	q	Parameter	RB	RRMSE	RMAE	RB	RRMSE	RMAE	RB	RRMSE	RMAE
Log-PE	10%	0.10	$β_{0}$	0.1384	1.0672	0.8466	−0.0013	0.7284	0.5754	0.0347	0.4928	0.3885
			$β_{1}$	0.0055	0.5653	0.4353	0.0414	0.4093	0.3277	0.0151	0.2528	0.1956
			$κ_{0}$	−0.0689	0.1922	0.1491	−0.0301	0.1327	0.1056	−0.0134	0.0840	0.0675
			$κ_{1}$	0.1852	0.6635	0.5181	0.0360	0.3166	0.2552	0.0323	0.2465	0.1988
Powell			$β_{0}$	−0.6721	2.7554	1.6954	−2.1507	7.1013	2.9568	−1.8793	6.9061	2.5035
			$β_{1}$	0.3991	1.5360	1.0998	−0.6379	3.6862	1.8388	0.0528	3.1405	1.3626
Portnoy			$β_{0}$	−0.5907	2.3943	1.5534	−0.9888	4.3779	1.6299	−1.2863	3.1860	1.6451
			$β_{1}$	0.5390	1.4984	1.1252	0.7231	2.4029	1.1043	0.7421	1.4551	0.9087
Peng-Huang			$β_{0}$	−1.3971	3.2524	2.0838	−2.1049	4.0427	2.4967	−2.0706	4.9947	2.3498
			$β_{1}$	0.7602	1.9042	1.5156	1.0617	2.3874	1.6791	0.8418	2.0581	1.1636
Log-PE		0.50	$β_{0}$	−0.0444	0.7616	0.6071	−0.0818	0.5009	0.4013	−0.0139	0.3214	0.2503
			$β_{1}$	−0.0181	0.5756	0.4433	0.0367	0.4118	0.3309	0.0166	0.2501	0.1951
			$κ_{0}$	−0.0557	0.1828	0.1429	−0.0283	0.1295	0.1024	−0.0091	0.0856	0.0681
			$κ_{1}$	0.2246	0.9145	0.7502	0.0815	0.5129	0.4007	0.0628	0.3909	0.3107
Powell			$β_{0}$	−0.0334	0.8392	0.6640	−0.0729	0.5721	0.4580	0.0074	0.3720	0.2901
			$β_{1}$	−0.0269	0.6266	0.4776	0.0243	0.4838	0.3887	0.0023	0.2936	0.2296
Portnoy			$β_{0}$	−0.0221	0.8345	0.6613	−0.0729	0.5632	0.4475	0.0053	0.3701	0.2867
			$β_{1}$	−0.0284	0.6280	0.4748	0.0224	0.4848	0.3881	0.0022	0.2886	0.2263
Peng-Huang			$β_{0}$	0.0219	0.8361	0.6641	−0.0958	0.5800	0.4630	−0.0494	0.3739	0.2935
			$β_{1}$	−0.0254	0.6389	0.4854	0.0257	0.4882	0.3910	0.0023	0.2928	0.2304
Log-PE		0.90	$β_{0}$	−0.2088	0.9744	0.8001	−0.1864	0.7440	0.5903	−0.0650	0.4539	0.3655
			$β_{1}$	−0.0297	0.5144	0.4067	0.0382	0.4134	0.3312	0.0166	0.2500	0.1949
			$κ_{0}$	−0.0556	0.1747	0.1386	−0.0320	0.1294	0.1032	−0.0124	0.0828	0.0665
			$κ_{1}$	0.1602	0.5975	0.4830	0.0937	0.3743	0.3012	0.0438	0.2648	0.2117
Powell			$β_{0}$	−0.0805	1.4013	1.1274	−0.0805	0.9817	0.7972	−0.0028	0.6137	0.4898
			$β_{1}$	−0.0823	0.9390	0.7299	0.1319	0.8866	0.7013	−0.0070	0.5119	0.4103
Portnoy			$β_{0}$	−0.0738	1.2981	1.0494	−0.0280	0.9489	0.7748	0.0042	0.6218	0.4932
			$β_{1}$	−0.0719	0.9187	0.7180	0.1187	0.8450	0.6777	−0.0125	0.5050	0.4036
Peng-Huang			$β_{0}$	0.6397	1.6876	1.2664	0.0731	1.0289	0.8317	−0.0308	0.6166	0.4935
			$β_{1}$	−0.0852	1.1035	0.8781	0.1231	0.8790	0.7030	−0.0074	0.5086	0.4073
Log-PE	40%	0.10	$β_{0}$	0.3619	1.4767	1.1750	−0.0389	0.9883	0.7781	0.0467	0.6877	0.5514
			$β_{1}$	−0.0080	0.6619	0.5297	0.0737	0.4716	0.3684	0.0088	0.2946	0.2321
			$κ_{0}$	−0.1399	0.2890	0.2210	−0.0377	0.1631	0.1286	−0.0204	0.1105	0.0887
			$κ_{1}$	0.3149	0.8399	0.6308	0.0464	0.3414	0.2701	0.0418	0.2655	0.2110
Powell			$β_{0}$	2.4789	3.4527	3.1471	2.6180	3.5536	3.3332	2.2081	5.6156	3.9753
			$β_{1}$	−0.2191	1.2845	1.0825	−0.3330	1.3566	1.1505	−0.4379	1.8923	1.2370
Portnoy			$β_{0}$	1.6025	3.2917	2.8021	−0.7189	5.4671	3.3247	−2.1865	7.5831	4.1709
			$β_{1}$	0.6165	1.7084	1.2353	1.2614	3.3669	2.1392	1.4990	2.8490	1.7677
Peng-Huang			$β_{0}$	−0.6804	3.2222	2.3646	−1.9252	4.3854	2.7560	−10.7373	15.5892	11.1410
			$β_{1}$	1.8081	2.4192	1.9235	2.1521	3.2258	2.4718	4.3267	5.6656	4.3716
Log-PE		0.50	$β_{0}$	−0.1155	0.8514	0.6769	−0.1329	0.5649	0.4492	−0.0212	0.3616	0.2895
			$β_{1}$	0.0257	0.6601	0.5114	0.0705	0.4685	0.3689	0.0145	0.2820	0.2208
			$κ_{0}$	−0.0929	0.2470	0.1924	−0.0313	0.1558	0.1251	−0.0165	0.1086	0.0861
			$κ_{1}$	0.3852	1.3960	1.1080	0.0434	0.6482	0.4967	0.0561	0.4511	0.3569
Powell			$β_{0}$	−0.5423	2.0553	1.1863	−0.4349	1.6693	0.8268	−0.3318	1.8591	0.6498
			$β_{1}$	0.2307	1.2243	0.8322	0.2188	1.0260	0.6315	0.1722	0.8218	0.4464
Portnoy			$β_{0}$	−0.4457	1.9939	1.0416	−0.1954	0.7329	0.5408	−0.0670	0.4447	0.3435
			$β_{1}$	0.2215	1.3833	0.7818	0.1029	0.6299	0.4654	0.0404	0.3361	0.2606
Peng-Huang			$β_{0}$	−0.5049	3.8184	1.1518	−0.2208	0.7274	0.5590	−0.1355	0.4849	0.3693
			$β_{1}$	0.3089	2.3524	0.8460	0.0986	0.6380	0.4791	0.0528	0.3607	0.2824
log-PE		0.90	$β_{0}$	−0.3588	1.1051	0.9109	−0.1942	0.7816	0.6194	−0.0810	0.4818	0.3868
			$β_{1}$	0.0033	0.5826	0.4480	0.0694	0.4464	0.3497	0.0268	0.2703	0.2152
			$κ_{0}$	−0.0650	0.2296	0.1751	−0.0220	0.1489	0.1189	−0.0111	0.1059	0.0854
			$κ_{1}$	0.2182	0.8758	0.6678	0.0819	0.4809	0.3823	0.0515	0.3141	0.2491
Powell			$β_{0}$	−0.0360	1.3816	1.1077	−0.0740	0.9848	0.7957	0.0012	0.6121	0.4880
			$β_{1}$	−0.1382	0.8689	0.6732	0.0968	0.8490	0.6661	−0.0106	0.5058	0.4068
Portnoy			$β_{0}$	−0.1084	1.3262	1.0780	−0.0291	0.9586	0.7788	0.0043	0.6225	0.4932
			$β_{1}$	−0.0166	1.0442	0.7869	0.1083	0.8425	0.6677	−0.0131	0.5044	0.4031
Peng-Huang			$β_{0}$	0.5940	1.7319	1.3086	0.0572	1.0351	0.8324	−0.0308	0.6137	0.4918
			$β_{1}$	0.0175	1.2963	0.9787	0.1286	0.8952	0.7081	−0.0040	0.5090	0.4063

Open in a new tab

Table A4.

RB, RMSE and RMAE from simulated data for the indicated cases ( $censoring = 25 %$ ).

			n = 25			n = 50			n = 100
	$q$	Parameter	RB	RRMSE	RMAE	RB	RRMSE	RMAE	RB	RRMSE	RMAE
Log-NO	0.10	$β_{0}$	−0.0225	0.0449	0.0336	−0.0189	0.0323	0.0256	−0.0643	0.0672	0.0645
		$β_{1}$	0.0003	0.0227	0.0180	−0.0008	0.0165	0.0133	−0.0136	0.0191	0.0155
Log-t		$β_{0}$	0.0026	0.0579	0.0455	0.0115	0.0345	0.0272	0.0244	0.0343	0.0289
		$β_{1}$	−0.0019	0.0240	0.0195	0.0005	0.0175	0.0141	0.0005	0.0093	0.0075
Log-PE		$β_{0}$	−0.0749	0.0838	0.0750	−0.0732	0.0837	0.0752	−0.1294	0.1358	0.1295
		$β_{1}$	−0.0052	0.0264	0.0215	−0.0095	0.0283	0.0204	−0.0117	0.0242	0.0190
EBS		$β_{0}$	−0.0932	0.1017	0.0932	−0.0937	0.1008	0.0937	−0.1685	0.1767	0.1685
		$β_{1}$	−0.0106	0.0266	0.0220	0.0025	0.0192	0.0156	−0.0080	0.0211	0.0170
Powell		$β_{0}$	0.0028	0.0852	0.0642	0.0000	0.0587	0.0469	0.0054	0.0409	0.0333
		$β_{1}$	−0.0020	0.0565	0.0435	−0.0000	0.0415	0.0327	−0.0042	0.0255	0.0198
Portnoy		$β_{0}$	−0.0313	0.1144	0.0764	−0.0158	0.0670	0.0496	−0.0008	0.0404	0.0326
		$β_{1}$	0.0131	0.0551	0.0421	0.0072	0.0396	0.0318	−0.0021	0.0248	0.0199
Peng-Huang		$β_{0}$	−0.0361	0.1182	0.0807	−0.0286	0.0834	0.0579	−0.0120	0.0448	0.0345
		$β_{1}$	0.0100	0.0649	0.0482	0.0084	0.0445	0.0350	−0.0020	0.0269	0.0215
Log-NO	0.30	$β_{0}$	−0.0399	0.0539	0.0434	−0.0330	0.0404	0.0347	−0.0475	0.0512	0.0478
		$β_{1}$	−0.0002	0.0200	0.0157	−0.0065	0.0205	0.0168	−0.0003	0.0124	0.0098
Log-t		$β_{0}$	0.0205	0.0445	0.0350	0.0158	0.0325	0.0261	−0.0021	0.0192	0.0150
		$β_{1}$	−0.0021	0.0237	0.0193	0.0007	0.0168	0.0136	0.0004	0.0092	0.0075
Log-PE		$β_{0}$	−0.0339	0.0504	0.0398	−0.0254	0.0377	0.0316	−0.0424	0.0488	0.0429
		$β_{1}$	0.0013	0.0239	0.0196	−0.0031	0.0190	0.0152	−0.0048	0.0147	0.0119
EBS		$β_{0}$	−0.0703	0.0815	0.0715	−0.0669	0.0766	0.0681	−0.0980	0.1002	0.0980
		$β_{1}$	−0.0064	0.0256	0.0206	−0.0028	0.0189	0.0153	0.0090	0.0141	0.0115
Powell		$β_{0}$	−0.0118	0.0490	0.0381	−0.0067	0.0358	0.0278	0.0019	0.0237	0.0189
		$β_{1}$	0.0029	0.0284	0.0230	0.0022	0.0251	0.0203	−0.0026	0.0148	0.0122
Portnoy		$β_{0}$	−0.0181	0.0548	0.0404	−0.0069	0.0346	0.0273	−0.0009	0.0228	0.0180
		$β_{1}$	0.0058	0.0303	0.0241	0.0019	0.0240	0.0195	−0.0011	0.0140	0.0115
Peng-Huang		$β_{0}$	−0.0190	0.0582	0.0420	−0.0104	0.0370	0.0292	−0.0050	0.0247	0.0196
		$β_{1}$	0.0065	0.0317	0.0253	0.0028	0.0249	0.0200	−0.0008	0.0145	0.0119
Log-NO	0.70	$β_{0}$	−0.0084	0.0291	0.0230	0.0019	0.0237	0.0191	0.0273	0.0371	0.0317
		$β_{1}$	−0.0001	0.0222	0.0178	−0.0052	0.0194	0.0154	−0.0118	0.0183	0.0154
Log-t		$β_{0}$	−0.0056	0.0322	0.0254	−0.0051	0.0237	0.0190	−0.0078	0.0159	0.0126
		$β_{1}$	−0.0025	0.0241	0.0195	0.0008	0.0165	0.0133	0.0004	0.0092	0.0075
Log-PE		$β_{0}$	0.0122	0.0348	0.0276	0.0213	0.0357	0.0291	0.0376	0.0446	0.0384
		$β_{1}$	0.0016	0.0226	0.0184	−0.0086	0.0258	0.0206	−0.0037	0.0164	0.0128
EBS		$β_{0}$	0.0055	0.0355	0.0281	0.0234	0.0426	0.0316	0.0317	0.0382	0.0326
		$β_{1}$	−0.0057	0.0257	0.0208	−0.0086	0.0306	0.0227	0.0033	0.0172	0.0137
Powell		$β_{0}$	−0.0067	0.0310	0.0238	−0.0030	0.0236	0.0184	−0.0015	0.0154	0.0121
		$β_{1}$	−0.0013	0.0227	0.0182	0.0015	0.0170	0.0136	−0.0003	0.0096	0.0076
Portnoy		$β_{0}$	−0.0051	0.0282	0.0211	−0.0022	0.0224	0.0177	−0.0020	0.0156	0.0123
		$β_{1}$	−0.0004	0.0203	0.0165	0.0017	0.0163	0.0130	0.0004	0.0094	0.0075
Peng-Huang		$β_{0}$	−0.0011	0.0283	0.0220	−0.0028	0.0233	0.0183	−0.0042	0.0164	0.0129
		$β_{1}$	−0.0001	0.0212	0.0171	0.0026	0.0166	0.0133	0.0010	0.0098	0.0079
Log-NO	0.90	$β_{0}$	0.0282	0.0394	0.0327	0.0361	0.0420	0.0372	0.0666	0.0717	0.0666
		$β_{1}$	−0.0040	0.0218	0.0176	−0.0030	0.0171	0.0137	−0.0018	0.0148	0.0119
Log-t		$β_{0}$	0.0477	0.0643	0.0540	0.0268	0.0408	0.0323	0.0133	0.0207	0.0169
		$β_{1}$	−0.0015	0.0236	0.0191	0.0008	0.0162	0.0130	0.0003	0.0092	0.0075
Log-PE		$β_{0}$	0.1032	0.1102	0.1037	0.1166	0.1198	0.1166	0.1631	0.1648	0.1631
		$β_{1}$	−0.0022	0.0248	0.0200	−0.0042	0.0194	0.0151	0.0018	0.0146	0.0111
EBS		$β_{0}$	0.0566	0.0663	0.0579	0.0820	0.0864	0.0820	0.1154	0.1195	0.1154
		$β_{1}$	−0.0037	0.0241	0.0188	−0.0049	0.0230	0.0178	−0.0114	0.0296	0.0269
Powell		$β_{0}$	−0.0026	0.0310	0.0235	−0.0027	0.0240	0.0193	−0.0023	0.0158	0.0128
		$β_{1}$	−0.0056	0.0245	0.0182	−0.0027	0.0181	0.0140	0.0001	0.0105	0.0082
Portnoy		$β_{0}$	−0.0038	0.0269	0.0210	−0.0017	0.0218	0.0175	−0.0013	0.0147	0.0119
		$β_{1}$	−0.0033	0.0217	0.0167	−0.0007	0.0152	0.0121	0.0003	0.0102	0.0081
Peng-Huang		$β_{0}$	0.0056	0.0294	0.0238	−0.0032	0.0238	0.0188	−0.0046	0.0156	0.0126
		$β_{1}$	−0.0023	0.0234	0.0176	0.0020	0.0167	0.0133	0.0022	0.0103	0.0082

Open in a new tab

Funding Statement

This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brazil (CAPES) Finance Code 001. Danúbia Rodrigues thanks CAPES-PROEX for the doctorate scholarship. Jose Angelo Divino and Helton Saulo acknowledge Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) for financial support.

Notes

Available at https://www.ibge.gov.br/

https://www.ibge.gov.br/.

We do not use the most recent Continuous PNAD data set because of a lack clarity in some variables of interest in its dictionary. This is case of the labor experience/skill variable, which are the years of work in the main activity, and marital status, which is also not clearly defined in the Continuous PNAD. It is worth mention that is not necessary the variables of the two samples, PNAD and PSID, are similar. We only need to have a censored variable for female hours worked. The 2015 PNAD data set also be used only to illustrate the proposed methodology.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Amemiya T., Tobit models: A survey, J. Econom. 24 (1984), pp. 3–61. [Google Scholar]
2.Barros M., Galea M., Leiva V., and Santos-Neto M., Generalized tobit models: diagnostics and application in econometrics, J. Appl. Stat. 45 (2018), pp. 145–167. [Google Scholar]
3.Barros R., Jatobá J., and Mendonça R., A evolução da participação de mulheres no mercado de trabalho: uma análise de decomposição, Proceedings of the 4th National Meeting of Labor Studies, Brazilian Association of Labor Studies, 1995.
4.Cox D. and Hinkley D., Theoretical Statistics, Chapman and Hall, London, 1974. [Google Scholar]
5.Davino C., Furno M., and Vistocco D., Quantile Regression, Wiley, Chichester, 2014. [Google Scholar]
6.Desousa M.F., Saulo H., Leiva V., and Scalco P., On a tobit-Birnbaum-Saunders model with an application to medical data, J. Appl. Stat. 45 (2018), pp. 932–955. [Google Scholar]
7.Fair R., A note on computation of the tobit estimator, Econometrica 45 (1977), pp. 1723–1727. [Google Scholar]
8.Fair R., A theory of extramarital affairs, J. Political Econom. 86 (1978), pp. 45–61. [Google Scholar]
9.Geraci M., Qtools: A collection of models and tools for quantile inference, R. J. 8 (2016), pp. 117–138. [Google Scholar]
10.Gilchrist W., Statistical Modelling with Quantile Functions, 1st ed., Chapman & Hall/CRC, Boca Raton, FL, 2000. [Google Scholar]
11.Greene W.H., Econometric Analysis, 7th ed., Pearson Education, New York, NY, 2012. [Google Scholar]
12.Hao L. and Naiman D., Quantile Regression, Sage Publications, California, 2007. [Google Scholar]
13.Heckman J.J. and MaCurdy T.E., A life cycle model of female labor supply, Rev. Econom. Stud. 47 (1980), pp. 47–74. [Google Scholar]
14.Helsel D.R., Statistics for Censored Environmental Data Using Minitab and R, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]
15.ILO , World Employment and Social Outlook: Trends for Women 2018 - Global snapshot. International Labour Organization, Geneva, 2018.
16.Islam N, A dynamic tobit model of female labor supply, Working Papers In Economics No 259, 2007. pp. 1–29.
17.Jacobsen J., Labor force participation, Q. Rev. Econ. Financ. 39 (1999), pp. 597–610. [Google Scholar]
18.Jarque C., An application of LDV models to household expenditure analysis in Mexico, J. Econom. 36 (1987), pp. 31–53. [Google Scholar]
19.Jones M.C., On reciprocal symmetry, J. Stat. Plan. Inference. 138 (2008), pp. 3039–3043. [Google Scholar]
20.Kano Y., Berkane M., and Bentler P.M., Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations, J. Am. Stat. Assoc. 88 (1993), pp. 135–143. [Google Scholar]
21.Koenker R., Quantile Regression, Cambridge University Press, Cambridge, 2005. [Google Scholar]
22.Koenker R., quantreg: Quantile Regression. R package version 5.85, 2021.
23.Koenker R. and Bassett Jr G., Regression quantiles, Econometrica 46 (1978), pp. 33–50. [Google Scholar]
24.Long J.S., Regression Models for Categorical and Limited Dependent Variables, Sage Publications Inc., Thousand Oaks, 1997. [Google Scholar]
25.Lucas A., Robustness of the student t based M-estimator, Commun. Stat.: Theor. Meth. 41 (1997), pp. 1165–1182. [Google Scholar]
26.Medeiros M.C. and Ferrari S.L.P., Small-sample testing inference in symmetric and log-symmetric linear regression models, Stat. Neerl. 71 (2017), pp. 200–224. [Google Scholar]
27.Melenberg B. and van Soest A., Parametric and semi-parametric modelling of vacation expenditures, J. Appl. Econom. 11 (1996), pp. 59–76. [Google Scholar]
28.Moffitt R., The tobit model, hours of work and institutional constraints, Rev. Econ. Stat. 64 (1982), pp. 510–515. [Google Scholar]
29.Mroz T., The sensitiviy of an empirical model of married women's hours of work to economic and statistical assumptions, Econometrica 55 (1987), pp. 765–799. [Google Scholar]
30.Peng L. and Huang Y., Survival analysis with quantile regression models, J. Amer. Stat. Assoc. 103 (2008), pp. 637–649. [Google Scholar]
31.Portnoy S., Censored regression quantiles, J. Amer. Stat. Assoc. 98 (2003), pp. 1001–1012. [Google Scholar]
32.Powell J.L., Censored regression quantiles, J. Econom. 32 (1986), pp. 143–155. [Google Scholar]
33.R Core Team . R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2020.
34.Santos J. and Cribari-Neto F., Hypothesis testing in log-Birnbaum-Saunders regressions, Commun. Stat. – Simul. Comput. 46 (2017), pp. 3990–4003. [Google Scholar]
35.Saulo H., Dasilva A., Leiva V., Sánchez L., and Fuente-Mella H.L., Log-symmetric quantile regression models, Stat. Neerl. (2021a), pp. 1–40. 10.1111/stan.12243 [DOI] [Google Scholar]
36.Saulo H., Leão J., Nobre L., and Balakrishnan N., A class of asymmetric regression models for left-censored data, Brazilian J. Probab. Stat. 35 (2021b), pp. 62–84. [Google Scholar]
37.Scorzafave L.G. and Menezes-Filho N.A., Participção feminina no mercado de trabalho brasileiro: evolução e determinantes, Pesquisa E Planejamento Econômico. 31 (2001), pp. 441–478. [Google Scholar]
38.Seung-Hoon Y., Analysing household bottled water and water purifier expenditures: simultaneous equation bivariate tobit model, Appl. Econ. Lett. 12 (2005), pp. 297–301. [Google Scholar]
39.Silva G.O., Ortega E.M., and Cordeiro G.M., A log-extended weibull regression model, Comput. Stat. Data. Anal. 53 (2009), pp. 4482–4489. [Google Scholar]
40.Stute W., Strong consistency of the mle under random censoring, Metrika 39 (1992), pp. 257–267. [Google Scholar]
41.Therneau T., Grambsch P., and Fleming T., Martingale-based residuals for survival models, Biometrika 77 (1990), pp. 147–160. [Google Scholar]
42.Tobin J., Estimation of relationships for limited dependent variables, Econometrica 26 (1958), pp. 24–36. [Google Scholar]
43.Vanegas L.H. and Paula G.A., A semiparametric approach for joint modeling of median and skewness, Test 24 (2015), pp. 110–135. [Google Scholar]
44.Vanegas L.H. and Paula G.A., Log-symmetric distributions: statistical properties and parameter estimation, Brazilian J. Probab. Stat. 30 (2016a), pp. 196–220. [Google Scholar]
45.Vanegas L.H. and Paula G.A., ssym: Fitting Semi-Parametric Log-Symmetric Regression Models, 2016b. R package version 1.5.7.
46.Vanegas L.H. and Paula G.A., Log-symmetric regression models under the presence of non-informative left- or right-censored observations, Test 26 (2017), pp. 405–428. [Google Scholar]
47.Weisberg S., Applied Linear Regression, 4th ed., John Wiley & Sons, Hoboken, NJ, 2014. [Google Scholar]

[CIT0001] 1.Amemiya T., Tobit models: A survey, J. Econom. 24 (1984), pp. 3–61. [Google Scholar]

[CIT0002] 2.Barros M., Galea M., Leiva V., and Santos-Neto M., Generalized tobit models: diagnostics and application in econometrics, J. Appl. Stat. 45 (2018), pp. 145–167. [Google Scholar]

[CIT0003] 3.Barros R., Jatobá J., and Mendonça R., A evolução da participação de mulheres no mercado de trabalho: uma análise de decomposição, Proceedings of the 4th National Meeting of Labor Studies, Brazilian Association of Labor Studies, 1995.

[CIT0004] 4.Cox D. and Hinkley D., Theoretical Statistics, Chapman and Hall, London, 1974. [Google Scholar]

[CIT0005] 5.Davino C., Furno M., and Vistocco D., Quantile Regression, Wiley, Chichester, 2014. [Google Scholar]

[CIT0006] 6.Desousa M.F., Saulo H., Leiva V., and Scalco P., On a tobit-Birnbaum-Saunders model with an application to medical data, J. Appl. Stat. 45 (2018), pp. 932–955. [Google Scholar]

[CIT0007] 7.Fair R., A note on computation of the tobit estimator, Econometrica 45 (1977), pp. 1723–1727. [Google Scholar]

[CIT0008] 8.Fair R., A theory of extramarital affairs, J. Political Econom. 86 (1978), pp. 45–61. [Google Scholar]

[CIT0009] 9.Geraci M., Qtools: A collection of models and tools for quantile inference, R. J. 8 (2016), pp. 117–138. [Google Scholar]

[CIT0010] 10.Gilchrist W., Statistical Modelling with Quantile Functions, 1st ed., Chapman & Hall/CRC, Boca Raton, FL, 2000. [Google Scholar]

[CIT0011] 11.Greene W.H., Econometric Analysis, 7th ed., Pearson Education, New York, NY, 2012. [Google Scholar]

[CIT0012] 12.Hao L. and Naiman D., Quantile Regression, Sage Publications, California, 2007. [Google Scholar]

[CIT0013] 13.Heckman J.J. and MaCurdy T.E., A life cycle model of female labor supply, Rev. Econom. Stud. 47 (1980), pp. 47–74. [Google Scholar]

[CIT0014] 14.Helsel D.R., Statistics for Censored Environmental Data Using Minitab and R, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]

[CIT0015] 15.ILO , World Employment and Social Outlook: Trends for Women 2018 - Global snapshot. International Labour Organization, Geneva, 2018.

[CIT0016] 16.Islam N, A dynamic tobit model of female labor supply, Working Papers In Economics No 259, 2007. pp. 1–29.

[CIT0017] 17.Jacobsen J., Labor force participation, Q. Rev. Econ. Financ. 39 (1999), pp. 597–610. [Google Scholar]

[CIT0018] 18.Jarque C., An application of LDV models to household expenditure analysis in Mexico, J. Econom. 36 (1987), pp. 31–53. [Google Scholar]

[CIT0019] 19.Jones M.C., On reciprocal symmetry, J. Stat. Plan. Inference. 138 (2008), pp. 3039–3043. [Google Scholar]

[CIT0020] 20.Kano Y., Berkane M., and Bentler P.M., Statistical inference based on pseudo-maximum likelihood estimators in elliptical populations, J. Am. Stat. Assoc. 88 (1993), pp. 135–143. [Google Scholar]

[CIT0021] 21.Koenker R., Quantile Regression, Cambridge University Press, Cambridge, 2005. [Google Scholar]

[CIT0022] 22.Koenker R., quantreg: Quantile Regression. R package version 5.85, 2021.

[CIT0023] 23.Koenker R. and Bassett Jr G., Regression quantiles, Econometrica 46 (1978), pp. 33–50. [Google Scholar]

[CIT0024] 24.Long J.S., Regression Models for Categorical and Limited Dependent Variables, Sage Publications Inc., Thousand Oaks, 1997. [Google Scholar]

[CIT0025] 25.Lucas A., Robustness of the student t based M-estimator, Commun. Stat.: Theor. Meth. 41 (1997), pp. 1165–1182. [Google Scholar]

[CIT0026] 26.Medeiros M.C. and Ferrari S.L.P., Small-sample testing inference in symmetric and log-symmetric linear regression models, Stat. Neerl. 71 (2017), pp. 200–224. [Google Scholar]

[CIT0027] 27.Melenberg B. and van Soest A., Parametric and semi-parametric modelling of vacation expenditures, J. Appl. Econom. 11 (1996), pp. 59–76. [Google Scholar]

[CIT0028] 28.Moffitt R., The tobit model, hours of work and institutional constraints, Rev. Econ. Stat. 64 (1982), pp. 510–515. [Google Scholar]

[CIT0029] 29.Mroz T., The sensitiviy of an empirical model of married women's hours of work to economic and statistical assumptions, Econometrica 55 (1987), pp. 765–799. [Google Scholar]

[CIT0030] 30.Peng L. and Huang Y., Survival analysis with quantile regression models, J. Amer. Stat. Assoc. 103 (2008), pp. 637–649. [Google Scholar]

[CIT0031] 31.Portnoy S., Censored regression quantiles, J. Amer. Stat. Assoc. 98 (2003), pp. 1001–1012. [Google Scholar]

[CIT0032] 32.Powell J.L., Censored regression quantiles, J. Econom. 32 (1986), pp. 143–155. [Google Scholar]

[CIT0033] 33.R Core Team . R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, 2020.

[CIT0034] 34.Santos J. and Cribari-Neto F., Hypothesis testing in log-Birnbaum-Saunders regressions, Commun. Stat. – Simul. Comput. 46 (2017), pp. 3990–4003. [Google Scholar]

[CIT0035] 35.Saulo H., Dasilva A., Leiva V., Sánchez L., and Fuente-Mella H.L., Log-symmetric quantile regression models, Stat. Neerl. (2021a), pp. 1–40. 10.1111/stan.12243 [DOI] [Google Scholar]

[CIT0036] 36.Saulo H., Leão J., Nobre L., and Balakrishnan N., A class of asymmetric regression models for left-censored data, Brazilian J. Probab. Stat. 35 (2021b), pp. 62–84. [Google Scholar]

[CIT0037] 37.Scorzafave L.G. and Menezes-Filho N.A., Participção feminina no mercado de trabalho brasileiro: evolução e determinantes, Pesquisa E Planejamento Econômico. 31 (2001), pp. 441–478. [Google Scholar]

[CIT0038] 38.Seung-Hoon Y., Analysing household bottled water and water purifier expenditures: simultaneous equation bivariate tobit model, Appl. Econ. Lett. 12 (2005), pp. 297–301. [Google Scholar]

[CIT0039] 39.Silva G.O., Ortega E.M., and Cordeiro G.M., A log-extended weibull regression model, Comput. Stat. Data. Anal. 53 (2009), pp. 4482–4489. [Google Scholar]

[CIT0040] 40.Stute W., Strong consistency of the mle under random censoring, Metrika 39 (1992), pp. 257–267. [Google Scholar]

[CIT0041] 41.Therneau T., Grambsch P., and Fleming T., Martingale-based residuals for survival models, Biometrika 77 (1990), pp. 147–160. [Google Scholar]

[CIT0042] 42.Tobin J., Estimation of relationships for limited dependent variables, Econometrica 26 (1958), pp. 24–36. [Google Scholar]

[CIT0043] 43.Vanegas L.H. and Paula G.A., A semiparametric approach for joint modeling of median and skewness, Test 24 (2015), pp. 110–135. [Google Scholar]

[CIT0044] 44.Vanegas L.H. and Paula G.A., Log-symmetric distributions: statistical properties and parameter estimation, Brazilian J. Probab. Stat. 30 (2016a), pp. 196–220. [Google Scholar]

[CIT0045] 45.Vanegas L.H. and Paula G.A., ssym: Fitting Semi-Parametric Log-Symmetric Regression Models, 2016b. R package version 1.5.7.

[CIT0046] 46.Vanegas L.H. and Paula G.A., Log-symmetric regression models under the presence of non-informative left- or right-censored observations, Test 26 (2017), pp. 405–428. [Google Scholar]

[CIT0047] 47.Weisberg S., Applied Linear Regression, 4th ed., John Wiley & Sons, Hoboken, NJ, 2014. [Google Scholar]

PERMALINK

On a log-symmetric quantile tobit model applied to female labor supply data

Danúbia R Cunha

Jose Angelo Divino

Helton Saulo

Abstract

1. Introduction

2. Log-symmetric distributions

2.1. Classical log-symmetric distributions

Table 1.

2.2. Quantile-based log-symmetric distributions

Figure 1.

3. Log-symmetric quantile tobit model

3.1. Interpretation of the regression coefficients

3.2. Residual analysis

4. Monte Carlo simulation

4.1. Scenario 1

Figure 2.

Figure 3.

Figure 4.

Figure 5.

4.2. Scenario 2

Figure 6.

5. Applications

5.1. PNAD data

Figure 7.

Table 2.

Table 3.

Table 4.

5.2. PSID data

Table 5.

Figure 8.

Table 6.

6. Concluding remarks

Tables.

Table A1.

Table A2.

Table A3.

Table A4.

Funding Statement

Notes

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases