Abstract
This paper introduces an approach to select the bandwidth or smoothing parameter in multiresolution (MR) density estimation and nonparametric density estimation. It is based on the evolution of the second, third and fourth central moments and the shape of the estimated densities for different bandwidths and resolution levels. The proposed method has been applied to density estimation by means of multiresolution densities as well as kernel density estimation (MRDE and KDE respectively). The results of the simulations and the empirical application demonstrate that the level of resolution resulting from the moments method performs better with multimodal densities than the Bayesian Information Criterion (BIC) for multiresolution densities estimation and the plug-in for kernel densities estimation.
KEYWORDS: Multiresolution density estimation, kernel density estimation, bandwidth, moments and level of resolution
1. Introduction
This paper develops a novel and straightforward approach to select the bandwidth or smoothing parameter in multiresolution models1 estimation and nonparametric density estimation. It is based on the moments and the shape of the estimated densities for different resolution levels and bandwidths. The method is applied to density estimation by means of multiresolution densities [MRDE; see refs 12,13,14] as well as kernel functions [KDE; see for instance, refs 18,5].
Our choice of considering the moments for the selection of the smoothing parameter was guided by the observed changes in the shape of the estimated MRDE when the level of resolution, varies. If the level of resolution is too low, the density is smooth but the bias is large. Conversely, a large leads to a rougher density but a small bias. Similar results are observed2 when a density is fitted by the Kernel method. Let us suppose a bandwidth equals to with . If is too small then is too large and the result is a smooth and biased density. As increases decreases and the bias tends to diminish, but from a determined value of , or its corresponding , an undesirable roughness appears. To solve the problem, we need to select a value of , or its corresponding smoothing parameter , so that the bias will be reasonably small without incurring an excessive roughness.
In both methodologies, MRDE and KDE, the bias of the estimated density is shown clearly by its dispersion and shape, especially in the flattening of the fitted density. Or equivalently in an underestimation of the kurtosis that evolves toward more reasonable values as the resolution level increases. That is, the bias evolution is related to the central moments of order 2, 3 and 4 since they are used to compute dispersion, asymmetry and kurtosis. When the resolution level in the MRDE increases, or the smoothing parameter in the KDE decreases, the flattening and the dispersion tends to stabilize indicating that the bias is small. From a certain level of resolution, the roughness begins to increase indicating which value of or should be selected. This value leads to a sufficiently smooth density with small bias. This is clearly shown in the graphs of sections 3, 4 and 5 that represent the evolution, as a function of , of the expected value, variance, and skewness and kurtosis coefficients for MRDE and KDE.
The rest of the paper is organized as follows. Section 2 introduces the math expression to calculate the moments of a multiresolution density. Section 3 shows, by means of simulations, how to select the level of resolution using the moments of a MRDE. Section 4 extends the approach to the Kernel method. In both estimation methods, we use the Cubic Box Spline function defined in Section 2.1. For the MRDE method, this is the scaling function generating the multiresolution analysis structure that contains the MRDE and their estimates. In the KDE method this function is used as the kernel. Section 5 contains an application to real data and Section 6 concludes.
Finally, we want to point out that the MRDE is a technique devised for massive data. Therefore, everything that follows must be understood in a context of large sample sizes.
2. Moments calculation for a MRDE
This section shows the expressions of the central and non-central moments of a multiresolution density. The second, third and fourth central moments will be used to select the level of resolution of a MRDE. The math development is contained in the Appendices A1–A6.
2.1. Multiresolution densities
Let be a symmetric density with mean zero and compact support , known as Cubic Box Spline. It is given by
where:
| (1) |
Applying dilations and translations to the density , the densities are built as follows [12]:
| (2) |
where . Note that .
For each level of resolution , the following MR densities:
| (3) |
are defined.
In expression (3) and . By definition, all these functions belong to the space of the multiresolution analysis structure (MRA) defined by the scaling function [7,23,11]. Any density of squared integrable, belonging to the space of Hilbert , has an approximation in each .
In the definition of a multiresolution structure, dilations and translations of given by are used. It is obvious that and using this notation the MR density defined by (3) can be rewritten as follows:
2.2. Moments of a random variable with MR density function
The central moment of order for a multiresolution (MR) density is calculated as follows:
where , . By definition, the expression is the moment of order of the density (see Appendices A1 and A2). That is:
The calculus of is in proposition 4 in Appendix A1.
Given a MR density the non-central moment of order is defined by:
It can be obtained as follows (see Appendix A4):
where
2.3. Asymptotic properties of the moments of an estimated MR density
Proposition 2.1:
Given a sample and an estimated MR density for that sample:
for a value of large enough is verified:
where
Proof :
See Appendix A6.
Proposition 2.2:
Let be
the non-central moment of the estimated MR density from the sample .
If converges to infinity, then converges to its sample counterpart. That is:
Proof :
See Appendix A6
Since each central moment of order is a continuous (polynomic) function of the non-central moments of order less than or equal to , we can state that as approaches to infinity the central moments of a MR density estimated using a sample of size also converge to the central moments of the sample.
3. Moments method for selecting the resolution level to estimate a MR density
In this section we introduce an alternative method to the Bayesian Information Criterion [17] to select the level of resolution in the estimation process of a MR density. It is based on the central moments of orders two, third, and four and the symmetry and kurtosis coefficients. When a MR density is estimated, if the resolution level is too low, the estimate is a smooth curve but it has excessive bias. Conversely, if the resolution level is too high the estimate bias is small but the roughness is large. In practice, the bias is mainly shown in an excessive dispersion and a flat density. Since the flattening can be measured by the Fisher coefficient, the evolution of the bias, as the resolution level increases, should be reflected in the gradual decrease of the central moment of order 2 and the central moment of order 4. Based on these moments, we will choose an appropriate resolution level so that the bias will be acceptable and the roughness of the estimator will be not excessive.
Since we are going to establish comparisons with the BIC criterion, let us introduce it briefly in the context of the MR densities. Any estimation using a MR density, for a finite-size sample and any resolution level , can be considered a finite mixture of densities (see section 2.1) with the form:
| (4) |
where is the proportion of data within the interval .
Note that the expression (4), estimator of (3), has a finite number of addends while (3) has infinite addends. This is explained as follows. Depending on the level of resolution, two extreme situations can arise. Firstly, can be so small that the entire sample will be within a single interval . In this case, there is only one coefficient distinct from zero and mixture (4) degenerates into a single addend. Secondly, can be so large that each observed value will be in a different interval existing as many different from zero as different values are observed in the sample. That is, if is the sample size and is the number of different values observed in the sample, then the number of addends in the mixture (4) is and it is verified that . Obviously, if all the sample values are different and each of them belongs to a single interval of the form3 .
Since mixture (4) contains position parameters and mixture parameters there are parameters. We can optimize the number of parameters, which depends on , by using the BIC criterion [17]. That is, we will consider that the best value is that which minimizes the expression:
where
is the sample likelihood of .
The proposed method based on the moments, is simpler and requires less process time than the BIC. However, both criteria complement and reinforce each other, as will be shown in the following simulations.
To proceed, we have simulated a sample of size 10,000 by using two generator models: a normal distribution and a mixture of double exponential distributions. We have fitted these generator models using MR densities for different levels of resolution in each case. Finally, we have calculated (Fisher coefficient of skewness or asymmetry) and (Fisher coefficient of kurtosis) to select an appropriate . In the supplemental material, we provide more details about the calculus and a macro to apply the developed methodology to the data-generating models used in this paper.
3.1. Normal distribution
Table 1 and Figure 1 shows the values and for a MR estimation of a distribution. Each value is divided by its empirical or sample counterparts, which are computed from the sample data without fitting any density. These indicators stabilize for (see Figure 1). The BIC criterion provides the value . Figure 2 displays the fitted densities for and , and the data generator model .
Table 1.
Ratios between MRDE moments and sample moments.
| E[X] | Variance | Asymmetry | Kurtosis | |
|---|---|---|---|---|
| −3 | 1.0014 | 2.0494 | 0.5553 | 1.987432 |
| −2 | 1.0058 | 1.2919 | −0.6275 | −3.905073 |
| −1 | 1.0011 | 1.0717 | 0.0905 | −2.285609 |
| 0 | 1.0000 | 1.0160 | 1.0618 | 1.212460 |
| 1 | 0.9998 | 1.0033 | 1.0149 | 0.887280 |
| 2 | 1.0000 | 1.0010 | 1.0179 | 1.072510 |
| 3 | 1.0000 | 1.0006 | 0.9854 | 1.023095 |
| 4 | 1.0000 | 1.0001 | 0.9973 | 0.985362 |
Figure 1.
Ratios between MRDE moments and sample moments.
Figure 2.
Estimated densities for , and data generator model .
The estimate for is smooth but the bias is noticeable. For the bias almost disappears but there is a small roughness that may be acceptable. According to the BIC criterion, the optimum level of resolution is .
3.2. Mixture of double exponential distribution
The density function of a double exponential distribution with parameters and is given by:
In this illustration, the generator model will be a mixture of three densities of this type whose parameters are in Table 2.
Table 2.
Parameters of the double exponential distribution.
| 20 | 5 | 0.3 |
| 30 | 6 | 0.5 |
| 40 | 7 | 0.2 |
Table 3 and Figure 3 display the values of , divided by their empirical counterparts, using levels of resolution from to . As can be seen, stability is reached either when or . Figure 4 shows the estimations, for , , and the data generator model (mixture of double exponential, ).
Table 3.
Ratios between MRDE moments and sample moments.
| E[x] | Variance | Asymmetry | Kurtosis | |
|---|---|---|---|---|
| −3 | 1.00020842 | 1.23072002 | 0.64137168 | 0.56500524 |
| −2 | 1.0000844 | 1.05642867 | 0.92675066 | 0.90709002 |
| −1 | 1.00058047 | 1.0155786 | 0.97057506 | 0.97393645 |
| 0 | 0.99986737 | 1.00355414 | 0.99758049 | 0.9933315 |
| 1 | 0.99996728 | 1.000518 | 0.99559745 | 0.99654088 |
| 2 | 0.99999484 | 1.00019494 | 1.00076368 | 1.00119058 |
| 3 | 1.00000043 | 1.00001854 | 0.99983838 | 1.00028379 |
| 4 | 1.00000388 | 1.00003893 | 0.99975579 | 0.99954511 |
Figure 3.
Ratios between MRDE moments and sample moments.
Figure 4.
Estimated densities for , and data generator model .
The mixture has three modes that are difficult to capture by the estimations.4 This leads us to use higher resolution levels and rougher estimates to avoid the bias that such difficulty produces. Based on Figure 4, we would opt for discarding because of bias excess, and for roughness excess. In this example, certain difficulties are encountered in the BIC criterion. Due to the fact it is based on the principle of parsimony, it tends to give up the peaks and selects a smoother estimation. The level of resolution, according to the BIC is .
4. Moments and selection of the bandwidth of a kernel density
Kernel density estimation (KDE) has become a common and useful tool for empirical studies. The discussion on the selection of the bandwidth has given rise to numerous publications on the subject. Nonetheless, part of the scientific community that works in nonparametric statistics has accepted that it may not be a perfect procedure for selecting the optimal bandwidth. We will not give an overview of kernel estimation techniques since our main aim is to extend the use of the moments to the choice of bandwidth parameter. We refer readers to [15,2,9,6,24,4,10,21] for a review of the bandwidth selector techniques.
4.1. Moments of a kernel density
The Kernel estimator of a density for a sample is given by:
where is the bandwidth or smoothing parameter and is the cubic box spline that we are going to use as the Kernel function.
For this density, it is verified (see Appendix A5):
and
where is the non-central moment of order of the density (see section 2 and Appendix A5) and is the central moment of order for the sample. That is:
4.2. Selection of the bandwidth of a kernel density based on the moments
The Kernel estimator of a density for a sample [5] is:
| (5) |
where is the kernel and is the so-called smoothing parameter.
Let us assume that where is an integer number. The kernel function that we are going to use is , that is, the cubic box spline introduced in section 2.1. It is evident that (5) can be written as follows:
| (6) |
Utilizing the multiresolution analysis notation, the expression (6) can rewrite as:
| (7) |
The estimator of a MRDE at the resolution level defined in (4) is:
| (8) |
where , is the number of sample values such that is the closest integer to . This allows us to write (8) as:
| (9) |
Assuming that is the closest integer to it is easy to understand that:
| (10) |
Note that expressions (9) and (10) are equal. We can obtain (10) from (9) by a frequency count on the values . In (9) represents the number of values found and is the number of repetitions observed for each of them. Therefore, we can write:
Taking into account that by definition for a sufficiently high both expressions must give very close results.
To illustrate the selection of based on the moments, we have simulated a sample of size 10,000 from a N (10, 5). Figure 5 shows the MRDE and the KDE for .
Figure 5.
MRDE and KDE for .
As can be seen in the Figure 5, both estimates are very similar and show a similar degree of roughness. This fact suggests to us that the moment method is suitable to select an appropriate when with integer. Table 4 and Figure 6 show and divided by their empirical counterparts for . The moments have been calculated by the expression developed in this section using the fitted densities for the values of that correspond to the above . The values of are on the abscissa axis.
Table 4.
Ratios between KDE moments and sample moments.
| E[x] | Variance | Asymmetry | Kurtosis | |
|---|---|---|---|---|
| −3 | 1 | 1.83770525 | 0.40140818 | 4.41595072 |
| −2 | 1 | 1.20942631 | 0.75184944 | 1.27816406 |
| −1 | 1 | 1.05235658 | 0.92630844 | 0.95204746 |
| 0 | 1 | 1.01308914 | 0.98068268 | 0.97763648 |
| 1 | 1 | 1.00327229 | 0.99511157 | 0.99369833 |
| 2 | 1 | 1.00081807 | 0.99877415 | 0.99837911 |
| 3 | 1 | 1.00020452 | 0.9996933 | 0.99959192 |
| 4 | 1 | 1.00005113 | 0.99992331 | 0.9998978 |
| 5 | 1 | 1.00001278 | 0.99998083 | 0.99997444 |
Figure 6.
Ratios between KDE moments and sample moments.
Note that the stability of both moments is reached when which is equivalent to . Figure 7 displays the MR and the kernel estimations for and .
Figure 7.
Estimated densities for , and data generator model . Note that is conveniently the same for both estimates and it can be obtained from the MR or kernel estimated moments. KDE and MRDE estimates are similar since both are good approximations of the same unknown density.
In the following simulation, the generator model is a mixture of three double exponential distributions whose parameters are in Table 2. Table 5 and Figure 8 display the values of , divided by their empirical counterparts, using levels of resolution from to . As can be seen, stability is reached either when or . Figure 9 shows the estimations, for , , and the data generator model (mixture of double exponential distributions, ).
Table 5.
Ratios between KDE moments and sample moments.
| E[x] | Variance | Asymmetry | Kurtosis | |
|---|---|---|---|---|
| −3 | 1.00020842 | 1.23072002 | 0.64137168 | 0.56500524 |
| −2 | 1.0000844 | 1.05642867 | 0.92675066 | 0.90709002 |
| −1 | 1.00058047 | 1.0155786 | 0.97057506 | 0.97393645 |
| 0 | 0.99986737 | 1.00355414 | 0.99758049 | 0.9933315 |
| 1 | 0.99996728 | 1.000518 | 0.99559745 | 0.99654088 |
| 2 | 0.99999484 | 1.00019494 | 1.00076368 | 1.00119058 |
| 3 | 1.00000043 | 1.00001854 | 0.99983838 | 1.00028379 |
| 4 | 1.00000388 | 1.00003893 | 0.99975579 | 0.99954511 |
Figure 8.
Ratios between KDE moments and sample moments.
Figure 9.
Estimated densities for , and data generator model .
An alternative way to compare KDE and MRDE is making in (10). That is:
| (11) |
Observe that (5) and (11) are quite similar. Taking into account that by definition:
and multiplying by the three terms of the above inequality we have:
The latter expression shows an increasing approximation between and . Note that the amplitude of the previous interval is .
The MRDE and the KDE had been compared in terms of time needed to run their density function in ref. [12]. Nonetheless, the previous simulations reveal some facts that are worth highlighting. The MR density is not a particular case of kernel density. On the one hand, when the multiresolution densities are estimated according to (4), the results are similar to a modified kernel in which each sample data is substituted in (5) by to obtain (6), with . On the other hand, we cannot state that a kernel estimator is a multiresolution kernel estimator. We could make the kernel and the scaling function identical. Also, we can equal both dilation factors by making . But the kernel for will be a density of the space of the multiresolution analysis structure only if the sample is of the form , with and where is a fixed integer determined by the sample. Any other estimation with a different will no longer be a function of the multiresolution structure.
Despite the above comment, we have to point out that there is a well-developed theory about the generalized kernel estimators, developed from the wavelets and the multiresolution analysis structures (see for instance [22]). Broadly speaking, this methodology requires the mother wavelet or scaling function of the multiresolution analysis structure to generate orthogonal bases for the spaces of the MRA. This is not the case of the cubic box spline since it generates non orthogonal Riesz Bases. Going deeper into this aspect is an interesting question, but it is out of the scope of this work.
5. Real data application
In this section, we apply the proposed method to the gross income of Spanish households. The sample data comes from the Spanish Survey of Household Finances (EFF) for the year 2014, which was conducted by the Bank of Spain [1]. The EFF provides information on assets, debt, income and spending. The sample size is 6120 households. The household income is calculated as the sum of labor and non-labor incomes for all household members in 2013. It is expressed in hundred thousand euros.
Table 6 and Figure 10 show the evolution, according to , of , and , divided by their empirical counterpart for levels of resolution from to .
Table 6.
Ratios between MRDE moments and sample moments.
| E[X] | Variance | Asymmetry | Kurtosis | |
|---|---|---|---|---|
| −13 | 0.99717924 | 1.01153181 | 0.98738403 | 0.98380212 |
| −12 | 0.9995666 | 1.00258459 | 0.99586171 | 0.99430889 |
| −11 | 0.99999127 | 1.00037759 | 0.9990044 | 0.99853096 |
| −10 | 1.00016918 | 0.99997439 | 0.99958258 | 0.99926656 |
| −9 | 0.99990519 | 1.00016536 | 0.99988219 | 0.99980877 |
| −8 | 0.99989802 | 0.99998004 | 1.0000367 | 1.00010499 |
| −7 | 0.99995971 | 0.99999305 | 1.00003152 | 1.0000789 |
| −6 | 0.99992814 | 1.00000293 | 0.9999922 | 0.99998317 |
Figure 10.
Ratios between MRDE moments and sample moments.
According to the BIC criterion the optimum is . However, the moments stabilize for or (Figure 10). Let us focus on this difference by comparing the density estimates for the two resolution levels plotted in Figure 11.
Figure 11.
Estimated multiresolution density for and .
The density has a peak between 8000 and 9000 euros that cannot be captured accurately by using . So, a higher level of resolution, or , is needed. The bias for is evident when the two densities are compared. The roughness for is clearly appreciable. The BIC chooses the smoothness of the curve, which leads to a very skewed estimated density around the mode. A similar fact has been shown in Section 3.2. We have observed empirically that roughness has only a slight effect on the cumulative distribution function. Nonetheless, the bias has a remarkable impact on the concentration measurement producing an underestimation of the Gini index and the Lorenz curve. This is an important issue to be considered if we study distributional aspects of the distribution as concentration or inequality through the fitted density. At this point, it should be noted that the kernel method is frequently applied to study income distribution (see for instance [8,16,3,20]). Figures 12 and 13 plot the cumulative distribution functions and the Lorenz curves respectively. The cumulative distribution functions are similar except in the income interval [0, 10,000]. This difference leads to an underestimation of the Gini index5: for the index equals to 0.4338 and for it is equals to 0.5131. It also affects the Lorenz curve (see Figure 13) which is underestimated for . Therefore, the level of resolution obtained by the method of moments is preferable to the value selected by the BIC in the situations set out above.
Figure 12.
Cumulative distribution functions.
Figure 13.
Lorenz curves.
Next, we repeat the estimation for kernel densities and compare the results. Table 7, Figure 14 show the evolution, according to , of and divided by their sample counterpart for levels of resolution from to .
Table 7.
Ratios between KDE moments and sample moments.
| E[X] | Variance | Asymmetry | Kurtosis | ||
|---|---|---|---|---|---|
| −13 | 8192 | 1 | 1.00759984 | 0.98870759 | 0.98497182 |
| −12 | 4096 | 1 | 1.00189996 | 0.99715681 | 0.99621088 |
| −11 | 2048 | 1 | 1.00047499 | 0.99928794 | 0.9990507 |
| −10 | 1024 | 1 | 1.00011875 | 0.99982191 | 0.99976255 |
| −9 | 512 | 1 | 1.00002969 | 0.99995547 | 0.99994063 |
| −8 | 256 | 1 | 1.00000742 | 0.99998887 | 0.99998516 |
| −7 | 128 | 1 | 1.00000186 | 0.99999722 | 0.99999629 |
| −6 | 64 | 1 | 1.00000046 | 0.9999993 | 0.99999907 |
| −5 | 32 | 1 | 1.00000012 | 0.99999983 | 0.99999977 |
Figure 14.
Ratios between KDE moments and sample moments.
The appropriate level of resolution according to the moments is or (Figure 14).
The results are similar reinforcing the idea of applying the analysis performed on the MRDE. The level of resolution selected for the MRDE was . For the KDE we have opted for . The plug-in method to select the optimum [19] provides the result . This value corresponds to which rounded to the nearest integer would give . If we use the MRDE for the plug-in method provides the result (Figure 15). That is whose nearest integer is . This is less conservative than the BIC but still conservative. In any case, the resulting is the same or it is very close to that used for the plugged MRDE.
Figure 15.
Estimated kernel density for and .
Generalizing, if we use an estimated MR for a given as plugged density, the plug-in method provides a value of equal to whose nearest integer is the value of utilized to estimate the MRDE. It is faster and easier to use the method of the moment for KDE and determine the that we will use in the estimation. If we want more conservative results, regarding the smoothness of the fit, we can reduce the value of by one or two units, paying special attention to the increasing bias.
6. Conclusions
This paper introduces an approach to select the bandwidth or smoothing parameter in semiparametric and nonparametric density estimation. It is based on the evolution of the expected value, the variance, the symmetry and kurtosis coefficients of the estimated densities for different bandwidths. Using these values, divided by their empirical counterpart, we select a resolution level so that the bias will be acceptable and the roughness of the estimator will be not excessive.
This method has been applied to the density estimation by means of multiresolution densities as well as Kernel density estimation. In this way, we have expanded the available criteria to smoothing parameter selection.
The results of the simulations and the empirical application indicate that the level of resolution resulting from the moments method is more flexible to fit a multimodal distribution than those resulting from the BIC for MRDE and the plug-in for KDE. The BIC chooses the smoothness of the curve which leads to a skewed estimated density around the modes. The method of the moments attributes more importance to the use of higher resolution levels and hence rougher estimates to avoid the bias that the fitting produces. This procedure is recommended to analyze some distributional aspects such as the concentration of income. As it has been shown in the empirical application, the bias can produce an underestimation of the concentration of the distribution.
Supplementary Material
Appendices.
A1. Central and non-central moments of the cubic box spline density
The density function has an expected value equal to zero. Hence the central and non-central moments are equal. It is also symmetric and consequently, the odd order moments are null.
Proposition A1:
Given and defined by (1), it is proved by polynomial integration that:
Analogously:
and
Proposition A2:
The non-central and central moments of order for are6 zero if is odd. If is even then:
Proof:
Let us assume that is even. In this case:
Given that is a symmetric density with expected value zero, all the central and non-central moments are equal and also the moments with odd are zero.
Taking into account proposition 1, we can assert that, if is even, then:
A2. Central moments of the densities
Proposition A3:
It is verified that:
(A1)
Proof:
Let us consider:
(A2) Making the change of variable in (A2) we have:
but
Hence the proposition is true.
Proposition A4:
Let
be the central moment of order of . It is verified that:
where is, by definition, the moment of order of the density (see proposition 4).
Proof:
(A3) Making the change in the integral (A3) we have:
A3. Central moments of a MR density
Proposition A5:
Let us consider a MR density as that given by (3). It is verified that:
where:
Proof:
It is trivial taking into account (3).
Proposition A6:
It is verified that the central moment of order of the MR density given by (3) is:
where and where is calculated according to proposition 4.
Proof:
Let:
(A4) However,
(A5) Considering that:
and substituting in (A3) we have:
But making we obtain:
It is verified:
Substituting in (A4) we have:
as we want to prove.
A4. Non-central moments of a MR density
Proposition A7:
Taking into account (3) is trivial to prove that:
where
Proposition A8:
It is verified:
where . They are defined and calculated in Proposition A2.
Proof:
(A6) Making the change of variable in (A6) it is obtained:
(A7) Taking into account that:
(A8) and substituting (A8) in (A7), the proof of the proposition is evident (see proposition 4).
A5. Central moments of a kernel density
Proposition A9:
It is verified:
Proof:
Let
(A9) Making the change of variable in the integral (A9) we have:
(A10) But and so (A10) equals and substituting in (A9) we have:
Proposition A10:
It is verified:
where is the non-central moment of order of density . It is obtained following Section 2 in the Appendix. The expression is the sample central moment of order . That is:
Proof:
(A11) That is:
(A12) Making the change of variable in (A12) we have:
(A13) But
Substituting in (A13) we have:
Substituting the latter expression in (A12) we obtain:
Naming
the proposition is proven.
A6. Asymptotic properties of the MR moments
Proof Proof of Proposition 2.1 —
It is evident that if and only if .
Moreover, the intervals:
have center and radius , so for large enough the previous intervals will have a radius so small that each sample element, belongs to a different interval. In this case, the coefficients greater than zero are those associated with intervals that contain a sample element, that is:
Hence the proposition is true.
Proof Proof of Proposition 2.2 —
For a large enough, according to proposition 1, we have:
which allows us to write:
(A14) However,
(A15) with .
If we make the change of variable in (A14), we have:
(A16) If tends to infinite, we have:
(A17) The first equality in (A17) is trivially true and the second is also true since by definition:
and when tends to infinity, the radius of the interval converges to zero and its center is the point .
Assuming that (A17) is true we can write:
(A18) Wherewith, under (A14) and (A18), we have:
That is, the non-central moments of the estimated MRDE converge to the non-central moments of the sample.
Notes
This is a well-known fact underlying all the bandwidth selection methods.
Remind that these intervals form a partition of the real line and their amplitude converges to zero as increases.
Unless this is done parametrically using the EM algorithm on a mixture model of three double exponential distributions. But for a sample of size 10,000 the process time is too long.
Note that the values for the Gini coefficient can differ from other publications since our illustration is based on gross income instead of net income.
The expected value of the density is zero and the central and non-central moments are equal.
Disclosure statement
No potential conflict of interest was reported by the author(s).
References
- 1.de España B., Survey of Household Finances (EFF) 2014: Methods, Results and Changes since 2011. Analytical Article, 24, January 2007.
- 2.Cao R., Cuevas A., and Gonzalez Manteiga W., A comparative study of several smoothing methods in density estimation. Comput. Stat. Data Anal. 17 (1994), pp. 153–176. doi: 10.1016/0167-9473(92)00066-Z. [DOI] [Google Scholar]
- 3.Charpentier A. and Flachaire E., Log-transform kernel density estimation of income distribution. L'Actualité économique 91 (2015), pp. 141–159. doi: 10.7202/1036917ar. [DOI] [Google Scholar]
- 4.Hall P., and Marron J.S., Estimation of integrated squared density derivatives. Stat. Probab. Lett. 6 (1987), pp. 109–115. doi: 10.1016/0167-7152(87)90083-6. [DOI] [Google Scholar]
- 5.Härdle W., Smoothing Techniques, Springer, New York, 1991. [Google Scholar]
- 6.Heidenreich N.B., Schindler A., and Sperlich S., Bandwidth selection for kernel density estimation: a review of fully automatic selectors. AStA Adv. Stat. Anal. 97 (2013), pp. 403–433. doi: 10.1007/s10182-013-0216-y. [DOI] [Google Scholar]
- 7.Hernández E., and Weiss G., A First Course on Wavelets, CRC Press, New York, 1996. doi: 10.1201/9780367802349. [DOI] [Google Scholar]
- 8.Jenkins S.P., Did the middle class shrink during the 1980s? UK evidence from kernel density estimates. Econ. Lett. 49 (1995), pp. 407–413. doi: 10.1016/0165-1765(95)00698-F. [DOI] [Google Scholar]
- 9.Jones M.C., On correcting for variance inflation in kernel density estimation. Comput. Statist. Data Anal. 11 (1991), pp. 3–15. [Google Scholar]
- 10.Marron J.S. and Sheather S.J., Progress in data-based bandwidth selection for kernel density estimation. Comput. Stat. 11 (1996), pp. 337–381. [Google Scholar]
- 11.Mallat S., A Wavelet Tour of Signal Processing, Academic Press, New York, 1998. doi: 10.1016/B978-012466606-1/50008-8. [DOI] [Google Scholar]
- 12.Palacios-González F. and García-Fernández R.M., A flexible family of density functions. Statistics 49 (2014a), pp. 680–704. doi: 10.1080/02331888.2014.883398. [DOI] [Google Scholar]
- 13.Palacios-González F. and García-Fernández R.M., Mixtures of mixtures based on multiresolution analysis theory. Commun. Stat. Simul. Comput. 43 (2014b), pp. 723–742. doi: 10.1080/03610918.2012.714031. [DOI] [Google Scholar]
- 14.Palacios-González F. and García-Fernández R.M., A faster algorithm to estimate multiresolution densities. Comput. Stat. 35 (2020), pp. 1207–1230. doi: 10.1007/s00180-020-00952-w. [DOI] [Google Scholar]
- 15.Park B.U. and Marron J.S., Comparison of data-driven bandwidth selectors. J. Am. Stat. Assoc. 85 (1990), pp. 66–72. doi: 10.1080/01621459.1990.10475307. [DOI] [Google Scholar]
- 16.Pittau G.M. and Zelli R., Testing for changing shapes of income distribution: Italian evidence in the 1990s from kernel estimates. Empir. Econ. 29 (2004), pp. 415–430. doi: 10.1007/s00181-003-0175-3. [DOI] [Google Scholar]
- 17.Schwarz G., Estimating the dimension of a model. Ann. Stat. 6 (1978), pp. 461–464. [Google Scholar]
- 18.Silverman B.W., Density Estimation for Statistics and Data Analysis, Chapman and Hall, London, New York, 1986, pp. 34–72. [Google Scholar]
- 19.Scott D.W., Tapia R., and Thompson J.R., Kernel density estimation revisited. Nonlin. Anal. 1 (1997), pp. 339–372. doi: 10.1016/S0362-546X(97)90003-1. [DOI] [Google Scholar]
- 20.Shaoping W., Ang L., Kuangyu W., and Ximing W., Robust kernels for kernel density estimation. Econ. Lett. 191 (2020), pp. 109138. doi: 10.1016/j.econlet.2020.109138. [DOI] [Google Scholar]
- 21.Sheather S.J. and Jones M.C., A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. Ser. B 53 (1991), pp. 683–690. doi: 10.1111/j.2517-6161.1991.tb01857.x. [DOI] [Google Scholar]
- 22.Huang S.H., Density estimation by wavelet-based reproducing kernels. Stat. Sin. 9 (1999), pp. 137–151. [Google Scholar]
- 23.Wojtaszczyk P., A Mathematical Introduction to Wavelets, Cambridge University Press, London, 1979. [Google Scholar]
- 24.Ziane Y., Adjabi S., and Zougab N., Adaptive Bayesian bandwidth selection in asymmetric kernel density estimation for nonnegative heavy-tailed data. J. Appl. Stat. 42 (2015), pp. 1645–1658. American Statistical Association, vol. 91, no. 433, pp. 401–407. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.















