Abstract
This paper considers a linear regression model with stochastic restrictions,we propose a new mixed Kibria–Lukman estimator by combining the mixed estimator and the Kibria–Lukman estimator.This new estimator is a general estimation, including OLS estimator, mixed estimator and Kibria–Lukman estimator as special cases. In addition, we discuss the advantages of the new estimator based on MSEM criterion, and illustrate the theoretical results through examples and simulation analysis.
Subject terms: Applied mathematics, Statistics
Introduction
Consider the following linear regression model:
| 1 |
where y is the response variable vector of is the column full rank independent variables matrix of is the unknown coefficient vector of is the random error vector of n dimension such that and , where is mean squared error.
In the estimation of unknown coefficient vector , the OLS estimator is the most commonly used:
| 2 |
It is easy to know from formula (2), , and the OLS estimator has been widely used because of its unbiased nature and concise form. However, the ill condition of the design matrix X caused by the increasing number of dependent predictors often makes the OLS estimates unstable.
Massy1 proposed principal component estimator. Hoerl and Kennard2 obtained the ridge estimation by introducing a ridge parameter k into the design matrix calculation. Swindel3 proposed a modified ridge estimator with prior information while Lukman et al.4 proposed the two-parameter form of the ridge estimator called the modified ridge estimator (MRT). Liu5 obtained a linearized form of the ridge estimator called the Liu estimator. Akdeniz and Kaciranlara6 proposed the generalized Liu estimator. Liu7 obtained a two-parameter form of the Liu estimator.
Many scholars have found that a new estimator can be obtained by combining the two estimators, which generally have good statistical properties. Baye and Parker8 proposed r–k estimator by combining ridge estimator and principal component estimator. Kaciranlar and Sakallioglu9 proposed r–d estimator by combining Liu estimator and principal component estimator. Ozkale and Kaciranlar10 proposed two parameter estimator by combining the James–Stein Shrinkage estimator and the modified ridge estimator proposed by Swindel. Batah et al.11 proposed a modified r-k estimator combining unbiased ridge estimator and principal component estimator. Yang and Chang12 proposed another two parameter estimator based on ridge estimator and Liu estimator. Lukman et al.13 proposed a new estimator by combining modified ridge estimator (MRT) and principal component estimator. Kibria and Lukman14 proposed Kibria–Lukman estimator by combining ridge estimator and Liu estimator.
In practice, in addition to the sample information given by model (1), additional information about parameters in the sample information, such as certain deterministic or stochastic restrictions on unknown parameters, can also be considered. This method can also overcome the complex collinearity problem. Theil and Goldberger15 and Theil16 proposed mixed estimator by comprehensively considering sample information and constraints. Schiffrin and Toutenburg17 proposed weighted mixed estimator for the different importance of sample information and prior information.
In recent years, biased estimation and estimation methods with prior information are often combined to form a broader biased estimation. Hubert and Wijekoon18 proposed a stochastic restricted Liu estimator by combining Liu estimator and mixed estimator. Yang and Xu19 obtained another stochastic mixed Liu estimator. In the same year, Yang and Chang further studied the stochastic mixed Liu estimator and obtained the weighted mixed Liu estimator. Yang and Li12 proposed another stochastic mixed ridge estimator. Ozbay and Kaciranlar20 integrated two parameter estimator and mixed estimator and proposed a two parameter mixed estimator.
In this paper, a new mixed KL estimator under stochastic restrictions is proposed, and its excellent properties under certain conditions are proved theoretically. The above theoretical results are verified and analyzed by examples and data simulation.
The proposed estimator
Hoerl and Kennard2 proposed the ridge estimator (RE):
| 3 |
where is the parameter. In fact, ridge estimator is obtained by solving the following extreme value problem:
where c is constant, k is the Lagrange constant.
Kibria and Lukman14 proposed the Kibria Lukman (KL) estimator:
| 4 |
where is the parameter.KL estimator is obtained by solving the following extreme value problem:
| 5 |
where c is constant, k is the Lagrange constant.
Consider the following stochastic restrictions:
| 6 |
where r is the known random vector of , R is the row full rank sample data matrix of , let e be the random error vector and independent of each other, and be the known positive definite matrix.
Theil and Goldberger15 and Theil16 proposed the mixed estimator by integrating sample information and constraints. The derivation idea is to rewrite models (1) and (6) into a new linear model:
If , above model is transformed into
| 7 |
By applying the least square estimator to the new linear model (7), the mixed estimator (ME)of parameter is obtained:
| 8 |
Combined mixed estimator and ridge estimator and proposed stochastic mixed ridge estimation (RME):
| 9 |
The estimator proposed in this paper is obtained by solving the following extreme value problem:
| 10 |
where c is constant, k is Lagrange constant.
Regular equations can be obtained:
| 11 |
| 12 |
from Eqs. (11) and (12), we can get the mixed KL estimator:
| 13 |
It can be seen from Eq. (13) that mixed estimator, KL estimator and OLS estimator can be regarded as special cases of mixed KL estimator.Namely
When is mixed estimator;
When is estimator;
When is OLS estimator.
The performance of the new estimator
If is the estimation of , then the mean square error matrix of is given as:
where is the covariance matrix of , and is the deviation vector. Two estimates and , are better than under MSEM criterion if and only if:
Lemma 3.1
Suppose two matrix , then , where is the maximum eigenvalue of matrix .
The mean square error matrix of mixed KL estimator is calculated as follows:
| 14 |
where .
Deviation vector: .
| 15 |
Therefore,
| 16 |
where
By substituting into Eq. (16), the mean square error matrix of the mixed estimator can be obtained:
| 17 |
where .
By substituting into Eq. (16), the mean square error matrix of the KL estimator can be obtained:
| 18 |
where .
By substituting into Eq. (16), the mean square error matrix of the OLS estimator can be obtained:
| 19 |
Mean square error matrix of mixed ridge estimator:
| 20 |
Deviation vector:
Therefore,
| 21 |
Comparison between mixed KL estimator and mixed estimator
From Eqs. (16) and (17), we make
| 22 |
Because
from ,so , Theorem 3.2 is obtained.
Theorem 3.2
The necessary and sufficient conditions for mixed KL estimator to be superior to mixed estimator under MSEM criterion are as follows:
| 23 |
Comparison between mixed KL estimator and KL estimator
From Eqs. (16) and (18), we make
| 24 |
Because
where
According to the Lemma 3.1, it can be obtained that if , then . So if and only if .
As long as , following conclusions can be obtained:
Theorem 3.3
When , the necessary and sufficient conditions for mixed KL estimator to be superior to KL estimator under MSEM criterion are as follows:
| 25 |
Comparison between mixed KL estimator and OLS estimator
From Eqs. (16) and (19), we make
| 26 |
Because
where .
Because , and , we can get , so , that is , Theorem 3.4 is obtained.
Theorem 3.4
The necessary and sufficient conditions for mixed KL estimator to be superior to under MSEM criterion are as follows:
| 27 |
Comparison between mixed KL estimator and mixed ridge estimator
From Eqs. (16) and (22), we make
| 28 |
Theorem 3.5
The necessary and sufficient conditions for mixed KL estimator to be superior to the mixed ridge estimator under MSEM criterion are as follows:
| 29 |
.
Numerical example and simulation study
In order to further explain the theoretical results, this section will verify and analyze the above theoretical results through examples.
The example analysis data adopts the percentage data of research and development expenses in GNP of several countries from 1972 to 1986 used by Gruber21, Akdeniz and Erol22, in which represents France, represents West Germany, represents Japan, represents the former Soviet Union and y represents the United States. See Table 1 for specific data.
Table 1.
1972–1986 research and development expenditure as a percentage of GNP.
| Year | y | ||||
|---|---|---|---|---|---|
| 1972 | 1.9 | 2.2 | 1.9 | 3.7 | 2.3 |
| 1975 | 1.8 | 2.2 | 2 | 3.8 | 2.2 |
| 1979 | 1.8 | 2.4 | 2.1 | 3.6 | 2.2 |
| 1980 | 1.8 | 2.4 | 2.2 | 3.8 | 2.3 |
| 1981 | 2 | 2.5 | 2.3 | 3.8 | 2.4 |
| 1982 | 2.1 | 2.6 | 2.4 | 3.7 | 2.5 |
| 1983 | 2.1 | 2.6 | 2.6 | 3.8 | 2.6 |
| 1984 | 2.2 | 2.6 | 2.6 | 4 | 2.6 |
| 1985 | 2.3 | 2.8 | 2.8 | 3.7 | 2.7 |
| 1986 | 2.3 | 2.7 | 2.8 | 3.8 | 2.7 |
The data in Table 1 are processed as follows
Firstly, it is easy to calculate that the eigenvalue of is , , ,the OLS estimator of is , and estimator of is .
We can use the method proposey by Kibria and Lukman14 to choose the biasing parameter k, and we can also use the generalized cross validation (GCV) criterion and the cross validation (CV) to choose the biasing parameter, the reference can refer to Arashi et al.23, Roozbeh24, and Roozbeh et al.25. In this paper we use the method propose by Kibria and Lukman14 to choose the biasing parameter k, which is given as follows:
| 30 |
we take .
Consider the following stochastic restrictions, this can refer to Roozbeh et al.26 and Roozbeh and Hamzah27:
For the mixed estimator, KL estimator, OLS estimator, mixed ridge estimator and mixed KL estimator proposed in this paper. The MSE is presented in Table 2.
Table 2.
Estimated MSE.
| 0.6455 | 0.6102 | 0.6455 | 0.6452 | 0.6449 | |
| 0.0896 | 0.0988 | 0.0896 | 0.0894 | 0.0892 | |
| 0.1436 | 0.1577 | 0.1436 | 0.1434 | 0.1433 | |
| 0.1526 | 0.1566 | 0.1526 | 0.1530 | 0.1534 | |
| 0.0431 | 0.0235 | 0.0561 | 0.0390 | 0.0180 |
As can be seen from Table 2:
When k takes , the MSE value of mixed KL estimator is better than that of mixed estimator, KL estimator,OLS estimator and mixed ridge estimator. Consistent with the theoretical results of this paper, it can be concluded that adding stochastic restrictions may have better estimation effect under certain conditions. So in practice we can use the stochastic restrictions to address the multicollinearity.
Next, we consider Monte Carlo simulation analysis.
Firstly, the generation of relevant parameters and data in the process of simulation analysis is briefly described.
The data generation of explanatory variables adopts the same method as McDonald and Galarneau28, Gibbons29), that is, it is generated by the following equation:
where is the random number generated by the standard normal random variable, is the given constant, and theoretically represents the correlation between two different variables, so reflects the degree of complex collinearity of the model to some extent. In this simulation analysis, we consider three cases , set .
For a given design matrix X, we take the orthogonalized eigenvector corresponding to the maximum eigenvalue of as the real value of parameter vector .
The data corresponding to the response variable is generated by the following equation:
where is the mean of zero, and random vector with variance of .
See Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 and 18 for simulation analysis and calculation results.
Table 3.
Estimated MSE when .
| 0.85 | 0.0024 | 0.0028 | 0.0028 | 0.0023 | 0.0023 |
| 0.9 | 0.0032 | 0.0042 | 0.0043 | 0.0032 | 0.0032 |
| 0.99 | 0.0197 | 0.0214 | 0.0282 | 0.0185 | 0.0160 |
Table 4.
Estimated MSE when .
| 0.85 | 0.0012 | 0.0013 | 0.0013 | 0.0012 | 0.0012 |
| 0.9 | 0.0027 | 0.0031 | 0.0031 | 0.0027 | 0.0027 |
| 0.99 | 0.0146 | 0.0221 | 0.0245 | 0.0158 | 0.0152 |
Table 5.
Estimated MSE when .
| 0.85 | 0.0011 | 0.0011 | 0.0011 | 0.0011 | 0.0011 |
| 0.9 | 0.0016 | 0.0019 | 0.0020 | 0.0016 | 0.0016 |
| 0.99 | 0.0126 | 0.0179 | 0.0205 | 0.0119 | 0.0015 |
Table 6.
Estimated MSE when .
| 0.85 | 0.0009 | 0.0009 | 0.0009 | 0.0009 | 0.0009 |
| 0.9 | 0.0015 | 0.0018 | 0.0018 | 0.0015 | 0.0015 |
| 0.99 | 0.0104 | 0.0137 | 0.0142 | 0.0102 | 0.0101 |
Table 7.
Estimated MSE when .
| 0.85 | 0.2375 | 0.1963 | 0.2710 | 0.1752 | 0.1697 |
| 0.9 | 0.3463 | 0.2594 | 0.4260 | 0.2354 | 0.2231 |
| 0.99 | 3.5782 | 2.7915 | 4.2485 | 1.8076 | 1.3056 |
Table 8.
Estimated MSE when .
| 0.85 | 0.1908 | 0.1775 | 0.2012 | 0.1673 | 0.1675 |
| 0.9 | 0.1932 | 0.1968 | 0.2340 | 0.1662 | 0.1660 |
| 0.99 | 1.9234 | 1.3309 | 2.9377 | 1.3809 | 0.7301 |
Table 9.
Estimated MSE when .
| 0.85 | 0.0984 | 0.0868 | 0.1033 | 0.0835 | 0.0828 |
| 0.9 | 0.1524 | 0.1533 | 0.1906 | 0.1277 | 0.1270 |
| 0.99 | 1.7543 | 1.1738 | 2.0825 | 1.0379 | 0.9163 |
Table 10.
Estimated MSE when .
| 0.85 | 0.0751 | 0.0737 | 0.0823 | 0.0680 | 0.0678 |
| 0.9 | 0.1424 | 0.1314 | 0.1548 | 0.1218 | 0.1215 |
| 0.99 | 1.0463 | 0.4883 | 1.3303 | 0.5769 | 0.3634 |
Table 11.
Estimated MSE when .
| 0.85 | 8.7289 | 3.3534 | 13.8404 | 3.7486 | 3.3308 |
| 0.9 | 10.7412 | 4.5196 | 13.4709 | 4.4088 | 3.7785 |
| 0.99 | 84.6872 | 115.8703 | 143.0686 | 45.0361 | 31.8312 |
Table 12.
Estimated MSE when .
| 0.85 | 6.3491 | 2.8390 | 7.8722 | 2.7472 | 2.5652 |
| 0.9 | 4.3795 | 2.1116 | 4.8651 | 1.8364 | 1.7838 |
| 0.99 | 51.4274 | 44.6585 | 75.3850 | 30.2169 | 24.9511 |
Table 13.
Estimated MSE when .
| 0.85 | 3.2597 | 1.7710 | 3.4194 | 1.3309 | 1.5452 |
| 0.9 | 3.9034 | 1.7501 | 4.3951 | 1.6719 | 1.5983 |
| 0.99 | 27.7983 | 32.8762 | 38.5172 | 21.0908 | 19.3660 |
Table 14.
Estimated MSE when .
| 0.85 | 0.8875 | 1.7599 | 0.9286 | 1.9500 | 0.8946 |
| 0.9 | 1.4590 | 2.9916 | 1.5563 | 3.5008 | 1.4880 |
| 0.99 | 10.8178 | 22.8642 | 19.2461 | 31.9632 | 11.7280 |
Table 15.
Estimated MSE when .
| 0.85 | 23.4609 | 8.7136 | 27.9864 | 8.4935 | 8.0303 |
| 0.9 | 28.7442 | 13.2491 | 31.8223 | 12.4392 | 11.8325 |
| 0.99 | 343.6973 | 450.0098 | 539.6959 | 250.2456 | 106.987 |
Table 16.
Estimated MSE when .
| 0.85 | 19.1189 | 7.0762 | 23.1833 | 6.7891 | 6.8042 |
| 0.9 | 30.4727 | 10.6969 | 32.8809 | 10.1258 | 9.0393 |
| 0.99 | 226.4676 | 335.4425 | 390.911 | 130.0564 | 97.6481 |
Table 17.
Estimated MSE when .
| 0.85 | 11.7984 | 3.8394 | 13.2484 | 3.9206 | 3.7376 |
| 0.9 | 18.0514 | 5.6690 | 19.9073 | 5.7731 | 5.4011 |
| 0.99 | 197.0861 | 114.1176 | 237.743 | 91.9548 | 54.3546 |
Table 18.
Estimated MSE when .
| 0.85 | 8.6620 | 3.0434 | 8.9405 | 3.1080 | 2.9602 |
| 0.9 | 16.3215 | 5.6973 | 17.6756 | 5.6112 | 5.4424 |
| 0.99 | 120.1565 | 67.5786 | 168.6498 | 62.5914 | 47.1027 |
Based on Tables 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 and 18, the following conclusions are drawn:
The mean square error of all estimates increases with the increase of and decreases with the increase of n;
The new estimator mixed KL estimator always has the minimum MSE when all given n and ,and k takes . Consistent with the results of Theorems 3.2–3.5 in this paper, under certain conditions, mixed KL estimator is better than mixed estimator , KL estimator , least square estimator and mixed ridge estimator under MSE criterion;
Under the same conditions, mixed estimator ,mixed ridge estimator and mixed KL estimator are better than unconstrained least squares estimator under MSE criterion, mixed KL estimator is better than unconstrained KL estimator under MSE criterion.
Conclusions
In this paper, a new mixed KL estimator considering the prior information about parameters in sample information in linear model is proposed, and the properties of the new estimator are discussed. The necessary and sufficient conditions for KL estimator to be better than mixed estimator, KL estimator, OLS estimator and mixed ridge estimator under the criterion of mean square error matrix are given, and the proofs are given respectively. Then the theoretical results are verified by examples and simulation analysis.
Author contributions
H.C. and J.W. wrote the main manuscript text. All authors reviewed the manuscript.
Funding
The authors are highly obliged to the editor and the reviewers for the comments and suggestions which improved the paper in its present form.This work was sponsored by the Natural Science Foundation of Chongqing [grant number cstc2020jcyj-msxmX0028] and the Scientific Technological Research Program of Chongqing Municipal Education Commission [grant number KJQN202001321].
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Massy WF. Principal components regression in exploratory statistical research. J. Am. Stat. Assoc. 1965;60:234–256. doi: 10.1080/01621459.1965.10480787. [DOI] [Google Scholar]
- 2.Hoerl AE, Kennard RW. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics. 1970;12:55–67. doi: 10.1080/00401706.1970.10488634. [DOI] [Google Scholar]
- 3.Swindel BF. Good estimators based on prior information. Commun. Stat. Theroy Methods. 1976;5:1065–1075. doi: 10.1080/03610927608827423. [DOI] [Google Scholar]
- 4.Lukman AF, Ayinde K, Binuomote S, Onate AC. Modified ridge-type estimator to cambat multicollinearity. J. Chemom. 2019;e3125:1–12. [Google Scholar]
- 5.Liu KJ. A new class of biased estimate in linear regression. Commun. Stat. Theroy Methods. 1993;22:393–402. doi: 10.1080/03610929308831027. [DOI] [Google Scholar]
- 6.Akdeniz F, Kaciranlar S. On the almost unbiased generalized Liu estimator and unbiased estimation of the bias and MSE. Commun. Stat. Theroy Methods. 1995;24:1789–1797. doi: 10.1080/03610929508831585. [DOI] [Google Scholar]
- 7.Liu KJ. Using Liu-type estimator to combat collinearity. Commun. Stat. Theroy Methods. 2003;32:1009–1020. doi: 10.1081/STA-120019959. [DOI] [Google Scholar]
- 8.Baye MR, Parker DF. Combining ridge and principal component regression: A money demand illustration. Commun. Stat. Theroy Methods. 1984;13:197–225. doi: 10.1080/03610928408828675. [DOI] [Google Scholar]
- 9.Kaciranlar S, Sakallioglu S. Combining the Liu estimator and the principal component regression estimator. Commun. Stat. Theroy Methods. 2001;30:2699–2705. doi: 10.1081/STA-100108454. [DOI] [Google Scholar]
- 10.Ozkale MR, Kaciranlar S. The restricted and unrestricted two-parameter estimators. Commun. Stat. Theroy Methods. 2007;36:2707–2725. doi: 10.1080/03610920701386877. [DOI] [Google Scholar]
- 11.Batah FM, Ozkale MR, Gore SD. Combining unbiased ridge and principal component regressions estimators. Commun. Stat. Theroy Methods. 2009;38:2201–2209. doi: 10.1080/03610920802503396. [DOI] [Google Scholar]
- 12.Yang H, Chang XF. A new two-parameter estimator in linear regression. Commun. Stat. Theroy Methods. 2010;39(6):923–934. doi: 10.1080/03610920902807911. [DOI] [Google Scholar]
- 13.Lukman AF, Ayinde K, Oludoun O, Onate CA. Combining modified ridge-type and principal component regression estimators. Sci. Afr. 2020;e536:1–8. [Google Scholar]
- 14.Kibria BMG, Lukman AF. A new ridge-type estimator for the linear regression model: Simulations and applications. Scientifica. 2020 doi: 10.1155/2020/9758378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Theil H, Goldberger AS. On pure and mixed estimation in econometrics. Int. Econ. Rev. 1961;2:65–78. doi: 10.2307/2525589. [DOI] [Google Scholar]
- 16.Theil H. On the use of incomplete prior information in regression analysis. J. Am. Stat. Assoc. 1963;58:401–414. doi: 10.1080/01621459.1963.10500854. [DOI] [Google Scholar]
- 17.Schiffrin B, Toutenburg H. Weighted mixed regression. Z. Angew. Math. Mech. 1990;70:735–738. [Google Scholar]
- 18.Hubert MH, Wijekoon P. Improvement of the Liu estimator in linear regression coefficient. Stat. Pap. 2006;47:471–479. doi: 10.1007/s00362-006-0300-4. [DOI] [Google Scholar]
- 19.Yang H, Xu JW. An alternative stochastic restricted Liu estimator in linear regression model. Stat. Pap. 2009;50:369–647. doi: 10.1007/s00362-007-0102-3. [DOI] [Google Scholar]
- 20.Ozbay N, Kaciranlar KS. Estimation in a linear regression model with stochastic linear restrictions: A new two-parameter-weighted mixed estimator. J. Stat. Comput. Simul. 2018;88:1669–1683. doi: 10.1080/00949655.2018.1442836. [DOI] [Google Scholar]
- 21.Gruber MHJ. Improving Efficiency by Shrinkage: The James–Stein and Ridge Regression estimators. Marcel Dekker Inc; 1998. [Google Scholar]
- 22.Akdeniz F, Erol H. Mean Squared error matrix comparisons of some biased estimator in linear regression. Commun. Stat. Theroy Methods. 2003;32(12):2389–2413. doi: 10.1081/STA-120025385. [DOI] [Google Scholar]
- 23.Arashi M, Roozbeh M, Hamzah NA, et al. Ridge regression and its applications in genetic studies. PLoS One. 2021;16(4):e0245376. doi: 10.1371/journal.pone.0245376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Roozbeh M, Azen SP. Optimal QR-based estimation in partially linear regression models with correlated errors using GCV criterion. Comput. Stat. Data Anal. 2018;117:45–61. doi: 10.1016/j.csda.2017.08.002. [DOI] [Google Scholar]
- 25.Roozbeh M, Arahi M, Hamzah NA. Generalized cross-validation for simultaneous optimization of tuning parameters in ridge regression. Iran. J. Sci. Technol. Trans. A, Sci. 2020;44(2):473–485. doi: 10.1007/s40995-020-00851-1. [DOI] [Google Scholar]
- 26.Roozbeh M, Hesamianb G, Akbaric MG. Ridge estimation in semi-parametric regression models under the stochastic restriction and correlated elliptically contoured errors. J. Comput. Appl. Math. 2020;378:112940. doi: 10.1016/j.cam.2020.112940. [DOI] [Google Scholar]
- 27.Roozbeh M, Hamzah NA. Uncertain stochastic ridge estimation in partially linear regression models with elliptically distributed errors. Statistics. 2022;3:494–523. [Google Scholar]
- 28.McDonald MC, Galarneau DI. A Monte Carlo evaluation of ridge-type estimators. J. Am. Stat. Assoc. 1975;70:407–416. doi: 10.1080/01621459.1975.10479882. [DOI] [Google Scholar]
- 29.Gibbons DG. A simulation study of some ridge estimators. J. Am. Stat. Assoc. 1981;76:131–139. doi: 10.1080/01621459.1981.10477619. [DOI] [Google Scholar]
