Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2020 Oct 31;48(13-15):2499–2514. doi: 10.1080/02664763.2020.1837085

A new alternative estimation method for Liu-type logistic estimator via particle swarm optimization: an application to data of collapse of Turkish commercial banks during the Asian financial crisis

Nuriye Sancar a,CONTACT, Deniz Inan b
PMCID: PMC9042136  PMID: 35707083

Abstract

In the existence of multicollinearity problem in the logistic model, some important problems may occur in the analysis of the model, such as unstable maximum likelihood estimator with very high standard errors, false inferences. The Liu-type logistic estimator was proposed as two-parameter estimator to overcome multicollinearity problem in the logistic model. In the existing previous studies, the (k, d) pair in this shrinkage estimator is estimated by two-phase methods. However, since the different estimators can be utilized in the estimation of d, optimal choice of the (k, d) pair provided using the two-phase approaches is not guaranteed to overcome multicollinearity. In this article, a new alternative method based on particle swarm optimization is suggested to estimate (k, d) pair in Liu-type logistic estimator, simultaneously. For this purpose, an objective function that eliminates the multicollinearity problem, provides minimization of the bias of the model and improvement of the model’s predictive performance, is developed. Monte Carlo simulation study is conducted to show the performance of the proposed method by comparing it with existing methods. The performance of the proposed method is also demonstrated by the real dataset which is related to the collapse of commercial banks in Turkey during Asian financial crisis.

KEYWORDS: Logistic Liu-type estimator, ridge logistic estimator, maximum likelihood estimator, multicollinearity, particle swarm optimization

1. Introduction

The logistic model is the most commonly used generalized linear model in many fields of science from medicine to the economics, to the education, to engineering, etc., which models relationship between the dichotomous response variable and the covariates. The estimation of the parameters in the logistic regression model is provided by the maximum likelihood (ML) estimation method. The reliability of the estimations to be obtained by the ML method depends on the various assumptions. One of these assumptions is that there is no linear dependency between the independent variables in the model. If there is no linear relationship between the covariates, they are said to be orthogonal. However, the covariates are not orthogonal in almost all applications. When there are near-linear dependencies between the covariates, non-orthogonality arises, meaning that the multicollinearity (ill-conditioning) problem is present. This problem may cause instability of the ML estimation of regression coefficients and inaccurate inferences based on the model. This problem also leads to inflated variance and covariance of the ML estimator, which causes wide confidence intervals of the coefficients. In the presence of multicollinearity, the signs of the ML estimators may be different than theoretically expected. As a consequence of this, the interpretation of the relationship between the dependent and independent variables in the terms of odds ratios becomes erroneous. These harmful results of the multicollinearity make ML estimators unreliable.

Applying biased shrinkage estimators is a good solution to manage variance and instability of ML estimators. One of these estimators is the ridge logistic estimator, introduced by Schaefer et al. [25], which is a logistic form of ridge estimator proposed by Hoerl and Kennard [10,11]. Mansson and Shukur [20] studied various types of ridge logistic estimators. Moreover, in the study of Månsson et al. [21] a new shrinkage estimator which is a logistic generalization of the estimator defined by Liu [18] was proposed. Also, Huang [13] proposed a shrinkage estimator which is a combination of the ridge logistic estimator and Liu logistic estimator as a solution to multicollinearity.

In ridge regression, it is widely accepted that value of the shrinkage parameter k is chosen as a relatively small constant between 0 and 1, but there is an inverse relationship between the condition number of (XWX+kI) and the shrinkage parameter k [19]. In order to ensure that the condition number is small, the selected value of k must be sufficiently large. Based on the fact that ridge regression did not completely overcome the problem of ill-conditioning, Liu-type logistic estimator was defined by Inan and Erdogan [14] as a logistic form of the Liu-type estimator proposed by Liu [19]. In the Liu-type logistic estimator, by adapting the biasing parameter d to ensure a good fit of the model, a large k value can be selected to solve the multicollinearity problem. In this way, the ill-conditioning problem is eliminated. In other words, the value of k can be chosen to be large enough to reduce the condition number of (XWX+kI) to an acceptable extent, while the d value makes the model fit well.

In the existing studies for the Liu-type logistic estimator, the pair of (k, d) is obtained in two phases [1,5,14]. The biasing parameter d is identified after identifying the shrinkage parameter k, such that the minimization of the mean squared error of the parameters is obtained. Since it is not feasible to evaluate the mean square error of the parameters numerically, an optimum value of the biasing parameter is obtained in the terms of parameters and thus the estimations of different parameters are used to obtain optimum value of d. However, the pair of the parameters k and d chosen in this way is not optimal since it depends on the choice of estimations of the parameters. When the estimated (k, d) pair is the actual optimum value, the minimization of MSE will ensure a good fit for the model. However, this form of two-phase process does not ensure a good performance of the model and the selected (k, d) pair may not be chosen purposefully so to guarantee the solution of the multicollinearity problem. For these reasons, the simultaneous estimation of the (k, d) pair instead of utilizing the two-phase process can be considered as a more appropriate approach to solve the multicollinearity problem and to ensure a good fit of the model. In this study, a simultaneous estimation method for the optimal selection of k and d parameters has been proposed for the Liu-type logistic estimator using Particle Swarm Optimization (PSO) algorithm. In accordance with this aim, an appropriate objective function is developed to solve the multicollinearity problem and to ensure the good performance of the model, simultaneously.

This article is organized as follows: In Section 2, the logistic Liu-type estimator is described. Section 3 includes information about the process of the PSO algorithm. An alternative method based on PSO for the optimal selection of (k, d) pair is proposed in Section 4. In Section 5, the performance of the proposed method is illustrated by simulation study and a real data set application by comparing various estimators and finally, the conclusion is given in Section 6.

2. Methodology

2.1. Logistic model and maximum likelihood estimator

The logistic model predicts the probability of occurrence of a binary event using logit link function. Consider a logistic regression model in Equation (1) where the dependent variable , yi is Bernoulli distributed as yiBern(πi)

πi=11+exiβi=1,2,,n (1)

where xi is the i-th row of the design matrix Xnx(p+1) with n data points and p explanatory variables, and β(p+1)x1 is the coefficient vector. The most commonly used method of estimating β is the maximum likelihood (ML) estimation method to maximize the log-likelihood l(β):

l(β)=i=1nyixiβln[1+exiβ] (2)

The ML estimator of β is computed by setting the first derivative of Equation (2) with respect to β to zero. Therefore, the ML estimator is obtained by solving the following equation:

l(β)β=i=1n[yiexiβ1+exiβ]xi=i=1n(yiπi)xi=0 (3)

where y=[y1y2yn]1xn, π=[π1π2πn]1xn.

The iteratively weighted least squares (IWLS) is applied to obtain the solution to Equation (3). The ML estimator of β is estimated by applying the IWLS algorithm as in Equation(4):

β^ML=(XW^X)1XW^Z^ (4)

where W^=diag[π^i(1π^i)]andZ^=[Z1Z2Zn]1xn is the vector which is the adjust dependent variable with Z^i=log(π^i)+yiπ^iπ^i(1π^i). The covariance matrix of asymptotically normally distributed β^ML is defied by the inverse of the Hessian matrix, XWX which is given by Equation (5):

Cov(β^ML)=(XW^X)1 (5)

The mean square error (MSE) of β^ML is also given by Equation (6):

MSE(β^ML)=tr[(XW^X)1]=i=1p+1λi (6)

2.2. Ridge logistic estimator and Liu-type logistic estimator

When the Hessian matrix is not invertible, this clearly leads to problems. As a result of the multicollinearity problem in the design matrix, the inverse of the Hessian matrix becomes difficult or impossible and some of the λjs becomes zero or very close to zero, which causes the ill-conditioning problem. Hence, the very small eigenvalue blows up the variance of β^ML, which produces an unstable estimator and inaccurate inferences. Applying biased shrinkage estimators is a powerful solution to deleterious effects of the multicollinearity. Many authors have studied biased shrinkage estimators in logistic regression. Schaefer et al. [25] proposed the ridge logistic estimator, Aguilera et al. [3] proposed the principal component logistic regression (PCLR) estimator, and Mansson et al. [21] introduced the Liu logistic estimator, by integrating the PCLR and ridge logistic estimator to deal with multicollinearity. Moreover, Inan and Erdogan [14] proposed Liu-type logistic estimator and Asar [5] studied various properties of this estimator. As in linear regression, the ridge estimator in Equation (7) is a frequently used biased estimator in logistic regression as a solution to the negative results of multicollinearity problem.

β^k=(XW^X+kI)1XW^Z^ (7)

The ridge logistic estimator, β^k solves multicollinearity by adding a small number to the diagonal of XW^X to decrease the condition number, which is defined by

CN=λmaxλmin (8)

where λmax and λmin  are maximum and minimum eigenvalues of XW^X, respectively. A high CN indicates ill-conditioning problem. CN < 100 implies that there is no multicollinearity. If 100 < CN < 1000 then there is moderate to strong multicollinearity and CN > 1000 indicates that there is severe multicollinearity [8]. There is an inverse relationship between CN and k and, a large k must be selected to ensure that the CN is small. On the other hand, a small value of k is mostly chosen in applications since the bias of the estimator inflates as the value of k increases. Thus, in practice, the very small selected value of k is not sufficient to solve the ill-conditioning problem, and in this case, the problem still exists and the parameter estimators are still unstable. To overcome this problem in the ridge estimator, Inan and Erdogan [14] proposed logistic Liu-type estimator, β^k,d as a logistic form of the Liu-type estimator proposed by Liu [19]:

β^k,d=(XW^X+kI)1(XW^XdI)β^ML (9)

where k > 0 and <d<. The Liu-type logistic estimator is a two-parameter estimator. These two parameters (k, d) have distinct tasks; k, which is a shrinkage parameter whose purpose is to decrease the condition number of the matrix XW^X+kI to a desired level, and d, which is a biasing parameter, that improves the model fit and statistical property. Consider spectral decomposition of the matrix XW^X. Let

α=Qβ,Λ=diag(λ1,λ2,,λp+1)=Q(XW^X)Q

where λ1λ2λp+1 are the ordered eigenvalues of XW^X, and Q is the matrix whose columns are the eigenvectors of XW^X. The bias and variance of β^k,d are obtained as in Equation (10) and Equation (11), respectively:

bias(β^k,d)=E(β^k,dβ)=(d+k)Q(Λ+kI)1α (10)
Var(β^k,d)=Q(Λ+kI)1(ΛdI)Λ1(ΛdI)(Λ+kI)1Q (11)

Since MSE(β^k,d)=tr[Var(β^k,d)]+E(β^k,dβ)E(β^k,dβ),MSE(β^k,d) is obtained as:

MSE(β^k,d)=i=1p+1[(λid)2λi(λi+k)2]+i=1p+1[αi2(d+k)2(λi+k)2] (12)

The purpose of this estimator is to select appropriate values of the (k, d) pair such that the decrease in variance is higher than the increment of the square of the bias. MSE(β^k,d) is quadratic function of the parameter d when k is fixed. Thus, differentiating MSE(β^k,d) with respect to d and equating to zero yields:

dopt=i=1p+11kαi2(λi+k)2i=1p+11+λiαi2λi(λi+k)2 (13)

It is evident that the minimum MSE(β^k,d) and optimal value of the parameter d, dopt in Equation (13) is obtained after estimating k. There is no definite rule for selecting k. Numerous estimators are used for the estimation of k for the ridge logistic regression. The following are the most commonly used estimators of k for Liu-type estimator:

k^1=(p+1)β^MLβ^ML (14)
k^2=pβ^MLβ^ML (15)
k^3=σ^2max(α^2) (16)
k^4=1max(α^2) (17)
k^5=λ1100λp99 (18)

where Equations (14)–(18) are given in [25], [10], and [19], respectively. After k is estimated by using the one of the estimators of k, dopt obtained in Equation (13) is used to choose d. Since dopt depends on an unknown parameter αi2, we replace it by ML estimator, αi2. Thus, we obtain the following estimator of d, d^ with selected k:

d^=i=1p+11k^α^i2(λi+k^)2i=1p+11+λiα^i2λi(λi+k^)2 (19)

This method allows the model to operate with larger k values by balancing the bias caused by k with d. The pair (k, d) is estimated in a two-phase process in previous studies. After assigning the shrinkage parameter k, the biasing parameter d is estimated such that the mean square error of β^k,d is minimized. However, since the different estimators can be utilized in the estimation of d and the d^ depends on the selection of parameter estimations, optimal choice of the (k, d) pair in this way is not guaranteed. Therefore, in this study, instead of using such two-phase estimation methods, it is considered more appropriate to develop a method that can estimate k and d with a simultaneous optimization method by means of an appropriate objective function that considers both the ill-conditioning problem and model fitting performance. To achieve this, a meta heuristic algorithm Particle Swarm Optimization (PSO) has been employed.

3. Particle swarm optimization

Particle Swarm Optimization (PSO) is a swarm intelligence-based evolutionary computation algorithm developed by Kennedy and Eberhart [16]. This algorithm was developed by simulating the social behavior of bird flocks and fish schools. Each possible solution in PSO is called a particle. A particle in the PSO is similar to a bird or fish flying across the search area. The positions of the particles are described as potential problem solutions. Each particle possesses a velocity vector that allows it to investigate the area in search of the optimum. Each particle assigns its trajectory based on its best position (Pbest) and the best particle of the swarm which is the global best (Gbest) at each generation. This process expands the stochastic structure of the particle and converge rapidly to a optimum solution.

Each particle of the swarm changes its positions at each iteration. Let xj={xj,1(t),xj,2(t),,xj,d(t)} be the position vector of the particle j at iteration t where d is the number of positions in a particle. Each particle j holds the trajectory of the position vector pj={pj,1(t),pj,2(t),,pj,d(t)} represents the current best obtained by the particle j  and the position vector pj={pj,1(t),pj,2(t),,pj,d(t)} keeps the global best solution obtained by all of the particles. The velocity vj(t+1)={vj,1(t+1),vj,2(t+1),,vj,d(t+1)} and position xj(t+1) of the particle j at iteration t + 1 are updated by the following dynamic equations:

vj,s(t+1)=w×vj,s(t)+c1×r1,s×(pj,s(t)xj,s(t))+c2×r2,s×(pj(t)xj,s(t)) (20)
xj,s(t+1)=xj,s(t)+vj,s(t+1) (21)

where w is the inertia weight; c1 and c2 are cognitive and social coefficients, respectively; r1,s and r2,s are independently uniformly distributed random numbers between 0 and 1, and s is a component in the space. The steps of the optimization process are shown in the following algortihm:

Step 1. Determine the tuning parameters: number of the particles (np), maximum number of iterations (maxit), w, c1andc2.

Step 2. Initialize the particles in the swarm by defining positions and velocities, randomly.

Step 3. Find the value of objective function for each particle.

Step 4. Define the best experienced position, local best solution by each particle (Pbest) and the swarm’s best position so far, global best solution (Gbest).

Step 5. Calculate and update the velocity and positions for each particle according to Equation (20) and Equation (21).

Step 6. Repeat steps 3–5 until maxit is reached.

4. Implementation of PSO for the estimating optimal values for the shrinkage and biasing parameters in Liu-type logistic regression

PSO has many good characteristics and superiorities over other heuristic algorithms such as easy implementation, few parameters reqirements, fewer calculation time, stable convergence feature, reduced dependent on initial points according to other methods and it is therefore more robust [2,6]. Furthermore, PSO has been successfully applied in regression analysis. See the references [27,9,15,26,24]. Taking into consideration of these characteristics and superiorities of PSO over other heuristic algorithms, PSO has been applied to develop new algorithm for estimating the (k, d) pair in Liu-type logistic estimator.

In the presence of severe multicollinearity problem, the Liu-type logistic estimator which is two-parameter biased estimator, was introduced as an alternative to the ridge logistic estimator to overcome this problem. The existing conventional methods for estimation of the (k, d) pair in Liu-type logistic regression are two-phase methods. However, as mentioned in previous sections, estimation of (k, d) by existing two-phase methods does not provide a precise optimal solution to the ill-conditioning problem since it is based on the selection of the estimators of the parameters. In short, when the (k, d) pair is selected based on two-phase methods, it becomes very difficult to obtain the optimal value of this pair in such a way as to adjust the bias that occurs and also improve the prediction performance. For these reasons, the aim is to develop a simultaneous approach based on PSO for selection of the (k, d) pair in a single step instead of utilizing two-phase process, both to solve multicollinearity and to ensure good performance and fit of the model.

In line with the purpose of the study, an objective function for our PSO-based algorithm that solves the multicollinearity problem and minimizes the bias of the model, as well as improves predictive performance of the model is formulated. Thus, the objective function in Equation (22) has the following three parts:

mink,dφ(k)+MRSS(k,d)+FPR;0k500,500d500 (22)

where

φ(k)={CN,ifCN>1000,otherwise (23)

and

RSS(k,d)=i=1n(yiπ^i)2π^i(1π^i)andMRSS=RSSnp (24)

The FPR or (1-specificity) is the False Positive Rate for a threshold value, which is used in the objective function to provide good prediction performance of the model. φ(k) is the condition number (CN) function, which is utilized to solve ill-conditioning problem in the model. Since utilization of CN directly in the objective function terminates in a much larger k value than required, φ(k) is constructed. Additionaly, residual sum of square (RSS) is an approach to assess model fit in the terms of deviation of the model from the observations [4]. A large value of MRSS (mean residual sum of squares) shows failure of the logistic model to fit the observation. MRSS is used in the objective function for improving model fit. The aim is to determine the optimum (k, d) pair that can eliminate the multicollinearity problem and minimize the bias of the model, as well as improve the predictive performance and fitness of the model. The optimization problem defined in Equation (22) is solved by utilizing PSO in the suggested method. R Studio has been used for the analyses in the study. The algorithm of the proposed method is introduced as in the following:

Step 1. The tuning parameters of the proposed algorithm are chosen as follows:

np: 15, maxit: 1000, w: 0.9, c1: 2, c2: 2

Step 2. The positions of each of the jth (j = 1, 2,  …  , 15) particle are randomly generated by uniform distribution. The first and second positions of a particle represent the shrinkage parameter k and biasing parameter d values, respectively. The first positions of the particles are generated from uniform distribution with (0, 500) parameters, and the second positions are generated from a uniform distribution with (−500, 500) parameters.

Step 3. The velocities are generated from a uniform distribution with parameters (0, 4).

Step 4. The objective function is chosen as in Equation (22). The objective function values of all particles are obtained.

Step 5. According to objective function values, Pbest and Gbest particles are determined.

Step 6. The velocities and positions of the particles are updated by using the equations given in Equation (20) and Equation (21), respectively.

Step 7. Steps from 3 to step 5 are iterated until maxit is achieved.

Step 8. The optimum pair (k, d) pair is acquired as Gbest.

5. Monte Carlo simulation study

A Monte Carlo simulation study under various degrees of multicollinearity, number of explanatory variables and sample sizes has been designed to show the performance of the suggested PSO-based method for estimating optimal values for the shrinkage and biasing parameters pair (k, d) in Liu-Type logistic regression. The proposed method has been compared with the ML estimator, the various ridge logistic estimators with different estimators of k and various Liu-type logistic estimators with different estimators (k, d) according to various judgment criteria.

In the study, four and eight explanatory variables were considered. The explanatorty variables were generated when p = 4 by the following formulas:

xi,j=(1ρ2)1/2zi,j+ρzi,4wherei=1,2,,nandj = 1,2 (25)
xi,j=(1γ2)1/2zi,j+γzi,4wherei=1,2,,nandj = 3,4 (26)

And the explanatory variables were generated when p = 8 by the following formulas:

xi,j=(1ρ2)1/2zi,j+ρzi,8wherei=1,2,,nandj = 1,2 (27)
xi,j=(1γ2)1/2zi,j+γzi,8wherei=1,2,,nandj = 3,4 (28)
xi,j=(1ρ2)1/2zi,j+ρzi,8wherei=1,2,,nandj = 5,6 (29)
xi,j=(1γ2)1/2zi,j+γzi,8wherei=1,2,,nandj = 7,8 (30)

where zi,j are pseudo-random numbers by the standard normal distribution, and ρ2 and γ2 are correlations between any two independent variables [22]. Three different sets of ρ2 and γ2 values were considered as

ρ=0.999,γ=0.99
ρ=0.999,γ=0.95
ρ=0.99,γ=0.95

As a common restriction in many simulation studies, the parameter β is chosen so that ββ=1 [23]. The experiment is repeated 1000 times for the samples sizes of n = 60, n = 90, n = 150 and at each iteration, the data was divided into training and testing datasets. Condition number (CN) and mean residual sum of squares (MRSS) were calculated on the training dataset; Area under the ROC curve (AUC), Accuracy (Acc), Sensitivity (Sens) were calculated on the testing dataset. The estimators used in the study are as follows:

MLE: Maximum likelihood estimator.

R1: Ridge logistic estimator, k is estimated by k^3 in Equation (16).

R2: Ridge logistic estimator, k is estimated by k^1 in Equation (14).

R3: Ridge logistic estimator, k is estimated by 10-fold cross validation [17].

LT1: Liu-type logistic estimator, k and d are estimated by k^1 and d^ in Equation (14) and Equation (19), respectively.

LT2: Liu-type logistic estimator, k and d are estimated by k^5 and d^ Equation (18) and Equation (19), respectively.

LT3: Liu-type logistic estimator, k and d are estimated by k^3 and d^ Equation (16) and Equation (19), respectively.

PSO-based LT: Liu-type logistic estimator, (k, d) pair is simultaneously estimated by the proposed PSO-based method.

The simulation was repeated 1000 times and the MSEs of the parameter estimates; mean of CN, MRSS, AUC, Sens, Acc values were calculated for each estimator.

The simulation study results have been illustrated in Tables 16. It is observed that in the case of moderate and severe multicollinearity, the MSE values of MLE are inflated in each simulation scenario. For all cases, all of the existing biased estimators and the proposed estimator have better performance than MLE according to each judgment criteria such that the proposed estimator with all other estimators has a lower value of MSE than MLE has. When the proposed estimator and other estimators are examined in terms of MSE and CN, the proposed method shows superior performance over the other methods by completely eliminating the multicollinearity problem in each simulation scenario and significantly reducing the MSE value of the parameter estimates. The MSE values of the proposed estimator are the lowest in each simulation scenario. Other existing conventional biased estimators have also generally reduced MSE values compared to MLE although not to the similar extent as the suggested method. Furthermore, the compared conventional estimators do not completely solve multicollinearity problem in each simulation scenario. It is also observed that an increase in the number of explanatory variables have negative effect on the performances of the estimators. This increase in the number of explanatory variables makes increase in MSE values of both MLE and other estimators.

Table 2. Simulation results for ρ = 0.999 and γ = 0.95 when p = 4.

n Judgment criteria MLE PSO-based LT LT1 LT2 LT3 R1 R2 R3
ntrain = 40ntest = 20 MSE 122.235 2.487 10.732 4.089 2.966 11.248 20.634 13.901
MRSS 26.234 5.021 18.789 16.853 13.058 12.978 17.963 19.098
CN 1274.788 1.985 198.633 96.991 90.963 92.856 189.997 255.476
AUC 0.730 0.788 0.739 0.748 0.750 0.749 0.749 0.750
Sens 0.709 0.807 0.749 0.746 0.751 0.749 0.753 0.775
Acc 0.650 0.681 0.677 0.673 0.672 0.682 0.675 0.670
ntrain = 60ntest = 30 MSE 79.547 2.003 18.654 8.735 9.847 27.965 34.985 17.178
MRSS 8.985 4.457 7.963 9.145 7.632 8.003 8.236 9.653
CN 1221.856 2.568 241.961 100.895 201.236 225.857 235.789 305.598
AUC 0.721 0.789 0.744 0.739 0.743 0.744 0.742 0.739
Sens 0.739 0.855 0.750 0.753 0.759 0.756 0.751 0.754
Acc 0.663 0.687 0.669 0.675 0.678 0.690 0.680 0.679
ntrain = 100ntest = 50 MSE 40.856 1.564 9.987 2.799 7.632 5.653 10.367 8.963
MRSS 14.106 4.562 10.635 12.905 10.560 10.157 11.744 12.954
CN 1156.345 2.896 400.693 102.455 165.323 148.810 201.563 634.058
AUC 0.703 0.711 0.705 0.707 0.708 0.706 0.703 0.706
  Sens 0.695 0.787 0.728 0.730 0.727 0.728 0.711 0.709
  Acc 0.659 0.663 0.660 0.659 0.662 0.665 0.655 0.657

Table 3. Simulation results for ρ = 0.99 and γ = 0.95 when p = 4.

n Judgment criteria MLE PSO-based LT LT1 LT2 LT3 R1 R2 R3
ntrain = 40ntest = 20 MSE 101.236 6.109 19.523 12.633 19.635 31.145 35.699 19.235
MRSS 14.873 4.782 8.222 8.796 8.969 9.104 8.967 9.254
CN 1106.564 1.850 142.321 95.487 123.665 129.123 163.364 193.457
AUC 0.734 0.803 0.775 0.759 0.766 0.763 0.741 0.752
Sens 0.763 0.799 0.749 0.762 0.752 0.739 0.767 0.758
Acc 0.680 0.695 0.669 0.660 0.680 0.671 0.677 0.675
ntrain = 60ntest = 30 MSE 69.854 2.443 7.879 5.852 8.156 24.877 21.523 14.118
MRSS 8.966 2.963 7.145 7.966 7.564 7.632 7.520 7.883
CN 1053.215 1.324 135.633 100.665 145.247 152.633 169.224 182.562
AUC 0.749 0.789 0.757 0.763 0.759 0.767 0.741 0.764
Sens 0.715 0.769 0.749 0.719 0.727 0.733 0.725 0.728
Acc 0.675 0.689 0.681 0.680 0.675 0.683 0.675 0.683
ntrain = 100ntest = 50 MSE 45.720 0.998 20.563 8.569 21.335 18.455 17.999 23.007
MRSS 9.378 3.966 8.889 6.214 7.851 8.006 8.257 8.229
CN 1009.434 2.009 149.766 101.858 121.960 119.651 179.641 600.568
AUC 0.730 0.740 0.731 0.736 0.711 0.735 0.734 0.733
  Sens 0.761 0.832 0.773 0.759 0.776 0.768 0.780 0.770
  Acc 0.650 0.657 0.667 0.668 0.643 0.670 0.660 0.655

Table 4. Simulation results for ρ = 0.999 and γ = 0.99 when p = 8.

n Judgment criteria MLE PSO-based LT LT1 LT2 LT3 R1 R2 R3
ntrain = 40ntest = 20 MSE 269.567 30.947 120.523 66.134 108.8246 201.993 207.617 182.307
MRSS 14.934 4.233 7.431 7.523 10.372 9.212 15.278 9.194
CN 2165.341 1.634 210.309 99.588 207.169 313.901 321.198 297.52
AUC 0.689 0.701 0.698 0.695 0.690 0.698 0.691 0.694
Sens 0.580 0.647 0.595 0.577 0.586 0.600 0.594 0.582
Acc 0.621 0.627 0.626 0.623 0.63 0.633 0.625 0.624
ntrain = 60ntest = 30 MSE 97.344 6.684 52.788 34.634 54.962 72.896 69.295 64.064
MRSS 12.372 4.142 8.597 17.965 14.692 9.705 9.190 12.186
CN 1674.565 1.874 190.081 98.093 197.250 200.194 194.484 189.027
AUC 0.703 0.709 0.710 0.711 0.707 0.710 0.703 0.707
Sens 0.592 0.665 0.584 0.590 0.571 0.586 0.574 0.591
Acc 0.647 0.646 0.649 0.645 0.643 0.647 0.648 0.647
ntrain = 100ntest = 50 MSE 55.895 3.790 15.651 35.274 18.399 40.675 43.178 40.262
MRSS 8.186 1.992 7.457 8.085 7.700 8.400 7.264 7.284
CN 1356.413 2.202 133.546 97.145 130.769 130.535 133.785 123.774
AUC 0.717 0.726 0.725 0.720 0.725 0.721 0.715 0.723
  Sens 0.633 0.700 0.623 0.624 0.619 0.621 0.623 0.626
  Acc 0.655 0.648 0.662 0.654 0.659 0.655 0.6520 0.659

Table 5. Simulation results for ρ = 0.999 and γ = 0.95 when p = 8.

n Judgment criteria MLE PSO-based LT LT1 LT2 LT3 R1 R2 R3
ntrain = 40ntest = 20 MSE 217.535 8.387 80.297 24.856 72.379 181.959 183.737 152.769
MRSS 18.814 4328 10.323 19.523 9.091 15.633 20.529 18.175
CN 2135.366 1.629 274.044 93.399 270.921 286.353 273.410 201.051
AUC 0.694 0.700 0.698 0.699 0.690 0.701 0.694 0.687
Sens 0.632 0.691 0.625 0.616 0.614 0.629 0.615 0.621
Acc 0.642 0.648 0.632 0.642 0.633 0.641 0.635 0.637
ntrain = 60ntest = 30 MSE 75.914 4.693 40.003 24.037 39.801 57.475 57.912 53.335
MRSS 9.610 4.123 8.284 11.979 8.001 8.700 12.973 9.184
CN 1476.719 1.753 160.790 99.283 174.711 162.100 175.7 168.103
AUC 0.712 0.730 0.723 0.722 0.721 0.719 0.712 0.719
Sens 0.606 0.703 0.610 0.596 0.599 0.607 0.597 0.612
Acc 0.657 0.665 0.668 0.657 0.658 0.660 0.650 0.667
ntrain = 100ntest = 50 MSE 50.908 1.648 22.641 23.957 22.998 30.505 29.865 27.135
MRSS 8.953 3.982 8.721 9.086 7.915 8.054 10.056 8.777
CN 1314.917 2.142 150.332 99.743 162.474 150.833 163.457 145.823
AUC 0.717 0.726 0.725 0.720 0.725 0.721 0.715 0.723
  Sens 0.633 0.701 0.619 0.624 0.623 0.622 0.623 0.626
  Acc 0.656 0.658 0.659 0.654 0.662 0.652 0.652 0.659

Table 1. Simulation results for ρ = 0.999 and γ = 0.99 when p = 4.

n Judgment criteria MLE PSO-based LT LT1 LT2 LT3 R1 R2 R3
ntrain = 40ntest = 20 MSE 130.693 3.001 17.578 7.023 9.645 41.102 49.125 32.634
MRSS 15.149 4.102 8.698 12.855 6.108 8.401 9.923 13.563
CN 1389.569 2.123 296.529 97.065 182.155 191.364 235.658 201.896
AUC 0.723 0.799 0.740 0.748 0.745 0.736 0.741 0.744
Sens 0.613 0.703 0.631 0.640 0.629 0.623 0.630 0.636
Acc 0.608 0.686 0.641 0.621 0.659 0.661 0.639 0.644
ntrain = 60ntest = 30 MSE 81.753 2.998 14.873 10.451 13.063 27.214 29.956 13.568
MRSS 7.562 3.562 6.153 6.998 5.823 6.274 5.952 7.045
CN 1223.673 2.319 250.124 102.457 219.376 265.278 296.36 358.169
AUC 0.711 0.783 0.731 0.729 0.731 0.737 0.729 0.717
Sens 0.698 0.759 0.708 0.722 0.725 0.723 0.731 0.725
Acc 0.649 0.714 0.689 0.683 0.678 0.692 0.685 0.693
ntrain = 100ntest = 50 MSE 49.231 1.314 20.963 10.897 20.689 10.205 12.002 26.719
MRSS 6.731 3.574 5.638 6.025 5.741 6.087 5.209 6.677
CN 1213.850 3.024 270.903 103.954 240.687 278.234 305.562 702.006
AUC 0.721 0.745 0.733 0.733 0.728 0.730 0.727 0.725
  Sens 0.689 0.758 0.703 0.710 0.709 0.708 0.713 0.711
  Acc 0.633 0.653 0.627 0.630 0.631 0.625 0.621 0.623

Table 6. Simulation results for ρ = 0.99 and γ = 0.95 when p = 8.

n Judgment criteria MLE PSO-based LT LT1 LT2 LT3 R1 R2 R3
ntrain = 40ntest = 20 MSE 102.270 6.268 45.458 19.507 38.465 89.265 80.256 80.399
MRSS 34.916 4.175 8.588 18.583 7.059 39.147 17.540 20.837
CN 1235.327 1.666 158.841 99.230 164.888 157.844 168.451 178.542
AUC 0.706 0.708 0.717 0.711 0.718 0.696 0.701 0.702
Sens 0.621 0.648 0.627 0.624 0.6132 0.608 0.634 0.619
Acc 0.643 0.643 0.646 0.642 0.647 0.632 0.647 0.642
ntrain = 60ntest = 30 MSE 72.463 5.274 23.199 39.485 20.11 45.200 50.774 49.941
MRSS 10.480 4.107 7.296 9.132 6.637 13.121 11.524 9.445
CN 1160.661 1.809 139.710 96.863 135.324 145.619 141.198 139.887
AUC 0.693 0.702 0.703 0.702 0.703 0.668 0.681 0.695
Sens 0.616 0.998 0.618 0.620 0.599 0.537 0.580 0.616
Acc 0.628 0.628 0.640 0.640 0.635 0.591 0.617 0.630
ntrain = 100ntest = 50 MSE 45.125 1.070 19.140 25.777 16.333 38.221 40.856 41.124
MRSS 7.571 4.030 6.093 7.333 5.714 9.456 7.451 6.896
CN 1148.112 3.498 134.152 97.288 130.574 147.519 148.648 139.288
AUC 0.746 0.757 0.760 0.747 0.755 0.722 0.717 0.750
  Sens 0.598 0.635 0.607 0.598 0.601 0.511 0.550 0.602
  Acc 0.679 0:687 0.681 0.679 0.681 0.620 0.638 0.673

If the MRSS is used as a performance criterion to evaluate the goodness of the model fit, we can see that the MRSS values of MLE are quite high in each simulation scenario. The MRSS values of the proposed method are quite low compared to MLE and all other biased estimators. These values of the conventional ridge and Liu-type estimators have slightly decreased according to the values of MLE, but not to the similar extent as the suggested method. When also comparing the AUC, Sens and Acc values calculated in the testing sets as the measures of prediction or classification performance, it has been observed that the proposed method increases these values in each simulation scenario according to the values of MLE. Other estimators also tend to increase these values. However, in some scenarios, these classification measure values of the proposed method are significantly increased.

When the ridge and Liu-type logistic estimators are compared among themselves, the Liu-type logistic estimators generally give better results than ridge estimators according to MSE values, especially for the sample sizes of n = 60 and n = 90. However, there is a significant difference between the performances of the Liu-type logistic estimators according as which shrinkage parameter is applied. The performance of the LT1 and LT3 estimators are almost equivalent. However, the most robust option is the LT2 estimator among the Liu-type estimators. Moreover, when comparing the ridge logistic estimators among themselves, the R3 estimator with cross validation gives better results than other ridge estimators.

Consequently, as can be seen from the simulation results, the existing conventional methods are not successful methods for the optimal choice of (k, d) pair in the Liu-type logistic estimator and the proposed method outperforms all the compared conventional methods. The performance of the proposed method is quite good in terms of each judgment criteria compared to the ridge and Liu-type logistic estimators in small samples where the multicollinearity is more effective and large samples where it is less effective. The proposed PSO-based estimator exhibits its best performance by the means of solution to the ill-conditioning problem, the significant reduction of the MSE values and the increase of the values of classification measures simultaneously, in other words improvement of model fit. We know that if the estimation of the (k, d) pair is equal to the actual estimation of the optimum value, it is guaranteed to minimize the mean square error of the parameter and the fitting performance of the model is improved. As can be seen, the proposed estimator yields better estimation results by both minimizing MSE, solving the multicollinearity problem and improving the fitting performance of the model. Consequently, the optimum value of the (k, d) pair can be provided by the proposed method.

6. Application

In the application part of this study, the real dataset used in the article of the Liu-type logistic estimator, which was proposed by Inan and Erdogan [14], was utilized. The proposed method was evaluated based on a dataset taken from the website of the Banks Association of Turkey [7]. This dataset is about the collapse of the banks in Turkey through the Asian financial crisis (www.bddk.org.tr, 2002). The goal of this application was to develop a logistic regression model to estimate the probability that the Turkish commercial banks would be transferred to the Savings Deposit Insurance Fund (SDIF) where the dependent variable is defined as the financial status of the banks:

yi={1, ifthebanksthatweretransferredtotheSDIF0,otherwise  (31)

The dataset (n = 41) was randomly divided into training and testing datasets and the sample sizes were chosen as ntrain = 30 and ntest = 11. Based on the proposal of Hosmer and Lemeshow [12], a variable that has a univariate test p value below 0.025 was selected as a necessary variable for the logistic model. As a result of this proposal, the predictors to be used in the study were determined as follows:

X1:(Equity + Profit)/(Deposit + Non - depositresources)}Capitaladequacy
X2:Totalcredit/TotalassetsX3:Currentassets/Totalassets}Assetsquality
X4:Current assets / (Deposit + Non - depositresources)X5:Netprofit/Averagetotalassets}Liquidity
X6:Non - interestincome/Non - interestexpenseX7:Totalrevenue/Totalexpense}Income - expenditurestructure

Tables 7 and 8 show real dataset application results. In the data matrix, the condition number was calculated as 1472 × 107, which signs quite severe multicollinearity. Furthermore, variance inflation factor (VIF) values for each explanatory variable were calculated as 34.63, 2.67, 235.63, 357.77, 9.28, 3.79, 7.40, which also sign that there is a multicollinearity problem in the dataset. PSO-based-LT, LT1, LT2, LT3, R1, R2, R3, MLE estimates, and the goodnesss of fit and classification measures related to them (MRSS and CN for training set; Sens, Acc, AUC for testing set) were computed. Since relatively severe collinearity exists, the estimation results by MLE are not reliable. According to the judgment criteria values of MLE, it can be seen that MRSS is quite high and the values of the classification measures are quite low. Second, the sign of β1is positive. However, β1 should be negative since X1is an indicator of capital adequacy and having good capital prevents transfer to SDIF. Nevertheless, the MLE of β1 is positive and contradicts this interpretation. Obviously, we can see that our proposed estimator corrects the problems. The sign of the β1 becomes negative. The proposed estimator also solves the ill-conditioning problem by reducing CN to below 100. Furthermore, the proposed estimator has lower value of MRSS than MLE and the compared biased estimators. The other conventional biased estimators do not solve the ill-conditioning problem and do not give good results according to the performance judgment criteria. Although they decrease the MRSS values and increase the values of the classification measures, they do not significantly change these values as in the PSO-based LT. It should be also mentioned that a large k value was estimated by the proposed method since a relatively severe multicollinearity exists in the dataset and only a sufficiently large k value can solve this severe ill-conditioning problem. In line with this, it is observed from the improvement of the model fit and statistical properties that the optimal d value is obtained to reduce and offset the bias caused by the large selected k value. This is because if the estimation of the optimum (k, d) pair is equal to the actual optimum value, the minimization of the mean square error of the parameters improves the model fit.

Table 7. Real dataset application results.

  β0 β1 β2 β3 β4 β5 β6 β7 CN
MLE 46.984 0.453 −2.122 1.089 −1.349 −0.476 0.323 −0.212 1472 × 107
R1 0.437 1.027 −1.179 2.211 −1.929 −0.226 0.164 −0.042 1657.54 × 104
R2 0.384 0.821 −0.983 1.862 −1.630 −0.212 0.140 −0.031 4147.961 × 103
R3 15.772 −0.047 −0.475 −0.08 −0.065 −0.026 0.086 −0.092 1378.849 × 101
LT1 0.382 0.797 −0.960 1.804 −1.585 −0.216 0.136 −0.028 4099.506 × 102
LT2 7.038 −0.011 −0.229 0.041 −0.054 −0.020 0.048 −0.052 1019.234
LT3 0.434 1.027 −1.178 2.216 −1.933 −0.221 0.164 −0.042 1627.54 × 104
PSO-based LT 4.811 −0.021 −0.143 0.013 −0.003 −0.010 0.038 −0.045 93.23

Table 8. Judgment criteria values for the real dataset application.

  MRSS AUC Spec Acc Sens k^ d^
MLE 11.231 0.427 0.285 0.363 0.49
R1 8.578 0.601 0.571 0.545 0.5 0.0008
R2 8.500 0.602 0.571 0.545 0.5 0.0031
R3 6.751 0.601 0.798 0.703 0.5 0.98
LT1 8.534 0.602 0.571 0.545 0.5 0.0031 −4 × 10−10
LT2 5.626 0.686 0.812 0.712 0.51 33.131 −0.000042
LT3 8.509 0.602 0.574 0.545 0.5 0.0008 −0.009
PSO-based LT 3.440 0.689 0.857 0.795 0.51 74.513 −2.123

7. Conclusion

In this study, a simultaneous estimation method based on PSO for the optimal selection of k and d parameters has been proposed for the Liu-type logistic estimator. The performance of the proposed estimator has been compared with the existing conventional ridge and Liu-type logistic estimators according to various judgment criteria. The Monte Carlo simulation study with different scenarios has revealed that the proposed method through the proper objective function used in the optimization process outperforms the conventional compared biased estimators. The proposed estimator shows the best performance by the means of a solution to the multicollinearity problem, a significant reduction in MSE values, and an improvement of the model fit, simultaneously. However, it is clearly observed that the existing biased estimators both fail to solve the multicollinearity problem, to improve the model fit and also to significantly reduce the MSE value, simultaneously. Real dataset application has also revealed the eligibility of the proposed method. Furthermore, when conventional ridge and Liu-type logistic estimators have been compared, it was observed that the Liu-type estimators are generally more successful than ridge estimators. However, the compared conventional Liu-type estimators do not give good enough results because they are two-phase methods. It has been observed that the single-phase proposed PSO-based method gives very good results in every aspect compared to the two-stage Liu-type estimators. Consequently, we strongly suggest our PSO-based algorithm for the optimal selection of (k, d) pair in the Liu-type logistic estimator.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Abonazel M.R. and Farghali R.A., Liu-type multinomial logistic estimator. Sankhya B 81 (2019), pp. 203–225. Available at 10.1007/s13571-018-0171-4. [DOI] [Google Scholar]
  • 2.Abuella M.A., Study of particle swarm for optimal power flow in IEEE benchmark systems including wind power generators, Ph.D. diss., Southern Illinois University Carbondale, 2012.
  • 3.Aguilera A.M., Manuel E., and Mariano J.V., Using principal components for estimating logistic regression with high-dimensional multicollinear data. Comput. Stat. Data Anal. 50 (2006), pp. 1905–1924. doi: 10.1016/j.csda.2005.03.011 [DOI] [Google Scholar]
  • 4.Aitkin M.A., Aitkin M., Francis B., and Hinde J., Statistical Modelling in GLIM 4, Vol. 32, OUP, Oxford, 2005. [Google Scholar]
  • 5.Asar Y., Some new methods to solve multicollinearity in logistic regression. Commun. Stat. Simul. Comput. 46 (2017), pp. 2576–2586. doi: 10.1080/03610918.2015.1053925 [DOI] [Google Scholar]
  • 6.Bai Q., Analysis of particle swarm optimization algorithm. Comput. Inf. Sci. 3 (2010), pp. 180–184. [Google Scholar]
  • 7.The Banks Association of Turkey, Banks in Turkey (1998–2002). Available at www.tbb.org.tr.
  • 8.Belsley D.A., Kuh E., and Welsch R.E., Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley & Sons, New York, 1980. [Google Scholar]
  • 9.Cagcag O., Yolcu U., and Egrioglu E., A new robust regression method based on particle swarm optimization. Commun. Stat. Theory Methods 44 (2015), pp. 1270–1280. doi: 10.1080/03610926.2012.718843 [DOI] [Google Scholar]
  • 10.Hoerl A.E. and Kennard R.W., Ridge regression: Biased estimation for nonorthogonal problems. Technometrics. 12 (1970), pp. 55–67. doi: 10.1080/00401706.1970.10488634 [DOI] [Google Scholar]
  • 11.Hoerl A.E. and Kennard R.W., Ridge regression: Biased estimation for nonorthogonal problems. Technometrics. 12 (1970), pp. 69–82. doi: 10.1080/00401706.1970.10488635 [DOI] [Google Scholar]
  • 12.Hosmer D.W. and Lemeshow S., Applied Logistic Regression, Wiley, New York, 1989. [Google Scholar]
  • 13.Huang J., A simulation research on a biased estimator in logistic regression model, in International Symposium on Intelligence Computation and Applications, Springer, Berlin, 2012, pp. 389–395. [Google Scholar]
  • 14.Inan D. and Erdogan B.E., Liu-type logistic estimator. Commun. Stat. Simul. Comput. 42 (2013), pp. 1578–1586. doi: 10.1080/03610918.2012.667480 [DOI] [Google Scholar]
  • 15.Inan D., Egrioglu E., Sarica B., Askin O.E., and Tez M., Particle swarm optimization based Liu-type estimator. Commun. Stat. Theory Methods 46 (2017), pp. 11358–11369. doi: 10.1080/03610926.2016.1267759 [DOI] [Google Scholar]
  • 16.Kennedy J. and Eberhart R., Particle swarm optimization. Proc. IEEE Int. Conf. Neural Netw. 4 (1995), pp. 1942–1948. doi: 10.1109/ICNN.1995.488968 [DOI] [Google Scholar]
  • 17.Le Cessie S. and Van Houwelingen J.C., Ridge estimators in logistic regression. J. R. Stat. Soc. C Appl. Stat. 41 (1992), pp. 191–201. [Google Scholar]
  • 18.Liu K., A new class of blased estimate in linear regression. Commun. Stat. Theory Methods 22 (1993), pp. 393–402. doi: 10.1080/03610929308831027 [DOI] [Google Scholar]
  • 19.Liu K., Using Liu-type estimator to combat collinearity. Commun. Stat. Theory Methods 32 (2003), pp. 1009–1020. doi: 10.1081/STA-120019959 [DOI] [Google Scholar]
  • 20.Månsson K. and Shukur G., On ridge parameters in logistic regression. Commun. Stat. Theory Methods 40 (2011), pp. 3366–3381. doi: 10.1080/03610926.2010.500111 [DOI] [Google Scholar]
  • 21.Månsson K., Kibria B.G., and Shukur G., On Liu estimators for the logit regression model. Econ. Model. 29 (2012), pp. 1483–1488. doi: 10.1016/j.econmod.2011.11.015 [DOI] [Google Scholar]
  • 22.McDonald G.C. and Galarneau D.I., A Monte Carlo evaluation of some ridge-type estimators. J. Am. Stat. Assoc. 70 (1975), pp. 407–416. doi: 10.1080/01621459.1975.10479882 [DOI] [Google Scholar]
  • 23.Newhouse J.P. and Oman S.D., An Evaluation of Ridge Estimators, Rand Corporation, P-716-PR, 1971.
  • 24.Sancar N. and Inan D., Identification of influential observations based on binary particle swarm optimization in the Cox PH model. Commun. Stat. Simul. Comput. 49 (2019), pp. 1–24. [Google Scholar]
  • 25.Schaefer R., Roi L., and Wolfe R., A ridge logistic estimator. Commun. Stat. Theory Methods 13 (1984), pp. 99–113. doi: 10.1080/03610928408828664 [DOI] [Google Scholar]
  • 26.Tak N., Evren A.A., Tez M., and Egrioglu E., Recurrent type-1 fuzzy functions approach for time series forecasting. Appl. Intell. 48 (2018), pp. 68–77. doi: 10.1007/s10489-017-0962-8 [DOI] [Google Scholar]
  • 27.Uslu V.R., Egrioglu E., and Bas E., Finding optimal value for the shrinkage parameter in ridge regression via particle swarm optimization. Am. J. Intell. Syst. 4 (2014), pp. 142–147. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES