Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2024 May 1;51(15):3233–3247. doi: 10.1080/02664763.2024.2346346

MSE superiority of the unrestricted Stein-rule estimator in a regression model with a possible structural break

Haifeng Xu a,b, Akio Namba c,CONTACT
PMCID: PMC11536628  PMID: 39507216

Abstract

This paper investigates the estimation of a linear regression model with a possible structural break at a known point. We analytically derive the exact formulae of the MSE for the restricted SR and PSR estimators, which shrinks the OLS estimator toward the restriction of no structural break. We compare the MSE performance of restricted/unrestricted SR, PSR, and least squared estimators. We analytically show that the unrestricted SR estimator can have a smaller MSE than the restricted SR estimator even when the restriction is correct. Further, our numerical results show that the unrestricted PSR estimator has the best MSE performance over a wide region of parameter space. These results indicate that the use of the unrestricted PSR estimator is recommended even when a structural break may not exist, although the unrestricted PSR estimator does not take the possibility of no structural break into consideration.

Keywords: Stein-rule estimator, structural break, MSE performance

1. Introduction

Structural break is an important issue for applied scientists. In recent years, the detection and estimation of linear regressions with a possible structural break have been widely used in many fields such as finance, bioinformatics, climatology, econometrics, and so on. Early empirical researches, mainly those using [3,6] approaches, determined whether parameter changes. It is important since such results can be useful for estimation and forecasting. When the evidence shows that a structural break exists, a model with the structural break should be used. The literature on determination and estimation for structural break goes back to [2–5,7,11,12,22,23]. Recently, some studies suggest using a pretest estimator which includes a restricted estimator when the structural change test is insignificant and the unrestricted estimator when the test is significant (e.g. [13]). However, the performance of the method based on pretest is not very good.

On the other hand, although the ordinary least squares (OLS) estimator is the best linear unbiased estimator (BLUE), there exist estimators that have smaller variance than the OLS estimator if we do not care about unbiasedness. [15,24] showed that the Stein-rule (SR) estimator dominates the OLS estimator in terms of predictive mean squared error (PMSE) if the number of the regression coefficients is larger than or equal to three. Further, [8] showed that the SR estimator is further dominated by the positive-part Stein-rule (PSR) estimator. The superiority of sampling properties for the SR and PSR estimators has been studied in a variety of situations [9,19,20]. Further, [16] proposed the restricted SR (RSR) estimator which is obtained by shrinking the OLS estimator towards the restricted least squared (RLS) estimator. Many studies have shown that the restricted PSR (RPSR) estimator dominates the RSR estimator [14,17].

The model with a possible structural break can be considered as a model with a linear constraint such that there is no structural break when the linear constraint is true. Thus, we may apply the restricted RSR and RPSR estimators to the model with a possible structural break. On the other hand, if the linear constraint is not true, the unrestricted SR and PSR estimators have smaller MSE than the OLS estimators. However, when we are not sure if a structural break exists or not, we have to determine which estimator to use, restricted or unrestricted. One way to make such a choice is to utilize a pretest. However, as shown in [13], the method based on the pretest does not work very well. In view of these issues, in contrast, this paper propose another approach, i.e. simply apply the unrestricted SR and PSR estimators to the model with a possible structural break.

Thus, in this paper, we consider a linear regression with a possible structural break at a known point, and derive the exact formula of the MSE for the RSR and RPSR estimators. Further, we compare the MSE performance of restricted/unrestricted SR, PSR, and least squared estimators. Our results suggest that the unrestricted PSR estimator has the best MSE performance over a wide region of parameter space. This indicates that we may use the unrestricted PSR estimator even when there may exist a structure break in the sample period. The remainder of this paper is organized as follows. In Section 2, the model and estimators are presented. In Section 3, we derive the exact formulae for the MSE of the RSR and RPSR estimators. In Section 4, by numerical evaluations, we compare the MSE performance of the estimators. Finally, some concluding remarks are given in Section 5.

2. Model and estimators

Consider a linear regression model with a possible structural break at a known point

yi=Xiβi+εi,i=1,2 (1)

where yi is an ni×1 vector of observations on a dependent variable, Xi is an ni×k matrix of observations on nonstochastic explanatory variables, βi is a k×1 vector of coefficients, and ε is an ni×1 vector of error terms and εiN(0,σi2Ini). We assume that Xi is of full column rank, and i denotes the regime.

Let

y=[y1y2],X=[X100X2],β=[β1β2],ε=[ε1ε2],

then we can rewrite (1) as

y=+ε,εN(0,σ2In). (2)

where y is n×1, X is n×2k, β is 2k×1, ε is n×1, and n=n1+n2.

The OLS estimator for β is b=S1XY, where S=XX, and the OLS residual vector is e = yXb.

In model (1), the Stein-rule (SR) estimator proposed by Stein [24] and James and Stein [15] is defined as

bSR=(1a1eebSb)b (3)

where a1 is a constant such that 0a12(2k2)/(n2k+2). It is well known that the SR estimator dominates the OLS estimator in terms of MSE. Also, as is shown Branchik [8], the SR estimator is further dominated by the positive-part Stein-rule (PSR) estimator defined as

bPSR=max[0,1a1eebSb]b. (4)

In this model, β1=β2 if there is no structural break. Since the existence of a structural break is uncertain in many applications, we can incorporate the non-existence of a structural break into the model by considering the following restriction on β:

=c, (5)

where R=[IkIk] and c = 0. If the above restriction is true, then there is no structural break and vice versa. Under the above restriction, the RLS estimator for β is

bR=bS1R(RS1R)1Rb=[b1Rb2R]. (6)

Noting that (S11+S21)1=S1S1(S2+S1)S1, it is easy to show that

b1R=b2R=(S1+S2)1(S1b1+S2b2), (7)

where

S=[S100S2],

Si=XiXi, and bi=Si1Xiyi is the OLS estimator for βi for i = 1, 2.

Under the restriction (5), [1,16] and proposed the following restricted SR (RSR) and PSR (RSR) estimators:

bRSR=(1a2eebR(RS1R)1Rb)(bbR)+bR, (8)
bRPSR=I(1a2eebR(RS1R)1Rb0)bRSR+I(1a2eebR(RS1R)1Rb<0)bR, (9)

where a2 is a constant such that 0a22(k2)/(n2k+2), and I(A) is an indicator function such that I(A)=1 if an event A occurs and I(A)=0 otherwise.

Following [1,16], we reparameterize the model as follows. Let S1/2 be a symmetric matrix such that S1/2SS1/2=Ik. Since

S1/2R(RS1R)1RS1/2

is an idempotent matrix with rank k, there exists an orthogonal matrix Q such that

QS1/2R(RS1R)1RS1/2Q=(Ik000).

If we define Z, γ and H as Z=XS1/2Q, γ=QS1/2β, and H=RS1/2Q, then the model (2) and the linear restrictions (5) are reparameterized as

y=+ε,=0.

The OLS and RLS estimators in the reparameterized model are

γ^=Zy,γ^R=γ^H(HH)1Hγ^.

Note that the residual vector is not affected by the reparameterized since e=yZγ^=yXb. Thus, the SR and the PSR estimators for the regression coefficients for the parameterized model are given by

γ^SR=(1a1eeγ^γ^)γ^ (10)
γ^PSR=max[0,1a1eeγ^γ^]γ^. (11)

Since

H=H(HH)1HH=[H10],

the linear restrictions =0 is transformed into H1γ1=0, where H1 is an k×k submatrix of H, and γ1 is the first k elements of γ : γ=(γ1,γ2). Because the rank of H is k, we assume that H1 is full rank and nonsingular without loss of generality. Hence, the linear restrictions are written as γ1=0. Accordingly, the OLS and RLS estimators are denoted as γ^=(γ^1,γ^2) and γ^R=(0,γ^2), respectively. Noting that bR(RS1R)1Rb=γ^1γ^1, the RSR and RPSR estimators for γ are written as

γ^RSR=(1a2eeγ^1γ^1)(γ^γ^R)+γ^R, (12)
γ^RPSR=I(1a2eeγ^1γ^10)γ^RSR+I(1a2eeγ^1γ^1<0)γ^R. (13)

For convenience, we define the following general class of estimators:

γ~=I(Fτ)γ^RSR+I(F<τ)γ^R, (14)

where F=(γ^1γ^1/k)/(ee/ν) is the test statistic for the null hypothesis H0:γ1=0 against the alternative H1:γ10, ν=n2k, and τ is a critical value of the test. Note that γ~ reduces to the RSR estimator when τ=0, RLS estimator when τ=, and the RPSR estimator when τ=a2ν/k.

3. MSE of the estimators

In this section, we derive the explicit formula for the MSE of γ~ and show some theoretical results about the MSE of the estimators.

Since for any estimator γ¯=QS1/2β¯, where β¯ is any estimator for β, we have:

MSE(γ¯)=E[(γ¯γ)(γ¯γ)]=E[(Xβ¯)(Xβ¯)]. (15)

Thus, smaller MSE implies that more accurate estimate of can be made.

Since γ^=(γ^1,γ^2) and γ^R=(0,γ^2), we have (cγ^R)=(γ^1,0). Substituting these formulas in Equation (12), we have

γ^RSR=[(1a2eeγ^1γ^1)γ^1γ^2].

Thus, the MSE of γ~ is given by

MSE(γ~)=E[(γ~γ)(γ~γ)]=E[I(Fτ)(γ^RSRγ)(γ^RSRγ)+I(F<τ)(γ^Rγ)(γ^Rγ)]=E[I(Fτ)((1a2eeγ^1γ^1)2γ^1γ^12(1a2eeγ^1γ^1)γ1γ^1+γ1γ1)]+E[(I(Fτ)+I(F<τ))(γ1γ1+(γ2^γ2)(γ2^γ2))]=E[I(Fτ)(1a2eeγ^1γ^1)2γ^1γ^1]2E[I(Fτ)(1a2eeγ^1γ^1)γ1γ^1]+γ1γ1+kσ2.

Here, we define the functions, H(p,q,δ,τ) and J(p,q,δ,τ), as

H(p,q,δ,τ)=E[I(F<τ)(γ^1γ^1γ^1γ^1+δee)p(γ^1γ^1)q], (16)
J(p,q,δ,τ)=E[I(F<τ)(γ^1γ^1γ^1γ^1+δee)p(γ^1γ^1)qγ1γ^1]. (17)

Then, the MSE of γ~ is written as

MSE(γ~)=H(2,1;a,)H(2,1;a,τ)2[J(1,0;a,)J(1,0;a,τ)]+σ2[λ1+k],

where λ1=γ1γ1/σ2 is the noncentrality parameter of the F-test statistic for the null hypothesis H0:γ1=0.

As is shown in Appendix in the supplementary material, the explicit formulae for H(p,q,δ,τ) and J(p,q,δ,τ) are given by

H(p,q,δ,τ)=(2σ2)qi=0wi(λ1)Gi(p,q,δ,τ), (18)
J(p,q,δ,τ)=(γ1γ1)(2σ2)qi=0wi(λ1)Gi+1(p,q,δ,τ), (19)

where

Gi(p,q,δ,τ)=Γ((ν+k)/2+q+i)Γ(k/2+i)Γ(ν/2)0τtk/2+p+q+i1(1t)ν/21[t+δ(1t)]pdt, (20)

wi(λ1)=exp(λ1/2)(λ1/2)i/i! and τ=/(ν+). Also, the MSE of the SR and PSR estimators can be derived in a similar way (see, e.g. [18]).

Noting that the exponent of t in the integral in Equation (20) could be smaller than or equal to 1 when i = 0 and k<3, and thus the integral does not converge. This indicates that the first moments of the RSR and RPSR estimators do not exist when the number of linear restrictions (k) is smaller than 3.

In a similar way to [21], we can show that the RPSR estimator dominates the RSR estimator in terms of MSE. However, the comparisons between the SR and the RSR estimators, and between the PSR and RPSR estimators have not been made so far. Here, we compare the MSEs of the SR and the RSR estimators under the condition λ=γγ/σ2=β/σ2=0. As discussed in [1], the MSE of the RSR estimator is minimized when a2=(k2)/(ν+2). Also, the MSE of the SR estimator is minimized when a1=(2k2)/(ν+2). Thus, we use these value of a1 and a2 for the comparison. Let u1=γ^γ^/σ2 and u2=ee/σ2, then, under the condition λ=0, u1χ2k2 and u2χn2k2, and u1 and u2 are mutually independent. Therefore, the MSE of the SR estimator can be written as

MSE[γ^SR]σ2=E[(γ^SRγ)(γ^SRγ)]σ2=E[u1]2a1E[ee]+a12E[u22u1]=2k2a1(n2k)+a12(n2k)(n2k+2)2k2=2nn2k+2. (21)

Since λ=0 implies λ1=γ1γ1/σ2=0, the MSE of γ^RSR for λ=0 can be derived either by the similar way or by letting λ1=0 and τ as

MSE[γ^RSR]σ2=2nnk+2+k. (22)

Comparing Equations (21) and (22), we can see that MSE[γ^SR]<MSE[γ^RSR] if

2nn2k+2<2nnk+2+k. (23)

Assuming n−2k + 2>0, Equation (23) can be transformed into

2n<(n2k+2)(nk+2) (24)

after some manipulations. Since the lhs and the rhs of Equation (24) are O(n) and O(n2) respectively, Equation (24) holds for reasonable values of sample size n. In fact, since we assume the situation such that the number of the observations is larger than the number of the regression coefficients (i.e. n>2k), the above condition holds in most realistic cases. (The exception is the case where k is large and n is small, e.g. k = 8 and n = 17.) Thus, MSE[γ^SR]<MSE[γ^RSR] for reasonable values of k and n when λ=0. Furthermore, since the infinite series in Equations (18) and (19) are absolutely convergent, MSE[γ^RSR] is differentiable with respect to λ1. Similarly, MSE[γ^SR] is differentiable with respect to λ. Therefore, MSE[γ^SR] and MSE[γ^RSR] are continuous functions of λ and λ1, which implies MSE[γ^SR]<MSE[γ^RSR] in some region around λ=0 and λ1=0. This means that the unrestricted SR estimator γ^SR can have a smaller MSE than the restricted SR estimator γ^RSR even when λ1=0. Since λ1=0 means there is no structural break, this result shows that the unrestricted SR estimator has smaller MSE than the restricted SR estimator even when there is no structural break, despite the RSR estimator takes the possibility of no structural break into consideration. Also, since the PSR estimator dominates the SR estimator, the PSR estimator may have superior MSE performance even when no structural break exists.

However, further analytical comparison of the MSEs of these shrinkage estimators is very difficult. Thus, we compare the performances of the estimators considered above by numerical evaluations in the next section.

4. Numerical results

In this section, we execute numerical evaluations to investigate the MSE performance of the estimators introduced in the previous section. To obtain clear insight, we calculate the relative MSE defined as

MSE[shrinkage estimators and the other estimators]MSE[unrestricted OLS estimator]=MSE[γ~]MSE[γ^] (25)

where γ~ is any estimator except unrestricted OLS estimator for γ. Thus, if the value of the relative MSE is smaller than unity, γ~ has smaller MSE than the OLS estimator γ^. Since the moments of the RPSR estimator does not exist when k2, we do not consider the cases for k2. The parameter values used in numerical evaluations are as follows: k=3,4,5, n=12,20,30,50,100, λ=β/σ2=γγ/σ2= various values and λ1=ωλ, where ω=0,0.1,0.3,0.5,0.7,0.9 and 1.0. Note that ω=0 means that the null hypothesis H0:β1=β2 is true, which in turn means that there is no structural break. Thus, when ω is close to zero, the magnitude of the structural break is considered to be small. When the increment of infinite series in the formulae of H(p,q,δ,τ) and J(p,q,δ,τ) given in (18) and (19) gets smaller than 1012, the infinite series are judged to converge. Further, the integral in Gi(p,q,δ,τ) given in (20) is calculated using the Simpson's 3/8 rule with 1000 equal subdivisions. As discussed in the previous section, the MSEs of the SR and the RSR estimators are minimized when a1=(2k2)/(n2k+2) and a2=(k2)/(ν+2), respectively. Thus, we use these values of a1 and a2 in the numerical analysis. The numerical evaluations are executed using the FORTRAN code. Since the exact formula of MSE of γ~ given in (14) depends on too many parameters, we show the results for the simple model with k=6,8 and n = 20, 50, 100. However, the results for the other cases are almost similar.

Table 1 shows the relative MSE of the formally general estimators (β~) for k = 3, n = 20, 50, 100. As shown in the table, the restricted/unrestricted PSR estimators always dominate the restricted/unrestricted SR estimators, which is consistent with the theoretical results discussed in the previous section. We can see that the SR estimator has a smaller MSE than the RSR estimator when λ=0, which is also shown analytically in the previous section. Further, the RLS estimator has a smaller MSE than the RPSR estimator when λ is small enough (e.g. λ=0.1,1.0). When λ is large enough (e.g. λ20), the RPSR, RSR, and RLS estimator have almost the same MSE performances.

Table 1.

Relative MSE of the formally general estimators (β~) for k = 3, n = 20, 50, 100.

    n = 20 n = 50 n = 100 n = 20, 50, 100
λ w SR PSR RSR RPSR SR PSR RSR RPSR SR PSR RSR RPSR RLS
0.1 0.0 0.42627 0.31900 0.51171 0.50867 0.37282 0.26572 0.50277 0.50216 0.35797 0.25136 0.50096 0.50076 0.50000
  0.1     0.51332 0.51030     0.50442 0.50382     0.50262 0.50242 0.50167
  0.3     0.51655 0.51355     0.50774 0.50713     0.50595 0.50575 0.50500
  0.5     0.51978 0.51680     0.51105 0.51045     0.50928 0.50908 0.50833
  0.7     0.52301 0.52005     0.51436 0.51377     0.51260 0.51241 0.51167
  0.9     0.52625 0.52330     0.51767 0.51708     0.51593 0.51573 0.51500
  1.0     0.52786 0.52493     0.51933 0.51874     0.51759 0.51740 0.51667
1.0 0.0 0.50286 0.41185 0.51171 0.50867 0.45654 0.36589 0.50277 0.50216 0.44367 0.3535 0.50096 0.50076 0.50000
  0.1     0.52786 0.52493     0.51933 0.51874     0.51759 0.51740 0.51667
  0.3     0.56017 0.55743     0.55245 0.55190     0.55085 0.55067 0.55000
  0.5     0.59249 0.58993     0.58556 0.58505     0.58412 0.58395 0.58333
  0.7     0.62481 0.62243     0.61868 0.61820     0.61738 0.61722 0.61667
  0.9     0.65714 0.65492     0.65178 0.65134     0.65063 0.65049 0.65000
  1.0     0.67331 0.67117     0.66834 0.66791     0.66726 0.66712 0.66667
3.0 0.0 0.62504 0.56569 0.51171 0.50867 0.59011 0.53184 0.50277 0.50216 0.58041 0.52272 0.50096 0.50076 0.50000
  0.1     0.56017 0.55743     0.55245 0.55190     0.55085 0.55067 0.55000
  0.3     0.65714 0.65492     0.65178 0.65134     0.65063 0.65049 0.65000
  0.5     0.75419 0.75240     0.75110 0.75074     0.75040 0.75028 0.75000
  0.7     0.85135 0.84991     0.85041 0.85012     0.85016 0.85006 0.85000
  0.9     0.94864 0.94748     0.94972 0.94948     0.94991 0.94984 0.95000
  1.0     0.99733 0.99630     0.99937 0.99916     0.99979 0.99972 1.00000
10.0 0.0 0.81302 0.80360 0.51171 0.50867 0.79560 0.78722 0.50277 0.50216 0.79076 0.78271 0.50096 0.50076 0.50000
  0.1     0.67331 0.67117     0.66834 0.66791     0.66726 0.66712 0.66667
  0.3     0.99733 0.99630     0.99937 0.99916     0.99979 0.99972 1.00000
  0.5     1.32283 1.32234     1.33046 1.33036     1.33229 1.33226 1.33333
  0.7     1.64977 1.64955     1.66169 1.66164     1.66481 1.66479 1.66667
  0.9     1.97796 1.97786     1.99310 1.99308     1.99735 1.99735 2.00000
  1.0     2.14245 2.14238     2.15887 2.15886     2.16364 2.16363 2.16667
20.0 0.0 0.89500 0.89454 0.51171 0.50867 0.88522 0.88488 0.50277 0.50216 0.88250 0.88220 0.50096 0.50076 0.50000
  0.1     0.83515 0.83365     0.83386 0.83356     0.83353 0.83343 0.83333
  0.3     1.48613 1.48580     1.49605 1.49599     1.49855 1.49852 1.50000
  0.5     2.14245 2.14238     2.15887 2.15886     2.16364 2.16363 2.16667
  0.7     2.80247 2.80245     2.82236 2.82235     2.82887 2.82887 2.83333
  0.9     3.46485 3.46485     3.48641 3.48641     3.49425 3.49425 3.50000
  1.0     3.79665 3.79665     3.81861 3.81861     3.82699 3.82699 3.83333
50.0 0.0 0.95520 0.95520 0.51171 0.50867 0.95103 0.95103 0.50277 0.50216 0.94987 0.94987 0.50096 0.50076 0.50000
  0.1     1.32283 1.32234     1.33046 1.33036     1.33229 1.33226 1.33333
  0.3     2.96788 2.96787     2.98832 2.98832     2.99520 2.99520 3.00000
  0.5     4.62736 4.62736     4.64955 4.64955     4.65895 4.65895 4.66667
  0.7     6.29182 6.29182     6.31282 6.31282     6.32336 6.32336 6.33333
  0.9     7.95821 7.95821     7.97731 7.97731     7.98826 7.98826 8.00000
  1.0     8.79174 8.79174     8.80987 8.80987     8.82086 8.82086 8.83333

As expected, for fixed λ, the relative MSE of RSR, RPSR, and RLS estimator increases as ω increases. Further, more importantly, the PSR estimator has a smaller MSE than the RPSR estimator, except for a few cases (e.g. λ=3,10,20, ω=0.1, and n = 20). Thus, the above result indicates that the PSR estimator can be preferable to the RPSR estimator even when there is no structural break (i.e. ω=0). Also, when there is an obvious structural break (i.e. ω is large), the PSR estimator has a smaller MSE than the RPSE estimator, which is natural because the RPSR estimator incorporates the null hypothesis of no structural break.

We also see that the relative MSE of the PSR estimator decreases as the sample size (n) increases, especially when λ is small. It is clear that the MSE of the PSR estimator is much smaller than that of the RPSR estimator when λ is small. Since λ=γγ/σ2, this result indicates that the PSR estimator has much better MSE performance than the RPSR estimator when γ is small. Further, the results of the RSR and the RPSR estimators for n = 50 and n = 100 are almost similar.

Table 2 shows the relative MSE of the formally general estimators (β~) for k = 4, n = 20, 50, 100. As shown in the table, the relative MSE for all of the estimators decreases as the number of regression coefficients (k) increases. We can see Tables 1 and 2 have similar results. As a whole, the PSR has the smallest MSE over a wide region of λ and ω. Thus, in conclusion, we recommend using the PSR estimator even when a structural break may not exist in the sample period.

Table 2.

Relative MSE of the formally general estimators (β~) for k = 4, n = 20, 50, 100.

    n = 20 n = 50 n = 100 n = 20, 50, 100
λ w SR PSR RSR RPSR SR PSR RSR RPSR SR PSR RSR RPSR RLS
0.1 0.0 0.36510 0.26976 0.50329 0.50192 0.29295 0.20212 0.50044 0.50030 0.27504 0.18606 0.50010 0.50007 0.50000
  0.1     0.50453 0.50316     0.50169 0.50155     0.50135 0.50132 0.50125
  0.3     0.50700 0.50564     0.50418 0.50405     0.50385 0.50382 0.50375
  0.5     0.50948 0.50813     0.50668 0.50654     0.50635 0.50632 0.50625
  0.7     0.51195 0.51061     0.50918 0.50904     0.50885 0.50882 0.50875
  0.9     0.51443 0.51310     0.51168 0.51154     0.51135 0.51132 0.51125
  1.0     0.51567 0.51434     0.51292 0.51279     0.51260 0.51257 0.51250
1.0 0.0 0.43009 0.34601 0.50329 0.50192 0.36533 0.28550 0.50044 0.50030 0.34925 0.27114 0.50010 0.50007 0.50000
  0.1     0.51567 0.51434     0.51292 0.51279     0.51260 0.51257 0.51250
  0.3     0.54043 0.53917     0.53790 0.53777     0.53760 0.53757 0.53750
  0.5     0.56520 0.56401     0.56287 0.56275     0.56259 0.56256 0.56250
  0.7     0.58996 0.58884     0.58784 0.58773     0.58758 0.58756 0.58750
  0.9     0.61473 0.61367     0.61281 0.61270     0.61258 0.61255 0.61250
  1.0     0.62711 0.62609     0.62529 0.62519     0.62507 0.62505 0.62500
3.0 0.0 0.54072 0.48071 0.50329 0.50192 0.48853 0.43280 0.50044 0.50030 0.47557 0.42142 0.50010 0.50007 0.50000
  0.1     0.54043 0.53917     0.53790 0.53777     0.53760 0.53757 0.53750
  0.3     0.61473 0.61367     0.61281 0.61270     0.61258 0.61255 0.61250
  0.5     0.68904 0.68816     0.68772 0.68763     0.68755 0.68753 0.68750
  0.7     0.76339 0.76265     0.76263 0.76255     0.76253 0.76252 0.76250
  0.9     0.83777 0.83716     0.83753 0.83747     0.83751 0.83749 0.83750
  1.0     0.87497 0.87442     0.87498 0.87492     0.87500 0.87498 0.87500
10.0 0.0 0.73792 0.72466 0.50329 0.50192 0.70814 0.69754 0.50044 0.50030 0.70075 0.69091 0.50010 0.50007 0.50000
  0.1     0.62711 0.62609     0.62529 0.62519     0.62507 0.62505 0.62500
  0.3     0.87497 0.87442     0.87498 0.87492     0.87500 0.87498 0.87500
  0.5     1.12322 1.12293     1.12467 1.12464     1.12492 1.12491 1.12500
  0.7     1.37187 1.37172     1.37438 1.37437     1.37483 1.37483 1.37500
  0.9     1.62085 1.62078     1.62412 1.62411     1.62476 1.62475 1.62500
  1.0     1.74544 1.74540     1.74900 1.74900     1.74972 1.74972 1.75000
20.0 0.0 0.84186 0.84081 0.50329 0.50192 0.82389 0.82327 0.50044 0.50030 0.81943 0.81891 0.50010 0.50007 0.50000
  0.1     0.75100 0.75024     0.75014 0.75006     0.75004 0.75002 0.75000
  0.3     1.24750 1.24729     1.24952 1.24950     1.24987 1.24987 1.25000
  0.5     1.74544 1.74540     1.74900 1.74900     1.74972 1.74972 1.75000
  0.7     2.24435 2.24433     2.24860 2.24859     2.24958 2.24958 2.25000
  0.9     2.74381 2.74381     2.74828 2.74828     2.74946 2.74946 2.75000
  1.0     2.99367 2.99367     2.99816 2.99816     2.99940 2.99940 3.00000
50.0 0.0 0.92878 0.92878 0.50329 0.50192 0.92069 0.92069 0.50044 0.50030 0.91868 0.91868 0.50010 0.50007 0.50000
  0.1     1.12322 1.12293     1.12467 1.12464     1.12492 1.12491 1.12500
  0.3     2.36917 2.36916     2.37351 2.37351     2.37455 2.37455 2.37500
  0.5     3.61856 3.61856     3.62291 3.62291     3.62428 3.62428 3.62500
  0.7     4.86883 4.86883     4.87262 4.87262     4.87410 4.87410 4.87500
  0.9     6.11929 6.11929     6.12250 6.12250     6.12397 6.12397 6.12500
  1.0     6.74453 6.74453     6.74748 6.74748     6.74892 6.74892 6.75000

5. Empirical examples

In this section, we evaluate the OLS, RLS, SR, RSR, PSR and RPSR estimators using real world data. To compare the performance of different estimators, we employed the residual standard error (RSE) as follows:

RSE=i=1n(y^iyi)2/(nk)=i=1nei2/(nk) (26)

where ei is the ith element of the OLS residual vector.

Data with structural break

The data used in the first example taken from the National Bureau of Statistics of China, are annual data of the Income, Household consumption, and Consumer Price Index (CPI, 1978=100) from 1978 to 2005 in China. The regression model considered here is as follows:

Household consumption=β0+β1Income+β2CPI+ε, (27)

Before discussing the empirical results, we use the Chow test suggested by Chow [10] to examine whether there exists a structural break or not. We find that the value of the Chow test statistics is 13.73, and the null hypothesis (the two regressions are equal) is rejected at the 1% level, which means that there exists a structural break in the year 1992. The results of the regression analysis are shown in Table 3.

Table 3.

Estimators of regression model, Equation (26).

Variables OLS RLS SR RSR PSR RPSR
Period 1 (1978–1991)
Intercept −131.800 342.479 −131.620 −71.173 −131.620 −71.173
Income 0.493 0.409 0.493 0.483 0.493 0.483
CPI 1.433 −1.785 1.431 1.022 1.431 1.022
Residual standard error: 11.2 88.5 11.2 15.9 11.2 15.9
Period 2 (1992–2005)
Intercept 2202.803 342.479 2199.791 1964.996 2199.791 1964.996
Income 0.351 0.409 0.350 0.358 0.350 0.358
CPI −15.294 −1.785 −15.273 −13.567 −15.273 −13.567
Residual standard error: 149.2 237.6 149.3 151.0 149.2 151.0
Chow test: F=13.73,  pvalue=2.9×105

Data without structural break

The real data used in the second example is a well-known built-in data set in Software R called ‘Longley’. This macroeconomic data set was observed yearly from 1947 to 1962 ( n=16), which contained the GNP implicit price deflator (1954 = 100), a population larger than 14 years old, and the number of employed. The regression model considered here is as follows:

Employed=β0+β1GNPdeflator+β2Population+ε (28)

Similar to the first example, we executed the Chow test to examine the existence of a structural break. We find that the value of the Chow test statistics is 1.7, and the null hypothesis is not rejected at the 10% level, which means that there is no evidence to support the existence of a structural break in the year 1955. The results for this example are shown in Table 4.

Table 4.

Estimators of regression model, Equation (27).

Variables OLS RLS SR RSR PSR RPSR
Period 1 (1947–1954)
Intercept 36.959 26.851 36.955 31.976 36.955 31.976
GNP 0.272 0.241 0.272 0.257 0.272 0.257
Population 0.001 0.119 0.001 0.059 0.001 0.059
Residual standard error: 0.715 0.780 0.715 0.733 0.715 0.733
Period 2 (1955–1962)
Intercept 31.552 26.851 31.549 29.235 31.549 29.235
GNP 0.003 0.241 0.003 0.121 0.003 0.121
Population 0.296 0.119 0.296 0.209 0.296 0.209
Residual standard error: 0.915 1.190 0.915 0.989 0.915 0.989
Chow test: F=1.70,  pvalue=0.23

As is clearly shown in these examples, the estimated values of the OLS and SR estimators are very close. Since the indicator function part in Equation (9) is larger than zero in these cases, the estimated values of the RSR and RPSR estimators are the same. Also, we find that the RSE of the OLS estimator and SR estimator share the same value, and both estimators are better than the other estimators. Noting that the OLS estimator has the smallest residual sum of squares theoretically, the performance of the SR estimator is the best among these estimators. Further, since the RLS estimator has the largest RSE, the performance of the RLS estimator is the worst among these estimators.

6. Concluding remarks

In this paper, assuming a linear regressions model with a possible structural break at a known point, we analytically derive the exact formulae of the MSE for the restricted SR and PSR estimators and compare the MSE performance of restricted/unrestricted SR, PSR, and least squared estimators. We analytically showed that the SR estimator can have a smaller MSE than the RSR estimator even when the restriction of no structural change is true. Also, our numerical results show that the unrestricted PSR estimator is superior to other estimators over a wide region of parameter space. In realistic situations, we are not sure if the structural break actually occurred. Our results show that the unrestricted PSR estimator can have better MSE performance no matter if a structural break exists or not. This indicates that we can use the unrestricted PSR estimator even when we are not sure if the structural break actually occurred in the sample period.

In this paper, we consider a model which may have only one structural break. However, in many situations, multiple structural breaks may exist. The SR and PSR estimators can be easily applied even when there are multiple breaks. It is expected that the PSR estimator has superior performance in the models with multiple breaks. However, investigating the performance of the estimators under possible multiple breaks is beyond the scope of this paper and a remaining topic for future research.

Supplementary Material

Supplemental Material
Supplemental Material

Acknowledgments

The authors thank the editor and two anonymous referees for their extensively valuable comments.

Funding Statement

This work was supported by Humanities and Social Science Fund of Ministry of Education of China [Grant Number 23YJC790162], JSPS KAKENHI [Grant Number 18K01546, 23K01336].

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Adkins L.C. and Hill R.C., The RLS positive-part Stein estimator, Am. J. Agric. Econ. 72 (1990), pp. 727–730. [Google Scholar]
  • 2.Alsarraf I. and Algamal Z.Y., Restricted ridge estimator in the inverse gaussian regression model, Electron. J. Appl. Stat. Anal. 15 (2022), pp. 574–587. [Google Scholar]
  • 3.Andrews D., Tests for parameter instability and structural change with unknown change point, Econometrica 61 (1993), pp. 821–856. [Google Scholar]
  • 4.Andrews D.W., Tests for parameter instability and structural change with unknown change point: A corrigendum, Econometrica 71 (2003), pp. 395–397. [Google Scholar]
  • 5.Andrews D.W., Lee I., and Ploberger W., Optimal changepoint tests for normal linear regression, J. Econom. 70 (1996), pp. 9–38. [Google Scholar]
  • 6.Bai J., Estimation of a change point in multiple regression models, Rev. Econ. Stat. 79 (1997), pp. 551–563. DOI: 10.1162/003465397557132 [DOI] [Google Scholar]
  • 7.Bai J. and Perron P., Computation and analysis of multiple structural change models, J. Appl. Econom. 18 (2003), pp. 1–22. [Google Scholar]
  • 8.Baranchik A.J., A family of minimax estimators of the mean of a multivariate normal distribution, Ann. Math. Stat. 41 (1970), pp. 642–645. [Google Scholar]
  • 9.Cellier D. and Fourdrinier D., Shrinkage estimators under spherical symmetry for the general linear model, J. Multivar. Anal. 52 (1995), pp. 338–351. [Google Scholar]
  • 10.Chow G.C., Tests of equality between sets of coefficients in two linear regressions, Econometrica: J. Econom. Soc. (1960), pp. 591–605. [Google Scholar]
  • 11.Elliott G. and Müller U.K., Confidence sets for the date of a single break in linear time series regressions, J. Econom. 141 (2007), pp. 1196–1218. [Google Scholar]
  • 12.Hansen B.E., Testing for structural change in conditional models, J. Econom. 97 (2000), pp. 93–115. [Google Scholar]
  • 13.Hansen B.E., Averaging estimators for regressions with a possible structural break, Econ. Theory. 25 (2009), pp. 1498–1514. [Google Scholar]
  • 14.Hill R.C., Ziemer R.F., and White F.C., Mitigating the effects of multicollinearity using exact and stochastic restrictions: the case of an aggregate agricultural production function in Thailand: comment, Am. J. Agric. Econ. 63 (1981), pp. 298–300. [Google Scholar]
  • 15.James W. and Stein C., Estimation with quadratic loss, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, 1961, pp. 361–379.
  • 16.Judge G.G., Bock M.E., and Bock M.E., The Statistical Implications of Pre-test and Stein-rule Estimators in Econometrics, Vol. 25, North-Holland, Amsterdam, 1978. [Google Scholar]
  • 17.Mittelhammer R.C. and Young D.L., Mitigating the effects of multicollinearity using exact and stochastic restrictions: the case of an aggregate agricultural production function in Thailand: reply, Am. J. Agric. Econ. 63 (1981), pp. 301–304. [Google Scholar]
  • 18.Namba A. and Ohtani K., PMSE performance of the Stein-rule and positive-part Stein-rule estimators in a regression model with or without proxy variables, Stat. Probab. Lett. 76 (2006), pp. 898–906. [Google Scholar]
  • 19.Namba A. and Xu H., A sufficient condition for the MSE dominance of the positive-part shrinkage estimator when each individual regression coefficient is estimated in a misspecified linear regression model, J. Stat. Comput. Simul. 88 (2018), pp. 2034–2047. [Google Scholar]
  • 20.Namba A. and Xu H., PMSE dominance of the positive-part shrinkage estimator in a regression model with proxy variables, J. Stat. Comput. Simul. 88 (2018), pp. 2893–2908. [Google Scholar]
  • 21.Ohtani K., An MSE comparison of the restricted Stein-rule and minimum mean squared error estimators in regression, Test 7 (1998), pp. 361–376. [Google Scholar]
  • 22.Perron P. and Yamamoto Y., A note on estimating and testing for multiple structural changes in models with endogenous regressors via 2SLS, Econ. Theory. 30 (2014), pp. 491–507. [Google Scholar]
  • 23.Perron P., et al. Dealing with structural breaks, Palgrave Handbook Econom. 1 (2006), pp. 278–352. [Google Scholar]
  • 24.Stein C., et al. Inadmissibility of the usual estimator for the mean of a multivariate normal distribution, in Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, 1, 1956, pp. 197–206.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material
Supplemental Material

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES