Skip to main content
Journal of Advanced Research logoLink to Journal of Advanced Research
. 2015 Mar 19;7(1):59–68. doi: 10.1016/j.jare.2015.02.001

On non-negative estimation of variance components in mixed linear models

Heba A El Leithy 1, Zakaria A Abdel Wahed 1, Mohamed S Abdallah 1,
PMCID: PMC4703422  PMID: 26843971

Abstract

Alternative estimators have been derived for estimating the variance components according to Iterative Almost Unbiased Estimation (IAUE). As a result two modified IAUEs are introduced. The relative performances of the proposed estimators and other estimators are studied by simulating their bias, Mean Square Error and the probability of getting negative estimates under unbalanced nested-factorial model with two fixed crossed factorial and one nested random factor. Finally the Empirical Quantile Dispersion Graph (EQDG), which provides a comprehensive picture of the quality of estimation, is depicted corresponding to all the studied methods.

Keywords: AUE, MINQUE, Negative estimates, Quantile dispersion graphs, Restricted maximum likelihood, Variance components

Introduction

Quite often, experimental research work requires the empirical identification of the relationship between an observable response variable and a set of associated variables, or factors, believed to have an effect on the response variable. In general, such a relationship, if it exists, is unknown, but is usually assumed to be linear which yields the unknown parameters appear linearly in such a model, then it is called a classical linear model. It is reasonable to add random effects to the classical linear model which includes fixed effects only. Searle et al. [1] provided a decision tree to assist us to decide whether the parameters are fixed or not. The rule is that if we can reasonably assume the levels of the factor come from a probability distribution, then treat the factor as random; otherwise fixed. If the model contains both fixed and random effects, we can extend classical linear model to mixed linear model which is commonly used.

Variance components estimation has a wide application as it has two major uses as well as many minor ones, the more familiar of the major uses is determining which factors have a significant effect upon the response being studied. The second major use is measuring the relative effect of factors on the dependent factor. Over the years, a plethora of variance components estimation methods has been extensively developed. ANOVA method, Minimum Norm Quadratic Unbiased Estimation (MINQUE), IAUE, Maximum Likelihood (ML) method and Restricted Maximum Likelihood (REML) are some of the most important methods available in the literature. A proper and comprehensive review of other methods can be found in Sahai [2], Khuri and Sahai [3] and Khuri [4]. Consequently a number of attempts have been made to study the relative performance and the properties of various estimators in order to determine the best estimator under different criteria such as bias, MSE and computational complexities.

Since most of variance components estimators cannot be explicitly written in various situations, thus conducting the comparisons analytically can be considered as intractable process. Accordingly, the numerical comparisons approach is adopted via many scholars; for instance, Sahai [5] compared between ANOVA, MLE and REML for the three stage nested model when all the factors are random, Swallow and Monahan [6] made a comparison between ANOVA, MLE, REML, and MINQUE methods through running one way model, Rao and Heckler [7] provided some modifications on ANOVA method and presented numerical comparisons among various variance component methods in the case of unbalanced threefold nested random model. Lee and Khuri [8] used the EQDG to make a comparison between the ANOVA and ML estimation methods under two-way random model without interaction terms. Jung et al. [9] compared between ANOVA and MLE under threefold nested random model based upon the EQDG. Subramani [10] introduced a new procedure to estimate the variance components in light of MINQUE approach, further he demonstrated numerically that his proposed estimator has less MSE than both ANOVA and MINQUE method using one way random unbalanced model. Chen and Wei [11] derived parametric empirical Bayes estimators and compared with ANOVA method under one-way random model.

A typical challenge of variance component methods about is that not all of them produce positive estimates, which is not acceptable in the practice. The negative values of the estimates of the variance components might arise for a variety of reasons such as choosing unsuitable set of initial variance components, violating of linearity condition, existing outliers in the data, and closing the actual values of the variance components to zero. However Thompson [12] realized that the natural of the estimator or the used algorithm can be considered as the major reason of the negatively. Further Lamotte [13] proved analytically that the only linear combination of variance components for which satisfies unbiasedness and non-negativity is the single error component estimator in variance components model. Although there is a number of authors replace the negative variance components with zero value, many efforts have been made in order to design non-negative estimator of variance components.

Horn et al. [14] proposed IAUE which avoids the non-negatively of MINQUE. Jennrich and Sampson [15] suggested to replace the negative estimates of the variance components with 5 value as done in some packages, or force the algorithm to take the non-negatively in the consideration by adding non-negative constraints. Kelly and Mathew [16] proposed non-negative quadratic estimator which offers substantial MSE improvement. Khattree [17] suggested a simple modification ensuring the non-negativity of Henderson’s ANOVA method. Searle et al.[1] explained EM algorithm for treating the non-negatively problem associated with both ML and REML methods. Teunissen and Amiri [18] discussed how to modify Least-squares method in order to ensure that the estimated variances are non-negative. Moghteased-Azar et al. [19] suggested a new idea to deal with the negatively related to REML method. The major motivation behind this article is providing a new estimator for estimating variance components through applying simple modifications on IAUE which is so-called (MIAUE).

The rest of the paper is organized as follows: The second section concerns with the REML method introduced by Thompson [12] and Modified REML (MREML) using EM algorithm explained via Searle et al. [1]. The third section reviews the MINQUE method proposed by Rao [20] and Modified MINQUE (MMINQUE) derived by Subramani [10]. The fourth section presents IAUE method suggested by Horn et al. [14]. The fifth section illustrates the proposed estimators Modified IAUE (MIAUE). The followed section summarizes the steps of EQDG approach in depth which are employed in this study. The next section includes the Monte Carlo results using unbalanced nested-factorial model. Finally some conclusions about the work are given in the last section.

REML and MREML method

Consider the variance components model stated by Subramani [10]

Y=Xβ+Z1δ1+Z2δ2+Zr-1δr-1+Zrδr (2.1)

where Y is a n×1 vector of observations, X is a n×m matrix with known constants, β is a m×1 vector of fixed (unknown) parameters, Zi is a n×ci matrix of known constants and δi is ci×1 random vector has multivariate normal distribution with zero mean and covariance matrix σi2Ici. Further it is assumed that δi and δjij are uncorrelated. Model (1) can be expressed in a compact form as:

Y=Xβ+Zδ (2.2)

where Z=[Z1Z2Zr] and δ=δ1δ2δr. The model (2) is called a mixed linear model. If r=1, it becomes a fixed model and if m=1 it becomes a random model. Thus generally we have E(Y)=Xβ and D (Y)=i=1rσi2Vi, where Vi=ZiZi,D is called the dispersion matrix and the parameters σ12σ22σr2 are the unknown variance components whose values should be estimated.

Since the normality distribution is assumed, thus it is acceptable to operate distribution-based methods. The preferred parametric method for estimating variance components is REML. The original reference to REML is the article by Thompson [12]. One of the interesting features of REML is that it takes account of the implicit degrees of freedom associated with the fixed effects as maximizing the likelihood function of the linear combination of the observations. Moreover, REML estimators are invariant to the fixed effects.

Theoretically, REML can be illustrated as assuming Kn-x,n be a full rank matrix, where x is the rank of X, such that KX=0, then the likelihood of KY can be formulated as:

L(σ/Y)KDK-.5exp(-.5(YK(KDK)-1KY))

the log likelihood of KY becomes:

ln(L(σ/Y))-.5lnKDK-.5(YK(KDK)-1KY)

in order to obtain REML estimates, it is required to take the partial derivatives of ln(L(σ/Y)) with respect to σ then setting to zero, we obtain

ln(L(σ/Y))σi2=-.5(tr(K(KDK)-1KVi)-YK(KDK)-1KViK(KDK)-1KY)=0

using the lemma given in Searle et al.[1] that states:

K(KDK)-1K=P,whereP=D-1-D-1X(XD-1X)-1XD-1.

hence we will get

ln(L(σ/Y))σi2=trPVi-YPViPY=0i=1r. (2.3)

It is obvious that we have r equations in r unknowns σ. In some cases these equations can be simplified to yield closed form. Yet, in almost all cases numerical algorithms have to be used to solve the equations. In this study, the algorithm proposed in [19] is devoted. In addition, it should be noted that the system of equations in (3) does not involve the elements of K, which means no matter what their values, the same result will be reached (see Searle et al. [1]). The main drawback concerning to REML technique is that the solution in (3) can be negative, which is not allowed in the real life problems. This dilemma can be resolved by operating Expectation Maximization (EM) algorithm which is perfectly explained in Searle et al. [1] considered.

EM algorithm is the most well-known technique used in the applied statistics produced firstly by Dempster et al. [21] to obtain ML estimators in the incomplete data. EM algorithm is a mechanism consisting of an expectation followed by maximization stage. Fortunately it is able to apply EM algorithm to estimate the variance components in the mixed linear models. The stages of EM algorithm can be expressed as following:

  • 1.

    Obtain a starting value of σi2(f).

  • 2.

    E-step: calculate the E(δiδi|Y)|σi2=σi2(f).

    Since [KY,δi] are normally distributed, then f(δi|KY) is a normal distribution with mean σi2ZiPY and variance σi2Ici-σi4ZiPZi, hence Eδiδi|Y would be:
    Eδiδi|Y=σ4YPViPY+trσi2Ici-σi4ZiPZi
  • 3.
    M-step: determine σ^i2 as maximizing the complete data which includes the observed data and the random effects δ:
    σ^i2(f+1)=E(δiδi|Y)ci
  • 4.

    While σ^i2(f+1)-σ^i2(f)σ^i2(f)>.01 increase f by one unit and return to step 1, otherwise terminate the calculations and set σ^i2=σ^i2(f+1).

The variance components estimates computed using the EM algorithm is donated hereafter as MREML. Harvile [22] stated that the EM algorithm has the property of always yielding positive estimates as long as prior values or initial points are positive, thus using any non-negative variance components estimates may be reliable to be considered as started values for the EM algorithm. Despite EM algorithm can be rather slow to converge and required heavy iterations, but it is not sensitivity to the initial values (see [23], [24]).

MINQUE and MMINQUE method

Rao [20] decided to estimate the unknown variance components as considering YAY as an estimator to the linear combination of the variance components ρσ, where ρ is known vector and σ=σ12σ22σr2, then selecting a symmetric matrix A that satisfies the following criteria:

  • Invariance under translation of the β parameter

    The first criterion should be satisfied by A is somewhat intuitive as A should not be sensitivity to location shifting in the fixed parameters. In other words A should satisfy the following equation:
    YAY=Y-Xβ0A(Y-Xβ0)
    where β0 is a constant vector, which means
    AX=0
  • Unbiasedness

    The second criterion should be satisfied by A that is:
    EYAY=ρσ
    but
    EYAY=E(Xβ+Zδ)A(Xβ+Zδ)=EβXAXβ+2βXAZδ+δZAZδ
    under the Invariant condition, we can get:
    EYAY=E(δZAZδ)=i=1rEδiZiAZiδi=i=1rσi2trace(AVi)
    hence
    i=1rσi2trace(AVi)=i=1rρiσi2which means:
    tr(AVi)=ρi
  • Minimum Norm

    The third criterion should be satisfied by A is that minimize the Euclidean norm of the difference between YAY and the natural unbiased estimator of ρσ, which can be formulated as:
    MinδZAZδ-i=1rρiciδiδi=Minδ(ZAZ-Δ)δMinZAZ-Δ
    where ∥·∥ denotes the Euclidean norm of the matrix, Δ=diagρ1c1Ic1ρ2c2Ic2ρrcrIcr. Thus we can state that YAY is MINQUE of ρσ if the symmetric matrix A is selected such that ZAZ-Δ is minimum as possible as subject to:
    AX=0andtrAVi=ρi
    For making the optimization more easier, the squared Euclidean norm, the sum of square of all elements in the matrix, will be utilized. Then we get
    ZAZ-Δ2=trZAZ-ΔZAZ-Δ=trAVAV+Δ
    where V=i=1rVi=ZZ and Δ refers to constant quantity does not involve A.
    Let A be a symmetric matrix and V be a symmetric and invertible matrix. Then the minimum trAVAV subject to invariant and unbiasedness criteria is attained at, according to Rao [20]:
    A=i=1raiRViR
    where a=S-1ρ, Si,j=trQV-1ViV-1QVj, i and j = 1 … r, Q=In-X(XV-1X)-1XV-1 and R=QV-1.
    Consequently, the MINQUE of ρσ is
    YAY=i=1raiYRViRY=i=1raibi=ab=ρS-1b
    where b=YRViRY. By equating YAY with ρσ, we can get:
    σMINQUE=S-1b

    In the case of the singularity of the matrix S, one can resort to calculate the generalized inverse of S.

    On another hand, Subramani [10] proposed a new idea to develop the estimation of variance components in light of Rao [20] approach. Instead of dealing with one linear combination, he decided to estimate a set of linear combinations of variance components ρiσ through a set of quadratic functions YAiY. In other words, he claimed that estimating variance components obtained by calculating the following normal equations:
    σ12σr2=ρ11ρ1rρr1ρrr-1YA1YYArY (3.1)
    Likewise, the symmetric Ai should be derived based upon certain criteria:
  • Invariance under translation of the β parameter

    It can easily be shown that the invariant condition will be satisfied if:
    AiX=0
  • Unbiasedness

    In order to ensure the unbiasedness, Ai should satisfy:
    EYAiY=ρiσ=j=1rρijσj
    under the invariant condition, we can get:
    ρi.j=tr(AiVj) (3.2)
  • Minimum Norm

    As already pointed above, in order to minimize the squared Euclidian norm between YAiY and the natural estimator of ρiσ. According to Subramani [20], the following theorem with our proof guides us the strategy of selecting Ai that minimizes tr(AiVAiV).

    Theorem Let Vn×n be a given symmetric matrix and An×n be an unknown symmetric matrix such that trAV=rankAV=p<n. Then AV2 attains minimum at AV, where AV is any symmetric idempotent matrix.

Proof

Since rankAV=p<n, then we have p non-zero characteristic roots of AV such that:

trAV=t=1pλt=p

in addition,

AV2=tr(AVAV)=t=1pλt2

Now, minimizing AV2 is as equivalent as minimizing t=1pλt2 . Hence the optimization problem may be reformulated as:

Mint=1pλt2subject tot=1pλt=p

using the Lagrange multipliers technique, the Lagrangian can be defined as:

Λλ1λp,λ=t=1pλt2-λt=1pλt-p

where λ denotes the constant of the Lagrange multipliers. Lagrange’s equations can be obtained:

Λλ1λp,λdλt=2λt-λ=0t=1p

and

Λλ1λp,λdλ=t=1pλt-p=0.

Since λt=λ2,t=1p, then t=1pλ2-p=0, which yields λ=2, hence λt=1t=1p,

Consequently, Subramani [10] deduced the minimum of AV2 will be reached if we replace A with A such that the characteristic roots of AV are only zero’s and one’s, which refers to the idempotency of the matrix. Thus the steps of MMINQUE can be summarized as: (1) Selecting Ai such that AiV is an idempotent matrix and AiX=0. (2) Substituting (3.2) in (3.1), then calculating the normal equations. The remaining point is the structure of Ai. Since the solution in the theorem is not unique, Subramani [10] introduced two different formulas of Ai which can be reliable to obtain MMINQUE.

The first version of Ai can be derived as assuming:

Ai1=V-1In-UiUiV-1Ui-UiV-1i=1r

where U1=X,U2=[XZ1], U3=[XZ2]Ur=[XZr-1]. The second version of Ai can be derived as assuming:

Ai2=GiGiVGi-Gi-GiGiVGi-GiXXGiGiVGi-GiX-XGiGiVGi-Gi

where Gi=Zi. In reality, Subramani [20] proposed other shapes of Us and Gs, yet we confine ourselves to select the preceding shapes as the others lead finally to the same result.1 The main drawback that may be thrown to MMINQUE is the existence of the condition that trAV=rankAV in the theorem which leads MMINQUE valid only in this class of the matrices. Moreover the negativity is possible which will be resolved in the next section. It should be pointed out that if we replace V in σMINQUE,σMMINQUE1 or σMMINQUE2,2 by D, then the estimators are called weighted MINQUE, weighed MMINQUE1and weighed MMINQUE2 respectively. □

IAUE method

The concept of IAUE was developed by Horn et al. [14]. IAUE can be considered as an advantageous alternative to MINQUE approach basically when MINQUE produces negative estimates. Lucas [25] stated that though IAUE gives bias estimators, it is far less computation than many variance component methods even though it usually requires more iterations to converge to the same degree of approximation. Analogously to Rao [20], Horn et al. [14] preferred to estimate the variance components σi2 with quadratic form YAiY given Ai has the following formula:

Ai=RτiViR

where R=D-1-D-1X(XD-1X)-1XD-1, D=i=1rτiVi and τi be the prior estimate for σi2. Then the expectation of YAiY can be obtained as:

EYAiY=EYRτiViRY=trRτiViRD+βXRτiViRXβ

Since RX=0, hence:

EYAiY=j=1rtrRτiViRσj2Vj=j=1rfjtrRτiViRτjVj

where fj=σj2τj. Horn et al. [14] showed that RDR=R, then we can get:

EYRτiViRY=j=1rfjtrRτiViRτjVj+fitrτiViR-fitrτiViRDR=j=1rfjtrRτiViRτjVj+fitrτiViR-fij=1rtrτiViRτjVjR=j=1r(fj-fi)trRτiViRτjVj+fitrτiViR

If all the prior estimates τi approach to the true values or at least the ratios between τi and the true values are close, the first term of the previous equation will vanish, and the working equation can be simplified as:

EτiYRViRY=fitrτiViR

Which yields:

f^i=YRViRYtrViR

Consequently, the IAUE can be summarized as: (1) Choose initial value for τi. (2) Compute f^i based on τi. (3) Update the values of τi until all f^is approach one by using any iterative procedure. (4) Finally calculate σiIAUE2=τifi. In other words σiAUE2 can be expressed as:

σiIAUE2=τ^i|f^is1

The more significant advantage related to IAUE is its facility computation and non-negativity property as the numerator of f^i in a quadratic form as RViR is a positive definite matrix and the denominator can be written in a sums of squares as:

trRVi=trRDRVi=i=1rτitrRViRVj=j=1rτitrRZiZiRZjZj=j=1rτitr(ZjRZi)(ZiRZj)=i=1rτitrZjRZiZjRZi

MIAUE method

On the other hand, one can easily operate IAUE principle to MMINQUE which generates new non-negative estimators in light of Subramani [10]. Mathematically, the suggested estimators can be expressed as considering the expectation of the quadratic form:

EYAi1τiViAi1Y=trAi11τiViAi11D+τiβXAi1ViAi1Xβ

Where Ai1=D-1(In-Ui(UiD-1Ui)-UiD-1). In light of the Invariant condition:

EYAi1τiViAi1Y=trAi11τiViAi11D

Since

Ai1DAi1=D-1(In-Ui(UiD-1Ui)-UiD-1)(In-Ui(UiD-1Ui)-UiD-1).=D-1(In-2UiUiD-1Ui-UiD-1+UiUiD-1Ui-UiD-1UiUiD-1Ui-UiD-1).=D-1(In-UiUiD-1Ui-UiD-1)=Ai1

then we have:

EYAi1τiViAi1Y=j=1rfjtrAi1τiViAi1τjVj+fitrτiAi1Vi-fij=1rtrτiViAi1τjVjAi1=j=1r(fj-fi)trAi1τiViAi1τjVj+fitrτiAi1Vi

thus we can get under neglecting the difference between fi and all fj:

f^i1=YAi1ViAi1YtrViAi1

As previously mentioned during deriving σiAUE2, σiMIAUE12 can be computed as:

σ2^iMIAUE1=τ^i|f^i1s1

Likewise σiMIAUE22, can be calculated as:

σ2^iMIAUE2=τ^i|f^i2s1,

where

f^i2=YAi2ViAi2YtrViAi22

and

Ai2=GiGiDGi-Gi-GiGiDGi-GiXXGiGiDGi-GiX-XGiGiDGi-Gi

It is notable that both σiMIAUE12 and σiMIAUE22 are not required heavy calculations and not producing negative estimates, which yields that MIAUE1 and MIAUE2 can be considered as a competitor estimators to IAUE.

Empirical quantile dispersion graphs

Quantile Dispersion Graph (QDG) is a graphical technique used, typically, for comparing and assessing the quality of the variance components estimations. The QDG was suggested by Lee and Khuri [8] as consisting of plots of the maxima and minima, in our view one of them suffices, over some region in the parameter space against the quantiles of a variance component estimator. These plots provide a comprehensive picture of the quality of estimation with a particular variance component method. Since most of variance component methods have not a closed-form expression, so the quantiles can be obtained numerically, in this case QDG is so-called empirical QDG (EQDG). The steps of the EQDG can be outlined, according to Lee and Khuri [8], as follows:

  • (a)

    Select specific variance component method.

  • (b)

    Generate a random sample Y from the model (2) corresponding to σ12σr2.

  • (c)

    Use the random sample obtained in (b) and the variance component method in (a) to estimate the variance components σ^12σ^r2.

  • (d)

    Repeat steps (b and c) sufficient number of times.

  • (e)

    Compute the quantity qs=σ^1s2σ12, where s is the index of the times’ number in (d).

  • (f)

    Corresponding to certain specific percentiles values ph,3 obtain the empirical quantiles wh1 of qs, where h is the index of the percentiles’ values.

  • (g)

    Repeat steps (e and f) to the remaining σi2s.

  • (h)

    Select another point of β,σ12σr2 from the determined region space in order to obtain another wh2 for each σi2.

  • (i)

    Repeat step (h) sufficient times until all points in the determined region space are covered.

  • (j)

    Computed the maximum of Wh1,Wh1,,Whh, where h is the number of the points in the determined region space corresponding to each ph. This maximum will be so-called here Empirical Quantile Maximum (EQM).

  • (k)

    Turn on another variance component method and obtain EQM associated with each σi2.

  • (l)

    Repeat the step (k) k times, where k is the number of the variance component methods under the study.

  • (m)

    For each σi2 a line chart is obtained with the percentiles values ph on the X-axis, while the EQM corresponding to each variance component method on the Y-axis.

As expected whether the specific variance component method is perfect, then the elements of EQM should be identical and close to the one, otherwise it is referred to little quality for estimating the variance components. In other words, the more variability in EQM the less efficiency of the corresponding method. It should be noted that EQM reflects on the variability associated with the estimators not other characteristics e.g. biasedness or getting negative values, etc.

Simulation study

It may be of interest to make a comparison study among all the preceding variance components estimates. Since it may be impossible to do any theoretical comparisons about the performance of them, thus one has to resort to compare through Monte Carlo simulation. Following to Melo et al. [26], nested factorial design with two crossed factors and one nested factor is adopted in this context in order to identify the behavior of variance components estimators which can be described as:

yabcd=αa+βb+γc(a)+αβab+βγbc(a)+εabcd

a=1I;b=1J;c=1Ka;d=1nabc where αa is the effect of the a level of factor A, βb is the effect of the b level of factor B, γc(a) is the effect of the c level of factor C nested within the a level of factor A, αβab is the interaction effect between the factor A and B, βγbc(a) is the interaction effect between the factor B and C instead within the a level of factor A and εabcd is a random term. It is assumed that all the effects in the model are fixed parameters except γc(a), βγbc(a) and εabcd are normally independently distributed such that:

γc(a)N(0,σ12),βγbc(a)N0,σ22andεabcdN0,σ32.

Since the fixed effects are out of our interest, thus one can fix all the fixed parameters at one. Oppositely, the comparison process requires to be conducted under a variety of variance components configurations, difference of imbalance degrees and multiple sample sizes. Following to Rao and Heckler [7], Table 1 displays the variance components values used in the simulation. A lot of measures of imbalance have been introduced in the literature, see Khuri et al. [27], which can be selected with the aim of covering different levels of imbalance of nabc and various sample sizes. According to Qie and Xu [28], the measure which is introduced by Ahrens and Pincus [28] can be reliable for reflecting the imbalance effect of nabc which can be formulated as:

ϕ=1mabcnabcn2

where n is the grand sample size and m=J×aKa. Ahrens and Pincus [29] illustrated that the values of φ range from 1m up to one, the smaller values refer a greater degree of imbalance, while the larger values are only for balanced case. Table 2 presents the patterns of imbalance according to different sample sizes throughout the simulation.

Table 1.

Variance components configurations used in the simulation.

σ12 σ22 σ32
V1 .1 1 1
V2 .1 10 1
V3 1 .1 1
V4 1 10 1
V5 10 1 1
V6 10 .1 1

Table 2.

The patterns of imbalance rate for each sample size used in the simulation.

n I J Ki nijk ϕ
P1 24 2 2 2, 2 3, … , 3 1
P2 24 2 2 2, 2 2,5; 3,2; 5,3; 3,1 .83
P3 24 2 2 2, 2 1,2; 1,2; 2,8; 7,1 .56
P4 36 3 2 1, 2, 3 3, … , 3 1
P5 36 3 2 1, 2, 3 2,2; 2,5; 2,5; 4,4; 3,3; 2,2 .87
P6 36 3 2 1, 2, 3 1,1; 2,5; 2,1; 1,10; 3,1; 7,2 .53
P7 63 3 3 2, 3, 2 3, … , 3 1
P8 63 3 3 2, 3, 2 2,2,4; 4,2,4; 2,2,4; 4,3,6; 3,3,4; 3,3,2; 3,2,1 .87
P9 63 3 3 2, 3, 2 2,1,1; 3,2,10; 8,1,2; 1,1,3; 3,3,2; 9,3,3; 3,1,1 .57

For each variance components configuration and pattern of imbalance combination, 2000 independent random samples were generated, then all the negative estimates are forced to be zero. The estimated bias, MSE and probability of getting negative estimates4 are shown in Table 3.

Table 3.

Comparison of MINQUE, MMINQUE, IAUE, MIAUE, REML and MREML estimators based on compound absolute bias, compound MSE and prop. negative values.

MMINQUE1
MMINQUE2
MINQUE
MIAUE1
MIAUE2
IAUE
REML
MREML
Compound absolute bias Compound MSE Prop. negative values Compound absolute bias Compound MSE Prop. negative values Compound absolute bias Compound MSE Prop. negative values Compound absolute bias Compound MSE Compound absolute bias Compound MSE Compound absolute bias Compound MSE Compound absolute bias Compound MSE Prop. negative values Compound absolute bias Compound MSE
P1
V1 0.26 1.31 0.56 0.26 1.31 0.56 0.26 1.31 0.56 0.26 1.14 0.46 0.86 0.48 0.62 0.24 1.30 0.56 0.48 0.60
V2 2.35 71.58 0.52 2.35 71.58 0.52 2.35 71.58 0.52 2.28 70.70 3.87 50.51 4.15 50.48 2.16 69.84 0.52 4.27 37.20
V3 0.07 1.10 0.51 0.07 1.10 0.51 0.07 1.10 0.51 0.113 1.180 0.115 1.154 0.142 1.145 0.08 1.15 0.52 0.51 0.76
V4 2.38 78.14 0.46 2.38 78.14 0.46 2.38 78.14 0.46 1.95 69.88 3.49 53.12 3.75 52.82 2.03 77.22 0.46 3.63 40.47
V5 0.17 74.94 0.12 0.17 74.94 0.12 0.17 74.94 0.12 0.04 73.54 0.04 73.54 0.04 73.48 0.09 73.98 0.11 2.57 51.05
V6 0.16 72.47 0.45 0.16 72.47 0.45 0.16 72.47 0.45 1.31 74.07 2.01 74.64 1.99 74.62 0.16 59.58 0.45 3.05 43.37



P2
V1 0.31 1.60 0.59 0.34 1.71 0.59 0.30 1.58 0.58 0.97 2.14 1.06 1.76 1.07 1.65 0.30 1.49 0.58 0.47 0.71
V2 2.46 73.83 0.51 2.64 76.98 0.50 2.48 75.08 0.50 2.35 72.49 3.83 52.10 4.03 47.80 2.30 71.62 0.51 4.62 39.94
V3 0.10 1.27 0.57 0.11 1.29 0.57 0.09 1.26 0.55 0.11 1.23 0.02 1.20 0.05 1.18 0.10 1.21 0.53 0.41 0.79
V4 1.88 77.52 0.45 2.16 85.20 0.46 1.86 78.37 0.45 2.10 72.07 3.51 59.58 3.81 55.81 2.06 71.80 0.46 3.92 39.09
V5 0.13 73.67 0.13 0.16 74.56 0.20 0.14 73.65 0.13 0.16 81.97 0.11 82.74 0.13 81.89 0.26 70.26 0.13 2.73 52.06
V6 0.13 71.18 0.46 0.18 71.46 0.49 0.13 71.68 0.45 1.17 70.60 1.17 70.64 1.18 70.60 0.20 62.32 0.45 2.82 44.62



P3
V1 0.34 1.75 0.61 0.43 2.07 0.62 0.32 1.67 0.60 0.35 1.70 0.52 1.31 0.51 1.17 0.30 1.62 0.60 0.51 0.79
V2 2.13 71.98 0.53 2.73 84.52 0.53 2.11 74.92 0.52 2.37 75.17 4.07 57.85 4.22 51.77 2.16 72.28 0.52 4.58 39.35
V3 0.16 1.33 0.58 0.20 1.40 0.62 0.14 1.31 0.58 0.19 1.42 0.10 1.36 0.12 1.35 0.18 1.37 0.56 0.55 0.88
V4 1.90 83.40 0.49 2.44 98.91 0.51 1.87 85.07 0.49 2.22 79.91 3.58 63.87 3.72 57.87 2.10 76.30 0.47 4.04 42.02
V5 0.34 78.32 0.21 0.38 81.91 0.30 0.35 78.71 0.22 0.10 81.02 0.07 80.37 0.07 80.91 0.28 70.95 0.18 2.81 51.72
V6 0.31 72.42 0.49 0.43 72.52 0.53 0.32 73.06 0.50 0.53 69.95 0.45 70.24 0.49 69.96 0.37 65.51 0.47 2.97 48.21



P4
V1 0.19 0.87 0.49 0.19 0.87 0.49 0.19 0.87 0.49 0.22 0.93 0.37 0.71 0.40 0.69 0.21 0.82 0.50 0.42 0.50
V2 1.74 47.46 0.52 1.74 47.46 0.52 1.74 47.46 0.52 1.97 50.14 3.23 33.67 3.57 33.94 1.60 44.45 0.51 4.01 27.32
V3 0.07 0.72 0.42 0.07 0.72 0.42 0.07 0.72 0.42 0.11 0.72 0.05 0.71 0.07 0.70 0.05 0.72 0.41 0.29 0.56
V4 1.55 49.22 0.41 1.55 49.22 0.41 1.55 49.22 0.41 1.82 50.21 2.44 35.56 2.80 35.21 1.52 50.01 0.42 3.42 29.36
V5 0.07 44.26 0.05 0.07 44.26 0.05 0.07 44.26 0.05 0.07 46.18 0.06 46.19 0.06 46.17 0.02 45.96 0.04 1.78 37.21
V6 0.08 41.14 0.40 0.08 41.14 0.40 0.08 41.14 0.40 0.12 40.03 0.06 40.02 0.08 40.03 0.12 38.72 0.40 2.15 31.08



P5
V1 0.19 0.89 0.51 0.20 0.92 0.50 0.19 0.88 0.50 0.23 0.93 0.38 0.70 0.41 0.67 0.18 0.82 0.51 0.44 0.52
V2 1.81 48.00 0.50 1.91 49.65 0.49 1.81 48.16 0.50 1.93 46.84 3.40 33.10 3.74 32.83 1.60 46.59 0.50 4.02 28.13
V3 0.07 0.74 0.43 0.08 0.74 0.44 0.07 0.73 0.42 0.14 0.65 0.08 0.64 0.10 0.63 0.07 0.74 0.45 0.31 0.56
V4 1.52 49.82 0.43 1.60 51.85 0.43 1.54 50.18 0.43 1.79 48.26 3.00 36.57 3.37 35.86 1.55 52.40 0.44 3.43 30.87
V5 0.05 47.28 0.07 0.08 47.46 0.09 0.05 47.34 0.07 0.21 45.51 0.21 45.79 0.20 45.50 0.05 45.44 0.06 1.84 34.95
V6 0.15 39.08 0.41 0.15 39.28 0.44 0.16 39.13 0.40 0.26 42.02 0.20 42.07 0.22 42.02 0.16 34.61 0.40 2.03 34.78



P6
V1 0.27 1.22 0.56 0.34 1.42 0.58 0.26 1.17 0.54 0.24 0.91 0.41 0.71 0.43 0.68 0.27 1.14 0.51 0.47 0.58
V2 2.01 52.2 0.51 2.4 60.76 0.5 1.97 53.47 0.51 2.07 45.43 3.53 33.59 3.83 32.92 1.81 48.7 0.5 4.05 29.28
V3 0.11 0.86 0.5 0.15 0.93 0.55 0.1 0.84 0.5 0.12 0.78 0.05 0.77 0.08 0.76 0.12 0.86 0.48 0.43 0.64
V4 1.85 50.84 0.42 2.17 58.89 0.44 1.92 52.13 0.41 1.74 28.59 1.74 50.89 4.99 47.00 1.48 68.84 0.43 5.48 43.34
V5 0.04 48.46 0.11 0.09 50.79 0.21 0.04 48.76 0.12 0.08 48.76 0.07 48.92 0.08 48.75 0.08 46.03 0.1 1.8 35.17
V6 0.38 43.31 0.46 0.48 43.41 0.48 0.39 43.58 0.48 0.19 43.45 0.16 43.46 0.16 43.46 0.32 41.36 0.44 2.17 33.22



P7
V1 0.11 0.38 0.44 0.11 0.38 0.44 0.11 0.38 0.44 0.10 0.36 0.17 0.31 0.19 0.31 0.12 0.40 0.42 0.21 0.28
V2 0.96 18.95 0.53 0.96 18.95 0.53 0.96 18.95 0.53 1.20 18.84 1.94 15.42 2.17 15.58 0.96 19.54 0.52 2.43 14.22
V3 0.02 0.49 0.33 0.02 0.49 0.33 0.02 0.49 0.33 0.08 0.51 0.03 0.50 0.05 0.50 0.05 0.51 0.31 0.26 0.42
V4 0.79 22.14 0.41 0.79 22.14 0.41 0.79 22.14 0.41 0.81 20.43 1.85 17.60 2.09 17.44 0.77 21.33 0.39 1.6 14.65
V5 0.15 40.31 0.02 0.15 40.31 0.02 0.15 40.31 0.02 0.07 38.37 0.13 38.37 0.13 38.37 0.01 37.67 0.01 1.5 29.98
V6 0.06 33.64 0.31 0.06 33.64 0.31 0.06 33.64 0.31 0.26 33.32 0.25 33.31 0.28 33.31 0.05 34.64 0.30 1.43 27.34



P8
V1 0.10 0.42 0.44 0.11 0.43 0.44 0.10 0.41 0.45 0.12 0.39 0.21 0.34 0.23 0.33 0.11 0.42 0.42 0.25 0.29
V2 0.87 18.82 0.53 0.92 19.26 0.52 0.87 19.15 0.52 1.08 18.47 2.39 15.52 2.59 15.33 0.93 19.12 0.51 2.37 13.74
V3 0.05 0.52 0.33 0.05 0.51 0.34 0.05 0.52 0.32 0.09 0.50 0.01 0.49 0.04 0.49 0.04 0.54 0.30 0.26 0.45
V4 0.68 20.07 0.4 0.74 20.7 0.41 0.68 20.44 0.4 0.91 21.43 1.76 18.74 2.01 18.06 0.5 21.23 0.40 1.71 14.52
V5 0.02 37.66 0.01 0.02 37.74 0.01 0.02 37.7 0.01 0.14 35.13 0.19 35.10 0.20 35.12 0.12 35.91 0.01 1.61 29.0
V6 0.1 35.23 0.32 0.11 35.19 0.35 0.10 35.3 0.31 0.05 35.41 0.06 35.40 0.07 35.41 0.04 33.19 0.29 1.55 26.66



P9
V1 0.11 0.48 0.45 0.14 0.53 0.46 0.11 0.46 0.45 0.14 0.46 0.25 0.41 0.26 0.37 0.11 0.45 0.42 0.25 0.31
V2 1.02 20.63 0.51 1.25 22.64 0.5 1.05 22.22 0.51 1.23 18.97 2.62 17.27 2.76 16.18 0.9 20.14 0.51 2.41 14.77
V3 0.08 0.61 0.38 0.09 0.6 0.4 0.07 0.60 0.36 0.11 0.55 0.03 0.53 0.07 0.54 0.05 0.57 0.32 0.28 0.46
V4 0.75 22.45 0.41 0.98 25.13 0.42 0.74 23.91 0.41 0.91 23.35 1.99 21.65 2.14 19.51 0.74 21.8 0.37 1.73 15.0
V5 0.11 35.74 0.01 0.10 36.54 0.03 0.10 35.95 0.02 0.03 36.21 0.10 36.98 0.11 36.20 0.02 38.19 0.01 1.54 30.42
V6 0.22 34.73 0.37 0.25 34.86 0.41 0.21 34.86 0.36 0.17 33.34 0.17 33.26 0.19 33.33 0.14 35.54 0.32 1.49 28.09

According to Table 3, a number of conclusions are drawn from the results for all the patterns and designs which are summarized in the following points:

  • (a)

    For the completely balanced designs, it does not matter computing MINQUE, MMINQUE1 or MMINQUE2 because they are the same.

  • (b)

    Generally speaking, one can observe that REML has the lowest compound absolute bias among all the estimators in most cases, whereas MREML can be considered as the best estimator in terms of MSE criteria.

  • (c)

    it is reasonable to note that the compound absolute biasedness of MINQUE, MMINQUE1 and MMINQUE2 is lower than IAUE, MIAUE1 and MIAUE2 regardless the sample size or imbalance rate. Oppositely, the compound MSE associated with MINQUE, MMINQUE1 and MMINQUE2 is greater than IAUE, MIAUE1 and MIAUE2 in most cases.

  • (d)

    Among the negative methods, REML estimator has the best behavior in terms of both bias and MSE, while MREML in the case of the non-negative methods.

  • (e)

    It is clear the superiority of MIAUE1 and MIAUE2 over IAUE in terms of biasedness criterion that the latter across ALL cases has bias greater than either MIAUE1or MIAUE2 or both. However the proposed estimators have MSE less than all MINQUE, MMINQUE1 and MMINQUE2.

  • (f)

    The sample size and the imbalance rate have substantially effect on the behavior of all the estimators, as either increasing the small size or reducing the imbalance rate yield to significant improvement in the two measures of the performance. Furthermore, there is an interaction effect between the sample size and the imbalance rate as the effect of the imbalance rate is downward at high level of the sample size.

  • (g)

    The performance of the estimators depends heavily on the ratio of σ12σ22. It is observed that the compound absolute biasedness of the estimators is acceptable whenever the ratio is greater than one.

  • (h)

    There are negligible differences among MINQUE, MMINQUE1 MMINQUE2 and REML with respect to the frequency of getting negative values, yet in almost cases it is remarkable that the frequency at MMINQUE2 is slightly higher than the remaining and relatively lower at REML. The sample size has strong effect in reducing the probability of getting negative values, while the imbalance effect has weak effect.

In order to enhance the numerical comparison process, EQDG’s which provide a powerful graphical tool for the comparisons are exhibited for all the above estimators which are given in Fig. 1. In addition the norm of EQM is computed and obtained as shown in Table 4.

Fig. 1.

Fig. 1

EQDG’s corresponding to MINQUE, MMINQUE, AUE, MAUE, REML and MREML estimators for each variance component.

Table 4.

The norm of EQM corresponding to MINQUE, MMINQUE, IAUE, MIAUE, REML and MREML estimators at each variance component.

MMINQUE1 MMINQUE2 MINQUE MIAUE1 MIAUE2 IAUE REML MREML
σ12 159.91 167.24 160.69 154.60 155.09 155.97 153.56 122.53
σ22 14.31 16.20 16.19 13.18 13.28 13.32 13.83 12.76
σ32 3.98 3.98 3.99 4.18 4.13 4.11 3.97 3.93

The extracted results from both EQDG and EQM coincide with the above conclusions as MREMLcan be donated as the best estimator since it has the least MSE, whereas MMINQUE2 has the highest variability among the above estimators. On the other hand all the estimators based on Ai1 are better than those based on MINQUE and Ai2. Furthermore, one can notice that the degrees of freedom have substantially negative effect on the norm of all above estimators, thus the norm associated with σ32 is lower than the norm associated with σ22 which the latter is lower than the norm associated with σ12.

Conclusions

In this article, two new estimators based on IAUE principle are introduced for estimating the variance components in the mixed linear model. The aim of this article was to evaluate the performance of the proposed estimators relative to various estimators via simulation studies. The model we used is nested-factorial model with two fixed crossed factorial and one nested random factor under regularity assumptions. Several criteria such as bias, MSE, probability of getting negative values and the norm of EQM are used to show the performance of the estimators under the study. From the numerical analysis, we have found that the estimators based on restricted likelihood function have desirable properties as long as the data have normal distribution. Further, the proposed estimators may be appropriate estimators since they have less bias and less MSE than the estimator based on almost unbiased approach it may be important to study some details in the proposed algorithms in the literature which used for computing the variance components estimates and its effect to the statistical characteristics e.g. [19], [23].

Conflict of interest

The authors have declared no conflict of interests.

Compliance with Ethics Requirements

This article does not contain any studies with human or animal subjects.

Acknowledgments

The authors wish to express their heartiest thanks and gratitude to Prof. J. Subramani for his fruitful assistance and commenting on the manuscript.

Footnotes

Peer review under responsibility of Cairo University.

1

We concluded this result during recording simulation’s results, thus our conclusion is restricted to nested-factorial model with two fixed crossed factorial and one nested random factor.

2

σMMINQUE1 and σMMINQUE2 are based upon Ai1 and Ai2 respectively.

3

Lee and Khuri [8] selected the values of ph as .01, .05, .1, .2, .3, .4, .5, .6, .7, .8, .9, .95 and .99.

4

The probability of getting negative values is calculated as one minus the number of the samples whose all are non-negative out of 2000.

References

  • 1.Searle S., Casella G., McCulloch C. John Wiley & Sons; New York: 1992. Variance components. [Google Scholar]
  • 2.Sahai H. A bibliography on variance components. Int Stat Rev. 1979;47:177–222. [Google Scholar]
  • 3.Khuri A., Sahai H. Variance components analysis: a selective literature survey. Int Stat Rev. 1985;53:279–300. [Google Scholar]
  • 4.Khuri A. Designs for variance components estimation: past and present. Int Stat Rev. 2000;68:311–322. [Google Scholar]
  • 5.Sahai H. A comparison of estimators of variance components in the balanced three-stage nested random effects model using mean squared error criterion. JASA. 1976;71:435–444. [Google Scholar]
  • 6.Swallow H., Monahan F. Monte-Carlo comparison of ANOVA, MINQUE, REML and ML estimators of variance components. Technometrics. 1984;26:47–57. [Google Scholar]
  • 7.Rao S., Heckler E. The three-fold nested random effects model. JSPI. 1997;64:341–352. [Google Scholar]
  • 8.Lee J., Khuri A. Quantile dispersion graphs for the comparison of designs for a random two-way model. JSPI. 2000;91:123–137. [Google Scholar]
  • 9.Jung C., Khuri I., Lee J. Comparison of designs for the three folds nested random model. J Appl Stat. 2008;35:701–715. [Google Scholar]
  • 10.Subramani J. On modified minimum variance quadratic unbiased estimation (MIVQUE) of variance components in mixed linear models. Model Assis Stat Appl. 2012;7:179–200. [Google Scholar]
  • 11.Chen L., Wei L. The superiorities of empirical Bayes estimation of variance components in random effects model. Commun Stat—Theory Meth. 2013;42:4017–4033. [Google Scholar]
  • 12.Thompson A. The problem of negative estimates of variance components. Ann Math Stat. 1962;33:273–289. [Google Scholar]
  • 13.LaMotto L. On non-negative quadric unbiased estimation of variance components. JASA. 1973;68:728–730. [Google Scholar]
  • 14.Horn S., Horn R., Dunca D. Estimation heteroscedastic variances in linear model. JASA. 1975;70:380–385. [Google Scholar]
  • 15.Jennrich R., Sampson P. Newton–Raphson and related algorithms for maximum likelihood variance component estimation. Technometrics. 1976;18:11–17. [Google Scholar]
  • 16.Kelly R., Mathew T. Improved nonnegative estimation of variance components in some mixed models with unbalanced data. Technometrics. 1994;36:171–181. [Google Scholar]
  • 17.Khattree R. Some practical estimation procedures for variance components. Comput Stat Data Anal. 1998;28:1–32. [Google Scholar]
  • 18.Teunissen P., Amiri-Simkooei A. Least-squares variance component estimation. J Geodesy. 2008;82:65–82. [Google Scholar]
  • 19.Moghtased-Azar K., Tehranchi R., Amiri-Simkooei A. An alternative method for non-negative estimation of variance components. J Geodesy. 2014;88:427–439. [Google Scholar]
  • 20.Rao C. Estimation of variance and covariance components in linear models. JASA. 1972;67:112–115. [Google Scholar]
  • 21.Dempster P., Laird M., Rubin B. Maximum likelihood from incomplete data via the EM algorithm. JRSS. 1977;39:1–38. [Google Scholar]
  • 22.Harville D. Maximum-likelihood approaches to variance component estimation and to related problems. JASA. 1977;72:320–340. [Google Scholar]
  • 23.Diffey S. Welsh A. Smith A. Cullis B. A faster and computationally more efficient REML (PX) EM algorithm for linear mixed models. Centre for Statistical & Survey Methodology. Working Paper Series. University of Wollongong; 2013.
  • 24.Liu C., Rubin D., Wu J. Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika. 1998;85:755–770. [Google Scholar]
  • 25.Lucas J. A variance component estimation method for sparse matrix application. NOAA Technical Report; NOS 111 NGS 33; 1985.
  • 26.Melo S., Garzón B., Melo O. Cell means model for balanced factorial designs with nested mixed factors. Commun Stat—Theory Meth. 2013;42:2009–2024. [Google Scholar]
  • 27.Khuri A., Mathew T., Sinha B. John Wiley & Sons; New York: 1998. Statistical tests for mixed linear models. [Google Scholar]
  • 28.Qie W, Xu C. Evaluation of a new variance components estimation method modified Henderson’s method 3 with the application of two way mixed model. Department of Economics and Society, Dalarna University College; 2009.
  • 29.Ahrens H., Pincus R. On two measures of unbalanceness in a one-way model and their relation to efficiency. Biometrics. 1981;23:227–235. [Google Scholar]

Articles from Journal of Advanced Research are provided here courtesy of Elsevier

RESOURCES