Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2001 Sep 11;98(19):10557–10559. doi: 10.1073/pnas.191354498

A note on Kalman Filter

C R Rao 1,
PMCID: PMC58504  PMID: 11553805

Abstract

The Kalman Filter commonly employed by control engineers and other physical scientists has been successfully used in such diverse areas as the processing of signals in aerospace tracking and underwater sonar, and statistical quality control. More recently, it has been used in some nonengineering applications such as short-term forecasting, time series, survival analysis, and so on. In all of these situations, we have a set of equations governing the true state of a system and another set connecting the observations made at any given time on the system with its true state. The problem is that of predicting the true state of the system at any given time point based on available observations. The solution proposed in the vast literature of the subject depends on the assumptions made on the initial state of the system. In this paper, a method that is independent of the initial state is proposed. This is useful when the a priori information on the initial state is not available. The method is also applicable when some observations are missing.

1. Dynamical and Observational Equation

There is extensive literature on Kalman Filtering (KF) and its applications arising out of the seminal contributions by Kalman (1) and Kalman and Bucy (2).

The general problem can be stated as follows. We have a time sequence of p and q-vector valued random variables {x(t), y(t)}, t = 1, 2, … related through structural equations

graphic file with name M1.gif 1.1
graphic file with name M2.gif 1.2

where F and H are p × p and q × p matrices, respectively, and the following hold, where E stands for expectation and C for covariance.

graphic file with name M3.gif
graphic file with name M4.gif
graphic file with name M5.gif
graphic file with name M6.gif 1.3

We have observations only on y(1), … , y(t) up to time t, and the problem is to estimate x(s) given y(1), … , y(t), or a subset of them.

In the engineering literature, estimation of x(t) is called smoothing if s < t, filtering if s = t, and prediction if s > t. An excellent review of KF and its applications to a wide variety of statistical problems can be found in refs. 15.

In the literature on KF, the conditions are usually stated in terms of independence of the variables involved instead of zero correlations as stated in 1.3. If we are interested in linear estimates of observations, we need only require the knowledge of the variances and covariances of the variables involved.

The theory described in the paper can be extended to the case where the matrices F and H in 1.1 and 1.2 and V and W in 1.3 depend on time.

2. Reduction to Linear Equations

From 1.1,

graphic file with name M7.gif 2.1

with et = Ftξt, where

graphic file with name M8.gif

and Ip is the identity matrix of order p. A simple calculation yields

graphic file with name 10557e2.2.jpg 2.2

where IsV is the Kronecker product, and 0 is a null matrix of appropriate order.

From 1.2,

graphic file with name M9.gif 2.3

where δ(t) = Het + η(t). A simple calculation gives

graphic file with name M10.gif 2.4
graphic file with name M11.gif
graphic file with name M12.gif

Writing Y′(t) = (y′(1), … , y′(t)) and

graphic file with name M13.gif

we have the composite linear model for observations up to time t

graphic file with name M14.gif 2.5

To estimate x(s) given Y(t) for any s and t, we need to consider the equations

graphic file with name M15.gif 2.6
graphic file with name M16.gif

with the variance-covariance matrix of (es, δt),

graphic file with name M17.gif 2.7

where Vss, wrs, and Wij are as defined in 2.2 and 2.4, and u′ = (ws1, … , wst) and U = (Wij).

3. Different Assumptions on x(0)

3.1. Given Prior Distribution of x(0).

Let x(0) have a prior mean μ and variance-covariance matrix P. Then we can write the model 2.6 as

graphic file with name M18.gif 3.1
graphic file with name M19.gif

with

graphic file with name M20.gif 3.2
graphic file with name M21.gif
graphic file with name M22.gif

Then, using the least squares theory (ref. 6, pp. 544–547), the estimate of x(s) is

graphic file with name M23.gif 3.3

with the minimum dispersion error matrix

graphic file with name M24.gif 3.4

3.2. x(0) Is a Known Constant μ.

The solution in this case is the same as 3.4 with P in 3.2 replaced by the null matrix. Then the estimate of x(s) is

graphic file with name M25.gif 3.5

with the minimum error dispersion matrix

graphic file with name M26.gif

If x(0) is a constant, but its value is not known, then we may estimate x(0) from the second equation in 2.6 using least squares theory provided the rank of Gt is p. In such a case, the least squares estimate of x(0) is

graphic file with name M27.gif 3.6

Substituting x̂(0) for μ in 3.5, we have the empirical estimate

graphic file with name M28.gif 3.7

The estimate 3.7 and its dispersion matrix are derived in a more natural way in the next section.

3.3. No Information Is Available on x(0).

Consider a linear function LY(t) as an estimate of x(s) such that

graphic file with name M29.gif 3.8
graphic file with name M30.gif

The variance-covariance matrix of x(s) − LY(t) is

graphic file with name M31.gif 3.9

where Vss, u, and U are as defined in 2.7.

The minimum value of 3.9 subject to 3.8 is attained at

graphic file with name M32.gif 3.10

giving the same expression for the estimate of x(s) as the empirical estimate derived in 3.7. The minimum dispersion error matrix is

graphic file with name M33.gif 3.11

The second expression in 3.11 is the extra error due to the imposed condition 3.8. We call the estimate L*Y(t) the constrained linear predictor (CLP) of x(s).

We prove the following result, which will be used in Section 4. Let M be a matrix such that E(MY(t)) = 0 independently of x(0), which implies that MGt = 0. Then, with L* as defined in 3.10,

graphic file with name M34.gif 3.12
graphic file with name M35.gif
graphic file with name M36.gif

4. Recursive Estimation

In the application of KF, it is customary to estimate x(s) using the available observations (y′(1), … , y′(r)) = Y1 (say), up to a time point r, and update the estimate when additional observations (y′(r + 1), … , y′(t)) = Y2 (say) are acquired. The methodology of such recursive estimation is well known in KF theory when unconstrained estimates (such as those discussed in Sections 3.1 and 3.2) are used.

We develop a similar method when CLPs as derived in Section 3.3 are required.

Let L1*Y1 be the CLP of x(s) based on Y1, and L2*Y1 be the CLP of Y2 based on Y1. We consider a general linear function of Y1, Y2

graphic file with name M37.gif 4.1

and its deviation from x(s) as

graphic file with name M38.gif 4.2
graphic file with name M39.gif

The condition of unbiasedness of 4.1 independently of x(0) implies E(B) = 0, since E(A) and E(C) = 0. Then by the Condition 3.12,

graphic file with name M40.gif 4.3

Hence

graphic file with name M41.gif 4.4
graphic file with name M42.gif

The second term in 4.4 vanishes if we choose M1 = L1*. The best choice of M2 which minimizes the first term is

graphic file with name M43.gif 4.5

To derive explicit expressions for A, D and their covariance, we write the basic linear model 2.6 by splitting Y(t) into the two components Y1 and Y2.

graphic file with name M44.gif 4.6
graphic file with name M45.gif
graphic file with name M46.gif

where G(1), G(2) are partitions of Gt and  δ(1), δ(2) are partitions of δt. The corresponding covariance matrix of x(s), Y1, Y2 is

graphic file with name M47.gif 4.7

obtained by partitioning the matrices u and U in 2.7.

Using the formula 3.10, the CLP of x(s) based on Y1 is

graphic file with name M48.gif 4.8
graphic file with name M49.gif

The CLP of Y2 based on Y1 is

graphic file with name M50.gif 4.9
graphic file with name M51.gif

Then

graphic file with name M52.gif 4.10

The updated CLP of x(s) is

graphic file with name M53.gif 4.11

with the minimum dispersion error

graphic file with name M54.gif 4.12

Note 1: The second equation in 2.6 is in the form of a mixed linear model, which enables us to estimate x(0), and also the matrices V and W if they are unknown by using MINQE method described in ref. 7.

Note 2: The linear model 2.6 is used when we want to estimate x(s) based on the data y(1), … , y(t). A more general problem is the estimation of x(s) based on y(r1), … , y(rn), observations made at n time points r1, … , rn. A particular situation is when observations on y are missing at certain time points, or some observations on y are discarded as not reliable or outliers. The linear equations appropriate for this is

graphic file with name M55.gif
graphic file with name M56.gif 4.13

where

graphic file with name M57.gif
graphic file with name M58.gif
graphic file with name M59.gif
graphic file with name M60.gif
graphic file with name M61.gif

where Vij are as defined in 2.2. Then the general methods of Sections 3 and 4 apply.

Acknowledgments

This work is supported by a grant from the Army Research Office, DAAD 19-01-1-0563.

Abbreviations

KF

Kalman Filter

CLP

constrained linear predictor

References

  • 1.Kalman R E. J Basic Eng. 1960;82:34–45. [Google Scholar]
  • 2.Kalman R E, Bucy R S. J Basic Eng. 1961;83:95–108. [Google Scholar]
  • 3.Meinhold R J, Singpurwalla N D. Am Stat. 1983;37:123–127. [Google Scholar]
  • 4.Meinhold R J, Singpurwalla N D. Am Stat. 1987;41:101–106. [Google Scholar]
  • 5.Rao C R, Toutenburg H. Linear Models: Least Squares and Alternatives. 2nd Ed. New York: Springer; 1999. [Google Scholar]
  • 6.Rao C R. Linear Statistical Inference and Its Applications. New York: Wiley; 1973. [Google Scholar]
  • 7.Rao C R, Kleffe J. Estimation of Variance Components and Its Applications. Amsterdam: North–Holland; 1988. [Google Scholar]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES