Abstract
The increased power of small computers makes the use of parameter estimation methods attractive. Such methods have a number of uses in analytical chemistry. When valid models are available, many methods work well, but when models used in the estimation are in error, most methods fail. Methods based on the Kalman filter, a linear recursive estimator, may be modified to perform parameter estimation with erroneous models. Modifications to the filter involve allowing the filter to adapt the measurement model to the experimental data through matching the theoretical and observed covariance of the filter innovations sequence. The adaptive filtering methods that result have a number of applications in analytical chemistry.
Keywords: automated covariance estimation, Kalman filter, multicomponent analysis
1. Introduction
The increased computational power available from small computers has prompted a re-evaluation of the methods used in reducing data obtained from a chemical analysis. Many of the responses obtained from chemical analyses are suited to mathematical analysis by methods which estimate the parameters that generate the response; these parameters are generally concentrations. For parameter estimation to be successful, an accurate model of the behavior of the chemical system is necessary. The model used need not be theoretical; empirical models based on experimental results or on a numerical simulation of the chemical system are often satisfactory as well. When valid models are available, the parameters associated with the model may be obtained with a variety of methods. Some that have seen extensive use in analytical chemistry include analysis of the chemical data using linear least squares [1], nonlinear least squares analysis [2,3], and Kalman filtering [4–6].1
The methods mentioned above all work well with accurate models, but are much less satisfactory when used with models containing errors that can arise from many sources. Theoretical models, or models based on simulation, may not describe the physics or chemistry of a system well enough to predict system responses to the accuracy desired. Small changes in the experimental conditions used for data acquisition may perturb experimentally-obtained models, leading to errors when these models are used to analyze subsequent experiments. And, it may be impossible, because of the effects of chemical equilibria, to obtain independent responses for some of the chemical species included in a model for a complex system, leading to “chemical” model errors.
Relatively few methods have been developed to compensate for model errors affecting multicomponent quantitation. Approaches using factor analysis [7] have been developed for situations where the model is unknown, but these approaches are generally limited to very few components [8], and it is difficult to incorporate additional a priori information into such methods. An alternative approach has used the Kalman filter. The Kalman filter is a linear, recursive estimator which yields optimal estimates for parameters associated with a valid model [9,10]. Several methods, classified under the term “adaptive filtering,” have been developed to permit the filter to produce accurate parameter estimates in the presence of model errors [11–15]. This paper summarizes the development of an adaptive Kalman filter for use in the mathematical analysis of overlapped multicomponent chemical responses.
2. Theory
Kalman Filtering
The Kalman filter has received some attention for the analysis of multicomponent chemical responses [4,6,16,17]. Because most models relating chemical responses to concentrations are linear, application of the Kalman filter is straightforward. The filter model is comprised of two equations. The system model, which describes the time evolution of the desired parameters, is, in state-space notation
(2.1) |
where X is a n × 1 column vector of state variables describing the chemical system, where F is an n × n matrix describing how the states change with time, w is a vector describing noise contributions to the system model, and where k indicates time or some other independent variable which meets the noise requirements given below. For state-invariant systems, F reduces to the identity matrix I. Because multicomponent analysis is most often performed under conditions where concentrations are constant over the time frame involved, the case where X is time-invariant is considered here.
The second equation describes the measurement process by relating the measured response z(k), to the filter states. For a single sensor, the measurement model is given by
(2.2) |
where HT(k) is a 1 × n vector relating the response at point k to the n states, and the scalar v(k) is the noise contribution of the measurement process. For example, in absorption spectrophotometry, z(k) is an absorbance measurement at some wavelength k, and HT(k) is the vector of absorption coefficients at that wavelength for all chemical species included in the model. The measurement model is easily extended to systems with multiple sensors.
The two noise processes in the Kalman filter, w(k) and v(k), are usually assumed to be independent, zero-mean, white noise processes. The matrix Q(k), defined as the covariance of the noise in the system model, is taken as approximately zero for the time invariant system discussed in this paper. The scalar quantity R(k) is the variance of the noise in the measurement process.
The Potter-Schmidt square-root algorithm, one implementation of the Kalman filter [18], is given in table 1. The details of this algorithm have been discussed elsewhere [18,19]. Initial guesses for the filter states and for the covariance matrix P are required to start the filter. Estimates of X and P depend on k, and because both are projected ahead of the data (in eqs 2.3 and 2.4) by the filter, the notation (j|k) is used to indicate that the estimate is made at point j, based on data obtained up through point k. The filter output consists of estimates , as well as . In analytical chemistry, these are often estimates of concentrations and of the error in the concentrations.
Table 1.
Algorithm equations for the square root Kalman filter.
State estimate extrapolation | |
(2.3) | |
Covariance square root extrapolation | |
(2.4) | |
where | |
Kalman gain: | |
(2.5) | |
where | |
(2.6) | |
(2.7) | |
(2.8) | |
State estimate update: | |
(2.9) | |
Covariance square root update: | |
(2.10) |
Adaptive Kalman Filtering
Errors can occur in both of the models used in the Kalman filter. Errors in the system model arise if the system was taken as timeinvariant, but was actually composed of time-dependent states. Errors in the measurement model arise from underestimating the number of components involved in the state vector (which can be thought of as incorrectly setting values in HT(k) to zero for one of the possible elements in the state vector), or by use of inaccurate values in HT(k). Either type of error produces a suboptimal Filter, in that the accuracy of the filter’s estimates are severely degraded. Many methods for compensating these model errors make use of the filter innovations sequence, v(k), defined as
(2.11) |
The innovations sequence can be used to construct a measure of the optimality of the filter; a necessary and sufficient condition for an optimal filter is that this sequence be a white noise process [10]. An optimal filter is one that minimizes the mean square estimation error . A suboptimal filter may generate results which show large estimation errors, or even a divergence of the errors [11]. The aim of an adaptive filter is to reduce or bound these errors by modifying, or adapting, the models used in the Kalman filter to the real data.
Several methods for controlling error divergence in the filter have been reported [11–15]. Most involve cases where Q is poorly known, the situation which arises when the time-dependence of states is incorrectly modeled. These include methods based on Bayesian estimation and maximum likelihood estimation [14], correlation methods [14], and covariance matching techniques [14,15]. The last method has also been suggested for use when Q is known, but R is unknown, the situation that arises when the number of components in the state is underestimated, or when the measurement model is otherwise incorrect. Because errors in the number of components and in the response factors used in the measurement model are common in multicomponent chemical analysis, covariance matching is used to develop the filter discussed here.
The aim of covariance matching is to insure that the residuals remain consistent with the theoretical covariances. The covariance of the innovations sequence v(k) is [14]
(2.12) |
If the actual covariance of v(k) is much larger than the covariance obtained from the Kalman filter, either Q or R should be increased to prevent divergence. In either case, this has the effect of increasing P(k|k − 1), thus bringing the actual covariance of v(k) closer to that given in eq 2.12. This also has the effect of decreasing the filter gain matrix, K, thereby “closing” the filter to new data which would otherwise be incorrectly interpreted because of errors in the measurement model. In essence, this amounts to “covering” the errors in the model with noise, then estimating the noise variance. The adaptive estimate of R at the kth point, when Q is known, is
(2.13) |
where m is the width of an empirically chosen rectangular smoothing window for the innovations sequence. The smoothing operation improves the statistical significance of the estimator for R(k), as it now depends on many residuals.
Adaptive estimation of R allows accurate estimates for the states to be obtained, even in the presence of model errors, because only data for which an accurate model is available are used in the filter. A new measurement model can be constructed from the estimated R(k), either by augmenting the HT vector, or by correcting any one of its existing elements; choice of correction or augmentation is arbitrary. For augmentation, the equations
(2.14) |
(2.15) |
apply, where denotes an element which is incorporated in the HT vector. The term (k + m/2) arises from the lag induced by averaging m of the squared innovations. The factor b(k) is defined as
(2.16) |
(2.17) |
Equations 2.16 and 2.17 allow determination of the sign of the model error by evaluating the average of the innovations over the range for which R was calculated. Equation 2.15 reflects the fact that the relation between the chemical response and concentration, given by HT, is generally positive.
For correction of the ith component of the vector HT, the expressions
(2.18) |
(2.19) |
apply instead of those given in eqs 2.14 and 2.15. In either case, a valid measurement model can be generated from the adaptive estimation of R.
Two criteria must be met for this adaptive filter to be useful in the mathematical analysis of multicomponent responses. First, the model to be adaptively corrected must already be correct for some of the values of k where each of the known components of the model has a measureable response. The second requirement is that the adaptive correction must be performed on a single component. For a single sensor, R is a scalar, and it is not possible to distinguish the different portions belonging to the different components. It is often feasible, however, to treat model errors as a single, unmodeled component without affecting the accuracy of some or all of the estimated quantities. Although it has been observed that the adaptive estimation of R by covariance matching is not a sufficient condition for obtaining an improved measurement model [10], application of this approach in the mathematical analysis of multicomponent responses has shown that significant model improvement generally occurs in practice [15,20].
Automation of the Adaptive Filter
The adaptive filter requires an initial guess of the states and of their covariances, just as in the ordinary filter. The adaptive estimation of R affects the calculation of P, however, and itis found that the diagonal elements of P decrease as R decreases. Since the size of R is directly related to the quality of the measurement model, this relation provides a means by which the quality of the final filter estimates can be judged. Once results are obtained with minimum values for the diagonal elements of the estimated P, the resulting corrected measurement model better describes the experimental data available, judging from the deterministic variances of fitting before and after the model correction. Because the innovations are not white in the presence of model error, the filter results are no longer guaranteed to be optimal, but now depend on the initial guess. Thus, the adaptive filter must be run several times, with different initial guesses, Xo and Po, to locate those best estimates. This process is easily automated, however. Simplex optimization [21–23] can be used to minimize the metric based on the diagonal elements of the covariance matrix
(2.20) |
as a function of the initial guesses input to the adaptive filter. We have previously demonstrated that the minima in the variance surface Y = f(Xo,Po) correspond well to the minima in an error surface defined by the quantities [24].
3. Application in Analytical Chemistry
Empirical Model Improvement
Empirical models have been used with the Kalman filter to study the chemical speciation of metal ions. One study [20] reported the adaptive correction of the visible photoacoustic spectrum of Pr(EDTA)−. This spectrum was obtained from data collected on solutions containing both Pr3+ and Pr(EDTA)− species. Direct spectroscopic measurement of Pr(EDTA)− is not simple, A similar approach was also used to obtain the spectrum of , another ion whose spectrum is difficult to observe in the absence of related chemical species [25]. These studies demonstrate the ability of adaptive filtering to correct for “chemical” errors in the measurement model.
Two other studies used adaptive filtering to model the electrochemical response of an equilibrium mixture of Cd2+ and Cd(NTA)− [20,26]. The adaptively modeled component, attributed to the reduction of Cd2+ after dissociation of the Cd(NTA)− complex, was corrected [26] from an approximate model based on digital simulation [19]. The stability constant for the Cd(NTA)− species was estimated from the concentrations obtained from the filter. These studies illustrate the correction of “theoretical” errors in the measurement model by adaptive filtering.
The adaptive filter has also been used to correct empirical models for errors which occurred in data acquisition, An example is the correction of models used for the resolution of overlapped electrochemical responses, Resolved peaks are generally needed to obtain estimates of the component concentration. Small changes in experimental conditions, occurring between the time when data are obtained for use in empirical models and the time when the mixtures are measured, change peak positions slightly. The resulting inaccuracy in the model degrades the accuracy of the resolution obtained with the Kalman filter. Adaptive filtering can correct for these type of model errors, resulting in substantially improved concentration estimates from multicomponent electrochemical responses [20].
Removal of Interferences
In many multicomponent analyses, substances which interfere with the chemical analysis are often present. Frequently, these species must be chemically separated, because they are not easily removed in the mathematical analysis of the data. Adaptive estimation of these unknown components of the model is an alternative approach. The feasibility of this has been demonstrated [24] in a visible spectrophotometric analysis, where adaptive filtering was used to quantify , Nr2+, Co2+ and picric acid in the presence of the “unknown” contaminant Cu2+. The errors in estimating species concentrations were typically less than 5%. An adaptive estimation of Co2+ in the presence of “unknown” Cu2+, Ni2+, and picric acid, where interferent species responses strongly overlap that for the species of interest, gave an estimation error of 14% with a five-fold excess of interfèrent species. This estimation’s lower accuracy results from the adaptive filter’s response when its model restrictions are not met, a situation which occurs here as a consequence of the severe overlap of the analyte and interfèrent responses. Even though this result is of lower accuracy than many of the others reported, it is still remarkable. Unlike the other fitting, this result does not rely on the use of a complete model. Using peak resolution based on an ordinary filtering approach, with the same incomplete measurement model, an error of 200–300% is likely.
4. Conclusion
The automated, adaptive estimation of measurement model covariance permits the application of Kalman filtering in chemical systems where models are poorly known, Although results obtained from the adaptive filter are not guaranteed to be optimal by theory, significant improvement in the accuracy of models and estimated parameters is generally possible in practice.
Restrictions are fairly minor: parts of the model must be known well enough to “open” the filter to the data, and only one component of the model may be adaptively corrected at a time. Adaptive filtering should yield results similar to those obtained from factor analysis using target transformation [27], but the adaptive filter requires only one mixture response, while factor analysis requires several.
Biography
About the Authors, Paper: Steven D. Brown is now with the Department of Chemistry at the University of Delaware. While Sarah C. Rutan, who was with Washington State University under a Summer Fellowship of the Analytical Division, American Chemical Society, is now with the Department of Chemistry at Virginia Commonwealth. The work described was supported by the Division of Chemical Sciences, U.S. Department of Energy.
Footnotes
Bracketed figures indicate literature references.
Contributor Information
Steven D. Brown, Washington State University, Pullman, WA 99164
Sarah C. Rutan, Virginia Commonwealth University, Richmond, VA 23284
5. References
- [1].Brubaker T. A.; Tracy R. and Pomernacki C. L., Linear parameter estimation. Anal. Chem. 50, 1017A–1024A (1978). [Google Scholar]
- [2].Meites L.; Some new techniques for the analysis and interpretation of chemical data, CRC Crit. Rev. Anal. Chem. 8, 1–53 (1979). [Google Scholar]
- [3].Brubaker T. A., and O’Keefe K. R., Nonlinear parameter estimation, Anal. Chem. 51, 1385A–1388A (1979). [Google Scholar]
- [4].Brubaker T. A.; Cornett F. N. and Pomernacki C. L., Linear digital filtering for laboratory automation, Proc. IEEE 63, 1475–1486 (1975). [Google Scholar]
- [5].Seelig P. F., and Blount H. N., Kalman filter applied to anodic stripping voltammetry: theory, Anal. Chem. 48, 252–258 (1976). [Google Scholar]
- [6].Poulisse H. N. J., Multicomponent-analysis computations based on Kalman filtering. Anal. Chim. Acta 112, 361–374 (1979). [Google Scholar]
- [7].Lawton W. H., and Sylvestre E. A., Self-Modeling curve resolution, Technometrics 13, 617–633 (1971). [Google Scholar]
- [8].Ohta N. Estimating absorption bands of component dyes by means of principal component analysis, Anal. Chem. 45, 553–557 (1973). [Google Scholar]
- [9].Kalman R. E., A new approach to linear filtering and prediction problems, Trans. ASME Ser. D, J. Basic Eng. 82, 34–45 (1960). [Google Scholar]
- [10].Gelb A. (ed.) Applied optimal estimation, MIT Press:Cambridge, MA: (1974). [Google Scholar]
- [11].Fitzgerald R. J., Divergence of the Kalman filter, IEEE Trans. Autom. Control AC-16, 736–747 (1971). [Google Scholar]
- [12].Jazwinski A. H. Stochastic Processes and Filtering Theory, Academic Press:New York, NY: (1970) chapters 7–8. [Google Scholar]
- [13].Mehra R. K. On the identification of variances and adaptive Kalman filtering, IEEE Trans. Autom. Control AC-15, 175–184 (1970). [Google Scholar]
- [14].Mehra R. K. Approaches to adaptive filtering. IEEE Trans. Autom. Control AC-17, 693–698 (1972). [Google Scholar]
- [15].Nahi N. E. Decision-directed adaptive recursive estimators: divergence prevention. IEEE Trans. Autom. Control AC-17, 61–68 (1972). [Google Scholar]
- [16].Brown T. F., and Brown S. D., Resolution of overlapped electrochemical peaks with the use of the Kalman filter, Anal. Chem. 53, 1410–1417 (1981). [Google Scholar]
- [17].Rutan S. C., and Brown S. D., Pulsed photoacoustic spectroscopy and spectral deconvolution with the Kalman filter for determination of metal complexation parameters, Anal. Chem. 55, 1707–1710 (1983). [Google Scholar]
- [18].Kaminski P. G.; Bryson A. E. and Schmidt S. F., Discrete square root filtering: a survey of current techniques, IEEE Trans. Autom. Control AC-16, 727–735 (1971). [Google Scholar]
- [19].Brown T. F.; Caster D. M. and Brown S. D., Estimation of electrochemical charge transfer parameters with the Kalman filter, Anal. Chem. 56, 1214–1221 (1984). [Google Scholar]
- [20].Rutan S. C., and Brown S. D., Adaptive Kalman filtering used to compensate for model errors in multicomponent analysis, Anal. Chim. Acta 160, 99–119 (1984). [Google Scholar]
- [21].Deming S. N., and Morgan S. L., Simplex optimization of variables in analytical chemistry, Anal. Chem. 45, 278A–283A (1973). [Google Scholar]
- [22].Nelder J. A., and Mead R., A simplex method for function minimization, Comp. J. 7, 308–313 (1965). [Google Scholar]
- [23].O’Neill R., Function minimization using a simplex procedure, Appl. Statistics 13, 338–345 (1971). [Google Scholar]
- [24].Rutan S. C., and Brown S. D., Simplex optimization of the adaptive Kalman filter, Anal. Chim. Acta 167, 39–50 (1985). [Google Scholar]
- [25].Irving D., and Brown T. F., Uranium speciation studies using the Kalman filter, Abstract No. 549; Proceedings, Pittsburgh Conference on Analytical Chemistry and Applied Spectroscopy; New Orleans, LA (1985). [Google Scholar]
- [26].Brown T. F.; Caster D. M. and Brown S. D., Speciation of labile and quasi-labile metal complex systems using the Kalman filter, NBS Spec. Pub. 618, 163–170 (1981). [Google Scholar]
- [27].Malinowski E. R., and Howery D. G., Factor analysis in chemistry, Wiley:New York, NY: (1980). [Google Scholar]