- An observation is defined as: where and denote the unknown vector and the measurement vector. is a function of and is the observation noise with the power density .
- It is assumed that is a random variable with an a priori power density before the observation.
- The goal is to compute the “best” estimation of using the observation.
Optimal Estimation
- The optimal estimation is defined based on a cost function :
- Some typical cost functions:
- Minimum Mean Square Error ():
- Absolute Value ():
- Maximum a Posteriori ():
- It can be shown that:
- If the a posteriori density function has only one maximum and it is symmetric with respect to then all the above estimates are equal to .
- In fact, assuming these conditions for , is the optimal estimation for any cost function if and is nondecreasing with distance (Sherman’s Theorem).
- Maximum Likelihood Estimation: is the value of that maximizes the probability of observing :
- It can be shown that if there is no a priori information about .
Linear Gaussian Observation
Consider the following observation: where is a Gaussian random vector and matrices and are known.
In this observation, is estimable if has full column rank otherwise there will be infinite solutions for the problem.
If is invertible, then:
The maximum likelihood estimation can be computed as:
It is very interesting that is the Weighted Least Square (WLS) solution to the following equation: with the weight matrix i.e.
is an unbiased estimation:
The covariance of the estimation error is:
is efficient in the sense of Cramér Rao bound.
Example: Consider the following linear Gaussian observation: where is a nonzero real number and is the observation noise.
Maximum a Posteriori Estimation: To compute , it is assumed that the a priori density of is Gaussian with mean and variance :
The conditions of Sherman’s Theorem is satisfied and therefore:
Estimation bias:
Estimation error covariance:
Maximum Likelihood Estimation: For this example, we have:
With this information:
Estimation bias:
Estimation error covariance:
Comparing and , we have: It means that if there is no a priori information about , the two estimations are equal.
For the error covariance, we have:
In other words, information after observation is the sum of information of the observation and information before the observation.
Estimation error covariance:
It is possible to include a priori information in maximum likelihood estimation.
A priori distribution of , , can be rewritten as the following observation: where is the observation noise.
Combined observation: where:
The assumption is that and are independent. Therefore:
Maximum likelihood estimation:
is unbiased and has the same error covariance as .
Therefore and are equivalent.
Standard Kalman Filter
Consider the following linear system: where , denote the state vector and measurement vector at time .
and are independent Gaussian white noise processes where is invertible.
It is assumed that there is an a priori estimation of , denoted by , which is assumed to be unbiased with a Gaussian estimation error, independent of and : where is invertible.
The Kalman filter is a recursive algorithm to compute the state estimation.
Output Measurement: Information in and can be written as the following observation: Considering the independence of and , we have:
Using the Weighted Least Square (WLS) and matrix inversion formula:
Assuming:
We have:
State estimation is the sum of a priori estimation and a multiplicand of output prediction error. Since:
is the Kalman filter gain.
Estimation error covariance:
Information: where
State Update: To complete a recursive algorithm, we need to compute and .
Information:
By removing from the above observation, we have:
It is easy to see:
Estimation error:
Estimation covariance:
Summary:
Initial Conditions: and its error covariance .
Gain Calculation:
:
:
Go to gain calculation and continue the loop for .
Remarks:
- Estimation residue:
- Residue covariance:
- The residue signal is used for monitoring the performance of Kalman filter.
- Modeling error, round-off error, disturbance, correlation between input and measurement noise, and other factors might cause a biased and colored residue.
- The residue signal can be used in Fault Detection and Isolation (FDI).
- The standard Kalman filter is not numerically robust because it contains matrix inversion. For example, the calculated error covariance matrix might not be positive definite because of computational errors.
- There are different implementations of Kalman filter to improve the
standard Kalman filter in the following aspects:
- Computational efficiency
- Dealing with disturbance or unknown inputs
- Handling singular systems (difference algebraic equations)