- An
**observation**is defined as: where and denote the unknown vector and the measurement vector. is a function of and is the observation noise with the power density . - It is assumed that is a random variable with an a priori power density before the observation.
- The goal is to compute the “best” estimation of using the observation.

## Optimal Estimation

- The optimal estimation is defined based on a cost function :
- Some typical cost functions:
- Minimum Mean Square Error ():
- Absolute Value ():
- Maximum a Posteriori ():

- It can be shown that:
- If the a posteriori density function has only one maximum and it is symmetric with respect to then all the above estimates are equal to .
- In fact, assuming these conditions for
,
is the optimal estimation for any cost function
if
and
is nondecreasing with distance (
**Sherman’s Theorem**). **Maximum Likelihood Estimation:**is the value of that maximizes the probability of observing :- It can be shown that if there is no a priori information about .

## Linear Gaussian Observation

Consider the following observation: where is a Gaussian random vector and matrices and are known.

In this observation, is estimable if has full column rank otherwise there will be infinite solutions for the problem.

If is invertible, then:

The maximum likelihood estimation can be computed as:

It is very interesting that is the Weighted Least Square (WLS) solution to the following equation: with the weight matrix i.e.

is an unbiased estimation:

The covariance of the estimation error is:

is

*efficient*in the sense of Cramér Rao bound.**Example:**Consider the following linear Gaussian observation: where is a nonzero real number and is the observation noise.**Maximum a Posteriori Estimation:**To compute , it is assumed that the a priori density of is Gaussian with mean and variance :The conditions of Sherman’s Theorem is satisfied and therefore:

Estimation bias:

Estimation error covariance:

**Maximum Likelihood Estimation:**For this example, we have:With this information:

Estimation bias:

Estimation error covariance:

Comparing and , we have: It means that if there is no a priori information about , the two estimations are equal.

For the error covariance, we have:

In other words, information after observation is the sum of information of the observation and information before the observation.

Estimation error covariance:

It is possible to include a priori information in maximum likelihood estimation.

A priori distribution of , , can be rewritten as the following observation: where is the observation noise.

**Combined observation**: where:The assumption is that and are

*independent*. Therefore:Maximum likelihood estimation:

is unbiased and has the same error covariance as .

Therefore and are equivalent.

## Standard Kalman Filter

Consider the following linear system: where , denote the state vector and measurement vector at time .

and are independent Gaussian white noise processes where is invertible.

It is assumed that there is an a priori estimation of , denoted by , which is assumed to be unbiased with a Gaussian estimation error, independent of and : where is invertible.

The Kalman filter is a recursive algorithm to compute the state estimation.

**Output Measurement:**Information in and can be written as the following observation: Considering the independence of and , we have:Using the Weighted Least Square (WLS) and matrix inversion formula:

Assuming:

We have:

State estimation is the sum of a priori estimation and a multiplicand of output prediction error. Since:

is the Kalman filter gain.

Estimation error covariance:

Information: