Saturday, May 01, 2010

Kalman Filters

(EDIT: See also Kalman Filter's Part 2)
I've been looking into a number of sensor fusion techniques, including Particle Filters and Kalman Filters (EKF,UKF, etc). I was looking for something easy to implement, easy to use, and not too computationally intensive.

I've always been a bit sceptical as to the advantages of a Kalman filter over an Alpha-Beta filter, in terms of performance gained vs cost (i.e.: difficulty to implement & use). For most simple tasks, the Alpha-Beta filter seems to work fine. (Similarly ABG vs AB).

Particle filters appeared advantageous to me, as they are relatively intuitive and aren't that hard to use, and easy to extend. The difficulty in implementing them can be overcome by using an existing library,

Kalman filters in my opinion have as a major disadvantage of being difficult to use (you need to construct multiple models), but are relatively easy to implement and computationally efficient. Thus, the use of an existing library doesn't help much.

I spoke to two experts in the field, Julius Ziegler (who worked on the DARPA Grand Challenge Annieway vehicle with me) and Rob Reid (a local, working on visual SLAM with Kalman filters). Both suggested that Kalman filters would be the most appropriate choice. (Reading various papers seems to indicate a merged (E)Kalman & Particle filter approach is the winner)

Wikipedia provides an overview of Kalman filters, but the real problem is in understanding what all the symbols actually mean, and how it works.

Like alpha-beta, Kalman filters are prediction-correction filters. That is, they make a prediction, then correct it to provide the final estimate of the systems state. Alpha-Beta's don't have a natural extension to include control inputs, or system identification techniques. They are also (effectively) constrained to linear position/velocity tracking. Kalman filters can be extended to handle non-linear conditions.

In essence a Kalman filter will (vaguely):
  1. Estimate the state (from previous)
  2. Determine the error
  3. Predict the next state
  4. Predict the next error
  5. Measure the system
  6. Update the state calculations
  7. Update the error calculations
Bellow are the equations that represent this process, followed by the definitions of each symbol:
x_temp_est = A * x_est(last) + B * u_control_input
P_temp = A * P(last) * A^T + Q
K = P_temp * H^T * (H* P_temp * H^T + R)^-1
x_est = x_temp_est + K * (z_measured - H * x_temp_est) 
P = (I - K*H) * P_temp
  • x - is the state we want to estimate (e.g. position)
  • u - is the input to the control system (if it exists) (e.g. a force)
  • A - state (model) matrix, how to get from last state, to next state (e.g.: position += velocity *dt). (Some people (eg: engineers) use an alternative notation 'F' for this matrix)
  • B - is the input matrix (Some people use an alternative notation 'G' for this matrix)
  • H - is the measurement matrix (what are we measuring?) e.g.: position or velocity, etc.
  • Q - is the process noise covariance matrix (you just 'know' this)
  • R - is the measurement noise covariance matrix (likewise, you 'know' this, usually from measurements, data sheets, etc, ie: how noisy are your sensors?)
  • P - the estimate error covariance matrix (calculated)
  • K - the kalman gain (calculated)
The things you must 'know' are:
  • How your system moves from one state to the next (e.g. given a current position, velocity, change in time, what will the next state be?) => This is the 'A' or 'F' matrix
  • What are you able to measure? (Maybe just acceleration, or just position?) => This is the H matrix
  • The amount of noise in your measurements and the state-transitions (e.g. the standard deviation of the signal noise, and how 'wrong' your simplified model of the state-transitions are) => These are Q and R matrices
Completely confused? Good. Some examples will help clear this whole mess up. First, a simple example where we are observing a noisy-constant. (e.g. measuring the height of a water-tank with a poor quality sensor) We have no control input so we can ignore the B and u terms. We have no state-transition model (it is a constant!) so we can set A to identity (i.e. one). Our first equation becomes:
x_temp_est = x_est(last)
Subsequently the second equation becomes:
P_temp = P(last) + Q
Where Q is our estimation of the variance in our state transition. Since we are measuring a constant, this value should be small. (Remember the real world always has noise, so assuming zero noise is always a bad idea) We measure the state directly (i.e. the noisy value of the constant) so the H matrix is also identity. This means:
K = P_temp * (P_temp + R)^-1
x_est = x_temp_est + K * (z_measured - x_temp_est) 
P = (1- K) * P_temp
Again, we keep R since that is our all-important measurement noise estimate. Now let's see this in action in C:
/** A simple kalman filter example by Adrian Boeing  

#include <stdio.h> 
#include <stdlib.h> 
#include <math.h> 

double frand() { 
    return 2*((rand()/(double)RAND_MAX) - 0.5); 

int main() { 

    //initial values for the kalman filter 
    float x_est_last = 0;
    float P_last = 0; 
    //the noise in the system 
    float Q = 0.022;
    float R = 0.617;
    float K; 
    float P; 
    float P_temp; 
    float x_temp_est; 
    float x_est; 
    float z_measured; //the 'noisy' value we measured
    float z_real = 0.5; //the ideal value we wish to measure
    //initialize with a measurement 
    x_est_last = z_real + frand()*0.09;
    float sum_error_kalman = 0;
    float sum_error_measure = 0;
    for (int i=0;i<30;i++) {
        //do a prediction 
        x_temp_est = x_est_last; 
        P_temp = P_last + Q; 
        //calculate the Kalman gain 
        K = P_temp * (1.0/(P_temp + R));
        z_measured = z_real + frand()*0.09; //the real measurement plus noise
        x_est = x_temp_est + K * (z_measured - x_temp_est);  
        P = (1- K) * P_temp; 
        //we have our new system 
        printf("Ideal    position: %6.3f \n",z_real); 
        printf("Mesaured position: %6.3f [diff:%.3f]\n",z_measured,fabs(z_real-z_measured)); 
        printf("Kalman   position: %6.3f [diff:%.3f]\n",x_est,fabs(z_real - x_est)); 
        sum_error_kalman += fabs(z_real - x_est); 
        sum_error_measure += fabs(z_real-z_measured); 
        //update our last's 
        P_last = P; 
        x_est_last = x_est; 
    printf("Total error if using raw measured:  %f\n",sum_error_measure);
    printf("Total error if using kalman filter: %f\n",sum_error_kalman);
    printf("Reduction in error: %d%% \n",100-(int)((sum_error_kalman/sum_error_measure)*100));
    return 0; 
This should generate output similar to:
Mesaured position:  0.562 [diff:0.062]
Kalman   position:  0.498 [diff:0.002]
Ideal    position:  0.500 
Mesaured position:  0.563 [diff:0.063]
Kalman   position:  0.509 [diff:0.009]
Total error if using raw measured:  1.566003
Total error if using kalman filter: 0.441561
Reduction in error: 72% 
More (non-trivial) examples will follow soon. In the meantime I can suggest the following resources:
Source code/libraries:


Vilson said...

Tyvm for sharing your learning. I've learned some with ur sharing.
Mainly about the definition of each symbol used on the equations, that was really helpful.

sukhi said...

amzing work man!!!it was a lot of help to me in my thesis work, i'm doing. Thanks!!

Adrian said...

Your welcome. What is your thesis topic?

sukhi said...

Actually working on target tracking in underwater sensor networks.

Muhammad Faiz Mohamed Saaid said...

is there any good references for particle filter, I'm working on nonlinear bayesian filtering problems for my phd

Topher Lafata said...

Thanks a lot for the explanation. Good job!

I have a few questions that maybe you could answer.

What does the T represent that some terms are being raised to?

I am attempting to apply this filter to a value that is expected to increment by a constant at each step but never increments exactly.

Let's say that in your example the initial constant were expected to increment by 0.5 at every iteration.

Would I have to recalculate A at every step such that
A * last_step = last_step +0.5?

If I expected the error in the increment to be +/- 0.2 at each step would Q be 0.4?

Is it acceptable to calculate the value of Q as we go along or does the initial value relate to how fast the filter converges to minimize the error?

Topher Lafata said...

Ok. I guess the T is a transpose which in my case would just be A itself I assume.

Any insight on the last two questions would be appreciated.


Azizi van Abdullah said...

A very nice tutorial. Do you have a tutorial that describe the same problem using EKF?

RK Jaiswal said...

Hi,you have explained nicely, I want to use the same code to predict the future location of a car using gps coordinates how i can proceed?

MD RAHMAN said...

Hey, thanks for nice blog. I want to know what is frand() and srand(). Please give clear ideas.

Mohammadali Vahidpour said...

i hope that you help me.
my master student thesis has about estimation of state of charge of li-ion battery .
in SOC(state of charge)method usage the extended Kalman filtering for estimating.
I have a big problem in noise Covariance matrix calculation.
help me please.
my ... tanks my dears.

Heejune Ahn said...

How we can JUST know the Q and R matrix? Can you give an example? I thought we can get this in Kalman filter Algorithm. Is this impossible?