(EDIT: See also
Kalman Filter's Part 2)
I've been looking into a number of sensor fusion techniques, including
Particle Filters and
Kalman Filters (EKF,UKF, etc). I was looking for something easy to implement, easy to use, and not too computationally intensive.
I've always been a bit sceptical as to the advantages of a Kalman filter over an
Alpha-Beta filter, in terms of performance gained vs cost (i.e.: difficulty to implement & use). For most simple tasks, the Alpha-Beta filter seems to work fine. (Similarly ABG vs AB).
Particle filters appeared advantageous to me, as they are relatively intuitive and aren't that hard to use, and easy to extend. The difficulty in implementing them can be overcome by using an existing library,
Kalman filters in my opinion have as a major disadvantage of being difficult to use (you need to construct multiple models), but are relatively easy to implement and computationally efficient. Thus, the use of an existing library doesn't help much.
I spoke to two experts in the field,
Julius Ziegler (who worked on the DARPA Grand Challenge Annieway vehicle with me) and
Rob Reid (a local, working on visual SLAM with Kalman filters). Both suggested that Kalman filters would be the most appropriate choice. (Reading various papers seems to indicate a merged (E)Kalman & Particle filter approach is the winner)
Wikipedia provides an overview of Kalman filters, but the real problem is in understanding what all the symbols actually mean, and how it works.
Like alpha-beta, Kalman filters are prediction-correction filters. That is, they make a prediction, then correct it to provide the final estimate of the systems state. Alpha-Beta's don't have a natural extension to include control inputs, or system identification techniques. They are also (effectively) constrained to linear position/velocity tracking. Kalman filters can be extended to handle non-linear conditions.
In essence a Kalman filter will (vaguely):
- Estimate the state (from previous)
- Determine the error
- Predict the next state
- Predict the next error
- Measure the system
- Update the state calculations
- Update the error calculations
Bellow are the equations that represent this process, followed by the definitions of each symbol:
x_temp_est = A * x_est(last) + B * u_control_input
P_temp = A * P(last) * A^T + Q
K = P_temp * H^T * (H* P_temp * H^T + R)^-1
x_est = x_temp_est + K * (z_measured - H * x_temp_est)
P = (I - K*H) * P_temp
- x - is the state we want to estimate (e.g. position)
- u - is the input to the control system (if it exists) (e.g. a force)
- A - state (model) matrix, how to get from last state, to next state (e.g.: position += velocity *dt). (Some people (eg: engineers) use an alternative notation 'F' for this matrix)
- B - is the input matrix (Some people use an alternative notation 'G' for this matrix)
- H - is the measurement matrix (what are we measuring?) e.g.: position or velocity, etc.
- Q - is the process noise covariance matrix (you just 'know' this)
- R - is the measurement noise covariance matrix (likewise, you 'know' this, usually from measurements, data sheets, etc, ie: how noisy are your sensors?)
- P - the estimate error covariance matrix (calculated)
- K - the kalman gain (calculated)
The things you must 'know' are:
- How your system moves from one state to the next (e.g. given a current position, velocity, change in time, what will the next state be?) => This is the 'A' or 'F' matrix
- What are you able to measure? (Maybe just acceleration, or just position?) => This is the H matrix
- The amount of noise in your measurements and the state-transitions (e.g. the standard deviation of the signal noise, and how 'wrong' your simplified model of the state-transitions are) => These are Q and R matrices
Completely confused? Good. Some examples will help clear this whole mess up. First, a simple example where we are observing a noisy-constant. (e.g. measuring the height of a water-tank with a poor quality sensor) We have no control input so we can ignore the B and u terms. We have no state-transition model (it is a constant!) so we can set A to identity (i.e. one). Our first equation becomes:
x_temp_est = x_est(last)
Subsequently the second equation becomes:
P_temp = P(last) + Q
Where Q is our estimation of the variance in our state transition. Since we are measuring a constant, this value should be small. (Remember the real world always has noise, so assuming zero noise is always a bad idea) We measure the state directly (i.e. the noisy value of the constant) so the H matrix is also identity. This means:
K = P_temp * (P_temp + R)^-1
x_est = x_temp_est + K * (z_measured - x_temp_est)
P = (1- K) * P_temp
Again, we keep R since that is our all-important measurement noise estimate. Now let's see this in action in C:
/** A simple kalman filter example by Adrian Boeing
www.adrianboeing.com
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
double frand() {
return 2*((rand()/(double)RAND_MAX) - 0.5);
}
int main() {
//initial values for the kalman filter
float x_est_last = 0;
float P_last = 0;
//the noise in the system
float Q = 0.022;
float R = 0.617;
float K;
float P;
float P_temp;
float x_temp_est;
float x_est;
float z_measured; //the 'noisy' value we measured
float z_real = 0.5; //the ideal value we wish to measure
srand(0);
//initialize with a measurement
x_est_last = z_real + frand()*0.09;
float sum_error_kalman = 0;
float sum_error_measure = 0;
for (int i=0;i<30;i++) {
//do a prediction
x_temp_est = x_est_last;
P_temp = P_last + Q;
//calculate the Kalman gain
K = P_temp * (1.0/(P_temp + R));
//measure
z_measured = z_real + frand()*0.09; //the real measurement plus noise
//correct
x_est = x_temp_est + K * (z_measured - x_temp_est);
P = (1- K) * P_temp;
//we have our new system
printf("Ideal position: %6.3f \n",z_real);
printf("Mesaured position: %6.3f [diff:%.3f]\n",z_measured,fabs(z_real-z_measured));
printf("Kalman position: %6.3f [diff:%.3f]\n",x_est,fabs(z_real - x_est));
sum_error_kalman += fabs(z_real - x_est);
sum_error_measure += fabs(z_real-z_measured);
//update our last's
P_last = P;
x_est_last = x_est;
}
printf("Total error if using raw measured: %f\n",sum_error_measure);
printf("Total error if using kalman filter: %f\n",sum_error_kalman);
printf("Reduction in error: %d%% \n",100-(int)((sum_error_kalman/sum_error_measure)*100));
return 0;
}
This should generate output similar to:
Mesaured position: 0.562 [diff:0.062]
Kalman position: 0.498 [diff:0.002]
Ideal position: 0.500
Mesaured position: 0.563 [diff:0.063]
Kalman position: 0.509 [diff:0.009]
Total error if using raw measured: 1.566003
Total error if using kalman filter: 0.441561
Reduction in error: 72%
More (non-trivial) examples will follow soon. In the meantime I can suggest the following resources:
- http://www.cs.unc.edu/~tracker/ref/s2001/kalman/index.html
- http://www.cs.cmu.edu/~biorobotics/papers/sbp_papers/integrated3/kleeman_kalman_basics.pdf
- http://www.acfr.usyd.edu.au/education/multiSensorDataFusion.shtml
- http://www.ee.pdx.edu/~greenwd/kalman_overview.pdf
Source code/libraries:
- http://kalman.sourceforge.net/
- http://bayesclasses.sourceforge.net/
- http://www.memsense.com/index.php/Product-Pages/kalman-filter-library.html
- http://sites.google.com/site/jordiuavs/Home/kalman_filter_by_Jordi.txt
- http://www.orocos.org/bfl
- http://www.scipy.org/Cookbook/KalmanFiltering