This is the fourth story in a series documenting my plan to make an autonomous RC race car. The first three stories can be found here:
The last story introduced the idea of sensor fusion in state estimation. Where you have some state that you want to estimate, such as the position of a car in the world, and you have a number of sensors providing partial information about your state that you want to fuse together into a single, more accurate estimate. I had made the analogy to how humans move by using a combination of vision, our vestibular system, and muscle memory, which all contribute to a single accurate estimation of where we are. Below is the wonderful illustration I made for each of these components.
Why is sensor fusion necessary? It is partly due to the fact that typically no single sensor provides everything that we want to know (or at least not very well). But, it is also due to different sensors having different characteristics, some of which are more desirable than others. A GPS can give an absolute position, but it will have a low update rate, and is subject to discrete jumps. On the other hand, an inertial measurement unit (IMU) can update extremely quickly, but when you try to integrate acceleration over time to obtain position, the errors in doing so grow without bounds over time. What we want is the best of both worlds, hence sensor fusion.
For this project, I’ll be implementing sensor fusion to improve the odometry estimation with encoders from the last story, by combining it with data from an IMU. But first, I’ll take a moment to provide some details on how this actually works. There are a few different approaches to sensor fusion, such as a probabilistic method, or fuzzy logic. I’ll explain how the probabilistic approach works since that is what I am using, and I feel it is the most intuitive.
First off, you need to think in terms of probability distributions. When you consider your own location right now, your true position may be at coordinates X and Y. But, that true position is not generally known. So instead, we consider a distribution of probability values for each position in the world, where the probabilities of the positions near your true positions should be high. Below is an example, where the X-axis could be your position in some direction, and the Y-axis the probabiity of being in that position.
The same applies to how you move, and the observations you make. Each observation (i.e. sensor reading) will be of the form of a probability distribution, and so will each motion you make. We combine these probabilities together to try and form a more accurate estimate. Exactly like basing your opinion off of multiple sources.
Probability distributions are very useful, because instead of just a single measurement value, they can also contain the uncertainty associated with that measurement. For things like sensor measurements, or predicted motion, we don’t always know exactly what that probability distribution will look like, as it is typically quite complex. But, we can make some assumptions and form a model of what we think it should be like. Take a look at the picture below. This is a depiction of a potential model for the probability distribution of a distance measurement with a laser.
Starting from the left you have some probability attributed to really short readings, such as when something obstructs the laser’s path. Then you have the peak representing what you expect the measurement to actually be. At the far right end you have some probability allotted to the laser returning the maximum range. In addition, all of this is supplemented with a little bit of probability spread over every possible reading to account for pure random measurements. Seems like a fairly reasonable model for what you would expect from a measurement with a laser. The same would be done for any other sensors or information you want to incorporate into your estimate.
Now that we have some understanding of probability distributions and their importance, let’s get back to the problem. In english, the question that we are trying to solve is: what is the probability that I am here, given that I performed all these motions, and made all of these observations? In mathematical terms this idea can be expressed by:
Which translates to the probability of the current state, conditioned on all the previous observations, and all the previous controls. The above probability distribution can be estimated with a Bayes Filter, which is represented by the equations below.
I’m not going to go deep into the Bayes filter, there are enough textbooks on that (an excellent one being Probabilistic Robots by Dieter Fox, Sebastian Thrun, and Wolfram Burgard, specifically Chapter 2 for the Bayes filter). I’m merely showing this formulation to illustrate that it is conveniently composed of three parts: the estimated distribution from the previous time step, the control update, and the measurement update. The previous distribution is where we thought we were one time step ago, the control update is where we think we are now based only on how we moved, and the measurement update is the likelihood of receiving the measurements we did, given where we think we are. This means that we have a recursive formulation for updating the current state, which depends on the state from the previous time step, the control that was executed, and the measurements that were observed.
Below is an example of how probability distributions for an initial state estimate and a measurement can be combined. The measurement differs slightly from the initial state, but it is much more concentrated about one point. The result is that the post measurement update more closely resembles the actual measurement, because it had more certainty.
The next obvious question is how do we actually obtain each of those probability distributions needed for the Bayes filter? Well, since the true distributions are likely to be very complex, as an engineer you would do what engineers to best: make some assumptions to develop an approximation. In the laser example I showed one possible set of approximations. A different set of approximations you could make are that every prediction (state, control, and measurement) are subject to white (purely random) noise, and that each distribution can be modelled as a Gaussian distribution. The Gaussian distribution is the Bell Curve that most people are familiar with. When you make these assumptions the math becomes very nice, and you are left with something called a Kalman filter.
The Kalman filter is one of the most popular state estimation tools, and you’ll see it applied in GPS receivers, aircraft, and even the navigation computer for the Apollo missions, which spawned its development. There are of course variations of the Kalman Filter, such as the Extended Kalman Filter, the Unscented Kalman, and the information filter, as well as whole other sets of approximations that could have been made instead which would lead to different methods like the particle filter. Each variation makes different assumptions, and is suitable for different applications. This could go on forever, but the point is that there are a lot of different approaches to this problem. Anyways, back to the actual purpose of presenting all of this information: to combine the odometry data with IMU data.
So, the end goal is to predict the car’s position and orientation, as well as the linear and angular velocities. First, the car needs an IMU. I’ve got a Phidgets IMU, which I’ve mounted on the very front of the car. The IMU also has a magnetometer, but the car’s electric motor will produce an electric field that causes distortion, so I want it as far away as possible. I also made some new mounts to for the Arduino and the onboard computer, so I’ll go ahead and show those off as well. The car is basically a brick of batteries and electronics at this point.
From the wheel encoder measurements I can measure the linear and angular velocities. The IMU that I am using provides linear acceleration, angular velocity, and magnetic heading. To fuse these measurements together I’ll be using an Extended Kalman filter, which differs from the standard Kalman filter in the assumptions made about the control update.
The most difficult part about implementing a Kalman filter is tuning it. Remember how I said the Kalman filter comes from assuming all the distributions are Gaussian distributions (the Bell curve)? Well, two parameters define that curve: the mean and the variance. The mean indicates the centre of the curve, for measurements this would simply be the reported value. The variance is an indication of how spread out the values typically are, and for a sensor it can often be obtained from the data sheet. The Kalman filter has many of these parameters called covariance values, which govern the measurement and control updates. Properly tuning the covariances is crucial to the filter’s performance, and it can be a bit tricky to do so. If I were to set all the covariance values for the IMU to be very large, and all the covariance values for the encoder measurements to be very small, this would indicate that the IMU data is not trustworthy (very spread out from the true value), while the encoder measurements are trustworthy (all very close to the true value). The result would be that the filter would ignore the IMU, and listen almost exclusively to the encoder measurements. If the encoder measurements actually are more accurate than the IMU, then this is alright. But if it is the opposite, then the estimate of the filter will be inaccurate, because it is biasing it’s predictions with the wrong data. The point I am trying to make is that these parameters are important, and need to be tuned properly. Because you are essentially deciding how much you want to trust each step measurement and each step of the update.
Last story I ended with showing the path of the car estimated only from the encoder measurements, and the estimate became very inaccurate when I drove quickly. So, I decided to repeat that test with the IMU data fused in as well. Check out the video below to watch the car’s estimated position as it moves. This is the same path as the last test, which was about 100m long, and the car is moving at 3–5 m/s (11–18km/h) on smooth tile floor. The video is in real time, I didn’t speed it up at all. The green line is the fused position estimate, while the red line is the position estimate from only the encoders.
Looks much better! This time the position error at the end of the 100 m long lap was just 0.23 m, compared to 10.91 m when only encoder measurements were used. What this means is that when the car is moving quickly, and the wheels are likely sliding a small amount on the smooth floor, the filter is able to use the IMU data to keep the state estimate from diverging.
An important point to make is that the current state estimate is entirely relative. There is no absolute measurements being incorporated, such as a reference to some feature in the world, or a GPS signal. As a result the errors will grow without bounds over time, just like trying to walk with your eyes closed. So, I need some form of global referenced. A camera will be mounted to the car to achieve exactly that, and next story I’ll explain how I do it.
If you are interested in the code, checkout the Github repo here. Or for the previous stories, you can find them below.