Maybe it's just because I'm not the target audience for this paper, but I'm finding this very tough going. I'm a PhD student in a mathematical field (operations research) but have only the faintest idea about Kalman filters - something about updating beliefs based on noisy measurements in a way that feels intuitively similar to Bayes' Rule. I'm not a tutor intending to teach Kalman filters to anyone in the near future, fair, but I should have the mathematical background to get through this. Not sure why it feels like a slog. Maybe it's because the measurement and prediction and update equations appear before any intuition about what a Kalman filter is, what the stages of the algorithm are, etc.?
Edit: all the way through now. Certainly this would be much more useful as an intro to Kalman filters (rather than an introduction to introducing Kalman filters) had some intuition been given.
Sequential Bayesian Filtering is how you apply repeated evidence to a moving target. There are three steps:
1. Predict: Using some Markov process, move your prior distribution forward in time so it's compatible with your new evidence. (Intuitively, everything becomes less certain as it's free to move around. Mathematically, doing this with continuous probabilities tends to mean an incredibly gross integral.)
2. Update: Using Bayes' Rule, update your probabilities with the new evidence. (Intuitively, this bunches the distribution back up. If the predict/update don't vary in time/quality, this tends to asymptotically reach some sort of balance. Mathematically, this tends to also be gross.)
3. Notreallyastep: Recycle your results as the priors in step 1 next time. (Note this means your result needs to be in the same format as your old priors if you don't want to re-solve all the math every update.)
If you get around the gross math by doing everything in finite space and brute forcing it (integrals become summations), you get a hidden Markov model.
If you get around the gross math by dingo a Monte-Carlo approximation, you get a particle filter.
If you assume your priors are normal, your evidence is normal, and your update function fits in a matrix multiplication, then you're in luck: all of the math works out so your result is also normal. That's a Kalman Filter.
I suggest you start with a recursive least squares filter first, which is fully deterministic and quite easy to derive if you are comfortable with Euclidean geometry. The Kalman filter basically just adds process covariance on top of that, so in a sense it is a "fuzzy" least-squares estimator.
I've worked with all the different types of Kalman Filters over the past decade, and one lesson I've learned is that there is no way to make it simple and intuitive. It requires extensive background knowledge in math and stats, and even then, it's very difficult for intelligent people to keep track of all the moving parts. Papers like this one are an exercise in futility.
> "'We have a new theorem--that mathematicians can only prove trivial theorems, because every theorem that's proved is trivial.'" - Richard Feynman
Sigh.
KF is essentially solving a QP with equality constraints (Boyd's course is a good place for details), which can be solved exactly with a single decomposition of the KKT system - picking an ordering is all that matters for complexity.
This is essentially the principle on which all of Sparse Linear algebra and Graphical models work. There is nothing special about the structure of KF, nor in LQR, nor in their non-linear generalizations.
One can symbolically unroll Schur complements multiple times to make block-LU appear opaque and sophisticated, but it really is not (this of course is not to say this is done deliberately). KF can also derived from the Bayes' network model, but extending this to non-linear forms like EKF, and to things which are not first-order becomes rather troublesome (or impossible).
I'd have appreciated a post asking for details rather than infantile derision.
I'm sorry that you can't take a joke. Personally, I found it funny that you're blessed with the requisite mathematical knowledge and perspective to be able to view the details as 'trivial minutiae'.
I did spend 5 minutes perusing the paper you linked, and couldn't make heads or tails of it. For myself and all other mere mortals unfamiliar with this mathematical machinery, I can assure you that details are far from trivial.
Well, it's a mathematical paper. I skimmed it, and although it's not my field (well, I wonder what my field is now that I work as machine learning engineer, but it used to be dynamical systems in Banach spaces) it seems pretty much readable given a mathematical background in research.
I can assure you, this is way more readable than many other papers. I can totally understand the "minutiae" comment there.
Understanding what the Kalman filter does and how it does it can be intuitive. Understanding that it can all be done efficiently with matrix operations: also intuitive. Getting a good intuition for the specific update equations? Still beats me. I'm a quant/statistician, I focus on filtering. I've implemented this a dozen time and I still have to look these equations every time.
100% agree. One useful exercise (once you understand all the algebra) is to show that when your measurement noise is extremely small, your estimate is just your current observation. Then show that when your dynamics noise is small, your estimate is just your prior prediction.
Easy enough in one dimension, surprisingly hard in several dimensions.
Even after all that, could I explain what the Kalman gain is to a ten year old? Not a chance.
I echo the sentiments of Kolbe. There really is no way to make a Kalman filter simple or intuitive. What I have found helps though, is to write one yourself based on the math before using the libraries you find.
Write one, print out every intermediate value to see how the matrix changes. It will be not-quite-correct, but it will give you insights to how exactly a kalman filter works
I've encountered a few different attempts at "simple explanations" of the Kalman filter, but what I really want is a "simple explanation of implementing a Kalman filter".
Anyone know of attempts at that? (Or applying one to a real situation?)
Edit: all the way through now. Certainly this would be much more useful as an intro to Kalman filters (rather than an introduction to introducing Kalman filters) had some intuition been given.