Lecture 8 - Kalman Filter

The Kalman Filter: Continuous State Tracking

Unlike Hidden Markov Models (HMMs) and Dynamic Bayesian Networks (DBNs) which often deal with discrete probability states (e.g., "Raining" vs "Not Raining"), a Linear Dynamical System using a Kalman Filter tracks continuous variables (e.g., position, velocity, temperature).

The Kalman Filter represents uncertainty using Gaussian (normal) distributions, defined by a mean (the most likely estimate, $x$ ) and a variance/covariance (the uncertainty, $P$ ).

The filter operates in a continuous two-step loop:

1. The Predict Step (Time Update)

The filter uses the laws of physics or a known mathematical model to guess where the system will be next. Because no model is perfect, this step injects Process Noise ( $Q$ ), which always increases our uncertainty ( $P$ ).

Predicted State: $x_{p r e d} = x_{t - 1} + motion$
Predicted Uncertainty: $P_{p r e d} = P_{t - 1} + Q$

2. The Update Step (Measurement Update)

The system takes a reading from a sensor. Sensors are inherently flawed, so this measurement has its own Measurement Noise ( $R$ ). The filter calculates the Kalman Gain ( $K$ ) to figure out how much to trust the sensor versus its own prediction.

Kalman Gain: $K = \frac{P_{p r e d}}{P_{p r e d} + R}$
New State Estimate: $x_{t} = x_{p r e d} + K (Measurement - x_{p r e d})$
New Uncertainty: $P_{t} = (1 - K) P_{p r e d}$

Note on Tuning: As mentioned in the lecture notes, if $Q$ is high, the filter trusts the sensors more and becomes "jumpy." If $R$ is high, it trusts its own predictions more and becomes "sluggish."

A 1D Numerical Example

Imagine a robot moving in a straight line. We want to track its position along an x-axis.

It starts at position 0, moving at 1 meter per second.
Initial Estimate ( $x_{0}$ ): 0
Initial Uncertainty ( $P_{0}$ ): 1.0 (We are somewhat unsure exactly where the center of the robot is)
Process Noise ( $Q$ ): 0.1 (We trust our motors and physics model quite a bit)
Measurement Noise ( $R$ ): 0.5 (Our GPS/sensor is somewhat noisy)

Time Step 1 ( $t = 1$ ):

Step 1: Predict

The robot was at 0 and moves at 1 m/s, so we predict it is now at 1.0. We add the process noise to our uncertainty.

$x_{p r e d} = 0 + 1.0 = 1.0$
$P_{p r e d} = 1.0 + 0.1 = 1.1$

Step 2: Measure

The robot's GPS sensor takes a reading. Let's say it reads a slightly inaccurate value of 1.2.

Step 3: Update (Calculate Kalman Gain)

We calculate $K$ to weigh our prediction (1.0) against the measurement (1.2).

$K = \frac{1.1}{1.1 + 0.5} = \frac{1.1}{1.6} \approx 0.6875$

(Because $K$ is closer to 1, it means we are leaning slightly more toward trusting the measurement than our prediction, since our prediction uncertainty 1.1 is higher than the sensor noise 0.5).

Step 4: Update (Final State and Uncertainty)

Now we calculate the final, optimized position estimate and update our confidence.

New Position ( $x_{1}$ ): $1.0 + 0.6875 \times (1.2 - 1.0) = 1.0 + (0.6875 \times 0.2) = 1.1375$
New Uncertainty ( $P_{1}$ ): $(1 - 0.6875) \times 1.1 = 0.34375$

The Result: The filter cleverly blended the prediction (1.0) and the sensor reading (1.2) to arrive at 1.1375. Crucially, as highlighted in the lecture snippet, observe how the uncertainty $P$ dropped dramatically from 1.0 down to 0.34. Even though moving forward added process noise, blending the two independent sources of information significantly increased the filter's overall confidence.

1. The Intuition: Overlapping Certainty

The lecture begins by visually explaining why the Kalman Filter works using Probability Density Functions (Gaussian curves).

A system starts with an initial state estimate that has a specific variance (uncertainty).
When the system predicts its next state through movement, the variance widens, meaning uncertainty increases.
A sensor provides a measurement, which has its own independent curve and variance.
The Magic: By multiplying these two probability curves together, the filter produces an "Optimal state estimate". Crucially, the resulting curve is narrower and taller than both the prediction and the measurement curves, proving that combining two uncertain sources results in a higher overall certainty.

2. The Scalar Kalman Filter & The Gain ( $K$ )

The presentation mathematically dissects the 1D (scalar) version of the filter to explain the behavior of the Kalman Gain ( $K G$ or $K$ ).

The formula for the gain is defined as the ratio of the Estimate Error ( $E_{s t}$ ) to the total error (Estimate Error + Measurement Error, $E_{m e s}$ ):

K = \frac{E_{s t}}{E_{s t} + E_{m e s}} = \frac{1}{1 + \frac{E_{m e s}}{E_{s t}}}

This creates two important extreme bounds for how the filter behaves:

Trusting the Sensor ( $K \to 1$ ): If the measurement error is effectively zero ( $E_{m e s} = 0$ ), the sensors are perfectly accurate. The gain becomes 1, and the filter updates its current state to match the measurement exactly ( $s t_{t} = M E S$ ).
Trusting the Prediction ( $K \to 0$ ): If the measurement error is incredibly high (sensors are inaccurate), the denominator approaches infinity, driving the gain toward 0. The filter ignores the faulty sensor data and relies entirely on its previous prediction ( $s t_{t} = s t_{t - 1}$ ).

3. The Matrix Kalman Filter

Real-world robotics rarely operate in 1D. A robot needs to track position and velocity across X and Y axes simultaneously. To do this, the scalar equations are upgraded to matrices.

The Prediction Step:

State ( $x_{k}$ ): $x_{k} = A x_{k - 1} + B u_{k} + w_{k}$ . The state matrix ( $x$ ) is updated by multiplying the previous state by the transition matrix ( $A$ ), adding any control inputs ( $u$ ) modified by the control matrix ( $B$ ), and accounting for process noise ( $w$ ).
Uncertainty ( $P_{k}$ ): $p_{k} = A p_{k - 1} A^{T} + Q_{k}$ . The covariance matrix ( $p$ ) is projected forward using $A$ and its transpose $A^{T}$ , while adding the process noise covariance ( $Q$ ).

The Update Step:

Kalman Gain ( $K$ ): $K = \frac{p_{k} H}{H p_{k} H^{T} + R}$ . This uses the observation matrix ( $H$ ) to map the state space to the measurement space, factoring in sensor noise ( $R$ ).
Final State ( $x_{k}$ ): $x_{k} = x_{k} + K [Y - H x_{k}]$ . The filter calculates the residual (the difference between the actual measurement $Y$ and the predicted measurement $H x_{k}$ ) and scales it by $K$ to correct the state.
Final Uncertainty ( $P_{k}$ ): $p_{k} = (I - K H) p_{k}$ . The uncertainty is reduced based on how much new information was gained.

4. Designing a Filter using Kinematics

The lecture concludes by showing how to build the $A$ and $B$ matrices using standard Newtonian physics.

If you are tracking 1D distance and velocity, the state matrix is $x = [\begin{matrix} x \\ \dot{x} \end{matrix}]$ . Using the kinematic equation $x = x_{0} + v t + \frac{1}{2} a t^{2}$ , the matrices are derived as follows:

The Transition Matrix ( $A$ ): Models how position and velocity naturally evolve over a time step ( $Δ t$ ).
$A = [\begin{matrix} 1 & Δ t \\ 0 & 1 \end{matrix}]$
The Control Matrix ( $B$ ): Models how external acceleration ( $a$ ) impacts both position and velocity.
$B = [\begin{matrix} 0.5 Δ t^{2} \\ Δ t \end{matrix}]$