1 - Sensor Noise & Aliasing

In any physical system, perception is imperfect. Sensor Noise refers to the random, unpredictable variations in sensor readings, often modeled as Gaussian (normal) distributions. A distance sensor might read 1.0 meters, but the true distance could be 0.95 or 1.05 meters.

Perceptual Aliasing occurs when a system’s sensors map multiple, distinct states in the environment to the exact same sensor reading.

2 - Actuator Effects (Uncertainty in Action)

Just as input is noisy, output is inconsistent. When a control command is sent to an actuator (like a motor turning a wheel), the physical execution will rarely be perfect.

3 - AI Foundations

The explosive growth of modern AI rests on four critical pillars:

  1. Data: The massive volume of digitized information required to train complex models.

  2. Computational Power: The hardware infrastructure. Early AI research was severely bottlenecked by weak processors and memory limitations. Modern hardware architecture, heavily utilizing parallel processing, allows for the matrix multiplications required in deep learning.

  3. Algorithms: The mathematical frameworks—from foundational search algorithms to backpropagation in neural networks—that allow systems to learn from data.

  4. Scenarios: The real-world environments and use-cases (like autonomous driving or medical diagnosis) that provide the structural boundaries for the AI to solve.

4 - Bayesian Theory Vs Naive Bayes

Calculating the exact joint probability of many interacting variables requires exponential computational complexity (O(2n)). The Naïve Bayes classifier drastically simplifies this by making a strong assumption: all features are conditionally independent of each other, given the class label.

5 - Bayesian Networks

When you cannot assume absolute independence (as in Naïve Bayes), you use a Bayesian Network. These are Directed Acyclic Graphs (DAGs) that explicitly map out the conditional dependencies between variables.

The network visually encodes two types of independence:

  1. Absolute Independence: If a node like "Weather" has no edges connecting it to the rest of the graph, it is entirely isolated. Knowing the weather provides absolutely zero information to update your belief about whether someone has a cavity.

  2. Conditional Independence: "Toothache" and "Catch" are independent only if the state of "Cavity" is known. Mathematically, this is written as P(T|Catch,C)=P(T|C).

Dependency Shifts (Conditional Independence)

The flow of information in a network changes dynamically based on what you know (your observed evidence). This dictates whether variables are dependent or independent.

Joint Probability Distribution & Efficiency

For the mathematical power of the network. Instead of calculating a massive, unmanageable table for every possible combination of events, the network's structure allows you to factor the joint probability into smaller, manageable pieces.

Why Use Bayesian Networks?

  1. Uncertainty: Real-world data is noisy and incomplete.
  2. Causality: Helps visualize ”cause and effect” relationships.
  3. Efficiency: Allows for compact representation of joint probability (as mentions above)
  4. Inference: Allows us to update our beliefs as new evidence arrives.

6 - The Alarm Scenario & Global Semantics

The lecture introduces Judea Pearl's famous "Burglary Alarm" network, which consists of five variables: Burglary, Earthquake, Alarm, JohnCalls, and MaryCalls.

Using the Global Semantics Formula, you can calculate the exact probability of a specific "complete path" of events. For example, the probability that John and Mary both call, the alarm sounds, but there is no burglary and no earthquake is calculated by multiplying the respective probabilities from the CPTs, resulting in approximately 0.00063.

to see more, here

Problem 1 - Joint Probability

This slide walks through a specific joint probability calculation: finding P(j,¬m,¬a,b,e).

Problem 2 - Conditional Probability

Here, the lecture tackles a harder problem: calculating P(j,¬m|b). Because the states of the Alarm (A) and Earthquake (E) are not specified, they are considered "hidden variables" that must be marginalized (summed over).

to see more on this problem, here

Problem 3 - Diagnostic Inference

This problem demonstrates "Bottom-Up" inference using Bayes' Rule: calculating the probability of a cause given an effect. The goal is to find the probability of an earthquake given that John called: P(e|j).

to see more on this problem, here

7 - Local Semantics & Markov Blanket

These slides define the structural rules that allow nodes to be isolated for calculations.

The Rain-Sprinkler-Grass Example

The lecture shifts to a new network to demonstrate the Markov Blanket.

Solving an Example Inference

This slide runs a standard joint probability calculation on the new network, finding P(c,¬s,r,w).

Numerical Inference & "Explaining Away"

This section uses the Markov Blanket to perform inference on the "Sprinkler" node.

Local Semantics vs. Markov Blanket

The final instructional slide contrasts the two main properties.

  1. Prediction (Local Semantics): Works top-down. If you know a parent state (e.g., it is raining), you ignore ancestors because the parents "shield" the node.

  2. Inference (Markov Blanket): Works bottom-up. When working backward from an effect (e.g., an employee is late), discovering one cause (an accident) will significantly decrease the probability of other competing causes, effectively "explaining away" the effect.