Overview of Variable Elimination
Variable Elimination (VE) is an exact inference algorithm used to calculate marginal probability distributions, such as
The process relies on two core operations:
-
Product: Multiplying all factors that involve the specific variable you are about to eliminate.
-
Sum-out (Marginalization): Summing over the variable to be eliminated to generate a new, smaller factor.
Why Variable Elimination Beats Brute Force
The traditional "Brute Force" method computes the full joint distribution before marginalizing. This approach quickly hits an "Exponential Wall". For example, calculating the full joint probability for a chain of 50 binary variables requires an impossibly large table of
Variable Elimination solves this by only working with smaller, intermediate tables at each step. The computational complexity is determined by the maximum factor size created during the process, which is directly tied to the elimination order and the graph's "induced width".
Graph Transformations: Moralization & Elimination
To utilize VE, the directed Bayesian Network must be converted into an undirected graph.
-
Moralization: This involves "marrying the parents" (drawing a connection between all parents of the same child) and converting all directed arrows into undirected edges. This step ensures that a Conditional Probability Table like
forms a complete clique containing all of its related variables. -
The Elimination Graph: When a variable
is eliminated, a new factor is created from the union of the scopes of all factors involving . In the graph, this corresponds to making all current neighbors of a clique (creating "Fill-in Edges"), and then removing and its incident edges.
Step-by-Step Example: Student Performance Network
The lecture walks through a 5-node network calculating a student's performance. The variables are
The goal is to query the probability of getting a good letter,
-
Eliminating S (The Barren Node):
is an unobserved leaf child. According to the "Barren Node Rule", summing out an unobserved leaf node always produces a factor of 1s. This proves has no impact on unless a specific value for is observed. -
Eliminating I: The algorithm takes the product of all factors involving
( , , and ) and sums out . This results in a new 2-variable factor, . -
Eliminating D: It multiplies the factors involving
( and ) and sums out . This leaves a new factor, . -
Eliminating G: Finally, the remaining factors
and are multiplied together, and is summed out. -
Result: The calculation concludes that there is a
chance ( ) of the student receiving a good recommendation letter.
Key Takeaway
The order in which you eliminate variables is incredibly important. By strategically eliminating variables (such as eliminating the leaf node