Lecture 3 - Intensity Transformation and Spatial Filtering

1. Image Enhancement Fundamentals

Image enhancement is the process of manipulating an image so the result is more suitable than the original for a specific application.

Key Goals: Highlighting interesting details, removing noise, and making images more visually appealing .
Problem-Oriented: Enhancement techniques are highly specific; a method used for enhancing X-ray images may not be suitable for satellite images .

Spatial Domain Methods

Spatial domain operations operate directly on the pixels of the image and can be reduced to the form:

g (x, y) = T [f (x, y)]

Where $f (x, y)$ is the input image, $g (x, y)$ is the processed image, and $T$ is an operator defined over a specific neighborhood around $(x, y)$ .

Point Processing

The simplest spatial domain operation occurs when the neighborhood is just the pixel itself $(1 \times 1)$ . It takes the form:

s = T (r)

Where $r$ is the original image pixel value, $s$ is the processed image pixel value, and $T$ is the transformation mapping .

2. Basic Intensity Transformation Functions

There are three basic types of mathematical functions used frequently for image enhancement: Linear, Logarithmic, and Power-Law .

A. Linear (Identity & Negative)

Image Negatives: The negative of an image with intensity levels in the range $[0, L - 1]$ is given by:
$s = L - 1 - r$
Application: Highly suitable for enhancing white or gray details embedded in large dark/black regions of an image .

B. Logarithmic

Log Transformation: Maps a narrow range of low (dark) input gray-level values into a wider range of output values.
$s = c * \log (1 + r)$
Application: Particularly useful for expanding the dark pixels in an image while compressing the higher-level values. It is famously used to reveal more detail in the Fourier spectrum of an image .
Inverse Log: Performs the exact opposite transformation.

C. Power-Law (Gamma)

Gamma Transformation: Maps values based on an exponent $γ$ .
$s = c * r^{γ}$
Behavior: Fractional values of $γ$ (where $γ < 1$ ) map a narrow range of dark input values into a wider range of output values. Higher values of $γ$ (where $γ > 1$ ) do the opposite.
Gamma Correction: Display monitors often do not respond linearly to different intensities (they naturally darken images). A power-law transformation is used to precondition the image before display to correct this hardware phenomenon .

3. Piecewise-Linear Transformation Functions

Unlike standard mathematical functions, piecewise linear functions can be arbitrarily complex, allowing for highly customized intensity mappings .

A. Contrast Stretching

Contrast is the difference between the minimum and maximum pixel intensity in an image . Low contrast can result from poor illumination, lack of dynamic range in the sensor, or wrong lens aperture settings .

Pasted image 20260227224343.png

Definition: A process that expands the range of intensity levels in an image so that it spans the full available intensity range of the display device (also known as normalization) .
Min-Max Stretching Equation: $I_{n e w} = (I - M i n) \frac{N e w M a x - N e w M i n}{M a x - M i n} + N e w M i n$ Where $I$ is the input intensity, $I_{n e w}$ is the output, $(N e w M i n, N e w M a x)$ is the target range (usually 0 to 255), and $(M i n, M a x)$ is the original intensity range .
Global vs. Local:
- Global: Increases contrast across the entire image uniformly.
- Local: Divides the image into small regions and performs contrast enhancement on each region independently .

B. Thresholding

Thresholding converts a grayscale image into a binary image by comparing every pixel to a specific threshold value $m$ (or $k$ ) .

s = {\begin{cases} 1 & If r \geq m \\ 0 & If r < m \end{cases}

Process: If a pixel's intensity is greater than or equal to the threshold, it is mapped to 1 (or 255/White). If it is below the threshold, it is mapped to 0 (Black) .

To see numerical examples, here

Summary

Transformation Type	Mathematical Formula	Behavior / Characteristics	Primary Application
Image Negative (Linear)	$s = L - 1 - r$	Reverses the intensity levels of an image.	Enhancing white or gray details embedded in large dark/black regions (e.g., medical X-rays).
Logarithmic	$s = c \times \log (1 + r)$	Expands a narrow range of dark input values while compressing higher-level values.	Revealing hidden details in images with massive dynamic ranges, like the Fourier spectrum.
Power-Law (Gamma)	$s = c \times r^{γ}$	Maps values based on an exponent. • $γ < 1$ : Lightens image (expands darks). • $γ > 1$ : Darkens image (expands brights).	Gamma Correction: Preconditioning images to display correctly on monitors that do not respond linearly to intensity.
Contrast Stretching (Piecewise-Linear)	$I_{n e w} = (I - M i n) \frac{N e w M a x - N e w M i n}{M a x - M i n} + N e w M i n$	Stretches the original intensity range to span a new, wider target range (usually $0$ to $255$ ).	Normalizing images with poor illumination or fixing low dynamic range sensor captures.
Thresholding (Piecewise-Linear)	$s = 1 if r \geq m$ $s = 0 if r < m$	Maps pixels above a threshold $m$ to White (1/255) and below to Black (0).	Binarizing an image to segment or isolate specific objects from the background.
(Note: $r$ is the input intensity, $s$ is the output intensity, $L$ is the number of intensity levels, $c$ is a scaling constant, and $m$ is the threshold value).