Feature Extraction and Image Descriptors

Understanding key techniques for image feature representation, from fundamental concepts to advanced methods

What is Feature Extraction?

Feature extraction is the process of detecting, describing, and representing salient characteristics of an image in a numerical or symbolic form. The goal is to reduce the dimensionality of the original image data (often consisting of millions of raw pixels) into a more compact and discriminative representation. These features may capture color, shape, texture, or other domain-specific attributes that are relevant for tasks like classification, object recognition, registration, and image retrieval.

In essence, feature extraction translates raw pixel intensities into descriptors or signatures that can be more easily interpreted or compared by machine learning algorithms, thereby facilitating tasks such as image matching, pattern analysis, and decision-making.

Importance of Feature Extraction

Feature extraction plays a critical role in computer vision and image processing because it:

Types of Features

Although features can vary in their mathematical formulation, they are often conceptually grouped as:

Common Feature Extraction Techniques

A wide array of feature extraction methods exist, each suited for different applications and image conditions. Below are some of the most commonly used techniques, along with brief code snippets in Python and MATLAB.

1. Edge Detection

Edge detection highlights the boundaries of objects by capturing sharp changes or discontinuities in intensity. It is often an important first step in further shape or contour analysis.

Python
MATLAB
import cv2

# Load image in grayscale
image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# Apply Canny edge detection
edges = cv2.Canny(image, 50, 150)
cv2.imwrite('edges.jpg', edges)
            
% Edge detection
I = imread('image.jpg');
Igray = rgb2gray(I);
edges = edge(Igray, 'Canny', [0.1 0.3]);
imwrite(edges, 'edges.jpg');
            

2. Texture Analysis

Texture features aim to characterize the repetitive or quasi-repetitive patterns in an image. They are particularly useful in applications like material classification, medical diagnostics (e.g., detecting abnormalities in tissue textures), and face recognition.

Python
MATLAB
from skimage.feature import graycomatrix, graycoprops
import cv2

# Load or define a grayscale image
image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# Compute GLCM
glcm = graycomatrix(image, distances=[1], angles=[0], levels=256,
                    symmetric=True, normed=True)
contrast = graycoprops(glcm, 'contrast')[0, 0]
print("Contrast:", contrast)
            
% Compute GLCM in MATLAB
I = imread('image.jpg');
Igray = rgb2gray(I);
offsets = [0 1];  % Horizontal neighbor
glcm = graycomatrix(Igray, 'Offset', offsets, 'Symmetric', true);
stats = graycoprops(glcm, 'Contrast');
contrast = stats.Contrast;
disp(['Contrast: ', num2str(contrast)]);
            

3. Shape Descriptors

Shape-based features describe the geometric properties of objects in an image. They often rely on boundary or contour extraction, enabling tasks such as object classification by shape, pose estimation, or defect detection.

Python
MATLAB
import cv2
import numpy as np

# Load binary or thresholded image
image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
# Find contours
contours, hierarchy = cv2.findContours(image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Draw contours on a color copy
color_image = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
cv2.drawContours(color_image, contours, -1, (0, 255, 0), 2)
cv2.imwrite('contours.jpg', color_image)

# Compute Hu Moments for the first contour (example)
if contours:
    moments = cv2.moments(contours[0])
    huMoments = cv2.HuMoments(moments)
    print("Hu Moments for first contour:", huMoments.flatten())
            
% Find contours and compute shape descriptors in MATLAB
I = imread('image.jpg');
Igray = rgb2gray(I);
BW = imbinarize(Igray);
BW = imfill(BW, 'holes');

% Extract boundaries
boundaries = bwboundaries(BW, 'noholes');
Ioverlay = I;

for k = 1:length(boundaries)
    boundary = boundaries{k};
    for n = 1:size(boundary, 1)
        row = boundary(n,1);
        col = boundary(n,2);
        Ioverlay(row, col, :) = [0, 255, 0]; % Draw boundary in green
    end
    
    % Compute moments (using regionprops)
    stats = regionprops(BW, 'Moments');
    % (regionprops can also give Hu moments if needed)
end

imwrite(Ioverlay, 'contours.jpg');
            

4. Keypoint Detection and Matching

Keypoints (a.k.a. interest points or salient points) are well-defined, highly distinctive points in an image. They form the basis for many image alignment, stitching, and recognition algorithms, as each keypoint can be associated with a local descriptor that is invariant (or robust) to transformations such as rotation or scaling.

Python
MATLAB
import cv2

# ORB keypoint detection
image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
orb = cv2.ORB_create()
keypoints, descriptors = orb.detectAndCompute(image, None)
output = cv2.drawKeypoints(image, keypoints, None, color=(0,255,0))
cv2.imwrite('keypoints.jpg', output)
            
% ORB keypoint detection in MATLAB
I = imread('image.jpg');
Igray = rgb2gray(I);

points = detectORBFeatures(Igray);
[features, validPoints] = extractFeatures(Igray, points);

% Create an output image showing keypoints
output = insertMarker(I, validPoints.Location, 'circle', 'Color', 'red');
imwrite(output, 'keypoints.jpg');
            

5. Color Descriptors

Color-based features quantify the distribution and relationships of colors within an image. They play an essential role in applications like content-based image retrieval (CBIR), scene understanding, and object detection in color-based contexts.

Python
MATLAB
import cv2
import numpy as np

image = cv2.imread('image.jpg')  # in BGR by default

# Compute grayscale histogram (for demonstration)
hist = cv2.calcHist([cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)], [0], None, [256], [0,256])
cv2.normalize(hist, hist, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX)
print("Normalized Grayscale Histogram:", hist.flatten())
            
% Compute grayscale histogram in MATLAB
I = imread('image.jpg');
if size(I,3) == 3
    I = rgb2gray(I);
end

counts = imhist(I, 256);
counts = counts / sum(counts);  % Normalize the histogram
disp('Normalized Grayscale Histogram:');
disp(counts');
            

Applications of Feature Extraction

Feature extraction underpins a wide variety of real-world image processing and computer vision applications:

Challenges and Considerations

Although feature extraction can greatly simplify complex image data, several hurdles remain:

Further Learning Resources

For a deeper exploration of feature extraction methods and best practices, consult the following references:

Interactive Demos

Local Binary Patterns (LBP)

Local Binary Patterns (LBP) is a simple yet efficient texture descriptor. For each pixel, it looks at the 3×3 neighborhood, compares each neighbor's intensity to the center pixel, and forms an 8-bit binary code. We then convert that code to a decimal value, which can be visualized (often used for texture analysis).

1
(We keep it small for simplicity; typical LBP uses a 3×3 or 3×3 ring.)
How it works:
For each pixel (x, y), we collect the intensities of its neighbors in a circular or square neighborhood (here, a 3×3 block for radius=1). Each neighbor that is >= the center pixel is assigned a bit value of 1; otherwise, 0. These bits form an 8-bit number, which we map to a grayscale value to display the LBP pattern.
Color Histogram

A color histogram is one of the simplest global descriptors. It counts how often each color value appears. Below, we compute a histogram for each channel (Red, Green, Blue) and display it as a simple bar chart.

16
(Adjust to see more/less resolution in the histogram)
How it works:
We read the pixel values from the original image. For each channel (R, G, B), we increment the appropriate bin index based on the pixel’s intensity. Then we normalize the counts and draw a bar chart for each channel in a different color.