Image Representation and Fundamentals

Understanding how digital images are represented, stored, and processed

What is an Image?

In the broadest sense, an image is a two-dimensional function f(x, y) of spatial coordinates x and y, where the value of f(x, y) (often referred to as intensity or gray level) represents the brightness or color at each point. When we talk about digital images, we specifically mean that this continuous function has been discretized into a finite grid of pixels (picture elements). Each pixel corresponds to a specific spatial location, and its numerical value(s) encode intensity or color information.

From a practical standpoint, digital images underlie numerous applications, ranging from general photography to highly specialized fields like medical imaging and satellite remote sensing. Understanding how these pixel values are stored, manipulated, and interpreted is essential for designing and implementing effective image processing techniques.

Types of Digital Images

Digital images vary not only in resolution or size but also in how they store color and intensity data. Below are the most common types you will encounter:

Image Resolution

The term resolution can refer to multiple concepts in digital imaging, all of which influence how well details or changes in intensity can be represented:

Color Spaces

A color space defines how color information is structured and interpreted. Various color spaces offer different advantages for processing, visualization, and hardware implementation:

Image File Formats

Images are stored in a variety of file formats, each optimized for particular use-cases regarding compression, quality, and metadata:

Image Representation in Programming

In most programming environments, images are represented as multi-dimensional arrays. For grayscale images, a 2D array (height × width) of intensity values is sufficient. For color images, the array often has an additional dimension for channels (height × width × channels). Below are simple code examples illustrating how images might be loaded, displayed, and inspected in Python and MATLAB.

Python
MATLAB
import cv2
import numpy as np

# Load an image in grayscale
image = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# Check if image is loaded correctly
if image is None:
    print("Error: Could not load image.")
else:
    # Get image dimensions
    height, width = image.shape
    print(f"Height: {height}, Width: {width}")

    # Display the image
    cv2.imshow('Grayscale Image', image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
            
% Load an image in grayscale
image = imread('image.jpg');
gray_image = rgb2gray(image);

% Get image dimensions
[height, width] = size(gray_image);
disp(['Height: ', num2str(height), ', Width: ', num2str(width)]);

% Display the image
imshow(gray_image);
title('Grayscale Image');
            

Image Processing Fundamentals

Once images are digitized, a variety of fundamental operations can be applied to analyze, enhance, or transform them. These building blocks serve as the basis for more complex algorithms in computer vision, deep learning, and beyond:

Beyond these fundamentals, image processing also encompasses morphological operations, frequency domain analysis (Fourier transforms), feature extraction, pattern recognition, and machine learning-based methods. Mastery of these core topics provides a strong foundation for tackling advanced research or industrial projects in imaging.

Further Learning Resources

Below are some useful references and tools to deepen your understanding of image representation and processing: