Graphics Transforms

Graphics transforms are fundamental operations used to manipulate the position, orientation, and size of objects in a 2D or 3D scene. They are essential for creating complex scenes, animating objects, and projecting 3D models onto a 2D screen.

Types of Transforms

The primary types of geometric transforms are:

Translation: Moving an object from one position to another.
Rotation: Turning an object around a fixed point or axis.
Scaling: Resizing an object, making it larger or smaller.
Shearing: Skewing an object, distorting its shape.

Homogeneous Coordinates and Matrices

To combine multiple transforms (like translation, rotation, and scaling) into a single operation and to represent translation in a matrix form, we often use homogeneous coordinates. In 2D graphics, a point (x, y) is represented as (x, y, 1), and in 3D, a point (x, y, z) is represented as (x, y, z, 1).

Each transform can be represented by a matrix. Applying a transform to a point involves multiplying the point's vector by the transform matrix.

2D Transforms (3x3 Matrix)

A 2D transformation matrix typically looks like this:


[ a  b  tx ]
[ c  d  ty ]
[ 0  0  1  ]

a, b, c, d control scaling, rotation, and shearing.
tx, ty represent translation.

3D Transforms (4x4 Matrix)

A 3D transformation matrix is a 4x4 matrix:


[ m11 m12 m13 m14 ]
[ m21 m22 m23 m24 ]
[ m31 m32 m33 m34 ]
[ m41 m42 m43 m44 ]

In a typical graphics pipeline, the last row is often [0 0 0 1] for affine transformations. The first three columns handle rotation, scaling, and shearing, while the first three elements of the last column handle translation.

Concatenating Transforms

One of the key advantages of using matrices is that multiple transformations can be combined (concatenated) into a single matrix by multiplying their respective matrices. This significantly improves performance by reducing the number of operations needed per vertex.

The order of multiplication is crucial: \( M_{combined} = M_{transform3} \times M_{transform2} \times M_{transform1} \).

World, View, and Projection Transforms

In 3D graphics, transformations are often categorized into:

World Transform: Positions and orientates objects in the 3D world space.
View Transform: Defines the camera's position and orientation in the world (often an inverse view matrix).
Projection Transform: Maps the 3D view frustum to normalized device coordinates (e.g., perspective or orthographic projection).

Matrix Operations in Graphics APIs

Most graphics APIs (like DirectX, OpenGL, Vulkan) provide functions to create and manipulate transformation matrices.

Example: 2D Translation Matrix

To translate a point (x, y) by (tx, ty):


// Conceptual representation
function createTranslationMatrix(tx, ty) {
    return [
        [1, 0, tx],
        [0, 1, ty],
        [0, 0, 1]
    ];
}

// Applying the transform
// [x'] = [1 0 tx] [x]
// [y'] = [0 1 ty] [y]
// [1 ] = [0 0 1 ] [1]
// Results in x' = x + tx, y' = y + ty

Example: 2D Rotation Matrix (around origin)

To rotate a point (x, y) by an angle θ counter-clockwise:


// Conceptual representation
function createRotationMatrix(angleInRadians) {
    const cos = Math.cos(angleInRadians);
    const sin = Math.sin(angleInRadians);
    return [
        [cos, -sin, 0],
        [sin,  cos, 0],
        [0,    0,   1]
    ];
}

// Applying the transform
// [x'] = [cos -sin 0] [x]
// [y'] = [sin  cos 0] [y]
// [1 ] = [0    0   1] [1]
// Results in x' = x*cos - y*sin, y' = x*sin + y*cos