Windows Documentation

DirectCompute Graphics API Reference

Welcome to the DirectCompute API reference. DirectCompute is a DirectX technology that allows you to harness the parallel processing power of the graphics processing unit (GPU) for general-purpose computing tasks.

Introduction to DirectCompute

DirectCompute enables developers to move computationally intensive tasks, such as physics simulations, image processing, video encoding/decoding, and complex algorithms, from the CPU to the GPU. This can lead to significant performance improvements, especially in applications that require high throughput and parallelism.

It leverages the familiar DirectX API structure, making it accessible to developers already working with DirectX for graphics rendering. Key features include:

Core Concepts

Compute Shaders

Compute shaders are the cornerstone of DirectCompute. They are executed on the GPU and can operate on arbitrary data structures. Unlike pixel or vertex shaders, compute shaders are not tied to the graphics pipeline and can be invoked independently.

A typical compute shader workflow involves:

  1. Dispatching Compute Work: Using functions like Dispatch to launch a grid of thread groups.
  2. Processing Data: Threads within each group process data from input buffers or textures.
  3. Writing Results: Threads write their computed results to output buffers or textures.

Thread Organization

DirectCompute organizes computations into thread groups. Each thread group contains multiple threads. Threads within a group can cooperate using shared memory and synchronization primitives. The GPU schedules these thread groups for execution.

Resource Binding

Resources like buffers (StructuredBuffer, ByteAddressBuffer) and textures (Texture2D, Texture3D) are bound to the compute shader for input and output. The way these resources are bound determines how threads access data.

Key API Functions and Structures

Here are some of the essential components you'll work with in DirectCompute:

Function/Structure Description
ID3D11Device::CreateComputeShader Creates a compute shader object from compiled shader code.
ID3D11DeviceContext::CSSetShader Sets the compute shader to be used for rendering.
ID3D11DeviceContext::CSSetShaderResources Sets shader resource views (SRVs) for the compute shader.
ID3D11DeviceContext::CSSetUnorderedAccessViews Sets unordered access views (UAVs) for the compute shader.
ID3D11DeviceContext::CSSetConstantBuffers Sets constant buffers for the compute shader.
ID3D11DeviceContext::Dispatch Executes the compute shader by dispatching thread groups.
D3D11_SHADER_DESC Shader description containing information about the shader.
D3D11_SO_DECLARATION_OFFSET Represents an offset in a stream output buffer.
CD3DX11CompileFromMemory Helper function to compile HLSL shader code from memory.

HLSL for DirectCompute

High-Level Shading Language (HLSL) is used to write compute shaders. Key HLSL constructs for DirectCompute include:

Example Compute Shader (HLSL)


// Define the thread group size
[numthreads(8, 8, 1)]
void MainCS(uint3 dispatchThreadID : SV_DispatchThreadID)
{
    // Access global thread ID
    uint x = dispatchThreadID.x;
    uint y = dispatchThreadID.y;

    // Perform computation based on thread ID
    // Example: Write a color to an output texture
    outputTexture[int2(x, y)] = float4(x / 1024.0, y / 768.0, 0.5, 1.0);
}
            

Getting Started

To start using DirectCompute:

  1. Create a Direct3D 11 Device: Initialize a Direct3D device and device context.
  2. Compile Compute Shader: Write your compute shader logic in HLSL and compile it.
  3. Create Resources: Create input and output buffers or textures.
  4. Bind Resources: Bind the appropriate shader resource views (SRVs) and unordered access views (UAVs) to the device context.
  5. Set Compute Shader: Set the compiled compute shader using CSSetShader.
  6. Dispatch: Call Dispatch with the desired number of thread groups.
  7. Synchronize and Retrieve Data: Ensure computations are complete before accessing output data.

Further Reading