MSDN Documentation

Microsoft Developer Network

Compute Shader Creation

This guide walks you through the process of creating and utilizing compute shaders in DirectX. Compute shaders offer a powerful way to leverage the parallel processing capabilities of the GPU for general-purpose computation, not just graphics rendering.

What are Compute Shaders?

Unlike traditional graphics pipeline shaders (vertex, pixel, geometry, hull, domain), compute shaders are designed for arbitrary computations. They can read and write to various resources like textures, buffers, and UAVs (Unordered Access Views), enabling a wide range of non-graphics tasks such as:

Creating a Compute Shader

Compute shaders are written using High-Level Shading Language (HLSL). The core function of a compute shader is typically defined with the num_threads attribute, which specifies the group size the shader will execute in.

Basic HLSL Compute Shader Structure


// Define the thread group size
[num_threads(8, 8, 1)]
void CSMain(uint3 dispatchThreadID : SV_DispatchThreadID)
{
    // 'dispatchThreadID' is the unique ID of the current thread within the dispatch call.
    // You can use this ID to access specific elements in your data structures.

    // Example: Accessing a buffer and performing a computation
    // Assume 'myBuffer' is a StructuredBuffer or RWStructuredBuffer
    // and 'myTexture' is a Texture2D or RWTexture2D

    // Read from a buffer
    // MyDataType data = myBuffer[dispatchThreadID.x];

    // Perform computation...
    // float result = data.value * some_constant;

    // Write to a buffer or texture
    // myBuffer[dispatchThreadID.x] = newValue;
    // myTexture[dispatchThreadID.xy] = computedColor;
}
            

Key Elements:

Dispatching a Compute Shader

In your C++ application code, you'll bind the compute shader and its associated resources to the graphics pipeline and then issue a dispatch call.

Example C++ Dispatch Code Snippet:


// Assume 'pComputeShader' is a pointer to your compiled compute shader object
// and 'pComputeState' is a pointer to your graphics pipeline state object

// Bind the compute shader and pipeline state
m_deviceContext->CSSetShader(pComputeShader.Get(), nullptr, 0);
m_deviceContext->SetPipelineState(pComputeState.Get());

// Bind resources (e.g., buffers, textures) to shader resource views (SRVs)
// and unordered access views (UAVs)
// m_deviceContext->CSSetShaderResources(...)
// m_deviceContext->CSSetUnorderedAccessViews(...)

// Define the number of thread groups to dispatch
// This is typically calculated based on your data size and shader's thread group size
UINT numGroupsX = (dataSizeX + THREAD_GROUP_SIZE_X - 1) / THREAD_GROUP_SIZE_X;
UINT numGroupsY = (dataSizeY + THREAD_GROUP_SIZE_Y - 1) / THREAD_GROUP_SIZE_Y;
UINT numGroupsZ = (dataSizeZ + THREAD_GROUP_SIZE_Z - 1) / THREAD_GROUP_SIZE_Z;

// Dispatch the compute shader
m_deviceContext->Dispatch(numGroupsX, numGroupsY, numGroupsZ);

// Unbind resources and shader to prevent unintended side effects
// m_deviceContext->CSSetShaderResources(0, 0, nullptr);
// m_deviceContext->CSSetUnorderedAccessViews(0, 0, nullptr, nullptr);
// m_deviceContext->CSSetShader(nullptr, nullptr, 0);
            

Resource Binding and Semantics

The connection between your HLSL shader and your C++ code is established through resource binding. You'll use specific views to expose your application's data to the compute shader:

The semantic SV_DispatchThreadID, along with SV_GroupThreadID and SV_GroupID, are crucial for coordinating work across threads within a group and across the entire dispatch.

Important Considerations:

Synchronization: Accessing shared resources between threads requires careful synchronization. UAVs provide atomic operations for thread-safe access. Explicit synchronization can also be managed.

Performance: Optimize your thread group sizes and avoid excessive branching within your compute shaders for maximum performance. Understanding the GPU architecture is key.

Data Layout: The way you structure your data in buffers and textures directly impacts how efficiently your compute shader can access it.

API Reference

DirectX 11/12 Compute Shader Functions

  • ID3D11DeviceContext::Dispatch (DirectX 11)
  • ID3D12GraphicsCommandList::Dispatch (DirectX 12)
  • ID3D11DeviceContext::CSSetShader
  • ID3D11DeviceContext::CSSetShaderResources
  • ID3D11DeviceContext::CSSetUnorderedAccessViews
  • ID3D11DeviceContext::CSSetConstantBuffers
  • HLSL built-in functions related to threading and resource access.

Further Reading