Microsoft Learn

DirectCompute API Reference

This section details the functions, structures, and interfaces available for DirectCompute programming on Windows.

Core Functions

ID3D11Device::CreateComputeShader

HRESULT CreateComputeShader( [in] const void* pShaderBytecode, [in] SIZE_T BytecodeLength, [in] ID3D11ClassLinkage* pClassLinkage, [out, optional] ID3D11ComputeShader** ppComputeShader );

Creates a compute shader from compiled shader code.

Parameters

  • pShaderBytecode: A pointer to the compiled shader code.
  • BytecodeLength: The size of the compiled shader code.
  • pClassLinkage: A pointer to shader linkage. Can be NULL.
  • ppComputeShader: A pointer to a compute shader interface.

Return Value

Returns one of the Direct3D 11 Return Codes.

Remarks

Use this method to create a compute shader object. Compute shaders are essential for general-purpose computation on the GPU.

See Also

ID3D11DeviceContext::Dispatch

void Dispatch( [in] UINT ThreadGroupCountX, [in] UINT ThreadGroupCountY, [in] UINT ThreadGroupCountZ );

Dispatches one or more compute-க்கப்படுகிறது thread groups.

Parameters

  • ThreadGroupCountX: The number of thread groups to execute along the X axis.
  • ThreadGroupCountY: The number of thread groups to execute along the Y axis.
  • ThreadGroupCountZ: The number of thread groups to execute along the Z axis.

Remarks

This function dispatches threads in groups. The total number of threads launched is (ThreadGroupCountX * ThreadGroupCountY * ThreadGroupCountZ) multiplied by the group size defined in the compute shader.

See Also

ID3D11Device::CreateBuffer

HRESULT CreateBuffer( [in] const D3D11_BUFFER_DESC* pDesc, [in, optional] const D3D11_SUBRESOURCE_DATA* pInitialData, [out, optional] ID3D11Buffer** ppBuffer );

Creates a buffer resource.

Parameters

  • pDesc: A pointer to a D3D11_BUFFER_DESC structure that describes the buffer.
  • pInitialData: A pointer to initialized data.
  • ppBuffer: A pointer to a buffer interface.

Return Value

Returns one of the Direct3D 11 Return Codes.

Remarks

Buffers are fundamental data containers. For DirectCompute, they are used for vertex data, index data, constant data, and unordered access data.

See Also

ID3D11DeviceContext::CSSetShaderResources

void CSSetShaderResources( [in] UINT StartSlot, [in] UINT NumViews, [in, optional] ID3D11ShaderResourceView* const* ppShaderResourceViews );

Sets an array of shader resource views to the compute shader pipeline stage.

Parameters

  • StartSlot: The starting slot for binding the array of shader resource views.
  • NumViews: The number of views in the array.
  • ppShaderResourceViews: A pointer to an array of shader resource views.

Remarks

Use this function to bind data (buffers, textures) to the compute shader for reading.

See Also

ID3D11DeviceContext::CSSetUnorderedAccessViews

void CSSetUnorderedAccessViews( [in] UINT StartUAV, [in] UINT NumUAVs, [in, optional] ID3D11UnorderedAccessView* const* ppUnorderedAccessViews, [in, optional] const UINT* pUAVInitialCounts );

Sets an array of unordered-access views (UAVs) to the compute shader pipeline stage.

Parameters

  • StartUAV: The starting slot for binding the array of UAVs.
  • NumUAVs: The number of UAVs in the array.
  • ppUnorderedAccessViews: A pointer to an array of UAVs.
  • pUAVInitialCounts: Optional initial counts for append/consume buffers.

Remarks

UAVs allow compute shaders to read and write data, enabling parallel algorithms like reductions, sorting, and simulation.

See Also

Key Structures

D3D11_BUFFER_DESC

typedef struct D3D11_BUFFER_DESC { UINT ByteWidth; D3D11_USAGE Usage; UINT BindFlags; UINT CPUAccessFlags; UINT MiscFlags; UINT StructureByteStride; } D3D11_BUFFER_DESC;

Describes a buffer resource.

Members

  • ByteWidth: The size of the buffer in bytes.
  • Usage: A D3D11_USAGE enumerated type value that specifies how the buffer is to be read from and written to.
  • BindFlags: A combination of D3D11_BIND_FLAG enumerated type values that specify how the buffer will be used.
  • CPUAccessFlags: A combination of D3D11_CPU_ACCESS_FLAG enumerated type values that specify the CPU's random access to the buffer.
  • MiscFlags: Miscellaneous flags. For compute shaders, often includes D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS or D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS.
  • StructureByteStride: The size of each element in bytes if the buffer is structured. Set to 0 if not structured or if the stride is the same as ByteWidth.

D3D11_UNORDERED_ACCESS_VIEW_DESC

typedef struct D3D11_UNORDERED_ACCESS_VIEW_DESC { DXGI_FORMAT Format; D3D11_UAV_DIMENSION ViewDimension; union { D3D11_BUFFER_UAV Buffer; D3D11_TEXTURE1D_UAV Texture1D; D3D11_TEXTURE2D_UAV Texture2D; D3D11_TEXTURE3D_UAV Texture3D; }; } D3D11_UNORDERED_ACCESS_VIEW_DESC;

Describe an unordered-access view (UAV).

Members

This structure describes the view of a buffer or texture resource that can be accessed via unordered-access views.

See Also

Shader Model 5.0

DirectCompute leverages Shader Model 5.0, which introduces new features specifically for GPGPU programming:

  • Thread Synchronization: Atomic operations and barriers for coordinating threads within a thread group.
  • Resource Binding: More flexible binding of resources like textures and buffers using Shader Resource Views (SRVs) and Unordered Access Views (UAVs).
  • Dynamic Resource Sharing: Ability to share data between threads within a group using shared memory.
  • New Intrinsics: Functions like InterlockedAdd, WaveGetWaveId for advanced parallel algorithms.

Compute shaders are written in HLSL (High-Level Shading Language) and compiled into bytecode.

Example HLSL Compute Shader


// Define thread group shared memory
groupshared float shared_data[64];

// Define buffer resources
RWStructuredBuffer<float> outputBuffer : register(u0);
StructuredBuffer<float> inputBuffer  : register(t0);

// Thread group size
[numthreads(64, 1, 1)]
void main(uint3 dispatchThreadID : SV_DispatchThreadID, uint3 groupThreadID : SV_GroupThreadID, uint groupID : SV_GroupID)
{
    // Load data into shared memory
    shared_data[groupThreadID.x] = inputBuffer[dispatchThreadID.x];

    // Synchronize threads to ensure all data is loaded
    GroupMemoryBarrierWithGroupSync();

    // Perform computation using shared data
    float result = shared_data[groupThreadID.x] * 2.0f;

    // Write result to the output buffer
    outputBuffer[dispatchThreadID.x] = result;
}