DirectML Documentation

DirectML is a high-performance, hardware-accelerated machine learning API for Windows. It enables developers to integrate machine learning inference into their applications on a wide range of DirectX 12-capable hardware.

What is DirectML?

DirectML is a DirectX 12 API that provides a consistent interface for running machine learning inference across various hardware accelerators, including GPUs from NVIDIA, AMD, and Intel, as well as NPUs. It leverages the power of DirectX 12's low-overhead, high-performance graphics pipeline to deliver efficient ML execution.

Key features of DirectML include:

Hardware Acceleration: Runs ML models on GPUs and other accelerators for significant performance gains.
Broad Hardware Support: Works with any DirectX 12-compatible hardware.
Framework Integration: Integrates seamlessly with popular ML frameworks like TensorFlow and ONNX Runtime.
Low-Level Control: Provides fine-grained control over ML operations and resource management.
Windows Native: Built into Windows, offering a native and robust ML development experience.

Getting Started with DirectML

To start using DirectML, you'll need a Windows 10 (version 1709 or later) or Windows 11 system with a DirectX 12-compatible GPU and the latest graphics drivers. You can integrate DirectML directly into your C++ applications or use it through higher-level frameworks.

Using DirectML Directly

For direct control, you can use the DirectML API in your C++ projects. This involves creating a DirectML device, command recorder, and binding resources for your ML operations.

                
#include <directml.h>
// ... initialize DirectML device and context ...

// Example of creating a tensor descriptor
DML_BUFFER_TENSOR_DESC tensorDesc = {};
tensorDesc.DataType = DML_DATA_TYPE_FLOAT32;
tensorDesc.Flags = DML_TENSOR_FLAG_NONE;
tensorDesc.SizeInBytes = /* size in bytes */;
tensorDesc.TotalTensorSizeInBytes = /* total size */;
tensorDesc.Shape = /* tensor shape */;
tensorDesc.Guid = /* tensor GUID */;

// ... create DML buffer and binding ...
                
            

For detailed instructions, refer to the Get Started section.

Using with ONNX Runtime

DirectML is supported as an execution provider for ONNX Runtime, allowing you to run ONNX models with DirectML acceleration. This is often the easiest way to get started.

Note: Ensure you have installed the DirectML execution provider package for ONNX Runtime.

                
import onnxruntime as ort
import numpy as np

# Load the ONNX model
session_options = ort.SessionOptions()
# Add DirectML EP to session options
session_options.add_session_option("providers", ["DmlExecutionProvider", "CPUExecutionProvider"])

session = ort.InferenceSession("your_model.onnx", session_options)

# Prepare input data
input_data = np.random.rand(...)

# Run inference
outputs = session.run(None, {"input_name": input_data})
                
            

Explore the Code Samples for practical examples.

DirectX 12 Windows GPU Acceleration