DirectML Model Conversion
Convert your existing machine learning models to a format optimized for DirectML, enabling hardware-accelerated inference on Windows devices. DirectML provides a high-performance layer for machine learning on Windows, leveraging DirectX 12 compute capabilities.
Supported Model Formats
DirectML supports conversion from several popular machine learning frameworks. The primary tool for this is the DirectMLX
converter.
- ONNX (Open Neural Network Exchange) - Recommended
- TensorFlow (via ONNX conversion)
- PyTorch (via ONNX conversion)
Using the DirectMLX Converter
The DirectMLX
tool is a command-line utility that facilitates the conversion process. It's typically included with the DirectML SDK or available as a separate NuGet package.
Example: Converting an ONNX Model
To convert an ONNX model named my_model.onnx
to the DirectML operator graph format, use the following command:
directmlx.exe convert --input-model my_model.onnx --output-model my_model.dml --platform windows --target arch_x64
Key parameters:
--input-model
: Path to your source model file (e.g., ONNX).--output-model
: Desired path for the converted DirectML model.--platform
: Target operating system (e.g.,windows
).--target
: Target architecture (e.g.,arch_x64
for 64-bit Intel/AMD).
Conversion Considerations
When converting models, keep the following in mind:
- Operator Compatibility: Not all operators from every framework might have a direct equivalent in DirectML. Check the DirectML operator support documentation for specifics.
- Data Types: Ensure your model uses supported data types, primarily FP32 and FP16.
- Quantization: For further optimization on certain hardware, consider post-conversion quantization.
- Custom Operators: If your model uses custom operators, you may need to implement them manually using the DirectML API.
Integration with DirectML API
Once converted, your .dml
file can be loaded and executed using the DirectML C++ API. This involves:
- Initializing DirectML Device and Command Queue.
- Loading the operator graph from the
.dml
file. - Creating Input and Output Resources (e.g.,
ID3D12Resource
). - Binding resources and dispatching the compute.
- Synchronizing and retrieving results.
Refer to the DirectML API Reference for detailed usage.