DirectML Object Detection Samples

Custom Object Detection with DirectML

The DirectML object detection samples provide a robust foundation for integrating custom object detection models into your Windows applications. This section focuses on the intricacies of adapting and running your own trained models using DirectML, leveraging its high-performance compute capabilities.

This sample demonstrates how to load a custom model (e.g., YOLO, SSD, Faster R-CNN trained with frameworks like PyTorch or TensorFlow) and perform inference directly on the GPU via DirectML.

Key Features and Capabilities:

Model Flexibility

Supports various model architectures and formats compatible with DirectML's inference capabilities. Easily integrate models trained with popular deep learning frameworks.

GPU Acceleration

Harness the power of your DirectX 12-capable GPU for significantly faster inference times compared to CPU-based solutions.

Data Preprocessing

Includes essential utilities for image preprocessing, such as resizing, normalization, and channel ordering, ensuring your input data is correctly formatted for the model.

Post-processing Logic

Provides examples for post-processing model outputs, including non-maximum suppression (NMS) to refine bounding box detections and confidence scoring.

DirectML API Integration

Clear guidance on utilizing the DirectML API for tensor operations, graph execution, and managing resources efficiently.

Performance Tuning

Tips and techniques for optimizing model performance, including batching, precision considerations, and leveraging DirectML's operator set.

Getting Started with Your Custom Model

To run your custom object detection model, you'll typically need to:

Convert your model: Ensure your model is in a format that can be loaded and executed by DirectML. This might involve using tools to convert from ONNX or other intermediate representations.
Prepare input data: Capture or load images/video frames and apply the necessary preprocessing steps.
Load the model: Use DirectML APIs to load your model's computational graph.
Perform Inference: Feed your preprocessed input tensors into the model and execute the inference.
Process Output: Interpret the model's raw output to extract bounding boxes, class labels, and confidence scores.

The provided sample code serves as a practical guide, showcasing the essential steps and DirectML API calls involved in custom model inference.

Example Code Snippet (Conceptual):

                    using Microsoft.AI.DirectML;
                    using System.Numerics;

                    // ... initialization code ...

                    var modelPath = @"path\to\your\custom_model.onnx";
                    var device = new DMLDevice(DMLDevice.DefaultAdapter);
                    var context = new DMLOperatorInitializer(device);

                    // Load and compile the model
                    var compiledModel = context.CompileModel(modelPath);

                    // Prepare input tensor
                    var inputTensorDesc = compiledModel.InputDescriptions[0];
                    var inputTensor = DMLTensor.Create(device, inputTensorDesc.DataType, inputTensorDesc.Dimensions);

                    // Fill inputTensor with preprocessed image data
                    // ... FillTensor(inputTensor, preprocessedImageData) ...

                    // Execute inference
                    var outputTensor = compiledModel.Execute(inputTensor);

                    // Process outputTensor to get detections
                    // ... ProcessDetections(outputTensor) ...