DirectML Performance Benchmarks

Optimize your machine learning workloads on Windows with DirectML.

Image Classification (ResNet-50)

Measures inference performance for a common image classification model.

Model: ResNet-50
Dataset: ImageNet (subset)
Batch Size: 32
Precision: FP16
Average Latency: 8.5 ms
// Example Snippet (Conceptual) auto result = directmlDevice.ExecuteInference(model, inputTensor, outputTensor); if (result.success) { // Process results... }

Object Detection (YOLOv4)

Evaluates throughput for real-time object detection tasks.

Model: YOLOv4
Dataset: COCO (subset)
Batch Size: 16
Precision: FP32
Throughput: 120 FPS
// Example Snippet (Conceptual) auto frames = captureCameraFeed(); for (const auto& frame : frames) { auto detections = directmlDevice.DetectObjects(yoloModel, frame); // Render bounding boxes... }

Natural Language Processing (BERT)

Assesses performance for complex NLP tasks like text classification.

Model: BERT-base
Task: Sentiment Analysis
Batch Size: 64
Precision: INT8
Queries per Second: 500 QPS
// Example Snippet (Conceptual) auto sentences = loadSentences(); for (const auto& sentence : sentences) { auto sentiment = directmlDevice.AnalyzeSentiment(bertModel, sentence); // Log sentiment... }

Generative Models (StyleGAN2)

Reports generation speed for high-fidelity synthetic data.

Model: StyleGAN2
Task: Image Generation
Latent Vector Size: 512
Precision: FP16
Images per Second: 15 IPS
// Example Snippet (Conceptual) auto noise = generateLatentVector(); auto generatedImage = directmlDevice.GenerateImage(styleganModel, noise); saveImage(generatedImage, "output.png");