Azure Batch – Core Concepts
Overview
Azure Batch enables large-scale parallel and high-performance computing (HPC) applications to run efficiently in the cloud. It abstracts the underlying compute infrastructure, allowing you to focus on the logic of your applications instead of managing virtual machines or clusters.
Job & Task Model
A Job is a container for a collection of Tasks. Each task represents a single unit of work to be executed on a compute node.
Job
├─ Task 1
├─ Task 2
└─ Task N
Jobs can be created manually via the portal, CLI, PowerShell, or programmatically through the Azure Batch SDKs.
Pools of Compute Nodes
A Pool is a collection of compute nodes (VMs) that run the tasks of your jobs. Pools can be configured with:
- Virtual machine size (CPU, memory, GPU)
- Operating system image
- Node count (static or autoscaling)
- Start-up tasks for configuration
Scheduling & Execution
Azure Batch automatically schedules tasks onto available nodes based on resource requirements, task dependencies, and priority.
- Task constraints: max retry count, timeout, and container runtime.
- Job constraints: max wall-clock time, max task count.
Autoscaling
Define an autoscale formula to dynamically adjust the pool size based on workload metrics such as pending tasks, CPU usage, or custom metrics.
formula = $PendingTasks.GetSampleCount() > 0 ?
$PendingTasks.GetSamplePercent(5) * 2 :
$DedicatedIdleVMs.GetSampleCount() > 0 ?
max(0, $CurrentDedicated - 1) : $CurrentDedicated
Pricing
Billing is based on the compute resources you provision (VM size, number of VMs, and runtime). Use low-priority VMs for cost savings on non-critical workloads.
Learn more about pricingSecurity & Identity
Batch integrates with Azure Active Directory (AAD) and supports Managed Identities for secure access to storage and other Azure services.
- Transport encryption (TLS)
- Network isolation with virtual networks
- Access control via Role-Based Access Control (RBAC)
Best Practices
- Use low-priority VMs when possible to reduce costs.
- Leverage Autoscale formulas to match demand.
- Run start-up tasks to install dependencies and set environment variables.
- Monitor jobs using Azure Monitor and Batch metrics.