Manage Compute for Azure Machine Learning

This guide provides comprehensive instructions on managing compute resources for your Azure Machine Learning workloads. Learn how to create, configure, and scale various compute targets to optimize your machine learning workflows.

Supported Compute Targets

Azure Machine Learning offers a variety of compute targets suitable for different stages of your machine learning lifecycle:

Creating a Compute Instance

Follow these steps to create a new compute instance:

1

Navigate to Compute in Azure ML Studio

In your Azure Machine Learning workspace, go to the 'Compute' section in the left-hand navigation pane.

2

Select 'Compute Instances' Tab

Click on the 'Compute instances' tab and then click '+ New'.

3

Configure Compute Instance Details

Choose a virtual machine size, region, and name for your compute instance. You can also configure advanced settings like SSH access.

4

Create the Instance

Click 'Create' to provision your compute instance. This may take a few minutes.

Tip: For cost efficiency, consider using Spot instances for compute clusters, especially for training jobs that can tolerate interruptions.

Creating a Compute Cluster

Compute clusters provide scalable resources for training and batch inference.

az ml compute create --resource-group my-resource-group --workspace-name my-workspace --name cpu-cluster --type amlcompute --size Standard_NC6 --min-nodes 0 --max-nodes 10

Key Parameters for Compute Clusters:

Managing Existing Compute

You can manage your compute resources through the Azure portal, Azure CLI, or the Python SDK.

Scaling Compute Clusters

To scale a compute cluster, you can update the --min-nodes and --max-nodes parameters:

az ml compute update --resource-group my-resource-group --workspace-name my-workspace --name cpu-cluster --max-nodes 20

Deleting Compute Resources

To delete a compute instance or cluster:

az ml compute delete --resource-group my-resource-group --workspace-name my-workspace --name my-compute-name --yes

Best Practices

Tip: For production deployments, consider using Inference Clusters (managed Kubernetes) for high availability and scalability.