Getting Started with Azure AI Machine Learning

Last Updated: October 26, 2023

Welcome to Azure AI Machine Learning! This guide will walk you through the essential steps to set up and start using Azure AI Machine Learning for your machine learning projects. We'll cover everything from creating your first workspace to deploying your models.

Step 1: Prerequisites

Before you begin, ensure you have the following:

An Azure subscription. If you don't have one, you can sign up for a free trial.
Appropriate permissions to create Azure resources.
Basic understanding of machine learning concepts.

Step 2: Create an Azure AI Machine Learning Workspace

A workspace is the top-level resource in Azure AI Machine Learning and provides a centralized place to work with all the artifacts you create. Follow these steps:

Navigate to the Azure portal.
Search for "Azure AI Machine Learning" and select it.
Click "Create".
Fill in the required details: subscription, resource group, workspace name, region, and optionally an Azure Storage account, Azure Key Vault, and Application Insights.
Review and create the workspace.

For detailed instructions, refer to the Workspace Setup Guide.

Step 3: Install the Azure ML SDK and CLI

The Azure ML SDK for Python and the Azure CLI extension for AI Machine Learning are essential tools for interacting with your workspace programmatically and from the command line.

Azure ML SDK for Python

Install the SDK using pip:

pip install azure-ai-ml azure-identity

Azure CLI Extension

Install the extension using Azure CLI:

az extension add --name ml

For more information on installing and configuring these tools, see the SDK and CLI Setup page.

Tip: Connect your local environment to your Azure AI ML workspace by configuring your credentials. You can use environment variables, managed identity, or service principals for authentication.

Step 4: Explore Key Concepts

Understanding these core concepts will help you navigate Azure AI Machine Learning effectively:

Experiments: Track and manage your training runs.
Jobs: Represent a single execution of a script or pipeline.
Models: Register and manage your trained machine learning models.
Endpoints: Deploy models for real-time or batch inference.
Compute: Manage compute resources for training and inference (e.g., Compute Instances, Compute Clusters, Inference Clusters).

Step 5: Your First Training Run

Let's run a simple training script. You'll need:

A Python script containing your training code (e.g., train.py).
A compute target to run your script on.

Here's a conceptual example using the SDK:

from azure.ai.ml import MLClient, command, Input
from azure.ai.ml.entities import WorkspaceConnection, Compute
from azure.identity import DefaultAzureCredential

# Authenticate and get ML client
ml_client = MLClient(
    DefaultAzureCredential(),
    subscription_id="YOUR_SUBSCRIPTION_ID",
    resource_group_name="YOUR_RESOURCE_GROUP",
    workspace_name="YOUR_WORKSPACE_NAME",
)

# Define the compute target (ensure it exists or create one)
compute_name = "cpu-cluster"
cpu_cluster = Compute(
    name=compute_name,
    type="amlcompute",
    size="STANDARD_DS3_V2",
    min_instances=0,
    max_instances=4,
    idle_time_before_scale_down=120,
)
ml_client.compute.begin_create_or_update(cpu_cluster).result()

# Define the training job
job = command(
    code="./src",  # Local path to your training code
    command="python train.py --data-path ${{inputs.training_data}} --learning-rate 0.01",
    inputs={
        "training_data": Input(type="uri_folder", path="azureml://datastores/workspaceblobstore/paths/datasets/my_training_data/")
    },
    environment="azureml://registries/azureml/environments/sklearn-1.0/versions/1", # Example environment
    compute=compute_name,
    display_name="my-first-training-job",
    experiment_name="my-first-experiment",
)

# Submit the job
returned_job = ml_client.jobs.create_or_update(job)
print(f"Job submitted: {returned_job.name}")

Replace placeholders like YOUR_SUBSCRIPTION_ID, YOUR_RESOURCE_GROUP, and YOUR_WORKSPACE_NAME with your actual values.

Next Steps: After a successful training run, you can register your model and deploy it to an endpoint. Explore the Tutorials section for more in-depth examples.