Azure Machine Learning SDK Reference

Introduction to the Azure Machine Learning SDK

The Azure Machine Learning SDK for Python is a powerful tool that allows you to manage and orchestrate your machine learning workflows on Azure. It provides a comprehensive set of classes and functions for interacting with Azure Machine Learning resources, from data preparation and model training to deployment and monitoring.

This SDK enables developers and data scientists to build, train, and deploy machine learning models at scale. Whether you're working with simple scripts or complex deep learning models, the SDK offers flexibility and control over your entire ML lifecycle.

Key Components

Workspaces

A workspace is the top-level resource for Azure Machine Learning. It provides a centralized place to work with all the artifacts you create when you use Azure Machine Learning. This includes notebooks, compute instances, experiments, models, and datastores.

from azureml.core import Workspace
ws = Workspace.from_config()

Compute Resources

The SDK allows you to create and manage various compute targets for training and inference, including:

Compute Instances: Cloud-based workstations for development.
Compute Clusters: Scalable clusters for distributed training.
Inference Clusters: Managed Kubernetes clusters for deploying models.

from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# Define the compute target configuration
compute_name = "my-aml-compute"
vm_size = "STANDARD_DS11_V2"
min_nodes = 0
max_nodes = 4

try:
    compute_target = ComputeTarget(workspace=ws, name=compute_name)
    print(f"Found existing compute target: {compute_name}")
except ComputeTargetException:
    print(f"Creating new compute target: {compute_name}")
    compute_config = AmlCompute.provisioning_configuration(vm_size=vm_size,
                                                         min_nodes=min_nodes,
                                                         max_nodes=max_nodes)
    compute_target = ComputeTarget.create(ws, compute_name, compute_config)
    compute_target.wait_for_completion(show_output=True)

Experiments and Jobs

Experiments are containers for your runs. A run represents a single execution of your training script. The SDK simplifies submitting and tracking these runs.

from azureml.core import Experiment, ScriptRunConfig

# Configure the run
src = ScriptRunConfig(source_directory='.', script='train.py', compute_target=compute_target)

# Create an experiment and submit the run
experiment = Experiment(workspace=ws, name='my-training-experiment')
run = experiment.submit(src)
run.wait_for_completion(show_output=True)

Models and Deployments

from azureml.core.model import Model

# Register the model
model = Model.register(model_path='outputs/model.pkl', # Relative path to the model file
                       model_name='my-sklearn-model',
                       tags={'area': 'regression', 'type': 'sklearn'},
                       properties={'accuracy': 0.95},
                       workspace=ws)

print(f"Registered model: {model.name}, version: {model.version}")

Data Management

Connect to and manage your data sources using Datastores and create Data Assets for versioning and tracking.

from azureml.core import Datastore, Dataset

# Get a reference to the default datastore
datastore = ws.get_default_datastore()

# Create a dataset from a folder (e.g., for tabular data)
# dataset = Dataset.Tabular.from_delimited_files(path=(datastore, 'path/to/your/data/*.csv'))
# dataset = dataset.register(workspace=ws, name='my-training-data', create_new_version=True)

Azure Machine Learning SDK Python Reference Machine Learning Documentation Cloud

‹ Previous Next ›