Azure AI + Machine Learning

Deploying Machine Learning Models on Azure

This document guides you through the process of deploying your trained machine learning models as real-time inference services or batch inference jobs on Azure. Learn about the different deployment options and best practices to make your models accessible and scalable.

Introduction to Model Deployment

Once your machine learning model is trained and validated, the next critical step is to deploy it so that it can be used to make predictions on new, unseen data. Azure Machine Learning provides a robust platform with various deployment targets and strategies to suit different needs, from low-latency real-time predictions to large-scale batch processing.

Deployment Targets on Azure

Azure Machine Learning supports deployment to several targets:

Real-time Inference Deployment

Real-time inference is crucial when you need immediate predictions for individual data points. Azure Machine Learning offers managed endpoints and AKS for this purpose.

Deploying to Managed Endpoints

Managed endpoints abstract away the complexity of infrastructure management. You can deploy your model as a REST API endpoint that accepts input data and returns predictions in real-time.

Azure ML Managed Online Endpoint Deployment Diagram

Example Python SDK code for creating a managed online endpoint and deployment:


from azure.ai.ml import MLClient
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment

# Authenticate and get ML client
ml_client = MLClient.from_config(credential=..., subscription_id=..., resource_group=..., workspace_name=...)

# Define the online endpoint
endpoint = ManagedOnlineEndpoint(
    name="my-online-endpoint",
    description="A sample online endpoint",
    auth_mode="key"
)

# Create the endpoint
ml_client.online_endpoints.begin_create(endpoint).result()

# Define the online deployment
deployment = ManagedOnlineDeployment(
    name="my-deployment",
    endpoint_name="my-online-endpoint",
    model="azureml:my-model:1", # Replace with your model path
    instance_type="Standard_DS2_v2",
    instance_count=1
)

# Create the deployment
ml_client.online_deployments.begin_create(deployment).result()
            

Deploying to Azure Kubernetes Service (AKS)

For production environments demanding high scalability and customizability, AKS is the preferred choice. You can deploy your model as a web service within an AKS cluster.

Batch Inference Deployment

Batch inference is ideal for processing large datasets asynchronously. Azure Machine Learning provides Batch Endpoints for this scenario.

With Batch Endpoints, you can submit batch scoring jobs that process data stored in Azure Blob Storage or Data Lake Storage and output predictions to a specified location.

Azure ML Batch Endpoint Deployment Diagram

Example Python SDK code for creating a batch endpoint and job:


from azure.ai.ml import MLClient
from azure.ai.ml.entities import BatchEndpoint, BatchDeployment, Input

# Authenticate and get ML client
ml_client = MLClient.from_config(credential=..., subscription_id=..., resource_group=..., workspace_name=...)

# Define the batch endpoint
endpoint = BatchEndpoint(
    name="my-batch-endpoint",
    description="A sample batch endpoint"
)

# Create the endpoint
ml_client.batch_endpoints.begin_create(endpoint).result()

# Define the batch deployment
deployment = BatchDeployment(
    name="my-batch-deployment",
    endpoint_name="my-batch-endpoint",
    model="azureml:my-model:1", # Replace with your model path
    code_configuration={
        "code": "./src", # Path to your scoring script directory
        "scoring_script": "score.py"
    },
    instance_type="Standard_DS2_v2",
    instance_count=3
)

# Create the deployment
ml_client.batch_deployments.begin_create(deployment).result()

# Submit a batch job
input_data = Input(
    type="uri_folder",
    path="azureml://datastores/workspaceblobstore/paths/input_data/"
)
ml_client.jobs.create_or_update(
    deployment_name="my-batch-deployment",
    endpoint_name="my-batch-endpoint",
    input=input_data,
    compute="batch-cluster-compute" # Replace with your batch compute name
)
            

Best Practices for Deployment

Next Steps

Ready to deploy your first model? Explore the detailed tutorials and quickstarts available in the Azure Machine Learning documentation.

Get Started with Deployment Explore Deployment Options