How to deploy Azure Machine Learning models
Introduction
Deploying your trained machine learning models to production is a critical step in bringing your AI solutions to life. Azure Machine Learning provides a comprehensive set of tools and services to help you deploy models as web services, integrate them with applications, and manage their lifecycle.
This guide will walk you through the common deployment scenarios and best practices for deploying models using Azure Machine Learning.
Deployment Targets
Azure Machine Learning supports deployment to various targets, allowing you to choose the best fit for your application's needs:
- Azure Container Instances (ACI): Ideal for development, testing, and low-scale production. Provides a quick and easy way to deploy models without managing underlying infrastructure.
- Azure Kubernetes Service (AKS): Recommended for production workloads requiring high scalability, availability, and management of complex deployments.
- Azure Machine Learning Managed Endpoints: A fully managed inference service for real-time and batch scoring. Simplifies deployment and management.
Common Deployment Scenarios
Deploying to Azure Container Instances (ACI)
ACI is a great starting point for deploying your models. It's serverless and easy to set up.
Step 1: Create an Inference Configuration
This involves defining how your model will be served. You'll need a scoring script (e.g., score.py) and an environment.
# score.py
import json
import numpy as np
import pickle
from sklearn.externals import joblib # Or import from sklearn.datasets
def init():
global model
# Load the model from disk
model_path = 'model.pkl' # Replace with your model file name
model = joblib.load(model_path)
def run(raw_data):
try:
data = json.loads(raw_data)['data']
data = np.array(data)
# Make prediction
result = model.predict(data)
# You can return this to any JSON-friendly format
return json.dumps({"result": result.tolist()})
except Exception as e:
error = str(e)
return json.dumps({"error": error})
Step 2: Create a Deployment Configuration
This specifies the compute resources needed for your deployment.
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig, Model
# Assuming you have an Azure ML Workspace and registered model 'my-model'
inference_config = InferenceConfig(entry_script="score.py", environment=myenv)
aci_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1, description = 'Deploy my model to ACI')
service = Model.deploy(workspace=workspace,
name='my-aci-service',
models=[model], # Assuming 'model' is your registered model object
inference_config=inference_config,
deployment_config=aci_config,
overwrite=True)
service.wait_for_deployment(show_output=True)
For more complex environments, consider using Dockerfiles with your Azure ML environment.
Deploying to Azure Kubernetes Service (AKS)
AKS provides a robust platform for scalable and highly available deployments.
Step 1: Create an AKS Cluster
You can create an AKS cluster via the Azure portal, Azure CLI, or SDK.
from azureml.core.compute import ComputeTarget, AksCompute
# Define the cluster provisioning configuration
prov_config = AksCompute.provisioning_configuration(location='eastus',
agent_count = 3,
vm_size = 'Standard_DS3_v2')
# Create the cluster
aks_target = ComputeTarget.create(workspace=workspace,
name='my-aks-cluster',
provisioning_configuration=prov_config)
aks_target.wait_for_completion(show_output=True)
Step 2: Deploy to AKS
Similar to ACI, you define an inference configuration and a deployment configuration, but this time targeting your AKS cluster.
from azureml.core.webservice import AksWebservice
from azureml.core.model import InferenceConfig, Model
inference_config = InferenceConfig(entry_script="score.py", environment=myenv)
aks_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1, autoscale_enabled=True, max_replicas=3)
service = Model.deploy(workspace=workspace,
name='my-aks-service',
models=[model], # Assuming 'model' is your registered model object
inference_config=inference_config,
deployment_config=aks_config,
compute_target=aks_target, # Specify your AKS compute target
overwrite=True)
service.wait_for_deployment(show_output=True)
Ensure your AKS cluster has enough resources and is properly configured for your model's demands.
Azure Machine Learning Managed Endpoints
Managed endpoints offer a simplified and scalable way to deploy models. They abstract away much of the infrastructure management.
Real-time Endpoints
For low-latency, high-throughput inference.
Step 1: Create an Online Endpoint
Define the endpoint configuration.
from azureml.core.model import OnlineEndpoint, Model
# Create or get your model
# model = Model(...)
# Create an online endpoint
endpoint = OnlineEndpoint.create(workspace=workspace,
name='my-realtime-endpoint',
description='Real-time inference endpoint',
auth_mode='key')
Step 2: Create a Deployment
Deploy your model to the endpoint.
from azureml.core.model import OnlineDeployment
deployment = OnlineDeployment.create(endpoint=endpoint,
name='my-deployment',
models=[model], # Your registered model
environment=myenv, # Your environment
code_path='.', # Path to your scoring script
scoring_script='score.py',
instance_type='Standard_DS2_v2',
instance_count=1)
deployment.wait_for_deployment(show_output=True)
Batch Endpoints
For scoring large datasets offline.
Deployment for batch endpoints follows a similar pattern, defining a batch deployment on a batch endpoint.
Managed endpoints simplify scaling, monitoring, and updating your deployed models.
Monitoring and Management
Once deployed, it's crucial to monitor your models for performance, drift, and errors.
- Azure Monitor: Collects metrics and logs for your deployed services.
- Application Insights: Provides detailed insights into application performance and usage.
- Model performance tracking: Implement logging within your scoring script to track predictions and potential issues.
Conclusion
Azure Machine Learning offers flexible and powerful options for deploying your models. Whether you choose ACI for simplicity, AKS for robust production, or Managed Endpoints for ease of use, you can effectively bring your AI solutions to production environments.