Deploy an Azure Machine Learning model as a web service

This guide walks you through the process of deploying your trained Azure Machine Learning model as a real-time web service. This allows your applications to consume the model's predictions over HTTP.

Prerequisites

An Azure subscription.
An Azure Machine Learning workspace.
A trained machine learning model registered in your workspace.
The Azure CLI and the Azure Machine Learning extension installed (or use Azure ML studio).

Steps to Deploy

Define the entry script and environment

The entry script (e.g., score.py) contains the logic for loading your model and processing incoming requests. You'll also define a Conda environment file (e.g., conda.yml) specifying the dependencies required by your script and model.

# score.py
import json
import numpy as np
import os
import pickle

from sklearn.externals import joblib

def init():
    # This function is called when the container is started, once per container.
    # This is where you can load your model object.
    global model
    # AZUREML_MODEL_DIR is an environment variable created during deployment.
    # It points to the directory where the model is downloaded.
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_model.pkl')
    model = joblib.load(model_path)

def run(raw_data):
    # This function is called for every inference request.
    # raw_data is a string with JSON input.
    data = json.loads(raw_data)
    data = np.array(data)
    # Make prediction
    prediction = model.predict(data)
    # You can return any JSON-serializable format here.
    return prediction.tolist()

# conda.yml
name: azureml
channels:
  - defaults
dependencies:
  - python=3.7
  - pip
  - pip:
    - azureml-defaults
    - scikit-learn==0.22.2
    - numpy==1.18.5
    - pip
    - azureml-defaults
    - scikit-learn==0.22.2
    - numpy==1.18.5

Create a scoring script configuration

This configuration tells Azure ML how to find your entry script and dependencies. You can use the Azure ML SDK or CLI for this.

from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment
from azureml.core import Workspace

# Load workspace
ws = Workspace.from_config()

# Define environment
env = Environment.from_conda_specification(name='myenv', file_path='conda.yml')

# Define inference configuration
inference_config = InferenceConfig(entry_script='score.py', environment=env)

Create a deployment configuration

Specify the compute target for your web service. This could be an Azure Kubernetes Service (AKS) cluster or a managed online endpoint.

from azureml.core.webservice import AciWebservice, Webservice
from azureml.core.compute import ComputeTarget

# Option 1: Deploy to Azure Container Instance (ACI) - for testing
deployment_config_aci = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)

# Option 2: Deploy to Azure Kubernetes Service (AKS) - for production
# Assuming you have an AKS cluster named 'myakscluster'
# aks_target = ComputeTarget(workspace=ws, name='myakscluster')
# deployment_config_aks = Webservice.deploy_configuration(compute_target=aks_target)

Deploy the model

Register your model in the workspace (if not already done) and then deploy it using the configurations created earlier. You'll need the name of your registered model.

from azureml.core.model import Model

# Assume 'my-sklearn-model' is the name of your registered model
model_name = 'my-sklearn-model'
model_version = 1 # Or the specific version you want to deploy

# Get the registered model
model = ws.models.get(name=model_name, version=model_version)

# Deploy the web service
service_name = 'my-sklearn-service'
service = Model.deploy(
    workspace=ws,
    name=service_name,
    models=[model],
    inference_config=inference_config,
    deployment_config=deployment_config_aci # Or deployment_config_aks
)

service.wait_for_deployment(show_output=True)

print(f"Service Name: {service.name}")
print(f"Scoring URI: {service.scoring_uri}")
print(f"Swagger URI: {service.swagger_uri}")

Test the web service

Once deployed, you can send test requests to the scoring URI to get predictions from your model.

import requests
import json

# Replace with your service's scoring URI
scoring_uri = service.scoring_uri
api_key = service.get_keys()[0] # Primary key

# Example data for prediction
test_data = [[1.0, 2.0, 3.0, 4.0]]
input_data = json.dumps({"data": test_data})

headers = {
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {api_key}'
}

response = requests.post(scoring_uri, data=input_data, headers=headers)

if response.status_code == 200:
    print("Prediction:", response.json())
else:
    print("Error:", response.text)

Managing your deployed service

You can manage your deployed services through the Azure ML studio UI or using the Azure CLI/SDK. This includes operations like updating the model, scaling the service, and monitoring its performance.

Best Practices

Use a separate Conda environment for each deployed model to avoid dependency conflicts.
Implement robust logging in your entry script for easier debugging.
Consider security aspects, such as API key management and network access controls.
For production environments, deploy to Azure Kubernetes Service (AKS) for better scalability and reliability.

Tip: You can register your model with model.register() before deployment to keep a history of your trained models in your workspace.