Deploy a model to an Azure Machine Learning endpoint
This quickstart guide walks you through deploying a machine learning model to a real-time endpoint using Azure Machine Learning. This allows you to make predictions on new data in real-time.
Prerequisites
- An Azure subscription. If you don't have one, create a free account.
- An Azure Machine Learning workspace.
- A trained machine learning model registered in your workspace.
- Azure CLI and the Azure Machine Learning extension installed.
Step 1: Create an inference script
An inference script (score.py
) defines how to load your model and make predictions. It must contain two functions:
init()
: Called once when the service is loaded. Used for loading the model.run(raw_data)
: Called for each inference request.
import os
import json
import numpy as np
from azureml.core.model import Model
def init():
global model
# This name is registered in the Azure ML workspace
model_path = Model.get_model_path('my-sklearn-model')
model = joblib.load(model_path)
def run(raw_data):
try:
data = json.loads(raw_data)['data']
data = np.array(data)
result = model.predict(data)
return json.dumps({'result': result.tolist()})
except Exception as e:
error = str(e)
return json.dumps({'error': error})
Step 2: Define the environment
Specify the dependencies your inference script needs using a Conda environment file (conda.yaml
).
name: azureml_deploy
channels:
- conda-forge
dependencies:
- python=3.8
- pip
- pip:
- azureml-defaults
- scikit-learn
- joblib
- numpy
Step 3: Create the deployment configuration
Define the compute resources for your endpoint. For real-time endpoints, you typically use managed online endpoints.
You can use the Azure CLI to create an inference configuration. This involves specifying the entry script and the Conda environment.
az ml model deploy --name my-model-endpoint \
--model-path 'azureml:my-sklearn-model:1' \
--file-path 'score.py' \
--environment-file 'conda.yaml' \
--resource-group my-resource-group \
--workspace my-workspace \
--instance-type Standard_DS2_v2 \
--instance-count 1
This command registers your model, creates an inference configuration, and deploys it to a managed online endpoint. Replace my-sklearn-model
with the name and version of your registered model, and adjust resource group and workspace names accordingly.
Step 4: Test the endpoint
Once the deployment is complete, you can test the endpoint by sending a POST request with sample data.
First, retrieve the scoring URI and key:
az ml endpoint show --name my-model-endpoint --query scoring_uri
az ml endpoint show --name my-model-endpoint --query primary_key
Then, use a tool like curl
to send a request:
SCORING_URI=$(az ml endpoint show --name my-model-endpoint --query scoring_uri -o tsv)
PRIMARY_KEY=$(az ml endpoint show --name my-model-endpoint --query primary_key -o tsv)
curl -X POST \
-H "Authorization: Bearer $PRIMARY_KEY" \
-H "Content-Type: application/json" \
-d '{"data": [[1, 2, 3, 4]]}' \
"$SCORING_URI/score"
The response will contain the predictions from your model.
Next Steps
- Learn more about managed online endpoints.
- Explore options for batch scoring.
- Discover how to monitor your deployed models.