This guide walks you through the steps required to deploy a trained machine learning model as a real‑time endpoint in Azure Machine Learning.
Define a conda environment that contains the packages required for inference.
conda:
name: inference-env
dependencies:
- python=3.9
- pip
- pip:
- azureml-defaults
- scikit-learn==1.2.0
Create a score.py file that loads the model and defines the run function.
import json
import joblib
import numpy as np
from azureml.core.model import Model
def init():
global model
model_path = Model.get_model_path('my-model')
model = joblib.load(model_path)
def run(data):
try:
input_data = np.array(json.loads(data)['data'])
result = model.predict(input_data)
return json.dumps({"result": result.tolist()})
except Exception as e:
return json.dumps({"error": str(e)})
az ml model register \
--name my-model \
--path ./models/sklearn_regression.pkl \
--workspace-name myworkspace \
--resource-group myresourcegroup
Use the Azure CLI to create an online endpoint and deployment.
# Create the endpoint
az ml online-endpoint create \
--name my-endpoint \
--workspace-name myworkspace \
--resource-group myresourcegroup
# Deploy the model
az ml online-deployment create \
--name my-deployment \
--endpoint-name my-endpoint \
--model my-model:1 \
--environment inference-env:1 \
--code ./src \
--instance-type Standard_DS3_v2 \
--instance-count 1 \
--workspace-name myworkspace \
--resource-group myresourcegroup
Send a test request using curl or any HTTP client.
curl -X POST \
-H "Content-Type: application/json" \
-d '{"data": [[5.1, 3.5, 1.4, 0.2]]}' \
https://my-endpoint.eastus2.inference.ml.azure.com/score
az ml online-endpoint get-traffic to view traffic routing and az ml online-endpoint delete to clean up resources.