Deploy a model

This guide walks you through the steps required to deploy a trained machine learning model as a real‑time endpoint in Azure Machine Learning.

Prerequisites

An Azure subscription.
An Azure Machine Learning workspace.
Model registered in the workspace.
Azure CLI 2.0 or Azure Cloud Shell.

Step 1 – Create an inference environment

Define a conda environment that contains the packages required for inference.

conda:
  name: inference-env
  dependencies:
    - python=3.9
    - pip
    - pip:
        - azureml-defaults
        - scikit-learn==1.2.0

Step 2 – Write the scoring script

Create a score.py file that loads the model and defines the run function.

import json
import joblib
import numpy as np
from azureml.core.model import Model

def init():
    global model
    model_path = Model.get_model_path('my-model')
    model = joblib.load(model_path)

def run(data):
    try:
        input_data = np.array(json.loads(data)['data'])
        result = model.predict(input_data)
        return json.dumps({"result": result.tolist()})
    except Exception as e:
        return json.dumps({"error": str(e)})

Step 3 – Register the model (if not already registered)

az ml model register \
    --name my-model \
    --path ./models/sklearn_regression.pkl \
    --workspace-name myworkspace \
    --resource-group myresourcegroup

Step 4 – Deploy the endpoint

Use the Azure CLI to create an online endpoint and deployment.

# Create the endpoint
az ml online-endpoint create \
    --name my-endpoint \
    --workspace-name myworkspace \
    --resource-group myresourcegroup

# Deploy the model
az ml online-deployment create \
    --name my-deployment \
    --endpoint-name my-endpoint \
    --model my-model:1 \
    --environment inference-env:1 \
    --code ./src \
    --instance-type Standard_DS3_v2 \
    --instance-count 1 \
    --workspace-name myworkspace \
    --resource-group myresourcegroup

Step 5 – Test the deployed endpoint

Send a test request using curl or any HTTP client.

curl -X POST \
  -H "Content-Type: application/json" \
  -d '{"data": [[5.1, 3.5, 1.4, 0.2]]}' \
  https://my-endpoint.eastus2.inference.ml.azure.com/score

Tip: Use az ml online-endpoint get-traffic to view traffic routing and az ml online-endpoint delete to clean up resources.