Deploy Machine Learning Models to Azure Container Instances (ACI)

On this page

Introduction
Prerequisites
Deployment Steps
Monitoring and Management
Conclusion

Introduction

Azure Container Instances (ACI) offers the fastest and simplest way to run a container in Azure. It's ideal for deploying machine learning models for development, testing, or low-scale production workloads where you don't need the full orchestration capabilities of Azure Kubernetes Service (AKS).

This guide will walk you through the process of deploying a trained machine learning model as a web service to ACI using Azure Machine Learning.

Prerequisites

An Azure subscription.
An Azure Machine Learning workspace.
Azure CLI and the Azure Machine Learning extension installed and configured.
Python 3.7 or later installed.
A trained machine learning model saved in a format recognized by Azure Machine Learning (e.g., scikit-learn pickle file, ONNX).
The necessary Python packages for your model and inference script installed.

Deployment Steps

Step 1: Prepare the Model and Environment

Ensure your trained model is saved locally or in your Azure ML workspace datastore. You'll also need an inference script (typically a Python file) that loads the model and defines a scoring function (e.g., `run(input_data)`).

For example, if you have a scikit-learn model, you might save it as model.pkl.

model.pkl

# This is a placeholder for your saved model file.

Your inference script (e.g., score.py) should look something like this:

score.py

import joblib
import os
import json
import numpy as np

def init():
    # This function is called when the container starts
    # Load the model from the path specified by AZUREML_MODEL_DIR
    # In this case, the model is saved in the root directory of the mounted model
    global model
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')
    model = joblib.load(model_path)

def run(raw_data):
    # This function is called for every inference request
    try:
        data = json.loads(raw_data)
        input_data = np.array(data['data'])
        prediction = model.predict(input_data)

        # Assuming prediction is a list of numbers, convert to JSON serializable format
        if isinstance(prediction, np.ndarray):
            prediction = prediction.tolist()

        return json.dumps({"prediction": prediction})
    except Exception as e:
        error = str(e)
        return json.dumps({"error": error})

You also need a requirements.txt file listing all the dependencies.

requirements.txt

azureml-defaults
scikit-learn
numpy
joblib

Step 2: Create an Inference Configuration

An inference configuration tells Azure ML how to deploy your model. It includes the entry script, dependencies, and the model itself.

Python SDK Example

from azureml.core import Workspace, Model
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import InferenceConfig

# Load workspace
ws = Workspace.from_config()

# Register the model
model = Model.register(workspace=ws,
                       model_path='model.pkl', # Path to your local model file
                       model_name='my-sklearn-model',
                       tags={'area': 'machine-learning'},
                       properties={'accuracy': 0.95})

# Define dependencies
myenv = CondaDependencies()
myenv.add_pip_package('joblib')
myenv.add_pip_package('scikit-learn')
myenv.add_pip_package('numpy')

# Define inference configuration
inference_config = InferenceConfig(entry_script="score.py",
                                   conda_dependencies=myenv)

Step 3: Create a Containerization Configuration for ACI

This configuration specifies the compute target (ACI) and deployment settings.

Python SDK Example

from azureml.core.compute_target import ComputeTarget
from azureml.core.webservice import AciWebservice

# Define ACI configuration
aci_config = AciWebservice.deploy_configuration(cpu_cores=1,
                                                memory_gb=1,
                                                description='Deploy my sklearn model to ACI',
                                                tags={'type': 'classification'})

# Define the web service name
service_name = 'my-sklearn-aci-service'

Step 4: Deploy to ACI

Use the deploy() method to create the web service on ACI.

Python SDK Example

# Deploy the model to ACI
service = Model.deploy(workspace=ws,
                       name=service_name,
                       models=[model], # List of models to deploy
                       inference_config=inference_config,
                       deployment_config=aci_config,
                       overwrite=True) # Set to True to overwrite if service already exists

# Wait for the deployment to complete
service.wait_for_deployment(show_output=True)

Step 5: Test the Deployment

Once deployed, you can test the service to ensure it's working correctly.

Python SDK Example

# Get the scoring URI and keys
scoring_uri = service.scoring_uri
print(f"Scoring URI: {scoring_uri}")

# Example data for testing (replace with your actual data format)
test_data = {"data": [[1.0, 2.0, 3.0, 4.0]]}
input_data = json.dumps(test_data)

# Send a request to the service
response = service.run(input_data)
print(f"Prediction: {response}")

Alternatively, you can use tools like curl or Postman to send requests to the scoring_uri.

curl Example

curl -X POST -H "Content-Type: application/json" -d '{"data": [[1.0, 2.0, 3.0, 4.0]]}' http://YOUR_SCORING_URI/api/v1/service/my-sklearn-aci-service/score

Remember to replace YOUR_SCORING_URI with the actual URI obtained after deployment.

Monitoring and Management

You can monitor the health and performance of your ACI deployment through the Azure portal. Navigate to your Azure Machine Learning workspace, then select "Endpoints" and find your deployed service. From there, you can view logs, metrics, and test the endpoint.

To delete the service:

Python SDK Example

service.delete()

Conclusion

Deploying models to ACI provides a quick and efficient way to serve your machine learning models as web services. This guide covered the essential steps from preparing your model to testing the deployment. For more advanced scenarios, consider deploying to Azure Kubernetes Service (AKS).