Deploy Machine Learning Models to Azure Container Instances (ACI)
Introduction
Azure Container Instances (ACI) offers the fastest and simplest way to run a container in Azure. It's ideal for deploying machine learning models for development, testing, or low-scale production workloads where you don't need the full orchestration capabilities of Azure Kubernetes Service (AKS).
This guide will walk you through the process of deploying a trained machine learning model as a web service to ACI using Azure Machine Learning.
Prerequisites
- An Azure subscription.
- An Azure Machine Learning workspace.
- Azure CLI and the Azure Machine Learning extension installed and configured.
- Python 3.7 or later installed.
- A trained machine learning model saved in a format recognized by Azure Machine Learning (e.g., scikit-learn pickle file, ONNX).
- The necessary Python packages for your model and inference script installed.
Deployment Steps
Step 1: Prepare the Model and Environment
Ensure your trained model is saved locally or in your Azure ML workspace datastore. You'll also need an inference script (typically a Python file) that loads the model and defines a scoring function (e.g., `run(input_data)`).
For example, if you have a scikit-learn model, you might save it as model.pkl.
# This is a placeholder for your saved model file.
Your inference script (e.g., score.py) should look something like this:
import joblib
import os
import json
import numpy as np
def init():
# This function is called when the container starts
# Load the model from the path specified by AZUREML_MODEL_DIR
# In this case, the model is saved in the root directory of the mounted model
global model
model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pkl')
model = joblib.load(model_path)
def run(raw_data):
# This function is called for every inference request
try:
data = json.loads(raw_data)
input_data = np.array(data['data'])
prediction = model.predict(input_data)
# Assuming prediction is a list of numbers, convert to JSON serializable format
if isinstance(prediction, np.ndarray):
prediction = prediction.tolist()
return json.dumps({"prediction": prediction})
except Exception as e:
error = str(e)
return json.dumps({"error": error})
You also need a requirements.txt file listing all the dependencies.
azureml-defaults
scikit-learn
numpy
joblib
Step 2: Create an Inference Configuration
An inference configuration tells Azure ML how to deploy your model. It includes the entry script, dependencies, and the model itself.
from azureml.core import Workspace, Model
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.webservice import InferenceConfig
# Load workspace
ws = Workspace.from_config()
# Register the model
model = Model.register(workspace=ws,
model_path='model.pkl', # Path to your local model file
model_name='my-sklearn-model',
tags={'area': 'machine-learning'},
properties={'accuracy': 0.95})
# Define dependencies
myenv = CondaDependencies()
myenv.add_pip_package('joblib')
myenv.add_pip_package('scikit-learn')
myenv.add_pip_package('numpy')
# Define inference configuration
inference_config = InferenceConfig(entry_script="score.py",
conda_dependencies=myenv)
Step 3: Create a Containerization Configuration for ACI
This configuration specifies the compute target (ACI) and deployment settings.
from azureml.core.compute_target import ComputeTarget
from azureml.core.webservice import AciWebservice
# Define ACI configuration
aci_config = AciWebservice.deploy_configuration(cpu_cores=1,
memory_gb=1,
description='Deploy my sklearn model to ACI',
tags={'type': 'classification'})
# Define the web service name
service_name = 'my-sklearn-aci-service'
Step 4: Deploy to ACI
Use the deploy() method to create the web service on ACI.
# Deploy the model to ACI
service = Model.deploy(workspace=ws,
name=service_name,
models=[model], # List of models to deploy
inference_config=inference_config,
deployment_config=aci_config,
overwrite=True) # Set to True to overwrite if service already exists
# Wait for the deployment to complete
service.wait_for_deployment(show_output=True)
Step 5: Test the Deployment
Once deployed, you can test the service to ensure it's working correctly.
# Get the scoring URI and keys
scoring_uri = service.scoring_uri
print(f"Scoring URI: {scoring_uri}")
# Example data for testing (replace with your actual data format)
test_data = {"data": [[1.0, 2.0, 3.0, 4.0]]}
input_data = json.dumps(test_data)
# Send a request to the service
response = service.run(input_data)
print(f"Prediction: {response}")
Alternatively, you can use tools like curl or Postman to send requests to the scoring_uri.
curl -X POST -H "Content-Type: application/json" -d '{"data": [[1.0, 2.0, 3.0, 4.0]]}' http://YOUR_SCORING_URI/api/v1/service/my-sklearn-aci-service/score
Remember to replace YOUR_SCORING_URI with the actual URI obtained after deployment.
Monitoring and Management
You can monitor the health and performance of your ACI deployment through the Azure portal. Navigate to your Azure Machine Learning workspace, then select "Endpoints" and find your deployed service. From there, you can view logs, metrics, and test the endpoint.
To delete the service:
service.delete()
Conclusion
Deploying models to ACI provides a quick and efficient way to serve your machine learning models as web services. This guide covered the essential steps from preparing your model to testing the deployment. For more advanced scenarios, consider deploying to Azure Kubernetes Service (AKS).