Microsoft Learn

Your path to Azure AI mastery

Azure Machine Learning Endpoints

Endpoints in Azure Machine Learning are the gateway to your trained machine learning models, allowing them to be consumed by applications and services. They provide a REST API interface for real-time inference and can also support batch scoring scenarios. This section delves into the different types of endpoints, how to create and manage them, and best practices for their deployment.

What are Azure ML Endpoints?

An endpoint acts as a managed web service that hosts your machine learning model. When you deploy a model to an endpoint, Azure ML provisions the necessary compute resources and configures a scalable, secure API for accessing your model.

Types of Endpoints

Azure Machine Learning supports two primary types of endpoints:

Creating and Deploying Endpoints

You can create and deploy endpoints using various tools:

Example: Deploying a model to an Online Endpoint (Conceptual)

The process typically involves packaging your model, creating an inference script, defining the environment, and then deploying it to an endpoint.

Key Concepts:

  • Model: Your trained machine learning artifact (e.g., a scikit-learn model, a TensorFlow graph, a PyTorch model).
  • Inference Script: A script (e.g., score.py) that defines how to load your model and make predictions.
  • Environment: Specifies the dependencies (libraries, runtime) required for your model to run.
  • Compute Target: The Azure ML managed compute resource where your endpoint will be deployed (e.g., managed online endpoints).

Example YAML for Online Endpoint Deployment:


$schema: "http://azureml/v1.0/endpoints.json"
name: my-online-endpoint
description: A sample online endpoint for model inference
auth_mode: key
throttle_settings:
  max_concurrent_requests_per_node: 1
  max_qps_per_endpoint: 100
  max_qps_per_instance: 50
endpoints:
  production:
    name: production
    description: Production deployment
    model:
      uri: azureml:my-model-name:1
    instance_type: Standard_DS3_v2
    instance_count: 1
    liveness_probe:
      path: /health
      initial_delay_seconds: 30
      period_seconds: 10
    readiness_probe:
      path: /health
      initial_delay_seconds: 30
      period_seconds: 10

Managing Endpoints

Once deployed, you can monitor the performance, scale your endpoints up or down, update the deployed model, and manage access keys through the Azure portal or the Azure CLI.

Best Practices

Learn More

Explore the official Azure Machine Learning documentation for in-depth guides, tutorials, and API references related to endpoints.

Deploy models with online endpoints in Azure Machine Learning

Deploy models with batch endpoints in Azure Machine Learning