Azure Documentation

Automated Machine Learning for Classification

This tutorial demonstrates how to use Azure Machine Learning's Automated Machine Learning (AutoML) feature to build and deploy a classification model without extensive coding.

Introduction to Automated ML

Automated ML simplifies the process of training machine learning models. It automatically explores various algorithms, feature engineering techniques, and hyperparameters to find the best model for your task. For classification, this means automatically selecting algorithms like Logistic Regression, LightGBM, or even neural networks, and tuning them to predict a categorical outcome.

Prerequisites

Steps

  1. Prepare Your Data

    Ensure your dataset is clean and ready for training. For classification, you need a target column that contains the categorical labels you want to predict.

    Example data structure:

    CustomerID,Gender,Age,AnnualIncome,SpendingScore,Purchased
    1,Male,19,15,39,0
    2,Female,21,15,81,0
    3,Female,20,20,6,0
    ...
    

    In this example, Purchased is the target column.

  2. Create an AutoML Classification Job

    You can create an AutoML job using the Azure Machine Learning SDK or the Azure ML studio UI. The SDK provides more programmatic control.

    Here's a conceptual Python snippet using the SDK:

    
    from azure.ai.ml import MLClient
    from azure.ai.ml.automl import automl_classifier
    from azure.ai.ml.entities import Data
    from azure.identity import DefaultAzureCredential
    
    # Connect to your workspace
    ml_client = MLClient(
        DefaultAzureCredential(),
        subscription_id="your_subscription_id",
        resource_group_name="your_resource_group",
        workspace_name="your_workspace_name"
    )
    
    # Load your training data (assuming it's registered in the workspace)
    training_data = ml_client.data.get(name="your-classification-dataset", version="1")
    
    # Configure the AutoML job
    classification_job = automl_classifier(
        target_column_name="Purchased",
        primary_metric="accuracy",
        training_data=training_data,
        compute="your-compute-cluster", # Name of your compute cluster
        experiment_name="automl-classification-tutorial",
        n_cross_validations=5,
        enable_early_stopping=True,
        training_job_name="automl-classification-run"
    )
    
    # Submit the job
    returned_job = ml_client.jobs.create(classification_job)
    ml_client.jobs.stream(returned_job.name)
                        
  3. Monitor the Training Process

    AutoML will automatically try different models and configurations. You can monitor the progress and see the best-performing models in the Azure Machine Learning studio under the "Jobs" section.

    Key metrics to observe include accuracy, precision, recall, F1-score, and AUC.

  4. Evaluate and Deploy the Best Model

    Once the job completes, AutoML will identify the best model. You can then register this model and deploy it as a web service (e.g., a REST API endpoint) for real-time predictions or batch inferencing.

    Deployment can be done directly from the Azure ML studio or programmatically using the SDK.

Advanced Options

Learn More

For detailed information and more advanced scenarios, refer to the official Azure Machine Learning documentation: