Working with Datasets in Azure AI ML

Datasets are fundamental to machine learning workflows. Azure AI ML provides robust tools to manage, register, version, and access your data efficiently and securely.

Core Concepts

Dataset Registration

Learn how to register your local or cloud data sources as datasets in your Azure AI ML workspace for easy access and versioning.

Data Access and Stacks

Understand how to access registered datasets in your training scripts and explore the concept of data stacks for complex data management.

Dataset Versioning

Discover how to create and manage different versions of your datasets, ensuring reproducibility and traceability in your ML experiments.

Common Scenarios

Tabular Data

Examples and best practices for working with structured data, such as CSV, Parquet, and TSV files.

Image Data

Guidelines for managing and utilizing image datasets for computer vision tasks.

Text Data

Strategies for handling text-based datasets for natural language processing (NLP) applications.

Video Data

Approaches to managing and processing video data for advanced AI models.

Tutorials and Guides

End-to-End Dataset Management Tutorial

A comprehensive guide covering registration, versioning, and accessing datasets in Azure AI ML.

Using the Azure AI ML SDK for Datasets

Programmatic management of datasets using Python SDK.