Introduction to Azure Data Lake Storage

On This Page

Azure Data Lake Storage is a massively scalable and secure data lake built on the foundation of Azure Blob Storage. It is designed for big data analytics workloads, providing high-performance, cost-effective storage for vast amounts of structured, semi-structured, and unstructured data.

What is Azure Data Lake Storage?

A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. Unlike data warehouses, which require data to be structured before ingestion, a data lake stores raw data in its native format. This flexibility enables advanced analytics, machine learning, and big data processing.

Azure Data Lake Storage (ADLS) provides the foundational storage layer for these big data analytics solutions. It's optimized for:

Key Features

Azure Data Lake Storage offers a rich set of features to support modern data analytics:

Note: Azure Data Lake Storage Gen1 is the previous generation. Azure Data Lake Storage Gen2 is the recommended solution for new projects.

Common Use Cases

Azure Data Lake Storage is ideal for a wide range of big data and analytics scenarios:

Azure Data Lake Storage Gen2

Azure Data Lake Storage Gen2 is the latest iteration and the recommended choice for new big data analytics solutions. It combines the scalability and cost-effectiveness of Azure Blob Storage with the filesystem capabilities of Azure Data Lake Storage Gen1. It offers:

For detailed information on implementing and managing Azure Data Lake Storage Gen2, please refer to the specific service documentation and tutorials.