Azure Data Lake Storage

Azure Data Lake Storage is a highly scalable and secure data lake, built on Azure Blob Storage. It is specifically designed for big data analytics workloads.

Data Lake Storage Gen2 is the flagship big data analytics solution for Azure. It provides a dedicated analytics service for big data. Data Lake Storage Gen2 combines the capabilities of Azure Data Lake Storage Gen1 and Azure Blob Storage, providing an optimized analytics experience.

Key Features

  • Massively Scalable: Designed to handle petabytes of data with high throughput.
  • Hierarchical Namespace: Enables efficient data management and access patterns common in analytics.
  • Security: Robust security features including Azure Active Directory integration, access control lists (ACLs), and encryption at rest and in transit.
  • Cost-Effective: Optimized for low-cost storage of large datasets.
  • Integration: Seamlessly integrates with Azure analytics services like Azure Databricks, Azure Synapse Analytics, and HDInsight.
  • Open Formats: Supports various open data formats, making data accessible to diverse tools.

Core Concepts

Understanding these core concepts is crucial for working effectively with Azure Data Lake Storage:

  • Account: An Azure Data Lake Storage Gen2 account is created on top of an Azure Storage account.
  • Hierarchical Namespace: Organizes data into a hierarchy of directories and files, similar to a file system.
  • Directory: A container for organizing data within the hierarchical namespace.
  • File: The basic unit of data storage.
  • Access Control Lists (ACLs): Permissions that grant or deny access to directories and files for specific users or groups.

Quickstarts

Create a Data Lake Storage Gen2 Account

Learn how to provision a Data Lake Storage Gen2 account using the Azure portal, PowerShell, or CLI.

Get Started

Upload Data to Data Lake Storage Gen2

Explore methods for uploading your big data files to Data Lake Storage Gen2 from various sources.

Learn More

Access Data with Azure Databricks

Discover how to connect Azure Databricks to your Data Lake Storage Gen2 and perform analytics.

Explore Integration

Tutorials

API Reference

Access comprehensive documentation for the Data Lake Storage Gen2 REST API, SDKs, and client libraries.

REST API: Azure Data Lake Storage Gen2 REST API Documentation

SDKs: Azure SDKs for Data Lake Storage Gen2


# Example: Listing contents of a directory using Azure CLI
az storage fs file list --account-name  --file-system  --output table