Azure Data Lake Storage
Azure Data Lake Storage is a highly scalable and secure data lake, built on Azure Blob Storage. It is specifically designed for big data analytics workloads.
Data Lake Storage Gen2 is the flagship big data analytics solution for Azure. It provides a dedicated analytics service for big data. Data Lake Storage Gen2 combines the capabilities of Azure Data Lake Storage Gen1 and Azure Blob Storage, providing an optimized analytics experience.
Key Features
- Massively Scalable: Designed to handle petabytes of data with high throughput.
- Hierarchical Namespace: Enables efficient data management and access patterns common in analytics.
- Security: Robust security features including Azure Active Directory integration, access control lists (ACLs), and encryption at rest and in transit.
- Cost-Effective: Optimized for low-cost storage of large datasets.
- Integration: Seamlessly integrates with Azure analytics services like Azure Databricks, Azure Synapse Analytics, and HDInsight.
- Open Formats: Supports various open data formats, making data accessible to diverse tools.
Core Concepts
Understanding these core concepts is crucial for working effectively with Azure Data Lake Storage:
- Account: An Azure Data Lake Storage Gen2 account is created on top of an Azure Storage account.
- Hierarchical Namespace: Organizes data into a hierarchy of directories and files, similar to a file system.
- Directory: A container for organizing data within the hierarchical namespace.
- File: The basic unit of data storage.
- Access Control Lists (ACLs): Permissions that grant or deny access to directories and files for specific users or groups.
Quickstarts
Create a Data Lake Storage Gen2 Account
Learn how to provision a Data Lake Storage Gen2 account using the Azure portal, PowerShell, or CLI.
Get StartedUpload Data to Data Lake Storage Gen2
Explore methods for uploading your big data files to Data Lake Storage Gen2 from various sources.
Learn MoreAccess Data with Azure Databricks
Discover how to connect Azure Databricks to your Data Lake Storage Gen2 and perform analytics.
Explore IntegrationTutorials
- Process data with Azure Databricks
- Build an ETL solution with Azure Data Factory and Data Lake Storage Gen2
- Analyze data with Azure Synapse Analytics
API Reference
Access comprehensive documentation for the Data Lake Storage Gen2 REST API, SDKs, and client libraries.
REST API: Azure Data Lake Storage Gen2 REST API Documentation
SDKs: Azure SDKs for Data Lake Storage Gen2
# Example: Listing contents of a directory using Azure CLI
az storage fs file list --account-name --file-system --output table