Tutorial: Get Started with Azure Data Lake Storage Gen2

This tutorial guides you through the essential steps to begin using Azure Data Lake Storage Gen2, a powerful and scalable cloud-based data lake solution designed for big data analytics.

1. Prerequisites

Before you start, ensure you have the following:

2. Create a Storage Account

You can create a new Azure Storage account with hierarchical namespace enabled using the Azure portal, Azure CLI, or Azure PowerShell.

Using Azure Portal:

  1. Navigate to the Azure portal.
  2. Click Create a resource.
  3. Search for "Storage account" and select it.
  4. Click Create.
  5. Fill in the required details: Subscription, Resource group, Storage account name, Region, Performance tier, and Replication.
  6. Under the Data Lake Storage Gen2 section, select Enable for Hierarchical namespace.
  7. Review and click Create.

Using Azure CLI:

Replace placeholders with your actual values.

az storage account create \ --name \ --resource-group \ --location \ --sku Standard_RAGRS \ --kind StorageV2 \ --hns true

3. Create a Container (Filesystem)

Within your Data Lake Storage Gen2 account, you'll organize data into containers, also known as filesystems.

Using Azure Portal:

  1. Go to your storage account in the Azure portal.
  2. Under Data Lake Storage Gen2, click Containers.
  3. Click + Container.
  4. Enter a name for your container (e.g., my-datalake) and set the public access level.
  5. Click Create.

Using Azure CLI:

az storage fs create \ --name my-datalake \ --account-name \ --auth-mode login

4. Upload Data

You can upload files and folders to your container using various tools.

Using Azure Storage Explorer:

Download and install Azure Storage Explorer. Connect to your storage account and drag-and-drop files into your container.

Using Azure CLI:

To upload a single file:

az storage fs file upload \ --file \ --path \ --fs-name my-datalake \ --account-name \ --auth-mode login

5. Next Steps

Congratulations! You have successfully set up and uploaded data to Azure Data Lake Storage Gen2. Here are some recommended next steps:

Explore Features