This document provides the API reference for the Azure Data Lake Storage Gen2 .NET SDK. This SDK allows you to interact with your Azure Data Lake Storage Gen2 accounts programmatically using C#.
Azure Data Lake Storage Gen2 is a powerful, scalable, and secure data lake built on Azure Blob Storage. It provides hierarchical namespace capabilities, enabling big data analytics scenarios.
To use the .NET SDK, you first need to install the relevant NuGet packages:
dotnet add package Azure.Storage.DataLake
dotnet add package Azure.Identity
You can then authenticate using various methods, such as connection strings or Azure Identity credentials.
// Example using DefaultAzureCredential
using Azure.Identity;
using Azure.Storage.DataLake;
string accountName = "YOUR_STORAGE_ACCOUNT_NAME";
var credential = new DefaultAzureCredential();
var dataLakeUri = new Uri($"https://{accountName}.dfs.core.windows.net");
var dataLakeServiceClient = new DataLakeServiceClient(dataLakeUri, credential);
// Example using Connection String
// string connectionString = "YOUR_CONNECTION_STRING";
// var dataLakeServiceClient = new DataLakeServiceClient(connectionString);
The primary client for interacting with the Azure Data Lake Storage Gen2 service. It provides methods for managing file systems (containers).
Namespace: Azure.Storage.DataLake
Key Methods:
Represents a file within a Data Lake Storage Gen2 file system. It provides methods for file operations such as uploading, downloading, and managing metadata.
Namespace: Azure.Storage.DataLake
Key Properties:
Key Methods:
Represents a directory within a Data Lake Storage Gen2 file system. It provides methods for directory operations such as creating, deleting, and listing contents.
Namespace: Azure.Storage.DataLake
Key Properties:
Key Methods:
Performing operations on files involves obtaining a DataLakeFileClient
.
// Get a file system client
var fileSystemClient = dataLakeServiceClient.GetFileSystemClient("myfilesystem");
// Get a file client for a file named "mydata.csv"
var fileClient = fileSystemClient.GetFileClient("mydata.csv");
// Upload a file
using (var stream = File.OpenRead("local_data.csv"))
{
await fileClient.UploadAsync(stream, overwrite: true);
Console.WriteLine("File uploaded successfully.");
}
// Download a file
using (var stream = await fileClient.ReadAsync())
using (var outputStream = File.Create("downloaded_data.csv"))
{
await stream.CopyToAsync(outputStream);
Console.WriteLine("File downloaded successfully.");
}
// Delete a file
await fileClient.DeleteAsync();
Console.WriteLine("File deleted.");
Managing directories is done through the DataLakeDirectoryClient
.
// Get a file system client
var fileSystemClient = dataLakeServiceClient.GetFileSystemClient("myfilesystem");
// Create a subdirectory
await fileSystemClient.CreateSubdirectoryAsync("data/processed");
Console.WriteLine("Subdirectory 'data/processed' created.");
// Get a directory client for the created subdirectory
var directoryClient = fileSystemClient.GetDirectoryClient("data/processed");
// List contents of the directory
await foreach (var pathItem in directoryClient.GetPathsAsync())
{
Console.WriteLine($"Path: {pathItem.Name}, Is Directory: {pathItem.IsDirectory}");
}
// Delete a directory (must be empty or use recursive delete option if available/supported)
await directoryClient.DeleteAsync();
Console.WriteLine("Directory 'data/processed' deleted.");
The SDK supports managing Access Control Lists (ACLs) for files and directories to control permissions.
You can set and get ACLs using methods on DataLakeFileClient
and DataLakeDirectoryClient
.
// Assuming 'fileClient' is an instance of DataLakeFileClient
var acl = "user:alice:rwx"; // Example ACL entry
// Add or update ACL
await fileClient.SetAccessControlAsync(acl);
Console.WriteLine("ACL updated.");
// Get ACLs
var aclResult = await fileClient.GetAccessControlAsync();
Console.WriteLine("Current ACLs:");
foreach (var entry in aclResult.Value.AccessControlList)
{
Console.WriteLine($"- {entry.Id}: {entry.Permissions}");
}
For more comprehensive examples, please refer to the official Azure SDK for .NET samples repository on GitHub.
Azure.Storage.DataLake GitHub Repository
Additional documentation can be found on Microsoft Docs.