Event Hubs Capture

Azure Event Hubs Capture is a fully managed Event Hubs feature that automatically and incrementally captures the output of an Event Hubs stream and writes it into an Azure Blob Storage or Azure Data Lake Storage account of your choice.

How Capture Works

When you enable Event Hubs Capture, it reads events from your Event Hubs namespace and writes them into storage. Capture can be configured to write to either:

Key Benefits of Using Capture

Configuring Event Hubs Capture

You can enable and configure Event Hubs Capture through the Azure portal, Azure CLI, PowerShell, or ARM templates. The configuration typically involves:

  1. Selecting the Destination: Choose between Azure Blob Storage or Azure Data Lake Storage Gen2.
  2. Providing Storage Account Details: Specify the storage account, container, and optionally a directory path.
  3. Defining Capture Interval: Configure how often data should be captured (e.g., every X minutes or after X GB of data).
  4. Choosing the File Format: Select between Avro (default) or Parquet.
Important: When configuring Capture, ensure your Event Hubs namespace has the necessary permissions to write to the chosen storage account.

Capture File Naming Convention

Captured files follow a specific naming convention to help organize and identify them. The default format is:

{
    "storage_account_name": "{StorageAccountName}",
    "container_name": "{ContainerName}",
    "namespace_name": "{NamespaceName}",
    "event_hub_name": "{EventHubName}",
    "partition_id": "{PartitionId}",
    "creation_time_utc": "{YYYY}/{MM}/{DD}/{HH}/{mm}/{ss}"
}

For example, a captured file might look like:

/web/logs/2023/10/27/15/30/05/yournamespace/youreventhub/0/yourcapturefile.avro

Use Cases for Capture

Limitations and Considerations

By leveraging Event Hubs Capture, you can create a robust and scalable solution for managing and analyzing your streaming event data.