Scalable real-time data streaming
Azure Event Hubs Capture is a built-in feature that automatically and incrementally batches the output of Event Hubs into the storage solution of your choice. It's designed for scenarios where you need to reliably capture streaming data for long-term archival, batch analytics, or immediate reprocessing.
When Capture is enabled, Event Hubs automatically takes snapshots of the data and writes them to a storage account. This process is managed by Azure, requiring minimal configuration on your part. The data is written in Apache Avro format, which is efficient and widely supported by big data analytics platforms.
You can enable Capture through the Azure portal, Azure CLI, or Azure Resource Manager (ARM) templates.
az eventhubs eventhub update --resource-group <your-resource-group> \
--namespace-name <your-namespace> \
--name <your-eventhub-name> \
--capture-enabled \
--capture-interval 60 \
--capture-size 104857600 \
--capture-destination blob \
--capture-blob-container <your-container-name> \
--capture-storage-account <your-storage-account-name>
Specifies the maximum time interval (in seconds) between captures. Data will be captured at least this often.
Specifies the maximum size (in bytes) of captured data before it's written to storage.
Choose between Azure Blob Storage or Azure Data Lake Storage Gen2.
The specific Azure Storage account and container where captured data will be stored.
Event Hubs Capture generates files with a structured naming convention, including namespace, event hub name, partition ID, and timestamps, making it easy to organize and query.
For more in-depth information and advanced configurations, please refer to the official Azure Event Hubs documentation: