Message Retention in Azure Event Hubs
Message retention is a critical configuration setting in Azure Event Hubs that determines how long events are stored in an Event Hub before they are automatically deleted. This setting is crucial for managing storage costs, complying with data policies, and ensuring that consumers have sufficient time to process incoming events.
How Message Retention Works
When you create an Event Hub, you can specify a retention period for the messages it will store. This period is typically measured in days. Event Hubs automatically manage the deletion of old messages based on this configured retention period. Consumers can read events from any point within the retention window, provided they maintain their offset correctly.
Configurable Retention Periods
The maximum message retention period you can configure depends on the Event Hubs tier you are using:
- Basic and Standard Tiers: Up to 7 days.
- Premium and Dedicated Tiers: Up to 7 days by default, but can be extended up to 90 days by contacting Azure Support.
You can adjust the retention period for an existing Event Hub through the Azure portal, Azure CLI, or Azure SDKs.
Factors Influencing Retention Period Choice
When deciding on an appropriate message retention period, consider the following:
- Consumer Processing Speed: Ensure the retention period is long enough for all consumers, including those that might experience temporary downtime or slower processing, to ingest and process their messages.
- Data Compliance and Auditing: Some regulations may require data to be stored for a specific duration.
- Storage Costs: Longer retention periods consume more storage, which incurs higher costs. Balance the need for data availability with budget constraints.
- Event Volume: High-volume event streams may fill up storage faster, making shorter retention periods more economical.
Viewing and Modifying Message Retention
Azure Portal
- Navigate to your Event Hubs namespace in the Azure portal.
- Select the specific Event Hub you want to configure.
- Under "Settings", click on "Configuration".
- You will see the "Message retention (days)" setting where you can adjust the value.
- Click "Apply" to save your changes.
Azure CLI Example
To set message retention to 3 days for an Event Hub named myEventHub in a namespace named myNamespace:
az eventhubs eventhub update --resource-group myResourceGroup --namespace-name myNamespace --name myEventHub --retention-period 3
Calculating Storage Usage
Understanding your storage consumption helps in estimating costs. The storage used by messages is directly related to the message size, the number of partitions, and the retention period.
A rough estimate of storage consumed can be calculated as:
Storage = (Average Message Size) * (Total Throughput) * (Retention Period in Seconds)
Note that Event Hubs has a default limit of 1 TB per partition for retention. If you exceed this limit, older messages will be dropped even if the retention period has not elapsed.
Message Expiration
Messages are considered "expired" and eligible for deletion once they have been stored for longer than the configured retention period. Event Hubs continuously cleans up expired messages.
It's important to understand that the deletion is not instantaneous. There might be a small delay between a message reaching its retention limit and its actual removal from storage. Consumers should be designed to handle potential re-reads of the very last few messages if they process events very close to the retention deadline.
Consumer Logic Consideration:
When consumers read messages, they keep track of their position using an offset. If a consumer stops processing for a period longer than the retention policy, it may no longer be able to read older messages that have been deleted.