Azure Event Hubs Concepts

Partitioning in Azure Event Hubs

Partitioning is a fundamental concept in Azure Event Hubs that enables high throughput and scalability for event ingestion and processing. An Event Hub is divided into one or more partitions. These partitions are the primary unit of parallelism in Event Hubs.

How Partitioning Works

Each partition is an ordered, immutable sequence of events. Events are appended to a partition and can be read from it. A key aspect of partitioning is that events with the same partition key are guaranteed to be stored and delivered to the same partition. This ordering guarantee is crucial for many event-driven scenarios where the sequence of related events matters.

When you create an Event Hub, you specify the number of partitions. This number cannot be changed after creation. The maximum number of partitions is determined by the Event Hubs tier:

  • Basic and Standard tiers: Up to 32 partitions.
  • Premium tier: Up to 1024 partitions (depending on the namespace configuration).
  • Dedicated tier: Up to 4096 partitions (depending on the namespace configuration).

Benefits of Partitioning

  • Scalability: By increasing the number of partitions, you can increase the overall throughput of your Event Hub. Multiple partitions can handle incoming events concurrently.
  • Parallel Processing: Consumers can read from multiple partitions in parallel, allowing for faster processing of event streams.
  • Ordered Delivery: Events within a single partition are always delivered in the order they were received, respecting the partition key.
  • Load Balancing: Event Hubs automatically distributes events across partitions, providing a form of load balancing for incoming traffic.

Choosing a Partition Key

The partition key is a string value that is used to determine which partition an event is sent to. If no partition key is provided when publishing an event, Event Hubs will assign it to a partition in a round-robin fashion. However, to ensure events that are related logically are processed in order, you should choose a partition key.

Good candidates for partition keys include:

  • A user ID
  • A device ID
  • A session ID
  • Any other identifier that represents a logical grouping of events.
Diagram showing Event Hubs partitioning

Illustrative diagram of Event Hubs partitioning.

Partitioning and Consumer Groups

Each consumer group in an Event Hub reads from all partitions. However, within a consumer group, only one consumer instance (or event processor) will read from a specific partition at any given time. This ensures that events within a partition are processed exactly once by the consumers within that group.

Considerations

  • Fixed Number of Partitions: Once an Event Hub is created, the number of partitions cannot be changed. If you anticipate needing more partitions later, you should plan accordingly during the initial creation.
  • Partition Key Hotspots: If your partition key distributes events unevenly, one partition might become a bottleneck. Carefully choose your partition key to ensure balanced distribution.
  • Maximum Throughput per Partition: While the total throughput of an Event Hub scales with partitions, each partition has its own ingest and egress limits.

Understanding partitioning is key to designing scalable and efficient event-driven applications with Azure Event Hubs. For more details on publishing and consuming events with partition keys, refer to the Azure Event Hubs SDK documentation.