Core Concepts

Understanding the fundamental building blocks of Azure Event Hubs is crucial for designing and implementing efficient event streaming solutions. This section breaks down the key concepts you need to know.

Event Hub

An Event Hub is the central entity within Azure Event Hubs. It acts as a highly scalable data streaming platform that can ingest millions of events per second. You can think of it as a named collection of event data streams.

Event

An event is a small unit of information, typically a record or a payload, that is sent to and processed by Event Hubs. Events are immutable and ordered within a partition.

Publisher

A publisher is an application or service that sends events to an Event Hub. Publishers can send events to specific partitions or allow Event Hubs to choose the partition using a partitioning key.

Consumer Group

A consumer group is an independent view of an event stream. Each consumer group allows a separate application or service to read events from an Event Hub without affecting other consumer groups. This enables multiple applications to process the same data stream in parallel.

Partition

An Event Hub is divided into one or more partitions. Partitions are the ordered sequences of events. Events sent to an Event Hub are appended to one of the partitions. The partitioning strategy ensures that events with the same partitioning key are always sent to the same partition, maintaining order for related events.

The number of partitions is fixed at the time of Event Hub creation and cannot be changed. Choosing an appropriate number of partitions is critical for scalability and throughput.

Partitioning Key

A partitioning key is a property of an event that is used to determine which partition the event will be sent to. If a partitioning key is not provided, Event Hubs will use a round-robin mechanism to distribute events across partitions. Using a consistent partitioning key ensures that related events are processed in order by consumers within a consumer group.

// Example of using a partitioning key (conceptual)
const eventData = {
  body: "User activity data",
  partitionKey: "user-123" // Ensures all events for user-123 go to the same partition
};
eventHubProducer.send(eventData);

Offset

An offset is a unique, sequential identifier for an event within a partition. Consumers use offsets to track their progress and resume reading from a specific point in the event stream. Offsets are only meaningful within a specific partition.

Sequence Number

A sequence number is a unique, sequential identifier for an event within a partition, assigned by Event Hubs. It is similar to an offset but is managed by the Event Hubs service.

Capture

Event Hubs Capture is a built-in feature that automatically and incrementally captures the output of an Event Hub and writes it to an Azure Blob Storage account or Azure Data Lake Storage account. This is ideal for archival purposes and for subsequent processing with tools like Azure Databricks or Azure Synapse Analytics.

Key Relationships

Here's how these concepts relate to each other:

By mastering these core concepts, you'll be well-equipped to leverage the power of Azure Event Hubs for your real-time data streaming needs.