Event Hubs
Azure Event Hubs is a highly scalable data streaming platform and event ingestion service. It can capture millions of events per second so you can develop a variety of real-time analytics solutions.
Event Hubs acts as a "front door" for an application, intercepting events or data. Event Hubs is designed for scenarios where data is coming in and needs to be processed immediately or in batches. Applications that use Event Hubs can retrieve and process the resulting data streams.
Conceptual diagram of data flow with Event Hubs.
Key Components
1. Event Hub
An Event Hub is the central entity. It's a container for events. When you create an Event Hubs namespace, you can then create one or more Event Hubs within that namespace. Each Event Hub is a named stream of events.
2. Event
An Event is the smallest unit of data that Event Hubs handles. It's a record or a message. An event contains a payload (the actual data), and optional metadata like headers and properties.
A typical event record might contain:
Body: The event data, typically JSON or Avro.Properties: Application-defined properties.SystemProperties: System-generated properties like offset, sequence number, timestamp, and partition key.
3. Partition
Event Hubs is a partitioned stream. A Partition is an ordered sequence of events. Events that are sent to an Event Hub are partitioned. Partitions are the responsibility of Event Hubs and are created automatically as part of an Event Hub. You can control the number of partitions when you create an Event Hub.
Partitions allow Event Hubs to scale. Events are distributed across partitions. When you send an event, you can specify a partition key. If a partition key is specified, all events with the same partition key will be directed to the same partition. This ensures that events for a specific entity (like a user or a device) are processed in order.
4. Consumer Group
A Consumer Group allows an application to read from an Event Hub independently. Each consumer group represents a distinct viewing of the event stream. Even if multiple applications are reading from the same Event Hub, they can do so independently by using their own consumer group.
This is crucial for scenarios where different applications need to process the same data streams for different purposes (e.g., one application for real-time dashboarding, another for batch analytics, and a third for archiving). Without consumer groups, only one reader could access a partition at a time. With consumer groups, multiple readers can operate in parallel, each maintaining its own position in the stream.
Sending and Receiving Events
Event Hubs supports various SDKs and protocols (like AMQP and Kafka) for sending and receiving events. You can send events in batches to improve efficiency.
// Example: Sending an event (conceptual)
import { EventHubProducerClient } from "@azure/event-hubs";
const connectionString = "YOUR_EVENTHUBS_CONNECTION_STRING";
const eventHubName = "YOUR_EVENTHUB_NAME";
async function sendEvent() {
const producerClient = new EventHubProducerClient(connectionString, eventHubName);
const eventData = { body: "Hello, Event Hubs!" };
await producerClient.sendEvent(eventData);
console.log("Event sent successfully.");
await producerClient.close();
}
// Example: Receiving events (conceptual)
import { EventHubConsumerClient } from "@azure/event-hubs";
const connectionString = "YOUR_EVENTHUBS_CONNECTION_STRING";
const eventHubName = "YOUR_EVENTHUB_NAME";
const consumerGroupName = "$Default"; // Or your custom consumer group name
async function receiveEvents() {
const consumerClient = new EventHubConsumerClient(consumerGroupName, connectionString, eventHubName);
const subscription = consumerClient.subscribe({
async processEvents(events, context) {
for (const event of events) {
console.log(`Received event: ${JSON.stringify(event.body)}`);
}
},
async processError(err, context) {
console.error(`Error occurred: ${err.message}`);
}
});
// Keep the process running to receive events
await new Promise(resolve => setTimeout(resolve, 60000)); // Listen for 60 seconds
await subscription.close();
await consumerClient.close();
}
Use Cases
- Real-time telemetry processing
- Application logging and monitoring
- Data pipeline ingestion
- Fraud detection
- IoT data handling