Consuming Azure Event Hubs with Kafka

This tutorial guides you through the process of consuming messages from Azure Event Hubs using the Apache Kafka API. Azure Event Hubs provides a highly scalable data streaming platform that can be accessed with familiar Kafka clients.

Prerequisites

  • An Azure subscription.
  • An Azure Event Hubs namespace and an Event Hub created within it.
  • Kafka client libraries for your preferred programming language (e.g., Java, Python, .NET).
  • Understanding of Kafka concepts (producers, consumers, topics, consumer groups).

Step 1: Obtain Event Hubs Connection Information

You'll need the Event Hubs connection string or other authentication details to connect your Kafka client. You can find this information in the Azure portal under your Event Hubs namespace's "Shared access policies" or "Access keys".

For Kafka compatibility, you'll typically use the Kafka endpoint provided by Event Hubs. This usually follows the pattern:

kafka://.servicebus.windows.net:9093

And you'll need a SAS policy name and key for authentication.

Step 2: Configure Your Kafka Consumer

Configure your Kafka consumer application with the necessary bootstrap servers and security credentials. The key parameters you'll need to set are:

  • bootstrap.servers: The Event Hubs Kafka endpoint (e.g., your-namespace.servicebus.windows.net:9093).
  • security.protocol: Typically set to SASL_SSL.
  • sasl.mechanism: Set to PLAIN.
  • sasl.jaas.config: This is where you provide your Event Hubs credentials. The format is crucial:
org.apache.kafka.common.security.plain.PlainLoginModule required \
  username="$ConnectionString" \
  password="Endpoint=sb://.servicebus.windows.net/;SharedAccessKeyName=;SharedAccessKey=";

Ensure you replace placeholders like <your-namespace>, <your-policy-name>, and <your-key> with your actual Event Hubs details.

Step 3: Implement the Kafka Consumer Logic

Write your consumer code to connect to Event Hubs as if it were a Kafka broker. You'll subscribe to the Event Hubs topic (which acts like a Kafka topic) and process incoming messages.

Example (Conceptual Python):


from kafka import KafkaConsumer
import json

# Kafka consumer configuration
consumer_config = {
    'bootstrap_servers': 'your-namespace.servicebus.windows.net:9093',
    'security_protocol': 'SASL_SSL',
    'sasl_mechanism': 'PLAIN',
    'sasl_jaas_config': 'org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://your-namespace.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=YOUR_SAS_KEY"',
    'group_id': 'my-eventhubs-consumer-group',
    'auto_offset_reset': 'earliest'
}

# Event Hubs topic name
topic_name = 'your-eventhub-name'

try:
    consumer = KafkaConsumer(topic_name, **consumer_config)
    print(f"Connected to Event Hubs. Subscribed to topic: {topic_name}")

    for message in consumer:
        print(f"Received message: Offset={message.offset}, Key={message.key}, Value={message.value.decode('utf-8')}")
        # Process the message here
        # For example, parse JSON, store in a database, etc.
        # data = json.loads(message.value)

except Exception as e:
    print(f"An error occurred: {e}")

finally:
    if 'consumer' in locals() and consumer:
        consumer.close()
        print("Consumer closed.")
                
Important: The exact configuration and syntax for sasl.jaas.config might vary slightly depending on the Kafka client library version and language you are using. Always refer to the specific library's documentation.

Step 4: Running Your Consumer

Compile and run your consumer application. It should now connect to your Azure Event Hub and start receiving messages that are being published to it.

Considerations

  • Authentication: Ensure your SAS key has the necessary permissions (e.g., Listen).
  • Topic Naming: Event Hubs use names for their streaming entities, which map to Kafka topics.
  • Consumer Groups: The group_id in Kafka is used for load balancing and fault tolerance. Event Hubs supports this concept.
  • Offset Management: By default, Kafka clients manage offsets. Event Hubs integrates with this for reliable message processing.
  • Client Compatibility: Azure Event Hubs supports Kafka protocol versions 1.0.0 and higher.

By leveraging the Kafka API, you can seamlessly integrate existing Kafka-based applications and tools with Azure Event Hubs, taking advantage of its robust cloud-native features while minimizing code changes.