Azure Data Ingestion

Overview

Azure provides a suite of services to ingest massive streams of data from devices, applications, and logs. Choose the right service based on latency, volume, and processing requirements.

Azure Event Hubs

Highly scalable data streaming platform and event ingestion service, capable of receiving and processing millions of events per second.

Key Features

  • Real-time ingestion
  • Partitioned consumer groups
  • Auto-scaling
  • Integration with Azure Functions & Stream Analytics

Quick Start

az eventhubs namespace create \
  --resource-group MyResourceGroup \
  --name MyNamespace \
  --location eastus

az eventhubs eventhub create \
  --resource-group MyResourceGroup \
  --namespace-name MyNamespace \
  --name MyEventHub \
  --partition-count 4

Sample Code (C#)

using Azure.Messaging.EventHubs;
using Azure.Messaging.EventHubs.Producer;

var connectionString = "";
var eventHubName = "MyEventHub";

await using var producer = new EventHubProducerClient(connectionString, eventHubName);
using EventDataBatch eventBatch = await producer.CreateBatchAsync();

eventBatch.TryAdd(new EventData("First event"));
eventBatch.TryAdd(new EventData("Second event"));

await producer.SendAsync(eventBatch);
Console.WriteLine("Events sent");

Azure Data Factory

Orchestrates data movement and transformation at scale. Ideal for batch ingestion from on-premises, SaaS, and cloud sources.

Core Concepts

ComponentDescription
PipelineLogical grouping of activities.
Linked ServiceConnection info to external data stores.
DatasetSchema definition of data to be consumed.
ActivityStep performed within a pipeline.

Sample JSON Pipeline

{
  "name": "CopyFromBlobToSql",
  "properties": {
    "activities": [
      {
        "name": "CopyData",
        "type": "Copy",
        "inputs": [{ "referenceName": "BlobDataset", "type": "DatasetReference" }],
        "outputs": [{ "referenceName": "SqlDataset", "type": "DatasetReference" }],
        "typeProperties": {
          "source": { "type": "BlobSource" },
          "sink": { "type": "SqlSink" }
        }
      }
    ]
  }
}

Azure Stream Analytics

Real-time analytics service for high-throughput streaming data. Define queries using a SQL-like language.

Sample Query

SELECT
    System.Timestamp AS WindowEnd,
    DeviceId,
    AVG(Temperature) AS AvgTemp
INTO
    OutputBlob
FROM
    InputEventHub TIMESTAMP BY EventEnqueuedUtcTime
GROUP BY
    TumblingWindow(minute, 5), DeviceId;

Azure Blob Storage Ingestion

Simple and cost-effective way to store large data files, logs, and batches before processing.

Upload via Azure CLI

az storage blob upload \
  --account-name mystorageaccount \
  --container-name rawdata \
  --name sensor-data-2025-09-13.json \
  --file ./sensor-data.json