Azure Stream Analytics Query Language

The Azure Stream Analytics Query Language (SAQL) is a SQL-like language used to process data streams in real-time. It allows you to filter, aggregate, and transform streaming data with low latency.

Core Concepts

SELECT Statement

The `SELECT` statement is used to retrieve data from input streams or query results. It supports standard SQL clauses like `FROM`, `WHERE`, `GROUP BY`, and `ORDER BY`.

SELECT
    System.Timestamp AS WindowEnd,
    DeviceId,
    AVG(Temperature) AS AverageTemperature
FROM
    IoTHub
WHERE
    IsActive = 1
GROUP BY
    DeviceId,
    TumblingWindow(minute, 5)

Input Streams

Input streams represent the data sources your ASA job reads from. These can be Azure IoT Hub, Azure Event Hubs, or Azure Blob Storage.

Output Data

Output data is what your ASA job writes to after processing. Common destinations include Azure SQL Database, Azure Cosmos DB, Power BI, and Event Hubs.

Windowing Functions

Windowing functions are crucial for performing time-based aggregations on unbounded streaming data.

Tumbling Window

Tumbling windows divide the stream into discrete, non-overlapping time segments.

TumblingWindow(size, [field_or_time])

Example: `TumblingWindow(minute, 5)` creates 5-minute windows.

Hopping Window

Hopping windows allow for overlapping windows, which is useful for analyzing data over sliding periods.

HoppingWindow( HopLength, WindowSize [,field_or_time] )

Example: `HoppingWindow(minute, 5, 10)` creates 10-minute windows that advance every 5 minutes.

Sliding Window

Sliding windows are a special case of hopping windows where the hop length is equal to the window size.

SlidingWindow(size [,field_or_time])

Session Window

Session windows group events that occur within a defined period of activity, separated by inactivity.

SessionWindow(session_gap [,field_or_time])

Example: `SessionWindow(minutes, 10)` groups events into sessions that are separated by at least 10 minutes of inactivity.

Functions

SAQL provides a rich set of built-in functions for data manipulation and analysis.

Aggregate Functions

Built-in Functions

SAQL includes functions for:

Note: Refer to the official Azure documentation for a comprehensive list of supported functions and their syntax.

Joins

You can join multiple streams or a stream with reference data.

Stream-to-Stream Joins

Join events from two different streams, typically within a time window.

SELECT
    input1.DeviceId,
    input1.Value AS SensorValue,
    input2.Status
FROM
    InputStream1 AS input1
JOIN
    InputStream2 AS input2
ON
    input1.DeviceId = input2.DeviceId
    AND DATEDIFF(minute, input1, input2) BETWEEN 0 AND 5

Stream-to-Reference Data Joins

Join streaming data with static reference data to enrich events.

SELECT
    s.DeviceId,
    s.Temperature,
    r.Location
FROM
    StreamingData AS s
JOIN
    ReferenceData AS r ON s.DeviceId = r.DeviceId
Tip: For performance optimization, ensure that joins involving reference data are efficient by keeping the reference data small and indexed.

Common Patterns