Creating an Azure Stream Analytics Job
This guide walks you through the process of creating a new Azure Stream Analytics job using the Azure portal. Stream Analytics jobs process real-time data streams from various input sources and direct the results to different output sinks.
Step-by-Step Creation Process
-
Navigate to the Azure Portal: Open your web browser and go to https://portal.azure.com/. Log in with your Azure account credentials.
-
Create a Resource: In the Azure portal, click on the + Create a resource button, typically found in the top-left corner or on the dashboard.
-
Search for Stream Analytics: In the search bar for Marketplace, type "Stream Analytics" and select Stream Analytics job from the results.
-
Configure the Job: Click on the Create button. You will be presented with a form to configure your new Stream Analytics job.
- Subscription: Choose the Azure subscription you want to use.
- Resource group: Select an existing resource group or create a new one to organize your Azure resources.
- Job name: Provide a unique and descriptive name for your job (e.g.,
my-realtime-analytics-job).
- Region: Select the Azure region where you want to deploy your job. Choose a region geographically close to your data sources and sinks for optimal performance.
- Hosting environment: For most use cases, select Cloud. Edge is used for deployments on IoT Edge devices.
-
Review and Create: After filling in the details, click on the Review + create button. Azure will validate your configuration.
-
Deploy the Job: Once the validation passes, click the Create button to deploy your Stream Analytics job. Deployment usually takes a few minutes.
After Job Creation
Once your Stream Analytics job is created, you will need to configure its inputs, query, and outputs:
Configuring Inputs
Inputs are the data streams that your job will process. Common input sources include:
- Azure Event Hubs: A highly scalable data streaming platform.
- Azure IoT Hub: A managed cloud service that enables secure and bi-directional communication between IoT devices and the cloud.
- Azure Blob Storage: Can be used for batch processing or as an output sink.
To configure an input:
- Navigate to your Stream Analytics job in the Azure portal.
- In the job's menu, under Job topology, select Inputs.
- Click Add stream input and choose your input source type.
- Follow the prompts to connect to your data source, providing details like connection strings, event serialization format, and encoding.
Writing the Query
The query is the core of your Stream Analytics job. It defines how your data streams are transformed and analyzed using a SQL-like query language. You can perform transformations, aggregations, joins, and more.
Example Query:
SELECT
System.Timestamp AS WindowEnd,
DeviceId,
AVG(Temperature) AS AverageTemperature
INTO
OutputAlias
FROM
InputAlias
GROUP BY
DeviceId,
TumblingWindow(minute, 5)
To configure the query:
- In the job's menu, under Job topology, select Query.
- Write your Stream Analytics Query Language (SAQL) in the query editor.
- Use the Test button to validate your query against sample input data.
Configuring Outputs
Outputs are where your processed data will be sent. Common output sinks include:
- Azure Blob Storage: For storing results in files.
- Azure SQL Database: For relational data storage.
- Azure Data Lake Storage: For large-scale data analytics.
- Power BI: For real-time visualization.
- Azure Cosmos DB: For NoSQL data.
- Service Bus: For messaging.
To configure an output:
- In the job's menu, under Job topology, select Outputs.
- Click Add and choose your output sink type.
- Follow the prompts to configure the connection details for your output sink.
Important: Ensure that your input, query, and output are correctly defined and connected. The query references output and input aliases that must match the names you configure in their respective sections.
Starting the Job
Once your job, inputs, query, and outputs are configured, you can start the job to begin processing data:
- In your Stream Analytics job's overview page, click the Start button.
- You will be prompted to choose the job output start time. Select Now for immediate processing or specify a past date/time if you need to process historical data.
- Click Start.
Your Stream Analytics job will now be running and processing data in real-time.
For more advanced configurations and features, please refer to the official Azure Stream Analytics documentation.