Monitoring and Alerting - Azure Stream Analytics

Monitoring and Alerting for Azure Stream Analytics

Effective monitoring and alerting are crucial for ensuring the health, performance, and reliability of your Azure Stream Analytics (ASA) jobs. This section covers key metrics, common alert scenarios, and best practices for setting up comprehensive monitoring.

Key Metrics to Monitor

Azure Stream Analytics provides a rich set of metrics that offer insights into the operational status of your jobs. Here are some of the most important ones:

✓

Sucessful Events Input

1.5M

⚠

Late Input Events

150

✖

Errors Input

✓

Sucessful Events Output

1.49M

⚠

Degraded Input Events

✖

Errors Output

⚙️

CPU Percentage

75%

💧

Watermark Delay

Common Alerting Scenarios

Setting up alerts for specific conditions can help you proactively address issues before they impact your application. Here are some common scenarios:

Input Errors

Alert when the number of input errors exceeds a defined threshold. This could indicate issues with data sources or connection problems.

Trigger Condition: InputErrors > 0 for 5 minutes

Output Errors

Notify when errors occur during data output. This might point to problems with sinks or serialization issues.

Trigger Condition: OutputErrors > 0 for 5 minutes

Late Input Events

Get alerted if a significant number of events are arriving late. This can impact the accuracy of time-sensitive analysis.

Trigger Condition: LateInputEventsPercentage > 1% for 10 minutes

Watermark Delay

Monitor the watermark delay to understand how far behind your job is in processing real-time data.

Trigger Condition: WatermarkDelay > 30 seconds for 5 minutes

Resource Utilization

Track CPU usage and memory to ensure your job is performing optimally and to identify potential bottlenecks.

Trigger Condition: CPUPercentage > 80% for 15 minutes

Configuring Alerts in Azure Monitor

Azure Monitor provides a centralized platform for setting up and managing alerts for your Stream Analytics jobs.

Navigate to your Azure Stream Analytics job in the Azure portal.
In the left-hand menu, select Metrics under the Monitoring section.
Click on New alert rule.
Scope: Ensure your Stream Analytics job is selected.
Condition:
- Select the Signal name (e.g., Input Errors, Output Errors, Late Input Events Percentage, Watermark Delay, CPU Percentage).
- Configure the Alert logic (e.g., Threshold, Operator, Aggregated value).
- Set the Evaluation based on settings (e.g., Across time series, Per time series).
- Specify the Period and Frequency of evaluation.
Actions:
- Create or select an Action group. Action groups define what happens when an alert is triggered (e.g., send an email, SMS, trigger a webhook, run an Azure Function).
Details:
- Provide a descriptive Alert rule name.
- Select the Severity of the alert.
- Add an optional Description.
Review and create the alert rule.

Best Practices for Monitoring and Alerting

Start with Key Metrics: Focus on input/output errors, late events, and watermark delay first.
Tune Thresholds: Avoid overly sensitive alerts that create noise. Regularly review and adjust thresholds based on your job's baseline performance.
Use Action Groups Effectively: Integrate alerts with your existing IT operations workflows (e.g., ticketing systems, PagerDuty).
Monitor Resource Utilization: Keep an eye on CPU and memory to proactively scale your ASA job if needed.
Test Your Alerts: Periodically simulate conditions that should trigger alerts to ensure they are working as expected.
Leverage Log Analytics: For more in-depth debugging, consider sending ASA diagnostic logs to Azure Log Analytics for advanced querying and analysis.
Set Up Custom Metrics: If standard metrics aren't sufficient, explore creating custom metrics within your ASA query to track specific business logic KPIs.