Introduction
Effective monitoring is crucial for maintaining the performance, availability, and cost-efficiency of your Azure Analysis Services (AAS) instances. Azure provides robust tools and services to help you gain deep insights into your Analysis Services workloads.
This document covers how to leverage Azure Monitor, Azure Resource Health, and other services to monitor your AAS resources.
Azure Monitor Integration
Azure Monitor is the foundational service for collecting, analyzing, and acting on telemetry from your Azure and on-premises environments. Azure Analysis Services integrates seamlessly with Azure Monitor.
Metrics
Metrics are numerical values that describe some aspect of a system at a particular point in time. Azure Analysis Services exposes several key metrics that can help you understand your service's performance and usage.
Commonly Used Metrics:
- CPU Usage: Percentage of CPU utilized by the Analysis Services engine.
- Memory Usage: Amount of memory consumed by the Analysis Services engine.
- Queries Per Second (QPS): The rate at which queries are being processed.
- Cache Hit Ratio: Percentage of queries served from the cache.
- Active Connections: Number of active client connections to the server.
- Query Duration: Average or maximum time taken to execute queries.
- Data Size: Total size of the data stored in the Analysis Services model.
You can view these metrics in the Azure portal by navigating to your Analysis Services resource and selecting "Metrics" from the left-hand menu. You can also use the Metrics API to retrieve and analyze this data programmatically.
Logs
Azure Analysis Services can send diagnostic logs to Azure Log Analytics, Azure Storage, or Azure Event Hubs. These logs provide detailed information about operations, errors, and performance events.
Log Categories:
- Engine: Detailed information about Analysis Services engine operations.
- Service: Information about the Azure Analysis Services service itself, including control plane operations.
- Query: Logs related to query execution, including duration, user, and query text (if enabled).
- Command: Logs for administrative commands (e.g., refresh, create, delete).
To configure diagnostic settings, navigate to your Analysis Services resource, select "Diagnostic settings," and choose the destinations for your logs.
Querying Logs with Log Analytics:
Once logs are sent to Log Analytics, you can use Kusto Query Language (KQL) to analyze them. For example, to find slow queries:
AzureAnalysisServicesLogs
| where Category == "Query"
| where DurationMs > 5000 // Filter queries longer than 5 seconds
| project TimeGenerated, OperationName, DurationMs, CallerName, DatabaseName, QueryText
| order by DurationMs desc
You can also use logs to track user activity, identify failed operations, and troubleshoot performance bottlenecks.
Azure Resource Health
Azure Resource Health provides personalized guidance and support to help you troubleshoot and resolve issues with your Azure resources. It reports on current and past service issues that may have affected your resources.
To access Resource Health for your Azure Analysis Services instance:
- Navigate to your Azure Analysis Services resource in the Azure portal.
- Under the "Support + troubleshooting" section, select "Resource health."
- Review the status of your resource and any active advisories or incidents.
Resource Health can inform you about planned maintenance, unplanned service degradations, or other platform-level issues that might impact your AAS performance or availability.
Query Performance Monitoring
Monitoring query performance is essential for ensuring a responsive user experience. Several aspects contribute to query performance:
- Query Optimization: Efficient DAX or MDX queries are fundamental.
- Model Design: Proper partitioning, data types, and relationships in your tabular model.
- Server Resources: Sufficient CPU, memory, and network bandwidth.
- Caching: Effective use of the VertiPaq engine's caching mechanisms.
You can monitor query performance using:
- Azure Monitor Metrics:
Query Duration,Queries Per Second. - Azure Monitor Logs: Query logs for detailed analysis of individual query performance.
- SQL Server Management Studio (SSMS): Connect to your AAS instance and use tools like the Activity Monitor or DMVs to inspect running queries and their performance characteristics.
- Performance Analyzer in Power BI: While client-side, it helps identify slow queries generated by Power BI reports.
Key Performance Indicators (KPIs):
- Average query duration should be within acceptable limits (e.g., under 2-5 seconds for interactive queries).
- Cache hit ratio should be high, indicating efficient data retrieval.
- CPU and memory usage should not be consistently at maximum capacity.
Management Operations
Monitoring management operations ensures that administrative tasks, such as data refreshes, scaling, and configuration changes, are executed successfully.
- Activity Log: The Azure Activity Log records subscription-level events that Azure resources have performed or that have occurred on Azure resources. This includes operations like starting, stopping, or deleting Analysis Services servers, or modifying their configurations.
- Diagnostic Logs (Service and Command categories): These logs provide detailed insights into the execution of management commands and service-level events.
Regularly review the Activity Log and relevant diagnostic logs to confirm the success of planned administrative actions and to detect any unauthorized or erroneous operations.
Setting Up Alerts
Proactive alerting is key to quickly addressing potential issues before they impact users. Azure Monitor allows you to create alert rules based on metrics or log queries.
Example Alert Rules:
- High CPU Usage: Alert when CPU usage exceeds 80% for 15 minutes.
- Low Cache Hit Ratio: Alert when cache hit ratio drops below 60%.
- High Query Latency: Alert when average query duration exceeds 5 seconds.
- Failed Data Refresh: Alert on specific error patterns in command logs.
- Resource Health Events: Configure alerts for critical service health events.
To set up alerts:
- Navigate to your Analysis Services resource.
- Under "Monitoring," select "Alerts."
- Click "Create" -> "Alert rule."
- Configure the Signal logic (metric or log query), conditions, actions (e.g., send an email, trigger a webhook), and details.
Best Practices for Monitoring Azure Analysis Services
- Establish Baselines: Understand your typical performance metrics during normal operations to easily identify deviations.
- Monitor Key Performance Indicators (KPIs): Focus on metrics that directly impact user experience and service health (CPU, Memory, Query Duration, Cache Hit Ratio).
- Implement Proactive Alerting: Set up alerts for critical thresholds and errors.
- Regularly Review Logs: Don't just rely on alerts; periodically review diagnostic logs, especially query and command logs, for deeper insights and potential issues.
- Utilize Azure Resource Health: Stay informed about platform-level issues that might affect your service.
- Monitor Data Refresh Operations: Ensure that data refresh jobs are completing successfully and within expected timeframes.
- Correlate Metrics and Logs: When investigating an issue, use both metrics (for trends) and logs (for specifics) to get a complete picture.
- Secure Your Monitoring Data: If sending logs to Azure Storage or Event Hubs, ensure appropriate access controls are in place.