Understanding Performance Tiers
Azure Cosmos DB offers various performance levels to suit different application requirements and budgets. Choosing the correct performance tier is crucial for optimizing your database's throughput, latency, and cost-effectiveness.
Performance levels in Cosmos DB are primarily defined by two key metrics: Request Units per second (RU/s) for throughput and storage. You can provision throughput either manually or automatically.
Manual Throughput Provisioning
With manual throughput, you specify a fixed number of RU/s that your container or database will consume. This is ideal for predictable workloads.
Autoscale Throughput Provisioning
Autoscale enables your database or container to automatically scale its throughput based on demand, up to a specified maximum. This is cost-effective for variable workloads as you only pay for the throughput you consume.
Performance Tiers Explained
Cosmos DB historically used terms like "S1", "S2", "S3" (for Standard tier) and "P1", "P2", "P3" (for Premium tier). While these terms are still sometimes referenced, the modern approach focuses on provisioning RU/s directly. The distinction often boils down to the guaranteed performance characteristics and SLAs.
Key Performance Tiers and Characteristics:
| Tier Name (Historical/Conceptual) | RU/s Range (Example) | Typical Use Case | Latency | SLA |
|---|---|---|---|---|
| Free Tier | 5 RU/s (for testing) | Development, testing, learning | Higher, variable | N/A |
| Standard (Manual/Autoscale) | 100 - 10,000+ RU/s | Web applications, mobile apps, gaming, IoT | Low (typically <10ms for reads, <20ms for writes at p99) | 99.99% availability |
| Premium (Manual/Autoscale) | 10,000 - 1,000,000+ RU/s | Enterprise applications, high-traffic e-commerce, mission-critical systems | Consistently lower (typically <5ms for reads, <10ms for writes at p99) | 99.999% availability |
Note: The RU/s ranges are illustrative. Actual provisionable limits can be much higher and depend on your subscription and region.
Request Units (RU) - The Currency of Performance
A Request Unit (RU) is a normalized measure of throughput provided by Cosmos DB. Different database operations (reads, writes, queries, updates) consume a certain number of RUs based on their complexity and resource usage.
For example, a simple read operation on an item might consume 1 RU, while a complex query could consume significantly more.
Example RU Consumption:
// Example: Reading an item with a payload of 1KB
GET /dbs/mydatabase/colls/mycollection/docs/mypartitionkey/itemid HTTP/1.1
Content-Length: 0
HTTP/1.1 200 OK
x-ms-request-charge: 1.00
...
In the example above, the x-ms-request-charge header indicates that this read operation consumed 1.00 RU.
Choosing the Right Performance Level
Consider the following factors:
- Workload Predictability: Use manual throughput for stable workloads and autoscale for unpredictable ones.
- Throughput Requirements: Estimate your peak RU/s needs. Tools like the Cosmos DB Capacity Calculator can help.
- Latency Requirements: Mission-critical applications with stringent latency needs may benefit from the Premium tier.
- Budget: Free tier is excellent for initial development. Compare costs between manual and autoscale provisioning for your specific usage patterns.
- Availability: If your application demands the highest level of uptime, the 99.999% SLA of the Premium tier is essential.
Monitoring Performance
Regularly monitor your Cosmos DB performance using Azure Monitor. Pay attention to:
- Consumed RU/s: Ensure you are not exceeding your provisioned throughput.
- Throttled Requests: Indicates your RU/s are insufficient.
- Latency: Monitor read and write latencies to ensure they meet your application's needs.
- Storage Usage: Keep track of your data storage to plan for capacity.