Azure Cosmos DB Performance Levels

Understand and select the right performance tier for your needs.

Understanding Performance Tiers

Azure Cosmos DB offers various performance levels to suit different application requirements and budgets. Choosing the correct performance tier is crucial for optimizing your database's throughput, latency, and cost-effectiveness.

Performance levels in Cosmos DB are primarily defined by two key metrics: Request Units per second (RU/s) for throughput and storage. You can provision throughput either manually or automatically.

Manual Throughput Provisioning

With manual throughput, you specify a fixed number of RU/s that your container or database will consume. This is ideal for predictable workloads.

Autoscale Throughput Provisioning

Autoscale enables your database or container to automatically scale its throughput based on demand, up to a specified maximum. This is cost-effective for variable workloads as you only pay for the throughput you consume.

Performance Tiers Explained

Cosmos DB historically used terms like "S1", "S2", "S3" (for Standard tier) and "P1", "P2", "P3" (for Premium tier). While these terms are still sometimes referenced, the modern approach focuses on provisioning RU/s directly. The distinction often boils down to the guaranteed performance characteristics and SLAs.

Key Performance Tiers and Characteristics:

Tier Name (Historical/Conceptual) RU/s Range (Example) Typical Use Case Latency SLA
Free Tier 5 RU/s (for testing) Development, testing, learning Higher, variable N/A
Standard (Manual/Autoscale) 100 - 10,000+ RU/s Web applications, mobile apps, gaming, IoT Low (typically <10ms for reads, <20ms for writes at p99) 99.99% availability
Premium (Manual/Autoscale) 10,000 - 1,000,000+ RU/s Enterprise applications, high-traffic e-commerce, mission-critical systems Consistently lower (typically <5ms for reads, <10ms for writes at p99) 99.999% availability

Note: The RU/s ranges are illustrative. Actual provisionable limits can be much higher and depend on your subscription and region.

Request Units (RU) - The Currency of Performance

A Request Unit (RU) is a normalized measure of throughput provided by Cosmos DB. Different database operations (reads, writes, queries, updates) consume a certain number of RUs based on their complexity and resource usage.

For example, a simple read operation on an item might consume 1 RU, while a complex query could consume significantly more.

Example RU Consumption:

// Example: Reading an item with a payload of 1KB GET /dbs/mydatabase/colls/mycollection/docs/mypartitionkey/itemid HTTP/1.1 Content-Length: 0 HTTP/1.1 200 OK x-ms-request-charge: 1.00 ...

In the example above, the x-ms-request-charge header indicates that this read operation consumed 1.00 RU.

Choosing the Right Performance Level

Consider the following factors:

  • Workload Predictability: Use manual throughput for stable workloads and autoscale for unpredictable ones.
  • Throughput Requirements: Estimate your peak RU/s needs. Tools like the Cosmos DB Capacity Calculator can help.
  • Latency Requirements: Mission-critical applications with stringent latency needs may benefit from the Premium tier.
  • Budget: Free tier is excellent for initial development. Compare costs between manual and autoscale provisioning for your specific usage patterns.
  • Availability: If your application demands the highest level of uptime, the 99.999% SLA of the Premium tier is essential.
Important: The concept of "tiers" is evolving. Focus on provisioning the required RU/s and storage, and leverage autoscale where appropriate. Always refer to the latest Azure documentation for the most up-to-date details on pricing and capabilities.

Monitoring Performance

Regularly monitor your Cosmos DB performance using Azure Monitor. Pay attention to:

  • Consumed RU/s: Ensure you are not exceeding your provisioned throughput.
  • Throttled Requests: Indicates your RU/s are insufficient.
  • Latency: Monitor read and write latencies to ensure they meet your application's needs.
  • Storage Usage: Keep track of your data storage to plan for capacity.