Scaling Azure Cosmos DB
Azure Cosmos DB is a globally distributed, multi-model database service that enables you to harness the benefits of global distribution, elastic scalability, and low latency. Scaling is a core feature, allowing your application to handle varying loads by adjusting provisioned throughput and storage.
Understanding Throughput in Cosmos DB
Throughput in Azure Cosmos DB is measured in Request Units (RUs). A Request Unit is a normalized measure of the compute, memory, and IOPS required to perform database operations. You can provision throughput at the container (or collection) level or at the database level. There are two primary ways to manage throughput:
- Manual Throughput: You explicitly set the number of RUs for your containers or databases.
- Autoscale Throughput: Cosmos DB automatically scales your throughput up and down based on your application's workload, within a defined maximum RU limit. This is often the most cost-effective and efficient way to manage throughput for unpredictable workloads.
Provisioning Throughput
Throughput can be provisioned in two scopes:
- Container Level: The most common approach. Throughput is dedicated to a specific container.
- Database Level: Shared throughput for all containers within a database. This is useful when you have many containers with infrequent requests.
Elastic Scale: Throughput and Storage
Cosmos DB offers independent and elastic scaling for both throughput and storage. As your data volume grows, your storage scales automatically. As your request load increases, you can scale your throughput.
Storage Scaling
Storage scales automatically to accommodate your data. There are limits on the maximum storage per partition, which is why effective partitioning is crucial for large datasets.
Throughput Scaling Options
You can scale throughput at any time through the Azure portal, Azure CLI, PowerShell, or SDKs.
Manual Throughput:
# Example: Scaling a container to 1000 RUs using Azure CLI
az cosmosdb sql container update \
--resource-group MyResourceGroup \
--account-name MyCosmosDBAccount \
--database-name MyDatabase \
--name MyContainer \
--throughput 1000
Autoscale Throughput:
When configuring autoscale, you specify a maximum RU/s. Cosmos DB will scale throughput between 10% of the maximum and the maximum RU/s. For example, if you set the maximum to 4000 RU/s, it will scale between 400 and 4000 RU/s.
# Example: Setting autoscale to a maximum of 4000 RU/s using Azure CLI
az cosmosdb sql container update \
--resource-group MyResourceGroup \
--account-name MyCosmosDBAccount \
--database-name MyDatabase \
--name MyContainer \
--max-throughput 4000
Partitioning for Scale
Partitioning is fundamental to achieving high scalability and performance in Cosmos DB. A logical partition is a group of documents that share the same partition key value. A physical partition can store multiple logical partitions. Choosing an effective partition key is critical:
- Cardinality: A partition key with a high number of distinct values is generally better.
- Distribution: A partition key that distributes requests evenly across logical partitions prevents "hot partitions."
- Query Patterns: Design your partition key to align with your most frequent query patterns.
Partition Key Limits
Each logical partition has a storage limit (currently 20 GB) and a throughput limit. If a logical partition exceeds its throughput limit, requests might be throttled. If it exceeds its storage limit, you will need to repartition your data.
Global Distribution and Scaling
Azure Cosmos DB supports multi-master writes and single-region writes. You can add or remove regions from your Cosmos DB account at any time. This allows you to:
- Improve Latency: Place data closer to your users in different geographical regions.
- High Availability: Ensure your application remains available even if an entire region experiences an outage.
- Seamless Scaling: Add regions to scale your application's global reach and throughput capacity.
Adding and Removing Regions
You can manage regions via the Azure portal or programmatically.
# Example: Adding a region to a Cosmos DB account using Azure CLI
az cosmosdb create \
--name MyCosmosDBAccount \
--resource-group MyResourceGroup \
--locations region1=eastus region2=westus \
--capabilities EnableServerless EnableMaterializedViews
Monitoring Scalability
Regularly monitor your Cosmos DB account's performance metrics, including Request Units consumed, throttled requests, latency, and storage usage. This helps you identify potential bottlenecks and adjust your scaling strategy proactively.
- Request Units (RUs): Monitor `Total RUs Consumed` and `Max RU/s` to understand your throughput needs.
- Throttled Requests: High numbers of throttled requests (HTTP status code 429) indicate that your provisioned throughput is insufficient.
- Latency: Monitor `Read Latency` and `Write Latency` to ensure your application is meeting performance requirements.
Azure Monitor provides comprehensive tools for observing these metrics.