Azure Cosmos DB Performance Tuning

Optimize your NoSQL database for speed and efficiency

Unlock Peak Performance

Discover actionable strategies to make your Azure Cosmos DB applications fly.

Introduction to Performance

Azure Cosmos DB is a globally distributed, multi-model database service that enables you to rapidly develop and scale high-performance applications. Achieving optimal performance is crucial for user experience, cost-efficiency, and application reliability. This guide provides key tips and best practices.

Indexing Strategies

Cosmos DB automatically indexes data, but understanding and optimizing this process can significantly boost query performance.

Automatic Indexing Policy

By default, Cosmos DB indexes all properties of your documents. For optimal performance, consider customizing the indexing policy to include only the fields you frequently query or filter on. This reduces indexing overhead and storage.

Example: Exclude large or infrequently used fields.

{
    "indexingMode": "consistent",
    "automatic": true,
    "includePaths": [
        { "path": "/*" }
    ],
    "excludePaths": [
        { "path": "/largeField/*" },
        { "path": "/metadata/internal/*" }
    ]
}

Composite Indexes

When queries involve multiple filter conditions on different properties, composite indexes can dramatically improve performance by allowing Cosmos DB to satisfy the query with a single index lookup.

Example: For queries filtering by status and then timestamp.

{
    "path": "/status",
    "order": "ascending"
},
{
    "path": "/timestamp",
    "order": "descending"
}

Range Indexes for Numerical/Date Data

Ensure that numerical and date/time fields intended for range queries (e.g., >, <) are indexed correctly with appropriate paths.

Effective Partitioning

A well-chosen partition key is fundamental to distributing your data and request load evenly across partitions, preventing hot partitions.

Choose a High-Cardinality Partition Key

Select a partition key with a large number of distinct values. This ensures that data is spread across many logical partitions, leading to better scalability and request distribution.

Good choices: User IDs, Session IDs, Device IDs.

Poor choices: Boolean flags, Status fields with few unique values.

Avoid Hot Partitions

Monitor your partition usage. If a single partition is consistently consuming a disproportionate amount of Request Units (RUs) or storing significantly more data, your partition key strategy may need adjustment.

Partition Key Size Limits

Be aware of the 20GB per logical partition limit. Design your partition key to ensure that individual partitions do not grow excessively large.

Throughput Management (RUs)

Request Units (RUs) are the measure of throughput in Cosmos DB. Efficiently managing RUs impacts both performance and cost.

Autoscale vs. Manual Throughput

Autoscale is ideal for unpredictable workloads, automatically scaling RUs up and down based on usage. Manual throughput is suitable for predictable, steady workloads where you can precisely provision.

Consider autoscale for development and testing environments, and potentially for production if your traffic patterns are highly variable.

Provision Throughput at the Container Level

For shared container scenarios, provision throughput at the container level. For workloads with distinct performance requirements, consider provisioning throughput at the database level and enabling autoscale for individual containers.

Optimize Operations for RU Efficiency

Understand the RU cost of different operations. Point reads and writes are generally cheaper than complex queries. Design your application to use the most cost-effective operations possible.

Tip: Use stored procedures for bulk operations to reduce network latency and RU costs.

Batching Operations

When performing multiple inserts or updates, batch them into a single transaction or stored procedure. This is far more efficient than issuing individual requests.

Query Optimization

Write efficient queries that leverage indexes and minimize unnecessary data retrieval.

Use System Functions Wisely

Functions like LOWER(), UPPER(), or mathematical functions applied to indexed fields can prevent index usage. If possible, store data in the desired case or format.

Avoid `SELECT *`

Only project the fields you need. Selecting all fields in a large document increases network traffic and processing overhead.

Example:

SELECT c.id, c.name, c.email FROM c WHERE c.isActive = true

Leverage the Azure Cosmos DB Emulator

Test your queries against the Cosmos DB Emulator. It provides a local development environment to debug and optimize queries without incurring cloud costs.

Consider Stored Procedures and User-Defined Functions (UDFs)

For complex logic or operations that need to be performed transactionally on the server, stored procedures and UDFs can improve performance by reducing network round trips.

Connection Management

Efficiently managing connections to Cosmos DB can prevent performance bottlenecks.

Use SDKs and Keep Connections Warm

Use the official Azure Cosmos DB SDKs. These SDKs implement efficient connection pooling and retry logic. Avoid creating new client instances for every operation; reuse a single client instance throughout your application's lifecycle.

Example (conceptual):

// Initialize once
var client = new CosmosClient("YOUR_COSMOS_DB_CONNECTION_STRING");

// Reuse client for all operations
var container = client.GetContainer("your_database", "your_container");
// ... perform operations using container ...

Tune Request Options

For SDKs, tune options like MaxRetryAttemptsOnRateLimitedOperations and MaxRetryWaitTimeOnRateLimitedOperations to gracefully handle throttling.

Consider Gateway vs. Direct Mode

The SDKs typically default to Direct Mode (TCP), which offers lower latency. Gateway Mode (HTTPS) might be preferred in certain network environments or for simplicity.