Containers in Azure Cosmos DB

This document provides a comprehensive guide to understanding and managing containers within Azure Cosmos DB.

Introduction to Containers

A container is the fundamental unit of scalability and throughput in Azure Cosmos DB. It's a schema-agnostic container for a hierarchical collection of resources. A container can hold entities, stored procedures, triggers, and user-defined functions.

Each container is uniquely identified by a name within a specific database. Containers are partitioned by a partition key, which is a property within the document that Cosmos DB uses to distribute data across logical partitions. The choice of partition key is critical for performance and scalability.

Creating Containers

You can create containers using various methods, including the Azure portal, Azure CLI, Azure PowerShell, or the Azure Cosmos DB SDKs for different programming languages.

When creating a container, you must specify:

Here's an example of creating a container using the Azure CLI:


az cosmosdb container create \
    --resource-group MyResourceGroup \
    --account-name MyCosmosDBAccount \
    --database-name MyDatabase \
    --name MyContainer \
    --partition-key-path "/categoryId"
            

Partitioning Strategies

Effective partitioning is crucial for distributing your data and requests evenly across logical partitions. This ensures optimal performance, scalability, and predictable costs.

A partition key is a property from your document that Cosmos DB uses to determine which logical partition the document should be stored in. To achieve effective partitioning, consider these strategies:

Important: Once a container is created, the partition key path cannot be changed. Plan your partition key strategy carefully.

Indexing Policies

Azure Cosmos DB automatically indexes all data within a container. The indexing policy defines how documents are indexed. By default, Cosmos DB uses an automatic indexing policy that indexes all properties of a document, providing a balance between query performance and storage overhead.

You can customize the indexing policy to optimize for specific query patterns:

Here's a JSON snippet for a custom indexing policy:


{
    "indexingMode": "consistent",
    "automatic": false,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "String",
                    "precision": 3
                },
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                }
            ]
        }
    ],
    "excludedPaths": [
        {
            "path": "/nonIndexedContent/*"
        }
    ]
}
            

Throughput Provisioning

Throughput in Azure Cosmos DB is measured in Request Units per second (RU/s). You can provision throughput at the container level or the database level.

The cost of your Cosmos DB account is directly related to the provisioned throughput and storage consumed. Monitoring your RU/s consumption is essential for cost management.

Common Operations

Key operations you can perform on containers include:

Example: Reading Container Properties (Azure SDK for .NET)


using Microsoft.Azure.Cosmos;

// ...

ContainerProperties containerProperties = await container.ReadContainerAsync();
Console.WriteLine($"Container ID: {containerProperties.Id}");
Console.WriteLine($"Partition Key Path: {containerProperties.PartitionKeyPath}");
            

Understanding and effectively managing containers is key to building scalable and performant applications on Azure Cosmos DB.