Containers in Azure Cosmos DB
A container is the fundamental unit of scalability and throughput in Azure Cosmos DB. It's a schema-agnostic container for your JSON documents, key-value pairs, graphs, or property graphs. A container is composed of a set of items, and has a defined partitioning scheme. A container is scoped to a database. Each container can store zero or more items.
Key Characteristics of Containers
- Schema Agnostic: Containers do not enforce a schema, allowing you to store data of varying structures.
- Scalability: Containers are the unit of horizontal scaling for both storage and throughput.
- Throughput Provisioning: You can provision throughput (Request Units per second - RU/s) at the container level.
- Partitioning: Data within a container is partitioned for scalability. You choose a partition key when creating a container.
- Indexing: Azure Cosmos DB automatically indexes all data in a container by default, without requiring you to define secondary indexes.
- Transactions: Containers support ACID transactions across multiple items within the same logical partition.
Creating a Container
You can create containers using the Azure portal, Azure SDKs (e.g., .NET, Java, Node.js, Python), or Azure CLI/PowerShell.
Using the Azure Portal:
- Navigate to your Azure Cosmos DB account.
- Select your database.
- Click on the "Containers" tab.
- Click "+ Add Container".
- Provide a container ID, database name, and choose a partition key.
- Configure the throughput (provisioned or autoscale).
Using SDKs (Conceptual Example - .NET):
// Assuming 'client' is an instance of CosmosClient
// and 'database' is an instance of Database
string containerId = "myContainer";
string partitionKeyPath = "/categoryId";
// Create container with manual throughput
ContainerProperties containerProperties = new ContainerProperties(containerId, partitionKeyPath);
containerProperties.Throughput = 400; // RU/s
Container createdContainer = await database.CreateContainerAsync(containerProperties, throughput: 400);
// Create container with autoscale throughput
ContainerProperties autoscaleContainerProperties = new ContainerProperties(containerId, partitionKeyPath);
autoscaleContainerProperties.AutoscaleSettings = new AutoscaleSettings { MaxThroughput = 1000 };
Container autoscaleCreatedContainer = await database.CreateContainerAsync(autoscaleContainerProperties, throughput: null);
Partitioning and Partition Keys
The partition key is a property within your items whose value determines which logical partition the item belongs to. Choosing a good partition key is crucial for performance and scalability. The partition key value is used to distribute data and requests across logical partitions. A good partition key should have high cardinality and distribute requests evenly.
Example Item with a Partition Key:
{
"id": "item1",
"categoryId": "Electronics",
"name": "Laptop",
"price": 1200
}
If the partition key is /categoryId, the value "Electronics" would determine the logical partition for this item.
Throughput and Request Units (RU/s)
Throughput in Azure Cosmos DB is measured in Request Units (RU/s). An RU is a normalized measure of the processing power required to perform various database operations, such as reading an item, writing an item, or querying an item.
- Provisioned Throughput: You manually set the RU/s for your container.
- Autoscale Throughput: Azure Cosmos DB automatically scales the throughput up or down based on workload, within a defined maximum.
API Operations
Containers support various CRUD (Create, Read, Update, Delete) operations for items, as well as querying and managing throughput.
Common Container API Operations
POST /dbs/{db_name}/colls
Creates a new container.
GET /dbs/{db_name}/colls/{coll_name}
Retrieves metadata about a specific container.
PUT /dbs/{db_name}/colls/{coll_name}
Updates a container's properties (e.g., throughput).
DELETE /dbs/{db_name}/colls/{coll_name}
Deletes a container.
Next Steps
Learn more about managing Items within your containers or explore Partitioning Strategies.