Indexing in Azure Cosmos DB

This section covers how indexing works in Azure Cosmos DB, including automatic indexing, indexing policies, and different index types.

Automatic Indexing

Azure Cosmos DB automatically indexes all data written to your containers. When you create or update a document, Cosmos DB automatically updates the index. This means you don't need to provision or manage indexes manually for most scenarios.

The default indexing policy ensures that all properties within documents are indexed, providing a broad range of query capabilities out-of-the-box.

Indexing Policies

While automatic indexing is powerful, you can customize it using an indexing policy. An indexing policy allows you to define how Cosmos DB indexes your data. This is crucial for optimizing query performance and managing storage costs.

Key aspects of an indexing policy include:

Defining an Indexing Policy

Indexing policies are defined in JSON format. Here's a simplified example of an indexing policy that excludes a specific path and includes another:


{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/sensitiveData/*"
        }
    ]
}
        

For more advanced configurations and specific index kinds, refer to the Azure Cosmos DB documentation on indexing policies.

Composite Indexes

Composite indexes are essential for efficiently executing queries that sort or filter on multiple properties simultaneously. Without them, Cosmos DB might need to perform costly operations to satisfy such queries.

A composite index is defined by specifying multiple paths in order. For example, to optimize queries that sort by category then price, you would define a composite index on these paths.


{
    "compositeIndexes": [
        [
            { "path": "/category", "order": "ascending" },
            { "path": "/price", "order": "descending" }
        ]
    ]
}
        

Spatial Indexes

Azure Cosmos DB supports spatial data types and indexing for performing geospatial queries. This is invaluable for applications dealing with location-based data, such as mapping services or delivery tracking.

To enable spatial indexing, you include a path with the spatial.hilbert<=1000> index kind.


{
    "includedPaths": [
        {
            "path": "/location/*",
            "indexes": [
                {
                    "kind": "Spatial(3)",
                    "dataType": "Point"
                }
            ]
        }
    ]
}
        

You can store spatial data as GeoJSON documents and query them using SQL's spatial functions like ST_DISTANCE and ST_WITHIN.

Manual Indexing (Preview/Advanced)

In certain advanced scenarios, you might want to control indexing more granularly, potentially switching off automatic indexing for specific parts of your document and manually managing index updates. This feature might be in preview or require specific configurations.

Note: Manual indexing is an advanced topic and should be carefully considered. Incorrect configuration can lead to performance degradation or unexpected query results.