Cosmos DB Indexing
Azure Cosmos DB automatically and transparently indexes all your data. The indexing policy defines how data is indexed, allowing you to optimize query performance. Cosmos DB supports a variety of indexing strategies to suit different workload patterns.
Automatic Indexing
By default, Cosmos DB uses a range index for all string values and a consistent index for all other data types. This means that most queries will perform well without any explicit configuration. The indexing process is asynchronous and happens as data is created, updated, or deleted.
Indexing Policy
You can customize the indexing behavior by defining an indexing policy. This policy allows you to specify:
- Inclusion/Exclusion Paths: Control which paths within your documents are included or excluded from indexing. This is crucial for optimizing storage and performance, especially for large documents or infrequently queried properties.
- Indexing Mode: Choose between
consistent
(default, indexes updated with every transaction) andlazy
(indexes updated periodically, good for bulk inserts). - Composite Indexes: Create indexes on multiple properties to optimize queries that filter or sort by multiple fields simultaneously.
- Spatial Indexes: Enable indexing for geospatial data types like points, lines, and polygons for efficient location-based queries.
Default Indexing Policy
The default indexing policy for Cosmos DB is:
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*"
}
],
"excludedPaths": [
{
"path": "/"_etag""
}
]
}
Configuring Indexing Policy
You can modify the indexing policy for a container through the Azure portal, Azure CLI, PowerShell, or SDKs. Here's an example of an indexing policy that excludes a specific path and includes a composite index:
Custom Indexing Policy Example
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{"path": "/*"},
{"path": "/myArray/\\*", "kind": "range", "precision": -1}
],
"excludedPaths": [
{"path": "/sensitiveData/*"},
{"path": "/nonQueryableField"}
],
"compositeIndexes": [
[
{"path": "/category", "order": "ascending"},
{"path": "/price", "order": "descending"}
]
]
}
In this example:
/sensitiveData/*
and/nonQueryableField
are excluded from indexing.- A composite index is created on
category
(ascending) andprice
(descending) for efficient queries using both fields. /myArray/\*
withkind: "range"
andprecision: -1
indexes all elements within themyArray
.
Indexing Mode: Consistent vs. Lazy
Consistent: This is the default and recommended mode. Indexes are updated synchronously with data operations. This ensures that query results are always up-to-date, but can have a slight overhead on write operations.
Lazy: In lazy indexing mode, indexes are updated periodically. This can improve write performance significantly, especially for bulk operations. However, there might be a delay before newly inserted or updated data is available for querying.
Querying Geospatial Data
Cosmos DB supports geospatial queries using the GeoJSON format. To enable efficient spatial queries, ensure your indexing policy includes spatial indexes. For example, to index a location property:
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{"path": "/*"},
{"path": "/location/*", "kind": "spatial"}
]
}
You can then use functions like ST_DISTANCE
and ST_WITHIN
in your SQL queries.
excludedPaths
to prevent indexing of unnecessary data.
Managing Indexing
You can view and manage your indexing policy in the Azure portal under the "Scale & Settings" section of your Cosmos DB account or database. For programmatic management, use the Azure SDKs for your preferred language.