Azure Cosmos DB: Querying Items

This tutorial guides you through the process of querying data in Azure Cosmos DB, covering common scenarios and best practices using the SQL API.

Introduction to Querying in Cosmos DB

Azure Cosmos DB is a globally distributed, multi-model database service. Its powerful query capabilities allow you to retrieve specific data efficiently. The primary way to query data is by using SQL-like queries.

Understanding the SQL API

The SQL API for Azure Cosmos DB supports a rich set of SQL constructs, including SELECT statements, WHERE clauses, JOINs, aggregations, and more. You can execute these queries through various SDKs (e.g., .NET, Java, Python, Node.js) or the Azure portal.

Basic SELECT Queries

The most fundamental query is to select all items from a container:

SELECT * FROM c

This query returns all documents within the current container. The alias c is a convention representing the current container.

Filtering Data with WHERE Clauses

You can filter your results using the WHERE clause to specify conditions. Let's say you have a collection of products, and you want to find all products with a price greater than 50:

SELECT *
FROM c
WHERE c.price > 50

You can combine multiple conditions using AND and OR:

SELECT *
FROM c
WHERE c.category = "Electronics" AND c.inStock = true

Selecting Specific Properties

Instead of retrieving the entire document, you can select only the properties you need:

SELECT c.id, c.name, c.price
FROM c
WHERE c.category = "Books"

Using Different Data Types in Queries

Cosmos DB handles various data types. Here's an example of querying based on a boolean property:

SELECT *
FROM c
WHERE c.isFeatured = true

And querying based on a string property:

SELECT *
FROM c
WHERE c.status = "Active"

Limiting Results and Ordering

You can limit the number of results returned and order them using LIMIT and ORDER BY:

SELECT *
FROM c
ORDER BY c.createdDate DESC
LIMIT 10

This query retrieves the 10 most recently created items.

Working with Arrays and Nested Objects

Cosmos DB's query language can effectively query data within arrays and nested objects. To query items where a tag is "premium":

SELECT *
FROM c
WHERE ARRAY_CONTAINS(c.tags, "premium")

To query based on a nested property, e.g., the street name in an address object:

SELECT *
FROM c
WHERE c.address.street = "Main St"
Tip: When querying nested properties, ensure the parent object exists to avoid errors.

Advanced Querying: Joins and Aggregations

Performing Joins

You can join data from different collections (requires specific configuration and understanding of your data model) or within the same collection using JOIN. Here's an example of joining items from the same container based on a shared property (though often, denormalization is preferred in NoSQL):

SELECT p.name, o.orderNumber
FROM c p
JOIN c o ON p.id = o.productId
WHERE p.category = "Electronics"

Note: In Cosmos DB, joins are typically done on items within the same container using the JOIN clause with a subquery or by self-joining.

Aggregations

Cosmos DB supports aggregation functions like COUNT, SUM, AVG, MIN, and MAX, often used with GROUP BY:

SELECT c.category, COUNT(1) as itemCount
FROM c
GROUP BY c.category

This query counts the number of items in each category.

Important: Aggregations can be more resource-intensive. Design your data model to minimize the need for complex aggregations on the fly if performance is critical.

Using the Azure Portal for Querying

The Azure portal provides a user-friendly interface to query your data. Navigate to your Cosmos DB account, select your container, and go to the "Data Explorer." You can then write and execute your SQL queries directly in the portal.

Conclusion

Querying in Azure Cosmos DB is flexible and powerful. By mastering the SQL API, you can efficiently retrieve and manipulate your data. Remember to optimize your queries and data models for the best performance and scalability.