Querying Tables and Entities
Azure Table Storage allows you to efficiently query tables and retrieve specific entities based on various criteria. This section covers the methods and best practices for constructing powerful queries.
Understanding Query Capabilities
Table Storage queries are designed for high throughput and low latency, particularly for retrieving entities with specific PartitionKey and RowKey values. You can perform the following types of queries:
- Point Queries: Retrieve a single entity by its PartitionKey and RowKey. This is the most efficient type of query.
- Range Queries: Retrieve multiple entities within a specific PartitionKey, using filters on the RowKey.
- Partition Scans: Retrieve entities from a single partition using filters on other properties.
- Cross-Partition Queries: Retrieve entities from multiple partitions. These are less efficient and should be used judiciously.
Constructing Filter Expressions
Filter expressions allow you to specify criteria for retrieving entities. These filters are applied server-side, reducing the amount of data transferred over the network. You can filter on any properties of your entities, except for the PartitionKey and RowKey when performing cross-partition queries.
Supported operators include:
- Comparison Operators:
eq(equal),ne(not equal),gt(greater than),ge(greater than or equal to),lt(less than),le(less than or equal to). - Logical Operators:
and,or. - Arithmetic Operators:
add,sub,div,mul. - String Functions:
substringof,startswith,endswith,length,concat. - Date/Time Functions:
year,month,day,hour,minute,second,round,floor,ceiling. - String Manipulation:
tolower,toupper.
Example Filter Expressions:
Let's assume we have entities with properties like Category, Price, and Timestamp.
Retrieve all entities in the 'Electronics' category:
Category eq 'Electronics'
Retrieve entities with a Price greater than 100.0 and in the 'Books' category:
Price gt 100.0 and Category eq 'Books'
Retrieve entities where the Timestamp is within a specific date range:
Timestamp ge '2023-01-01T00:00:00Z' and Timestamp lt '2023-01-31T23:59:59Z'
Querying with SDKs
Azure SDKs for various languages (e.g., .NET, Java, Python, Node.js) provide convenient methods for constructing and executing queries against Table Storage.
.NET Example:
using Microsoft.Azure.Cosmos.Table;
// Assume 'cloudTable' is an instance of CloudTable
// Construct a query
TableQuery<DynamicTableEntity> query = new TableQuery<DynamicTableEntity>()
.Where(TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, "Products"));
// Execute the query
TableContinuationToken token = null;
do
{
TableQuerySegment<DynamicTableEntity> resultSegment = await cloudTable.ExecuteQuerySegmentedAsync(query, token);
foreach (DynamicTableEntity entity in resultSegment.Results)
{
// Process entity
Console.WriteLine($"RowKey: {entity.RowKey}");
}
token = resultSegment.ContinuationToken;
} while (token != null);
Python Example:
from azure.cosmosdb.table.tableservice import TableService
from azure.cosmosdb.table.query import Query
table_service = TableService(account_name='your_account_name', account_key='your_account_key')
# Construct a query
query = Query("Products").filter("Category eq 'Electronics'")
# Execute the query
entities = table_service.query_entities("YourTableName", query)
for entity in entities:
print(f"RowKey: {entity.RowKey}")
Performance Considerations
- PartitionKey and RowKey: Always include
PartitionKeyandRowKeyin your filters when possible. Queries filtered byPartitionKeyare significantly faster than cross-partition queries. - Indexing: Table Storage automatically indexes
PartitionKeyandRowKey. For other properties, consider denormalization or using Azure Cosmos DB for more advanced indexing capabilities. - OData Filters: Use OData filter expressions for server-side filtering to minimize data transfer.
- Top Clause: Limit the number of returned entities using the
Topclause if you don't need all matching results. - Projections: Select only the properties you need to reduce payload size.
Key OData Query Options
$filter: Specifies the filter expression for retrieving entities.$select: Specifies the properties to return for each entity.$top: Specifies the maximum number of entities to return.$orderby: Specifies the order in which entities are returned. (Note: Orderby is only supported on a single property and requires a single partition key filter).
Best Practices for Querying
- Design your table schema with querying in mind. Choose appropriate PartitionKeys and RowKeys.
- Favor point and range queries on PartitionKey and RowKey.
- When scanning a partition, filter on other properties to narrow down results.
- Avoid cross-partition queries unless absolutely necessary, and implement pagination.
- Use SDKs to leverage their built-in query construction and execution capabilities.
- Test your queries thoroughly to ensure optimal performance.