Introduction to Azure Storage Tables
Azure Table Storage is a NoSQL key-attribute store that allows you to store large amounts of structured, non-relational data. It's a cost-effective and scalable solution for many application needs. In this tutorial, you will learn how to perform basic queries against your Azure Table Storage data.
Table Storage uses a flat schema, meaning entities within a table do not need to share the same set of properties. This flexibility makes it ideal for scenarios where data structures can evolve or vary.
Prerequisites
- An Azure account. If you don't have one, sign up for a free account.
- An Azure Storage account. You can create one via the Azure portal or Azure CLI.
- The Azure SDK for your preferred language (e.g., Python, .NET, Java, Node.js).
Understanding Table Storage Concepts
Before diving into queries, it's essential to understand the core concepts:
- Table: A collection of entities. A storage account can contain an arbitrary number of tables.
- Entity: A set of properties, similar to a row in a database. An entity can contain up to 100 properties.
- Properties: Name-value pairs. Each property name must be a string, and each property value can be of a primitive data type (String, Int32, Int64, Boolean, Double, DateTime, Guid, Binary, etc.).
- PartitionKey: A string that identifies a set of entities within a table. Entities with the same PartitionKey are co-located on the same storage node, which optimizes query performance for entities within the same partition.
- RowKey: A string that uniquely identifies an entity within a partition. The combination of PartitionKey and RowKey must be unique for each entity in a table.
Performing Basic Queries
The primary ways to query entities in Azure Table Storage are:
- Retrieving a single entity: Using its PartitionKey and RowKey.
- Querying entities within a partition: Filtering by PartitionKey and other properties.
- Querying across partitions: This is less efficient and generally discouraged unless necessary.
1. Retrieving a Single Entity
This is the most efficient way to retrieve data, as it uses both the PartitionKey and RowKey for direct access.
from azure.data.tables import TableServiceClient
# Replace with your connection string and table name
connection_string = "YOUR_AZURE_STORAGE_CONNECTION_STRING"
table_name = "MySampleTable"
partition_key_to_find = "Partition1"
row_key_to_find = "Row1"
try:
table_service_client = TableServiceClient.from_connection_string(connection_string)
table_client = table_service_client.get_table_client(table_name=table_name)
entity = table_client.get_entity(partition_key=partition_key_to_find, row_key=row_key_to_find)
print(f"Retrieved Entity: {entity}")
except Exception as ex:
print(f"An error occurred: {ex}")
2. Querying Entities within a Partition
You can retrieve all entities within a specific partition, or filter them further using property filters. Using PartitionKey in your query is highly recommended for performance.
Retrieving all entities in a partition:
from azure.data.tables import TableServiceClient, UpdateMode
# ... (connection string, table name setup as above) ...
partition_key_to_query = "Partition1"
try:
table_service_client = TableServiceClient.from_connection_string(connection_string)
table_client = table_service_client.get_table_client(table_name=table_name)
# Query for all entities with a specific PartitionKey
query_filter = f"PartitionKey eq '{partition_key_to_query}'"
entities_in_partition = table_client.query_entities(filter=query_filter)
print(f"Entities in Partition '{partition_key_to_query}':")
for entity in entities_in_partition:
print(entity)
except Exception as ex:
print(f"An error occurred: {ex}")
Filtering entities within a partition by property:
You can apply filters to properties such as strings, numbers, and dates. The OData filter syntax is used.
from azure.data.tables import TableServiceClient, UpdateMode
# ... (connection string, table name setup as above) ...
partition_key_to_query = "Partition1"
property_name = "Status"
property_value = "Active"
try:
table_service_client = TableServiceClient.from_connection_string(connection_string)
table_client = table_service_client.get_table_client(table_name=table_name)
# Example: Get entities where Status is 'Active' within Partition1
# Note the use of single quotes around string values
query_filter = f"PartitionKey eq '{partition_key_to_query}' and {property_name} eq '{property_value}'"
filtered_entities = table_client.query_entities(filter=query_filter)
print(f"Filtered Entities in Partition '{partition_key_to_query}' where {property_name} is '{property_value}':")
for entity in filtered_entities:
print(entity)
except Exception as ex:
print(f"An error occurred: {ex}")
3. Query Operators and Filter Syntax
Table Storage supports various OData operators for filtering:
Operator | Description | Example |
---|---|---|
eq |
Equal to | PartitionKey eq 'PK' |
ne |
Not equal to | Status ne 'Deleted' |
gt |
Greater than | Timestamp gt datetime'2023-01-01T00:00:00Z' |
ge |
Greater than or equal to | Count ge 10 |
lt |
Less than | Price lt 100.50 |
le |
Less than or equal to | Date le datetime'2023-12-31T23:59:59Z' |
and |
Logical AND | PartitionKey eq 'PK' and Status eq 'Active' |
or |
Logical OR | Status eq 'Active' or Status eq 'Pending' |
not |
Logical NOT | not (Status eq 'Archived') |
Note on Data Types in Filters:
- Strings require single quotes:
'MyString'
. - Numbers (Int32, Int64, Double) do not require quotes:
123
,3.14
. - Booleans are
true
orfalse
. - Dates require the
datetime
literal and ISO 8601 format:datetime'2023-10-27T10:00:00Z'
. - Guids require the
guid
literal:guid'123e4567-e89b-12d3-a456-426614174000'
. - Binary data requires the
binary
literal and base64 encoding:binary(AQID)
.
4. Selecting Specific Properties (Projection)
To improve performance and reduce network traffic, you can specify which properties to retrieve for each entity. This is called projection.
from azure.data.tables import TableServiceClient
# ... (connection string, table name setup as above) ...
partition_key_to_query = "Partition1"
try:
table_service_client = TableServiceClient.from_connection_string(connection_string)
table_client = table_service_client.get_table_client(table_name=table_name)
# Query for entities in Partition1, but only retrieve 'Name' and 'Email' properties
query_filter = f"PartitionKey eq '{partition_key_to_query}'"
# The 'select' parameter takes a list of property names
selected_entities = table_client.query_entities(filter=query_filter, select=["Name", "Email"])
print(f"Projected Entities (Name, Email) in Partition '{partition_key_to_query}':")
for entity in selected_entities:
print(f"Name: {entity.get('Name')}, Email: {entity.get('Email')}") # Use .get() for safety
except Exception as ex:
print(f"An error occurred: {ex}")
Note that PartitionKey and RowKey are always returned, even if not explicitly selected.
5. Querying Across Partitions (Less Efficient)
While possible, querying entities across multiple partitions is less efficient than partition-specific queries. Table Storage performs a table scan in this case, which can be slow for large tables.
from azure.data.tables import TableServiceClient
# ... (connection string, table name setup as above) ...
property_name = "CreatedDate"
# Example: Find entities with CreatedDate after a certain point, across the entire table
# This query is a table scan and can be slow.
query_filter = f"{property_name} ge datetime'2023-01-01T00:00:00Z'"
try:
table_service_client = TableServiceClient.from_connection_string(connection_string)
table_client = table_service_client.get_table_client(table_name=table_name)
all_entities_filtered = table_client.query_entities(filter=query_filter)
print(f"Entities across all partitions where {property_name} is after 2023-01-01:")
for entity in all_entities_filtered:
print(entity)
except Exception as ex:
print(f"An error occurred: {ex}")
Next Steps
You've learned the basics of querying Azure Storage Tables. To further enhance your skills, consider exploring:
- More advanced OData filtering techniques.
- Batch operations for more efficient data manipulation.
- Querying with cursors for handling large result sets.
- Integrating Table Storage with other Azure services like Azure Functions and Azure Cosmos DB.
Continue your learning journey with the official Azure Table Storage documentation.