Using Azure Storage Tables with Python

Introduction to Azure Storage Tables and Python

Azure Table storage is a NoSQL key-attribute store that can store a large amount of structured, non-relational data. It is accessible from anywhere in the world via HTTP or HTTPS. A table in Azure Table storage is a collection of entities, and each entity is a collection of properties. Each entity must have two properties that serve as its primary key: PartitionKey and RowKey. Together, these two properties uniquely identify an entity within a table.

This document will guide you through using the Azure SDK for Python to interact with Azure Table storage. We will cover installation, authentication, creating and managing entities, and querying your data.

Installation

To get started, you need to install the Azure Table storage client library for Python. You can do this using pip:

pip install azure-data-tables

Authentication

You can authenticate to Azure Table storage using several methods. The most common methods are:

  • Connection String: A connection string provides all the necessary information to connect to your storage account.
  • Azure Identity Library: For more secure and flexible authentication, especially in production environments, use the Azure Identity library which supports various credential types like DefaultAzureCredential.
Recommendation: For production applications, it is highly recommended to use the Azure Identity library for authentication.

Using a Connection String

You can retrieve your connection string from the Azure portal under your storage account's "Access keys" section.


from azure.data.tables import TableServiceClient

# Replace with your actual connection string
connection_str = "YOUR_CONNECTION_STRING"
table_service_client = TableServiceClient.from_connection_string(connection_str)
                

Using Azure Identity


from azure.data.tables import TableServiceClient
from azure.identity import DefaultAzureCredential

# Replace with your storage account name
account_url = "https://YOUR_STORAGE_ACCOUNT_NAME.table.core.windows.net/"
credential = DefaultAzureCredential()

table_service_client = TableServiceClient(endpoint=account_url, credential=credential)
                

Working with Entities

Entities in Azure Table storage are represented as Python dictionaries. Each entity must have at least PartitionKey and RowKey. Other properties can be of various types, including strings, integers, booleans, dates, and binary data.

Creating an Entity

You can create an entity using the create_entity method. You'll need a table client, which you can obtain from the TableServiceClient.


# Get a table client. If the table doesn't exist, it will be created.
table_client = table_service_client.get_table_client(table_name="mytable")
table_client.create_table() # Creates the table if it doesn't exist

entity = {
    "PartitionKey": "users",
    "RowKey": "user1",
    "name": "Alice Smith",
    "email": "alice.smith@example.com",
    "age": 30
}

created_entity = table_client.create_entity(entity)
print(f"Created entity: {created_entity}")
                

Upserting an Entity

The upsert_entity operation creates an entity if it doesn't exist, or updates it if it does.


updated_entity_data = {
    "PartitionKey": "users",
    "RowKey": "user1",
    "email": "alice.s@example.com",
    "city": "New York"
}

upserted_entity = table_client.upsert_entity(updated_entity_data)
print(f"Upserted entity: {upserted_entity}")
                

Querying Entities

You can query entities using the query_entities method. You can specify filters for more targeted results.

Retrieving All Entities in a Table


all_entities = table_client.list_entities()
for entity in all_entities:
    print(entity)
                

Querying by Partition Key


partition_key_filter = "PartitionKey eq 'users'"
user_entities = table_client.query_entities(filter=partition_key_filter)
for entity in user_entities:
    print(entity)
                

Querying by Partition and Row Key


partition_key_filter = "PartitionKey eq 'users'"
row_key_filter = "RowKey eq 'user1'"
specific_entity = table_client.query_entities(filter=f"{partition_key_filter} and {row_key_filter}")
for entity in specific_entity:
    print(entity)
                

Advanced Filtering

You can use OData filter expressions for more complex queries. For example, to find users older than 25:


filter_expression = "PartitionKey eq 'users' and age gt 25"
older_users = table_client.query_entities(filter=filter_expression)
for user in older_users:
    print(user)
                

Performing Operations

You can also perform other operations like retrieving, updating, and deleting entities.

Retrieving a Single Entity


retrieved_entity = table_client.get_entity(partition_key="users", row_key="user1")
print(f"Retrieved entity: {retrieved_entity}")
                

Updating an Entity

Use update_entity to modify an existing entity. This operation fails if the entity does not exist.


entity_to_update = {
    "PartitionKey": "users",
    "RowKey": "user1",
    "age": 31,
    "occupation": "Engineer"
}

updated_entity = table_client.update_entity(entity_to_update)
print(f"Updated entity: {updated_entity}")
                

Deleting an Entity


table_client.delete_entity(partition_key="users", row_key="user1")
print("Entity deleted successfully.")
                

Deleting a Table


# Be careful with this operation!
# table_client.delete_table()
# print("Table deleted successfully.")
                

Full Code Example

Here's a comprehensive example demonstrating the common operations:

Python Code


import os
from azure.data.tables import TableServiceClient
from azure.identity import DefaultAzureCredential

def main():
    # --- Configuration ---
    # For production, use DefaultAzureCredential or other managed identities
    # For local development, you might use a connection string (store securely!)
    # connection_str = os.environ.get("AZURE_STORAGE_CONNECTION_STRING")
    # table_service_client = TableServiceClient.from_connection_string(connection_str)

    account_url = "https://YOUR_STORAGE_ACCOUNT_NAME.table.core.windows.net/" # Replace with your storage account name
    credential = DefaultAzureCredential()
    table_service_client = TableServiceClient(endpoint=account_url, credential=credential)

    table_name = "pythondocstable"
    table_client = table_service_client.get_table_client(table_name=table_name)

    # --- Create Table (if it doesn't exist) ---
    try:
        table_client.create_table()
        print(f"Table '{table_name}' created successfully.")
    except Exception as e:
        # Ignore if table already exists
        if "TableAlreadyExists" in str(e):
            print(f"Table '{table_name}' already exists.")
        else:
            print(f"Error creating table: {e}")

    # --- Create Entities ---
    print("\n--- Creating Entities ---")
    entity1 = {
        "PartitionKey": "products",
        "RowKey": "sku1001",
        "name": "Laptop",
        "price": 1200.50,
        "inStock": True
    }
    entity2 = {
        "PartitionKey": "products",
        "RowKey": "sku1002",
        "name": "Keyboard",
        "price": 75.00,
        "inStock": True
    }
    entity3 = {
        "PartitionKey": "customers",
        "RowKey": "cust001",
        "name": "John Doe",
        "email": "john.doe@example.com",
        "city": "London"
    }

    try:
        table_client.upsert_entity(entity1)
        print("Upserted entity1.")
        table_client.upsert_entity(entity2)
        print("Upserted entity2.")
        table_client.upsert_entity(entity3)
        print("Upserted entity3.")
    except Exception as e:
        print(f"Error upserting entities: {e}")

    # --- Query Entities ---
    print("\n--- Querying Entities ---")
    print("All entities:")
    for entity in table_client.list_entities():
        print(entity)

    print("\nProducts only:")
    product_filter = "PartitionKey eq 'products'"
    for entity in table_client.query_entities(filter=product_filter):
        print(entity)

    print("\nProducts with price > 100:")
    expensive_products_filter = "PartitionKey eq 'products' and price gt 100"
    for entity in table_client.query_entities(filter=expensive_products_filter):
        print(entity)

    # --- Get Specific Entity ---
    print("\n--- Getting Specific Entity ---")
    try:
        retrieved = table_client.get_entity(partition_key="products", row_key="sku1001")
        print(f"Retrieved sku1001: {retrieved}")
    except Exception as e:
        print(f"Error retrieving entity: {e}")

    # --- Update Entity ---
    print("\n--- Updating Entity ---")
    try:
        entity_to_update = {
            "PartitionKey": "customers",
            "RowKey": "cust001",
            "email": "john.d@example.com",
            "country": "UK"
        }
        updated = table_client.update_entity(entity_to_update)
        print(f"Updated customer cust001: {updated}")
    except Exception as e:
        print(f"Error updating entity: {e}")

    # --- Delete Entity ---
    print("\n--- Deleting Entity ---")
    try:
        table_client.delete_entity(partition_key="products", row_key="sku1002")
        print("Deleted product sku1002.")
    except Exception as e:
        print(f"Error deleting entity: {e}")

    # --- Verify Deletion ---
    print("\nEntities after deletion:")
    for entity in table_client.list_entities():
        print(entity)

    # --- Delete Table (Optional - use with caution) ---
    # print("\n--- Deleting Table ---")
    # try:
    #     table_service_client.delete_table(table_name)
    #     print(f"Table '{table_name}' deleted successfully.")
    # except Exception as e:
    #     print(f"Error deleting table: {e}")

if __name__ == "__main__":
    main()
                    

Conclusion

Azure Table storage, combined with the Azure SDK for Python, provides a powerful and scalable solution for storing and managing semi-structured data. This guide has covered the essential steps to get you started, from setting up your environment to performing basic CRUD operations and querying your data effectively. Remember to consult the official Azure documentation for more advanced features and best practices.

Tip: For performance optimization, consider how you design your PartitionKey and RowKey to distribute your data effectively and support your common query patterns.