Azure Table Storage

Azure Table storage is a NoSQL key-attribute store that accepts authenticated calls from outside the cloud service. The Table service stores data as collections of entities. Each entity is a set of name-value pairs. Because Table storage is schemaless, the properties for a given kind of entity can vary from one entity to another within the same table, which provides great flexibility for the developer.

Table storage is ideal for storing large amounts of structured, non-relational data that can be queried efficiently by a partition key and a sort key. Common use cases include:

  • Storing user data for web applications
  • Storing device data for IoT solutions
  • Storing logs and diagnostic information
  • Storing metadata for other Azure services

Getting Started

To get started with Azure Table Storage, you'll need an Azure subscription and an Azure Storage account. You can create a storage account through the Azure portal, Azure CLI, or Azure PowerShell.

Once you have your storage account, you can interact with Table Storage using:

  • Azure SDKs (e.g., .NET, Python, Java, Node.js)
  • Azure CLI
  • Azure PowerShell
  • REST API

We recommend using the Azure SDKs for a streamlined development experience. You can install the relevant SDK package for your preferred language.

Note: Azure Table Storage is part of Azure Cosmos DB, offering a flexible and scalable solution for NoSQL data. New applications should consider Azure Cosmos DB for Table API.

Core Concepts

Tables

A table is a collection of entities. Tables in Azure Table Storage are schemaless, meaning that different entities within the same table don't need to have the same set of properties.

Entities

An entity is analogous to a row in a database table. It's a collection of properties. Each entity can have up to 1000 properties, including the system properties PartitionKey, RowKey, and Timestamp.

Properties

A property is a name-value pair within an entity. Property names are strings, and values can be of various primitive data types (e.g., String, Boolean, DateTime, Double, GUID, Int32, Int64, String).

PartitionKey and RowKey

Every entity in Table Storage must have a PartitionKey and a RowKey. These two properties together form the entity's primary key and uniquely identify the entity within a table. They also play a crucial role in scalability and performance:

  • PartitionKey: Used to partition the table data. Entities with the same PartitionKey are stored together on the same storage partition, which enables efficient querying of entities within a partition.
  • RowKey: Used to sort entities within a partition. It's a unique identifier for an entity within its partition.

Choosing appropriate PartitionKey and RowKey values is critical for performance. For example, consider partitioning data by a date or a customer ID, and using a GUID or a timestamp for the RowKey.

Common Operations

Here are some fundamental operations you can perform with Azure Table Storage:

Creating Tables

You can create a new table in your storage account. Table names must be valid DNS names and adhere to specific naming conventions.

Example (conceptual):

// Using Azure SDK for Python
                from azure.data.tables import TableServiceClient

                connection_string = "YOUR_CONNECTION_STRING"
                table_name = "myCustomTable"

                table_service_client = TableServiceClient.from_connection_string(connection_string)
                table_client = table_service_client.create_table(table_name)
                print(f"Table '{table_name}' created.")

Inserting Entities

To insert an entity, you define its properties, including PartitionKey and RowKey.

Example (conceptual):

from datetime import datetime

                entity = {
                    "PartitionKey": "Customer_123",
                    "RowKey": "Order_456",
                    "orderDate": datetime.utcnow(),
                    "totalAmount": 55.75,
                    "product": "Gadget XYZ"
                }
                table_client.upsert_entity(entity)
                print("Entity inserted.")

Querying Entities

You can query entities by specifying filters on PartitionKey, RowKey, and other properties. Queries are most efficient when they can leverage the primary key (PartitionKey and RowKey).

Example (conceptual):

# Query entities within a specific partition
                query_filter = "PartitionKey eq 'Customer_123'"
                for entity in table_client.query_entities(query_filter):
                    print(entity)

                # Query a specific entity by PartitionKey and RowKey
                entity_key = {"PartitionKey": "Customer_123", "RowKey": "Order_456"}
                retrieved_entity = table_client.get_entity(entity_key["PartitionKey"], entity_key["RowKey"])
                print(f"Retrieved: {retrieved_entity}")

Updating Entities

Entities can be updated by replacing existing ones or by merging properties.

Example (conceptual):

entity_key = {"PartitionKey": "Customer_123", "RowKey": "Order_456"}
                entity = table_client.get_entity(entity_key["PartitionKey"], entity_key["RowKey"])
                entity["totalAmount"] = 60.00 # Update amount
                entity["status"] = "Shipped"  # Add new property
                table_client.update_entity(entity)
                print("Entity updated.")

Deleting Entities

You can delete individual entities or entire tables.

Example (conceptual):

entity_key = {"PartitionKey": "Customer_123", "RowKey": "Order_456"}
                table_client.delete_entity(entity_key["PartitionKey"], entity_key["RowKey"])
                print("Entity deleted.")

                # To delete the entire table:
                # table_service_client.delete_table(table_name)
                # print(f"Table '{table_name}' deleted.")

Advanced Topics

Indexing

Azure Table Storage automatically indexes the PartitionKey and RowKey. You can create secondary indexes on other properties to improve query performance. These are typically implemented using separate tables or leveraging features within Azure Cosmos DB.

Transactions

Table Storage supports entity group transactions (EGTs), which allow you to perform atomic operations on entities within the same partition. This ensures that either all operations in the transaction succeed or none do.

Performance Considerations

  • Partition Design: Distribute your data across partitions to avoid hot partitions and maximize read/write throughput.
  • RowKey Selection: Use RowKey to enable efficient range queries within a partition.
  • Batch Operations: Use batch operations for multiple writes to improve efficiency.
  • Query Patterns: Design your queries to efficiently utilize the indexed PartitionKey and RowKey.
  • Data Types: Choose appropriate data types for your properties.

SDK Examples

Explore the official Azure SDK documentation for detailed examples in your preferred programming language: