Azure Storage Tables Concepts

Azure Table storage is a NoSQL key-attribute store that you can use to store large amounts of unstructured data. Table storage is a component of Azure Cosmos DB, a globally distributed, multi-model database service.

Core Concepts

Azure Table storage offers a schemaless design and a query-based programming model, making it easy to adapt your application as your data needs evolve. The key concepts are:

Tables

A table is a collection of entities. Within a storage account, tables are identified by name. Table names are case-insensitive and must adhere to naming rules:

Start with a letter.
Contain only letters, numbers, and the hyphen (-) character.
Be between 3 and 63 characters long.

All entities within a single table have the same properties. However, entities in a table do not need to have the same set of properties. This means that two entities in the same table can have different properties.

Entities

An entity is a record in a table, analogous to a row in a database. An entity is represented as a set of name-value pairs. Each entity is limited to 1 MB in size.

Every entity must contain the following three system properties:

PartitionKey: Distributes entities across partitions. Entities with the same PartitionKey are guaranteed to be stored on the same storage node. This is crucial for efficient querying and transaction support.
RowKey: Uniquely identifies an entity within a partition. Combined with PartitionKey, it forms the entity's unique identifier, called the identity.
Timestamp: Automatically managed by the storage service, this property indicates when the entity was last modified.

In addition to these system properties, an entity can contain any number of custom properties. These properties are stored as name-value pairs. Property names are strings, and property values can be of various data types, including:

String
Int32
Int64
Boolean
DateTime
Double
Guid
Binary (Byte array)
String (limited to 64KB)
DateTimeOffset
Single
Decimal

Table storage is schemaless, meaning you don't define the schema of your tables beforehand. You can add new properties to entities at any time, and different entities within the same table can have different sets of properties.

Properties

A property is a name-value pair within an entity. Property names are strings and have a maximum length of 255 characters. Property values can be of various primitive data types.

Each entity can have a maximum of 252 properties, excluding the three system properties (PartitionKey, RowKey, and Timestamp).

Partitions

A partition is a set of entities that share the same PartitionKey. Entities within the same partition are stored together on the same storage node. This co-location of entities with the same PartitionKey enables efficient retrieval of related data and supports atomic operations (transactions) within a partition.

The choice of PartitionKey is critical for performance. A well-designed partition strategy can distribute your data evenly and maximize query efficiency. Large partitions can become a bottleneck if not managed properly.

Querying

Azure Table storage supports various query operations to retrieve entities. Queries can be performed at the partition level or across partitions (though less efficiently).

Key querying features include:

Point Queries: Retrieve a specific entity using its PartitionKey and RowKey. This is the fastest query type.
Range Queries: Retrieve entities within a specified range of RowKey values for a given PartitionKey.
Partition Queries: Retrieve all entities within a specific PartitionKey.
Table Queries: Retrieve entities across multiple partitions. These queries are less efficient than partition-specific queries.
Filter Expressions: Table storage supports OData filter expressions to specify criteria for selecting entities.

The most efficient queries target a single partition. To maximize performance, design your tables so that the most frequent queries can be satisfied by retrieving entities from a single partition.

Transactions

Azure Table storage supports batch operations and entity group transactions within a single partition. An entity group transaction is an atomic operation that applies to multiple entities within the same partition.

All operations within a transaction either succeed or fail together, ensuring data consistency. This is a powerful feature for maintaining data integrity when making related changes.

Note on Schemaless Nature

While Table storage is schemaless, it's good practice to maintain consistency in the properties used for entities within a table, especially if you perform frequent queries or analytics. This makes your data easier to understand and manage.

Performance Tip

Carefully choose your PartitionKey and RowKey. A good strategy balances data distribution across partitions with the ability to perform efficient, targeted queries and transactions within partitions.

Use Cases

Azure Table storage is ideal for scenarios such as:

Storing large amounts of structured, non-relational data.
Storing data that can be quickly queried by primary key.
Storing data that requires atomic operations across multiple entities.
Building scalable web applications.
Storing logs, events, and telemetry data.