Table Storage Concepts
Azure Table Storage is a NoSQL key-attribute store that lets you store large amounts of unstructured data. It’s a cost-effective and scalable service ideal for many applications.
Key Concepts
Tables
A table is a collection of entities. Tables are schema-less, meaning that a collection of entities within a single table doesn't need to share the same set of properties. You can think of a table as a spreadsheet where each row is an entity and each column is a property. However, unlike a spreadsheet, the columns do not need to be defined for all rows.
Entities
An entity is a record that can be uniquely identified. In Table Storage, an entity is analogous to a row. Each entity is composed of a set of properties. An entity can have up to 100KB of data in total. Each entity must contain two specific properties that serve as its primary key:
PartitionKey: This property is used to group entities. All entities within a table that share the samePartitionKeyare stored on the same storage node, which can significantly improve query performance if you query byPartitionKey.RowKey: This property is a unique identifier for an entity within a specific partition. The combination ofPartitionKeyandRowKeyuniquely identifies an entity.
Properties
A property is a name-value pair within an entity. Each entity can contain up to 252 properties, in addition to the PartitionKey and RowKey properties. Property names are strings, and values can be of various primitive data types. Properties are not typed in the Table Storage schema; the client is responsible for managing data types.
Data Types
Table Storage supports the following primitive data types for property values:
- String
- Int32
- Int64
- Double
- Boolean
- DateTime
- Guid
- Binary (Byte array)
- Double (representing a decimal number)
- Single (representing a single-precision floating-point number)
- DateTimeOffset
Querying Data
You can query data in Table Storage using various methods, including REST APIs, SDKs, and tools like Azure Storage Explorer.
Querying by PartitionKey and RowKey
The most efficient queries are those that specify both the PartitionKey and RowKey. This allows Azure Storage to quickly locate the specific entity.
GET /MyTable(PartitionKey='MyPartition',RowKey='MyRow') HTTP/1.1
Host: your_storage_account.table.core.windows.net
Date: Tue, 20 Mar 2024 18:15:00 GMT
Authorization: SharedKey your_storage_account:examplekey
Content-Length: 0
Querying by PartitionKey
Querying by only the PartitionKey is also efficient, as it retrieves all entities within that partition.
GET /MyTable?$filter=PartitionKey eq 'MyPartition' HTTP/1.1
Host: your_storage_account.table.core.windows.net
Date: Tue, 20 Mar 2024 18:15:00 GMT
Authorization: SharedKey your_storage_account:examplekey
Content-Length: 0
Querying by Property
You can also filter entities based on other properties, though this may be less efficient than key-based queries if the properties are not indexed or if large datasets are involved.
GET /MyTable?$filter=Status eq 'Active' and Priority gt 5 HTTP/1.1
Host: your_storage_account.table.core.windows.net
Date: Tue, 20 Mar 2024 18:15:00 GMT
Authorization: SharedKey your_storage_account:examplekey
Content-Length: 0
Note on Schemaless Design
The schemaless nature of Table Storage means you don't need to pre-define a schema. However, for performance and maintainability, it's good practice to have a consistent set of properties for entities within a table, especially when performing complex queries.
Performance Considerations
To optimize performance:
- Design your partition keys to enable efficient querying. Spread your data across partitions to avoid hot partitions.
- Use the
PartitionKeyandRowKeyfor point lookups. - Consider using a composite
RowKeyif you need to query ranges within a partition. - Batch operations for multiple writes or reads to reduce network overhead.
Tip
For relational data or complex transactions, consider Azure SQL Database or Azure Cosmos DB. Table Storage is optimized for high-volume, structured data access where relational integrity and complex transactions are not primary requirements.
Scalability
Azure Table Storage is designed for massive scalability. It can handle petabytes of data and millions of requests per second. Azure automatically scales the storage infrastructure to accommodate your data and traffic needs.