Blob Data Model
This document describes the fundamental data model for Azure Blob Storage, detailing the hierarchical structure and key components that govern how data is organized and accessed.
Understanding the Hierarchy
Azure Blob Storage organizes data in a hierarchical structure that is fundamental to its operation. The core components are:
- Storage Account: The top-level container for all your Azure Storage data objects. A storage account provides a unique namespace in Azure for your data.
- Container: A logical grouping of blobs within a storage account. Think of it as a folder in a file system. Containers must have names that follow specific rules (e.g., lowercase letters and numbers).
- Blob: The basic unit of storage in Azure Blob Storage. A blob represents an object that can contain any amount of text or binary data. There are three types of blobs:
- Block Blobs: Optimized for storing large amounts of unstructured data, such as documents, media files, and backups. Block blobs are made up of blocks of data, which can be uploaded independently and then committed as a single blob.
- Append Blobs: Optimized for append operations, such as logging data from a virtual machine or writing to a log file. Like block blobs, append blobs are also made up of blocks, but blocks can only be appended to the end of the blob.
- Page Blobs: Optimized for random read and write operations. Page blobs are used for scenarios like hosting virtual machine disks (VHDs). They are composed of pages (up to 512 bytes each) and support random read/write operations within these pages.
Key Properties and Metadata
Each blob has associated properties and metadata that provide information about the blob itself. These include:
- System Properties: Automatically managed by Azure Storage, such as the blob's ETag, last modified time, and content type.
- User-Defined Metadata: Key-value pairs that you can associate with a blob to store custom information. This metadata is accessible via REST APIs and client libraries.
- Tags: Key-value pairs that can be applied to blobs for resource management and cost analysis. Tags are indexed and queryable.
Blob Naming Conventions
Blob names must be unique within a container. They can include any combination of characters, but there are some restrictions:
- Blob names can be from 1 to 1,024 characters long.
- Blob names are case-insensitive, but Azure Storage preserves the case of the blob name.
- Blob names cannot end with a forward slash (
/). - Certain characters are reserved or have special meaning and should be avoided or properly encoded.
Data Consistency
Azure Blob Storage provides strong consistency for all operations. This means that after a successful write operation, any subsequent read operation is guaranteed to return the latest version of the data.
Example: Blob Structure
Consider the following structure:
your-storage-account
├── your-container
│ ├── documents
│ │ ├── report.docx
│ │ └── presentation.pptx
│ ├── images
│ │ ├── logo.png
│ │ └── banner.jpg
│ └── logs
│ └── app.log
└── another-container
└── data.csv
In this example:
your-storage-accountis the storage account.your-containerandanother-containerare containers.documents,images, andlogsare virtual directories formed by using forward slashes in blob names.report.docx,presentation.pptx,logo.png,banner.jpg,app.log, anddata.csvare individual blobs.