Blob Storage Concepts
This article explains the core concepts of Azure Blob Storage, a cloud object storage solution for storing large amounts of unstructured data.
What is Blob Storage?
Azure Blob Storage is a service that stores unstructured data. Unstructured data is data that doesn't adhere to a particular data model or definition, such as text or binary data. Blob storage is ideal for:
- Serving images or documents directly to a browser.
- Storing files for distributed access.
- Streaming video and audio.
- Storing data for backup, restore, disaster recovery, and archival.
- Writing to log files.
- Storing data for analysis by an on-premises or Azure-hosted service.
Core Concepts
Blob storage organizes objects into containers. A storage account can contain any number of containers, and a container can hold any number of blobs.
Storage Account
A storage account provides a unique namespace in Azure for your data. Every object you store in Azure Storage can be referenced via a URL that uses this unique storage account name. The combination of the storage account name and the service endpoint forms the base URI for your storage account.
Container
A container is a logical grouping of a set of blobs. You must create a container before you can upload blobs. A storage account can contain any number of containers. A container can hold any number of blobs.
Container naming rules:
- Container names must be from 3 to 63 characters long.
- Container names must start and end with a letter or number.
- Container names can contain only letters, numbers, and hyphens (-).
- Container names must be written in lowercase.
- Container names must be unique within a storage account.
Blob
A blob is a file. Any type of data can be stored as a blob. Blobs are typically used to store:
- Block blobs: Optimized for storing large amounts of unstructured data, such as images, documents, and application executables.
- Append blobs: Optimized for append operations, such as logging data.
- Page blobs: Optimized for storing random access files, such as virtual machine disks.
Blob names can be any valid combination of characters. However, they are subject to the following restrictions:
- Blob names can contain any combination of characters.
- The maximum blob name length is 1,024 characters.
- Blob names are case-sensitive.
- The UTF-8 encoding is supported.
- The following characters are reserved: `\` and `/`. These are used to delimit a hierarchy of directories.
Blob Types
Azure Blob Storage supports three types of blobs:
Block Blobs
Block blobs are composed of blocks of data. Each block is identified by a block ID. A block blob can contain up to 50,000 blocks, and each block can be up to 100 MB in size. The total size of a block blob can be up to approximately 5 TB.
Block blobs are optimized for storing large amounts of unstructured data. They are suitable for scenarios like:
- Serving images or documents directly to a browser.
- Storing files for distributed access.
- Streaming video and audio.
Append Blobs
Append blobs are also composed of blocks, but they are optimized for append operations. You can only add new blocks to an append blob. You cannot modify or delete existing blocks. This makes them ideal for scenarios like logging data.
An append blob can contain up to 50,000 blocks, and each block can be up to 4 MB in size. The total size of an append blob can be up to approximately 195 GB.
Page Blobs
Page blobs are composed of pages of data. Each page is 512 bytes in size. Page blobs are optimized for storing random access files. They are typically used for storing virtual machine disks for Azure IaaS VMs.
Page blobs can be up to 8 TB in size. They support read and write operations on arbitrary page ranges.
Data Redundancy
Azure Storage offers multiple redundancy options to protect your data from hardware failures or regional disasters. These options include:
- Locally Redundant Storage (LRS): Provides the lowest cost and offers basic durability by replicating data within a single data center.
- Geo-Redundant Storage (GRS): Replicates data to a secondary region hundreds of miles away, protecting against regional outages.
- Read-Access Geo-Redundant Storage (RA-GRS): Provides the same benefits as GRS, with the added ability to read data from the secondary region.
- Zone-Redundant Storage (ZRS): Replicates data across multiple availability zones within a region, providing higher availability within a single region.
- Geo-Zone-Redundant Storage (GZRS): Combines the benefits of GRS and ZRS, replicating data across multiple availability zones in a primary region and also to a secondary region.
Access Tiers
Blob storage offers different access tiers to optimize costs based on how frequently you need to access your data:
- Hot tier: Optimized for frequently accessed data.
- Cool tier: Optimized for infrequently accessed data, stored for at least 30 days.
- Archive tier: Optimized for rarely accessed data, stored for at least 180 days, with flexible latency for retrieval.
Security
Azure Storage provides robust security features including:
- Authentication: Shared Key, Azure AD authentication.
- Authorization: Role-Based Access Control (RBAC), Access Control Lists (ACLs) for containers and blobs.
- Encryption: Data is encrypted at rest by default.
- Network Security: Firewalls and virtual networks, private endpoints.
For more detailed information on security, refer to the Azure Storage security documentation.