Azure Storage Blob Concepts
This article introduces the core concepts of Azure Blob Storage. Blob storage is Azure's massively scalable object store for the cloud. It's optimized for storing massive amounts of unstructured data, such as text or binary data. Anything that can be stored as text or binary can be stored in Blob Storage.
What is Azure Blob Storage?
Blob storage is designed for:
- Serving images or documents directly to a browser.
- Storing files for distributed access.
- Streaming video and audio.
- Writing to log files.
- Storing data for backup, restore, disaster recovery, and archiving.
- Storing data for analysis by an on-premises or hosted Azure service.
Key Concepts
Storage Account
A storage account provides a unique namespace in Azure for your data. Every object that you store in Azure Storage is referenced by an account name. A storage account contains all your Azure Storage data objects:
- Blobs (object storage)
- Files (shared file storage)
- Queues (NoSQL key-value store)
- Tables (NoSQL wide-column store)
The name of your storage account must be unique across all of Azure. A storage account has the following options for availability, disaster recovery, and consistency:
- Locally Redundant Storage (LRS): A low-cost replication option that replicates your data three times within a single physical location in the primary region.
- Zone-Redundant Storage (ZRS): Replicates your data across three Azure availability zones in the primary region.
- Geo-Redundant Storage (GRS): Replicates your data to a secondary region hundreds of miles away from the primary region.
- Read-Access Geo-Redundant Storage (RA-GRS): Provides GRS replication and also allows read access to the data in the secondary region.
Container
A container is a logical grouping of blobs. You must create a container before you can upload blobs to it. A storage account can contain an unlimited number of containers. A container can hold an unlimited number of blobs.
Container names follow specific naming rules:
- Container names must be a valid DNS name, consisting of lower-case letters, numbers, and hyphens.
- Container names must start and end with a letter or number.
- Container names must have from 3 to 63 characters.
- Container names cannot be represented as IP address dotted decimal strings.
- Container names must be case-insensitive.
Blob
A blob is the most optimized type of managed object storage for the cloud. A blob can hold:
- Text or binary data.
- Any type of file, such as a document, image, or video.
There are three types of blobs:
- Block blobs: Optimized for large amounts of unstructured data. Block blobs are made up of blocks of data that can be managed independently. This is the most common type of blob used for storing files such as documents, media files, and backups.
- Append blobs: Optimized for append operations, such as logging data from a virtual machine. Append blobs are made up of blocks, but they are specifically designed so that blocks can only be appended to the end of the blob.
- Page blobs: Optimized for random read and write operations. Page blobs store data in pages up to 512 bytes in size. They are primarily used for IaaS virtual machine disks.
Blob Name
A blob name is unique within a container. Every blob has a name.
Blob names can contain any combination of characters. However, to comply with REST access and HTTP URL conventions, it's recommended to use characters that are URL-safe. Blob names have a maximum length of 1024 characters.
A blob name can be represented as a virtual directory structure using forward slashes (/). For example, photos/2023/vacation/photo1.jpg.
Access Tiers
Azure Blob Storage offers different access tiers that can be used to store data at the most cost-effective levels. The access tiers are:
- Hot tier: Optimized for frequently accessed data. This tier offers the lowest access latency and highest throughput, but at a higher storage cost.
- Cool tier: Optimized for infrequently accessed data. Data stored in the cool tier is available with slightly higher access latency and lower throughput than hot tier data. It offers lower storage costs but higher rehydration costs.
- Archive tier: Optimized for rarely accessed data that can tolerate hours of retrieval time. This tier offers the lowest storage costs but the highest retrieval costs and longest retrieval times.
You can set the access tier at the account, container, or blob level.
Shared Access Signatures (SAS)
A Shared Access Signature (SAS) is a URI that grants restricted access rights to Azure Storage resources. A SAS token allows a client to delegate access to your storage account without sharing your account access keys.
SAS provides:
- Delegated access: Clients can access resources on your behalf.
- Restricted permissions: You can grant specific permissions, such as read, write, delete, and list.
- Time limits: You can set an expiration time for the SAS token.
- IP restrictions: You can specify allowed IP addresses or ranges.
There are two types of SAS:
- Service SAS: Signed with the storage account key.
- User delegation SAS: Signed with Azure AD credentials.
Access Control
Azure Blob Storage supports multiple methods for controlling access to your data:
- Azure Role-Based Access Control (RBAC): Assigns roles to users, groups, or service principals to grant permissions to Azure Storage resources.
- Access Control Lists (ACLs): For specific containers and blobs, you can define granular permissions using ACLs.
- Shared Access Signatures (SAS): As described above, for temporary, delegated access.
Example Scenario
Imagine you are building a web application that allows users to upload profile pictures. You would:
- Create an Azure Storage account.
- Inside the storage account, create a container named
profile-pictures. - When a user uploads a picture, upload it as a block blob to the
profile-picturescontainer. The blob name could be{userId}.jpg. - Grant read access to the
profile-picturescontainer so that your web application can serve the images to users. You might use RBAC for this. - If you need to provide temporary access for a user to download their own picture, you could generate a SAS token with read permissions for that specific blob.
This example illustrates how the core concepts of storage accounts, containers, blobs, and access control work together to manage your data effectively in Azure Blob Storage.