Core Architectural Components
Azure Cosmos DB is a globally distributed, multi-model database service. Its architecture is designed for high availability, low latency, and elastic scalability across any number of regions.
Key Components:
- Partitioning: Cosmos DB employs horizontal partitioning to distribute data across logical and physical partitions. This is fundamental to achieving massive scalability and high throughput.
- Replication: Data is automatically replicated across multiple regions for high availability and disaster recovery. Users can configure their desired consistency levels and replication topology.
- Request Router: This intelligent service intercepts all incoming requests, determines the appropriate partitions to route the request to, and handles failover scenarios.
- Storage Service: Manages the physical storage of data, ensuring durability and efficient retrieval. It operates on a per-partition basis.
- Compute Service: Handles query processing, indexing, and transactions. It's optimized for low-latency operations.
- Global Distribution: Cosmos DB's Turnkey global distribution allows for active-active replication across any number of Azure regions.
A high-level overview of Cosmos DB's distributed architecture.
Data Model Abstraction
While Cosmos DB supports multiple data models (Document, Key-Value, Graph, Column-Family), its underlying architecture is consistent. It uses an abstract, ordered data store that can be mapped to these different models:
- Items: The smallest unit of data, typically a JSON document.
- Collections/Containers: A group of items. In the relational world, think of it like a table.
- Database: A logical container for containers.
Consistency Models
Cosmos DB offers five well-defined consistency levels, allowing you to balance availability, throughput, latency, and strong consistency based on your application's needs:
- Strong: Reads always return the most up-to-date data.
- Bounded Staleness: Reads are guaranteed to be no more than a certain number of versions or time behind the writes.
- Session: Reads within a user's session are consistent.
- Consistent Prefix: Reads are guaranteed to be ordered chronologically.
- Eventual: Reads may return stale data, but will eventually converge.