Frequently Asked Questions: Azure Cosmos DB

General

What is Azure Cosmos DB?

Azure Cosmos DB is a globally distributed, multi-model database service that enables you to build highly responsive, always-on applications. It offers a variety of APIs, including SQL (DocumentDB), MongoDB, Cassandra, Gremlin (Graph), and Table.

What are the key features of Cosmos DB?
  • Global Distribution: Turnkey global distribution with active-active capabilities and multi-region writes.
  • Elastic Scalability: Independent and elastic scaling of throughput and storage.
  • Guaranteed Throughput: Provides guaranteed low latency and high availability (99.999% SLA).
  • Multiple APIs: Supports popular APIs like SQL, MongoDB, Cassandra, Gremlin, and Table.
  • Multiple Data Models: Supports document, key-value, graph, and column-family data.
  • Automatic Sharding: Data is automatically sharded and distributed across partitions.
  • Enterprise-Grade Security: Robust security features, including Azure Active Directory integration and private endpoints.
What are the pricing models for Cosmos DB?

Cosmos DB offers two main pricing models:

  • Throughput-provisioned: You provision a specific amount of Request Units (RUs) per second for your containers. This is suitable for predictable workloads.
  • Serverless: You are billed for actual database throughput consumed and storage used. This is ideal for unpredictable or intermittent workloads.

Pricing also depends on the region, storage, and features used.

Performance & Scalability

What are Request Units (RUs)?

Request Units (RUs) are a normalized measure of throughput for Azure Cosmos DB. A Request Unit represents the amount of processing power required to perform a database operation, such as reading an item, writing an item, or querying an item. All API operations are measured and billed in RUs.

How does partitioning work in Cosmos DB?

Cosmos DB uses logical partitions to store data. Each logical partition contains a set of items that share the same partition key value. A partition key is a property within your documents that helps distribute data and requests across the database. Choosing an effective partition key is crucial for performance and scalability. Cosmos DB automatically handles sharding and distributing data across physical partitions.

What is a partition key and how do I choose one?

A partition key is a property in your items that Cosmos DB uses to distribute data across logical and physical partitions. The choice of partition key is critical for scalability and performance. A good partition key should have a high cardinality (many distinct values) and distribute requests evenly across partitions. Common choices include user IDs, tenant IDs, or order IDs.

Consider these factors:

  • Cardinality: The number of unique values for the partition key. Higher is generally better.
  • Distribution: Ensure the partition key distributes requests evenly. Avoid "hot partitions" where one partition receives a disproportionate amount of traffic.
  • Query Patterns: Design your partition key to align with your common query patterns to minimize cross-partition queries.
What are "hot partitions"?

A "hot partition" occurs when a single partition is receiving a disproportionate amount of traffic (reads or writes) compared to other partitions. This can lead to throttling and degraded performance because the throughput is concentrated on a single partition. This often happens due to a poorly chosen partition key or uneven data distribution.

APIs and Data Models

What APIs does Azure Cosmos DB support?

Azure Cosmos DB supports multiple APIs, allowing you to use the data model and programming model that best suits your application:

  • Core (SQL) API: The original API for Cosmos DB, offering document database capabilities with a SQL query language.
  • MongoDB API: Compatible with MongoDB applications. You can use existing MongoDB drivers and tools with Cosmos DB.
  • Cassandra API: Compatible with Apache Cassandra applications.
  • Gremlin API: For graph database workloads.
  • Table API: Compatible with Azure Table storage applications.
Can I migrate my existing database to Cosmos DB?

Yes, you can migrate existing databases to Cosmos DB. The process depends on the source database and the Cosmos DB API you choose. Azure provides tools and services like Azure Database Migration Service to facilitate these migrations. For example, you can migrate data from MongoDB, SQL Server, or other sources to the corresponding Cosmos DB API.

How do I query data in Cosmos DB (Core SQL API)?

You can query data using a SQL-like query language. The SDKs provide methods to execute queries. For example, in the .NET SDK:

string sqlQueryText = "SELECT VALUE r FROM root r WHERE r.city = 'Seattle'"; FeedIterator feedIterator = this.container.GetItemQueryIterator(sqlQueryText); while (feedIterator.HasMoreResults) { FeedResponse response = await feedIterator.ReadNextAsync(); foreach (MyDocument document in response) { Console.WriteLine($"Found {document.id}"); } }

Consistency, Availability, and SLAs

What are the consistency levels in Cosmos DB?

Cosmos DB offers five well-defined consistency levels, balancing consistency, availability, and latency:

  • Strong: Reads are guaranteed to return the most recent committed write.
  • Bounded Staleness: Reads are guaranteed to be no more than a specified number of versions or time intervals behind the latest write.
  • Session: The default consistency level. Reads are consistent within a client session.
  • Consistent Prefix: Reads are guaranteed to return a prefix of writes, ensuring writes are returned in the order they were made.
  • Eventual: Reads can return stale data, but all replicas eventually converge.
What are the SLAs for Azure Cosmos DB?

Azure Cosmos DB provides industry-leading Service Level Agreements (SLAs) for:

  • Availability: Up to 99.999% for multi-region availability.
  • Throughput: Guaranteed throughput for provisioned RUs.
  • Latency: Guaranteed low latency for reads and writes (e.g., 99% of requests served in under 10ms for reads, under 15ms for writes).
  • Consistency: Guaranteed consistency levels.
  • Storage: Guaranteed storage.

If Cosmos DB fails to meet its SLA, you may be eligible for a service credit.