Azure Cosmos DB Cassandra API Documentation
This documentation provides comprehensive information on using Azure Cosmos DB with its Cassandra API. Azure Cosmos DB is a globally distributed, multi-model database service. The Cassandra API offers a familiar Cassandra Query Language (CQL) interface for developers accustomed to Apache Cassandra.
Introduction to Cassandra API
The Azure Cosmos DB Cassandra API is wire-compatible with the Apache Cassandra 3.7 API. This means you can use existing Apache Cassandra drivers and tools to connect to Azure Cosmos DB. Key benefits include:
- Globally distributed: Scale your application horizontally across the globe.
- Managed service: No need to manage infrastructure or perform upgrades.
- Elastic scalability: Scale throughput and storage independently and on demand.
- High availability: Built-in replication and failover mechanisms.
- Tunable consistency: Choose the consistency level that best suits your application's needs.
Getting Started
To start using the Cassandra API, you'll need to:
- Create an Azure Cosmos DB Cassandra API account: You can do this through the Azure portal, Azure CLI, or PowerShell.
- Create a keyspace: A keyspace is a logical container for your tables, similar to a database in a relational system.
- Create a table: Define your table schema using CQL.
- Connect using a Cassandra driver: Use your preferred programming language's Cassandra driver to connect to your Azure Cosmos DB endpoint.
Example: Creating a Keyspace and Table
Here's an example of CQL commands to create a keyspace and a simple table:
-- Create a keyspace with replication factor 1 (for single region)
CREATE KEYSPACE mykeyspace WITH REPLICATION = { 'class' : 'SimpleStrategy' };
-- Use the keyspace
USE mykeyspace;
-- Create a table for user profiles
CREATE TABLE users (
user_id uuid PRIMARY KEY,
first_name text,
last_name text,
email text,
signup_date timestamp
);
Connecting with Drivers
Azure Cosmos DB Cassandra API supports popular Cassandra drivers. You'll typically configure the driver with your Azure Cosmos DB account's contact point and authentication credentials.
Connection String Example (Java Driver)
When configuring your Java driver, use the following format:
Cluster cluster = Cluster.builder()
.addContactPoint("YOUR_COSMOS_DB_ACCOUNT_NAME.cassandra.cosmos.azure.com")
.withPort(10350)
.withCredentials("YOUR_COSMOS_DB_ACCOUNT_NAME", "YOUR_PRIMARY_KEY")
.build();
Session session = cluster.connect();
Remember to replace placeholders with your actual Azure Cosmos DB account name and primary key. You can find these in the Azure portal under your Cosmos DB account's "Keys" section.
Important Note on Ports
The Cassandra API endpoint for Azure Cosmos DB uses port 10350, not the default Cassandra port of 9042.
Querying Data
You can use standard CQL queries to interact with your data. Azure Cosmos DB provides automatic indexing for all your data, so you can query efficiently without manual secondary indexing.
Example Queries
Inserting data:
INSERT INTO users (user_id, first_name, last_name, email, signup_date)
VALUES (uuid(), 'Alice', 'Smith', 'alice.smith@example.com', toTimestamp(now()));
Selecting data:
SELECT * FROM users WHERE user_id = a1b2c3d4-e5f6-7890-1234-567890abcdef;
Performance Considerations
To optimize performance and manage costs, consider the following:
- Partitioning: Choose a good partition key to distribute your data evenly across physical partitions.
- Request Units (RUs): Understand how RUs are consumed by your operations and provision adequate throughput.
- Batching: Use batches for related operations when appropriate.
- Data Modeling: Design your schema to support your application's query patterns effectively.
Tip: Choosing a Partition Key
A well-chosen partition key is crucial for performance and scalability. Aim for keys with high cardinality and uniform distribution to avoid hot partitions.
Monitoring and Diagnostics
Azure Cosmos DB provides extensive monitoring capabilities through Azure Monitor. You can track request rates, latency, storage, and other key metrics. Use the diagnostic logs to troubleshoot issues.