In the context of distributed systems, especially databases and data stores, the CAP theorem is a fundamental principle. It highlights a critical trade-off that designers must consider when building systems that need to operate across multiple nodes, potentially in different network locations.
Every read receives the most recent write or an error. All nodes see the same data at the same time. If you write data to one node, any subsequent read from any node will return that new data.
Every request receives a (non-error) response, without guarantee that it contains the most recent write. The system remains operational and responsive, even if some nodes are down or unreachable.
The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes. This means the system can withstand network failures that split the cluster into multiple partitions.
The theorem is often stated as: in the presence of a network partition (P), you must choose between Consistency (C) and Availability (A).
Visual representation of the CAP theorem trade-off.
If a partition occurs, these systems sacrifice availability to maintain consistency. When a node cannot communicate with the rest of the cluster, it might refuse to serve requests to ensure that it doesn't return stale data. Examples often include traditional relational databases in distributed setups and some NoSQL databases like ZooKeeper or etcd.
If a partition occurs, these systems sacrifice consistency to maintain availability. They will continue to respond to requests even if they cannot guarantee that the data is the most up-to-date across all nodes. This can lead to reading stale data, but the system remains accessible. Examples include Cassandra, CouchDB, and Amazon DynamoDB.
A system that is Consistent and Available but not Partition Tolerant is essentially a single-node system or a cluster where network partitions are impossible. In modern distributed systems, partition tolerance is almost always a requirement, making CA systems rare in practice for truly distributed applications.
In any real-world distributed system, network failures are inevitable. Nodes might go offline, networks can become congested, or data centers might experience outages. Therefore, partition tolerance (P) is usually considered a non-negotiable requirement. This leaves system designers with a choice between prioritizing Consistency (C) or Availability (A) when a partition occurs.