05 Feb 2013
tl;dr: Highly Available Transactions show it’s possible to achieve many of the transactional guarantees of today’s databases without sacrificing high availability and low latency.
CAP and ACID
Distributed systems designers face hard trade-offs between factors like latency, availability, and consistency. Perhaps most famously, the CAP Theorem dictates that it is impossible to achieve “consistency” while remaining available in the presence of network and system partitions.1 Further, even without partitions, there is a trade-off between response time and consistency. These fundamental limitations mean distributed databases can’t have it all, and the limitations aren’t simply theoretical: across datacenters, the penalties for strong consistency are on the order of hundreds of milliseconds (compared to single-digit latencies for weak consistency) and, in general, unavailability takes the form of a 404 or Fail Whale on a website. Over twelve years after Eric Brewer first stated the CAP Theorem (and after decades of building distributed database systems), data store designers have taken CAP to heart, some choosing consistency and others choosing availability and low latency.
While the CAP Theorem is fairly well understood, the relationship between CAP and ACID transactions is not. If we consider the current lack of highly available systems providing arbitrary multi-object operations with ACID-like semantics, it appears that CAP and transactions are incompatible. This is partly due to the historical design of distributed database systems, which typically chose consistency over high availability. Standard database techniques like two-phase locking and multi-version concurrency control do not typically perform well in the event of partial failure, and the master-based (i.e., master-per-shard) and overlapping quorum-based techniques often adopted by many distributed database designs are similarly unavailable if users are partitioned from the anointed primary copies.
HATs for Everyone
In recent research at UC Berkeley, we show that high availability and transactions are not mutually exclusive: it is possible to match the semantics provided by many of today’s “ACID” and “NewSQL” databases without sacrificing high availability. While these Highly Available Transactions (HATs) do not provide serializability—which is not highly available under arbitrary read/write transactions—as I blogged about last week, many ACID databases provide a weaker form of isolation. The problem is that these databases do not implement their guarantees using highly available algorithms. However, as our recent results demonstrate, we can implement these guarantees and achieve other useful properties without giving up high availability or having to incur cross-replica (or, in a georeplicated scenario, cross-datacenter) latencies.
At a high level, HATs provide several guarantees that can be achieved with high availability2 for arbitrary read/write transactions across a given set of data items, irrespective of data layout:
Transactional atomicity across arbitrary data items (e.g., see all or none of a transaction’s updates, or “A” in “ACID”), regardless of how many shards a transaction accesses and without using a master.
ANSI-compliant Read Committed and Repeatable Read isolation levels3 (”I” in “ACID” matching many existing databases).
Session guarantees including read-your-writes, monotonic reads (i.e., time doesn’t go backwards), and causality within and across transactions.
Eventual consistency, meaning that, if writes to a data item stop, all transaction reads will eventually return the last written value.
We believe that this is the strongest set of guarantees that have been provided with high availability, and many of the algorithms—like the atomicity and isolation guarantees—are brand new, namely because they don’t use masters or other coordination on transactions’ fast paths. The brief report we just released runs slightly over five pages and includes proof-of-concept algorithms for each guarantee.
Of course, there are several guarantees that HATs cannot provide. Not even the best of marketing teams can produce a real database that “beats CAP”; HATs cannot make guarantees on data recency during partitions, although, in the absence of partitions, data may not be very stale. HATs cannot be “100% ACID compliant” as they cannot guarantee serializability, yet they meet the default and sometimes maximum guarantees of many “ACID” databases. HATs cannot guarantee global integrity constraints (e.g., uniqueness constraints across data items) but can perform local checking of predicates (e.g., per-record integrity maintenance like null value checks). In the report, we classify many of these anomalies in terms of previously documented isolation levels.
Are these guarantees worthwhile? If users need high availability or low latency, HATs provide a set of semantics that is stronger than any existing highly available data store. If users need strong consistency guarantees, they will need to accept the possibility of unavailability and expect to pay at least one round trip time for each of their operations. As an example, people often ask me about Spanner, from Google. Spanner provides strong consistency and typically low latency read-only transactions. Users that are partitioned from the majority of Spanner nodes will experience unavailability and read-write transactions will incur WAN latencies due to Spanner’s two-phase locking mechanism. Spanner’s authors don’t hide these facts—for example, look at Table 6 on page 12 of the paper: read/write transactions are between 8.3 and 11.9 times slower than read-only transactions. For Google, who has optimized their WAN networks, atomic clocks, and infrastructure engineering and whose workload (also in Table 6) is composed of over 98% of read-only transactions, Spanner makes sense. When high availability and guaranteed low latency matter, even Google might choose a different architecture.
Our work on HATs at Berkeley is just beginning. We’re benchmarking a HAT prototype and are tuning our algorithms for performance and scalability. Once the algorithms are better explored, I would personally like to help integrate HATs into existing data stores, much as we recently did with our PBS work in Cassandra. It’d be interesting to port an application running on Oracle Database to a NoSQL store and provide the same semantic guarantees with substantially improved performance, availability, and cost effectiveness. We’re also working on additional theoretical results to further explain HATs in the context of CAP. I plan to share these results as we develop them further.
In the meantime, we’d welcome feedback on our work so far and are curious where HATs make sense in your stack. If you’re an application developer who wishes she had transactional atomicity or weak isolation, a distributed database developer interested in HATs, or you just think HATs are cool, let us know. We’re always looking for anecdotes, workloads, and good conversation.
This is Part Two of a two part series on Transactions and Availability.
Part One: When is ACID ACID?
 As formally proven by Gilbert and Lynch, the CAP Theorem states that linearizability and high availability are incompatible. Linearizability is often called “atomicity,” yet “atomicity” means something different in database parlance, namely, we see all or none of a transaction’s updates. For clarity, I’ll call “ACID atomicity” “transactional atomicity.”
 As we discuss in Section 2 of the paper, we have to be careful how to define “high availability”: a system that always aborts all transactions is, in a sense, “available,” if not very useful. In short, we say that a system provides high availability if every transaction that can contact at least one server for each data item in the transaction eventually commits (or alternatively, aborts itself due to an internal integrity constraint violation). The system is not allowed to indefinitely abort transactions for the purposes of maintaining availability.
 If you’re a database nut, you may object that the ANSI SQL definitions are notoriously underspecified. However, rest assured that HAT “Read Committed” matches all of the definitions we’ve found in the literature, including those by Berenson et al. (SIGMOD 1995) and Adya (MIT Ph.D. thesis, ICDE 2000). HAT “Repeatable Read”—and “Repeatable Read” interpretations in general—is more complicated; HAT “Repeatable Read” does match the ANSI spec, and we provide a detailed discussion in Section 4 of the paper.
- Causality is expensive (and what to do about it) (05 Feb 2014)
- Stickiness and client-server session guarantees (13 Jan 2014)
- On consistency and durability (10 Dec 2013)
- Non-blocking transactional atomicity (28 May 2013)
- Communication costs in real-world networks (17 May 2013)
- When is "ACID" ACID? Rarely. (22 Jan 2013)
- Using PBS in Cassandra 1.2.0 (14 Jan 2013)
- Doing redundant work to speed up distributed queries (20 Sep 2012)
- Safety and liveness: Eventual consistency is not safe (27 Mar 2012)
- A running list: Writing, speaking, and research advice (17 Mar 2012)