Bridging the Gap: Opportunities in Coordination-Avoiding Databases

22 Apr 2014

Background: I recently co-organized the Principles and Practice of Eventual Consistency workshop at EuroSys, where I also gave a talk on lessons from some of our recent research. This post contains a summary (in the form of my talk proposal). This is joint work with Alan Fekete, Ali Ghodsi, Mike Franklin, Joe Hellerstein, and Ion Stoica.

My slides are available on Speaker Deck.

Abstract: Weakly consistent systems surface a controversial tension between, on the one hand, availability, latency, and performance, and, on the other, programmability. We propose the concept of coordination avoidance as a unifying, underlying principle behind the former and discuss lessons from our recent experiences mitigating the latter.

Trouble in Paradise

Faced with the task of operating “always on” services and lacking sufficient guidance from the literature regarding alternative distributed designs and algorithms, many Internet service architects and engineers throughout the 2000s discarded traditional database semantics and transactional models in favor of weaker but less principled models: eventual consistency, few if any multi-object, multi-operation (i.e., transactional) guarantees, and ad-hoc application-specific compensation—collectively, much of “NoSQL” [1]. From a research perspective, this space of weaker models has proven to be a fertile area: new (or re-discovered), often esoteric, and almost always nuanced semantics present many opportunities for new systems design, optimizations, and formal study.

Unfortunately, these weaker models come with serious usability disadvantages: programmability suffers. Understanding the implications of non-serializable isolation models for end-user applications is difficult. Programmers have little practical guidance as to how to choose an appropriate model for their applications, and understanding the differences between models effectively requires graduate-level training in distributed systems and/or database theory [2]. Some members within the internet services industry that birthed the resurgence of interest in these semantics have begun a backlash against them: one recent and prominent industrial account unequivocally claims that “designing applications to cope with concurrency anomalies in their data is…ultimately not worth the performance gains” [15]. Statements like these (which, in our experience, enjoy some popularity among practitioners and considerable acceptance in the database community [17]) suggest that, as a research community centered on weak consistency, we are possibly failing to communicate and demonstrate the benefits achievable with these semantics, have underestimated the burden placed on programmers, or a combination of both.

Coordination-Free Execution: A Unifying Principle Behind “AP” Benefits

In tribute to the CAP Theorem [11] that widely popularized these trade-offs, much of the dialogue around weakly consistent models concerns the availability of operations under failures. Availability is an important property, but, in our opinion, a sole focus on availability undervalues the benefits of weak semantics. Daniel Abadi has successfully argued that, while “availability” is primarily relevant in the presence of failures, weakly consistent (“AP”) systems can also offer low latency [1]. Any replica can respond to any request, alleviating the need for many communication delays—in our experience, average LAN latencies can be up to 720x faster than those over WAN.

We would take Abadi’s position even further: weak consistency also allows aggressive scale-out, even at the level of a single data item—more servers can be added without communication between them. Modern, strongly consistent “NewSQL” systems can indeed provide horizontal scale-out using shared-nothing database replication techniques popularized in the 1980s [16]. However, especially for worst-case accesses, these systems are far from “as scalable as NoSQL” [15] systems offering weak isolation. In recent research, we have examined the throughput penalties associated with these “strong” models: in modern LAN and WAN networks, distributed serializable transactions face a worst-case throughput limits of 1200 and 12 read-write transactions per item per second, independent of implementation strategy [6]. Recent systems [10, 15] are no exception: operations over disjoint data items can proceed concurrently, increasing throughput, but conflicting operations over non-disjoint data items are limited by network latency. In contrast, appropriate implementations of weak consistency face no such overheads.

The three properties above—availability, low latency, and scale-out—are consequences of a more fundamental principle underlying weakly consistent systems: a lack of synchronous communication, or coordination, between concurrent operations. If operations can execute coordination-free, they can run concurrently, on any available resources, without communicating with or otherwise stalling concurrent operations. The cost of coordination is easily and simultaneously cast in the form of (minority) unavailability, latency (minimum 1 RTT), and throughput (maximum 1/RTT). Moreover, and more importantly, the concept of coordination-free execution is portable to a range of system architectures: whereas traditional formulations of availability are inherently tied to physical replication, coordination-freedom is a property of the execution strategy and is independent of physical deployment or topology. For example, a system providing clients with snapshot reads can effectively act as a coordination-free “replicated” system even if implemented by a set of linearizable multi-versioned masters [7]. Judicious use of strong semantics in correct application execution equates to coordination-avoidance [6]: the use of as little coordination as possible.

Bridging the Gap: Experiences Applying Coordination Avoidance

As F1’s authors highlight above, the decision to consider coordination-avoiding algorithms or not requires a cost-benefit judgment [8]: will performance, availability, or latency benefits outweigh the cost of ascertaining whether weak models are sufficient? In a sober assessment, many applications will likely be able to (over-)pay for “strong consistency”: single-site operations are inexpensive [16], while improvements in datacenter networks [19] lower the cost for non-geo-replicated systems. Yet, a large class of applications—for example, non-partitionable applications [9], applications with high mutation rates (i.e., write contention) [18], and geo-replicated applications [20]—will continue to be sensitive to extraneous coordination costs and will likely necessitate further study.

Identifying and serving this latter class of applications is paramount to ensuring the future adoption of coordination-avoiding algorithms. While represents a difficult task, we offer three examples from our recent research:

Overall, we have found success in i.) focusing on existing applications and ii.) incorporating existing specifications to enable coordination-avoiding execution. Towards these goals, our ongoing research directly incorporates application-level invariants (derived from real-world application and the SQL language) for analysis under a necessary and sufficient property for coordination-free execution [6]. The use of application-level invariants is key to safely maximizing concurrency without requiring programmer expertise in weak isolation models. By focusing on existing invariants and specifications as above, we can provide tangible improvements without necessarily affecting end-user experience.

A Coordinated Future

The continued success of weakly consistent systems requires a demonstration of and focus on utility. Delivering an understanding of exactly why these weakly consistent semantics can provide (in a fundamental sense) greater availability, lower latency, and higher throughput is paramount; our proposed focus on coordination avoidance is our attempt at providing a unified answer. By applying the lens of coordination avoidance to a range of existing, well-defined and ideally high-value domains, we have the opportunity to demonstrate exactly when weak consistency is adequate and, equally importantly, when it is not. Without a full specification, language techniques [3, 4] and library-based optimizations [14] are helpful to programmers. However, with a full specification and increased knowledge of application semantics [6, 7], we can fully realize the benefits of coordination avoidance while further mitigating programmer burden. While coordination cannot always be avoided, we are bullish on a continued ability to effectively manage it.


[1] D. J. Abadi. Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. IEEE Computer, 45(2):37–42, 2012.

[2] P. Alvaro, P. Bailis, N. Conway, and J. M. Hellerstein. Consistency without borders. In SoCC 2013.

[3] P. Alvaro, N. Conway, J. M. Hellerstein, and D. Maier. Blazes: Coordination analysis for distributed programs. In ICDE 2014.

[4] P. Alvaro, N. Conway, J. M. Hellerstein, and W. Marczak. Consistency analysis in Bloom: a CALM and collected approach. In CIDR 2011.

[5] P. Bailis, A. Davidson, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Highly Available Transactions: Virtues and Limitations. In VLDB 2014.

[6] P. Bailis, A. Fekete, M. J. Franklin, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Coordination-Avoiding Database Systems. arXiv:1402.2237, 2014.

[7] P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. Scalable Atomic Visibility with RAMP Transactions. In SIGMOD 2014.

[8] P. Bailis and A. Ghodsi. Eventual Consistency Today: Limitations, extensions, and beyond. ACM Queue, 11(3), 2013.

[9] N. Bronson et al. Tao: Facebook’s distributed data store for the social graph. In USENIX ATC 2013.

[10] J. C. Corbett et al. Spanner: Google’s globally-distributed database. In OSDI 2012.

[11] S. Gilbert and N. Lynch. Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2):51–59, 2002.

[12] J. Gray and A. Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, 1993.

[13] P. L. Lehman and S. B. Yao. Efficient locking for concurrent operations on B-trees. ACM TODS, 6(4):650–670, 1981.

[14] M. Shapiro, N. Preguiça, C. Baquero, and M. Zawirski. A comprehensive study of convergent and commutative replicated data types. Technical Report 7506, INRIA, 2011.

[15] J. Shute et al. F1: A distributed SQL database that scales. In VLDB 2013.

[16] M. Stonebraker. The case for shared nothing. IEEE Database Engineering Bulletin, 9(1):4–9, 1986.

[17] M. Stonebraker. Why enterprises are uninterested in NoSQL. ACM Queue Blog, September 2010.

[18] TPC Council. TPC-C Benchmark Revision 5.11. 2010.

[19] Y. Xu, Z. Musgrave, B. Noble, and M. Bailey. Bobtail: avoiding long tails in the cloud. In NSDI 2013.

[20] M. Zawirski, A. Bieniusa, V. Balegas, S. Duarte, C. Baquero, M. Shapiro, and N. Preguiça. SwiftCloud: Fault-tolerant geo-replication integrated all the way to the client machine. arXiv:1310.3107, 2013.

You can follow me on Twitter here.