A Distributed Systems Reading List

Introduction.

I often argue that the toughest thing about distributed systems is changing the way you think. The below is a collection of material I've found useful for motivating these changes.

Thought Provokers

Ramblings that make you think about the way you design. Not everything can be solved with big servers, databases and transactions.

  • Harvest, Yield and Scalable Tolerant Systems - Real world applications of CAP from Brewer et al
  • On Designing and Deploying Internet Scale Services - James Hamilton
  • The Perils of Good Abstractions - Building the perfect API/interface is difficult
  • Chaotic Perspectives - Large scale systems are everything developers dislike - unpredictable, unordered and parallel
  • Data on the Outside versus Data on the Inside - Pat Helland
  • Memories, Guesses and Apologies - Pat Helland
  • SOA and Newton's Universe - Pat Helland
  • Building on Quicksand - Pat Helland
  • Why Distributed Computing? - Jim Waldo
  • A Note on Distributed Computing - Waldo, Wollrath et al
  • Stevey's Google Platforms Rant - Yegge's SOA platform experience
  • Latency Exists, Cope! - Commentary on coping with latency and it's architectural impacts
  • Latency - the new web performance bottleneck - not at all new (see Patterson ), but noteworthy
  • The Tail At Scale - the latencychallenges inherent of dealing with latency in large scale systems

Somewhat about the technology but more interesting is the culture and organization they've created to work with it.

  • A Conversation with Werner Vogels - Coverage of Amazon's transition to a service-based architecture
  • Discipline and Focus - Additional coverage of Amazon's transition to a service-based architecture
  • Vogels on Scalability
  • SOA creates order out of chaos @ Amazon

Current "rocket science" in distributed systems.

  • Chubby Lock Manager
  • Google File System
  • Data Management for Internet-Scale Single-Sign-On
  • Dremel: Interactive Analysis of Web-Scale Datasets
  • Large-scale Incremental Processing Using Distributed Transactions and Notifications
  • Megastore: Providing Scalable, Highly Available Storage for Interactive Services - Smart design for low latency Paxos implementation across datacentres.
  • Spanner - Google's scalable, multi-version, globally-distributed, and synchronously-replicated database.
  • Photon - Fault-tolerant and Scalable Joining of Continuous Data Streams. Joins are tough especially with time-skew, high availability and distribution.
  • Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing - Data warehousing system that stores critical measurement data related to Google's Internet advertising business.

Consistency Models

Key to building systems that suit their environments is finding the right tradeoff between consistency and availability.

  • CAP Conjecture - Consistency, Availability, Parition Tolerance cannot all be satisfied at once
  • Consistency, Availability, and Convergence - Proves the upper bound for consistency possible in a typical system
  • CAP Twelve Years Later: How the "Rules" Have Changed - Eric Brewer expands on the original tradeoff description
  • Consistency and Availability - Vogels
  • Eventual Consistency - Vogels
  • Avoiding Two-Phase Commit - Two phase commit avoidance approaches
  • 2PC or not 2PC, Wherefore Art Thou XA? - Two phase commit isn't a silver bullet
  • Life Beyond Distributed Transactions - Helland
  • If you have too much data, then 'good enough' is good enough - NoSQL, Future of data theory - Pat Helland
  • Starbucks doesn't do two phase commit - Asynchronous mechanisms at work
  • You Can't Sacrifice Partition Tolerance - Additional CAP commentary
  • Optimistic Replication - Relaxed consistency approaches for data replication

Papers that describe various important elements of distributed systems design.

  • Distributed Computing Economics - Jim Gray
  • Rules of Thumb in Data Engineering - Jim Gray and Prashant Shenoy
  • Fallacies of Distributed Computing - Peter Deutsch
  • Impossibility of distributed consensus with one faulty process - also known as FLP [access requires account and/or payment, a free version can be found here ]
  • Unreliable Failure Detectors for Reliable Distributed Systems. A method for handling the challenges of FLP
  • Lamport Clocks - How do you establish a global view of time when each computer's clock is independent
  • The Byzantine Generals Problem
  • Lazy Replication: Exploiting the Semantics of Distributed Services
  • Scalable Agreement - Towards Ordering as a Service
  • Scalable Eventually Consistent Counters over Unreliable Networks - Scalable counting is tough in an unreliable world

Languages and Tools

Issues of distributed systems construction with specific technologies.

  • Programming Distributed Erlang Applications: Pitfalls and Recipes - Building reliable distributed applications isn't as simple as merely choosing Erlang and OTP.

Infrastructure

  • Principles of Robust Timing over the Internet - Managing clocks is essential for even basics such as debugging
  • Consistent Hashing and Random Trees
  • Amazon's Dynamo Storage Service

Paxos Consensus

Understanding this algorithm is the challenge. I would suggest reading "Paxos Made Simple" before the other papers and again afterward.

  • The Part-Time Parliament - Leslie Lamport
  • Paxos Made Simple - Leslie Lamport
  • Paxos Made Live - An Engineering Perspective - Chandra et al
  • Revisiting the Paxos Algorithm - Lynch et al
  • How to build a highly available system with consensus - Butler Lampson
  • Reconfiguring a State Machine - Lamport et al - changing cluster membership
  • Implementing Fault-Tolerant Services Using the State Machine Approach: a Tutorial - Fred Schneider

Other Consensus Papers

  • Mencius: Building Efficient Replicated State Machines for WANs - consensus algorithm for wide-area network
  • In Search of an Understandable Consensus Algorithm - The extended version of the RAFT paper, an alternative to PAXOS.

Gossip Protocols (Epidemic Behaviours)

  • How robust are gossip-based communication protocols?
  • Astrolabe: A Robust and Scalable Technology For Distributed Systems Monitoring, Management, and Data Mining
  • Epidemic Computing at Cornell
  • Fighting Fire With Fire: Using Randomized Gossip To Combat Stochastic Scalability Limits
  • Bi-Modal Multicast
  • ACM SIGOPS Operating Systems Review - Gossip-based computer networking
  • SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol
  • Chord : A Scalable Peer-to-peer Lookup Protocol for Internet Applications
  • Kademlia : A Peer-to-peer Information System Based on the XOR Metric
  • Pastry : Scalable, decentralized object location and routing for large-scale peer-to-peer systems
  • PAST : A large-scale, persistent peer-to-peer storage utility - storage system atop Pastry
  • SCRIBE : A large-scale and decentralised application-level multicast infrastructure - wide area messaging atop Pastry

Distributed Systems and Parallel Computing

No matter how powerful individual computers become, there are still reasons to harness the power of multiple computational units, often spread across large geographic areas. Sometimes this is motivated by the need to collect data from widely dispersed locations (e.g., web pages from servers, or sensors for weather or traffic). Other times it is motivated by the need to perform enormous computations that simply cannot be done by a single CPU.

From our company’s beginning, Google has had to deal with both issues in our pursuit of organizing the world’s information and making it universally accessible and useful. We continue to face many exciting distributed systems and parallel computing challenges in areas such as concurrency control, fault tolerance, algorithmic efficiency, and communication. Some of our research involves answering fundamental theoretical questions, while other researchers and engineers are engaged in the construction of systems to operate at the largest possible scale, thanks to our hybrid research model .

Recent Publications

Some of our teams.

Algorithms & optimization

Graph mining

Network infrastructure

System performance

We're always looking for more talented, passionate people.

Careers

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

A curated list to learn about distributed systems

theanalyst/awesome-distributed-systems

Folders and files.

NameName
128 Commits

Repository files navigation

Awesome-distributed-systems.

A (hopefully) curated list on awesome material on distributed systems, inspired by other awesome frameworks like awesome-python . Most links will tend to be readings on architecture itself rather than code itself.

Read things here before you start.

  • CAP Theorem , Also plain english explanation
  • Fallacies of Distributed Computing , expect things to break, everything
  • Distributed systems theory for the distributed engineer , most of the papers/books in the blog might reappear in this list again. Still a good BFS approach to distributed systems.
  • FLP Impossibility Result (paper) , an easier blog post to follow along
  • An Introduction to Distributed Systems @aphyr's excellent introduction to distributed systems
  • Distributed Systems for fun and profit [Free]
  • Distributed Systems Principles and Paradigms, Andrew Tanenbaum [Free with registration]
  • Scalable Web Architecture and Distributed Systems [Free]
  • Principles of Distributed Systems [Free] [ETH Zurich University]
  • Making reliable distributed systems in the presence of software errors , [Free] Joe Amstrong's (Author of Erlang) PhD thesis
  • Designing Data Intensive Applications [Amazon Link]
  • Distributed Machine Learning Patterns, Yuan Tang , Practical patterns for scaling machine learning from your laptop to a distributed cluster
  • Distributed Computing, Hagit Attiya and Jennifer Welch
  • Distributed Algorithms, Nancy Lynch [Amazon Link]
  • Impossibility Results for Distributed Computing [paywall]
  • Designing Distributed Systems, Brendan Burns [Free with registration]
  • Distributed Systems: Concepts and Design, George Coulouris [Amazon Link]
  • Akka in Action, Second Edition
  • Systemantics: how systems work and especially how they fail
  • Think Distributed Systems [Free with subscription]

Must read papers on distributed systems. While nearly all of Lamport's work should feature here, just adding a few that must be read.

  • Times, Clocks and Ordering of Events in Distributed Systems Lamport's paper, the Quintessential distributed systems primer
  • Session Guarantees for Weakly Consistent Replicated Data a '94 paper that talks about various recommendations for session guarantees for eventually consistent systems, many of this would be standard vocabulary in reading other dist. sys papers, like monotonic reads, read your writes etc.

Storage & Databases

  • Dynamo: Amazon's Highly Available Key Value Store Paraphrasing @fogus from their blog , it is very rare for a paper describing an active production system to influence the state of active research in any industry; this is one of those seminal distributed systems paper that solves the problem of a highly available and fault tolerant database in an elegant way, later paving the way for systems like Cassandra, and many other AP systems using a consistent hashing.
  • Bigtable: A Distributed Storage System for Structured Data
  • The Google File System
  • Cassandra: A Decentralized Structured Storage System Inspired heavily by Dynamo, an now an open source
  • CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data , the algorithm for the basis of Ceph distributed storage system, for the architecture itself read RADOS

Messaging systems

  • The Log: What every software engineer should know about real-time data's unifying abstraction , a somewhat long read, but covers brilliantly on logs, which are at the heart of most distributed systems
  • Kafka: a Distributed Messaging System for Log Processing

Distributed Consensus and Fault-Tolerance

  • Practical Byzantine Fault Tolerance
  • The Byzantine Generals Problem
  • Impossibility of Distributed Consensus with One Faulty Process
  • The Part Time Parliament Paxos, Lamport's original Paxos paper, a bit difficult to understand, may require multiple passes
  • Paxos Made Simple , a more terse readable Paxos paper by Lamport himself. Shorter and more easier compared to the original.
  • The Chubby Lock Service for loosely coupled distributed systems Google's lock service used for loosely coupled distributed systems. Sort of Paxos as a Service for building other distributed systems. Primary inspiration behind other Service Discovery & Coordination tools like Zookeeper, etcd, Consul etc.
  • Paxos made live - An engineering perspective Google's learning while implementing systems atop of Paxos. Demonstrates various practical issues encountered while implementing a theoretical concept.
  • Raft Consensus Algorithm An alternative to Paxos for distributed consensus, that is much simpler to understand. Do checkout an interesting visualization of raft
  • Conflict-free Replicated Data Types presents an approach for Strong Eventual Consistency which as been applied in projects such as Riak , Redis and Akka . A great talk on the subject by Martin Kleppmann can be found here
  • Speculative algorithms for global state synchronizations Azos.Sky.Server.Locking uses probability based QOS (Quality of Service)/Trust measure to ensure probability-based consensus. The approach avoids distributed state machine/phase synchronization and is very simple to understand and implement

Testing, monitoring and tracing

While designing distributed systems are hard enough, testing them is even harder.

  • Dapper , Google's large scale distributed-systems tracing infrastructure, this was also the basis for the design of open source projects such as Zipkin , Apache SkyWalking , Pinpoint and HTrace .

Programming Models

  • Distributed Programming Model
  • PSync: a partially synchronous language for fault-tolerant distributed algorithms Video: Conference Video
  • Programming Models for Distributed Computing
  • Logic and Lattices for Distributed Programming

Verification of Distributed Systems

  • Curated list of resources on testing distributed systems includes links to materials on testing by various companies (Google, Amazon, Netflix, Microsoft, Dropbox, etc) and research papers.
  • Jepsen A framework for distributed systems verification, with fault injection @aphyr has featured enough times in this list already, but Jepsen and the blog posts that go with are a quintessntial addition to any distributed systems reading list.
  • Verdi A Framework for Implementing and Formally Verifying Distributed Systems Paper
  • Distributed Deep Dive interview series by Ably Relatime .
  • Distributed Systems in One Lesson Distributed Systems in One Lesson by Tim Berglund
  • Reliable Distributed Algorithms, Part 1 , KTH Sweden
  • Reliable Distributed Algorithms, Part 2 , KTH Sweden
  • Cloud Computing Concepts , University of Illinois
  • CMU: Distributed Systems in Go Programming Language
  • Software Defined Networking , Georgia Tech.
  • ETH Zurich: Distributed Systems
  • ETH Zurich: Distributed Systems Part 2 , covers Distributed control algorithms, communication models, fault-tolerance among other things. In particular fault tolerance issues (models, consensus, agreement) and replication issues (2PC,3PC, Paxos), which are critical in understanding distributed systems are explained in great detail.
  • Distributed Systems Course , A beginner course on distributed system by Chris Colohan, A google employee who contributed to SUIF, MapReduce, TCMalloc, Percolator, Caffeine, Borg, Omega, and Piper.
  • MIT 6.824 , Youtube-playlist MIT distributed system lectures, in each video they discuss papers like GFS, Zookeeper, RAFT, Spanner...
  • Distributed Systems , Lectures 9 to 16 of the Cambridge University lecture "Concurrent and Distributed Systems", given by Dr. Martin Kleppmann. Youtube-playlist . A computer science entrance course, covered basic models and algorithms in distributed systems, also discussed CRDT, collaboration software and google's spanner.

Blogs and other reading links

  • Amazon Builder's Library , a collection of Amazon's learnings on distributed systems
  • How we implemented consistent hashing efficiently
  • Notes on Distributed Systems for Young Bloods
  • High Scalability Several architectures of huge internet services, for eg twitter , whatsapp
  • There is No Now , Problems with simultaneity in distributed systems
  • Turing Lecture: The Computer Science of Concurrency: The Early Years , An article by Leslie Lamport on concurrency
  • The Paper Trail blog, a very readable blog covering various aspects of distributed systems
  • aphyr , Posts on jepsen series are pretty awesome
  • All Things Distributed - Wernel Vogel's (Amazon CTO) blog on distributed systems
  • Distributed Systems: Take Responsibility for Failover
  • The C10K problem
  • On Designing and Deploying Internet-Scale Services
  • Files are hard A blog post on filesystem consistency, pretty important to read if you are into distributed storage or databases.
  • Distributed Systems Testing: The Lost World Testing distributed systems are hard enough, a well researched blog post which again covers a lot of links to various approaches and other papers
  • SWIM Protocol explained A blog post on popular SWIM failure detector
  • ACM Symposium on Principles of Distributed Computing (PODC) and International Symposium on Distributed Computing (DISC) , a list of resources from PODC–DISC community including conference series, mailing lists, youtube, twitter, etc.
  • IEEE International Parallel & Distributed Processing Symposium (IPDPS) , an international forum for engineers and scientists to present their latest research findings.
  • Springer Distributed Computing Journal , a journal about theory, design, specification, and implementation of distributed systems.

Other lists like this one

  • Readings in distributed systems
  • Distributed Systems meta list
  • List of required readings for Distributed Systems Part of CMU's Engineering Distributed Systems course
  • The Distributed Reader
  • A Distributed Systems Reading List , A collection of material, mostly papers on Distributed Systems Theory as well as seminal industry papers
  • Distributed Systems Readings , A comprehensive list of online courses related to distributed systems
  • Awesome Distributed Consensus , Another list of materials on distributed consensus protocols
  • Beginner's Guide to Distributed Systems A blog post with some useful getting started links for distributed systems

Contributors 31

@theanalyst

In Search of an Understandable Consensus Algorithm (Raft)

Paxos Made Simple

ZooKeeper: Wait-free coordination for Internet-scale systems

Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore

Impossibility of Distributed Consensus With One Faulty Process

Consensus in the presence of partial synchrony

Viewstamped Replication Revisited

Replication

Don’t be lazy, be consistent: Postgres-R, a new way to implement Database Replication

PacificA: Replication in Log-Based Distributed Storage Systems

Chain Replication for Supporting High Throughput and Availability

Byzantine Chain Replication

A Comprehensive Study of Convergent and Commutative Replicated Data Types

Optimistic Replication

Causality/Transactions

Stronger Semantics for Low-Latency Geo-Replicated Storage (Eiger)

Calvin: Fast Distributed Transactions for Partitioned Database Systems

Sinfonia: a new paradigm for building scalable distributed systems

Understanding the Limitations of Causally and Totally Ordered Communication

A Response to Cheriton and Skeen’s Criticism of Causal and Totally Ordered Communication

MDCC: Multi-Datacenter Consistency

Spanner: Google’s globally distributed database

Concurrency

Transactional Memory: Architectural Support for Lock-Free Data Structures

Software Transactional Memory

Sharing Memory Robustly in Message-Passing Systems

Wait-free Synchronization

ZooKeeper’s atomic broadcast protocol: Theory and practice

Kafka (LinkedIn)

Omega: flexible, scalable schedulers for large compute clusters

Thialfi: A Client Notification Service for Internet-Scale Applications

Large-scale Incremental Processing Using Distributed Transactions and Notifications

Note: We haven’t included anything already covered in 6.824 , but you should read those papers too.

Paxos Made Live: An Engineering Perspective

Viewstamped Replication: A new primary copy method to support highly-available distributed systems

Time, Clocks, and the Ordering of Events in a Distributed System

The Part-Time Parliament

Paxos Made Practical

The papers from SOSP 2013

Advanced Distributed Systems

Research Seminar at Columbia University

  • --> Blog -->