Raft Distributed Consensus
State machine replication
Goal: Create a fault-tolerant distributed algorithm that enables a set of processes to agree on a sequence of events.
Distributed Consensus
- Distributed = Many nodes
- Consensus = Agreement on something
Why do we need consensus?
- Consensus, or distributed agreement, is a recurring problem in distributed systems design.
- It is useful for things such as
- mutual exclusion, where all processes agree on who has exclusive access to a resource, and
- leader election, where a group of processes has to decide which of them is in charge.
- Perhaps most importantly, consensus plays a pivotal role in building replicated state machines.
Ref: https://pk.org/classes/417/notes/raft.html
Raft
Consensus Goal
Implementation
States
RPCs
Terms
Leader Election
Server State transitions
Heartbeat and Timeout
Log Replication
Possible logs of followers
- The leader decides when it is safe to apply a log entry to the state machines; such an entry is called committed.
- Raft guarantees that committed entries are durable and will eventually be executed by all of the available state machines.
- A log entry is committed once the leader that created the entry has replicated it on a majority of the servers (e.g., entry 7 in Figure 6).
how does a server learn about a newly elected leader?
- the leader sends out AppendEntries heart-beats with the new higher term number
- only the leader sends AppendEntries
- only one leader per term
- so if you see AppendEntries with term T, you know who the leader for T is
- the heart-beats suppress any new election
- leader must send heart-beats more often than the election timeout
Timeline
Invariants
Log compaction and Snapshot
- Raft’s log grows during normal operation to incorporate more client requests, As the log grows longer, it occupies more space and takes more time to replay.
- In snapshotting, the entire current system state is written to a snapshot on stable storage, then the entire log up to that point is discarded.
- Snapshotting is used in Chubby and ZooKeeper.
Summary

source: https://www.hashicorp.com/resources/raft-consul-consensus-protocol-explained
Ref:
- https://pdos.csail.mit.edu/6.824/notes/l-raft.txt
- https://pdos.csail.mit.edu/6.824/notes/l-raft2.txt
- https://timilearning.com/posts/mit-6.824/lecture-6-7-fault-tolerance-raft/