Skip to main content
  1. Notes/

RAFT: Overview

RAFT guarantees consensus by ensuring all non-faulty servers apply the same operations in the same order.

RAFT Overview

State-based replication:

  • Instead of copying entire state snapshots frequently, RAFT applies incremental updates:
  • Each update is stored in the log as a (key, value) operation.
  • All servers apply updates in the same sequence → consistency.

Problem: Split Brain

If a network partition occurs, replicas can diverge (two groups think they have leaders).
Solution: Majority voting → only the group with a majority of servers can elect a leader. That’s why RAFT typically uses an odd number of servers (3, 5, 7).

Client interactions:

  • Clients send operations (e.g., PUT/GET) to the leader.
  • The leader appends the operation to its log, replicates it, and commits when a majority acknowledges.
  • Operations must be idempotent (safe to reapply without changing meaning), since retries may occur.

Logs:

  • Store both tentative (uncommitted) and committed operations.
  • Persistent on disk so they survive crashes.
  • May temporarily differ between servers, but RAFT ensures they eventually converge.
    <p><strong>Example:</strong></p>
TermOperationLast Applied Value
1P10
2Q10
10L10

(latest message in log is at the RHS).

Terms:

  • Time is divided into terms, numbered consecutively.
  • Each term begins with an election.
  • At most one leader per term.
  • If no leader is elected, a new election (new term) begins.
  • Different from Zookeeper, where leaders send heartbeats constantly to prevent elections.

RAFT Guarantees:

  • Election Safety → At most one leader elected per term.
  • Leader Append Only → Leaders never delete/overwrite logs; they only append entries.
  • Log Matching → If two logs agree at an index, all prior entries must be identical.
  • Leader Completeness → If a log entry is committed, it will appear in future leaders’ logs.
  • State Machine Safety → A given index will never be assigned different commands on different servers.

Snapshots:

  • Logs can grow indefinitely → snapshots periodically save the full state at a point in time.
  • Old logs before snapshot can be discarded.
  • Followers can create snapshots independently.
  • Helps lagging followers catch up more efficiently (instead of replaying long logs).
  • Leaders can send the latest snapshot to very out-of-date followers.

RAFT: Implementation

RAFT Lifecycle

RAFT is a consensus algorithm designed to ensure fault tolerance and strong consistency in distributed systems. Because RAFT is state-based, each server maintains variables that define its state. Some of these must survive crashes (persistent state), while others are only needed while the server is running (volatile state).

Persistent state (stored on stable storage, updated before responding):

  • currentTerm → the latest term the server knows about (starts at 0).
  • votedFor → candidate ID that this server voted for in the current term (or null).
  • log[] → sequence of log entries, each storing a command for the state machine and the term when it was received.

Volatile state (on all servers):

  • commitIndex → index of highest log entry known to be committed (safely replicated on a majority).
  • lastApplied → index of highest log entry actually applied to the state machine (important if snapshots exist).

Volatile state (on leaders only):

  • nextIndex[] → for each follower, index of the next log entry the leader will try to send.
  • matchIndex[] → for each follower, index of the highest log entry known to be replicated there.

Elections & Timeouts

Each server starts as a follower. If it doesn’t hear from a leader within a randomized timeout, it becomes a candidate, increments its term, and starts an election. Randomization avoids “split votes,” where multiple candidates tie.

Remote Procedure Calls (RPCs):

RAFT uses only two RPC types for communication:

  • RequestVote RPC → used during elections.
  • AppendEntries RPC → used by leaders to replicate log entries and send heartbeats.

RequestVote RPC:

Candidate asks other servers for votes by sending (term, candidateID, lastLogIndex, lastLogTerm).

Each server checks:

  • Reject if candidate’s term < currentTerm.
  • Grant vote if they haven’t voted yet this term, and candidate’s log is at least as up-to-date as their own.

Candidate becomes leader if it receives a majority of votes. Then it begins sending heartbeats to establish authority.

AppendEntries RPC:

Sent by the leader to replicate logs or as heartbeat messages.

Parameters: (term, leaderID, prevLogIndex, prevLogTerm, entries[], leaderCommit).

Each follower checks:

  • Reject if term < currentTerm (outdated leader).
  • Reject if follower’s log doesn’t contain an entry at prevLogIndex with term = prevLogTerm.
  • Otherwise, append new entries (overwriting conflicting ones).
  • If leader’s commit index > follower’s, update follower’s commit index.

This ensures log consistency across servers.

Computer Vision

Overview of Computer Vision

Overview of Computer Vision

Core concepts in computer vision and machine learning

cv ml
History of Computer Vision

History of Computer Vision

How computer vision evolved through feature spaces

cv
ImageNet Large Scale Visual Recognition Challenge

ImageNet Large Scale Visual Recognition Challenge

ImageNet's impact on modern computer vision

cv ml
Region-CNNs

Region-CNNs

Traditional ML vs modern computer vision approaches

ml cv

Distributed Systems

Overview of Distributed Systems

Overview of Distributed Systems

Fundamentals of distributed systems and the OSI model

distributed-systems
Distributed Systems Architectures

Distributed Systems Architectures

Common design patterns for distributed systems

distributed-systems
Dependability & Relevant Concepts

Dependability & Relevant Concepts

Reliability and fault tolerance in distributed systems

distributed-systems
Marshalling

Marshalling

How data gets serialized for network communication

distributed-systems
RAFT

RAFT

Understanding the RAFT consensus algorithm

distributed-systems
Remote Procedural Calls

Remote Procedural Calls

How RPC enables communication between processes

distributed-systems
Servers

Servers

Server design and RAFT implementation

distributed-systems
Sockets

Sockets

Network programming with UDP sockets

distributed-systems

Machine Learning (Generally Neural Networks)

Anatomy of Neural Networks

Anatomy of Neural Networks

Traditional ML vs modern computer vision approaches

ml cv
LeNet Architecture

LeNet Architecture

The LeNet neural network

ml cv
Principal Component Analysis

Principal Component Analysis

Explaining PCA from classical and ANN perspectives

data ml

Cryptography & Secure Digital Systems

Symmetric Cryptography

Symmetric Cryptography

covers MAC, secret key systems, and symmetric ciphers

cryptography
Hash Functions

Hash Functions

Hash function uses in cryptographic schemes (no keys)

cryptography
Public-Key Encryption

Public-Key Encryption

RSA, ECC, and ElGamal encryption schemes

cryptography
Digital Signatures & Authentication

Digital Signatures & Authentication

Public-key authentication protocols, RSA signatures, and mutual authentication

cryptography
Number Theory

Number Theory

Number theory in cypto - Euclidean algorithm, number factorization, modulo operations

cryptography
IPSec Types & Properties

IPSec Types & Properties

Authentication Header (AH), ESP, Transport vs Tunnel modes

cryptography