Understanding the Paxos Consensus Algorithm - Part I: How Distributed Systems Reach Consensus

Consensus is one of the fundamental problems in distributed systems: how do a group of independent, unreliable nodes agree on a single value? This challenge arises in many systems—databases, distributed locks, replicated state machines—where consistency is critical even in the presence of failures.

Paxos is a classic algorithm that solves this problem. In this first installment of a three-part series, I’ll walk you through the Paxos protocol, starting with its components, then illustrating how proposals are made and resolved.

What is Paxos?

According to Wikipedia, Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors.

Consensus means agreeing on one result among multiple participants. In practice, this becomes tricky because participants or their communication channels may fail, messages may arrive late or out of order, and some nodes may disappear and later rejoin.

Components of Paxos

At its core, Paxos has three essential actors:

Processors (Acceptors) – The nodes that store and respond to proposals.
Network – The medium for exchanging messages, with no guarantees of speed or reliability.
Proposers (Agents) – The initiators of proposals, such as clients asking for a lock or value.

Let’s break them down.

Processors

Operate at arbitrary speeds.
May fail, but can rejoin after recovery.
Do not collude or attempt to cheat.

Network

Messages can be sent asynchronously between processors.
Messages may be delayed, lost, duplicated, or reordered.
Messages, however, are not corrupted.

Proposers

Create and send proposals to the processors.
Use the same network to broadcast proposals and collect responses.

Assumptions

For this walkthrough, we’ll assume a simple system with five processors (nodes 1–5) and two proposers (Alice and Bob). Paxos requires a majority for consensus: with 5 nodes, at least 3 must agree on a proposal. This comes from the formula 2F + 1, meaning the number of non-faulty nodes must exceed twice the number of faulty ones.

The algorithm

I'll first illustrate the algorithm through some visuals:

Let's say our system has 5 processors, nodes 1 through 5. We also have two proposers (or agents) the classic Alice and Bob. Since we have 5 nodes for consensus, we need to have 3 out of 5 nodes to accept the agent proposals, as the Paxos algorithm allows tolerance levels of 2F + 1 processors in the system, which, in other words, means the number of non-faulty processes must be strictly greater than twice the number of faulty processes.

Step 1 – Alice Proposes

Alice wants a lock (AliceLock).

She chooses a unique proposal number, say 1001, and sends it to node 4.
Node 4 broadcasts the proposal to the other active nodes.
At this moment, nodes 2 and 3 are offline, leaving only nodes 1, 4, and 5 active.
Since no previous proposal exists, nodes 1, 4, and 5 accept Alice’s proposal.
Alice now has the majority approval (3 out of 5).

This is shown in the figure below:

Step 2 – Commit Phase

Once Alice receives enough accept responses, she can commit the lock.

This means the system agrees that AliceLock is the chosen value.

Even if node 2 comes back online later, it must respect this decision.

Step 3 – Bob’s First Attempt

Bob now wants the same lock. By this time, node 3 has come back online and can participate. He generates a different proposal number 2001 with value BobLock and sends it to node 5.

Node 5 (and nodes 1 and 4 as well) already accepted n=1001, AliceLock. They respond to Bob’s prepared request with a promise not to accept anything below 2001, but they also reveal their prior accepted value (AliceLock).

Node 3 accepts BobLock because they see no higher-numbered prior promise, it is accepted; however, by majority, Bob is notified that Alice currently has the lock.

In Paxos, once a value is committed, any future proposer must adopt it even if their proposal number is higher.

Step 4 – Bob Re-Proposes Alice’s Value

Since Bob now knows about an already-chosen value, he must adopt it.

He issues a new proposal with an even higher number, say 2002, but this time with value AliceLock.

All nodes accept this, since it is consistent with the prior consensus.

Thus, even though node 3 briefly appeared to support BobLock, the system converges safely:

AliceLock is the final agreed-upon lock, reaffirmed by Bob’s proposal.

Thus, the system converges: AliceLock is the agreed-upon lock, and Bob has reinforced the decision rather than conflicting with it.

Conclusion

Paxos may look intricate at first glance, but its power lies in a simple principle: once a value is chosen, it can never be undone. By separating the protocol into a prepare and accept phase, Paxos ensures that even if multiple proposers compete, or if nodes fail and later recover, the system always converges on a single consistent decision.

In our walkthrough, Alice’s proposal (AliceLock) became the agreed value. Even when Bob joined later with a higher-numbered proposal and a different value (BobLock), Paxos forced him to learn about the prior consensus and carry it forward. This guarantees safety (no conflicting decisions), while still allowing liveness (progress can continue as new proposals are made).

That’s why Paxos remains a cornerstone in distributed systems: it provides the foundation for reliable coordination across unreliable environments. From database replication to distributed locks and modern consensus protocols like Raft, the lessons of Paxos continue to shape how we build fault-tolerant systems today.

Key Takeaways

Consensus requires a majority. With 5 nodes, at least 3 must agree for a value to be chosen.
Once chosen, always chosen. Paxos guarantees that a committed value (like AliceLock) cannot be replaced by a new one.
Prepare vs Accept matters. A proposer must adopt the highest previously accepted value it learns during the prepare phase.
Fault tolerance is built in. Nodes can fail, recover, or rejoin (like node 3) without breaking consensus, as long as a majority is available.

In part 2, I'll cover more edge cases and how Paxos deals with them.