Consensus is one of the fundamental problems in distributed systems: how do a group of independent, unreliable nodes agree on a single value? This challenge arises in many systems—databases, distributed locks, replicated state machines—where consistency is critical even in the presence of failures. Paxos is a classic algorithm that solves this problem. In this first installment of a three-part series, I’ll walk you through the Paxos protocol, starting with its components, then illustrating how proposals are made and resolved. What is Paxos? According to Wikipedia, Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors. Wikipedia Consensus means agreeing on one result among multiple participants. In practice, this becomes tricky because participants or their communication channels may fail, messages may arrive late or out of order, and some nodes may disappear and later rejoin. agreeing on one result among multiple participants Components of Paxos At its core, Paxos has three essential actors: At its core, Paxos has three essential actors: Processors (Acceptors) – The nodes that store and respond to proposals.Network – The medium for exchanging messages, with no guarantees of speed or reliability.Proposers (Agents) – The initiators of proposals, such as clients asking for a lock or value. Processors (Acceptors) – The nodes that store and respond to proposals. Processors (Acceptors) Network – The medium for exchanging messages, with no guarantees of speed or reliability. Network Proposers (Agents) – The initiators of proposals, such as clients asking for a lock or value. Proposers (Agents) Let’s break them down. Processors Operate at arbitrary speeds.May fail, but can rejoin after recovery.Do not collude or attempt to cheat. Operate at arbitrary speeds. May fail, but can rejoin after recovery. Do not collude or attempt to cheat. Network Messages can be sent asynchronously between processors.Messages may be delayed, lost, duplicated, or reordered.Messages, however, are not corrupted. Messages can be sent asynchronously between processors. Messages may be delayed, lost, duplicated, or reordered. Messages, however, are not corrupted. Proposers Create and send proposals to the processors.Use the same network to broadcast proposals and collect responses. Create and send proposals to the processors. Use the same network to broadcast proposals and collect responses. Assumptions For this walkthrough, we’ll assume a simple system with five processors (nodes 1–5) and two proposers (Alice and Bob). Paxos requires a majority for consensus: with 5 nodes, at least 3 must agree on a proposal. This comes from the formula 2F + 1, meaning the number of non-faulty nodes must exceed twice the number of faulty ones. five processors two proposers 2F + 1 The algorithm I'll first illustrate the algorithm through some visuals: Let's say our system has 5 processors, nodes 1 through 5. We also have two proposers (or agents) the classic Alice and Bob. Since we have 5 nodes for consensus, we need to have 3 out of 5 nodes to accept the agent proposals, as the Paxos algorithm allows tolerance levels of 2F + 1 processors in the system, which, in other words, means the number of non-faulty processes must be strictly greater than twice the number of faulty processes. Step 1 – Alice Proposes Alice wants a lock (AliceLock). She chooses a unique proposal number, say 1001, and sends it to node 4.Node 4 broadcasts the proposal to the other active nodes.At this moment, nodes 2 and 3 are offline, leaving only nodes 1, 4, and 5 active.Since no previous proposal exists, nodes 1, 4, and 5 accept Alice’s proposal.Alice now has the majority approval (3 out of 5). She chooses a unique proposal number, say 1001, and sends it to node 4. 1001 Node 4 broadcasts the proposal to the other active nodes. At this moment, nodes 2 and 3 are offline, leaving only nodes 1, 4, and 5 active. Since no previous proposal exists, nodes 1, 4, and 5 accept Alice’s proposal. accept Alice’s proposal Alice now has the majority approval (3 out of 5). This is shown in the figure below: This is shown in the figure below: Step 2 – Commit Phase Once Alice receives enough accept responses, she can commit the lock. commit the lock This means the system agrees that AliceLock is the chosen value. Even if node 2 comes back online later, it must respect this decision. Even if node 2 comes back online later, it must respect this decision. Step 3 – Bob’s First Attempt Bob now wants the same lock. By this time, node 3 has come back online and can participate. He generates a different proposal number 2001 with value BobLock and sends it to node 5. node 3 has come back online 2001 Node 5 (and nodes 1 and 4 as well) already accepted n=1001, AliceLock. They respond to Bob’s prepared request with a promise not to accept anything below 2001, but they also reveal their prior accepted value (AliceLock). Node 5 promise 2001 reveal their prior accepted value Node 3 accepts BobLock because they see no higher-numbered prior promise, it is accepted; however, by majority, Bob is notified that Alice currently has the lock. Node 3 accepts BobLock because they see no higher-numbered prior promise, it is accepted; however, by majority, Bob is notified that Alice currently has the lock. BobLock In Paxos, once a value is committed, any future proposer must adopt it even if their proposal number is higher. Step 4 – Bob Re-Proposes Alice’s Value Since Bob now knows about an already-chosen value, he must adopt it. He issues a new proposal with an even higher number, say 2002, but this time with value AliceLock. 2002 All nodes accept this, since it is consistent with the prior consensus. All nodes accept this, since it is consistent with the prior consensus. Thus, even though node 3 briefly appeared to support BobLock, the system converges safely: Thus, even though node 3 briefly appeared to support BobLock, the system converges safely: AliceLock is the final agreed-upon lock, reaffirmed by Bob’s proposal. Thus, the system converges: AliceLock is the agreed-upon lock, and Bob has reinforced the decision rather than conflicting with it. AliceLock is the agreed-upon lock Conclusion Paxos may look intricate at first glance, but its power lies in a simple principle: once a value is chosen, it can never be undone. By separating the protocol into a prepare and accept phase, Paxos ensures that even if multiple proposers compete, or if nodes fail and later recover, the system always converges on a single consistent decision. once a value is chosen, it can never be undone prepare accept In our walkthrough, Alice’s proposal (AliceLock) became the agreed value. Even when Bob joined later with a higher-numbered proposal and a different value (BobLock), Paxos forced him to learn about the prior consensus and carry it forward. This guarantees safety (no conflicting decisions), while still allowing liveness (progress can continue as new proposals are made). safety liveness That’s why Paxos remains a cornerstone in distributed systems: it provides the foundation for reliable coordination across unreliable environments. From database replication to distributed locks and modern consensus protocols like Raft, the lessons of Paxos continue to shape how we build fault-tolerant systems today. Key Takeaways Consensus requires a majority. With 5 nodes, at least 3 must agree for a value to be chosen.Once chosen, always chosen. Paxos guarantees that a committed value (like AliceLock) cannot be replaced by a new one.Prepare vs Accept matters. A proposer must adopt the highest previously accepted value it learns during the prepare phase.Fault tolerance is built in. Nodes can fail, recover, or rejoin (like node 3) without breaking consensus, as long as a majority is available. Consensus requires a majority. With 5 nodes, at least 3 must agree for a value to be chosen. Consensus requires a majority. Once chosen, always chosen. Paxos guarantees that a committed value (like AliceLock) cannot be replaced by a new one. Once chosen, always chosen. AliceLock Prepare vs Accept matters. A proposer must adopt the highest previously accepted value it learns during the prepare phase. Prepare vs Accept matters. Fault tolerance is built in. Nodes can fail, recover, or rejoin (like node 3) without breaking consensus, as long as a majority is available. Fault tolerance is built in. In part 2, I'll cover more edge cases and how Paxos deals with them. In part 2, I'll cover more edge cases and how Paxos deals with them.