- Read
- Top Stories
- Write
Writer Brand Writing Prompts - Full List of Prompts
- Decentralized Internet Contest
- Startups Interview
- Gaming Metaverse Contest
- All about you
- Do you own a good company?
- Get feral when you answer to the greatest interview in history
- Share your philosophy
- Which tech trends excite you?
- Do you travel?
- Describe your life hacks
- Describe your DIY project
- What do you wish existed?

- Learn
- STARTUPS
- Tech Giants
- About
Company People Software by HackerNoon - Help
- Sponsor

How to Achieve 99% Fault Tolerant Consensus by@Vitalik

How to achieve consensus with 50% fault tolerance in a synchronous network. If you want to relax the synchrony assumption, the maximum achievable fault tolerance drops to 33%. But if you add even more assumptions, you can increase fault tolerance all the way to 99%. This has in fact been known for a long time; Leslie Lamport's famous 1982 paper "The Byzantine Generals Problem" contains a description of the algorithm. At this point, there is a guarantee that honest nodes have all "validly seen" the same set of values.

We've heard for a long time that it's possible to achieve consensus with 50% fault tolerance in a synchronous network where messages broadcasted by any honest node are guaranteed to be received by all other honest nodes within some known time period.

If an attacker has *more* than 50%, they can perform a "51% attack", and there's an analogue of this for any algorithm of this type.

We've also heard for a long time that if you want to relax the synchrony assumption, and have an algorithm that's "safe under asynchrony", the maximum achievable fault tolerance drops to 33% (PBFT, Casper FFG, etc all fall into this category).

But did you know that if you add *even more* assumptions (specifically, you require *observers*, ie. users that are not actively participating in the consensus but care about its output, to also be actively watching the consensus, and not just downloading its output after the fact), you can increase fault tolerance all the way to 99%?

This has in fact been known for a long time; Leslie Lamport's famous 1982 paper "The Byzantine Generals Problem" (link here) contains a description of the algorithm. The following will be my attempt to describe and reformulate the algorithm in a simplified form.

Suppose that there are ** N** consensus-participating nodes, and everyone agrees who these nodes are ahead of time (depending on context, they could have been selected by a trusted party or, if stronger decentralization is desired, by some proof of work or proof of stake scheme).

We label these nodes * 0 . . . N -1*. Suppose also that there is a known bound

All nodes wait * (N - 1) ∙ D* seconds, running the following process. Define

If a validator * i* receives some message

At time * T + (N - 1) ∙ D*, nodes stop listening. At this point, there is a guarantee that honest nodes have all "validly seen" the same set of values.

If the problem demands choosing one value, they can use some "choice" function to pick a single value out of the values they have seen (eg. they take the one with the lowest hash). The nodes can then agree on this value.

Now, let's explore why this works. What we need to prove is that if one honest node has seen a particular value (validly), then every other honest node has also seen that value (and if we prove this, then we know that all honest nodes have seen the same set of values, and so if all honest nodes are running the same choice function, they will choose the same value).

Suppose that any honest node receives a message * v : i[1] : . . . i[k]* that they perceive to be valid (ie. it arrives before time

- In the first case (say
for this message), we know that the honest node**x = i[j]**had already broadcasted that message, and they did so in response to a message with**x**signatures that they received before time**j - 1**, so they broadcast their message at that time, and so the message must have been received by all honest nodes before time**T + (j - 1) ∙ D**.**T + j ∙ D** - In the second case, since the honest node sees the message before time
, then they will broadcast the message with their signature and guarantee that everyone, including**T + k ∙ D**, will see it before time**x**.**T + (k + 1) ∙ D**

Notice that the algorithm uses the act of adding one's own signature as a kind of "bump" on the timeout of a message. It's this ability that guarantees that if one honest node saw a message on time, they can ensure that everyone else sees the message on time as well, as the definition of "on time" increments by more than network latency with every added signature.

In the case where one node is honest, can we guarantee that passive *observers* (ie. non-consensus-participating nodes that care about knowing the outcome) can also see the outcome, even if we require them to be watching the process the whole time?

With the scheme as written, there's a problem. Suppose that a commander and some subset of * k* (malicious) validators produce a message

But we can plug this hole. We require * D* to be a bound on

Now, suppose an observer sees a message an accepts it. They will be able to broadcast it to an honest node before time * T + k ∙ D*, and the honest node will issue the message with their signature attached, which will reach all other observers before time

The above could theoretically be used as a standalone consensus algorithm, and could even be used to run a proof-of-stake blockchain.

The validator set of round * N + 1* of the consensus could itself be decided during round

The main additional ingredient that would need to be added is a mechanism for deciding who is allowed to propose blocks (eg. each round could have one designated proposer). It could also be modified to be usable as a proof-of-work blockchain, by allowing consensus-participating nodes to "declare themselves" in real time by publishing a proof of work solution on top of their public key at the same time as signing a message with it.

However, the synchrony assumption is very strong, and so we would like to be able to work without it in the case where we don't need more than 33% or 50% fault tolerance. There is a way to accomplish this.

Suppose that we have some other consensus algorithm (eg. PBFT, Casper FFG, chain-based PoS) whose output *can* be seen by occasionally-online observers (we'll call this the *threshold-dependent* consensus algorithm, as opposed to the algorithm above, which we'll call the *latency-dependent* consensus algorithm).

Suppose that the threshold-dependent consensus algorithm runs continuously, in a mode where it is constantly "finalizing" new blocks onto a chain (ie. each finalized value points to some previous finalized value as a "parent"; if there's a sequence of pointers * A → . . . → B*, we'll call

We can retrofit the latency-dependent algorithm onto this structure, giving always-online observers access to a kind of "strong finality" on checkpoints, with fault tolerance ~95% (you can push this arbitrarily close to 100% by adding more validators and requiring the process to take longer).

Every time the time reaches some multiple of 4096 seconds, we run the latency-dependent algorithm, choosing 512 random nodes to participate

in the algorithm.

A valid proposal is any valid chain of values that were finalized by the threshold-dependent algorithm. If a node sees some finalized value before time * T + k ∙ D* (

The "choice" function used at the end is simple:

- Finalized values that are not descendants of what was already agreed to be a finalized value in the previous round are ignored
- Finalized values that are invalid are ignored
- To choose between two valid finalized values, pick the one with the lower hash

If 5% of validators are honest, there is only a roughly 1 in 1 trillion chance that none of the 512 randomly selected nodes will be honest, and so as long as the network latency plus clock disparity is less than

the above algorithm will work, correctly coordinating nodes on some single finalized value, even if multiple conflicting finalized values are presented because the fault tolerance of the threshold-dependent algorithm is broken.

If the fault tolerance of the threshold-dependent consensus algorithm is met (usually 50% or 67% honest), then the threshold-dependent consensus algorithm will either not finalize any new checkpoints, or it will finalize new checkpoints that are compatible with each other (eg. a series of checkpoints where each points to the previous as a parent), so even if network latency exceeds

(or even * D*), and as a result nodes participating in the latency-dependent algorithm disagree on which value they accept, the values they accept are still guaranteed to be part of the same chain and so there is no actual

disagreement. Once latency recovers back to normal in some future round, the latency-dependent consensus will get back "in sync".

If the assumptions of both the threshold-dependent and latency-dependent consensus algorithms are broken *at the same time* (or in consecutive rounds), then the algorithm can break down.

For example, suppose in one round, the threshold-dependent consensus finalizes * Z → Y → X* and the latency-dependent consensus disagrees between

fault tolerance is a well known result in Byzantine fault tolerance theory, as is the impossibility of more than

fault tolerance even allowing synchrony assumptions but assuming offline observers.

*Originally published as “**A Guide to 99% Fault Tolerant Consensus**” with the WTFPL license*

Special thanks to Emin Gun Sirer for review

Join Hacker Noon

Create your free account to unlock your custom reading experience.