Ethereum is turning from PoW (Proof of Work) to PoS (Proof of Stake). However, PoS launch time is postponed again and again.

Ethereum uses difficulty bomb to switch the network to PoS. After the difficulty bomb takes effect, mining on the PoW network will be exponentially difficult. The block time increases dramatically. After nobody can mint new blocks on the PoW network, all participants can only switch to PoS network. The following graph shows the average block time.

Source: Etherscan

Every wave stands for the impact of the difficulty bomb. The block time increases a lot. The wave disappears because the difficulty bomb is postponed during network upgrades.

The launch of PoS is essential for Ethereum. In this article, we will explain the history of PoS and why it is so hard for Ethereum to switch to PoS.

How to Learn a Consensus

PoW and PoS are two common consensus approaches. We can ask ourselves these two questions when learning a consensus:

Who has the right to propose new blocks
If there is a fork, which will be the canonical chain

PoW Overview

With the framework in the last section, we can analyze PoW shortly. The node which first gets the answer to the computation puzzle right can propose the new block. If there is a fork, the longest fork will become the canonical chain. If anyone would like to manipulate the chain, the attacker needs at least 51% computation power. Compared to PoS, PoW is relatively simple. PoW does not tap into problems like random numbers.

PoW networks are protected by the cost of the work. In the long term, nodes will get rewards proportional to their computation power.

PoS Deep Dive

Let’s answer the question at the beginning, why it is so hard for Ethereum to switch to PoS. Part of the reason is that Ethereum is pushing decentralization and encourages more nodes to join the network. This requires a more complex protocol to support the P2P communication of thousands of nodes.

Most networks employ PoS because of sustainability, security, higher TPS, and fast finality. If any nodes cheat, the network can immediately slash their staking.

History of PoS

PoS comes through two-phase development. PoS 1.0 is blockchain-based PoS consensus. PoS 2.0 is based on BFT(Byzantine Fault Tolerance).

Examples of PoS 1.0 are PeerCoin, NextCoin, BlackCoin, and Ethereum Serenity.

PoS 2.0 is based on BFT and is widely adopted by projects like Ethereum 2.0, Tendermint, and Cosmos.

PoS 1.0

During phase 1.0, most PoS looks like this:

Nodes stake tokens
Randomly select one node to produce new blocks. The probability is proportional to the number of staked tokens

The biggest problem of this workflow is the generation of random numbers. If the random number generation is based on timestamp and block hash, the block proposer would generate the new block in a way that increases the probability of being elected as the block proposer again.

Attack Vector

Common attack vectors are listed below:

Staking grinding: The block proposer tries to influence the random number generator to get more chances to produce blocks
Nothing at stake: In PoS networks, producing new blocks does not cost lots of computation resources. When there is a fork, nodes will produce new blocks on every fork. In this way, nodes can maximize their yields. However, this is harmful to the network stability
Long-range attack: Early validators can roll back the state in the future. This will easily cause forks

You can find more attack vectors on page 58 of this paper.

PoS 2.0

PoS 2.0 employs BFT. The goal of BFT is to reach a consensus between different nodes.

The leader will make a proposal. Among all nodes, there are Byzantine nodes. Byzantine nodes cheat in the following ways:

Failure to return a result
Respond with an incorrect result
Respond with a deliberately misleading result
Respond with a different result to different parts of the system

Except for Byzantine nodes, the rest of the nodes are all honest nodes. The most difficult scenario is that the leader node is Byzantine.

Besides these settings, there are some other limitations:

Nodes can only communicate peer to peer
Nodes keep the log of sent-out messages

P2P communication causes different nodes to have different views. For example, node A finds 3 new blocks, but node B only sees 1 new block.

The main goal of BFT is that after the leader makes proposals, every node except Byzantine nodes can reach an agreement. The side goal is to find the Byzantine nodes.

The current algorithm can achieve the main goal in the following condition:

Among 4 nodes, 1 is Byzantine node
Among 25 nodes, 8 are Byzantine nodes
Among 100 nodes, 33 are Byzantine nodes

Network operation status can be categorized in the following conditions. f is the number of Byzantine nodes. There is a total of 3k+1 nodes in the network.

f ≤ k, the network is stable
k < f < 2k+1, some nodes may fail to reach consensus. The blockchain pauses
f ≥ 2k+1, there are security risks

The popular BFT algorithms are PBFT and Tendermint. Tendermint can keep the network stable in the condition that among 3k+1 nodes, there are k byzantine nodes.

BFT

Modern BFT algorithms employ multi-round votes to ensure every node knows that more than 2/3 of nodes approve the proposal. After the block is finalized, the honest node will never revert it.

Every round of communication has four steps:

The client sends requests to the leader
The leader pass request to all nodes
Nodes respond to leader with execution results
After getting f+1 same results, the leader responds to the client (f is the number of Byzantine nodes)

The below diagrams show 5 stages during execution. C is the client, 0 is the leader and 3 is the Byzantine node.

Prepare and commit stages are used to ensure the synchronization between different nodes.

For Tendermint, it has several limitations:

P2P communications cannot support more than 100 nodes in the network
Complex P2P communications design
Execution delay. Nodes first decide the new blocks. Then nodes execute all transactions in the new block. Although individual transactions work fine, executing all transactions once may cause problems, like the double-spending problem

If you would like to learn more about PBFT, check out this blog.

Ouroboros

Compared to BFT algorithms, Ouroboros can support more and more nodes and it also supports nodes to come in and out at anytime. Ouroboros is more decentralized and flexible than BFT but has a longer finality time. Cardano and Mina uses Ouroboros.

Ouroboros divides time into epochs. Epochs can be divided into slots. Every slot uses VRF to generate random numbers and use this random number to decide the block proposer. Nodes staking more tokens are more possible to be a block proposer.

When there is a fork, Ouroboros will choose the correct chain with predefined selection rules. For example, it may choose the most “dense” chain.

Future

Blockchains are balancing decentralization and performance. Switching from PoW to PoS shows that blockchains are chasing TPS, decentralization, tokenomics, and fast finality. We will see more innovations around these four characteristics.

Some blockchains use sampling techniques to lower the communication frequency between nodes. ZK can be applied to it. Byzantine nodes need to send proofs with their messages. This can prevent Byzantine nodes from cheating. Through verifying the proof, honest nodes can also find byzantine nodes.

The tokenomics innovations in GameFi and DeFi applications can be applied in consensus. For example, we can use Ve tokens to improve the staking ratio. Using dual token design may also be an interesting direction. Separating the rewards and sharing tokens to catch more value.

Application iterates much faster than low-level protocol, like consensus. It is hard to apply innovations on applications to low-level protocols. Modular blockchains may solve this problem. However, it is still risky for blockchains to hot-swap essential modules.