In this article, we briefly review some advance topics in Ethereum blockchain development. To follow and understand this article, basic knowledge of blockchain technology and Ethereum development is required. Here is an excellent article for learning about blockchain history and evolution and how blockchain technology works.
In this article, we cover advance Ethereum concepts such as Oracle, Off-the-chain Data and PoS and TPS.
Ethereum smart contracts are executed on nodes worldwide. To obtain the same outcomes, nodes have to take the same set of inputs. This is called determinism. Ethereum relies on the determinism property validating smart contract outputs. That is, validating nodes have to yield the same results while running the same code. In this sense, the determinism property plays a key role in enabling nodes reaching a consensus.
Maintaining determinism can be a challenging task. On the one side, Ethereum is a general- purpose platform. Its smart contracts require data or inputs from external sources such as the internet. Without access to these sources of information, use cases for smart contracts will be restrictive. On the other side, even with a tiny time difference, validating nodes may retrieve different information from an external source. With different inputs, nodes will end up with different outputs.
Consequently, the determinism property does not hold. For avoiding the issue, smart contracts are not permitted to call an internet URL or pull data from an external source directly. To resolve the paradox, Ethereum relies on Oracle.
A definition of Oracle is as follows:
"A shrine in which a deity reveals hidden knowledge or the divine purpose through such a person."
– Merriam-Webster
In blockchain, oracle refers to the third-party or decentralized data feed services that provide external data. Oracle provides interfaces from the real world to the digital world. Oracle data is not part of blockchain and is available off-chain.
There are different types of oracles. Two of them are software oracles and hardware oracles:
Software oracles: This normally refers to easily accessible online information such as stock index prices, FX rates, economic news, weather forecasts, and so on. Software oracles are useful since they provide smart contracts with a wide range of information and up-to-date data.
Hardware oracles: This normally refers to scanned information such as UPS delivery scanning, registered mail scanning, supplier goods delivery scanning, and so on. This feed can be useful to activate a smart contract acting on an event's occurrence.Ethereum Off-the-chain Data
There are multiple scenarios where data cannot be stored on a chain:
State variables: Data stored on an Ethereum blockchain is immutable. However, contents of state variables vary as account balances change. A solution is to save them off-the-chain.
Oracle: We have just talked about that.
Digitized assets: Commonly digitized assets require a large dataset to describe/define them. Given a limited size of blocks, it is not feasible for hosting complete asset information on a chain.
Trimmed blocks: For optimization, Ethereum full nodes need to keep a portion of the distributed ledger, that is, to trim a ledger. The trimmed blocks are saved off-the-chain at a centralized location for supporting future inquiries.
Proof of Stake (PoS) is an algorithm for choosing a validator to build the next block. Per the PoS algorithm, when a validator owns more coins, the validator has a higher chance to be chosen. Compared to PoW, PoS is much more energy efficient and quicker.
A pure PoS will lead to the richest validator being selected frequently, causing a supernode problem, referring to a node validating the majority of the blocks being added to the chain. This obviously will not work. Additional randomness is required to give other validators better chances. Several randomization methods are available:
Randomized block selection: Uses a formula to look for the lowest hash value in combination with the size of the stake for selecting a validator.
Coin age-based selection: Coins owned long enough, say 30 days, are eligible to compete for the next block. A validator with older and larger sets of coins have a better chance of being granted the role.
Delegated PoS: This implementation chooses a limited number of nodes to propose and validate blocks being added to the blockchain.
Randomized PoS: Each node is selected randomly using a verifiable random beacon for building the new block.
Ethereum is working on replacing PoW with PoS in future releases.
Ethereum is inherently slow. The average waiting time for a validator building a block is 17 seconds. It usually requires 12 blocks in depth before a transaction (containing the first block) is confirmed. This is 12 × 17 = 204 seconds or 3.4 minutes of waiting time for a transaction to be confirmed. The 12-blocks-in-depth rule is necessary.
When a block is newly added to the blockchain by a validator to its ledger copy, there could be a competing path worked on by other validators. The validator may lose the competition for building the longest blockchain. Per blockchain protocol, the validator has to drop its own block being worked on and add the winning block to its ledger copy. The 12-blocks-in-depth rule assures that a transaction does not end up in a block to be dropped later.
Throughput is a measure of how many units of information a system can process in a given time window. For measuring the performance of a transaction platform, the throughput is expressed in terms of throughput per second (TPS).
To calculate Ethereum TPS, we take the approximate number of transactions in a block (using 2,000). Then, we divide it by the waiting time in seconds for a transaction to be confirmed, 204 seconds. So, Ethereum TPS is approximately 9.8, that is, almost 10 transactions per second. By applying the same approach, we can estimate the TPS for bitcoin, which is about 0.5 transaction per second.
On the other hand, Visa has a TPS of 2,000 with a peak TPS of 40,000. A high performance database such as VoltDB can handle over a million insertions a second. A stock exchange can match thousands of trades a second. It clearly shows a gap that needs to be closed by the blockchain community.
Ethereum is working on multiple solutions to increase TPS. PoS is worked on as a replacement of the computationally inefficient PoW algorithm. PoS is not fully implemented and upgraded on mainnet due to concerns regarding the emergence of a set of supernodes (which receive an outsized role in building the new blocks).
Casper is the Ethereum community's attempt to transiting out of PoW and into PoS. Per Casper protocol, validators set aside a portion of their ether as a stake. When a validator identifies a candidate block, ether is bet on that block by the validator. If the block is indeed added to the chain, the validator is rewarded based on the size of its bet. Validators acting maliciously will be penalized by having their stakes removed.
Led by Vitalik, the Ethereum Foundation is also working on the sharding approach, which is aiming at increasing TPS by 80 times. Sharding splits up the state of the network into multiple shards, where each shard has its transaction history and portion of the network's state.
Another idea to increase TPS is Plasma. Plasma is a technique for conducting off-the-chain transactions while relying on the underlying Ethereum blockchain to provide its security. Therefore, Plasma belongs to the group of off-chain technologies. Truebit is another example of this.
In this article, we review some advance topics on Ethereum blockchain development such as Oracle, Off-the-chain data, PoS and TPS. The next step is to build your first Ethereum application via Write Ethereum Smart Contracts with Solidity in 1 hour tutorial.
About Authors
This article is written by Matt Zand (founder of High School Technology Services) in collaboration with Brian Wu who is a senior Blockchain instructor at Coding Bootcamps.