Among the existing scalability solutions, sharding is probably the most adopted solution to enable horizontal scalability.
The basic idea of sharding is to partition a global system state into multiple sub-states, i.e., shards, and to process the transactions in each shard relatively independently.
With an appropriate design of the sharding technique, the capacity of the system is able to increase as the numbers of shards and processors (nodes) increase, in other words, linear scale.
To apply sharding, there are a couple of key questions we need to answer:
For example, in a distributed key-value (KV) store (e.g., BigTable, Cassandra), the system state is a map from arbitrary bytes (key) to arbitrary bytes (value), and the operations to change the system state are: create, read, update, and delete (CRUD).
Another example is a distributed append-only file system (e.g., Google File System (GFS), Hadoop Distributed File System (HDFS)), where the system state is a set of directories and files, and the operations are two sets: create, delete, and list operations in directories, and open, append, read, and close operations of a file.
The way to partition is critical to system performance, and the system can perform poorly if the design of partition is inappropriate. To design partition, there are several key aspects we need to consider:
Before answering the aforementioned questions for QuarkChain, let us first introduce the system model and difficulties of sharding of the existing blockchains.
We consider an account-based blockchain model similar to Ethereum, where the system state is basically a key-value map from an address to its account data. There are two types of addresses:
and the account data consist of
where the code and storage of a user address are empty.
There are two types of transactions supported with various combinations of CRUD operations:
1, Transfer transaction between two user addresses, which basically update the balances of two addresses and nonce of the sender;
2, Smart contract transaction, which may
delegatecall
;Compared to the existing scalability solutions, the good thing is that the system state of the blockchain is exactly the same as a distributed KV store such as BigTable and Cassandra; however, the bad news is that the transaction semantics is much more complicated than just simple CRUD operations — a smart contract transaction could potentially perform any CRUD operations on any key-value pairs of the system state. If the state is partitioned to different sub-states (shards), ensuring atomicity across multiple shards will be extremely difficult (most of time impossible) to scale. How to partition the blockchain ledger is the fundamental problem of blockchain sharding.
In addition, more challenges come for a decentralized world as we need to build proper consensus to process transactions in all shards in a secure way. New sharding consensus opens new possibilities of attacks, and thus without comprehensive analysis on the thread model, a shard may be easily compromised and thus the whole network can be easily broken.
Besides the challenges of partitioning and consensus, another common problem of sharding is interoperability among shards, i.e., cross-shard transactions. The underlying logic is usability — a user should be able to access all resources including smart contracts and other user accounts across all shards. How to develop efficient and secure cross-shard transactions is a key topic.
In the following articles, we will discuss QuarkChain’s solutions to the challenges in blockchain sharding. In addition, we will compare QuarkChain with existing centralized systems such as Google’s BigTable — and illustrate similarities and differences with centralized counterparts.