by Andrew Barisser
I’ve been struggling with the question of how to scale blockchain technology. This is the big elephant in the room around all the excitement around cryptocurrencies; the scalability problem is unsolved. The most promising solution, lightning networks, is unproven and seems to rely on questionable guarantees, for instance the ability to redeem pooled funds on-chain. It also introduces hub-and-spoke models with potentially centralized dynamics. It could work; I don’t know. But we should not be satisfied.
In short, the scalability problems are the main thing restraining my excitement around blockchain technology. In 8 years we’ve made very little progress. Besides the Lightning Network, the only progress has been bickering over blocksizes and highly bloated networks (Ethereum).
Another group that has made a truly interesting stab at the scalability problem is IOTA. Their tangle can support potentially unlimited capacity; indeed it strengthens with use. However it is currently a centralized system through the use of “coordinator” nodes. The ideas behind the Iota Tangle are intriguing; you should check it out because it will challenge your assumptions about A) verifiability and B) scalability. But the use of the coordinator node is just cheating and undermines the credibility of the system. They claim it will be rolled back eventually. But until then it’s not a working, decentralized system. Check it out anyway.
So I am dissatisfied with the paucity of options for dealing with Blockchain scalability. Scalability is critical because the true application of decentralized, trustless systems is between machines. Machines need to communicate at high frequency in untrusted environments; this is the real future of Blockchains: machine-machine coordination. A blockchain unfettered by scalability constraints could power billions of machines exchanging data at enormous scale.
The Old Double Spend Problem
The Bitcoin Blockchain was invented to solve the Double Spend problem. The cryptography underlying transactions is not new. The semantic internal to transactions is not particularly innovative. But with untethered transactions, one must establish an order of precedence between transactions to determine which ones are historically valid. There could be arbitrarily many syntactically valid transactions, ie A sends 1 Bitcoin to B, or A sends to C, or A sends to D; all are valid according to the grammar of transactions. But only 1 can be historically valid for any given coin. To establish an order of precedence, we must make the order of events mathematically self-evident; the Blockchain solved this problem.
How to Rule Out Double Spends in Existing Blockchains
Let’s deconstruct the problem by investigating exactly what it takes to know that a transaction is valid with conventional blockchains like Bitcoin. We need not describe the full protocol, just the bare minimum to know (trustlessly) that a particular transaction that I received was not double spent by someone else.
- A transaction is published in a Bitcoin Block.
- I verify that the transaction is internally consistent.
- I verify the block solution and header. I also establish that this is the longest block branch, ie the main chain. With a light wallet I can verify Bitcoin Block Headers only. This only gives partial trust because it does not rule out the possibility that the miners have lied to me, but it does verify the Proof-of-Work algorithm.
- I verify that the transaction belongs in the published Block (by inspecting the merkle root of the Block).
- I repeat the above process for every single block.
- I build a global state of all unspent Bitcoin outputs throughout the entire address space and over the entire history. For every block I must process and retain the entire global state.
But the actual bare minimum to verify a transaction is to inspect the validity of the inputs. Each Bitcoin input is the output of a previous transaction, or the mining block reward. Thus to verify a given input I must do one of two things:
- Maintain global state at all times to ensure correctness (everywhere).
- Only verify the transactions that feed into the transaction I want to verify. Only those need to be valid. In theory I can backtrack through transactions until I discover their origin as block rewards.
- Problem #1 is that, since inputs may merge in a transaction, as I backtrack, the number of inputs I must track grows exponentially.
- Problem #2, the more serious one, is that I must inspect the entire block to rule out the existence of a double spend of a transaction input I have already accounted for. In other words, to rule out a double spend, I must have a global view of all transactions.
This is the key problem. To know an input is valid, I must exclude the possibility that it has been spent elsewhere. To exclude the existence of something, I must have a global view of all things. The problem is, a transaction double-spending an input could be anywhere, thus I must look everywhere. As far as I can tell, this is the root of the Blockchain scalability problem.
But what if there were some way to rule out the existence of double-spends, without inspecting an entire block and maintaining global state? What if there were only one place to look? What if a spend-transaction could only be valid in one place?
I have been wrestling with a different idea, one in which there is only one place to look. Instead of having to rule out the existence of a double spend, and thus necessitating a global view, there will only be one syntactically valid place to change the state of a particular coin. Thus there will be, by definition, only one place to look.
It is possible to demonstrate timestamping of an almost unlimited number of data points by referencing a “Merkle Path” to the root of a block-published merkle root. Peter Todd has referred to this as Proof-of-Publication in numerous presentations. Proof-of-Presentation is essentially free and unlimited with existing blockchain tech. This should give you pause; it hints that more might be possible.
An Address Space in a Proof-of-Publication Tree
Proof of Publication works at virtually unlimited scale. We can create an address space among the leaves of a proof-of-publication Merkle Tree in which each unique leaf has a unique identifier. This corresponds directly to a binary representation of the left’s and right’s in a Merkle Path leading from the Merkle Root to that leaf. This is all well-known cryptography.
To represent transactions for a given indivisible token, I can provide two things: a valid set of transactions for that token, involving all the right ownership signatures, and a series of proofs-of-publication of those transactions over time. Verification will work in the following way:
- Verify all block headers
- Each token transfer is represented by a transaction and the associated proof-of-publication. This latter item is a Merkle Path leading from a hash of my transaction, to a published Merkle Root in a published historical block. This proves timestamping.
The elegance of the above approach is that it is not necessary to have a global view of events. There is only one place to look, the leaf node of the Merkle Tree corresponding to the particular token under consideration. No other locations are valid, thus I don’t need to rule them out.
The above works save for a significant problem. I call it a “gibberish attack”. What if a miner simply publishes a merkle root which contains nodes that do not correspond to valid transaction events for my token? The problem is this:
- I possess proof of publication of all my valid historical transactions
- These proofs are succinct and do not require looking at all transactions in a block, or looking at other addresses or other tokens.
- However I do not possess proof of non-publication in blocks in which I did not make a transaction.
Thus we may have problems like the following; I can succinctly prove that a token was sent to me at block #10 with a valid transaction. And I can prove that I published a transaction sending that coin to you at block #25. I can also prove that within these blocks, 10 and 25, the published transactions are the only ones the receiver needs to inspect. But I cannot prove that there were no transactions spending that coin between blocks 11–24.
In the above example, for blocks 11–24, I see a succession of Merkle Roots. But I cannot even discern whether they are valid.
The above problem is not really solved. While I think proof-of-publication based schemes demonstrate interesting properties, they are themselves not sufficient.
The below solutions are speculative.
Proof of non-membership of signature
Dealing with the above problem, I need to demonstrate that the spend of the coin I received in block #10 never happened during intermediary blocks 11–24 without requiring verification of the entire blocks. If there were a succinct way to prove non-membership of a value in a large set, then we could solve this problem. We can break down the transfer transaction signature into several components. One of the components of the input signature can be a fixed invariant which ought to remain the same across all blocks, but is unknowable by all but the signer until a spend has occurred. We could amend the protocol to require two things: a proof-of-publication of a transaction in a merkle tree, as stated above, and a simultaneous “revealing” of the spending signature in a separate data structure.
Thus for instance, if there were some way to reveal N strings (where N is large) in some way as to provide some concise proof P, such that it is trivial to prove that N does or does not exist in P, then we would have solved the problem. In our above example, you would receive my proofs-of-publication in blocks 10 and 25, along with signed transfer transactions. You could also verify the existence of signatures used in blocks 10 and 25. Finally you would inspect the proofs from blocks 11–24, and for each of them, prove that the signature from block 10 never occurred in any of these blocks. If this proof could be accomplished concisely, we’re done.
I don’t know how to implement the above, but it’s something I’m thinking about. Please let me know if you know of a solution.
Tree of zk-SNARKS Proofs
An even more speculative approach is with recursive trees of zero knowledge SNARK proofs. Let’s say I have a series of true assertions N and I can create a zkSNARK for each of these forming a set of N proofs. The zk-SNARK verification algorithm is itself a verifiable assertion. Thus I could construct a tree in which my original assertions, and their proofs, lie at the base as leaves. Each parent node is another zkSNARK proof that its children, zkSNARKS, are valid. zkSNARK verification is succinct. We eventually form a root node. This node demonstrates that, recursively, all statements in the tree are True, even if they are not known. What we want to create is a scheme where we can do two things: provide proof-of-publication via a Merkle Path, but also prove that everything in this tree was correct, even if I do not know everything in this tree.
The purpose of the above would be to concisely prove that a miner has not filled in junk in a Proof of Publication Merkle Tree, aka the “gibberish attack”. If we could accomplish this, no other verification scheme is necessary.
In closing, I have been trying to ponder schemes in which the global state of the network is not required to verify the correctness of a single input transaction. I believe that this is the key to scalability. The almost infinite scalability of Proof-of-Publication suggests that something to this effect might be possible.
One limitation to any of the above schemes is that, even if a tamper-proof algorithm can be found, miners must aggregate all the information in a block to construct the requisite Merkle trees, proofs, etc. This introduces its own scalability concerns, albeit vastly more favorable ones than the current state-of-the-art.
Feedback, Ideas, and Corrections
All of these ideas are quite inchoate. If there are any obvious flaws, or improvements, or any other helpful comments, please let me know.