2,241 reads

Evolution of Airdrop: from Common Spam to the Merkle Tree

by Sergey PrilutskiyAugust 7th, 2018

Too Long; Didn't Read

With a good algorithm, you can simultaneously solve several problems associated with smart contracts that work with large lists of user addresses. You can’t add a list of several thousand addresses to a contract and allow this set of addresses to do something in that contract. The <a href="https://hackernoon.com/tagged/blockchain" target="_blank">blockchain</a> tries to save every single byte, so it would be too expensive to have so much data there.

Companies Mentioned

Coin Mentioned

featured image - Evolution of Airdrop: from Common Spam to the Merkle Tree

With a good algorithm, you can simultaneously solve several problems associated with smart contracts that work with large lists of user addresses. You can’t add a list of several thousand addresses to a contract and allow this set of addresses to do something in that contract. The blockchain tries to save every single byte, so it would be too expensive to have so much data there.

To solve this problem, the contract code needs to be able to determine whether a given address is whitelisted. If the verification is successful, the required action can be performed. The proposed solution is quite simple: the contract only stores one number(merkle-root) and it is the customer’s obligation to prove that the address is whitelisted. The algorithm was proposed by the brilliant cryptographer Merkle and widely used in almost all decentralised software to ensure the integrity of data sets.

Airdrop and ACL

A common practice is to issue your own token and send it in small quantities to tens of thousands of addresses that have, for example, at least 1 ETH in their wallets. This type of spam is currently a popular way of promoting tokens. Not many people like someone’s unknown tokens appearing in their wallet. However, projects need this and airdrop orders are very popular. This is usually done in the following old-fashioned way (based on Ethereum, but it is also suitable for other blockchains that use smart-contracts, including EOS):

The blockchain is parsed and a large list of addresses is created based on set specified criteria
An address is created with enough tokens to send to all addresses in the list, or a function is added to the token contract to implement the ability to create (mint) the required number of tokens for the specified address (when initiating a transaction from a special privileged address)
Enough ETH are placed on this address to pay the fee charged for each time tokens are sent to users
A distribution script is launched that iterates through addresses and creates a transaction for each one to transfer (or mint) tokens to a given address.

This approach consists in simply iterating over a large list of addresses and sending tokens to each of them. In decentralised systems, this push strategy is usually far from the best; it is expensive, generates security vulnerabilities and is, in fact, just spam. The disadvantages can be described in slightly more detail as follows:

Fees depend on the number of addresses and can be prohibitively large. In addition, at the time of distribution, the fees may increase, because the cost of the transaction increases as the load on the network increases.
To perform the distribution, you need to develop a script that will send transactions and that contains a secret key to access a large number of tokens.
You need an algorithm to control the distribution, which will also be able to resume work if it is interrupted.

However, there is a much simpler solution, common for decentralised networks, that delegates most of the computational tasks to the client-side software. In this case, access control is based on the Merkle tree, which is a very convenient data structure that allows the contract to store just one fixed-size number (merkleRoot) with information about all the data in this tree (for example, a huge list of recipient addresses). It doesn’t involve any magic: the client application provides information, proving that the address is in the list of allowed addresses, by itself. To do this it performs relatively complex calculations and eliminates the need for the contract to view a huge list of addresses. If you are interested in cryptocurrencies, then you should already know about the Merkle tree; if you don’t, then you should find out about it, as this structure is very useful for solving many problems.

I also mentioned the ACL, or Access Control List, that refers to file systems: this is a way to specify the rights to an object in the form of a list of accounts and access types. I use this term to show that this algorithm is suitable for creating huge ACLs that provide access to a certain contract feature to millions of accounts. To do this, save a single number in the contract to make sure the account is in the list.

But let’s focus on the scheme including airdrop, as it is now extremely popular in the market and is a simple and clear example of smart contracts with large ACLs.

DApp scheme

I hope you already know what a smart contract is (as well as a token contract) and understand how it works. As you know, a smart contract is first deployed onto the Ethereum network and receives its own address and balance, after which the smart contract accepts and processes incoming transactions.

In the merkle-airdrop scheme, tokens are not distributed using an address list, instead users ‘claim’ their tokens themselves by sending a transaction to the contract and paying a fee. The secret is that users add special data to the transaction allowing the contract to easily verify that they are whitelisted, and the contract does not have to store the list itself. In this case, the contract is the most cost-effective, because it needs only one (!) number for an address list of any (!) size — this is the essence of this great algorithm. Such a contract is also extremely simple to launch, and it does not require any support after launch; moreover, this simplicity also ensures maximum safety when used. We will discuss these aspects later.

Smart contract

The scheme with a merkle-airdrop contract works approximately as follows:

A token contract is placed on the network (or is already present there).
A list is created for those addresses that are allowed to ‘claim’ their tokens. The list is generated based on blockchain analysis, collected on the website, or created in any other way.
All addresses are stored in one large file, which is uploaded to a resource and made publicly available using a certain URL. For example, we upload files to IPFS, but you can use any other way to save the file so that it can be opened later from the client browser.
This file is processed by a script that takes all the addresses from it and builds a Merkle tree, and then gets a single hash called a Merkle root. This root is the main parameter of our contract for issuing tokens.
The merkle-airdrop contract is published on the network. In this case, the contract stores the Merkle root and the token contract address so that it can transfer this token to the recipients.
A large number of tokens are transferred (in the usual way) to the balance of the airdrop contract; the amount should be enough for distribution between all participants from the list.
Now, the airdrop contract provides external access to the “claim-tokens-by-merkle-proof” function, which takes merkle-proof from the user as a proof that the user is in that large whitelist of addresses. If the proof is valid, then the contract executes transfer function from its address and transfers tokens to the user.

User

The following processes occur in the client browser. All operations are performed by the client-side JavaScript right on the DApp page:

The client finds its address in a white list that is publicly available somewhere on the network (e.g. in IPFS) and makes sure it is eligible for tokens.
The client reads all the addresses in the list and builds a Merkle tree; it then gets the merkle-proof, which is a set of numbers (hashes), the quantity of which depends on the size of the list; however, it is very small compared to the size of the list; for example, it is sufficient to use about 17 hashes for about 100,000 addresses (the size of the merkle-proof is calculated as follows: log2(N), where N is the number of elements).
The client sends a transaction to the contract to call the “claim-token-by-merkle-proof” function and send the received merkle-proof. The contract only needs to check the proof (using the saved merkle-root) and make the transfer() of tokens, if the proof is valid. Checking merkle-proof is not simple, but requires only log2(N) operations and it’s much easier, than search in list.
The contract stores information about transactions to prevent tokens being reissued.

In general, merkle-proof can be described as ‘the path from the user’s address to the merkle-root through the branches of the Merkle tree’. This requires quite complex calculations and extra data, which we took out of the contract and saved the most expensive resource in the blockchain — storage. The merkle-proof consists of log2(N) hashes (rounded up to the nearest whole number). The size of each hash corresponds to the size of the merkle-root, which we saved in the airdrop contract. So the user must provide 10 hashes for a list of 1,024 addresses, but only 32 hashes for about 4 billion addresses. The protocol for creating and presenting the proof is the basic feature of the contract, which makes it possible to store a minimum amount of information to confirm that certain data belong to a very large list. Moreover, the larger the list is, the greater your benefits will be.

In fact, the contract also provides for the ability to pick up unused tokens, update the merkle root, and impose time limits, such as prohibiting the issue of tokens after a certain period of time. After a simple modification, the contract will be able to issue an arbitrary number of tokens to each address; in this case, the file will store not only the addresses of the recipients, but also the necessary amounts of tokens. In addition, you will need to slightly modify the function to check merkle-proof, however, the general algorithm remains almost unchanged.

Advantages and disadvantages of merkle-airdrop

Let’s describe the advantages and disadvantages of the above method compared to traditional script-based distribution.

Advantages:

the transaction that requires tokens doesn’t cost much ether, while this amount is a constant that depends on the size of the white list
the launched contract does not require any support, all the necessary operations are performed by the client-side application
this approach allows you to work with lists of almost any size with minimal storage consumption in the blockchain.

Disadvantages:

you must upload the address list to a public resource
the client code needs to process all addresses in the white list and perform quite resource-intensive operations.

There are no secrets in this algorithm, and the contract storage savings are offset by the intensive operations of client-side applications to verify that addresses belong to the list. This approach clearly demonstrates the difference between the models of using smart contracts in comparison with traditional centralised systems.

The advantages of the traditional scheme for distributing tokens using a script are just simple and clear workflows. In addition, the developer needs to make much more effort to run the usual airdrop compared to the publication of a merkle-airdrop contract. Moreover, the running script should be monitored to take appropriate measures if the operation is interrupted in the middle of the list or the funds to pay the transaction fee have been exhausted; you also need to make sure that no one can steal private key used by the script to sign transactions. And if you have a file with addresses, you do not need a developer at all. You don’t have to be a developer to deploy such a contract, it can be done very simply by using public services.

Merkle-ACL is not the single solution, there are other good “claim” algorithms with simpler logic, more effective from smart contracts view. For example when a user presents the “pre-signed right to claim tokens” that he found in public whitelist. But from the user’s view, this way is harder, because the contract owner has to build a huge whitelist and sign every line with the proper secret key. Yeah, there is no “free” features in programming.

Implementation features

In addition to the basic smart contract, a complete DApp for the merkle-airdrop has some implementation specifics. The merkle-airdrop scheme delegates a large number of operations to the code in the user’s browser, which, for example, must create a merkle proof after processing the entire address list. The list of addresses should be stored somewhere publicly, and for the user this procedure should be no more difficult than uploading a file to the server.

Therefore, our merkle-airdrop constructor for the Smartz.io platform allows you to upload the address list into IPFS and provide this list to the client. In addition, we have developed special JS widgets to work with the file of the address list in IPFS. In the final version of DApp, we plan to store IPFS information directly in the contract to have 100% of the data for airdrop in the blockchain. At the moment, we have released a fully functional beta version and we’re working on next usability improvements. We are also developed a merkle-airdrop constructor for EOS tokens, as it has a similar interface and internal logic.

In general, the trend for increasing the complexity of the client code takes place in all state-of-the-art solutions, and the merkle tree is only the first of such interesting algorithms. Other schemes such as ring signatures, zkSNARKs, state channels, etc. also require complex browser-side calculations, so Smartz is preparing to implement large and complex JS components in constructor interfaces on the client side. And we recommend that you also get involved in this process.

Conclusion

The main ‘advantage’ of traditional list-based token distribution is that this scheme allows you to send tokens even to those users who do not want to receive them. In addition, there are such damn features that involve sending so few tokens that the exchanges do not even allow transactions to be made with them and users are forced to put up with the presence of such ‘leftovers’ on their addresses, as they cannot get rid of them.

The concept of an airdrop, when a company distributes part of its system tokens to the community, is of course extremely important for the projects’ lifecycle. But we believe that this solution is not all user-friendly and is wholly inefficient. Moreover, DApps tend to implement the ‘pull’ instead of the ‘push’ concept; this means that it is the network users who are the initiators and supervisors of business processes, so the approaches when someone centrally imposes something on tens of thousands of users are gradually fading into the past.

Let’s sum up brief advantages of the Merkle airdrop:

does not require additional gas costs;
does not require any support;
allows to work with really huge lists;
takes up a bit of storage space in the blockchain.

You can already try to deploy the contract of Merkle Airdrop using a specific template on Smartz platform. Beware to have the contract deployed MetaMask browser extension or Trust Wallet for mobile are required. For Solidity techies: the source code of this smart contract is available on GitHub, see the link below.