Why Developing for the Blockchain is Hard — Part 1: Posting Transactions

We at Alacris.io want application developers to be able to spend their time focusing on the high-level design of their applications, in terms of economic interactions between multiple parties. We want to save them from having to deal with the complexity of issues like posting a transaction.

Wait, some will say: Posting a transaction is complex? Isn’t there already a function to call for that? Well, yes and no. There is already a function that works well enough for the simple case of manual payments; but that function fails badly and is wholly unsuitable for use in the more elaborate cases that matter to a distributed application backed by a smart contract.

Existing Functions to Post Transactions are Lacking

Sure, you will find functions to create and post transactions in the Ethereum JSON-RPC API, or the equivalent libraries for Go, Javascript, Python, Rust or your favorite language. But calling one of these functions does not and cannot guarantee that the transaction will go through. What if your application crashes in the middle of the interaction? What if your Ethereum node client crashes? What if you are victim of a network outage? What if you fail to use a high enough GAS limit, or use a high enough GAS price? What if your transaction loses the race to another transaction that you posted, or to a transaction posted by someone else using the same contract? What if your transaction makes it into a block, but that block gets reversed by a heavier uncle? What if any of the above happens, not by an unfortunate accident, but out of a deliberate attack by an enemy?

In all these cases, the simple answer is “after ten, thirty or sixty minutes, check whether the transaction went through, and if not try again.” And as long as you’re just a human sending payment and you’re in no great hurry, that’s probably a good enough answer. All you need is for your “wallet” to warn you that the transaction failed, and to offer you to try again with updated parameters at the press of a button. The merchant or trading partner on the other side will usually wait patiently for the payment to go through before they do anything, anyway. And so the existing function is good enough to implement a wallet.

Now, to write an application that is not merely a payment wallet, you need more than that. In a typical “smart contract”, one or multiple parties each posts a large bond that they will stand to lose should they fail to post the transactions that they contractually promised to post within a tight deadline. Is your “post transaction” function good enough to use in such a smart contract? If it works 99% of the time, that means that every hundred times someone will lose their large bond. Is that good enough for you? If the function works 99.9999% of the time, it will still fail once in a million calls. If it is to be used a billion times a year, that is still quite a number of discontent users each losing a significant amount of money. And yet, 99.9999% “six nines” reliability is already considered not easy to achieve in practice. Can you actually make this function that reliable, or even better? How many nines of reliability can you afford to have? And how many can you not afford not to have? When your function eventually fails because the world conspires against you, then what? Remember that in writing the function, you are not just fighting accidental odds of failure: you’re fighting deliberate attempts to thwart you by bad actors who will sign a contract with you just so they can prevent you from posting your transaction and collect part of your bond as reparations. Even if they can’t collect reparations from making you fail, enemies and vandals may want to see you lose even when they can’t directly benefit; and elaborate gangsters might threaten to make you fail to hold your bond hostage, blackmail you and ransom you. Will your function remain as reliable as you need it to be in that adversarial environment?

What a Proper Post Transaction API would do

The existing “post transaction” function is not at all a functional API usable by distributed applications beside simple wallets. The entire purpose of a functional API is to make it possible to write automated programs that will call the API, let it do its magic, and not have to deal with the details below that level of abstraction. A proper “post transaction” function will ensure that the transaction WILL go through, despite all the potential issues within the control of the program: It will include a good algorithmic model to follow the progress of the transaction in the network. It will handle persistence of data so the application survives a crash. It will maintain multiple application replicas over multiple data centers so that the application cannot “just” be thwarted by an enemy doing a DDoS. If needed, it will route all relevant communications through virtual circuits and/or mix networks so that enemies cannot easily locate the contract-using machine and attack it. It will automatically recompute nonces to adjust to race conditions. It will use some informed model-based strategy to watch the network and dynamically adjust what gas price to use, so as to resist block-buying attacks by partaking in a suitable auction for transaction space (and block-buying attacks are not just theoretical attacks, but things that have actually happened in the past: consider the case of the Fomo3D contract: 1, 2, 3). All in all, a proper “post transaction” function will patiently nurse a pool of desired transactions until each of them is eventually posted and confirmed on the blockchain, once and only once, in a suitable order.

What if some issues outside the control of the program still crop up and prevent the transaction from being posted? Then the “post transaction” function will send suitable alerts, before it is too late. The program calling the function has the option of handling some of these alerts; but ultimately, whatever alerts are left unhandled will reach some human watchman on call. It is the role of the Blockchain Operating System to take over at this point, as the application developer doesn’t even have access to the entire context necessary to handle these complex issues. Is the application deployed on a cell phone? The user to alert is the cell phone owner, using cell phone alerts. Is the application deployed on a redundant network spanning three data centers across different continents? The user to alert may be a professional System Reliability Engineer on call. Users may be using a proprietary alert system that just did not exist when the application was written, and yet the application will indirectly make use of it, because alerts are part of the abstraction offered by the Operating System, rather than features of the application itself. What alert mechanism is used will depend on the deployment configuration for the application. It is not the responsibility of the application developer, who not only cannot know all the existing deployment situations, but by definition cannot predict the deployment situations that will only be invented in the future. Letting the application developer access the details of the deployment and handle the emergency situations would actually be big security concern that goes against all good practices for software security.

Now, what when an application fails? The human on call will have to take it over from there. And unless you’re a large corporation or wealthy individual affording a software operations engineer as technical support, that human on call will be you. Then, a great functional API will make it easy for that human on call to assess the situation and take appropriate action:

plugging some cable back before retrying, retrying using a backup network connection,
calling a friend who will post the transaction for him over a different network,
using different parameters with more gas or a higher gas price,
dropping the transaction and losing the bond because posting the transaction would cost more than the bond is worth after all,
still posting the transaction and losing more in gas than the transaction is worth, because the failure to post would reward the bad guys, or would cause a loss of trust and future business from other users,
use some alternate execution path or mitigation game foreseen in the smart contract,
accept a settlement offer from the other party, rather than have a smart judge rule against you, etc.

Many failure and recovery modes are possible, and in general it is not the responsibility of the calling function to painfully handle recovery strategies that involve global state way beyond the scope of said function. Rather, the Blockchain Operating System shall automatically deduce these monitoring interactions from the regular structure of the calling program.

However, before he may take any action at all, the human must be able to understand what is going on, what transaction is being held up, what value is at stake. It is therefore essential that the Operating System offers to the human the ability to inspect the structure of the calling program. The User Interface should make it possible for the user to inspect the current state of the program being interrupted. This is the reason why the program usually cannot be written as “just” simple function calls in Python, Go, Javascript or any such “blub” language: the program has to be written in a suitable Domain Specific Language (DSL), compiled by a metaprogram that has access to the entire program structure, rather than merely to local calls (though the program could be built from a metaprogram written in a blub language, in the style of Google TensorFlow, or from a metaprogram targeting the blub language). Writing distributed applications in suitable DSL also means that these applications can be made modular, where common modules can be reused and composed into larger applications, when written properly. By contrast, writing these applications by hand into a procedural language would lead to all the abstractions leaking and making the resulting program unsuitable for modular composition.

Towards a Blockchain Operating System

In conclusion, something as deceptively simple-looking as “posting a transaction” justifies that distributed applications should not be written using existing APIs, or using any API whatsoever expressed as function calls in any existing language. Rather, it should be written in a Domain Specific Language designed for the purpose of writing such applications, running on top of a “Blockchain Operating System” that will not only provide robust abstraction and enforce security practices, but also help humans handle emergency situations, depending on their specific deployment configuration. And posting a transaction is only one of a multitude of aspects of a distributed application, each of which has to be handled properly. At Alacris, we believe that these aspects are better implemented well once by infrastructure specialists then shared between all applications, rather than implemented many times by each application developer stepping out of his regular role and possibly out of his skills. That is why we at Alacris.io are building a Blockchain Operating System, that will manage these aspects for the application developers.

This article is the first in an ongoing series titled “Why Blockchain is Hard.” A new article will be released every three weeks. Keep an eye out for our next article “Computing a Proper Collateral.”