The Pain Of Making A Blockchain App From Scratch

After a 3–4 month development cycle, I finally transition from the “developer” state to the “maintainer” state. SnowApe, an on-chain weekly stock trading league, seems ready to release, so for now, my work is mostly done. During this time, before the inevitable horde of bugs comes crawling out, it’s okay to take a load off and think back on the process. What did I do? Why did I do it? How’d it go? Was it worth it? And so on…

This isn’t my very first go at Web3 development or working with smart contracts, but it is the first cohesive, polished app I have ever published that utilizes Web3. I aim to write through the lens of a fairly experienced consumer-facing Web 2.0 application developer so that others getting into this space can understand the programmatic and UX-related constraints crypto brings. Readers may use this as a development guide, if they wish, or use it to help themselves make an informed decision as to whether Web3 is right for their project. Either way, I’ll be taking a chronological approach through this process, and will give fair warning before going on any topical detours.

Why did I build on Web3?

Too many builders seem to forget to consider whether or not their project is right for Web3; usually, the answer is no. The plethora of worthless tokens floating around is a testament to all the teams that are either aiming to solve a nonexistent problem or that simply wish to leverage the technology for easier marketing and fundraising. I would recommend any reader to really examine why their project needs one or all of its components to be on-chain, before going on-chain.

My team did not have crypto in mind while planning the creation of our application. We were interested in stock trading, and more specifically real-money gaming. Neither of these concepts were intrinsically Web3-related, so we aimed to go ahead using fiat.com. Deposit cash, play some stock trading games, and cash out! Pretty simple concept, right? Yes, actually, except for the cash part. As some readers may already know, real-money gaming is riddled with tight regulations and fees in American regions, as well as most other regions around the world.

~Begin Tangent On Why Real-Money Gaming is Hard~

Here is the happy path a team will take to bring a real-money gaming app to market.

The team purchases licenses for the regions they want to operate in,
Integrate with Plaid/Stripe to accept user deposits,
Facilitate user withdrawals
Builds their gaming app.

Once they’ve purchased their licenses, they can verify their business with payment processors, which will then provide them with the infrastructure needed to handle USA bank transfers. All set! Users can deposit cash into the gaming platform for free, and the team can deliver the gaming experience they’ve envisioned. This is a great go-to-market strategy for a bankrolled team with enough certainty in their product to risk the millions of dollars they may spend on licensing fees.

For a team not able to put down a small fortune as collateral, not so much. Such a team will likely not have access to the gaming markets of Pennsylvania, New York, and so on, due to lack of funds and legal acumen. They may then try to operate outside of the United States/Europe digitally, and accept players via a VPN or some other clever method. Great! Unfortunately, though, since this team is now operating as a foreign business, they won’t be able to integrate bank deposits and withdrawals with the large payment processors. They may then attempt to simply accept card payments instead of ACH, only to find that their users will be charged ~3% each way for deposits and withdrawals. The prospect of having to beat a 6% charge just to break even leaves the vast majority of ideas dead in the water. Enter crypto.

~End Tangent~

I found crypto to be a way around the most difficult bureaucratic roadblocks my team faced in bringing our idea to market. Rather than spending time researching a hostile legal landscape and applying to payment processors that would hopefully accept our business model and provide us with our needed payments infrastructure, we simply integrated onto an unregulated asset class and payments network. This took our project from infeasible to innovative and allowed us to focus on the most important part: the product.

The Smart Contract

“I could set up an on-chain watch collector’s exchange… let’s see how the contract would work… What about an on-chain ETF of assets? Let me write up a contract…”

In my previous independent projects, code had rarely been the first thing I would work on. Generally, after bouncing the idea around, a creative type may sketch a few mocks, or draft a deck. Backend code would usually be the very last thing to ever get touched, and often the idea would be abandoned before we even got there. I find blockchain development to be quite the opposite; as soon as I come up with an idea, the very first thing I think to do is sketch up a contract. “I could set up an on-chain watch collector’s exchange… let’s see how the contract would work… What about an on-chain ETF of assets? Let me write up a contract…”

The code acts as the writeup, but I suppose that’s the nature of smart contracts. Yes, they are code to be executed, but they are also extremely effective at describing the interactions disparate parties will have within an application or network, as are real contracts. In many applications, they are the lynchpin of the system, and consequently, feel like blueprints I can build the rest of that system upon.

This opinion may also be partially based on the severe unfamiliarity other conventional startup roles (UX designers, product owners, etc..) have with Web3. It’s still very much a developer-first space.

~Begin Tangent On Getting Started in Smart Contract Development~

Anyone unfamiliar with smart contract development should check out CryptoZombies, which I have found to be one of the most effective programming language guides I’ve ever interacted with. So much so, in fact, that I feel smart contract development is the most accessible form of programming, period.

Once understood, smart contracts are actually quite simple compared to commensurate backend applications, yet have a profound impact thanks to distributed systems executing their instructions on a massive scale. It is an excellent way to empower a new developer, especially when compared to the kinds of starter projects a student can complete in conventional development environments, and quickly imbues some advanced engineering concepts like transactions, idempotence, and basic cryptography, which under ordinary circumstances a budding engineer would not touch for a long time.

After understanding the gist of writing a smart contract, the next most important thing is understanding the current tooling. The difference between a zero-tooling experience and a fully-tooled experience is like night and day, and I recommend for a new developers familiarize themselves with the following resources:

Find some good development environments: I particularly like Remix and Hardhat, but there’s more out there that I haven’t used much like Truffle. These are the services developers leverage to write and compile their smart contracts.
Use a simulated blockchain, or find a testnet that you like. I’ve used Ganache before for local testing, and hardhat offers a similar service via the Hardhat Network. They’re both local Ethereum networks, so they merely live on your machine. Testnets, on the other hand, are actual blockchains, but they’re free! The mainnet tokens are fake, so you can deploy and run transactions as much as you like. There are still computers all over the world running deployed code, but it is a voluntarily-maintained network, not constrained by token scarcity. Here is a more concise overview of Ethereum testnets.
Understand Etherscan, please!!! For any unknowing parties, Etherscan is a block explorer — basically the Google of blockchain. All transactions, smart contracts, and user wallets are visible there, and a user can search up information on basically every single thing that’s occurred on-chain. It’s immensely beneficial to know the ropes of a block explorer, including deciphering blockchain transactions, verifying a smart contract, and interacting with a smart contract through the UI. Ultra bonus points: Tenderly, similar to Etherscan but on steroids.

For additional reading, this is a great start-to-finish guide that expands upon the above bullets.

I will acknowledge that I am only discussing Solidity development, which is a language made for interacting with the Ethereum Virtual Machine (EVM), or the execution environment Ethereum uses to run code. The list of EVM-compatible chains includes Ethereum itself, Binance Smart Chain, Polygon, Avalanche, and Fantom, but does not include Solana, Terra (lol), EOS, and others. Other projects like Polkadot are aiming to implement interoperable execution environments, which for now seems like support for both EVM and Wasm (WebAssembly). I haven’t worked with non-EVM execution environments, so I can’t authoritatively point to any good resources, but I would encourage any curious reader to do their own research on alternative development environments like Wasm and LLVM (Solana), as well as interoperability projects like Polkadot and Cosmos.

~End Tangent~

So I started out on Solidity, before even thinking of what network I would deploy to, and drafted a contract to solidify my thoughts on user fees, design, etc.. I chopped through a few iterations, and finally landed on an implementation I thought would be a good fit for the idea. I deployed and verified the contract, tested out the methods, hashed out bugs, and then finally felt happy with the bones. With enough documentation I was able to walk non-technical team members through the contract, and our group had a rough collective idea of the work set out ahead of us.

To be quite honest, blockchains feel like some of the easiest backend providers I’ve ever worked with. Almost no configuration is required; once a smart contract is deployed, you can immediately interact with it through a block explorer UI, or through a couple lines of code. I suppose when everything is public, a lot of red tape goes away.

Authentication

Authentication, from a developer and user perspective, is one of the principal pain points in crypto. This is due to the fact that a user may be interacting with an application that runs on any number of chains, and that same user may themselves be using any number of RPC providers (services that interact with nodes to smart contract code, like Metamask) with any given network configuration. When these services were first coming out, they were made for Ethereum Mainnet, and perhaps during that initial period, things were more simple (I wasn’t a Web3 dev at the time!). But now that gas fees are unjustifiable for the average user, sidechains, as well as L2s, are achieving widespread adoption, and a tricky UX game of the user switching the chain for the given app they are using has begun. In the long run, this problem may be phased out by the abstraction blockchain interoperability may provide, i.e Cosmos, Polkadot, and Chainlink, but for now, it is a present challenge for builders and users.

This was the most difficult aspect of blockchain development I ran into, primarily because there is still no standard solution. Various applications decide to interact with certain RPC providers and do so in their own unique ways. Additionally, the authentication needs of the application can vary. Some apps will be 100% on-chain and require nothing more than a wallet provider. Others will use both traditional Web2 backend services along with on-chain storage. To make things worse, there are no major providers like Google or Amazon that provide sign-in functionality to Externally Owned Accounts (wallets) — no “Sign In With Ethereum”. This part took some elbow grease.

The flow I went with is depicted below: Connect → Connect to Network → Sign In.

The above user flow entails the user connecting to their RPC provider, switching to the blockchain network the app uses (Ethereum, Polygon, Avalanche, etc..), and finally signing in with their wallet. Each button press will prompt a flow by their wallet provider; there is no step that requires active thought by the user — just press ‘Okay.’

The first step is the most common, which is the ‘Connect Wallet’ step. After pressing this button, the user’s provider confirms with them that they want to allow the app to connect to their wallet. Approving this step will advance the user to the next step.

The second step is less common but perhaps will grow more popular in time. This is the ‘Connect to Network’ step. Not every application wants to run every chain; for example, I wouldn’t want my users spreading out over several chains, it’ll be more fun if they’re all playing together in the same league!

Given this, it’s important for some applications to ensure that their users are on the correct chain so that they are able to interact with the intended smart contract. Hence, when a user clicks on the ‘Connect to Network’ button, their RPC provider prompts them to switch from their current network to the network the application uses, and if their RPC provider doesn’t have that network it prompts the user to add that network to the provider. For devs: here is a thread discussing how to switch chains, add a chain if it doesn’t exist, and deal with the error codes that occur.

Technically, the ‘Connect to Network’ button is not actually necessary, nor is the button I discuss next. The application could simply wait for the ‘Connect Wallet’ step to complete, and then automatically request the user to switch chains, if not already on the correct one. However, I don’t like that user flow very much. I prefer for the user to know each step that is being completed, as to prevent them from rejecting a step out of confusion or distrust. Therefore, I give a button for this step, as well as the next.

Now the user has a wallet that is connected to the application they are using, and has switched to the proper chain. For many apps, this is enough, and the authentication flow ends there. However, I did not intend to store all user data on-chain, due to speed and cost, so my application additionally requires the user to sign into a Web 2.0 authentication client. This is where the ‘Sign In’ button comes into play.

Using a wallet to authenticate with Web 2.0 may initially seem counterintuitive, but as it turns out EOAs make authentication much easier than conventional methods. Thanks to ECDSA, the cryptographic primitive that underpins Ethereum wallets, you can sign into the part of the app that isn’t crypto with a single click. The wallet is able to prove that its user is actually its owner, which is enough for a Web 2.0 backend to generate a token. After signing some message the application generates, the user is then authenticated into whatever Web 2.0 components the application uses, and can proceed as such.

~Begin Tangent On the Technicals of Signing~

In order to understand this step from a technical perspective, it’s good to know about what cryptographic signing is, as well as what it empowers developers to do. The most crucial premise for all of these algorithms is that their use in Web3 is based on elliptic curves. Elliptic curves allow for the use of points on a curve to be used sort of like regular numbers; they can be added to each other, they can be added to themselves, they can be multiplied by a scalar (add P to itself x times), for some point P there is a negative P where P + -P = 0, and there is some zero point where P + P = 0. As to not go too far into a subject here is a good dive into Bitcoin’s elliptic curve group; cryptography is the ultimate rabbit hole and I don’t intend to go too deep. The most important part of all this is that I can add a point to itself like this: Q = 2P; in this case, 2 can be considered the scalar x, and I am adding the point P to itself x-1 times to get Q. Thanks to the nature of the elliptic curve group we’re working with, it’s easy to find out what Q is if I have P and x, but it’s actually extremely hard to find out what P is if I only have Q and the scalar (the number 2), and it’s also extremely hard to find out what x is if I just have Q and P.

This is the basis for cryptography used in crypto: I can apply a secret scalar to a point to create a new point. I can share both the old point and the new point with anybody, and they won’t be able to figure out how many times I had to add the original point to itself to get the new point, or in other words they won’t be able to figure out the value of the secret scalar x.

A relatively simple signature scheme that’s good for explaining how this is used is the BLS signature. It leverages something called a bilinear paring:

This pairing represents a function in which you are passing two elliptic points P and Q, and two scalars a and b. You can switch the order of the scalars a and b, and the result of the function stays the same. What this property enables the signer to do is then use scalars x and 1 instead of a and b, and just switch a single x around in the function (the 1 doesn’t matter):

Okay, a couple of things might be standing out right now: x is a scalar like I was just talking about, so x * P is adding P to itself x-1 times to get some new point. I’ll call that new point PubKey, because in practice it is the public key of the signature scheme. So what’s Q? Technically, Q could be any point, but that’s not very useful. What makes all this useful is that I can convert any message I am signing (“Hello world! This is my secret message.”) into a point on the curve, and call that point Q. The process of turning an abstract thing like a “hello world” message into a point on an elliptic curve is called a hash-to-curve function. I won’t go into detail about that, but assume that I am able to turn any secret message I want into an elliptic point Q. Once I have Q, I can then sign the point by adding it to itself x-1 times, or in other words calculating a new point x*Q.

Now when I look back at the above bilinear function, I can see that I have all the variables to satisfy the equation. I can then give anybody the points x*P (my public key), x*Q, P, and Q, and they will also be able to verify that the equation is satisfied. But here’s the catch: I didn’t need to give that person the scalar x to prove all this. The unknowing party was able to prove that the bilinear equation was satisfied with just the points I gave them, but only I can create the signature x*Q since only I have x.

To summarize: I know a secret number x that only I can sign messages with, and anybody can prove from looking at the signatures I create that I really do know x, without them actually finding out what x is. This is called a zero-knowledge proof, and it might be a bit tricky to understand at first, but if it’s too difficult to intuitively understand then just remember this: it allows us to prove to anyone that we are the owner of a private key without revealing the private key. There’s another great article on BLS signatures that I’ve partially drawn from, with some handy graphs to help with the osmosis.

BLS signatures are not actually what Ethereum or Bitcoin use; they use ECDSA. However, I myself have the strongest understanding of BLS out of any of the signature schemes and feel most comfortable using that scheme to explain them. They all basically do the same thing, which is to verify ownership of a private key via a zero-knowledge proof. I encourage any that are curious about ECDSA, Schnorr signatures, or other signing methods to do their own research.

~End Tangent~

For readers that did not read the above tangent, just assume that wallet signing allows you to prove to any party that you own the private key stored in your wallet, without revealing the private key. The programmatic flow for Web 2.0 authentication will then go as follows:

The Web 2.0 backend of the app generates some random message to sign. The example message shown in the most recent image is “Your SnowApe code is 9uc2w. Happy playing! ;)”
The user’s RPC provider prompts them to sign that message with their private key. They do this, and send a verification request containing their public key, the original message, and the signed message to the Web 2.0 backend.
The Web 2.0 backend uses a public Ethereum library to verify that the public key matches the signature provided.
The Web 2.0 backend now assumes that this party really is the owner of the private key it claims to own. It can then provide authentication credentials to the user in whatever fashion it so desires.

Take note when using this method for authentication: do not reuse the same message over and over. The message should be random and should be deleted after its first use. Cryptographic signatures are great, but they are prone to be intercepted. If a developer re-uses the same messages, it allows someone else to set up a fake site impersonating that developer, which could gather signatures from users and then use their authenticated accounts on the real site.

That being said, as long as EOA authentication is implemented correctly it is an excellent way to handle authentication, especially when compared to legacy systems. Traditional authentication requires the user to create a password and send it to the custodian, who then grants the user authenticated credentials. This has numerous issues: the transport layer that delivers the user password to the custodian may be insecure, so the password may be leaked. Additionally, the custodian may become compromised and leak the user password, which could lead to more leaks in other services.

Finally, the user may use bad passwords, or may have difficulty maintaining their own set of passwords. By using cryptographic signatures, the user doesn’t have to send a password anywhere, and instead sends a signature which doesn’t compromise any private information. Additionally, since the custodian doesn’t store any passwords, there is less at stake in the case of a hack. The very biggest benefit, though, is that the user no longer needs to manage a set of passwords. All authentication is instead handled by a wallet, of which knowledge of just one secret is needed: the private key.

Authentication is by far the biggest section of this writeup, and justifiably so. It is the feature I learned the most from, and is the feature which presented the most interesting challenges with still unproven solutions. That being said, I expect this space to change drastically in the future, and also expect the solution to Web 3 authentication presented above to eventually be obsolete.

Web 2.0 Stuff… Atomicity and Rankings

This section is mostly about making a trading game, so don’t expect much crypto talk here. For any readers that are still interested, read away!

Once I had set up a testnet payments system for users to join a league by depositing Matic, it was time for them to actually play some kind of stock game. As I built the trading interface, there were a few major challenges I faced. I’ll go over some of the technical complexities related to stock trading game design I encountered, as well as game design in general.

To start, I’ll discuss one challenge I did not face in building this application: the order book. Since it’s just a simulated stock game that uses data feeds rather than real asset holders, there’s no need to coordinate buy orders with sell orders. For an idea of how to implement a real order book, look here.

There are two primary concerns with designing a “real-money” trading game: atomicity and rankings.

Atomicity is a concept that comes from the term atomic: “of or forming a single irreducible unit or component in a larger system.” In relation to computation, it means that if a unit of logic is being executed, it is guaranteed that if any part of the logical unit successfully executes, the entire logical unit successfully executes. Conversely, if any part of the logical unit fails to execute, the entire unit fails to execute. Why is this good? And why is this important for my application? In short, atomicity is good because it prevents unpredictable behavior in an application. One of the most useful aspects of blockchain is that transactions are atomic. If a transaction fails, none of the transaction logic is executed; it’s like the transaction never happened. If the transaction succeeds, all of its logic is guaranteed to have been executed.

The concept of atomicity exists and is important outside of the blockchain, especially as distributed databases become ever more popular. In a distributed database, storage and execution is split between many different computers, where a single computer in the system is not guaranteed to be operational 24/7. Ethereum is a type of distributed database, insomuch that it facilitates distributed data storage and data retrieval; there are other systems that do this as well, like Cassandra and MongoDB. These databases can offer atomicity with certain kinds of operations, but not all.

Within my own application, I had to ensure that both a player’s portfolio and trading history are updated in an atomic manner: that is, if they successfully make a trade, it is guaranteed that their history will be updated with that trade. This seems obvious, but the only way to achieve it in most distributed databases is to ensure that the trade history of a portfolio is contained within the portfolio object. If the portfolio and trade history are contained in separate objects, they may be stored on separate computers — this is the nature of a distributed database. In this situation, a partial system failure (partition) may occur that causes only one of the two objects to successfully update when a player makes a trade, creating a discrepancy between history and actual portfolio value. This would severely confuse a player, and detract from their gaming experience. As of version 4.2, MongoDB supports distributed transactions (multi-document atomicity) which allows multiple objects to be modified in an atomic manner. However, most distributed databases do not yet have this kind of functionality, so it is up to the developer to actively research their database solution. While atomicity is guaranteed on blockchains, it is a property carefully managed in Web 2.0.

The second challenge, rankings, can be aptly covered by this Stack Overflow answer. Effectively, ranking every player can be difficult at-scale, because it requires some computer to sort an entire dataset of players. If a game has something like ten million players, having realtime rankings would mean sorting ten million players every few seconds. Doing this without any tricks would be costly. One of the easiest solutions, which is the one I am currently implementing, is to have tiered sets of players. The top 50 players get sorted every minute or so, the next 500 get sorted every ten minutes, the next 5000 players every 100 minutes, and so on… This allows the work of sorting players, while still expensive, to be much more scalable than sorting the entire set of players every time. The top players get updated quickly, making for a positive gaming experience, while lower-ranked players are getting sorted slowly but probably not checking their stats as frequently. Other solutions to this problem include implementing a data structure called a sorted set, which can track rankings with lower time complexity. There is no right answer to ranking, and though it is a common problem in gaming it is a fairly tricky one.

FIN.

Thanks for reading! I found the experience of building a fleshed-out blockchain app to be quite rewarding, and hope others can use this writeup to help them on their journey towards making a useful Web3 application.

🌐 Website | 🐦 Twitter | 🎮 Discord

This article was first published here.