So, you have chosen the engine and implemented the first version of your blockchain. Now, full-scale testing is required to test the network stability under changes. Potential users and validators will be able to prepare their services in advance for real conditions if there’s a test network (or testnet) - the most complete copy of the main network.
Without a testnet, you will not be able to check the developed code quality, and users will not be able to try your services in “lab” conditions.
I have written a lot about blockchain testing (read here and here), so now I suggest focusing on some practical issues. When the blockchain is working in the main network, you may face many problems, and it is better to solve them in advance. Here are some examples of such technical issues:
The list can be endless. This is only the tip of the iceberg - there are still economic and organizational problems. You can avoid a big part of them through rational architecture and right choice of algorithms, but it is better to stick to a complex approach.
Since each line of code is executed on dozens of servers, even small changes can cause an avalanche of consequences and errors. Thus, regular testing, active monitoring and proper work flow of the project team are essential.
Almost all modern blockchains support external monitoring for the timely detection of errors. This is a notification system that reports about critical problems and collects important metrics from nodes. The goal of the metric collection system is to inform you of abrupt changes in one of the components and help you quickly identify the cause of network degradation.
It is desirable to monitor both standard system parameters (processor and hard disk load, amount of consumed RAM, network traffic), and specific metrics (consensus and processing logic).
If you make changes to network consensus, important metrics will be related to its internal logic. As for the finalitу gadget, it is important how much the last finalized block is behind the latest blocks of the chain. If the network p2p layer changes, then the number of active peers can be a useful metric (peers are equal participants in the network). For complex transactions, you can measure the time and resources needed to pack them in a block.
Unfortunately, there is no universal set of metrics. I can only advise to add monitoring metrics (for example, transaction processing time or lagging of the last finalized block) - that will allow you to see problems in the components and save time. It is not complicated to do this, since all modern blockchain engines enable you to send data to monitoring servers (usually this is a complex of Prometheus + Grafana). To add a new metric, just copy a couple of lines of code.
Do you have a product manager who is responsible for project development? In the case of developing consensus or a large system of contracts, it is quite difficult for him to track and evaluate the results. However, he can take a look at the graph and assess the time that the client expected for the transaction to be added to the block and see the latest improvements. Monitoring, metrics and colorful graphics are not a caprice of a perfectionist, but necessary tools to evaluate project development. Think about them in advance and facilitate the work of your colleagues.
In terms of testing, blockchains look like databases with the master-master type of replication. For testing distributed databases, there are sets of benchmarking instructions. For example, TPC or YCSB are used when testing new replication algorithms. There are no similar standards for blockchains yet.
In addition to other testing difficulties, the number of validators may significantly change when receiving multiple transactions (add new validators or exclude old ones). As I have mentioned before, blockchains are networks that must remain stable during massive attacks and collusion, even if some validators are disconnected or captured by attackers.
To test network stability under load, you must run a test every time you make changes to the consensus or logic of the main transactions. In particular, if new metadata was added to the transaction or a new algorithm was used. Test repeatability is crucial as it’s not enough to launch a small test network, update the node code and run the script sending transactions. If you don’t reproduce all the initial conditions, external factors may affect test results. A good example is disk cache. In the first test, the data will be read and written to the disk at one speed, and in the second one - much faster, so that you may mistake this increase for blockchain performance improvement.
The selection of transaction sequences for the test is also important. For example, 1,000 transactions transferring a certain amount of tokens between 2 accounts is not the same as transactions transferring tokens between 1,000 different accounts. The latter may affect the behavior of the memory, disk, and processor on the nodes. To obtain good test results, the system should take minimum effort to accelerate the test and cover its vulnerabilities. To test payment transactions, it is reasonable to use N random accounts (should be new every time), transactions that transfer random amounts of tokens, and files with random contents for file storage.
In my opinion, Test Driven Development or TDD is the best paradigm for blockchains and smart contracts. It involves continuous addition of new test scenarios and acceptance of new code only if it passes all the scenarios. Also, with TDD you will have to automate blockchain deployment from the very beginning that in turn will save you a lot of time in the future.
Automated deployment of a blockchain from scratch allows developers and the project team:
Blockchain deployment is not very different from the scripts that run a database cluster. To some extent, it’s even easier thanks to a peer-to-peer network. Here are several scenarios for deploying a Polkadot test network with a boot node for massive network performance tests based on the Parity Substrate engine. We used the same scenarios for testing Haya, and they saved us a lot of time.
Scenarios for validator deployment may require pre-set peer IDs of all validators. Thus, they will never exclude each other from the list and wait until the connection is restored in case of network issues.
You can start this procedure at the early stages of testing. The blockchain requires an initial account to distribute balances for tests, allocate resources or carry out configuration procedures. As a rule, validator accounts are used for these tasks.
Automating such tasks, you will have to send addresses and private keys of validators and test accounts to public repositories. This is not a big deal, it’s even convenient - your product manager will be able to use one wallet and present it to investors, while all test scripts will use ready-made accounts. (Never use these accounts in the mainnet!)
The test network launch is coming. The project team proves that the code is consistent, the network can operate without stops for a long time, and anyone can use it. This is the main difference between blockchains and centralized systems - from now on it will be very hard to hide errors and shortcomings.
For teams that decide to build their own blockchain without using current engines, testnet launch is a more exciting event than going mainnet. Before testnet you could easily “annule” the blockchain and start from the first block, but now that there are some services that testers, auditors, and external teams are studying, this is extremely undesirable.
To ensure a smooth start, it makes sense to have one or more boot nodes with fixed public addresses. New nodes will not be searching for testnet nodes in the network, but will contact these fixed addresses instead, receive an up-to-date list of peers, and quickly synchronize with the network. Your team is responsible for maintaining the relevant boot nodes.
When the testnet is running, the so-called replay scenario is possible - transactions are “replayed” by all nodes starting from the genesis block. It is relevant if there was a change in the logic of transaction processing and the state of all nodes must be rebuilt from the very beginning, without losing user transactions. In case of critical errors, this scenario is also possible in the mainnet. So, it is worth trying it in the testnet in advance.
Sometimes a test network requires more devops attention than mainnet. In the mainnet, a project can have only one or two validators, and the team only monitors and backups the main services. Testnet may require updating a large number of nodes and continuous support of many people. At this point, all previously created automated software updates will come in handy.
You can also delegate testnet launch to the teams that will build solutions for your blockchain or be validators of your network. This is an ideal option because switching from the test network to the mainnet will be very easy and without unpleasant surprises. Good luck!
Read the previous parts on Hackernoon: