The Return of the API: Analyzing the Latest Developments in Ubiquity

Written by blockdaemon | Published 2020/11/01
Tech Story Tags: blockhain | nodes | cryptocurrency | api | protocols | rabbitmq | websockets-api-gateway | good-company

TLDR The Return of the API: Analyzing the Latest Developments in Ubiquity. Blockdaemon the leading blockchain infrastructure platform for node management. The next iteration provides users with real-time notifications on transactions via a WebSocket API. The API will offer some exciting new features in the next iteration of its development. The newest addition is staking reward data, incubating for internal use. The data extraction pipeline works with less storage space by replaying blockchain sync on special-purpose "zombie" nodes and tracing state data.via the TL;DR App

It is no secret that the blockchain sector moves at pace. In 2020, we have seen the launch of several significant protocols, each with their particular nuances and strengths which offer value propositions across industries, ranging from blockchain and enterprise to finance and supply chain.
At Blockdaemon, we understand that the need to communicate across multiple blockchain networks is pivotal for many network users and developers, that is why we launched Ubiquity earlier this year, an API tool that provides a unified syntax for communicating across many different protocols. Ubiquity eliminates the need for users of multiple networks to always switch contexts and re-learn a new set of commands and APIs. 
However, with constant innovation and development in the blockchain sector, it is essential that Ubiquity also stays ahead of these developments, that is why I'm thrilled to discuss the next exciting phase in the development of the service. 

Ubiquity System Architecture 

One of the initial challenges we had developing Ubiquity was in finding the right system architecture. The first version of the architecture has proven to suit our needs effectively. CockroachDB serves as a performance index backed on top of Ceph which stores large amounts of unstructured data, while RabbitMQ, a reliable message broker, powers the Event-Driven-Architecture for relaying syncing data across the microservice architecture. 

Live APIs

We incorporated feedback on customers' demand for a live API, and to that end, Ubiquity will offer some exciting new features in the next iteration of its development. The engineering team focused on Ubiquity v1.0 as a historical data API. Our next iteration provides users with real-time notifications on transactions via a WebSocket API.

Network Reorganizations

Blockchain development has many specific and challenging nuances that don't exist in other areas of software development. One of these is in accounting for widespread network reorganizations which can occur both in terms of attack type situations and also planned forks, where large volumes of blocks are reverted.
Blockchain networks running proof-of-work (PoW) consensus perform smaller forks naturally during normal operations. We made sure to account for these possibilities by enabling support for these reverts in Ubiquity and notifying users if previously sent transactions have been reverted. 

Staking Data

We expect the API generic model to evolve and import more data types. The newest addition is staking reward data, incubating for internal use. A unified staking dashboard on the Blockdaemon Marketplace, backed by Ubiquity, enables users to monitor node efficiency across multiple protocols--all within one unified dashboard. 

Data Extraction with Zombies

Extracting rich historical data from blockchain nodes is difficult and can be extremely slow due to poor data locality. While extracting state change information is as easy as parsing a block, getting detailed information such as historical balances from archive nodes incurs random reads ranging over terabytes of cold data, which takes months.
To address this challenge, we reimagined the data extraction pipeline to work with less storage space by replaying blockchain sync on special-purpose "zombie" nodes and tracing state data. These zombie nodes exist solely for state/tracing computation and are otherwise idle (have no free will to sync blocks).
Although applicable to any blockchain, this approach has proven incredibly valuable on Bitcoin, where data extraction was sped up by two orders of magnitude, by fully leveraging the UTXO set in-memory cache provided by Bitcoin Core. The zombie syncing mechanism will also enable the quick onboarding of new protocols. 

High Availability

Even the best server hardware on the market requires occasional maintenance. So from day one, Ubiquity has been designed to eliminate any single points of failure and is resilient to partial outages, conquering the availability problem by dividing the architecture into layers with clear separation.
  • Source Layer: Multiple blockchain nodes feeding new data into the API.
  • Service Layer: Multi-zone Kubernetes cluster and RabbitMQ event broker.
  • Syncing Layer: Redundant batch processing and live jobs for syncing blockchain data to the database.
  • API Layer: Swarm of web servers querying the database.
Ingestion Phase
Presentation Phase

Testing Testing Testing

We continually test API performance to iteratively push improvements based on insights gained through thousands of VictoriaMetrics time series and millions of Jaeger traces.
The critical metric optimization on the live API was latency--the time between when the block gets created on the network, to when the customer receives information on the event via Ubiquity. A valuable few hundred milliseconds are saved by converting the notifiers for blockchain changes from polling to push-based delivery. We are happy with the performance of the API for the most part, and stress-tested it to maximize raw request per second throughput, mainly by optimizing SQL queries.
A test involving scaling the SQL indexing part of the syncing layer yielded the most exciting results, increasing the throughput for Bitcoin from just ~100 to sustained ~1700 transactions per second. Changes included using binary encodings for primary keys, dropping foreign key constraints and switching to an eventually consistent database model--data inconsistencies during updates are now gracefully handled on the query layer instead.
Another syncing component to be tested was the blockchain tracker, which resolves re-branches and states which blocks to index. A trace below shows how performance was improved by 40% by adding parallel processing.
Without Parallel Processing
With Parallel Processing

A Final Note

As a blockchain infrastructure provider, we both must anticipate and move with developments in the ever-evolving blockchain sector. Our latest updates to Ubiquity are our endeavour to do just that.
With support currently provided for Bitcoin, Ethereum, Stellar and Ripple, we are planning to add several additional protocols to the list soon. 
If you'd like to try Ubiquity out, or learn more about its latest features, please visit the Ubiquity page in our Marketplace for additional details.
This article was written by Richard Patel, Blockchain Developer at Blockdaemon

Written by blockdaemon | the leading blockchain infrastructure platform for node management
Published by HackerNoon on 2020/11/01