Floating Islands by Sviatoslav Gerasimchuk This is the second post in our series, “Getting APIs on the Blockchain”. See our first post where I define “API” via some computer science history. here As discussed in our : APIs permeate our digital world and allow developers to build applications at a rate and at a complexity never before seen. In recent years, businesses are increasingly using APIs to monetize their data and services through completely API-centric business models. However, existing APIs are not natively compatible with blockchains and the decentralized applications that operate on them. previous blog post Just as Web 2.0 was marked by interoperability, user-generated content, and participatory culture, . Practically, this means the distribution of computation and across a network. To enforce these network-wide consensus rules, nodes in the network must verify global network states by computing proposed state changes — in the form of transactions — locally. Web 3.0 is defined by decentralization consensus Thus, running on a blockchain network can only operate on information that is accessible to and agreed upon by all nodes in the network. . This has been widely referred to as the “oracle problem”, referring to an ideal, abstract entity that can deliver Truth about the outside world to the blockchain. smart contracts Thus, the blockchain is walled-off from off-chain information The “Oracle Problem” The “Oracle Problem” is a three body problem: data source, oracle, and on-chain data consumer. . In other words, their models inaccurately treat the oracle node as the that is the source of the truth, rather than what it really is — something that transports data from source to blockchain. Existing solutions succumb to the pitfall of modelling their ecosystem as being solely composed of oracles and data consumers, while ignoring where the data originates from mythical oracle More essentially: the oracle problem is — its name suggests an impossible solution. An analogy would be to approach the problem of getting from Point A to Point B as the “teleportation problem”. Further, ideal architectures for solving the oracle problem drastically change depending on the data type at hand (e.g. objective vs subjective information). Such a problem inevitably leads to impractical and/or sub-optimal solutions. ill-posed For a more formal treatise on these issues, see Sections 3 in the . API3 whitepaper The problem with price feeds: a quick example To illustrate some of my point, here’s an example. Price feeds are presently the most common use case for oracle networks, given the recent and rapid growth of DeFi. Under an architecture like that shown above — where oracles are not incentivized or enforced to report their sources — we can quite easily outline several problems: A price feed fed by 10 oracles (for example) does not represent 10 unique data points. All oracles could very well be serving data from the same API provider (and we’d be none the wiser). There is a here — the number of oracles serving a data feed does not correspond to higher quality and more robust data, although providers of such feeds might imply such things. lack of transparency Oracles have an incentive to gather cheap and easily accessible data because nothing is enforcing or incentivizing them to do otherwise (since, again, they are not enforced in any way to report their sources). . Further, this makes staking difficult, if not impossible, in such systems because now high-quality, curated data sources become outliers. (Issues regarding staking in such systems will be covered in more detail in a later post in this series.) This creates something of a around cheap and easily available data Schelling point Doing source-blind aggregation shows, what can only be called, . Like already mentioned, a data feed being served by x oracles does not necessarily correspond to x unique data sources. This is especially true when the number of oracles increases, because unique data sources are far less abundant and scalable than an oracle node. This means a data-source agnostic aggregation method results in a skewed aggregate result (since it is very unlikely that the oracle to data source ratio is the same for all data sources). A (likely small) subset of data sources has a disproportionate affect on the final aggregate result. And, again, for game theoretic reasons: this results in the aggregate result being skewed towards cheaper and easily accessible data sources. statistical illiteracy Let’s narrow down our example to price feeds again. Another problem with source-agnostic aggregation is the inability to do a properly weighted and normalized aggregation. Consider, a price feed contract (served by data from price aggregator APIs): (certainly with their own proprietary aggregation methods). Blindly computing a mean or median on these data points is doing an “apples-to-oranges” comparison. That is, you are essentially computing a statistic on different data types but implicitly treating them as if they were the same — something that is clearly ill-informed to an average data scientist. oracle responses occur at different times, prices represent different trading volumes, and these prices come from different aggregators And I haven’t even gotten to the legal repercussions of data source agnosticism. Most API terms of service prohibit the resale or unauthorized distribution of the API data, which positions an oracle node operator serving such APIs to be in breach of those terms and susceptible to broad sources of legal liability including claims by the API provider.¹ The API Connectivity Problem We reduce the problem of getting objective² data on the blockchain to a two-body problem by redefining it as the API Connectivity Problem. This is the cutting of the Gordian Knot. (Note: this also solves the issues above regarding data source agnosticism since the data source is now represented on-chain.) There are that create real value via the internet, but they can’t create real value on the blockchain because they are not connected to it. Indeed, the primary use of oracle solutions today is to deliver asset prices curated by API providers to DeFi applications. Emerging use cases such as prediction markets and parametric insurance have similar requirements. real-world businesses Conclusions & next blog post The API Connectivity Problem formalizes and specifies the problem of how to connect off-chain businesses —monetized and represented digitally by their APIs — with the blockchain (in a decentralized, cost-efficient, and secure way, of course). Connecting such APIs to the blockchain directly brings off-chain value on-chain. How exactly do we bring API providers onto the blockchain? How does the transmission of such monetizable data and services differ from existing “oracle network” solutions? Keep an eye out for next week’s installment of this series, where I discuss the pros and cons of third-party oracles versus first-party oracles! ► API3 wants to talk to you about providing your API to the blockchain◄ [1] Practical Law, “Data licensing: Taking into account data ownership and use.” . [2] I must note that there are suitable approaches to getting subjective data on the blockchain via posing a question and crowdsourcing its answer. A good example would be the resolution of a judicial dispute. Footnotes https://legal.thomsonreuters.com/en/insights/articles/data-licensing-taking-into-account-data-ownership Previously published . here