paint-brush
Expanding Smart Contracts With SQLby@kwilteam

Expanding Smart Contracts With SQL

by KwilOctober 8th, 2024
Read on Terminal Reader

Too Long; Didn't Read

Current blockchain platforms, like Ethereum, face limitations in handling complex data due to rigid key-value storage, hindering advanced applications. SQL smart contracts introduce flexibility, allowing developers to perform dynamic queries and manage intricate data models efficiently on a decentralized network. SQL Smart Contracts unlock the potential for more powerful decentralized apps, revolutionizing blockchain beyond cryptocurrency.
featured image - Expanding Smart Contracts With SQL
Kwil HackerNoon profile picture

Special thanks to Jun Jiang from DePHY Network and Ryan Soury from Usher Labs for feedback and insights.


In 2008, the alarms on Wall Street rang as sophisticated traders descended into a primal frenzy. Overleveraged financial institutions, collapsing under the weight of subprime mortgage-backed securities, left greedy bankers exposed and begging for bailouts. The central banks, desperate to retain their grip on power, paid for the bankers' sins from the common man’s checkbook. This betrayal laid bare the centralized monetary system’s flaws, revealing the need for a newer, freer, and fairer financial system. Just as the American Revolution and the constitution that followed separated church and state, a new revolution called Bitcoin emerged to separate money and state, enabling many of the same liberties and freedoms fundamental to self-determination.


Blockchain technology is freedom technology. It enables us to build financial, identity, information, and social coordination systems that do not require trust in a centralized intermediary. Individual liberties thrive in a world where the central bank does not control the flow of money, a single platform does not control social discourse, and a single company does not control digital identities.


Many of the differences between this new world and where we are today lie in the technical capabilities of blockchain platforms. The first generation of smart contracts was the tip of the iceberg that enabled these freedom systems; however, they are fundamentally limited in their capabilities. In this article, I explain some of the critical limitations in current smart contracts and how a new system, “SQL Smart Contracts,” provides a more technically capable foundation to unlock human freedoms and realize blockchain’s potential as a new computing platform.

Smart Contracts: Programming The Truth Machine

“The root problem… is all the trust that is required to make it work.” - Satoshi Nakamoto


A blockchain's initial core property is immutability; once a certain threshold of stakeholders (or “nodes”) in a network agree that something is true, the blockchain will retain a permanent record of that truth. Blockchains use a variety of “proof” mechanisms in which nodes expend large amounts of value in the form of computing power, financial stake, or reputation to ensure that malicious actors cannot manipulate the truth.


If Bitcoin is the “truth machine” for digital currency, Ethereum is the “truth machine” for more complex financial products. Ethereum expands on Bitcoin’s capabilities by creating a programmable design space where developers can implement any logic to be deployed, verified, and executed across a series of nodes. This means we can now create systems that remove the need for trust in a central authority beyond just currency! Any system requiring central authorities–such as lending, real estate deeds, identity information, social media, economic metrics, etc.–can now operate without central intermediaries. This is an entirely new world!

A smart contract is a program developers write and deploy on a blockchain, the canvas for developers to create decentralized applications. The term “smart contract” does not mean a legal contract where two parties are bound to certain rights and obligations. Instead, a “smart contract” simply means that the application is guaranteed to function exactly how the code is written indefinitely. Lending contracts guarantee that borrowers and lenders can always transact. Real estate contracts guarantee that people can always verify and transfer property ownership. A smart contract is an application where code becomes law.


Steve Jobs called the computer “a bicycle for the mind.” Smart contracts guarantee that the wheels never fall off.


Ethereum Smart Contracts: Tip of the Iceberg

“Crypto is not just about trading tokens, it's part of a broader ethos of protecting freedom and privacy and keeping power in the hands of the little guy.” - Vitalik Buterin


Although Ethereum smart contracts introduced a whole new world of decentralized products, fundamental limitations in their design and data manipulation capabilities prevent them from being effective in many applications beyond cryptocurrency.


In Solidity (a programming language for Ethereum), contract data is stored in key-value pairs. Although structs (variable groupings) and mappings (collection of key-value pairs) present helpful ways to organize data, all data is only retrievable by its key. Consider a theoretical contract for storing user identity data:


contract IdentityStorage {
   // Struct to store KYC details
   struct identity {
       string fullName;
       string dateOfBirth;
       string residentialAddress;
   }


   // mapping a country to its citizens to their info
   // "Canada" => 0x123… => {Vitalik Buterin, 01/31/1994, ...}
   mapping(string => mapping(address => identity)) public idData;


   //...rest of contract
}


In this contract, a user’s identity record can only be retrieved by knowing the User’s country and wallet address. Unless the contract deployer redesigns the smart contract to have high-gas cost data manipulation, there are no other ways for the contract user to retrieve an identity record. Storing data in key-value pairs ultimately limits how the data can be accessed and manipulated.


In particular, data management in Ethereum smart contracts presents two fundamental problems: index dependence and access path dependence.


Index Dependence

Index dependence means that to access a specific piece of data, the data must be available in an index. An index is a data structure that efficiently searches for a unique identifier within a collection. In the example KYC contract above, records are only accessible through the exact Ethereum address used for the key. This rigid indexing structure prevents contract users from querying the data based on other criteria, such as, “Which users have this residential address?” or “What percentage of users with this national ID were born after January 1, 1970?” Without the ability to perform such queries, developers lack the flexibility to aggregate, analyze, and build application logic around contract data. When developers need this additional flexibility, such as retrieving an identity record by full name, the entire contract needs to be restructured. In Ethereum, restructuring indexes can also increase a contract’s gas costs, further hampering the contract’s usability.


Access Path Dependence

Access path dependence refers to data being accessible and understandable only through a specific retrieval path. In the example contract, knowing Vitalik’s country and wallet address would allow a developer to retrieve his identity record. However, knowing only the wallet address would not allow a developer to get Vitalik’s country of origin. Furthermore, even if the developer has Vitalik’s wallet address, they cannot get his identity record unless they also know the country of origin (the “Canada” key). The access path to Vitalik’s identity record is fixed; if a developer needed to try to retrieve his record by just the wallet address, the entire contract would need to be restructured. Access path dependence means that data is accessible and meaningful only in one direction, limiting the ability to query or interpret the data from different perspectives.

Index and access path dependence pose significant challenges for applications requiring a complex or evolving data model. While cryptocurrencies have simple data structures that can be implemented on Ethereum (ERC20 tokens are essentially just a mapping of addresses to balances), these challenges become problematic for more data-intensive applications. When an application needs to store, query, and manipulate a complex data model, Ethereum's basic key-value storage makes data management significantly more limiting, making it challenging to build and maintain applications requiring complex data management.

A Brief History Lesson: The Relational Model

“History doesn't repeat itself, but it often rhymes” – Mark Twain


In 1970, Edgar F. Codd, a computer scientist at IBM, published a paper called “A Relational Model of Data for Large Shared Data Banks.” At the time, the most popular type of application database was the "hierarchical database," which used a rigid, tree-like structure where each piece of data was stored under a parent directory, similar to how files are organized on a computer. Codd argued against the hierarchical database, proposing a newer, simpler, far more capable relational database with a tabular structure.


The hierarchical database’s tree-like structure means that data can only be accessed through the rigid system of understanding each piece of data’s parent-child relationship. In particular, Codd identified three key problems with the hierarchical system:


  1. Ordering Dependence: The result of a query often depends on how the data is organized in storage. If an application is built assuming that data will be queried in the same order it is stored, then the order cannot be changed in the future.

  2. Index Dependence: To access a specific piece of data, the application must know the parent (i.e., an index). Otherwise, retrieving the requested data is impossible.

  3. Access Path Dependence: Accessing or understanding data requires following a specific retrieval path. If the application is designed to retrieve data using one particular access pattern, it cannot retrieve or interpret the same data using alternative paths.


Does this sound familiar? Although Ethereum smart contracts do not have ordering dependence (maps are unordered), the same index and access path dependence limitations that held databases back in the 1960s and 1970s are holding back smart contract platforms today.


Limitations at the database level are more than a trivial setback; they fundamentally constrain developers and limit the types of applications built on a platform. Rather than focusing on implementing new features, developers fighting index and access path dependence must spend an extraordinary amount of effort maintaining an existing application’s functionality. Through the 1960s and 1970s, database usage was primarily reserved for rigid business tasks such as inventory management, accounting, and general data processing; developers did not have the data flexibility to create more sophisticated applications. However, after the introduction of relational databases, significantly more expressive and data-intensive applications emerged, leading to the rise of ERP systems, CRMs, and business intelligence tools. Furthermore, with the advent of the internet, these advancements paved the way for e-commerce platforms and social media applications. Developers could implement features that would previously require an entire database to be restructured with just a few lines of SQL. The relational database was more than a paradigm shift; it was a category-creating platform that enabled fundamentally new applications to come into existence.


Today, blockchain platforms are similar to computers and databases in the 1970s. The lack of capable data processing at the blockchain level means developers cannot implement more sophisticated, data-intensive decentralized applications. If the primary use case for blockchains will ever expand beyond cryptocurrency, we need blockchain platforms with more capable data processing functionality.

SQL Smart Contracts: A more flexible paradigm

"The measure of intelligence is the ability to change." - Albert Einstein


Just as the commercialization of the relational database in the 1980s led to the proliferation of new applications, integrating relational databases into blockchain platforms has the same potential to reshape the types of decentralized applications that can be created.


At Kwil, we are building a blockchain platform and smart contract language that allows developers to build decentralized applications that leverage SQL's full expressivity. With Kwil, developers can leverage the relational model’s flexibility to create more capable, data-intensive decentralized applications.


Consider the same identity storage example from earlier. Rather than storing identity records in a map where each record is only accessible by its key, Kwil allows developers to store the records in a table and leverage a flexible SQL syntax to query over the table:


database user_registry;


table identities {
  address uuid primary key,
  name text notnull,
  date_of_birth int notnull,
  residential_address text notnull,
  national_id int notnull,
  #country_index index(national_id)
}


action query_by_national_id ($id) public view {
  SELECT * FROM identities WHERE national_id = $id;
}


action query_by_dob ($dob) public view {
  SELECT * FROM identities WHERE date_of_birth > $dob;
}


In the original Ethereum smart contract, there was no way to search through the identities and return all users given a condition (such as national ID) or to associate a wallet based on a specific attribute (such as a date of birth). To enable such functionality would require restructuring the contract to add costly, gas-intensive functions. However, with the relational model, developers can execute these queries without any restructuring required, thereby gaining more data manipulation flexibility without incurring additional costs.


For example, the idOS network is a sovereign blockchain built with Kwil, allowing users and dApps to store user credential information. Leveraging SQL over the idOS network enables:


  1. Users to be associated and retrievable by multiple wallets, credentials, and attributes.

  2. DeFi protocols to perform aggregate analyses of where their users are from.

  3. Stablecoin protocols to assess which users are from high-risk areas.


Enabling the relational model and SQL on a decentralized blockchain platform allows us to create fundamentally new applications that cannot exist on existing Ethereum smart contracts.

Conclusion

The relational model that revolutionized the computing industry 40 years ago has the same capabilities to revolutionize the blockchain industry today. In the 1960s and 1970s, index and access path dependence limited the hierarchical database’s usefulness in data-intensive applications. Today, the same index and access path dependence limit Ethereum smart contracts and their ability to power decentralized platforms with complex data models. However, by integrating the relational model into the blockchain and providing developers with the same expressive SQL dialect, we can unlock new types of applications. Just as the relational database accelerated business demand and helped computers attain mainstream adoption, it may help blockchain platforms do the same, thereby unlocking a freer, more decentralized, more trustworthy digital world.