Due to the boom of LLM, we have seen more and more AI + Blockchain projects recently. Besides the combination of LLM and blockchain, we also see AI coming back to the blockchain. ZKML is one of the popular combinations.
AI and Blockchain are two different technologies with totally different characteristics. AI needs large computation power which is offered by centralized data centers. Blockchains provide decentralized computation and privacy but are not good at large computation and large storage. We are still exploring the correct way to combine AI and blockchain. Here is an overview of AI + Blockchain projects.
In this research piece, we will mainly focus on the application of LLM in the crypto space. LLM is really powerful tech because of its ability to understand natural language and developers are using LLM in these two directions:
Here is an engineering workflow diagram for building an LLM app to answer users’ questions. First, related data sources are generated into embeddings and stored in the vector database. The LLM adapters use user query and similarity search to find related contexts from the vector database. The related contexts are put into the prompt and sent to the LLM. LLM will execute the prompts and use the tools it has to generate the response. Sometimes the LLM is tuned on specific datasets to improve accuracy and cut the cost.
The LLM application workflow can be generally categorized into three main phases:
We come up with 8 potential directions in which LLM can help the blockchain space:
Blockchain will have built-in AI functions and models. Developers can access AI functions to perform signature ML tasks like classification, regression, text completion, and AIGC on-chain. Developers can call these AI functions through smart contracts.
With these built-in functions, developers can empower their smart contracts with intelligence and autonomy. Classifications, regression, and AIGC are typical AI tasks. Let’s see how can these be used in the blockchain space and a few example projects.
Classification can be used to determine whether the address is a bot or a real human. This can change the current NFT sale situations. Classification can also increase the security of the DeFi ecosystem. DeFi smart contracts can filter malicious transactions to prevent fund loss.
Regression activity can be used in prediction. It can be used in funds and wealth management.
Numer.ai has already used AI to help them manage their funds. Numer provides high-quality stock market data. Data scientists work on top of these data and apply machine learning to predict the stock market.
Lots of NFT projects are trying to build an IP universe. However, their limited contents cannot support a universe. If we can use AIGC on-chain, the AIGC model can output countless contents at a relatively low cost. These generated contents have similar iconic brand styles. It is possible for the model to output texts, illustrations, music, vocals, or even music. This greatly expands the IP universe. Community participants can collectedly fine-tune the model to fit their expectations. The fine-tuning process also gives the community a sense of engagement.
Botto has already used the AIGC model to generate art content. Community votes for their favorite images to collectively fine-tune the AIGC model.
If we treat blockchain as a database, we also find Databend integrating AI built-in functions into their database. They provide functions like:
Developers can apply this function to their SQL queries. For example, the following query is supported:
USE <your-database>;
SELECT * FROM ai_to_sql('<natural-language-instruction>');
We see several projects providing AI functions to the blockchain.
Giza is working on the ZKML sides. It generates inference proofs off-chain and does the verification on-chain. It now supports EVM-compatible chains and StarkNet. Giza recently announced their partnership with Yearn.finance. Yearn will use Giza AI functions to improve its risk assessments.
Modulus Labs is working in similar directions. They put more effort into improving the proving systems to produce high-performance circuits for AI. They released demos like chess AI and ETH price prediction AI. Their new demo zkMon is the world’s 1st zero-knowledge proven Generative Adversarial Network NFT collection.
Analyzing transaction records is usually done by specific Apps, like Debank. It is hard for humans to analyze transaction records manually. Analyzing it manually involves data gathering, data cleaning, and data analytics. Users need to have the ability to code. With LLM, we now have a new approach. LLM has the ability to analyze the visualize the data. Thus, with LLM, we can analyze the on-chain data with customizations. We can analyze the win ratio, performance ratio, or anything we would like to know.
RSS3 developed a ChatGPT plugin Web3 User Activity to pursue this direction. Users can input the wallet address, ENS, or Lens to look up the on-chain activity. The plugin will output transactions in a human-readable manner. However, it cannot execute complex queries like how many Azuki holders and what are the hottest smart contracts are. Users should also be aware of the accuracy of the address and label of the plugin provided.
DeFiLlama also released a ChatGPT plugin. Users can use natural language to ask for any data available on DeFiLlama. It can also execute simple filters and sorts.
Dune is also integrating GPT into its products. Dune planned the following features:
Besides analyzing on-chain data, Dune has also integrated LLM into other functions which we will discuss later.
Similar to Dune, Space and Time also works on translating natural language to SQL powered by OpenAI.
Blockchain data is highly structured data. Using natural language to directly query the database may not output accurate results. The better approach would be to convert the natural language to SQL and then execute the corresponding SQL queries.
Since LLM has logic and reasoning ability, it can be used to filter some malicious transactions. It acts as a firewall for smart contracts. Here is a specific example of blocking out bot activities.
After inputting the address, the LLM can get all transaction data through the third-party plugin. Then the LLM analyzes these transaction records and outputs the likelihood that the address is a bot. This functionality can plugin into Dapps that don’t welcome bots, like NFT sales.
Here is a simple example through ChatGPT. ChatGPT retrieves account transaction records through the Web3 User Activity plugin developed by RSS3. Then ChatGPT analyzes these transaction records and outputs the likelihood that the account is a bot.
If we input more transaction records and fine-tune the LLM on bots-related datasets, we can get more accurate results. Here is an example workflow for such applications. We can also add caching and a database layer to improve response speed and lower costs.
LLM is largely used in development to help developers write code faster and better. Based on developers’ instructions, LLM can generate code for them. Currently, developers still need to give detailed instructions for LLM. It is hard for LLM to automatically generate the whole project for developers.
Some popular LLM models for codes are StarCoder, StarCoder+, Code T5, LTM, DIDACT, WizardCoder, FalCoder-7B, MPT30B.
All of them can be used to write smart contracts, but they may not be specifically trained on smart contract data. There is still improvement space for them.
Currently, there is only one smart contracts-related dataset available on HuggingFace. It is slither audited smart contracts. It contains 113k smart contracts. It can be used for text classification, text generation, and vulnerability detection.
Compared to copilots, automatic code generation would be more promising. Automatic code generation is suitable for smart contracts since smart contracts are relatively short and simple. There are several ways LLM can help the developer to write codes automatically in the blockchain space.
Firstly, LLM can generate tests for well-written smart contracts. There are already projects like Codium that can automatically generate tests for written projects. Codium currently supports JS and TS. Codium first understands the codebase and analyzes each function, docstring, and comment. Codium will write code analysis as comments back to files and output a test plan. Users can select tests they like and Coidum will generate selected tests.
Other copilot software also supports generating tests for selected functions.
We can follow similar steps to reproduce similar functionalities on GPT-4.
We ask for code analysis first because we would like the LLM to spend more time on this task. LLM does not know which tasks are hard. It puts the same amount of computation power on each token. This may result in inaccurate results on complex tasks. Based on these characteristics, we ask for code analysis. In this way, the LLM will spend more tokens/time thinking about these tasks and output higher-quality results. This method is also called Chain of Thoughts.
To make it work for longer smart contracts, we need an LLM with a larger context or some engineering design to preserve the memory.
Secondly, we can use LLM to automatically generate some auxiliary scripts, like deployment.
Deployment scripts can reduce potential errors during the manual deployment process. The idea is quite similar to automatically generating tests.
In the bull market, there are lots of fork projects. These fork project teams change a little bit of code from the original codebase. This would be a great use case for LLM. LLM can help developers automatically modify code based on teams’ needs. Normally only specific parts of code need to be changed. This would be relatively easy for LLM to do.
What if we take one step further? Can LLM automatically generate smart contracts for developers based on their needs? Smart contracts are relatively short and simple compared to other complicated software written by JS, Rust, and Python. Smart contracts do not have lots of external libraries. It is relatively easy for LLM to figure out how to write smart contracts.
We have already seen some progress in automatic code generation. GPT-engineer is one the pioneers. User needs to fulfill their requirements and solve any confusion from LLM. Then it started to code. The code also includes a script that can run the whole project. GPT-engineer can automatically start the project for developers.
User input their demands. GPT-engineer will analyze the demands and ask for some clarifications. After collecting all the necessary information, GPT-engineer first outputs the design of the program. This includes the core classes, functions, and methods necessary for this task. Then it GPT-engineer will generate code for each file.
With prompts like this, we can generate a counter smart contract.
The smart contract can pass the compilation and works as expected.
Because GPT-engineer is originally designed for Python, there are some issues when generating code relating to Hardhats. GPT-engineer does not know the latest version of Hardhat and sometimes generates outdated tests and deployment scripts.
What if our code has bugs? We can feed the codebase and console error log to LLM. LLM can keep modifying the code until the code can run successfully. We saw projects like flo working in this direction. Currently, Flo only supports JS.
If we would like to improve the accuracy of the smart contracts generation. We can improve GPT-engineer with some new prompts. We can employ test-driven development and ask the LLM to ensure that the program passes certain tests where we can better constrain the generated programs.
Since LLM can well understand code, we can use LLM to write developer docs. LLM can also track the code change to update the documents. We discussed this approach at the end of our last research Exploring Developer Experience on ZKRUs: An In-Depth Analysis.
Reading documents is legacy behavior. Chatting with code is a new approach. Users can ask any questions about the code, and the LLM will answer users’ questions. LLM can interpret code for developers to help them quickly understand the on-chain smart contracts. LLM can also help people with no code experience understand smart contracts.
We already see this trend in the Web2 world. Lots of copilot tools have this function code interpretation.
Etherescan also reveals its new feature where users can chat with code leveraging the power of LLM.
Given the ability to understand code, what about auditing? In the experiments by the paper, do you still need a manual smart contract audit, LLM achieves a 40% hit rate in identifying vulnerabilities, outperforming a random baseline. However, they also have a high false positive rate. The author suggests that proper prompting is the key.
Besides prompting, the limitations would also be the following reasons:
These problems are not that hard to solve. Big auditing firms have thousands of auditing reports which can be used to fine-tune LLMs. LLM with large token limits is coming out. Claude has 100k token limits. The newly released LTM-1 has crazy 5M token limits. Combining the efforts in solving these two problems, we will probably see LLM can better identify bugs. LLM can assist auditors and speed up the auditing process. This may happen step by step. There would be a possible development track:
Governance is a crucial part of the community. Community members have the right to vote for their favorite proposals. These proposals will shape the future of the products.
For important proposals, there is a lot of background information and community debates around these proposals. It is hard for all community members to understand this context before voting. LLM can help community members quickly understand their choice's impact and help them vote.
Another potential application is Q&A bots. We already see Q&A bots based on project documents. We can take one step further and build a larger knowledge database. We can plugin different media and sources, like presentations, podcasts, Github, Discord chats and Twitter Space. The Q&A bots will not only exist in the document search bar, but they can also appear on Discord to immediately support community members, or on Twitter to spread the projects’ vision and answer any questions.
AwesomeQA is currently working in this direction. It implements three functions:
The current obstacle for Q&A bots is how to accurately take related context from the vector database and provide the context to the LLM. For example, if users ask for multiple characteristics for multiple elements with filters, the bots may not be able to retrieve relevant contexts from the vector database.
Updating the vector database is another problem. The current solutions are to rebuild the vector database or update the vector database by namespace. Append namespace to the embeddings is similar to attaching tags to the data. It can help developers easily find and update the corresponding embeddings later.
Lots of things are happening every day. Markets change a lot. KOL tweets new ideas and new thoughts. Newsletters and product emails flow into your mailbox. LLM can select the most important ideas and news for you. It can also summarize content to shorten your reading time and help you catch up with the market.
minmax.ai is working on the news. They provide a summary of recent news on a specific topic and also sentiments about the topic.
Boring reports remove sensationalism from the news, focus on the essential details and help readers make correct decisions.
Robot advising is one of the hottest areas. LLM can boost the usage of Robot-advising. LLM can provide trade suggestions and help users manage their portfolios given the stock information as context.
Projects like Numer.ai use AI to predict the market and operate the fund. There are also portfolios managed by LLM. Users can follow these portfolios on Robinhood.
Composer brings trading algorithms with AI. AI builds specific trading strategies based on users’ insights. Then AI will automatically backtest these trading strategies. If users are satisfied with the strategy, the composer can automatically execute the strategy for users.
Analyzing projects involves reading lots of material and writing long research papers. LLM can read and write short paragraphs. If we can extend its ability to longer paragraphs, does it mean LLM can somehow output some project research? Probably yes. We can input the whitepaper, doc, or event presentation for LLM to analyze the project and founder. Limited by the token, we can write the outline of the paper first. Then update each section based on the sources it get.
Projects like BabyAGI are already working in this direction. Here is an example output of BlockAGI, which is a variant of BabyAGI.
LLM can also analyze founders’ personalities based on Twitter and public speech. For example, Tweet analyzer takes recent tweets and uses LLM to analyze the personnel.
These are 8 specific directions that LLM can help the blockchain community in the near future.
LLM can benefit all members in the crypto space, including project owners, analysts, and engineers. Founders can use LLM to automate tasks like documents and Q&A. Engineers can use LLM to write code faster and safer. Analysts can research projects more easily.
In the long term, we also find potential opportunities to apply LLM in GameFi. LLM can generate more interesting tasks and play different roles inside the games. The world in the game will feel more real and interesting. NPC will dynamically react to players’ actions. The tasks will have more endings based on how users solve them.
LLM can be integrated into existing projects, but it also opens opportunities for newcomers. For example, there are already top players in the on-chain data analytics sectors. Dune can integrate LLM to improve its user experience. However, LLM also creates opportunities for newcomers. These newcomers can place LLM in the central place of their product design. These AI-first, AI-centric idea products may bring new competition to the on-chain data analytics sectors.
Web2 and Web3 worlds do share some use cases in LLM applications, but they may implement the products differently. Because the data we used in the Web3 world are different from the data in the Web2 world. The knowledge base for the LLM may also be different in Web2 and Web3. Web3 data are around blockchain, token price, tweets, projects, and research pieces. Thus it requires that Web2 and Web3 have different LLMs to provide service to end users.
Due to the boom of LLM, we see increasing popularity of AIxBlockchain. However, lots of AIxBlockhain is not practical in a short period of time. Blockchains and ZK cannot provide large computation power to train and do the inference on some complex models. Small models are not powerful enough to solve complex tasks. The more practical approach is LLM’s application in the blockchain space. LLM has made more progress than other AI topics recently. It is more reasonable to practice LLM and blockchain.
The LLM community is working to improve token limits and better response accuracy. What is left to the blockchain community are data sources and data pipelines. Cleaned data can be used to fine-tune the LLM to improve accuracy under blockchain contexts. The data pipeline can integrate more blockchain-related apps into LLM and develop more crypto-specific agents.
Special thanks to IOSG