Social and news media plays a relevant role in the dissemination of information related to crypto-assets. In a nascent financial market without established disclosure mechanisms, a lot of the relevant events about crypto-assets are distributed first in news and social media channel and, not surprisingly, the market remains incredibly susceptible to those channels.
The result is an ecosystem in which social and news media becomes a first-class source of intelligence about the behavior of crypto-assets. Unfortunately, most of the techniques used to analyze social and news media fees for crypto-assets remain incredibly simplistic producing ineffective and often misleading results.
In the last few months, our team at IntoTheBlock started different research efforts focused on producing a more sophisticated analysis of social and news media for crypto-assets. Today, I would like to discuss some of the initial deliverables of that work which are already in the Beta edition of the IntoTheBlock platform but before let’s try to make some sense of the challenges of social and news media analysis for crypto-assets.
In order to understand the challenges of creating effective news/social media analysis for crypto-assets we need to examine the current state of the artificial intelligence(AI) market in particular a hot space known as deep learning.
Over the last few years, there have been a renaissance in the creation of complex neural networks to tackle cognitive problems in areas such as speech, vision and, of course, text. Deep learning is a subdiscipline of the machine learning space that encompasses text analytic branches such as natural language understanding(NLU) and text mining which are the foundation of news/social media analysis.
The advancements in the deep learning space have lowered the entry point to create basic text intelligence models. Today, a developer without any meaningful knowledge of machine learning can create a basic sentiment analysis model by calling an API on a third party platform like Microsoft Cognitive Services or Watson APIs.
This has been precisely the approach followed by most analytic solutions in the crypto-asset space.
While unquestionably simple, those models fail to produce any meaningful intelligence given that they have no contextual information about the data they are analyzing. In deep learning, this phenomenon is known as the simplicity-accuracy dilemma.
The Accuracy-Simplicity Dilemma in Deep Learning
Imagine that you travel to a country without any knowledge of the native language, their history and socio-economic climate. To be prudent, you take a dictionary of the native language and you took a few lessons in Duolingo to try to articulate some common phrases.
With those tools you should be able to establish basic conversations like asking for directions or even ordering a meal at a restaurant. However, your lack of proficiency in the language and limited knowledge of the country will prevent you from engaging in a dialog about local politics or art.
This simple metaphor encapsulates the core principles of the accuracy-simplicity dilemma in deep learning systems. While simple deep learning models and relatively easy to develop but they often hit accuracy limits when applied to complex datasets.
Sophisticated models are incredibly hard to build and interpret but they yield better results when applied in complex environments. This dynamic is reflected perfectly in the evolution of machine learning algorithms:
In the case of social/news media analysis for crypto-assets, using a third-party NLU API will produce results incredibly fast but its likely to fail revealing meaningful intelligence as it hasn’t been trained in the specific context of the crypto markets.
Take the phrase “The upcoming Bitcoin halving could be impactful for the future of the crypto market”. Any sentiment analysis API will produce a neutral (0.5) score when analyzing that sentence.
However, a model trained in the specifics of crypto-assets would understand the context of halving in the history of Bitcoin and its likely to yield a positive score.
Some Non-Obvious Challenges of Analyzing Social/News Media for Crypto-Assets
Beyond the accuracy-simplicity dilemma, there are a few unique challenges that are relevant when analyzing social/news media for crypto-assets. Here are some of my favorite examples:
1) News are Great for Topic, not Sentiment, Analysis: Sentiment in news media should trend towards neutral. On the other hand, news media is a great source to understand key topics that are relevant in the behavior of crypto-assets.
2) No All News Are Created Equal: In the crypto space, a handful of news media outlets such as CoinDesk, CoinTelegraph or The Block have a disproportional impact on the behavior of the market. News from those media outlets should be analyze factoring in their level of influence.
3) Twitter-Telegram are Great for Sentiment, not Topic, Analysis: Contrasting with news analysis, Twitter and Telegram are great data sources to understand the sentiment of the markets but become very noisy if trying to analyze topic information.
4) Twitter-Telegram are Full of Biased Information: The discussion in Twitter and Telegram tend to be very passionate and biased towards the viewpoints of specific individuals. Additionally, Twitter and Telegram messages are often poorly written and full of misspells which difficult the analysis of any model.
5) No Single Model is Enough: Stop trying to predict price with sentiment or topics alone. No single model is enough to predict the behavior of a crypto-asset. To achieve that, you need combinations of models.
6) Visualizations Matter: A sentiment analysis curve tells you absolutely nothing no matter how great the underlying model is. In models that are no clear predictors of price movements, meaningful visualizations are incredibly important.
Meaningful Social/News Media Analysis for Crypto-Assets
Based on some of the principles discussed in the previous section, the IntoTheBlock team started the work of creating several deep learning models that were trained on the specifics of the crypto market and that will focus on extracting relevant information about crypto-assets. While this is still work in progress, the initial results are certainly encouraging.
Custom Topic Analysis for News Media
The IntoTheBlock News Topic Analysis model leverages a stream of news from the most influential news media outlets in the crypto space. The model itself its based on a convolutional neural network that has been previously trained on relevant topics in crypto-assets.
Every few hours, the topic analysis model reviews the recent news on those influential outlets and extracts the relevant topics. The results are plotted in relationship to a price curve which provide the user with an intuitive visualization to understand the relevant of specific topics.
Crypto-Optimized Sentiment Analysis for Social Media Channels
Sentiment analysis is a favorite of crypto market researchers and, as a result, we couldn’t ignore it. As explained previously, sentiment analysis methods are more effective when applied to social media channels such as Twitter or Telegram than when used on news. The IntoTheBlock Social Sentiment Analysis model is based on a recurrent neural network trained on specific terms in the crypto space. The result is a fun and interactive visualization that allow users to judge sentiment movements related to price.
These are just the initial step on IntoTheBlock’s social/news analysis journey. We are certainly exploring some initial ideas using this initial release as a foundation.