Data is the new oil except for in one significant way. The value of oil is linked to the market laws of supply and demand. When the oil is scarce, prices go up. As opposed to data which is becoming less and less scarce every day.
In the last two years alone, 90% of existing data was generated. Social media, streaming media, the Internet of Things — as we become more connected, we create more data. Unlike oil, the biggest challenge in managing data is not maintaining a dwindling supply. Rather, it’s figuring out how to handle and harness the 2.5 quintillion bytes of new data generated each and every day.
Big Data Presents Big Challenges
The sheer volume of new data creation presents the most significant challenge. Increasing data volume explains the shift to cloud computing. Cloud computing allows tech companies to leverage economies of scale by buying up vast quantities of storage and processing power, making it more cost-effective for non-tech companies than maintaining their own servers.
Now, cloud computing companies are racing to keep up with customer demand. “Mega data centers,” of more than a million square feet are now becoming more commonplace. These buildings don’t just take up space, they require vast amounts of energy to store and process data at scale.
There are other challenges. Centralized servers can fall prey to hacks and attacks, given the high value placed on consumer data and providers must ensure that data remains accessible to their clients at all times.
Outsourcing these challenges makes things easier for the end user of the data, but cloud computing providers aren’t able to ensure the data they store is accurate. This particular issue sits firmly with the data owner.
What Blockchain Brings to Big Data
Several features of blockchain technology lend themselves well to solving the issues of handling big data. Firstly, blockchain is predicated on decentralization. Using a decentralized network to manage data storage and processing allows potentially infinite scaling, as any machine with computing power can contribute to the network.
Decentralization also offers a layer of protection against external attacks. A decentralized network requires a takeover of 51% or more of the network hashing power, making it less vulnerable to hacks than data stored on central servers.
Blockchain also operates on consensus methods for adding transactions to the ledger. Applying this to data means that the network could play a role in validating the authenticity and provenance of data, reducing instances of inaccuracies. Because a blockchain is immutable, the data stored on it can never be altered or manipulated by anyone.
Finally, one of the biggest threats facing the use of big data is regulation by governments concerned about the privacy and data security of their citizens. Blockchain doesn’t overcome this by itself, but deploying key encryption means that individuals could take more control over how their data is used and passed on to third parties.
These features justify forecasts that blockchain could control 20% of the big data market by 2030.
Applying Blockchain to Big Data
Although there are still plenty of developments to come in this area, several first-mover projects are already making strides in converging blockchain and big data for various use cases.
Endor has developed a decentralized platform which uses big data to power its engine for predictive analytics — powerful insights that can be used by businesses to help gain the edge over their competitors. For example, a retailer could forecast which consumer is likely to buy a product launched a week ago, who will convert to premium, or which consumers are ideal for a particular new product, and the algorithm will deliver a list of the requested consumes, predicting their future behavior.
The Endor protocol has been developed based on extensive research by MIT into a discipline called “social physics.” It takes big data from a vast array of sources and uses it to create fast and accurate predictions without involving any data scientist or research analyst. Therefore, Endor has the potential to level the playing field for small businesses that are currently priced out of the market for big data.
Storage and Processing
Data storage and processing are two of the most fundamental challenges facing big data today. Several startups in the blockchain space are aiming to address these challenges by leveraging the vast amount of idle computing resources available in homes and offices around the world.
For example, Storj offers a peer-to-peer file storage network. By using encryption and sharding, users are guaranteed that no one machine on the network can access their files. Golem provides a similar solution, but for processing power. A decentralized supercomputer, anyone can rent out spare GPU or CPU power for performing ad-hoc computations at a lower cost than using providers such as AWS.
Decentralized Artificial Intelligence
Artificial intelligence (AI) algorithms are data-hungry, requiring huge quantities of data to build the patterns and recognition algorithms that power a machine’s intelligence. SingularityNET aims to create a global marketplace for AI algorithms, intelligence, and services.
The principle is that by decentralizing AI, all learnings are shared across the entire decentralized network. This means that every machine or algorithm potentially has access to all the data, information, and intelligence available in the network. For the first time, AI machines can learn from one another, rather than from the single data source fed in by its creators.
Although data as “the new oil” is a poor economic analogy, it does have some value in illustrating how data is fueling these use cases and emerging technologies. If these new technologies are the vehicle, then blockchain is the foundational infrastructure. It’s building the roads and rails to enable the smooth and scalable handling of the data avalanche, well into the future.