That one single innovation since the last century, which probably has the most significant impact on our lives today is the internet.
It started as a decentralized ecosystem back in the early days. The open protocols like TCP/IP and SMTP helped build different kinds of applications on top of the internet like the World Wide Web, emailing services, and messaging. However, the internet, s we know it today is all centralized, and companies are investing heavily in huge server farms that hold all our data and information.
Too much ‘centralization’ is slowly killing the online ecosystem
Centralization has its own unique benefits, which include:
- Higher speeds
- Low latency
- Higher availability
- Quick throughput
But all these benefits come at the cost of severe drawbacks like data hacks and security breaches, censorships, and lack of control over your data, to name a few. If you carefully observe, the internet is dominated by a few technology companies, the ‘Big Tech’. In fact, the internet is dominated by only a handful of ten big companies, according to a blog post published by Mashable. Too much centralization also means that the governments can ban your access to any application, leaving you with no other options whatsoever.
Hacks, censorships, and blockage are widespread in the centralized system
One recent example came from Turkey where the government banned Wikipedia in 2017, claiming it was a ‘threat to their national security’. China has blocked access to popular social media, search engine platforms, and replace them with applications that come with lots of government surveillance, content blockage, and censorship. So now the question arises, how can we make the internet decentralized again with no censorship and more public control? Well, it is certainly possible as more and more advancements have been made into the ‘decentralized cloud storage’ space, which we will discuss more in this article.
What is Decentralization, and how does it apply to cloud storage?
Decentralization, in terms of technology, means that the system doesn’t rely on a central authority, it doesn’t have a single point of failure. In more technical terms, decentralization is a subset of distributed architecture where the decision making is performed independently by all the participating nodes, instead of relying on a single node. Decentralization has been around for many years, and it has more to do with governance, decision making, and control.
The earliest example of a decentralized system is the internet itself, where the websites were hosted on individual PCs, followed by Napster and BitTorrent, which laid the foundation for peer-to-peer (p2p) file sharing. BitTorrent protocol became the most famous and widely adopted and is still used today in a variety of different applications.
When we talk about cloud storage, ‘decentralized cloud storage’ means that you can store your data, not on one single server or location, but many different nodes spread across multiple locations. These nodes are independent of each other in terms of complete authority over decision making. It is quite similar to BitTorrent protocol where the users host files on their local storage and act as ‘seeders’ (sharing chunks of files with other users who want to retrieve them), but there are some fundamental differences.
Decentralized cloud storage is made possible by the new protocol for the distributed web named IPFS (InterPlanetary File System). In the next part, we will dive a little deeper into the IPFS protocol. We will also discuss how it differentiates from the BitTorrent protocol, which is also built for distributed peer-to-peer (p2p) file-sharing over the internet.
IPFS and how it builds the foundation for the decentralized cloud storage
IPFS (InterPlanetary File System) is a protocol developed by Protocol Labs for the distributed web of the future. It aims to challenge the traditional HTTP protocol by building a more distributed and decentralized network. Both HTTP and IPFS are hypermedia protocols built for the web, to transfer any data between client and server over the internet. However, there are subtle differences between the two, in fact, IPFS aims to replace HTTPS to become the default protocol of the internet.
Instead of a single server, IPFS works on a huge swarm of nodes that store different blocks of data and users accessing the network can retrieve this data from the nearest node.
Below is the brief explanation of what happens to the files on the IPFS network:
- The file is divided into chunks of data called blocks. Each block is given a unique hash.
- IPFS works on deduplication, which means that all the redundant files are removed from the network.
- Every node participating in the IPFS network stores the content with its hash and some indexing information.
- When a user wants to retrieve the file, he is telling the network to find a list of nodes that have the content behind a particular hash.
- With IPNS, a decentralized naming system, each file can be easily found by human-readable names.
One other significant difference between IPFS and HTTP is how they address the content over the internet. HTTP primarily uses something called ‘location-based addressing’ where you retrieve the content by addressing its location, which is the IP address of the server hosting that piece of content.
On the other hand, IPFS uses something called ‘content-based addressing’ where you retrieve the content by either its name or a unique hash since the IPFS has deduplication all across the network, which means that every node is hosting exclusive content that makes ‘content-based addressing’ more efficient and reliable than the traditional location-based addressing.
How the IPFS differentiates itself from the BitTorrent protocol?
The IPFS sounds very similar to the BitTorrent protocol as both of them are distributed. However, they both are fundamentally very different from each other in so many ways. Let’s discuss a few key differences between the IPFS and BitTorrent protocol.
- IPFS is built for the web aiming to replace HTTP while BitTorrent is only built for peer-to-peer (p2p) file sharing.
- IPFS has deduplication all across the network, which saves a ton of bandwidth and resources. However, BitTorrent doesn’t have any deduplication, which means that there is a very heavy redundancy all across the network.
- IPFS uses ‘content-based addressing’ to retrieve the files while BitTorrent uses Trackers to locate the peers, which use ‘location-based addressing’ just like regular DNS and HTTP.
- All the data on IPFS is ‘immutable’, just like Blockchain, and it has a versioning system built in which keeps track of different versions of the same file. BitTorrent protocol doesn’t have this immutability and versioning system.
- IPFS has the capability of being an offline-first network that can significantly help in natural disasters or developing the world. BitTorrent doesn’t have any offline dressing mechanism built-in.
- With the hashing, content-based addressing and immutability, IPFS is ‘blockchain ready’. In fact, many blockchain platforms are using IPFS for distributed file storage already. BitTorrent, on the other hand, is best suited for peer-to-peer (p2p) file-sharing over the traditional internet model.
What about privacy? Is decentralized cloud storage secure?
Blockchains are immutable for sure. Decentralized file storage puts another threat on the table: privacy, security, and integrity of data. Fortunately, it has been taken into strong considerations, and different blockchain platforms take care of it in their unique ways.
The majority of the applications that we will discuss in this article have end-to-end encryption and sharding. Before the distribution of files into the decentralized world, it is divided into blocks, and those blocks are encrypted and then distributed among many different nodes. For the file retrieval, you need to have your private key to decrypt the files.
However, this is just a broader perspective of how secure decentralize cloud storage is. With no central location of your files and with encryption built into the system, decentralized cloud storage might be more secure than the centralize solutions available today.
When it comes to IPFS, there is a problem — Why would the users utilize their local storage to store chunks of data for the IPFS network? How are they incentivized?
In the next section, we will discuss different decentralized cloud storage solutions, which most of them are using blockchain on top.