6,208 reads

What's the Difference Between IPFS and Ethereum Swarm?

by Laszlo FazekasDecember 25th, 2022

Too Long; Didn't Read

IPFS is the older system (in a good sense). It has many use cases, it’s well-documented and widely used. There are many centralized IPFS providers, and you can also use FileCoin to store your content. Ethereum Swarm is relatively new and is under development, but it has some very exciting properties. The anonymous content storage and retrieval, the super efficient DHT management, and the strong Ethereum compatibility are unique features of this solution.

People Mentioned

featured image - What's the Difference Between IPFS and Ethereum Swarm?

Storing data on the blockchain is expensive, so if you need to store a large amount of data on it in the “blockchain style” (in an immutable, permissionless, and distributed way) you have to use an external storage solution. The most common choice is using IPFS, but there are some other players on the market. One of them is the Ethereum Swarm. In this article, I will introduce you to the similarities and differences between Swarm and IPFSs to help you choose the right storage solution for your next project.

IPFS (the acronym for InterPlanetary File System) was founded in 2014 by Protocol Labs. It is a distributed file system protocol that uses content addressing to uniquely identify each file in a global namespace. IPFS has no real incentive system, this is why Protocol Labs made FileCoin in 2017, which is its own storage-centric blockchain.

The idea of Ethereum Swarm came from Gavin Wood, one of the founders of Ethereum. In 2015 Viktor Trón and Daniel Nagy took over the project within the Foundation’s Geth team. After a successful ICO, they do their own journey as an Ethereum Foundation-supported autonomous project. Nowadays (in 2022) Swarm is more or less feature-complete, and actively developed.

Both of the systems provide a distributed, immutable and content-addressable system with its own incentive system and cryptocurrency, and also both of them are based on libp2p. In what follows, I will do a step-by-step comparison of the two systems.

Storage logic

IPFS is basically a storage provider community where the nodes publish content that is referenced by its content hash. The list of storage nodes for a hash is stored in a DHT. If you want to retrieve content, in the first step you have to find the peer list in the DHT by the content hash. In the second step, the peer id has to be translated to an IP address, and in the third step, you can download the content from the given peer at the IP address.

Ethereum Swarm has a different logic because it stores the content itself in the DHT. They call this system DISC (Distributed Immutable Store for Chunks), and the whole system is designed to make this DHT efficient. For example, when a node chooses other nodes to connect, it chooses peers for every proximity order (i.e., order of magnitude distance from its own address). Because of Kademlia connectivity, finding a chunk is really fast in this system. When a node wants to retrieve a chunk, it asks its peers. If a peer has the content, it gives back it, if not, it asks for its peers, etc. The content is always stored by nodes that have a small Kademlia distance from it, and because of the Kademlia connectivity, it’s always possible to find a path of logarithmic length. When the storer node gets the retrieval request, it gives back the content to the asking node, which in turn relays it back to the node that requested it, etc. until it arrives at the origin of the request. Swarm calls this method forwarding Kademlia, and it provides anonymity. The nodes know only the asking peers of the content, but nobody knows who is the origin of the request or the content. It is something similar to what the Tor network does to anonymize requests. But anonymity is only one advantage of forwarding Kademlia. It also helps to distribute the content, but I will write about this later.

Mutable content

If you want to store mutable content (for example a webpage that frequently changes) on IPFS, you can use IPNS. On IPNS, the address of the mutable content is a public key, and the address of the underlying immutable content is signed by its private key part. The public key -> signed content assignments are published in the DHT. If the content changes, the content owner signs the new content hash and publishes it for the public key, so the retrievers can refresh the content assignment from it.

On Ethereum Swarm, there are two types of chunks. One is the “usual” content addressed chunk, and the other is the single owner chunk. The address of the single-owner chunk is a hash of the owner and a unique ID. These single-owner chunks are also immutable, but you can create the ID from a topic name and a serial number. When you change the content, you have to simply increase the serial number and publish a new chunk. The retrievers can poll the system, and if a new chunk is available with a higher serial number then they can refresh the content from it. Swarm calls these serial-numbered structure feeds.

Incentive system

If you want to store content on the IPFS, you have more options. You can simply choose a centralized provider like Infura or Pinata, upload your content, and pay the storing fee to make your content available on IPFS, or you can simply run an IPFS node on your machine and publish it yourself.

Another way is using FileCoin, which is the “official” blockchain of IPFS (also developed by Protocol Labs which developed IPFS itself). FileCoin is basically a marketplace for storage providers where you can make contracts for storing your content. The mechanisms of the FileCoin network keep your content safe and punish the contracted providers that do not keep your content or do not make it available. If you retrieve the content, and a data transfer threshold is reached you have to pay a fee.

Ethereum Swarm uses its own payment system instead of a blockchain which is something similar to payment channels like the Lightning Network, but a little bit different. When a node pays another, it does it by cheque. These cheques are similar to real-world cheques, signed documents that can be used to pull money from the node’s checkbook contract.

On Swarm, data transfer has a fee. If a node sends data to its peer then a small fee is counted. Every peer connection has a balance, and if this balance reaches a limit, the node gives a cheque to the other. Everything goes off-chain. Only cashing the cheque needs blockchain action.

If you want to retrieve content, you have to pay the peer who gives it to you. If it has the content, it can keep the whole fee, but if not, it has to pay for its peer who gives it. This logic incentivizes the nodes to store the popular content locally, so Swarm acts as an adaptive CDN.

On the Swarm network, there are no individual providers. The content is always stored on the nodes that have the smallest Kademlia distance from its hash (Swarm calls these nodes neighborhoods). It is an essential property of Swarm to keep the DHT effectively searchable. You can publish content on any node, which will push it to the nearest peer, which pushes it forward to the best storing place. The method is very similar to content retrieval but in the opposite direction. If you want to store your content in the network, you have to attach a postage stamp to it. A postage stamp is something like a cheque that can be cashed only if you can prove that you keep the content.

Ethereum interoperability

Ethereum interoperability is the field where Ethereum Swarm is really strong. As the “official” storage of Ethereum, everything is “Ethereum compatible”.

For example, the node address is derived from the owner’s Ethereum address. Because of this, we can use the chunk-forwarding system to send encrypted messages to given nodes. Swarm calls this technique PSS which is the successor of the Ethereum messaging protocol Whisper.

Single owner chunks’ owner is also the Ethereum address and the signing method is the same as Ethereum use, so you can simply check it on a smart contract or use it to assign metadata in feeds. When I built MyETHMeta (a Gravatar-like metadata system for Ethereum accounts), I had to use a smart contract to store Ethereum address -> metadata URL mappings. With Swarm, it can be done by a simple feed without blockchain.

Swarm chunks are Merkle trees. This means that the chunk address is the Merkle root of the content. This is useful if you check them in smart contracts because you can easily create inclusion proofs for the content. For example, if you want to store a long (>1000 element) whitelist, you can store it on Swarm instead of the blockchain, and check the membership by a smart contract using Merkle proofs. Or you can create full rollups on Swarm where the state root is also the content address for the whole state.

Conclusion

As you can see, both storage solutions have their strengths and weaknesses.

IPFS is the older system (in a good sense). It has many use cases, it’s well-documented and widely used. There are many centralized IPFS providers, and you can also use FileCoin to store your content.

Ethereum Swarm is relatively new and is under development, but it has some very exciting properties. The anonymous content storage and retrieval, the super efficient DHT management, and the strong Ethereum compatibility are unique features of this solution.