Non-fungible tokens (NFTs) have hit the mainstream. Unfortunately, this space is full of misconceptions, and many developers - intentionally or not - are cutting corners as they rush to capitalize on this nascent market.
Though customers buy NFTs believing them to be permanent and immutable records of ownership, this is not always the case - fundamental flaws in the construction of many tokens jeopardize the long-term integrity of the asset.
This is a major problem for the entire ecosystem, with the potential to rapidly erode customer faith - not just in NFTs, but in blockchain technology at large. To avoid reputational loss, the developer community needs to proactively acknowledge and address these issues.
Fortunately, it's relatively straightforward to understand both the flaws that are undermining these tokens, as well as the means by which to fix them. In this post, I’ll detail what the problems are, and how developers can avoid them in their own products.
In recent months, particular interest has developed in NFTs for digital assets, like pictures, music, or videos. The core idea being sold here is a claim* (in some context, see footnote) to these assets - one that is awarded and verified through a distributed digital consensus, rather than a central authority (for example, a patent office). These claims are being billed as permanent, immutable, and unhackable proof of ownership.
Sadly, many purchasers of NFTs are not in a position to technically evaluate the truth of this marketing. Vaguely-understood notions about the immutability of blockchains have supplied cover for all manner of NFTs, some of which fail to live up to this reputation.
At its most basic, an NFT is just a record of ownership, stored on a blockchain, that associates an identity with an asset. It’s very important to be clear on this point: the NFT is not the asset itself - it’s the record. For example, let’s say Zoe Schmoe (pronounced Sh-mow-ee, of course :) buys an NFT for a cat picture. The record will look more or less like this:
Zoe Schmoe owns cat.png.
Of course, that’s sweeping a fair amount of detail under the rug. In practice, we face a problem of digital consensus: how do we (in the context of a blockchain) agree on what entities "Zoe Schmoe" and "cat.png" refer to? We could give those names to any number of things!
For Zoe herself, the answer is relatively standard. We can use public-key cryptography to generate unique identities for network participants, and reasonably presume that if a person has the corresponding private key for a given identity, then they are the person who created that identity. Rather than her name, the record references an anonymous identity that Zoe controls:
0xZ03 owns cat.png.
That solves half the problem, but we still need a way to indicate a specific "cat.png". Furthermore, we’ve stressed that the asset and the NFT are two distinct things, but haven’t discussed the asset itself yet. Where should it live, and who should be in charge of its storage? This hints at two broad problems we need to solve to build resilient NFTs.
If an NFT is to retain its value, it has to be stored somewhere - if all copies are deleted, then there’s nothing to own! This requires us to take a number of issues into consideration, ranging from who should be responsible for storage, to the desired levels of redundancy, accessibility, and longevity of the data being stored. All of these aspects tie into the problem of persistence: making sure content remains available, in a way that is robust to the typical failures we see with the Internet. Many NFTs minted today completely punt on these considerations.
One way of ensuring that the asset is stored for the lifetime of the NFT would be to store the asset on the blockchain as well, leveraging the fact that a blockchain is a ledger copied to every participant. It also solves the issue of responsibility, by implicitly making every node in the network responsible for the asset’s upkeep.
Unfortunately, precisely because a blockchain is replicated across every network participant, storing data on one is extremely expensive. Therefore, it is typically cost-prohibitive to store any but the most trivial of data on-chain. We need to keep the data somewhere else - and that means we need to link to it.
The second problem we need to solve is that of addressing: we need a way of unambiguously identifying the content of data.
One way we could do so is by providing an index to it in an immutable data store - like, say, a blockchain! Given such a store, agreeing on where a given piece of data is located within the store is equivalent to agreeing on what the content of that data is. Unfortunately, as we’ve just seen above, storing our data on the blockchain has to be ruled out.
A second solution comes to mind by analogy to the first: rather than link internally to a blockchain, we might link externally to a website, using a time-tested protocol like HTTP:
0xZ03 owns "cat.png", which is stored at https://nft-emporium.com/cat.png.
Indeed, many NFTs on the market today do this. Unfortunately, while at first blush this might seem reasonable, it is precisely this practice that compromises the integrity of many NFTs.
HTTP URLs have two troubling properties that impact their suitability as long-term references to data. The first issue is that they are links to locations at which data can change over time. As a result, the notion of ownership created by an NFT predicted on HTTP is extremely fragile. Today, Zoe owns the cat picture at that link; tomorrow, she owns a 404. The day after, somebody buys nft-emporium.com, changes the link, and now Zoe owns a picture of a horse.
Developing an NFT on top of HTTP undermines all promises of permanence and immutability.
This leads us to a second concern with HTTP addressing: centralized control. A single person or entity has total authority over - and singular responsibility for - the content behind a link. This has very serious implications for a link’s long-term viability.
In summary, storing data on a blockchain allows us to make immutable references to data, but is too expensive; storing data off of the blockchain allows us to avoid this cost, but traditional links are centralized, mutable references. Herein lies our dilemma: we need a way of making immutable references to data stored off-chain.
Fortunately, this is one of those problems where we can achieve both of the properties we’re after without compromise: the solution is to use something called a content address to identify and link to an asset.
A content address for some piece of data is a link that is derived solely from that data - the most basic form being a hash of the data. We can think of a content address as a fingerprint: ideally, within the context of a given content-addressing scheme, an address should uniquely identify a piece of data. Taken together, this means that a content address is a link that never changes, and a link whose meaning all parties can agree on: this is exactly the trustless unambiguity we are after! If we use these instead, we’ll wind up with a record close to the following:
0xZ03 owns "cat.png", which hashes to <some-hash>.
This is a great start, but note that a raw hash is not the same as a content address, which carries the additional connotation of being able to be used as a link. You can’t enter a raw hash in your browser and get a file back. The record above allows us to verify what Zoe owns, but doesn’t enable anybody to retrieve that data. This does not reflect the reality of most use cases today, where being able to access the asset directly from the record is a core feature of many NFTs.
Solving the Problem of Addressing with IPFS
Additional infrastructure must be created to enable a hash - or any such address - to fill the role of a link. Fortunately, we don’t have to create our own content address infrastructure from scratch. The Interplanetary File System (IPFS) ecosystem has been developing a particularly robust form of content address - the content identifier, or CID - for several years.
Within the context of IPFS, a CID uniquely identifies a piece of data. Building on top of that primitive, IPFS implements a global distributed data-sharing network. A network node can broadcast a request for data via its CID, and any node that has this file can service the request. This is just what we’re looking for - let’s adjust our NFT record to use a CID:
0xZ03 owns "cat.png", which has the IPFS CID "bafy1".
Now anybody that comes across that record can see what data Zoe owns, as long as somebody on the IPFS network has the file and is willing to serve it to them!
In addition to laying the groundwork for the CID itself, IPFS has a tremendous supporting ecosystem. One major advantage it offers is its own internationally recognized URI: in addition to existing compatibility layers for widely-used browsers, this URI has recently started gaining native browser support.
IPFS provides us with immutable, widely-supported links, in a manner that mirrors the trustless, distributed nature of a blockchain: it is a direct solution to the problem of addressing we identified above.
When we construct NFTs by referencing assets with IPFS CIDs, we preserve both the integrity of the asset, as well as the advantages we gain by storing and linking to the asset off-chain.
Of course, addressing is only part of the equation: we need ways of ensuring that data remains continuously stored and accessible. IPFS can only retrieve a file if somebody is storing it!
To tackle this problem, we can turn to IPFS’s sister project, Filecoin. Filecoin is a distributed storage network designed to act as an IPFS incentivization layer, and offers users looking to secure the longevity of IPFS-hosted assets a robust paid storage solution.
Just like IPFS, Filecoin is completely decentralized, and thus has no single point of failure - one of our primary concerns with HTTP. IPFS enables anybody to help keep a link alive, but there’s little reason (outside of altruism) for most people to do so. By contrast, Filecoin allows us to directly motivate data storage through incentives and penalties tied to contracts, giving us the strongest guarantee possible that somebody is interested in keeping our link up.
An open, decentralized storage ecosystem has several long-term hosting advantages:
Best of all, Filecoin provides us with continuous and fully-transparent proofs that data is being stored correctly. This is a real innovation - something no traditional cloud service provider supports as a first-class feature.
NFTs are both investments and cultural artifacts; the assets they link to shouldn’t fail to resolve just because their hosting company goes out of business. Filecoin allows us to overcome the problem of persistence in the face of such contingencies. To help everyone realize this goal, Protocol Labs is currently offering free Filecoin-backed storage for NFTs at nft.storage.
Many NFTs being marketed to customers today are fundamentally broken - they embed mutable links to refer to the asset they convey ownership over, and so cannot be trusted as sources of truth. NFT developers must stop relying on centralized, mutable links in their attempts to create timeless assets - and to ensure those assets remain accessible, they must also secure their storage far into the future.
The aims of NFTs cannot be achieved if they are predicated on a technology stack with single points of failure. Through IPFS and Filecoin, we can completely eliminate such dependencies, while gaining an entire ecosystem of additional capabilities that add value to an NFT.
By unifying the decentralized consensus of a blockchain, the decentralized addressing of IPFS, and the decentralized storage of Filecoin, we can come as close as possible to realizing a truly permanent and decentralized digital token of ownership.
* Technically, that claim is defined within the context of a given blockchain and the rules by which it operates. It may have little to no relation to ownership as defined or enforced by any legal jurisdiction.