Ever feel like things are sometimes just a little harder than they should be? That’s the way I felt when I wanted to save version-controlled Markdown and documents to IPFS from an editor I am developing. If you don’t know what IPFS is, then this article may not be for you. Take a look at and come back if it is. SECST IPFS By its very nature, IPFS creates a new version of a document every time you save the document, unfortunately, it does not provide a simple way to track the versions and keep them related to each other. There was a substantive attempt at creating a comprehensive mechanism for doing this with the (IPVC) system; however, work on the project has been suspended by the author. Additionally, its powers are way beyond what I was seeking. It is modeled on Git, so its power comes with complexities beyond the ken of a typical document author. Interplanetary Version Control I wanted something lightweight and easy to wrap in a user interface so that casual users will have the ability to track and retrieve old versions of files, either through automatic numbering or using user-provided names (as many people still do when sharing versions of files with each other or keeping track of things on their computer). In addition to IPVC, I found some instructions for using IPLD (Interplanetary Linked Data). It seems complex and also resource consumptive. The approach keeps entire copies of files around, even if just a few characters have changed. Although IPFS automatically de-dups data at the block level behind the scenes, character sequence changes in documents exist on a far smaller level of granularity. Unable to find what I needed, I wrote the Interplanetary Versioned File System (IPVFS). In this article, I describe both how to use it and how it works along with design alternatives and tradeoffs (IPVFS is currently in an beta state, things could change!). How To Use IPVFS IPVFS has only three API end-points: An initialization function which takes an instance as its argument and augments it to support file versioning then returns the instance. ipvfs ipfs which is similar to with additional options. read(path,options) ipfs.files.read which is similar to with additional options. write(path,options) ipfs.files.write The and methods exist on an object added to the property. They can be accessed as and . Ultimately, they will be an API superset of the standard functions; hence, so long as pointers to the original versions are kept around, you could theoretically elevate the versioned methods to replace the standard versions. read write versioned ipfs.files ipfs.files.versioned.read ipfs.files.versioned.write Here are a few lines of standard IPFS code followed by similar IPVFS code: import ipvfs from "../index.js"; import {create} from "ipfs"; import {all} from "@anywhichway/all"; let ipfs = await ipvfs(create({repo:"hackernoon-filestore"})); await ipfs.files.write("/hello-world.txt","hello there peter!",{create:true}); // log contents console.log((await all(ipfs.files.read("/hello-world.txt"))).toString()); await ipfs.files.write("/hello-world.txt","hello there paul!",{create:true}); // log new contents, but access to the old version is not available console.log((await all(ipfs.files.read("/hello-world.txt"))).toString()); await ipfs.files.versioned.write("/hello-world-versioned.txt","hello there peter!"); // log contents console.log(await ipfs.files.versioned.read("/hello-world-versioned.txt",{all:true})); await ipfs.files.versioned.write("/hello-world-versioned.txt","hello there paul!"); // log new contents console.log(await ipfs.files.versioned.read("/hello-world-versioned.txt",{all:true})); // log first version contents console.log(await ipfs.files.versioned.read("/hello-world-versioned.txt#1",{all:true})); To retrieve an old version of a file, you just append to the file name, where is the sequential version. #<number> <number> You may have noted the use of the function from the package . IPFS returns chunks of data asynchronously, the function just collects them into a single buffer. Without this function, you would have to write your own function to collect the chunks in a for loop. all @anywhichway/all read all IPVFS also allows you to pass as an option and the chunks are concatenated for you. Furthermore, IPVFS keeps additional metadata about what it is storing, you do not have to convert returned data to a string. Since a string was saved, a string is returned. {all:true} You can also name versions and retrieve them by appending . @<version name> await ipfs.files.versioned.write("/hello-world-versioned.txt","hello there mary!",{metadata:{version:"Mary Version"}}); console.log(await ipfs.files.versioned.read("/hello-world-versioned.txt@Mary Version",{all:true})); IPVFS does not enforce any particular naming convention, but you could use this approach to implement semantic versioning, e.g. could be retrieved using . {version:”0.0.3”} @0.0.3 Finally, you can add arbitrary metadata to files (anything other than ), e.g. version await ipfs.files.versioned.write("/hello-world-versioned.txt","hello there mary!", metadata:{ version:"Mary Version", author:"John Jones", }}) This data can be retrieved by passing to read, in which case an object is returned instead of just the content, e.g. withMetadata:true const result = await ipfs.files.versioned.read("/hello-world-versioned.txt@Mary Version",{withMetadata:true,all:true})), {content,metadata} = result, {version,author} = metadata; More on the metadata structure and how to get a version history is covered below. For additional read and write options, visit the . documentation on GitHub How IPVFS Is Implemented Currently, IPVFS stores the first version of a file’s content as a standalone un-named CID hashed block in IPFS. A pointer to this block is kept in a named file along with some metadata that includes an array of transformations that are required to convert the original text into the most current version. The beta release of IPVFS does not automatically pin this content, but it should be pinned. When a write operation is performed, a test is made to see if the content or custom metadata being provided is different from the most recent version. If the content is different, the library l is used to discover the differences. The difference, if any, and any new custom metadata are used to create a change record which is added to the array of transformations. ittle-diff When a version of the file content is requested, it is generated from the first version and the array of transformations up to the version requested. The little-diff library is used to convert the actual content and simple object assignment is used for custom metadata. Design Alternatives And Tradeoffs Keeping the original content in a separate CID hashed block is a time/space tradeoff. The original content could be stored in the named file along with the metadata. This would save one write and one IPFS CID entry. However, this would mean the metadata and all the content would need to be read prior to returning anything to the requestor. For large files this could have both a negative performance and RAM impact. By using a pointer to a separate CID hashed block, IPVFS can use the metadata to assemble ordered change sets that can be applied as content streams from the separate block to the requestor. In some sense, IPVFS is acting as a pipe. This makes it both time and memory efficient at scale. A pointer to a separate CID hashed block could be created and saved for every version, but this would ultimately take a lot of space. It could also subject the system to larger than necessary writes and network traffic. The design would potentially fail with respect to time, memory, network and storage efficiency. Some version management systems only keep the most recent copy of file contents and use backward transformations to create older versions. This is arguably better since people are more likely to want a recent version. IPVFS could be modified to do this. A new CID hashed block could be created for each change and its CID could replace the pointer. However, this will require an extra write operation and subsequent network traffic as the new block is propagated. This might also result in management overhead as attempts are made to “remove” the old CID hashed object, which is now garbage from a version management perspective. The word “remove” is in quotes because it is not really possible to remove hashed content, in some sense, it expires if the content has not been when the creating IPFS node stops so long as nobody else has created an identical CID hashed block (which is entirely possible and actually quite likely for small files). The design would potentially fail with respect to time and network efficiency. And, code might be considerably more complex. pinned Metadata Structure and Version History In order to optimize content access and delivery or implement more sophisticated version management, IPVFS makes its metadata available via the function using the or options. However, it is also possible to get just the version history and metadata without the actual file content by using the standard function. This saves a CID lookup and reads until the requesting program decides to make them. read withMetadata withHistory ipfs.files.read Below is the contents of a versioned file read using the standard function. ipfs.files.read The file contains an array of change records. The first includes a CID path to the original content and a . The remaining properties are the same for all records: btime an SHA-256 hash of the version content a version that will either be the change index + 1 or a manually provided version string the kind of data stored an array of delta records (see ) little-diff the for the change mtime any other metadata properties provided when the file was written (there are none in this example) [ { "path": "QmScjZmC4J4ZHq6bGTUyYSESfTKDhxo8X7o3QShSawTsqi", "hash": "f7a67e7a0a50e87e59713999562d06cc3d2511709c0a3ded8020d8247e47251c", "version": 1, "kind": "String", "delta": [], "btime": 1672768094671, "mtime": 1672768094671 }, { "hash": "4fe36dd2fd280cbdd9414f3efa61d2b49116453e7edad0316b8b6be1d1c64817", "version": 2, "kind": "String", "delta": [ [ 17, 1, "" ], [ 13, 4, "aul!" ] ], "mtime": 1672768094748 }, { "hash": "0c8a635762b80e327d384f660387f3acc5f24363de54366404e4a391260fd5c5", "version": 3, "kind": "String", "delta": [ [ 12, 1, "m" ], [ 14, 2, "ry" ] ], "mtime": 1672768094806 } The above structure can be read and used to optimize file retrieval on a client device by independently accessing the CID and applying the delta records using path little-diff. IPVFS is currently in beta. I would love your feedback or in the comments. here Image: Image: on Pixaby PCB Tech