Since 1992, the web has been the only hypertext system most of us have known (outside of occasional hypertext systems built on top of the web, such as mediawiki). However, the web is merely the most popular of hundreds of earlier systems, most of which were technically superior.
For historical and political reasons (detailed in Tim Berners-Lee’s autobiography, Weaving the Web), this implementation dropped nearly all of the features that define hypertext in favor of a single one (jump links). For historical and political reasons, this implementation relied upon existing ill-fitting technologies (the SGML-derivative HTML, the filesystem- and host-oriented HTTP). For historical and political reasons (CERN’s clout, attempts to cash in on Gopher trademarks, early pushes to open source Apache and Mosaic, the proprietary nature and lack of portability and interoperability of many earlier hypertext systems), the web won.
Nevertheless, the web will not be the last hypertext system. The features that the web dropped are useful, and for the most part adding them back in on top of the web’s existing structure is fated to result in slow, complex, and unreliable hacks.
The situations that made a web-like system easier to design and build than a Xanadu-style system are, for the most part, gone. Over the past thirty years, making a web browser has graduated from a weekend project to such a major undertaking that only three or four modern browsers exist (webkit, mozilla, internet explorer, and possibly blink); at the same time, systems like IPFS have made permanent host-agnostic addressing easy. Furthermore, the web is no longer mostly used as crippled hypertext system — it is primarily an application sandbox. As a result, new hypertext systems are not even in competition with the web.
As someone who has worked on hypertext systems (both independently and under the aegis of Project Xanadu), and as a part of a community around the design and implementation of post-web hypertext systems, I would like to provide some guidelines for the structure of future hypertext systems. Tech has a short memory lately, and I would like future implementors to learn not only the lessons of the web but the lessons of pre-web hypertext systems (which often solved problems that the web has yet to address).
Guidelines:
- A hypertext system should make no distinction between client and server. All such systems should be fully peer to peer. Any application that has downloaded a piece of content serves that content to peers.
- A hypertext system should not distinguish between authors and commenters. While every piece of text is associated with its author, all text is first-order and can stand on its own.
- Links live outside of text, potentially created by third parties, and are loaded as overlays or views.
- There is no embedded markup. Links provide hints about how to format bytes, in addition to providing hints about connections between sequences of bytes.
- A link has one or more targets, represented by a permanent address combined with an optional start offset and length (in bytes).
- A document is not a sequence of bytes that render as the target text. A document is sequence of bytes interpreted as a list of pointers to sequences of bytes that, when concatenated, render as the target text. These sequences are downloaded (or retrieved from local cache) and assembled at render time.
- While links can apply to a particular document, they can also apply to particular sequences of bytes independent of document.
- A link can connect rendered documents to sequences of bytes (possibly included in many different documents). A link can connect the unrendered representations of documents with their rendered representations.
- A user can individually enable or disable any links, and create their own.
- Permanent addresses refer to unchanging byte sequences. New versions have new addresses. All content is stored at permanent addresses.
- A second addressing scheme may be used to refer to the newest version of some collection of documents and links — a pointer to a permanent address containing a list of permanent addresses to documents and links, as well as the permanent address of the previous version of that list.
- A hypertext UI contains facilities for both viewing and editing (i.e., creating a new version). Anything that is viewable can be edited, and that new version can be republished.
- A new version is a sequence of references to the previous version combined with a sequence of references to any new content typed or uploaded. Multiple sources (even from different authors) may be combined (even by a third).
- All quotation or content reuse embeds within itself a record of its provenance, being a reference to some portion of its proximate origin. This information may be flattened for local storage and fast lookup, but is always present.
- A hypertext UI contains facilities for viewing both sides of a link simultaneously, side by side. It also contains facilities for fetching multiple versions of a document side by side, and facilities for finding other contexts for the same piece of content.
- All byte spans are available to any user with a proper address. However, they may be encrypted, and access control can be performed via the distribution of keys for decrypting the content at particular permanent addresses.
- No facility exists for removing content. However, sharable blacklists can be used to prevent particular hashes from being stored locally or served to peers. Takedowns and safe harbor provisions apply not to the service (which has no servers) but to individual users, who (if they choose not to apply those blacklists) are personally liable for whatever information they host. (To clarify, since some people have asked, I don’t recommend a centralized blocklist facility. Instead, users can subscribe to multiple independently-maintained blocklists, as addresses for versioned lists of hashes. This is similar to how sharable blocklists operate in ad-blocking software, and it is also similar to a proposal for fediverse instance blocklists & another proposal internal to IPFS.)
- The system will not support dynamic content or automatically-running scripts, although users may download and manually run any code they find on it. While formatting links may apply complex styling to content, such styling is not turing-complete.
- A user may create content privately, on their own machine, and refuse to publish it. This means that while it is stored in the private cache, it is not made available when it is requested by address. A user may then later publish it, in part or in full, encrypted or plaintext.
- A user may keep a killfile of nodes or authors they would like to avoid automatically downloading content from. This content is replaced with a placeholder, indicating the source.
This story is also available on gopher at gopher://fuckup.solutions/0enkiv2/guidelines-for-future-hypertext-systems.txt