Look at the GIF below — it shows a real-time Todo-MVC demo, syncing across windows and smoothly transitioning in and out of offline mode. While it’s just a simple demo app, it showcases important, cutting-edge concepts that every web developer should know.
This is a Replicache demo app that I ported from an Express backend and web components frontend to SvelteKit to learn about the technology and concepts behind it. I want to share my learnings with you.
The source code is available on Github.
Web applications face some fundamentally hard problems, problems most web frameworks seem to ignore. These problems are so hard that only very few apps actually solve them well, and those apps stand head and shoulders above other apps in their respective space.
Here are some such problems I had to deal with in actual commercial apps I worked on:
I encountered these problems while working on totally normal web applications, nothing too crazy, and I believe most web apps will hit some or all of them as they gain traction.A pattern I noticed in dev teams that start working on a new product is to ignore these problems completely, even if the team is aware of them. The reasoning is usually along the lines of "we'll deal with it when we start actually having these problems." The team would then go on to pick some well-established frameworks (pick your favorite) thinking these tools surely offer solutions to any common problem that may arise. Months later, when the app hits ten thousand active users, reality sinks in: the team has to introduce partial, patchy solutions that add complexity and make the system even more sluggish and buggy, or rewrite core parts (which no one ever does right after launch). Ouch.I felt this pain. The pain is real.Enter "Sync Engine."
Remember I said that some apps address these issues much better than others? Recent famous examples are Linear and Figma. Both have disrupted incredibly competitive markets by being technologically superior. Other examples are Superhuman and a decade prior, Trello. When you look into what they did, you discover that they all converged on very similar patterns, and they all developed their respective implementations in-house. You can read about how they did it (highly recommended) in these links: Figma, Linear, Superhuman, Trello (series).
At the core of the system, there is always a sync engine that acts as a persistent buffer between the frontend and the backend. At a high level, this is how it works:
The client always reads from and writes to a local store that is provided by the engine. As far as the app code is concerned, it runs locally in memory.
That store is responsible for updating the state optimistically, persisting the data locally in the browser's storage, and syncing it back and forth with the backend, including dealing with potential complications and edge cases.
The backend implements the other half of the engine, to allow pulling and pushing data, notifying the clients when data has changed, persisting the data in a database, etc.
Different implementations of sync engines make different tradeoffs, but the basic idea is always the same.
If you've been following trends in the web-dev world, you'd know that sync engines have been a centrepiece in several of them, namely: progressive web apps, offline-first apps, and the lately trending term: local-first software. You might have even looked into some of the databases that offer a built-in sync engine such as PouchDb or online services that do the same (e.g., Firestore). I have too, but my general feeling over the last few years has been that none of it is quite hitting the nail on the head. Progressive web apps were about users "installing" shortcuts to websites on their home screens as if they were native apps, despite not needing installation being maybe "the" benefit of the web. "Offline-first" made it sound like offline mode is more important than online, which for 99% of web apps is simply not the case. "Local-first" is admittedly the best name so far, but the official local-first manifesto talks about peer-to-peer communication and CRDTs (a super cool idea but one that is rarely used for anything besides collaborative text editing) in a world of full client-server web applications that are trying to solve practical problems like the ones I described above. Ironically, many tools that are part of the current "local-first" wave adopted the name without adopting all the principles.
The one that drew my attention and interest the most is called "Replicache." Specifically, I was intrigued by it exactly because it's NOT a self-replicating database or a black-box SaaS service that you have to build your entire app around. Instead, it offers much more control, flexibility, and separation of concerns than any off-the-shelf solution I have encountered in this space.
Replicache is a library. On the frontend, it requires very little wiring and effectively functions as a normal global store (think Zustand or a Svelte store). It has a chunk of state (in our example, each list has its own store). It can be mutated using a set of user-defined functions called "mutators" (think reducers) like "addItem", "deleteItem," or anything you want, and exposes a subscribe function (I am simplifying, full API here).
Behind this familiar interface lies a robust and performant client-side sync engine that handles:
Initial full download of the relevant data to the client.
Pulling and pushing "mutations" to and from the backend. A mutation is an event that specifies which mutator was applied, with which parameters (plus some metadata).
When pushing, these changes are applied optimistically on the client, and rolled back if they fail on the server. Any other pending changes would be applied on top (rebase).
The sync mechanism also includes queuing changes if the connection is lost, retry mechanisms, applying changes in the right order, and de-duping.
Caching everything in memory (performance) and persisting it to the browser storage (specifically IndexedDB) for backup.
Since the same storage is accessible from all the tabs of the same application, the engine deals with all the implications of that—like what to do when there was a schema change but some tabs have refreshed and some haven't and are still using the old schema.
Keeping all the tabs in sync instantly using a broadcast channel (since relying on the shared storage is not fast enough).
Dealing with cases in which the browser decides to wipe out the local storage.
You might have noticed that this right here addresses a big chunk of the problems I listed at the top of this post. Being mutations-based also lends itself to features like undo/redo.
In order for all of this to work, it's your backend's job to implement the protocol that Replicache defines. Specifically:
If it sounds non-trivial to you, you are right. I don't think I fully wrapped my head around all the details of how it operates with the database. I'll need to do a follow-up project in which I totally refactor the database structure and/or add meaningful features to the example (e.g., version history) in order to get closer to fully grokking it. The thing is, I know how valuable this level of control is when building and maintaining real production apps. In my book, spending a week or two thinking deeply about and setting up the core part of your application is a great investment if it creates a strong foundation to build and expand upon.
The best (and arguably only) way to learn anything new is by getting your hands dirty—dirty enough to experience some of the tradeoffs and implications that would affect a real app. As I was going over the examples on the Replicache website, I noticed there were none for Sveltekit. I have been a huge Svelte fan since Svelte 3 was released, but only recently started playing with Sveltekit. I thought this would be an awesome opportunity to learn by doing and create a useful reference implementation at the same time.
Porting an existing codebase to a different technology is educational because, as you translate the code, you are forced to understand and question it. Throughout the process, I experienced multiple eureka moments as things that seemed odd at first clicked into place.
Sveltekit doesn't natively support WebSockets, and even though it does support server-sent events, it does so in a clumsy way. Express supports both nicely. As a result, I used svelte-sse for server-sent events. One somewhat annoying quirk I ran into is that since svelte-sse returns a Svelte store, which my app wasn't subscribing to (the app doesn't need to read the value, just to trigger a pull as I described above), the whole thing was just optimized away by the compiler. I was initially scratching my head about why messages were not coming through. I ended up having to implement a workaround for that behavior. I don't blame the author of the library; they assumed a meaningful value would be sent to the client, which is not the case with 'poke'.
SvelteKit's filesystem-based routing, load functions, layouts, and other features allowed for a better-organized codebase and less boilerplate code compared to the original Express backend. Needless to say, on the frontend, Svelte is miles ahead of web components, resulting in a frontend codebase that is smaller and more readable even though it has more functionality (the original example TodoMVC was missing features such as "mark all as complete" and "delete completed").
Overall, I love Sveltekit and plan to keep using it in the future. If you haven't tried it, the official tutorial is an awesome introduction.
Overall, I am super impressed by Replicache and would recommend trying it out. At the basic level (which is all I got to try at this point), it works very well and delivers on all its promises. With that said, here are some general concerns (not todo app related) I have and thoughts related to them:
Replicache is not open source (yet! see the point below) and is free only as long as you're small or non-commercial. Given the amount of work (>2 years) that went into developing it and the quality of engineering on display, it seems fair. With that said, it makes adopting Replicache more of a commitment compared to picking up a free, open library. If you are a tier 2 and up paying customer, you get a source license so that if Replicache shuts down for some reason, your app is safe. Another option is to roll out your own sync engine, like the big boys (Linear, Figma) have done, but getting to the quality and performance that Replicache offers would be anything but easy or quick.
Crazy plot twist (last minute edit): As I was about to publish this post I discovered that Replicache is going to be opened sourced in the near future and that its parent company is planning to launch a new sync-engine called "Zero". Here is the official announcement. It reads: "We will be open sourcing Replicache and Reflect. Once Zero is ready, we will encourage users to move." Ironically, Zero seems to be yet another solution that automagically syncs the backend database with the frontend database, which at least for me personally seems less attractive (because I want separation of concerns and control). With that said, these guys are experts in this domain and I am just a dude on the internet so we'll have to wait and see. In the meanwhile, I plan on playing with Replicache some more.
No, a sync engine shouldn't be used for everything. The good news is that you can have parts of your app using it while other parts still submit forms and wait for the server's response in the conventional manner. SvelteKit and other full-stack frameworks make this integration easy.Obvious situations where using a sync engine is a bad idea:
Optimistic updates make sense only when client changes are highly likely to succeed (with rollbacks being rare) and when the client possesses enough information to predict outcomes. For instance, in an online test where a student's answer must be sent to the server for grading, optimistic updates (and hence a sync engine) wouldn't be feasible. The same applies to critical actions such as placing orders or trading stocks. A good rule of thumb is that any action dependent on the server and incapable of functioning offline should not rely on a sync engine.
Any app dealing with huge datasets that cannot be fit on user machines. For example, creating a local-first version of Google or an analytics tool processing gigabytes of data to generate results is impractical. However, in scenarios where partial synchronisation suffices, a sync engine can still be beneficial. For instance, Google Maps can download and cache maps on client devices to operate offline, without needing high-resolution maps for every location worldwide all the time.
My impression is that having a sync engine can make DX (developer experience) much nicer. Frontend engineers just work with a normal store that they can subscribe to updates, and the UI always stays up to date. No need to think about fetching anything, calling APIs or server actions for the parts of the app that are governed by the sync engine. On the backend, I can't say much yet. It seems like it won't be harder than a traditional backend but I can't say for sure.
It's exciting to imagine the future of web apps as planet scale, real-time multi-player collaboration tools that work reliably regardless of network conditions, while at the same time making these nasty problems I started this post with a thing of the past.I highly recommend fellow web developers to get themselves familiar with these new concepts, experiment with them, and maybe even contribute. Thanks for reading. Leave a comment if you have any questions or thoughts. Peace..
P.S with Aaron Boodman, the founder of the company that created Replicache, is great. Watch it and thank me later.