performance by creating HeapViz
, a visualization tool
for Chrome memory profiles. Today I will be talking about my experience creating a jank-free UI
by using web workers. If you missed it, check out part 1
, part 2
, and part 3
This guy is gonna make sure your main thread isn’t doing too much
Why a worker?
A web worker
off the main thread, freeing the user to interact with your webpage and potentially allowing you to perform extremely heavy tasks on multiple threads simultaneously. The decision to use a web worker is fairly straightforward in any web application. If you have
- your work does not need access to the DOM or many other browser APIs
you do it in a worker!
While optimizing my renderer
, doing all of my work on the main thread lead to some serious UI locking (or jank
). My heaviest renders were pushing 10+ minutes of constantly locked UI. That might fly for a pre-rendered demo, but absolutely would not work for a real application — UX sticklers would argue that any jank at all is too much!
These guys hate jank
To figure out how to avoid jank, I broke the application down to its 4 basic steps:
- Loading the profile
- Parsing the profile
- Calculating the layout
Of those steps, loading the profile and rendering the layout have to be done in the main thread. Web workers have no access to the various file upload APIs, and, until OffscreenCanvas
finds more stable footing, have no way of interacting with a canvas either.
The profile loading is non-blocking by default because a FileReader
is always asynchronous. One interesting fact about FileReaders is that they can be used in a WebWorker context. This might seem like a promising way to more efficiently parse File
objects, but it’s a bit of a red herring — File objects are not Transferable
, so they need to be passed to and from workers by value. This means that we end up keeping two copies of the file in memory if we pass it to a worker this way — for 140MB heap profiles, it will not do!
The better solution is to just use FileReader.readAsArrayBuffer
which spits out an ArrayBuffer, which is
Transferable, so we can pass it by reference. We only keep a single copy of the file in memory, and as the transfer is almost instantaneous.
To keep the rendering non-blocking, I just needed to make sure that I only do as much work as I can fit in a single frame. The goal for a jank-free UI is to render at 60 frames per second, which gives you a new frame every 16ms. Conventional wisdom
is to keep a frame’s worth of work to 10ms to allow for browser housekeeping.
This allows for the cleanest user experience while costing the least amount of render time — ideal!
In the Worker
With my main-thread activity made non-blocking, I now had two bodies of work I could do in a worker — parsing the profile and calculating the layout. Doing these in the worker keeps the main thread free for any other activity I might want to do.
Parsing the profile
Receiving the profile as an ArrayBuffer was no issue — the heap profile format is just an extremely compact JSON format
and JSON.parse work fine to transform it from an ArrayBuffer to an object. At that point, it can be happily shuttled off to our heap profile parser and inflated into a proper data structure.
Returning the heap as a Transferable needed a little more nuance. I return my nodes with additional data in a much more verbose format to minimize the deserialization necessary on the main thread. To accommodate this, I needed to create a separate wire format to dodge the string character limit
(512MB on Chrome) for very large profiles. Once I have the nodes as this compact format, I just JSON.stringify and TextEncoder.encode
to transfer the representation across.
This is an extremely fast way to transfer large objects to and from a worker.
The last piece of heavy lifting to do on the worker is to calculate the layout. I am still using d3-hierarchy’s pack layout
, which is distributed as a piece of the self-contained hierarchy package. Applying the layout to the data in the worker is easy— just format the nodes in a hierarchy with a value assigned to each node and d3 will take care of the rest, returning a structure
with the x, y, and radius of each circle.
That’s it! This is the engine that does all of the “magic” in the layout, in only 7 lines of code. Perfect… almost. These 7 lines of code are where we spend the vast majority of our time during the whole program. As I mentioned in part 2
, one major disadvantage to circle packing is that it is quite computationally intensive.
We’ve accomplished our goal of making our interface jank-free by moving our intense computation to a worker and keeping our render non-blocking. There is just got one more hurdle to cross — how can we make those 7 little lines as fast as possible? Find out next time!