This article continues my series chronicling my investigation into JavaScript performance by creating HeapViz, a visualization tool for Chrome memory profiles. Today I am going to talk about my choice of visualization methods. If you missed it, catch part 1 here.
(source: Wikimedia Commons)
The value delivered by my tool will be determined by its ability to quickly diagnose issues with a particular memory profile. Thinking about ways that I could leverage intuition engineering to enhance the visualization, I came up with three criteria for success:
In order to effectively establish a baseline we needed something that would, at a glance, represent a lot of related data. My two tools for representing nodes would be size and color. By having nodes drawn according to size, I would be able to quickly highlight areas of an app that have exceptionally large footprints. Similarly, color-coordinating nodes would allow at-a-glance analysis of the state of a heap.
With this general idea, I tackled the more specific problem of communicating problem areas. Taking some cues from the output of the Chrome heap profile tool and my own experience I knew that node self size and retained size were of critical importance. I also knew that I wanted some way of representing retainers because of their critical role in figuring out a fix for memory issues.
Looking for a format that allowed for separately sized entities, color coordinated, with an indication of relationships between them lead me to the force-directed graph.
(source: Martin Grandjean)
Force directed graphs are great! They check all the boxes for communicating importance— efficiently representing nodes of varying sizes, color coordinated, and they show you relationships between nodes. D3 even provides a force layout module that makes it simple to implement one of these suckers.
Unfortunately, they do not satisfy the pesky performance requirement. Force directed layouts are expensive to compute. Most browser implementations take minutes to lay out even low-thousands of nodes. Furthermore, as they get large they get extremely visually congested.
A force directed graph with 200,000 nodes (source: graphmap.net)
If my tool takes many minutes to lay out a heap, or if it is difficult to get relevant diagnostic information about a single node at a glance, it will not be more useful than just parsing the data by hand. In the end, I decided to pass on the force-directed graph.
One thing I did like about force-directed graphs was their circular representation of nodes. It was visually attractive and easy to reason about. If only it wasn’t so darn expensive to compute!
A lot of the complexity in rendering a force layout comes from drawing the relationships between nodes. If I could find a layout that was similar but did not explicitly draw edges, I might be able to render the volume of nodes that I needed to.
Enter the circle pack layout:
(source: Mike Bostock and Jeff Heer)
I saw some potential here — it has many of the advantages of a force directed graph — circular nodes, colored nodes, and an at-a-glance sense of relative size — without the computational overhead of laying out a bunch of lines between objects.
I also saw a couple of downsides as well:
To address the first point, I decided that I needed to flatten my data as much as possible. Remember that memory is generally represented as a graph, and sometimes as a dominator tree — it is not stratified by default, though it can be grouped by type or other qualifiers if desired.
The second point I decided to mulligan on. I liked how a circle packing layout looked and decided that the only indicator I would display for retainers would be a text list of them and a number on the node. The value in knowing retainers tends to come after a problem has been identified so I decided to simplify the initial visualization to include only those elements that highlight problem areas.
You might be wondering — if performance on large data sets is such a concern, why not use a treemap?
(source: MDN)
If I am being honest, the reasons I steered away from a tree map originally went something like:
I might add a couple of extra points now that I have adopted circle packing as my visualization method of choice.
So, that’s settled! I decided to use the circle packed layout and consider it a fine choice for visualizing a memory heap.