Tom Lagier

@tomlagier

Javascript Renderers in All Shapes and Sizes

A Tale of JavaScript Performance, Part 3

This article continues my series chronicling my investigation into JavaScript performance by creating HeapViz, a visualization tool for Chrome memory profiles. Today I will be taking a deep dive into my creation of the renderer. If you missed it, check out part 1 and part 2.

Starting with D3 and SVG

With my file format parsed and my visualization method chosen, the time had come to get down to it and actually draw some nodes.

To learn how to actually implement a circle-packing layout, I started by going over some helpful tutorials. I read up on d3-heirarchy’s pack layout, and found a couple of excellent examples on bl.ocks.org. It didn’t take long before I had a rudimentary renderer churning out SVGs from a dummy heap profile. It looked like this:

This was my first attempt, and it worked! I was able to draw this visualization of a small sample of my test profile:

Output of a small sample on the SVG renderer

It looks like we have accomplished a few of our goals! We can clearly identify the outlier nodes, we get a quick feel for what distribution of the major node types are, and we can easily see which nodes are retaining a lot of memory. The color scheme might need a little work, but for a first pass this seems like a simple and robust solution.

With the proof of concept working well on a small sample, I decided to throw it against my upper bound — a profile from an app at work that contains a meager 1,050,000 nodes. Predictably, it blew the tab’s ~4GB memory limit in just a couple of minutes, locking up the browser.

It didn’t take long to reason through why — the snippet above renders 4 DOM nodes per heap node, including a relatively weighty <circle> and <clipPath>. A browser might be able to handle a million empty <div>s, but there is no way it can handle a million paths in an SVG. Playing around with the renderer lead to an upper limit of around 10,000 nodes on my MacBook Pro.

Do we really need to render everything?

In order to achieve the goal of making a tool useful enough to debug issues with most memory profiles, an intelligent filtering algorithm that displayed the largest nodes along with some useful metadata probably be just fine. I’d wager that even with a view of as few as 100 nodes at a time would be more than sufficient for highlighting any problem areas.

Here’s the issue: I didn’t want to build that tool. I wanted to build a tool for visualizing an entire heap profile, not just the subset that I assumed was most useful for my users. Furthermore, I was really intrigued by the challenge. What tools and techniques could I use for rendering a full two orders of magnitude more nodes?

I decided to see what I could do to squeeze every last drop of juice out of a renderer and take on the million node challenge.

Take 2: PixiJS with Vectors

SVG is clearly not memory efficient enough as an output format, so I would need to use one of the web’s other options — either a standard canvas 2D drawing context or WebGL. WebGL gives me the finest-grain control over the memory footprint of my renderer, so it was a natural first choice. I chose the PixiJS framework because it was optimized around efficient 2D rendering.

Here’s how the renderer in Pixi looks:

There is a little magic being done behind the scenes by ember-cli-pixijs, but for the most part the rendering is straightforward. I did run into a couple of challenges:

  1. Antialiasing. Even with antialiasing enabled in the renderer, I ended up needing to 2x my resolution and compress the image using a CSS transform in order to get relatively crisp edges.
  2. UI lock during rendering. In the snippet above, you can see that we batch our draws with a somewhat arbitrary batch size of 5,000 nodes. This allows for a pretty nifty progressive draw effect but does slow down our render.

With this renderer, I was able to get to around 50,000 nodes without issue:

We’re moving in the right direction, but 50k is still a long ways from 1 million. Switching to vector-based WebGL got us a 5x nodes rendered improvement, but ultimately it was not able to break through the 100k, let alone 1 million barrier.

After hitting the books once more, I found my major issue. I was misusing the PixiJS library! It was created as the renderer for the Phaser game engine, and while it can draw vectors, it has been optimized around drawing sprites. To use it to its full potential, I needed to convert my circles to rasterized textures.

Next Attempt: PixiJS with Textures

My strategy for rasterizing my circles was simple — create a texture for each node type by drawing a large circle in each color of the color scale I was using, and then reference that texture by node type when drawing the node. Here’s what I added:

One “gotcha” in this code is that we need to generate our basic textures as a power of 2. This is a well-documented property of WebGL that will cause degraded performance if violated.

Here’s the output with the sprite-based renderer:

Boom! 1 mill… wait, no, that’s still only 220,000 nodes. And it took nearly 10 minutes to render. Pretty dang good, but still not good enough.

Edit: /u/grinde pointed out that I should have used a ParticleContainer instead of a normal PIXI.Container to render my sprites as it is optimized for rendering tons of simple sprites. I will do some benchmarks and report how it stacks up (hah) against stackgl

Extra Credit: stackgl

Recently, I stumbled on this spectacularly helpful benchmark of all of the major WebGL frameworks. PixiJS holds up fairly well at the high end, but there is one hands-down favorite: stackgl.

Stackgl isn’t so much a WebGL framework as a collection of composable utilities that make it easier to assemble shaders. While all of the other frameworks have robust primitive support, stackgl makes you roll your own — but the obvious upshot is screaming performance.

Even though at this point my rendering speed was far outpacing my other bottlenecks, I thought it was worth it to give stackgl a shot for three reasons:

  1. Large profile renders were still taking a significant amount of time.
  2. Object picking with the mouse (i.e. which node am I hovering) was really slow on large profiles with Pixi.
  3. It would get me closer to the GPU, giving me some valuable experience writing shaders.

Thankfully, there is a good example of using stackgl to render primitives. I was able to leverage that to build the following:

This iteration of the renderer proved to be almost 3x faster than the previous, and opened up a lot of doors for optimization around interaction. I finally had a renderer that would hold up to the largest profiles I could throw at it!

So, if my bottleneck was no longer the renderer, what was it? Find out next time!

Up next — Part 4: Working with Workers for a Jank-Free UI

More by Tom Lagier

Topics of interest

More Related Stories