A Tale of Javascript Performance, Part 5 This article continues my series chronicling my investigation into performance by creating , a visualization tool for Chrome memory profiles. On the ticket today is a look at how I used WebAssembly to get a large performance increase on my most expensive operation. If you missed it, check out , , , and . JavaScript HeapViz part 1 part 2 part 3 part 4 is a hot topic in the ecosystem right now. The ability to compile a handful of mature languages with granular memory (currently C, C++, and Rust) to an extremely performant assembly-like language has a lot of appeal. While it might seem like it’s a new development, the has been around for over 5 years so there are plenty of resources out there to help developers of any skill level get started giving their apps a huge performance boost. WebAssembly Javascript management toolchain for building it Understanding the Layout If you’ve been following along, you know that I’ve come to a point where I’ve optimized , and . The remaining bottleneck, as a couple of you astutely noted, is my layout algorithm. I mentioned that one of the major tradeoff of doing a circle-packing layout is that computing the layout is quite expensive. my loading and parsing my rendering Why is that? Well, if you’re the algorithmically inclined sort, take a few minutes and think about how you would write a process for doing it. Go ahead, I’ll wait. Not so easy, is it? Circle packing is actually a really challenging problem. There are devoted to finding new and better ways to pack circles with a variety of restrictions. For our particular variant, known in the field as the “unequal circles in a circle” problem, I am on d3-hierarchy’s . This is an implementation of a combination of and the (check out this by Mike Bostock, the creator of d3). entire internet communities leaning pack module Wang, et al’s sibling packing algorithm Matoušek-Sharir-Welzl algorithm beautiful explanation Both of these algorithms are recursive, with a high degree of complexity and a lot of integer math. They also are completely self-contained, and together make up the majority of the actual work my app needs to perform for any given layout. This makes them ideal candidates for compilation into WebAssembly! I decided to take the plunge and rewrite my hot path in a language that I can compile to the fastest execution context the web offers. WebAssembly 102 does a great job of explaining what WebAssembly is and why it matters. If you’re unfamiliar with WebAssembly, give it a read — it’s the best 5 minute introduction I have found. I’m going to skip past the overview and jump right in to implementation details. This cartoon post To compile any language to WebAssembly, you will need the Emscripten toolchain. I started by following the to get that installed, the compiled, and my first running in the browser. instructions and tutorial here emsdk hello.c Output Format One thing that might strike you as curious is the output format of . While it be used to output straight WebAssembly, there is a decent amount of setup that needs to be done on the browser to actually load and run the file. When you run with the flag, Emscripten helpfully generates a loader file and a sandbox file that you can use to play around with your WebAssembly code. emcc can .wasm emcc -o {filename}.html index.js index.html The Emscripten sandbox This sandbox is extremely valuable for debugging, it gives you a quick way to REPL your module and captures all of the output of your module on screen. I found myself, while debugging my build, running in the output directory of for a super light little development environment. python -m SimpleHTTPServer 3005 emcc Once I had finished my development, I incorporated that same loading boilerplate into my app bundle, which exposes the object Emscripten uses to mount your WebAssembly interface to. Module While it is easy to do development like this, it is not particularly efficient. You need to do a lot of manual reloading and copy your compiled files over to their target project after you are done. I think there is an opportunity for a Webpack plugin that will take a source , , or file, perform the compilation, and automatically include the results in your project. c cpp rust emcc Bindings, Three Ways After you have played around a bit with hello world and the sandbox, you will quickly see the need to do Javascript <-> WebAssembly bindings. A binding is a way to reference Javascript from WebAssembly, or vice versa. It describes the interface between WebAssembly and the rest of your code. One problem I had getting started with WebAssembly was the you could do these bindings. By my count there are three unique ways to do WebAssembly -> Javascript: sheer number of different ways ccall, cwrap and direct function calls WebIDL Embind The first two ways of doing bindings are for C code only. They involve passing a command line flag to describe your exported functions: emcc test.cpp -s WASM=1 -s EXPORTED_FUNCTIONS=”['_int_sqrt']" -o index.html With this flag, you tell the compiler to make a binding to that particular function. You can then reference it from Javascript in a number of ways: EXPORTED_FUNCTIONS That last method of invocation seems like it would be the simplest, and it certainly is for integers and strings. For more complicated data structures, however, it gets quite a bit trickier as you need to tell it exactly how it can convert your Javascript data into a structure that C can understand. I would recommend this style if the code you are optimizing is constrained to a single function or small set of functions that take relatively simple input — numbers and strings are perfect. If you are going to be passing around larger structures, I would instead suggest writing in C++ so you can leverage Embind and WebIDL. What is WebIDL? WebIDL is a that describes a standard language for interfaces between external code and the browser. Similarly, it can be used to describe the interface between WebAssembly and Javascript. It uses a very familiar C++-style notation: W3C specification For WebAssembly, Emscripten that consumes an IDL file and turns it into glue code that you then include when compiling your . Unfortunately, I was not able to get this pipeline working well for my classes and so I decided to try Embind instead. provides a python script wasm I do think that as the WebAssembly ecosystem matures further, WebIDL will be the preferred method for describing the interfaces between WebAssembly and Javascript. It is a standard component of the browser contract already and provides a very logical way of expressing these interfaces. Embind to the rescue Embind allows us to write WebAssembly to Javascript bindings from directly within our C++ code. I found its syntax a little opaque at times, but it is by far the most well-supported binding method and worked well out of the box. When using Embind, you provide a block of binding declarations somewhere in your file. It looks like: cpp This declares 3 classes that I can now access through Javascript off of Emscripten’s object, and also exposes a meta-class of VectorHNode that describes an array of HNodes. I can use these (mostly) just like any old Javascript class, e.g.: Module One thing to note is that I have declared the flag on the method. This is to tell Embind that I am going to use pointers of any kind as either an input or a return value from this method. Embind doesn’t have any way of type-checking these pointers so this is basically an opt-in to undefined behavior and Bad Things™ if I screw up either the method call or the handling of the pointer inside the method. allow_raw_pointers Hierarchy.pack How do I debug this stuff? Let’s assume that I goof and pass the wrong thing to my 'd method — what then? Luckily, we have several lines of defense against this sort of issue. The first is through the compiler — it will catch syntax errors and type mismatches, within your code and between your bindings and your code. allow_raw_pointers() The wrapping code that Emscripten provides will also do its best to provide helpful error messages: These combined will get you a good deal of the way, but when you’re instrumenting something more complex you will eventually get an error within WebAssembly that you don’t have a good way to debug. Thankfully, Emscripten provides a killer feature for and files — sourcemaps! c cpp If you run your compilation with the flag, you will get bona-fide browser sourcemaps for your compiled code. While you don’t get the actual context, you do get breakpoints. Combined with robust logging from within your app, you can diagnose and fix pretty much any issue. -g4 I will say though that while in-browser debugging via is pretty great, it doesn’t take the place of a test suite and C++ debugging environment. If I was going to do it over again, the first thing I would do would be to set up . It might have taken a couple of hours to get going properly, but it would have easily paid itself off over the course of the project. NetBeans with a CppTest suite The final compilation command After a lot of trial and error, I ended up with the following command: # for devemcc hierarchy.cpp -g4 --bind -o index.html -s WASM=1 -s ALLOW_MEMORY_GROWTH=1 # for prodemcc hierarchy.cpp -O3 --bind -o index.html -s WASM=1 -s ALLOW_MEMORY_GROWTH=1 sets the debugging verbosity level to maximum. This includes sourcemaps in the output -g4 sets the optimization level. It is incompatible with the flag -O3 -g tells Emscripten to use Embind bindings. --bind tells Emscripten to emit WASM instead of asm.js. It is worth testing both outputs as I have read that for some applications asm.js can actually be faster than WebAssembly. -s WASM=1 was necessary for my application to avoid depleting the pre-allocated WASM memory pool. If you can get away with not setting this, you should as it incurs a performance penalty. -s ALLOW_MEMORY_GROWTH=1 Is it faster? With my classes bound, my bugs squashed, and everything importing into my project correctly, the last step was to actually run some benchmarks. Was the hassle of translating my hot path into WebAssembly worth it? Here is the code I used to run the benchmarks: time in ms to render a given number of nodes The results confirm that, yes, WebAssembly really is fast. It was over 30% faster for my large datasets, and maintained that separation between 220k and 1 million nodes. Interestingly, it was not significantly faster for my smaller dataset, implying that the startup incurred by Emscripten’s wrapper code might outweigh the WebAssembly performance increase for small amounts of data. One million circles After all of that, the goal was to render my million circle data set. Without further ado, here is one million circles laid out and drawn by the browser: Isn’t it beautiful? We did it! A million circles in the browser! A fully interactive accounting of every single byte of memory in a large web app. Join me next time as I unveil the tool I made so you too can explore your heap profile in a cool interactive visualization. Up next — Introducing HeapViz: A new way to debug memory issues

The Sandbox

Assembly

Glue

Screamin’ Speed with WebAssembly

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Choosing a Visualization Method for Memory Profiles

Choosing a Visualization Method for Memory Profiles

Animating Line Charts With D3.js

Cool JavaScript Libraries To Consider Using in 2023

Creating Composite Node of a Graph using D3.js

d3.js and Google Maps API in 11 easy steps

Choosing a Visualization Method for Memory Profiles

Choosing a Visualization Method for Memory Profiles

Animating Line Charts With D3.js

Cool JavaScript Libraries To Consider Using in 2023

Creating Composite Node of a Graph using D3.js

d3.js and Google Maps API in 11 easy steps

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps