In my quest to generate a re-usable WebAssembly template, I discovered many ways that appear easy to implement, but don’t really work in applications beyond a simple ‘hello world’. If you’re interested in building slightly more complex WebAssembly modules, this article is for you.
Ultimately, I’m looking to compile a nice C/C++ library to the JavaScript domain — I’m not looking to build a specific one-off functions, I want the entire library support (or at least the majority). Additionally, I want this to run in the browser as well as NodeJS. I don’t want to deal with instantiating the WASM or manage its memory across these environments either. These requirements mean I can rule out several alternatives…
Spoiler: Use Emscripten with Bazel. Shortcut to the github repo.
There are many starting points and many tutorials on how to get started using C/C++ with JavaScript. Here are a few tools you would find in the wild:
Some of these tools are not exactly what I want…
WAT — requires a lot of low-level work and only makes sense for simple one-off functions. 👎👎👎
N-API — is not much better. I’m still writing a lot of bindings, need to worry about node-gyp, and it may not work in the browser. 👎👎
LLVM — allows us to directly compile C/C++ to WASM! Unfortunately, this is still a low-level job that requires me to perform a lot of extra steps just to get it working. 👎
Wasmer — actually looks great! They’re a relatively new player and support lots of integrations. Even their builds are relatively lean! Unfortunately, they still require a lot of glue to the native C/C++ code which is not really pleasant for larger projects. However, they are working on better integration support and are moving rather fast. 👍
Cheerp — another great tool. They’re similar to emscripten, but have a different memory model that allows for automated garbage collection. The performance is quite similar, often beating emscripten in special cases. However, the community support is not quite as large and I found myself getting stuck. I’ll keep these guys on the radar. 👍👍
Emscripten — just right. Integration with C++ is made extremely easy by using embind. I can pass non-primitive types between both domains (C++ only). They have a larger community presence. They can output into a format that is relatively straight forward to use in the browser or NodeJS with ease. 👍👍👍
I’ll showcase a simple “hello world” C++ application that we will convert to WebAssembly.
This is the crux of it all. Every toolchain has some initial difficulties setting up and I’m often left scratching my head on where to even start. No one wants to manually invoke gcc so we built scripts such as
configure
, make
, or cmake
to automate the build process — great!…except, not 🙁
Sometimes I’ve needed to hack the existing make/cmake rules to avoid dependencies on shared libraries, ignore some intrinsics checks, etc. This obviously doesn’t play nice with a centralized C++ code base that attempts to build bindings for many languages. So what are our options?
— a fast, scalable, multi-language, and extensible build system.
While this build system can be quite daunting, it is actually very powerful. Unfortunately, there’s just not that much documentation to learn to use it with emscripten. In fact — their docs are broken, more broken, and maybe not even supported.
I argue that it can be done decently well — even the reputable TensorFlow.js team has managed to get it working! So what was so difficult? What makes it so special?
After converting several libraries to WebAssembly, I can tell you that the isolation Bazel offers is quite nice — no horrible breaking changes when a cmake script has been modified. No more complex logic determining the target to build, etc. Once defined it will almost always just work.
Install Bazel. You will also need yarn to install the dev dependencies.
Fast forward a bit, here is the github repo so you can follow along.
Note: I’ve taken a lot of inspiration from the TensorFlow.js project on how they managed to get it working. My changes revolve around compiler/linker flags, showing how to output both JS and WASM, and most important — using the latest emscripten release 🎉!
git clone --recurse-submodules https://github.com/s0l0ist/bazel-emscripten.git
cd bazel-emscripten
yarn install
I’ve taken the liberty to include the emsdk as a git submodule instead of managing it yourself. The first step is to get the emsdk cloned. If you’ve cloned my repo recursively, you can skip this step:
yarn submodule:update
Next, we need to update the release tags and then install the latest version of emscripten:
yarn em:update
yarn em:init
Done 🎉!
Some important files and directories:
.bazelrc
— describes default commands for building a targetWORKSPACE
— defines our external dependenciesSome files inside
hello-world/
:BUILD
— empty file so bazel doesn’t complaindeps.bzl
— bazel toolchain dependencies (emsdk)A few directories in
hello-world/
:cpp/
— holds the simple C++ sourcesjavascript/
— holds all JS related materialjavascript/bindings/
— holds all emscripten bindingsjavascript/src/
— holds all JS wrappersjavascript/scripts
— the handy build scripts to shorten our cli statementsjavascript/toolchain
— the heart of the Bazel + Emscripten configurationThe rest is self explanatory.
I’ve outlined a very simple library containing Greet and LocalTime classes that have static methods for this example:
LocalTime class:
//////// cpp/localtime.hpp ////////
#ifndef LIB_LOCAL_TIME_H_
#define LIB_LOCAL_TIME_H_
namespace HelloWorld {
class LocalTime {
public:
/*
* Prints the current time to stdout
*/
static void Now();
};
} // namespace HelloWorld
#endif
//////// cpp/localtime.cpp ////////
#include <ctime>
#include <stdio.h>
#include "localtime.hpp"
namespace HelloWorld {
void LocalTime::Now() {
std::time_t result = std::time(nullptr);
printf("%s", std::asctime(std::localtime(&result)));
}
} // namespace HelloWorld
Greet class:
//////// cpp/greet.hpp ////////
#ifndef LIB_GREET_H_
#define LIB_GREET_H_
#include <string>
namespace HelloWorld {
class Greet {
public:
/*
* Greets the name
*/
static std::string SayHello(const std::string &name);
};
} // namespace HelloWorld
#endif
//////// cpp/greet.cpp ////////
#include <string>
#include "greet.hpp"
namespace HelloWorld {
std::string Greet::SayHello(const std::string &name) {
return "Hello, " + name + "!";
}
} // namespace HelloWorld
The bindings are quite short for our example. We make use of the powerful embind which lets us talk to C++ classes.
You may notice that
LocalTime::Now
outputs directly to stdout. Emscripten is intelligent enough to redirect our output to
console.log
so we don’t need to do anything else 😎. Greet::SayHello
returns a primitive string that we will manually need to send to console.log
.//////// javascript/bindings/hello-world.cpp ////////
#include <emscripten/bind.h>
#include "hello-world/cpp/greet.hpp"
#include "hello-world/cpp/localtime.hpp"
using namespace emscripten;
using namespace HelloWorld;
EMSCRIPTEN_BINDINGS(Hello_World) {
class_<Greet>("Greet")
.constructor<>()
.class_function("SayHello", &Greet::SayHello);
class_<LocalTime>("LocalTime")
.constructor<>()
.class_function("Now", &LocalTime::Now);
}
Now that we’ve defined our bindings, we’re ready to build!
You may build the native libraries, but they’re quite useless by themselves…
bazel build -c opt //hello-world/cpp/...
I’ve configured the
.bazelrc
file to build the with two different options: JS or WASM.JS — Specifies flags to emscripten to output a single asmjs file that does not contain any WebAssembly. This is useful for environments that can’t work with WebAssembly such as React-Native, but is significantly larger and slower.
WASM — Specifies flags to emscripten to output a single JavaScript file containing the WebAssembly as a base64 encoded string. This means we don’t need to manage a separate
.wasm
file in our bundles or figure out how to properly serve this file in the browser. The drawback is a larger file size due to the base64 encoding.To make it simple, I’ve created some helper scripts so all you need to do is run the following:
yarn build:js
// or
yarn build:wasm
// or both
yarn build
There are some good and bad things about using emscripten here:
Good: It generates glue code for you automatically.
Bad: It generates glue code for you automatically.
Obviously, the glue code adds some bloat but keeps me from having to deal with the intricacies of initialization 👌.
Note: In
there are a few defined compiler flags that are present for both the JS/WASM builds geared towards production use. You may feel free to modify the flags as necessary, but I wanted to show what’s possible here..bazelrc
If you do want to have full control over instantiating the WASM to reduce the bundle size, you may generate a pure WASM build by adding the link flag
inside the starlark file,-s STANDALONE_WASM=1
.hello-world/javascript/BUILD
You may have seen the
javascript/src/implementation
files which wrap the emscripten output. Do we really need these files? — no, you don’t. However, I like my APIs to be abstracted from the output of emscripten. This allows for more flexibility when there are potentially breaking changes to the C++ core.An important thing to note is that the outputs are quite a bit larger than you would expect. A big reason for some people is that some code requires
<iostream>
where a lot of code is pulled in for static constructors to initialize the iostream system even if it is not used — but our builds don’t have this problem. Then there is the glue code auto-generated to manage initialization and provide helpers for memory allocation, resizing, and the like.yarn rollup
This gathers the files in
hello-world/javascript/bin/*
, hello-world/javascript/src/*
and produces a few output bundles in hello-world/javascript/dist/
.You will notice two minified bundles for
js
and for wasm
that each have two different targets for ES6 module
support or UMD
(for browser and NodeJS) in hello-world/javascript/dist/<js|wasm>/<es|umd>/*
.Details of the rollup configuration are in rollup.config.js.
So we’ve compiled our C++ to JS and WASM — what’s next?
Run the JS bundle in NodeJS:
yarn demo:js
Or run the WASM bundle in NodeJS:
yarn demo:wasm
Or open
javascript/html/index_wasm.html
to run the WASM bundle in the browser:By spending a little time with bazel, you can create a nice build system that works for many languages without breaking your other targets.
We can now drive a core C++ application with bindings in several different languages all while simplifying the interoperability between them.
Stay tuned for part 2 where I show a real C++ library converted to JS and WASM!
Hope you enjoyed and thanks for reading!