A big part of our product at Lumen5 is our video rendering engine (we're a video creation tool, after all). A couple of years ago, we started experimenting with WebGL as a potential new way to render videos. It's totally transformed the way we think about JavaScript development, and opened our eyes to a completely different web paradigm.
When we first started, it wasn't clear that WebGL was the way to go for us. It took a lot of research into the different alternatives before we broke ground on our existing render engine. I want to share why we thought WebGL was the best alternative for us in our situation.
Our users come to our website and interact with our video creator application to set up a series of scenes (which are combinations of text, images, videos, animations, audio, etc). When they are happy with their set of scenes, they click "render". We needed something that would take that data and produce a final mp4 file.
The basic requirement: Take in some data (in the form of JSON + media assets) and produce an mp4 output.
Other requirements:
Given these requirements, we set about analyzing a couple of alternatives.
The idea: There are plenty of video creation tools to help designers create videos and then render those videos. An easy example, which I focus my analysis on, is Adobe After Effects. We could piggy-back on After Effects, using it to be the backend rendering engine for our users.
The main problem: I experimented with this idea quite a bit, and it turns out that there isn't one main problem, there is a multitude of them! Here are a few:
As you can see, this turned out to be one of the worst alternatives that we looked at.
The idea: Rather than using a proprietary video tool, use an open source one. The example that I spent the most time looking at was Blender. Blender has a Python API and our backend is already written in Python, so this makes good sense as a solution. Also, since Blender is open source, it'd be possible to fork it and create a totally custom version of it suited to just our needs, if we need ultimate control.
The main problem: The main reason we didn't end up going with this approach is because of our preview requirement. We want the ability for users to preview their video in real-time as they are editing it. Since our users are using a web browser, it'd be really complicated for us to be generating a dynamic preview on our server in Blender and then streaming that preview down to our end user's browser at scale. Maybe this is possible, but when we thought of this issue, we decided to explore some browser-based solutions.
The idea: If our users are interacting with the tool in their browser, let's use the things that the browser is most known to be good at: HTML + CSS + JavaScript to generate the contents of our video. Here's an example of a simple scene that we could create.
This example shows how we can animate various properties of DOM elements using CSS or JavaScript.
In order to create the actual mp4 file, we would need to use a browser-automation tool, like Puppeteer. How this would work:
Interestingly, there are some projects that are using this very approach. You can see the source code in a really cool project called Remotion that is doing exactly this.
The main problem: For us, we realized that we wanted even more control over the specific pixels being rendered than the normal DOM API would give us. For example, the DOM API makes animating an entire tag worth of text pretty simple - our codepen above does exactly that. But when it comes to controlling how the glyphs within the text are rendered and animated, the DOM API's support starts to drop off.
Additionally, we were concerned about the scalability and performance of this solution. Assuming that there are potentially thousands of DOM nodes all with various properties being simultaneously animated in one of our videos, this option began to make less sense.
The idea: This is a similar idea to Option 3, but instead of creating the video contents out of DOM nodes, we could instead use a canvas element and WebGL APIs. This would give us complete control over every pixel, and would also give us more flexibility to optimize performance on our own (without relying on the DOM API.
Of course, it's not all roses, there are still lots and lots of challenges with WebGL. A couple that come to mind:
Conclusion
For us, WebGL presented a solid solution to our rendering problem. We've now worked with it for multiple years, and learned a lot. It's exciting to be on the cutting edge of web-based graphics, marrying the two disciplines of web engineering and computer graphics development.
If you're interested in this space, add me on Linkedin, I'd love to chat!
Footnote: There are other, legal and licensing considerations that also went into this decision, but in this context it's more interesting to discuss the technical tradeoffs, so I've left those out ;)