So, Here’s the problem: You are a chef, and you have been asked to prepare milkshakes for your best friend’s baby shower. Because of the huge income chef’s usually have, you are a proud owner of a Cylindrical Automated Transformer(CAT) that you can use to make milkshakes and whatnot.
Option 1: You could make the milkshakes yourself, and it will take you 2 minutes to do it.
Option 2: You could use the CAT, which takes about 20 minutes, no matter the request.
You choose: Option 1 (good choice)
Impressed by your speed in the kitchen, the Queen of Whales asks you to cook an eight course dinner, with appetizers, and dessert, and milkshakes on her son’s wedding.
If you chose to have at it all by yourself now, not only will you not be able to complete the job, you will be barred from Whales and the Kingdom for the rest of your life. However, if you are a wee bit smart at Math, you’d choose the CAT, finish the job in 20 minutes(right!?!), and gain an all access pass to the fanciest hotels in the United Kingdom.
Machine Learning is that eight-course dinner, in a catered wedding of the Prince of whales with 200,000 guests. Would you want to cook all that yourself(CPU) or use a service? (Hint: gpu.js is that service.)
In Machine Learning, the GPU can help you cut the time to 1/100th of the original time taken. And maybe more. (Keep going! Results will show up.)
Introducing, gpu.js!
If you are baffled and want to dive straight into the pool of brackets, feel free to skip to the next section.
gpu.js is a GPGPU(General purpose Programming on Graphical Processing Units) library that lets you hand over hefty calculations to the GPU for super fast operation and output. It currently runs on the browser and node.js, wherein it is using WebGl API’s in the browser, and a single threaded operation on node.js. OpenCL is on the roadmap.(🎉)
Github Stars ⭐️
You might ask: “But why? Aren’t Intel’s i7, or even i9 fast enough? They seem to work fine for me. I don’t need this.”
Before you get bogged down by that, check out the results:
MacBook Pro Retina 2015, Google Chrome
22.97 times faster!?!
(Specs: MacBook Pro Retina, 2015)
Right, right! That’s a powerful machine, so here are the results on a system with Integrated Graphics Card (Intel HD 3000) and no dedicated GPU:
Intel HD 3000, Google Chrome
All in all, the thing that separates gpu.js
form the lot is that it doesn’t chain you to use the library in a specific way. It does what the tagline says: It lets you accelerate your hefty JavaScript.
Let’s have some code now: We are going to perform Matrix Multiplication, and benchmark CPU’s performance vs GPU. Size of the Matrices: 512 X 512
You are right. It’s hosted on GitHub: gpu.js-demo
The source files, namely gpu.min.js
and gpu-core.min.js
can be downloaded from our website (gpu.rocks) or github(gpu.js).
Note: I am assuming you have already initialized a prototype HTML/JS/CSS project (index.html
, index.js
, style.css
)
In your index.html, import the files, and you are good to go:
To multiply two matrices, we need to make sure that the number of columns in the first matrix is equal to the number of rows in the second matrix.
Matrix A: 512
X512
(m X n)
Matrix B: 512
X512
(n X r)
Result:512
X512
(m X r)
Here’s a generic Matrix Multiplication algorithm that runs on the CPU:
The next step is where the magic begins. (WebGL magic 😆)
The lib files export a global function named GPU
that you can use to create a new gpu instance. A gpu
const gpu = new GPU({mode: 'webgl'});
A few options can be sent to the constructor, the complete list of which can be find on GitHub and the automatically generated JSDocs
.
The mode
option specifies where the function will run. There are three options:
gpu and webgl are aliases, for the time being. We are targeting to incorporate OpenCL
for v2
and then the gpu will solely mean use the gpu via OpenCL API on server.
Currently, both webgl and gpu use WebGL API’s to defer work to the GPU.
The gpu variable we just initialized has several different methods attached to it, all of which have varying use-cases.
We’ll use thecreateKernel
method which, essentially, creates a “kernel” (An abstract term for it could be, in fact, function
) which you can call from JS. Behind the scenes, your code is compiled to GLSL shaders using an AST and a jison based parser. That ensures that the code written inside the kernel will be executed on a GPU.
You pass a JS Function as an argument to the kernel, and have access to the thread dimensions(As a mnemonic, you can think of the thread dimensions as the lengths of the for-loops we used in CPU mode.)
.setDimensions
sets the dimension of the loop. (See the API page for complete reference.)
This is an inherent problem in the way most GPU written software works: Transfer Penalty. The GPU is like its own computer; a black box that we send commands to from the CPU. We can transfer things to it and read data back from it, but all that comes with a penalty. The overall penalty of transfer especially becomes a bottleneck if your case involves performing several mathematical operations on the GPU, and the net fine keeps increasing with every operation.
You can, however, leave values on the GPU. They exist on the GPU as Textures. (You can think of Textures as sort of data containers, but, for the GPU.) By setting outputToTexture
flag to true
, you can make sure that you incurring no transfer penalty thereby eliciting an important speed gain.
.setOutputToTexture
is where the REAL STUFF happens!
And as important as they are**, A** and B are matrices which we’ll create in the next step.
This code is taken from the demo you saw on our website. If you don’t get it, no problem. Here’s what it does: It adds 512*512 elements to a JavaScript Array. (1D) and then divides them into 512 parts, which means, in the end, we have a 2D array of size 512*512. (Every array element has child elements.)
All done! Let’s hit the gas.
For simplicity, we are going to use the Web performance API to benchmark this but you can use benchmark.js
as well. (Our website uses that.)
First we generate the matrices by calling the above function, and then we run the matMult
methods for both CPU and GPU.
BAM! Open index.html
in the browser and see for yourself. Here is what I got: (Support for Safari is coming soon!)
And that’s just 512 X 512. If you change that number to 1024, you’ll notice that the GPU is a powerful beast and can run your code much much faster than the CPU.
We would like to let the community know that JavaScript has been gifted a rocket. 🚀 What will you do with it?
All the code is on GitHub and the team will be thrilled to have more users and contributors. gpu.js — Go have some epiphanies. 🎉