In this post, I'm going to show you how to potentially triple your Node application's performance by managing multiple threads. This is an important tutorial, where the methods and the examples shown will give you what you need to set up production-ready thread management.
For the longest time, Node's had the ability to be multi-threaded, by using either Child Processes, Clustering, or the more recent preferred method of a module called Worker Threads.
Child processes were the initial means of creating multiple threads for your application and have been available since version 0.10. This was achieved by spawning a node process for every additional thread you wanted to be created.
Clustering, which has been a stable release since around version 4, allows us to simplify the creation and management of Child Processes. It works brilliantly when combined with PM2.
Now before we get into multithreading our app, there are a few points that you need to fully understand:
1. Multithreading already exists for I/O tasks
There is a layer of Node that's already multithreaded and that is the libuv thread-pool. I/O tasks such as files and folder management, TCP/UDP transactions, compression, and encryption are handed off to libuv, and if not asynchronous by nature, get handled in the libuv's thread-pool.
2. Child Processes/Worker Threads only work for synchronous JavaScript logic
Implementing multithreading using Child Processes or Worker Threads will only be effective for your synchronous JavaScript code that's performing heavy-duty operations, such as looping, calculations, etc. If you try to offload I/O tasks to Worker Threads as an example, you will not see a performance improvement.
3. Creating one thread is easy. Managing multiple threads dynamically is hard
Creating one additional thread in your app is easy enough, as there are tons of tutorials on how to do so. However, creating threads equivalent to the number of logical cores your machine or VM is running and managing the distribution of work to these threads is way more advanced, and to code this, logic is above most of our pay grades 😎.
Thank goodness we are in a world of open source and brilliant contributions from the Node community. Meaning, there is already a module that will give us the full capability of dynamically creating and managing threads based on the CPU availability of our machine or VM.
The module we will work with today is called Worker Pool. Created by Jos de Jong, Worker Pool offers an easy way to create a pool of workers for both dynamically offloading computations as well as managing a pool of dedicated workers. It's basically a thread-pool manager for Node JS, supporting Worker Threads, Child Processes, and Web Workers for browser-based implementations.
To make use of the Worker Pool module in our application, the following tasks will need to be performed:
First, we need to install the Worker Pool module - npm install workerpool
Init Worker Pool
Next, we'll need to initialize the Worker Pool on the launch of our App
Create Middleware Layer
We'll then need to create a middleware layer between our heavy-duty JavaScript logic and the Worker Pool that will manage it
Update Existing Logic
Finally, we need to update our App to hand off heavy-duty tasks to the Worker Pool when required
At this point, you have 2 options: Use your own NodeJS app (and install workerpool and bcryptjs modules), or download the source code from GitHub for this tutorial and my NodeJS Performance Optimization video series.
If going for the latter, the files for this tutorial will exist inside the folder 06-multithreading. Once downloaded, enter into the root project folder and run npm install. After that, enter into the 06-multithreading folder to follow along.
In the worker-pool folder, we have 2 files: one is the controller logic for the Worker Pool (controller.js). The other holds the functions that will be triggered by the threads…aka the middleware layer I mentioned earlier (thread-functions.js).
worker-pool/controller.js
'use strict'
const WorkerPool = require('workerpool')
const Path = require('path')
let poolProxy = null
// FUNCTIONS
const init = async (options) => {
const pool = WorkerPool.pool(Path.join(__dirname, './thread-functions.js'), options)
poolProxy = await pool.proxy()
console.log(`Worker Threads Enabled - Min Workers: ${pool.minWorkers} - Max Workers: ${pool.maxWorkers} - Worker Type: ${pool.workerType}`)
}
const get = () => {
return poolProxy
}
// EXPORTS
exports.init = init
exports.get = get
The controller.js is where we require the workerpool module. We also have 2 functions that we export, called init and get. The init function will be executed once during a load of our application. It instantiates the Worker Pool with options we'll provide and a reference to the thread-functions.js. It also creates a proxy that will be held in memory for as long as our application is running. The get function simply returns the in-memory proxy.
worker-pool/thread-functions.js
'use strict'
const WorkerPool = require('workerpool')
const Utilities = require('../2-utilities')
// MIDDLEWARE FUNCTIONS
const bcryptHash = (password) => {
return Utilities.bcryptHash(password)
}
// CREATE WORKERS
WorkerPool.worker({
bcryptHash
})
In the thread-functions.js file, we create worker functions that will be managed by the Worker Pool. For our example, we're going to be using BcryptJS to hash passwords. This usually takes around 10 milliseconds to run, depending on the speed of one's machine, and makes for a good use case when it comes to heavy duty tasks. Inside the utilities.js file is the function and logic that hashes the password. All we are doing in the thread-functions is executing this bcryptHash via the workerpool function. This allows us to keep code centralized and avoid duplication or confusion of where certain operations exist.
2-utilities.js
'use strict'
const BCrypt = require('bcryptjs')
const bcryptHash = async (password) => {
return await BCrypt.hash(password, 8)
}
exports.bcryptHash = bcryptHash
.env
NODE_ENV="production"
PORT=6000
WORKER_POOL_ENABLED="1"
The .env file holds the port number and sets the NODE_ENV variable to "production". It's also where we specify if we want to enable or disable the Worker Pool, by setting the WORKER_POOL_ENABLED to "1" or "0".
1-app.js
'use strict'
require('dotenv').config()
const Express = require('express')
const App = Express()
const HTTP = require('http')
const Utilities = require('./2-utilities')
const WorkerCon = require('./worker-pool/controller')
// Router Setup
App.get('/bcrypt', async (req, res) => {
const password = 'This is a long password'
let result = null
let workerPool = null
if (process.env.WORKER_POOL_ENABLED === '1') {
workerPool = WorkerCon.get()
result = await workerPool.bcryptHash(password)
} else {
result = await Utilities.bcryptHash(password)
}
res.send(result)
})
// Server Setup
const port = process.env.PORT
const server = HTTP.createServer(App)
;(async () => {
// Init Worker Pool
if (process.env.WORKER_POOL_ENABLED === '1') {
const options = { minWorkers: 'max' }
await WorkerCon.init(options)
}
// Start Server
server.listen(port, () => {
console.log('NodeJS Performance Optimizations listening on: ', port)
})
})()
Finally, our 1-app.js holds the code that will be executed on the launch of our App. First, we initialize the variables in the .env file. We then set up an Express server and create a route called /bcrypt. When this route is triggered, we will check to see if the Worker Pool is enabled. If yes, we get a handle on the Worker Pool proxy and execute the bcryptHash function that we declared in the thread-functions.js file. This will in turn execute the bcryptHash function in Utilities and return us the result. If the Worker Pool is disabled, we simply execute the bcryptHash function directly in Utilities.
At the bottom of our 1-app.js, you'll see we have a self-calling function. We're doing this to support async/await, which we are using when interacting with the Worker Pool. Here is where we initialize the Worker Pool if it's enabled. The only config we want to override is setting the minWorkers to "max". This will ensure that the Worker Pool will spawn as many threads as there are logical cores on our machine, with the exception of 1 logical core, which is used for our main thread. In my case, I have 6 physical cores with hyperthreading, meaning I have 12 logical cores. So with minWorkers set to "max", the Worker Pool will create and manage 11 threads. Finally, the last piece of code is where we start our server and listen on port 6000.
Testing the Worker Pool is as simple as starting the application and while it's running, preforming a get request to
http://localhost:6000/bcrypt
. If you have a load testing tool like AutoCannon, you can have some fun seeing the difference in performance when the Worker Pool is enabled/disabled. AutoCannon is very easy to use.I hope this tutorial has provided insight into managing multiple threads in your Node application. The embedded video at the top of this article provides a live demo of testing the Node App.
Till next time, cheers :)
Previously published at http://bleedingcode.com/managing-multiple-threads-nodejs/