Hackernoon logoHow to Compile Node.js Code Using Bytenode? by@pw.osama

How to Compile Node.js Code Using Bytenode?

Osama Abbas Hacker Noon profile picture

@pw.osamaOsama Abbas

In this post, I will show you how to “truly” compile Node.js (JavaScript) code to V8 Bytecode. This allows you to hide or protect your source code in a better way than obfuscation or other not-very-efficient tricks (like encrypting your code using a secret key, which will be embedded in your app binaries, that’s why I said “truly” above).

So, using bytenode tool, you can distribute a binary version .jsc of your JavaScript files. You can also bundle all your .js files using Browserify, then compile that single file into .jsc.

Check bytenode repository on Github.

Long story short..

  1. Install bytenode globally:
    [sudo] npm install -g bytenode
  2. To compile your .js file, run this command:
    bytenode --compile my-file.js my-file.jsc
  3. Install bytenode in your project too:
    npm install --save bytenode
  4. In your code, require bytenode,
    const bytenode = require('bytenode');

    to register 
    .jsc
    extension in Node.js module system. That’s why we installed it locally too.
  5. You can now require my-file.jsc as a module:
    const myFile = require('./my-file.jsc');

    You can also remove
    my-file.js
    from production build.
  6. And if you want to run
    my-file.js
    c using
    bytenode
    cli:
    bytenode --run my-file.jsc

Now you know how to compile 

.js
files, how to require the compiled version in your code, and how to run 
.jsc
files from the terminal. Let’s move on to the long story.

V8 engine (which Node.js is based on) uses what is called: just in time compilation (JIT), where JavaScript code is compiled just before execution, then it will be optimised subsequently.

Starting from

Node.js
v5.7.0, the vm module introduced a property called
produceCachedData
in
vm.Script
Constructor function, so if you do something like this:

let helloScript = new vm.Script('console.log("Hello World!");', {
  produceCachedData: true /* This is required for Node.js < 10.0.0 */
});

View on GitHub

Then, get the bytecode or

cachedData
buffer:

let helloBuffer = helloScript.cachedData;

// or in Node.js >= 10
let helloBuffer = helloScript.createCachedData();

View on GitHub

This

helloBuffer
can be used to create an identical script that will execute the same instructions when it run, by passing it to the
vm.Script
Constructor function:

let anotherHelloScript = new vm.Script('', {
  produceCachedData: true,
  cachedData: helloBuffer
});

// This will fail!

View on GitHub

But this will fail, V8 engine will complain about the first argument (that empty string

''
), when it checks whether it is the same code as the one that was used to generate
helloBuffer
buffer in the first place. However, this checking process is quite easy, it is the length of the code that does matter. So, this will work:

let anotherHelloScript = new vm.Script(' '.repeat(28), {
  produceCachedData: true,
  cachedData: helloBuffer
});

View on GitHub

We give it an empty string with the same length (28) as the original code (

console.log("Hello World!");
) . That’s it!

This is interesting, using the cached buffer and the original code length we were able to create an identical script. Both scripts can be run using 

.runInThisContext();
function. So if you ran them:

helloScript.runInThisContext();

anotherHelloScript.runInThisContext();

View on GitHub

you will see ‘Hello World!’ twice.

(Note that if you have used the wrong length, or if you have used another version of Node.js/V8:

anotherHelloScript
won’t run, and its property
cachedDataRejected
will be set to
true
).

Now to our last step, when we defined

anotherHelloScript
we used a hard coded value (28) as our code length. How can we change this, so that in the runtime we don’t have to know exactly how long was the original source code?

After some digging in V8 source code, I have found that the header information is defined here (in this file

code-serializer.h
):

 // The data header consists of uint32_t-sized entries:
  // [0] magic number and (internally provided) external reference count
  // [1] version hash
  // [2] source hash
  // [3] cpu features
  // [4] flag hash

View on GitHub

But, Node.js buffer is Uint8Array typed array. This means that each entry from the

uint32
array will take four entries in the
uint8
buffer. So, the payload length (which is
source
hash at index
[2]
, which is
[8, 9, 10, 11]
bytes in Node buffer) will be:

let payloadLengthBytes = whateverBufferYouHave.slice(8, 12);

View on GitHub

It will be some thing like this:

<Buffer 1c 00 00 00>
, which is Little Endian, so it reads:
0x0000001c
. That is our code length (28 in decimal).

To convert these four bytes to a numeric value, you may do something like this:

firstByte + (secodeByte * 256) + (thirdByte * 256**2) + (forthByte * 256**3),

Or in a more elegant way, you can do this:

let length = payloadLengthBytes.reduce( (sum, number, power) => sum += number * 256**power , 0);

View on GitHub

As I did here in my library, check it to see the full recipe.

Alternatively, we could use

buf.readIntLE()
function, which does exactly what we want:

let length = whateverBufferYouHave.readIntLE(8, 4);
// 8 is the offset, 4 is the number of bytes to read

View on GitHub

Once you have read the length of the original code (that was used to generate the

cachedData
buffer), you can now create your script:

let anotherHelloScript = new vm.Script(' '.repeat(length), {
  produceCachedData: true,
  cachedData: helloBuffer
});

// later in your code
anotherHelloScript.runInThisContext();

View on GitHub

Finally, does this technique have an impact on performance? Well, in recent versions of v8 (and Node.js), the performance is almost the same. Using octance benchmark I did not find any difference in performance. I know that Google deprecated octance (because browsers and JS engines were cheating), but the results in our situation are significant, because we are comparing the same code on the same JS engine. So, the final answer is: Bytenode does NOT have a negative impact on performance.

Check my Github repository, where you can find complete working examples. I have added an example for Electron (which has no source code protection at all) and for NW.js (which has a similar tool nwjc, but it works only with browser-side code). I will add more examples (and tests) soon, hopefully.

Tags

Join Hacker Noon

Create your free account to unlock your custom reading experience.