Implementing highlight.js in an Express app for a super fast pre-rendered code highlighting
A simple server side Node / Express code to transform Markdown content into a fully formatted HTML with highlighted code blocks.
Markdown is a lightweight markup language with plain text formatting syntax. Its design allows it to be converted to many output formats.
Highlight.js is a syntax highlighter written in JavaScript. It works in the browser as well as on the server. It works with pretty much any markup, doesn’t depend on any framework, and has automatic language detection.
Unified is a friendly interface backed by an ecosystem of plugins built for creating and manipulating content.
Unified plugins: remark-parse, remark-rehype, rehype-stringify, rehype-highlight
At Regbrain we decided to implement server side code highlighting to boost the loading time of our main website. We constantly benchmark our website with Lighthouse and aim for top performance scores.
Loading JavaScript to highlight code in the browser was taking too much time. First, the JavaScript files had to be fetched and then the browser was repainting the content resulting in a slower website. To improve speed, we decided to implement code highlighting on a server and now we send fully formatted html to the browser.
At this point, you may be wondering, how is highlighting code server side performant? We will explore that in more details later on, but first, let's walk through our technical solution.
Our articles are written in markdown so our workflow needs to take raw markdown as input and serve a fully formatted html. We do it in the following steps:
1. Fetch markdown content
2. Transform markdown into a markdown syntax tree using remark-parse
3. Transform markdown syntax tree to html syntax tree using remark-rehype
4. Traverse html syntax tree to apply code highlighting to content within <code> tags using rehype-highlight
5. Transform html syntax tree to string to send to the client using rehype-stringify
We achieve all the above with unified framework and plugins as follows:
Import required libraries
We grab the unified framework and the required plugins
let unified = require('unified')
let markdown = require('remark-parse')
let remark2rehype = require('remark-rehype')
let highlight = require('rehype-highlight')
let html = require('rehype-stringify')
Create a unified processor
We create a processor which pipes together all the plugins above to achieve our chain of transformations from markdown to fully highlighted html:
let processor = unified()
// Transform markdown into a markdown syntax tree
.use(markdown)
// Transform markdown syntax tree to html syntax tree
.use(remark2rehype)
// Traverse html syntax tree to apply code highlighting to content within code tags
.use(highlight)
// Transform html syntax tree to string to send to the client
.use(html)
Transform!
We now have the processor which can parse any markdown input as follows:
let input = some markdown content
let output = await processor.process(input)
Express js router implementation example
We implement the above steps in our Express app as follows:
let express = require('express')
let router = express.Router()
let unified = require('unified')
let markdown = require('remark-parse')
let remark2rehype = require('remark-rehype')
let html = require('rehype-stringify')
let highlight = require('rehype-highlight')
router.get('/:slug', async function (req, res, next) {
let input = await article.from.database.in.markdown()
let processor = unified()
.use(markdown)
.use(remark2rehype)
.use(highlight)
.use(html)
let output = await processor.process(input)
res.render('article', output)
})
module.exports = router
The last thing we need to do is to include highlight css styles on our pages. The easiest way would be to simply link them as external styles, but that would impair our website loading speed as fetching external styles blocks page rendering. To avoid the performance penalty, we include all css as an internal style on a page.
<!doctype html>
<html>
<head>
<style>
{all page's style including highlightjs css}
</style>
</head>
<body>
</body>
</html>
How do we make server side rendering performant? Even though the above code highlighting slows down our server a little bit compared to sending 'clean' html, we implement a number of additional layers below which allow us achieve excellent page loading speed:
AMP - we serve our main content as AMP pages by default. That means that Google and Bing can cache our pages and serve it really fast on mobile devices.
No external styles or JavaScript (other than async AMP) - we do not use any blocking external resources such as styles, images or JavaScript files. This is already enforced by following the AMP specification, but even if we did not implement AMP, this would a good approach to take to improve page load speed. All our css is internal. We prepare css server side and make it specific to the type of content that we serve to avoid including unused styles (...within reason...).
Minification - we use css and html minification to further reduce the size of our pages.
CDN - we use a global content distribution network and configure our HTTP headers to get benefits of CDN caching, we also configure asset compression for our CDN.
With the set up above we can serve even ten Express apps on the smallest AWS EC2 instance, which works out to be really cost attractive compared to various options of hosting individual apps separately as a service.