The concept of instrumentation often refers to tracing where events happen in an application. Many application performance monitoring (APM) tools use it to provide metrics on the inner workings of your application. But sometimes, all you really need are details about API calls.
Adding a hook into every HTTP request your application makes will allow you to automatically log requests, monitor APIs, handle problems with failure remediations, and more. This holds true for both internal requests to your own services, but more importantly, it works with any request to external third-party APIs. Even those that use their own client SDK.
Creating a full system to manage this is a bit harder. You need a dashboard to view the metrics, storage to handle the logs, and a way to capture the data. Fortunately, the concept of HTTP instrumentation is easier in Node.js thanks to the ecosystem's reliance on the underlying http module. Nearly every API client and request library used by node developers relies on this module.
In this post, we'll look at the building blocks needed to add observability to every request your application makes. This means direct insight into your API calls, without the need to configure logging for each individually.
To make our instrumentation layer, we will patch the core methods of the http/https module. This means overriding their functionality, performing some action, then calling the original method. It sounds more complex than it is. To start, let's look at a minimum example, without any new features. We can create a module in hijack.js as follows:
// hijack.js
const http = require("http")
function hijack() {
override(http)
}
function override(module) {
let original = module.request
function wrapper(outgoing) {
// Store a call to the original in req
let req = original.apply(this, arguments)
// return the original call
return req
}
module.request = wrapper
}
module.exports = hijack
Let's break down what this code is doing. We import the http module at the top. For a more complete version, we'd also need coverage for the https module. Next, the hijack function sets up the patches for http by calling an override function. The override function does three things:
The .apply method exists on all functions in javascript. It allows you to "call" the function, and pass it the this context and an array of arguments. It is similar to .call, except for the way it structures the arguments.
For our purposes, we are using it to call the original request, essentially unchanged. We aren't modifying the this context or the arguments. This is what allows us to step in-between the request and the destination without interfering.
One common use case for http instrumentation is to automatically log parts of the request. This provides observability and metrics about the request. In the wrapper function above, notice that we accept an argument called req. This is somewhat misleading, as it may not be the request. What it actually accepts are the arguments passed to http.request.
Depending on the client that is using http under the hood, this may be different. For most, like axios, our example will work fine. For others, you'll need to write more logic to handle edge cases. For these examples, we can assume that the first argument, arguments[0], maps to the request.
Let's make a function that handles logging. For this example, it will parse the request and log to the console. You could also send this data to a logging service, write it to a file, or whatever suits your needs.
// hijack.js
function logger(req) {
let log = {
method: req.method || "GET",
host: req.host || req.hostname || "localhost",
port: req.port || "443",
path: req.pathname || req.path || "/",
headers: req.headers || {},
}
console.log(log)
}
While mosts request should be uniform, we add in some optional fallbacks to avoid any undefined values. We capture a limited amount of information from the request in this implementation. You can view all the available options by inspecting or logging out req.
When combined with our code from earlier, we get something like the following:
function override(module) {
let original = module.request
function wrapper(outgoing) {
let req = original.apply(this, arguments)
logger(outgoing) // NEW
return req
}
module.request = wrapper
}
We captured the details about the request, but not the body of the request. More on that later. For now, let's look at capturing the response that comes back. This requires using the same technique we used earlier of overriding the default functionality of a method. Our goal is to listen to the response events that fire when an http request comes back. To recap, we need to:
In this case, we want information about the events that http.request emits when a response comes in. Specifically, the data and end events. To do this, we will patch the functionality of request.emit. An HTTP request emits events, such as the response event, that themselves emit events. If you're familiar with making HTTP calls using the standard http library, this will look similar.
To start, let's override emit. We will only focus on code inside of the wrapper.
function wrapper(outgoing) {
let req = original.apply(this, arguments)
let emit = req.emit // New
// New
req.emit = function (eventName, response) {
switch (eventName) {
case "response": {
response.on("data", (d) => {
// build body from chunks of data
})
response.on("end", () => {
// handle final response
})
}
}
return emit.apply(this, arguments)
}
logger(outgoing)
return req
}
In this code, we capture a reference to req.emit in emit. This allows us to do the next step, where we override req.emit with our own function. It takes the same two arguments that a node event emitter would take. The name of the event, and any specific arguments (in this case, the response).
We then set up some listeners when a response event comes in. You could add additional switch cases for error, abort, timeout, or any events emitted by the http.clientRequest. For now, we'll only focus on responses that successfully came back.
Finally, we return emit.apply(this, arguments) similar to how we did earlier. This ensures that the "real" request still emits events as expected.
Let's fill in the switch block to handle the response body, and log some details about the response to the console:
function wrapper(outgoing) {
let req = original.apply(this, arguments)
let emit = req.emit
let body = ""
req.emit = function (eventName, response) {
switch (eventName) {
case "response": {
response.on("data", (d) => {
// NEW: Collect data chunks
body += d
})
response.on("end", () => {
// NEW: Complete response
let res = {
statusCode: response.statusCode,
headers: response.headers,
message: response.statusMessage,
body,
}
console.log(res)
})
}
}
return emit.apply(this, arguments)
}
logger(outgoing)
return req
}
The two main changes here are:
This is another case where creating a dedicated logging method would be useful. You can also combine the request and response details into a single log, attach timestamps or use process.hrtime() to time the requests, and more.
That covers the bulk of our introductory implementation. The final code looks like this:
// hijack.js
const http = require("http")
const https = require("https")
function hijack() {
override(http)
override(https)
}
function override(module) {
let original = module.request
function wrapper(outgoing) {
let req = original.apply(this, arguments)
let emit = req.emit
let body = ""
req.emit = function (eventName, response) {
switch (eventName) {
case "response": {
response.on("data", (d) => {
// NEW: Collect data chunks
body += d
})
response.on("end", () => {
// NEW: Complete response
let res = {
statusCode: response.statusCode,
headers: response.headers,
message: response.statusMessage,
body,
}
console.log(res)
})
}
}
return emit.apply(this, arguments)
}
logger(outgoing)
return req
}
function logger(req) {
let log = {
method: req.method || "GET",
host: req.host || req.hostname || "localhost",
port: req.port || "443",
path: req.pathname || req.path || "/",
headers: req.headers || {},
}
console.log(log)
}
module.request = wrapper
}
module.exports = hijack
To get it up and running in your application, you would then include the following at the start of your application logic:
const hijack = require("./hijack")
hijack()
// ...
Now the hijack module will pick up every outbound HTTP request in your application. Metrics and instrumentation, all with a small library. This is just the beginning!
So far we've logged the request and the response, without interrupting the original requests too much. Some additional functionality could include:
HTTP instrumentation is much easier in Node.js than many languages due to Node's use of the http module under the hood. In other languages, you'd need to handle individual libraries on a case-by-case basis. You also need to take into account how to process all this data, how to obfuscate sensitive content, and more.
Fortunately, that's what we're solving at Bearer. We have built clients for multiple languages that work similarly to the core idea of this post (albeit far more complex). They automatically log the data you want (and nothing you don't want shared), and more importantly react when failures happen through circuit breakers and other remediation techniques.
Previously published at https://blog.bearer.sh/http-api-instrumentation-nodejs/