How To Define HTTP Middleware and Best Practices by@moesif

How To Define HTTP Middleware and Best Practices

Middleware is a design pattern to add cross-cutting concerns like logging, handling authentication, or gzip compression without having many code contact points. Some middleware is passive, such as logging middleware, can perform transforms on the request or response body. Latency and bandwidth are two orthogonal, yet related metrics. For example, an increase in latency can also reduce the bandwidth of a system, or the average time needed to perform a particular query. Latency may also increase the resident time of queues.
moesif HackerNoon profile picture


User-Centric API Analytics

github social iconfacebook social icontwitter social iconlinkedin social icon

In order to capture API calls from arbitrary environments, we had to create middleware for many of the common web API frameworks. Here's what we learnt.

Introduction to HTTP Middleware

What Is Middleware?

I use the term middleware, but each language/framework refers to the concept differently. Node.JS and Rails call it middleware. In the Java enterprise world (i.e. Java Servlet), it’s called filters. C# calls it delegate handlers. Essentially, middleware performs some specific function on the HTTP request or response at a specific stage in the HTTP pipeline before or after the user defined controller. Middleware is a design pattern to eloquently add cross-cutting concerns like logging, handling authentication, or gzip compression without having many code contact points.

Since these cross-cutting concerns are handled in middleware, the controller/user defined handlers can focus on the core business logic.

What Can Be Done With Middleware

Middleware is generally pretty flexible. Some middleware is passive, such as logging middleware. Other middleware, such as gzip compression, can perform transforms on the request or response body. Middleware can add HTTP headers, or add internal flags for use by your business logic, etc. It’s an implementation of the pipeline's design pattern.

In fact, even basic framework features like body parsing can be considered middleware. You can add custom body parsing for various binary protocols if the ones included with a framework don’t fit your use case.

You can even call other services, such as looking up or performing a GeoIP lookup from a MaxMind dataset.

Key Things to Consider

One of the first things to consider if you want to perform a task using middleware is that middleware is best for cross-cutting concerns. If it is a specific business logic, and only applies to very few cases.

If your app has many common tasks, such as logging, authentication, parsing JSON, or adding some common share data to every request or response, then refactoring out that logic into middleware makes sense.


Due to the cross-cutting nature of middleware, ensuring it’s performant is particularly important, as any added latency could impact the entire application. Latency could come from the I/O, such as the disk or network access.

For example, if you create authentication middleware for your API that needs to lookup user information in a SQL database, that I/O read will stall the HTTP request pipeline until a response is received from the SQL database. That added latency will still be seen by clients even with non-blocking frameworks like Node.js.

Primer On Latency vs Bandwidth
Latency and bandwidth are two orthogonal, yet related metrics. Bandwidth refers to metrics such as Request Per Second or the maximum number of active clients or connections at a time. Latency, on the other hand, refers to metrics like response time to first byte, or the average time needed to perform a particular query. But, since an increase in latency can also reduce the bandwidth of a system, they are related. For example, an increase in latency may also increase the resident time of queues.
Generally, non-blocking frameworks such as Node.js focus mostly on increasing the bandwidth of an application by not tying up resources (i.e. threads). This is similar to adding an additional VM or load balancer to handle additional traffic. However, each individual user or client may still experience high latency, regardless of the bandwidth of the system, or if they're using a non-blocking or blocking architecture.

For our example, a better solution may be to use an in-memory hash table, like Redis, for looking up data via session tokens. An even lower latency solution would be to use JSON Web Tokens (JWT) where secure authentication and authorization requires only CPU cycles.

Some frameworks like Node are designed from the bottom up to be a non-blocking I/O. However, in other frameworks, such as PHP or Ruby on Rails, it's important to do I/O or process heavy tasks on background threads.

Passive Middleware

Even with the limitations of NIO described above, it could still be beneficial in reducing latency. Especially for passive middleware with write only I/O, architecting your middleware to be non-blocking or asynchronous is ideal. For example, the middleware for Moesif does not modify the response, so there is no reason for the HTTP pipeline to wait for any writes to Moesif. Thus, we designed the Moesif middleware to be asynchronous.

For blocking frameworks such as PHP Laravel, you may have to implement your own methodology to be asynchronous. For the Moesif Laravel middleware, we leveraged a Unix fork to handle the sending of data to Moesif in a separate process.

Forking a process in Linux is a surprisingly lightweight task. For more info on why, please read What every web developer should know about CPU Arch.


Each middleware is instantiated as one step in a pipeline, thus ordering may be important for functional correctness. However, even if two pieces of middleware can be re-ordered, you should think about the performance or security implications. Lightweight middleware that has a short circuited return path should go before the heavier middleware.

As an example, middleware that redirects non-HTTPS to HTTP domain via a 301 Permanent Redirect should be placed before middleware that decompresses and parses the request body. Otherwise, you’re wasting CPU cycles decompressing a body that will be thrown away.

Think about security and DDoS protection also. If only a specific whitelist of IP addresses are allowed to access your API, and all others are denied, you probably want to check the IP address before performing expensive operations, such as querying account information in a database.

Underlying Frameworks

Often times, the framework you are using is actually built on top of another framework. For example, Ruby on Rails is built on top of Rack. So when you are building a rails middleware, you are actually building one for Rack.

This can be used to your advantage. At Moesif, we didn’t need to create a separate middleware for every Java framework. Instead, we released a single Java Servlet SDK. Many modern Java frameworks are built on a Java Servlet such as Spring, Struts, and Jersey. Sometimes, you have the option to build on a base framework like Java Servlet or a higher level web framework such as Spring.

Closing Thoughts

Build middleware can force you to dig into the more underlying technology of each framework. Many common services (such as authentication and body parsing) of a framework are implemented as middleware.

Previously published at

react to story with heart
react to story with light
react to story with boat
react to story with money

Related Stories

. . . comments & more!