Being a committer to many open-source projects, I decided one day to make my life easier and developed an upstream module for NginX that helped me eliminate a thick bunch of tiers in a multitier architecture. That was such a fun experience that now I’ve decided to share my results and to publish an article. My results are totally open-sourced, see the source code here: https://github.com/tarantool/nginx_upstream_module. You can build it all from scratch or download a docker image via this link: https://hub.docker.com/r/tarantool/tarantool-nginx.
This is what a typical architecture of a microservice looks like. User requests come down through NginX onto an application server. There is some business logic running on the application server which the users interact with.
The application server does not hold any state, so you need to store those states somewhere. You can use a database for that. Also, there is a cache to decrease latency and to ensure faster content delivery.
Let’s give definitions to those tiers:
Tier 1 — NginX
Tier 2 — Application server
Tier 3 — Cache
Tier 4 — Database proxy. You need that proxy in order to secure the fault tolerance of your database and to persist connections to the database.
Tier 5 — Database server
Some day I started thinking about those five tiers and came up with an idea to eliminate some of them. Why would I do that? There are many reasons. I love keeping things simple, and I don’t like maintaining a lot of different systems in production, and last but not least — less tiers mean less failure points. As a result, I created a Tarantool NginX upstream module that helped me reduce the number of tiers down to two.
How can Tarantool help us reduce some of the tiers? Well, tier one is NginX, and tiers two, three and five are now replaced by Tarantool. Tier four — that is a database proxy — is inside NginX now. The trick is that Tarantool is a database, a cache and an application server — all in one. My upstream module is the glue that sticks NginX and Tarantool together and lets them work without the other three tiers.
This is what our new microservice looks like. A user sends REST or JSON RPC requests to NginX with the Tarantool upstream module. This module connects directly to Tarantool, or this module can balance workload among many Tarantool instances. We use a super effecient protocol between NginX and Tarantool, based on MSGPack. You can find more information in this article.
Also, you can head over these links in order to download Tarantool and the NginX module. But I would advise you to install everything via packages or to use a Docker image (docker pull tarantool/tarantool-nginx).
https://hub.docker.com/r/tarantool/tarantool
Tarantool NginX upstream module_Key features: Both nginx and tarantool features accessible over HTTP(S). Tarantool methods callable via JSON-RPC or…_hub.docker.com
Tarantool - Download_curl http://download.tarantool.org/tarantool/1.7/gpgkey | sudo apt-key add - release = `lsb_release -c -s` # install…_tarantool.org
Tarantool_GitHub is where people build software. More than 15 million people use GitHub to discover, fork, and contribute to over…_github.com
tarantool/nginx_upstream_module_nginx_upstream_module - Tarantool NginX upstream module (REST, JSON API, websockets, load balancing)_github.com
Here is an example of nginx.conf file. As you can see, this is a regular NginX upstream. Here we have the “tnt_pass” directive telling NginX that there is a Tarantool upstream in a specified location.
— nginx-tnt.conf
http {
upstream tnt {server 127.0.0.1:3301;keepalive 1000;}server {listen 8081;
location /api/do {tnt_pass_http_request parse_args;tnt_pass tnt;}}}
Here is links to docs:
http://nginx.org/en/docs/http/ngx_http_upstream_module.htmlhttps://github.com/tarantool/nginx_upstream_module/blob/master/README.md
Well, we’ve connected NginX with Tarantool. So, what is the next step? We need to write a function and to store this function in a file. I stored it in a file named “app.lua”.
Here is a link to Tarantool doc: https://tarantool.org/doc/tutorials/index.html
-- Bootstrap Tarantoolbox.cfg { listen='*:3301' }
-- Grantsbox.once('grants', function()box.schema.user.grant('guest', 'read,write,execute', 'universe')end)
-- Global variablehello_str = 'Hello'
-- functionfunction api(http_request)local str = hello_strif http_request.method == 'GET' thenstr = 'Goodbye'endreturn 'first', 2, { str .. 'world!' }, http_request.argsend
Let’s take a closer look at this Lua code.
box.cfg {} — it is telling Tarantool to start listening at port 3301, and it can also take other parameters.
box.once — it is telling Tarantool to call some function once in a lifetime.
function api() — this is our function that I’m going to call soon. That function is pretty simple, it takes an HTTP request as the first argument and it returns an array of values.
I stored this code in a file named “app.lua”. I can execute it just by starting a Tarantool binary.
$> tarantool app.lua
Let’s call our function by using an HTTP GET request. I use “wget” for this. By default, “wget” puts the result into a file. So, I use “cat” to extract the content of this file.
$ wget '0.0.0.0:8081/api/do?arg_1=1&arg_2=2'
$ cat do*{ “id”:0, # — unique identifier of the request“result”: [ # — is what our Tarantool function returns
\[“first”\], \[2\], \[{
“request”:{“arg\_2”:”2",”arg\_1":”1"}
“1”:”Goodbye world!”
}\]
]}
Those benchmarks are running with production data.
Input data for this benchmark is a large JSON object. Each object has an average size of 2 KBytes.
Single server, 4 core CPU, 90GB RAM, OS Ubuntu 14.04.1 LTS
For this test, we use only one NginX worker. This worker is a ROUND-ROBIN balancer. The worker is balancing the workload for two Tarantool instances. The instances are tied via sharding.
These charts are showing us the number of reads per second. The top chart shows latencies (ms).
Here we can see more charts. These charts are showing the number of writes per second. The top chart shows latencies (ms).
Impressive! Do you think so?
In the next article I’ll write about REST and JSON RPC in details.