TL;DR
… and yes, doing a lot of tasks are killing me slowly, that is why I need your help. Please!
Imagine that I create a new mobile application. My system consists of frontend and backend service, communicate using the REST API. This is my system architecture.
At the first time my product lunch to market, there are only a few users that use it. My backend service still can perform well to handle the request as expected. My customers happy because the application blazing fast.
Now the users grow as I add more features on my application. My backend service handles more request than before. The request growth starts to impacting my backend service, it leads to performance degradation. My backend service also down because it cannot handle all the request. It means the availability is degraded too.
The user starts complaining about the application because it becomes slower. They not happy and me too as I afraid of losing users. What can I do to improve my backend performance? What can I do to increase its availability?
There are some few ways to achieve it. It can be revisiting the application algorithm. It can be to change the programming language. It can be to add more resources to the server. It can be to add more instances of the backend service. You named it.
For this case, I will choose to add more instances of my backend service, because I have enough time to rewrite the backend service.
As I want to create multiple instances of my backend service, I need a mechanism to redirect request from my mobile application to the instances. Here I used the load balancer. As its name, the load balancer is used to share the workload between the backend instances. Here my architecture will look like.
The load balancer has its own algorithm to share the workload. There few algorithms I can choose like round-robin, weighted round-robin, least-connection, and others. I can choose the algorithm that fits with my needed.
Using this scenario, now my backend service will handle less request than before. Imagine I have averaged around 6000 requests per minute that handle by a single backend service. Now I can share the workload to three instances, mean that one instance will handle 2000 requests per minute. It will help me to increased performance and uptime.
There is some load balancer technology such as HAProxy, Nginx, and Cloud Load Balancing that we can use. You can pick whatever you want.
Evidently not! Load balancing does not only handle the case where I want to handle a high workload. So what other?
Let’s imagine again that I have services that run on the virtual machine. But know I want to migrate all my deployment to containers. Sound easy because I only need to redeploy all my services using containers. After that, I redirect all the request to the container cluster. But is that a good practice?
That approach is too risky. Imagine that there is some deployment on container cluster that not configured well. It can cause downtime. So how can load balancer can help me?
Because moving directly to the container cluster is risky, so I can moving partially. After deploying all service to container cluster, I want to test it first by sending some request to it. I can use the load balancer to share the load between the VM cluster and the container cluster. I configure the load balancer to share 20% of request to the container cluster and 80% of request to VM cluster. Using this approach, I can monitor my container cluster to make sure it configured well. By the time I can add more request to the container cluster until I am confident to use the container cluster totally.
This case similar enough to migrating VM cluster to container cluster. I risky when I migrate monolith directly. A load balancer can help me to migrate part by part. I can use a load balancer to share request between my monolith cluster to my microservices clusterDisclaimer. I use the load balancer until I am confident that the microservices already work well. Then I migrate totally.
Here at Gojek, we also create our in-house load balancer, named Weaver. It was an open-source project that you can find here.
https://github.com/gojek/weaver
You can read why we create it as we know there is some existing load balancer technology that we can use.
https://blog.gojekengineering.com/weaver-sharding-with-simplicity-8d602f2b0d81
https://blog.gojekengineering.com/sharding-101-the-ways-of-weaver-6c99360531fa
I hope you enjoy reading it!
~CappyHoding 💻 ☕️
(Disclaimer: The Author is an SRE at Gojek)