With the growing complexity and usages of data-intensive web applications, it becomes crucial for developers to build truly reliable, scalable, and maintainable applications. These terms are often used in the developer’s community without true understanding. In these three blog series, we are set to understand these important concepts in detail. We have already talked about reliability, and if you haven’t seen it yet, check it now on Hackernoon here.
In this post, we will be talking about Scalability.
An application that works today might not work the same tomorrow. There are several reasons which make this statement true, and the biggest of them all is the increased load.
An application that was running on a small server might properly serve 100 concurrent users today, but when 1000 concurrent users tomorrow access the same application, things are destined to break. We say the application is experiencing downtime because of increased load. Such a scenario is not good for any application.
Applications may also suffer downtime because of larger volumes of data. Data on any application grows with usage. Applications performing operations on larger datasets would naturally take up more processing power and latency.
Therefore we need some mechanism to cope up with this increased load in our applications. This is where scalability comes in!
Scalability is the term we use to describe a system’s ability to cope with the increased load.
Scalability is often talked about casually as - “System X is scalable“ or “System Y is not scalable“. We need to understand that these statements in themselves do not mean anything. Rather, scalability discussions around an application should talk about growing or increasing an application’s capability when a certain type of load is encountered.
Thus before talking about scaling an application, we should properly understand the art of quantifying load and performance. Once we have understood those, then we can talk about the ideas of maintaining performance when load increases in a certain way.
Measurements that define the load on a system are called as load parameters. The number and kinds of load parameters depend on the type of application under discussion. If we are talking about a web application, the number of users hitting the Backend at any time can be a load parameter. Similarly, for an HPC, the number of jobs in the queue can be a load parameter. And for a chat application, it might be the number of concurrent users.
Different applications have different definitions of performance. A batch processing application would have throughput as the measure of performance. On the other hand, a web application might have response time as the measure of performance. These performance numbers vary even with similar loads, and these variances are a result of random fluctuations. Therefore we often work with a distribution of these performance numbers rather than a single value.
Now there are several ways of working with these distributions. We might say that if the average of a performance distribution is within a certain limit, then the application is said to be performant. However, optimizing the average case is not always the goal of many applications.
Many organizations use percentile for analyzing this performance distribution. If 90% of performance values in the distribution satisfy a prescribed value, then it is said that the application is performant at the 90th percentile or p90.
Higher percentiles are difficult to optimize as they are easily affected by random events outside of an application’s control.
With the understanding of quantifying load and performance, we can go ahead and understand scalability. The question here is - How do we maintain good performance even when our load parameters increase by some amount?
An architecture that has been designed for handling one kind of load is likely to fail when given 10 times the prescribed load. It is thus possible that developers may need to rethink the architecture of their application on every magnitude of load increase.
As the load on an application increases, the resources allocated for that application need to grow.
There are two ways in which resource can be allocated to scale applications and these are - Vertical Scaling and Horizontal Scaling.
In Vertical Scaling, old, less powerful machines are replaced by more efficient and powerful machines. In this way, there is little impact on the actual application code, while the application can now serve more load. However, vertical scaling has its own limit.
In Horizontal Scaling, more less-powerful resources are added to the application. This is a better way to scale but involves its own challenges. Unlike vertical scaling, horizontal scaling demands certain guarantees from application code, which might not be true in the case of all applications.
Good architecture generally involves a mixture of both vertical and horizontal scaling.
Some applications are configured to add more resources when there is an increased load. Such systems are called elastic. Such systems are often provided by cloud-based providers.
On the other hand, we have manual scaling. In such type of systems, manual intervention is required to increase the allocated resources in case of an increased load.
An application can either be stateless or stateful. Stateless applications do not maintain any state. Such applications are easier to scale horizontally. On the other hand, we have stateful applications. These applications maintain data of their own. Scaling such an application is often a difficult task. This is because horizontal scaling for such an application would need to keep track of the state across various machines.
An architecture that scales well for one application might not scale well for the other. There is no magic sauce in terms of scalable architecture that works for all kinds of applications. These architectures are often a result of load parameters and performance requirements. It is thus very important to properly understand your application requirements before settling on an architecture.
With this, we come to the end of this second blog on the scalability of applications. In the last post of this series, we will explore the maintainability of web applications. Till then, keep learning and keep growing!
Enjoy what you are reading? Consider following me on Twitter!