paint-brush
Five Tips To Scale Your Infrastructure 30X For Peak Days by@dmitryshesternin
354 reads
354 reads

Five Tips To Scale Your Infrastructure 30X For Peak Days

by Dmitry ShesterninSeptember 18th, 2023
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Black Friday and Cyber Monday in-store traffic show a 38% increase. Companies should get their infrastructure ready for peak days since even a few minutes of downtime can cost them thousands of dollars. Flowwow, a global marketplace of local brands and floral businesses, has its own 3-4 peak days (Valentine’s Day, Mother's Day) when traffic is increasing 30X.
featured image - Five Tips To Scale Your Infrastructure 30X For Peak Days
Dmitry Shesternin HackerNoon profile picture


According to SalesCycle, Black Friday and Cyber Monday in-store traffic show a 38% increase. Companies should get their infrastructure ready for peak days since even a few minutes of downtime can cost them thousands of dollars. Over the last few years, many big brands, including Walmart, J. Crew, Lowe’s and GAME, lost a lot of money and broke their customers’ trust because their teams weren’t ready for Black Friday. Circling back to the 38% increase, such a change demands unwavering attention to the provided service and its scalability for the companies to keep up with the flow of orders, maximise customers’ experience, and minimise their negative feedback.


Flowwow, a global marketplace of local brands and floral businesses, has its own 3-4 peak days (Valentine’s Day, Mother’s Day) when traffic is increasing 30X. Our IT team has developed a flexible system that helps to scale the service, manage it, and wrap it up when the peak day is over.


In this article, I am sharing five tips that will help prepare your service for massive scaling – and avoid mistakes, common for e-commerce platforms.

Planning is the key

Before any action, develop and implement a detailed scaling plan, divided into several stages: 1 month before a peak day, 2 weeks before a peak day, 3-4 days before a peak day, and 1 day before a peak day with a precise list of actions for the teams. At every stage, it’s important that designated specialists understand how to carry out each task and solve any potential issue. These teams usually consist of a DevOps team and backend developers.


Our traffic at peak days can increase 30X

#1 Forecast the data

Every year the total number of orders is increasing – it comes as no surprise, and that's why it’s vital to forecast the traffic volume and be ready for it. Historical data and cross-functional collaboration with the marketing team are core elements for an accurate prediction. They help to understand which of the system components can grow and then decide on the number of servers and their capacity.


Once you have evaluated the expected numbers, put 2X into the forecast. If you predict 30X growth, then set the task to be ready for a 60X increase. This approach allows you extra preparation for unexpected challenges. If you have cloud scaling, you have the tools to dramatically increase the capacity in a short period of time and then reduce it to regular indicators when needed.


Do an infrastructure discovery analysis, based on the basic parameters – map out servers, their current size, their components, projects, and applications. Understanding the basic parameters helps extrapolation: you are predicting growth and calculating potential scenarios in case it's indeed 60X.


Apply rule 30X to all errors. If you find an error, ask yourself: “If this error starts appearing 30 times more often, will it be a problem for us?” If an error on the 30X traffic is worth being corrected now, start working on it.

#2 Protect your scaling team from business requests

When your development team is not big enough, the chances are high that everyone's to-do list is full to the brim. Hence, it is necessary to form a separate internal development team, whose focus will exclusively be on the peak period: this team should be free from other business requests and deal only with optimization tasks. The Pareto principle works well here: optimising 20% of the bottlenecks means closing 80% of performance problems in code and the database.


Don’t forget to announce a feature freeze (a period when your team won’t add new features) and remind the team about it, preferably a month in advance. We don’t allow making changes to the code or database during this period because, when changes are made, the entire system may fall, and the cost of this error increases 30X at the time.

#3 Find bottlenecks and start the optimization

Bottlenecks show the mechanisms you need to optimise in your current infrastructure. We divide the current system components into different areas and monitor them separately. It allows us to evaluate the performance of each server and understand which components need more resources from our side – this approach helps to plan our resources flexibly. Use an advanced monitoring system, capable of assessing each component’s performance in real-time, and log aggregation to catch all errors (so you can fix them later).


The Pareto principle works well here: optimising 20% of the bottlenecks means closing 80% of performance problems in code and the database.

#4 Perform personalised stress testing

Perform stress testing in advance, trying to mimic your potential customer behaviour. Start with simulating the infrastructure load, for example, 10X of the usual. Based on the data obtained, we understand, which areas require optimization. There are special tools that allow you to take an hour load and play it 10 times faster to assess and confirm whether the system can withstand a certain level of pressure.

#5 Plan for Day X

During peak loads, we meticulously schedule the workload of each team member, assigning key roles and distributing responsibilities. We do care about our employees, and that’s why we make sure everyone gets a healthy amount of sleep, however, we also strive to cover the maximum number of time zones. At each stage, we nominate a process manager, who monitors what is happening at the peak moment. It is this person who informs the team if one of the system components approaches the performance limit (80%).


As we work remotely, when Day X comes, it is crucial that the whole team keeps in touch not only through work messengers like Slack, but also through emergency communication channels, such as Telegram and a personal phone.


Peak Days like Black Friday, Cyber Monday, and Christmas are active phases that you should be ready for. More traffic means more value for your business. This, in turn, usually manifests into increased revenue. That’s why do not hesitate to invest in scaling, adhere to international principles and implement modern services. Keep an eye on new solutions that meet the needs of your business today, and scale with the benefit to your service!