http://www.internationalnewsandviews.com/wp-content/uploads/2017/03/flash-sale-banner.png There’s a well-known website in China that almost every Chinese use and we usually only use it once a year — 12306.com, the railway ticket booking system. When the website open the sales for tickets for Chinese New Year, many people are waiting in front of their computer and get ready to click the button at the exact time when sale starts. reserve Then, the website crashed and users complained. (And then software engineers reflect how to fix it) What is flash sale system The example above is a typical flash sale system. It usually has the following characteristics: Large number of users come to the system at the same time, causing a spike in traffic. (Tickets sales starts at 6am and finishs at 6:30am) Number of ordering request is much larger than inventory size. (1 million users competing for 10,000 tickets) What are the challenges Handle load on read. Users will constantly refresh the page to check available inventory. It’s also likely that all users will request for other similar resource at the same time, such as account, item description, etc. Provide high throughput. There will be many concurrent read and write to the limited inventory records, so the database lock may result in timeout for some of the requests. Avoid inventory from being oversold or undersold. how to handle the available inventory number? An item may be oversold if multiple requests reserved the same inventory. An item may be undersold if the inventory is hold but the process failed and inventory is not released. This happened to some e-commerce website, and usually they will call the customers to refund. Prevent cheating with script. user could write a script to fire request in a loop. so this user may generate many repeated requests Design considerations Limit request from upstream: we have limited inventory, even if 1 million users come, we can only serve 1000 of them. Can reduce the number of unnecessary requests from users? Split the traffic: the root cause is that we ask users to come at the same time and create a spike. Can we divide the sales into different time slots and invite users for different time slot? Asynchronous processing: do we need to process the request immediately? Or we can accept the request, process it with the pace that our system can support and notify user within a acceptable time? Cache, cache, cache: many repeated read on resource that never/seldom changes. Can we keep them in cache to relieve the load on database? Understand tradeoff and make the sacrifice: do we really need to show the realtime inventory? we only have this flash sale once a year, does it make sense to dump so many resources on it? Database design: different schema design for the same problem will have different tradeoffs. Scalability: are we able to scale horizontally to support more users? Optimisation by layers Client When the user did not get the immediate response after submitting a request, they will tend to click the button again and again. However, those requests are duplicates and create unnecessary load for the system. (Just imagine every user resubmit 9 more requests, then 90% of the requests system received are redundant) What we can do is to: disable submit button once it’s submitted only send out the request if there’s no duplicated request in recent 1 minute. This will reduce most of the unnecessary requests from client side, however, it does not prevent user from writing a script to bypass the client side restriction. So we need to do more. Gateway When a request reached the gateway, we keep track of the request number by the same user. The simple and effective solution is to keep a counter for each user ID in memory. We can drop the request or return the same response if the same request was made recently by this user. However, what if this user created thousands of user accounts? We will handle it in the service layer. Services Now the requests that reached service layer are much less. Let’s start processing them. Managing available inventory The first thing we need to do is checking the available inventory. How do we get the remaining inventory size? If we query the database with aggregation like , firstly, we may need to have a isolation level for the transaction which affects performance. If we do not apply this isolation level, we may end up with phantom read and thought that we still have enough inventory this request, and then we sell more than what we have. COUNT serialisable To handle this, we can use a counter with atomicity. Redis will be a good choice for atomic decrement when a request need to be processed. 2. Buffer with message queue Processing a transaction could be quite complex and heavy. For example, automatically applying promotion based on user’s loyalty points. Thus, let’s use message queue to buffer the request and let the service to process it at its own pace. Database Now, the number of request that reaches database is very much reduced to an acceptable number. We can focus on designing better schema, optimising queries and maybe sharding the table. Discussion So now I can directly copy this design to my project? NO. Firstly, it depends on your requirements. Secondly, you may not need to spend this much effort, there might be other better ways. Conclusion This post is a brief case study of flash sale system design. It provides a general guideline and possibly inspiration to people who is solving similar problem.