paint-brush
Prepare your distributed system, so you don’t lose money, when it failsby@bcosta10

Prepare your distributed system, so you don’t lose money, when it fails

by Bruno CostaApril 27th, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The first thing you need to learn when you release a distributed <a href="https://hackernoon.com/tagged/system" target="_blank">system</a> into production is this: it’ll eventually fail! You don’t believe me, huh? Well, I don’t blame you.

Company Mentioned

Mention Thumbnail
featured image - Prepare your distributed system, so you don’t lose money, when it fails
Bruno Costa HackerNoon profile picture

The first thing you need to learn when you release a distributed system into production is this: it’ll eventually fail! You don’t believe me, huh? Well, I don’t blame you.

We usually forget that we are trusting on the network, the number one reason for serious system failure, because we don’t know or we don’t remember the eight fallacies of distributed computing:

  • The network is reliable
  • Latency is zero
  • Bandwidth is infinite
  • The network is secure
  • Topology doesn’t change
  • There is one administrator
  • Transport cost is zero
  • The network is homogeneous

Lets say that you are running an e-commerce business, it’s the most important Friday of the year. Yep, that’s it, Black Friday. You have 20.000 users on your website. Your system allows the user to add to cart the items and it will validate the stock later on the checkout process. You have 300 users on the checkout process and you know that a slow experience on the checkout process can have a huge impact on the conversion rate. Your Inventory Service is having 30.000 requests per minute and it is slowing down your checkout process. You have an average order value of $400, so you can lose up to $120.000 if you lose the users that are at the checkout process.

How can you prevent from losing the users if the Inventory Service goes down? Timeouts and Circuit break are useful techniques in this situation. Just remember that it’s not just a technical decision and you should ask the business what should happen in this situation. Should we prevent users from ordering an item without validating the item’s availability or should we allow them and check the availability later in the process?

Repeat after me, it-will-fail. And Murphy will make sure it happens in the most inappropriate moment. The best thing is to prepare yourself for when it fails.

Originally published at thrownewexception.com on February 16, 2016.

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMIfamily. We are now accepting submissions and happy to discuss advertising &sponsorship opportunities.

To learn more, read our about page, like/message us on Facebook, or simply, tweet/DM @HackerNoon.

If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!