Surviving downtime and creating critical moment experiences When was the last time your company’s product had a downtime? How did it affect your customers? This question is typically something that a dev team usually turns a blind eye to. We expect the app to be always up, and when it goes down, we react. This reactive reasoning is fair to us developers, but with the cost of the experience of the user. If you are new to service and application development, you might be thinking, what else can I do to keep my uptime high? Let me introduce the concept of . Resiliency Resiliency is more commonly defined as the capacity to recover quickly from failures, leaning towards elasticity. In this article, I’ll discuss client side and network related and how to improve your current stack. resilience I will be using the awesome to illustrate examples below. axios library _axios - Promise based HTTP client for the browser and node.js_github.com mzabriskie/axios const axios = require('axios'); Timeout Networks are unpredictable beasts. We cannot predict when and how connections will drop. All we can do is prepare for it. What happens when my app can’t reach an API? how about a slow response? Usually, developers will leave this alone since we expect the user to always be connected to a fast network. This is a dangerous assumption to make, especially when we do not know who are users are. To prepare for this, always add a timeout to your requests: async function MakeRequest() {try {await axios.get('/slow', {timeout: 5000});} catch (err) {// ...}} This ensures that users won’t have to wait a long time for your app to respond. You can discuss with your UX guy about various ways to accommodate this scenario. My favourite is to implement… Retry What happens when my request fails or times out? Client and network errors are abundant, and no developer should ignore that fact. There are a lot of scenarios where requests fail and we have to be think how the app should react. A good strategy is to implement retries. The usual threshold is to retry 3 times before actually failing. async function MakeRequest(retry = 0) {try {await axios.get('/failing', {timeout: 5000});} catch (err) {if (err < 3) {await MakeRequest(retry + 1);} else {// ...}}} This ensures that the app is given enough attempts to try and reach a once failed endpoint. Work with your UX guy about this scenario to determine how to handle the interim requests being retried and the final proper failure. Fallback what if my request fails? what should I show to the user? Inevitably, downtime will occur. . Everything will fail eventually, and you have to have a fallback. Just because you are using AWS, doesn’t mean your app won’t fail async function MakeRequest(retry = 0, fallback = false) {try {const url = fallback === false ? '/failing' : '/fallback';await axios.get(url, {timeout: 5000});} catch (err) {if (err < 3) {await MakeRequest(retry + 1, fallback);} else {if (fallback === false) {await MakeRequest(0, true);} else {// ...}}}} This ensures that the app shall receive something in the event of a failed request. This doesn’t disrupt the experience of the user since it does not necessarily result in a failure in their perspective. Logging How do I know which request/screen/api is failing? This has to be one of the basics that must be covered before writing an app. Ensure that proper monitoring is available in your development and production environments. This covers client-side and server-side. Enforce centralized logging to your monitoring tools from your server side. Capture all handled exceptions and throw them to a searchable log. This will greatly help in your debugging efforts when worst-comes-to-worst, failures happen. Implement client-side error handling as well, and if possible, throw them in a separate bucket in your monitoring tool. It’ll help you determine, previously unpredictable points of failure. Circuit Breaker What do I do if a service constantly fails? This point covers more into the microservices approach, which in my opinion, in most cases you should totally avoid(ask me in the comments). If you already are in this situation, why let our friend Martin Fowler explain it . _continuous delivery · application architecture tags: It's common for software systems to make remote calls to software…_martinfowler.com bliki: CircuitBreaker In essence, your app needs to be able to select performant service nodes over underperforming ones and protect your service when those service nodes are down. One notable benefit is the smart handling of usable resources. Preventing unnecessary usage of cpu and memory where you can allot it somewhere else. A combination of fallback, retries, and logging are essential to making this work. Please read . You can also check the library for reference. Martin Fowler’s take on it as he is the best to explain Netflix/Hystrix _Maintaining high availability and resiliency for a system that handles a billion requests a day._medium.com Making the Netflix API More Resilient Conclusion In the end, all of this is between the developers, architects, and UX to deliver a great experience to the user. This is always the ultimate goal. Always remember that UX does not end with the , it’s also how the users experience your application through different scenarios. Be it failures, slowness, or even outages. design I might not have covered each nook and cranny, but I hope it gives you a push to the right direction when thinking of architecting your next stack. Good Luck! Read more about resilience through the netflix blogs, they have great stuff over there. Don’t forget to check out Hystrix and Chaos monkey too. _Maintaining high availability and resiliency for a system that handles a billion requests a day._medium.com Making the Netflix API More Resilient _In a distributed environment, failure of any given service is inevitable._medium.com Introducing Hystrix for Resilience Engineering _Chaos Kong is the most destructive Chaos Monkey yet_medium.com Chaos Engineering Upgraded _Keeping our cloud safe, secure, and highly available_medium.com The Netflix Simian Army If you would like these stuff and you’re in Singapore, come join us: Digital Product Manager - Subscriptions, Payments & Partnershipsfoxcareers.com - Fox Careers Digital Product Manager - Subscriptions, Payments & Partnerships _Trademark & Copyright Notice: TM and © FOX and its related entities. All rights reserved. Use of this Website assumes…_foxcareers.com Video Development & Operations Manager, Fox+ - Fox Careers

Let’s talk about Resilience

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

An Organized Chaos

10 Ways To Make Your Innovative, Game-Changing, Synergistic Website Stand Out In 2018

10 Use Cases for Using Laravel to Build Web App Development Projects

10 Things You Can Do Better While Working With Programmers

10 Prioritization Techniques for Agile Product Development

10 Popular Websites Built With Django

An Organized Chaos

10 Ways To Make Your Innovative, Game-Changing, Synergistic Website Stand Out In 2018

10 Use Cases for Using Laravel to Build Web App Development Projects

10 Things You Can Do Better While Working With Programmers

10 Prioritization Techniques for Agile Product Development

10 Popular Websites Built With Django

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps