One aspect of application development that is often overlooked, especially by beginner developers is application resilience. A lot of tutorials often focus on the happy path of execution, omitting the details of potential errors that can occur. Example Consider the following, although a bit simplified ASP.NET MVC example: [HttpPost]
public Task<ActionResult<ResponseModel>>
 CreateOrderAsync(OrderModel order)
{ cart = _cartService.GetCartItemsAsync(UserId); (cart.Items.Count == )
    { ResponseModel
        { }
    } orderEntries = cart.Items.Select( c.ToDbModel(UserId)); order = Order
    {
        UserId = UserId,
        DatePlaced = DateTime.UtcNow,
        Entries = orderEntries,
        CartIdempotencyToken = cart.IdempotencyToken
    };

    _context.Orders.Add(order); _context.SaveChangesAsync(); _cartService.EmptyAsync(UserId, cart.IdempotencyToken); ResponseModel
    { };
} async var await if 0 return new // ... omitted for brevity ... var => c var new await // User should no longer have the items in their cart // after they've placed an order await return new // ... omitted for brevity ... Apart from the absence of some obvious error handling (what happens if the user's cart can't be found), the code looks decent enough at first glance. It is able to retrieve the entities from the , map them to the database entities and store them as a part of entity. CartService Order I've tested it, it works! Sure enough, the code is correct algorithmically - it does exactly what you've asked it to do. You have tested it with various different inputs and came to the conclusion that no matter what data you give it - the processing will be done correctly. So what's the problem? Async - state, trapped in time Request/response model and async/await makes the code look linear. It's pretty obvious where the data is coming from, and where it goes. It's sure convenient. However, if we don't look into the nature of asynchronous processing, it makes it very easy to miss an important detail - asynchronous processing is stretched in time and usually involves 3rd party resources that can potentially fail at any point in time. A resource that was available just a moment ago, when we were executing the top of the function, may very well be down at this point. This service is not alone in the world - it this case it interacts with the (which may make calls to a microservice over the network) and the Database. It becomes pretty obvious that the author of this code example was focused on an ideal condition when both of them are always available and don't return any errors. However, the reality is a lot more complicated than that - there may be network problems when the service becomes unreachable or straight-up refuses to process the request correctly due to issues of its own  ( ). CartService remember the fallacies of distributed computing? Although the result of a happy path is correct, we haven't even thought about a plethora of potential issues: What is the time requirement for this endpoint? May it be the case, that after a certain period of time it's better to just straight up give up on the request processing and return an error informing the client to try again at a later time (timeout). What happens if the cart data request fails? Is it safe for us to retry it? How many times? What kind of retry intervals are safe to use without overwhelming the upstream service? What if the database store operation fails? What kind of response should the user get? Are we able to retry it too? (For example, if it failed due to a network problem). What if the cart cleaning operation fails? Is it essential to clear it, or in the worst-case scenario we can keep it? Can we retry it? Ok, It's complicated, is there a better way? Sure is! Polly comes to the rescue! Polly is resilience and transient-fault-handling library that allows us to very easily express the policies that will help to deal with various issues. With Polly, it becomes very easy to describe retries, timeout, caching, and many more policies or their combinations. Building and using policies One thing that you should decide right away - is your policy going to be asynchronous or synchronous one, because depending on your choice of a policy builder method you will get back either or instance and using these two together can be quite challenging. Policy AsyncPolicy Usually though - you'll be making asynchronous calls through your policies so let's use that as an example. timeoutPolicy = Policy.TimeoutAsync( ); res = timeoutPolicy
	.ExecuteAsync(ct => TestAsync(ct), CancellationToken.None); // Let's build our simple timeout policy // This policy will timeout after 3 seconds var 3 // Note that this also supports optimistic cancellation var await Ok, so what is going on in this example? We build a policy that specifies a timeout rule, and on the next line we are using that policy to call an asynchronous method called . TestAsync(...) This method also supports optimistic cancellation (we explicitly notify it when it's time to stop through the ) and we are making use of that. has an overload that gives us access to an internal of the policy and we can pass that to our method to achieve the desired result. CancellationToken AsyncPolicy.ExecuteAsync CancellationToken However, notice how I've passed as a second parameter? That's right, if you wish, Polly allows you to also use your own that will be linked to the internal one to terminate the execution even sooner. Pretty awesome! CancellationToken.None CancellationToken Basic retries As discussed earlier, Polly supports a lot of things out of the box, but for now, let's focus on the most basic example - retries with exponential backoff. From the official Polly : wiki Policy
  .Handle<SomeExceptionType>()
  .WaitAndRetryAsync( , retryAttempt => 
	TimeSpan.FromSeconds(Math.Pow( , retryAttempt)) 
  ); // Retry a specified number of times, using a function to // calculate the duration to wait between retries based on // the current retry attempt (allows for exponential backoff) // In this case will wait for //  2 ^ 1 = 2 seconds then //  2 ^ 2 = 4 seconds then //  2 ^ 3 = 8 seconds then //  2 ^ 4 = 16 seconds then //  2 ^ 5 = 32 seconds 5 2 In this example, we can see a policy that will retry to execute your code at most 5 times, each time increasing the delay between calls. This is very useful in situations when you don't want to overwhelm the upstream servers with retries. The exponential backoff mechanism will allow your system to balance out and find a suitable rate of calls to upstream servers, even if they are experiencing temporary problems/load spikes. Do note, that it will be beneficial to introduce some randomness (jitter) into the retry policy to avoid all of the retries happening at the same time. This also partially helps to reduce the possibility that your service will cause Denial Of Service for the upstream server. More on that in the next chapter. Take extra care when retrying calls to services with side effects (e.g. sending emails to users through SMTP service) because exception does not always mean that action was not executed by the service, and you may unintentionally execute it multiple times while trying to retry a failed call. Circuit breaker Sometimes, if the rate of failures is too high, it's probably a good idea to give the upstream servers some time to recover while partially degrading the functionality of your own application. Imagine this scenario - we have a factory that makes car engines. These travel through various assembly steps on a conveyor belt and then checked at the end to ensure quality. In case manufacturing yield is high enough (say only 1 in 10000 engines is defective) - just removing defective part is a good enough solution. On the other hand, if the failure rate is above 30% - something is definitely wrong and it's worth stopping the whole conveyor for inspection. With web services we can do exactly the same thing - if we see that the failure rate of our requests is too high, maybe it's not worth making a request at all? Let's give our upstream servers some time to deal with whatever issue they are having while degrading our application a little bit. It may not be suitable for all scenarios, but if it's intended to provide non-critical functionality (say recommendations for a purchased product in an e-shop) - it's is useful to temporarily disable that feature while showing the users a pop-up with an explanation that the service is experiencing some temporary high load. Circuit breaker policy does exactly that - it allows us to temporarily stop making upstream calls in case the failure rate is above a certain threshold, or a certain amount of consecutive exceptions of a specified type occur. policy = Policy
  .Handle<HttpRequestException>()
  .CircuitBreaker(
    exceptionsAllowedBeforeBreaking: , 
    durationOfBreak: TimeSpan.FromMinutes( )
  ); var 2 1 In this example - if two consecutive calls through this policy throw an exception of type , the circuit will break and stay broken for a duration of 1 minute, meaning that any call made through this policy in that interval will throw a . HttpRequestException BrokenCircuitException It's up to the application developer to properly handle this exception and possibly return some kind of meaningful message to the client describing what exactly has happened. Policy wrapping I won't be explaining all of the policy variants available, but by this time I hope you already saw how powerful these are. But wait, there is more! You can wrap one policy with the other to achieve even more complex behavior. Consider this example with two separate policies: timeoutPolicy = Policy.TimeoutAsync( ); fallbackPolicy = Policy< >
	.Handle<SomeException>()
  .Or<OperationCancelledException>()
	.FallbackAsync( ); // Timeout policy, // requests cancellation when the execution time exceeds // a specified amount. var 3 // Fallback policy // If an exception of a specified type occurs // during method execution, // it will return a predefined result var string "Fallback result" The first one will just cancel the execution of a method as discussed earlier, while the other one is a bit more interesting - in case the method throws an exception with type or , it will return a predefined result instead. But what if we could combine these two? Can we do that? Easy! SomeException OperationCancelledException "Fallback result" combined = fallbackPolicy.WrapAsync(timeoutPolicy); result = combined.ExecuteAsync(...); var var await And that's it, now we have a policy that will either return the original result of a method, or a fallback result if the operation times out or throws . SomeException The order of the wraps will actually affect the behavior, so make sure to pay close attention to it because it may give you unexpected results. In the example above - is the outer one (will operate on the results returned or exceptions thrown by the ), and the will operate on the results of the method passed to . fallbackPolicy timeoutPolicy timeoutPolicy .ExecuteAsync(...) Integrations with HttpClient With package installed you can call method on your 's to handle some trivial cases like responses with 5XX or 408 status codes and retry with a chosen strategy. Microsoft.Extensions.Http.Polly .AddPolicyHandler(...) IHttpClientBuilder This lifts this concern from the layers that use these 's. package even provides that behavior out of the box with its: . HttpClient Polly.Extensions.Http HttpPolicyExtensions.HandleTransientHttpError() Consider this example code: {
	...
	services.AddHttpClient<T>(client => {...}) .AddPolicyHandler(GetHttpRetryPolicy());
	...
} { HttpPolicyExtensions.HandleTransientHttpError()
	.RetryAsync( );
} ( ) public void ConfigureServices IServiceCollection services // You may want to do some additional configuration // on your http clients (like base address) IAsyncPolicy<HttpResponseMessage> ( ) private static GetHttpRetryPolicy return 3 In this example, we are using a function provided for us by to get the policy with the default behavior that handles various HTTP status codes and simply retries the operation. Polly.Extensions.Http Do note, however, that for the extension to work you'll need to configure a typed or named HTTP client (the parameterless implementation of just returns ). More information about named and typed clients can be found in the official Microsoft . .AddPolicyHandler() .AddHttpClient() IServiceCollection documentation Some tools have built-in resilience mechanisms Polly is not the only way to get resilience - if you look closely into some of the tools you are using already, you might discover that they also provide resilience mechanisms. One notable example of such mechanisms is available can be found in EF Core, when using MS SQL Server. Let's take a look at the configuration: { { services.AddDbContext<CatalogContext>(options =>
        {
            options.UseSqlServer(Configuration[ ],
            sqlServerOptionsAction: sqlOptions =>
            {
                sqlOptions.EnableRetryOnFailure(
                maxRetryCount: ,
                maxRetryDelay: TimeSpan.FromSeconds( ),
                errorNumbersToAdd: );
            });
        });
    } } // Startup.cs from any ASP.NET Core Web API public class Startup // Other code ... IServiceProvider ( ) public ConfigureServices IServiceCollection services // ... "ConnectionString" 10 30 null //... In this example, the connection will be reattempted no more than 10 times, with a maximum delay of 30 seconds. specifies additional SQL Server error codes that will be handled by this retry policy. errorNumbersToAdd How we use it To simplify the reuse of common policies, we at , use a policy registry. It is exactly what it sounds like - it's a registry of policies that you can address by the unique (e.g. string) key. It is very convenient to register the policy registry at the start of the application inside of the DI container, add policies, and then resolve the needed ones later through the interface. TeleSoftas IPolicyRegistry<Tkey> Adding a registry To add the registry to your DI container, simply call in your . services.AddPolicyRegistry(); Startup.ConfigureServices(...) Registering policies To register a policy, simply call method. The example below shows us how to register a policy that we have already seen - a retry policy that handles transient HTTP errors. This configuration introduces a 2-second delay between retries and only does so 3 times. policyResgistry.Add(Tkey key, TPolicy policy) policyRegistry.Add(PolicyNameConstants.Transient,
		HttpPolicyExtensions
		.HandleTransientHttpError()
		WaitAndRetryAsync( , (i) => TimeSpan.FromSeconds( ))); 3 2 Retrieving and using the policy Simply inject your policy registry into the service where you need it and request the needed policy by it's key like so: { IAsyncPolicy<HttpRequestMessage> _policy; {
		_policy = _policyRegistry
		.Get<IAsyncPolicy<HttpRequestMessage>>(PolicyNameConstants.Transient);
		... ...
	} {
		_policy.ExecuteAsync( () => 
		{ }); }
} public class MyService private readonly ( ) public MyService IReadonlyPolicyRegistry< > policyRegistry, ... string // Rest is omitted Task<HttpResponseMessage> ( ) public async DoStuffAsync async //Do the work //Do something else Final thoughts First and foremost - get to know your tools. Some of them already provide near-effortless ways to improve your app stability. Adding retry policy with few lines of code is next to effortless but it might save you a lot of headache. Second - make sure to focus not only on the happy path of execution but carefully plan the failure and graceful degradation strategies, especially when dealing with external services. I hope that with this brief introduction to resilience policies I've persuaded you to go through your code and identify spots for potential improvement, where any potential errors are simply dismissed and not handled properly. Further reading A great article about resilient applications from Microsoft Polly official wiki

Don't Let Your .NET Applications Fail: Resiliency with Polly

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

The Noonification: Whats Up With AI Regulations? (1/20/2024)

C# 8.0 Indices and Ranges

ASP.NET FAQs

A Starter Guide to Middleware in ASP.NET Core

Building an ASP.NET Core MVC 6.0 Report Viewer Application

Developing and Customizing a Shopping Cart Based on ASP.NET With nopCommerce

The Noonification: Whats Up With AI Regulations? (1/20/2024)

C# 8.0 Indices and Ranges

ASP.NET FAQs

A Starter Guide to Middleware in ASP.NET Core

Building an ASP.NET Core MVC 6.0 Report Viewer Application

Developing and Customizing a Shopping Cart Based on ASP.NET With nopCommerce

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps