Working on a company that tries very hard to get things done in the right way, it is just sad to sometimes find some of the code shared in this post in our code base. We are really trying to correct some of the followings, so I share this for the new people (there is always new people) starting with concurrency. Even though these examples are Java based, the exact same ideas will work for the programming language of your choice.
The following code examples are just to exemplify the problem in question, our production code has similar general problems.
Let’s suppose we want to download the content from an URL. We could simply use the following function for it. This code runs synchronously.
The function below could be replaced by anything else that has a synchronous execution such as accessing a database or saving something to a file system. Based on this, in our code base we found, sadly, the following.
As we can see, this is indeed, a very bad way to download content. It does absolutely nothing asynchronously. It is even worse when we look at the signature of the function
getContenAsyncBadWay since all indicates it will be running asynchronously.
We can verify that this is running serially by executing the following code.
The output will be.
As we observe, everything happens linearly even though we think not.
We are trying to train some of our resources (people) not to do this and write the following instead.
We are trying hard to avoid these kind of silly mistakes.
Notice we are wrapping the
.getContent call inside the
supplyAsync. We can test it again by executing a similar code as before.
Which outputs correctly what we expect.
Please, take note of the order in the output which it does indicate that everything runs as expected.
Early Blocking and Chaining
Another problem we found in our code base is that we block async operations in order to access its result which in most cases it is completely unnecessary.
The following example shows the same problem we have.
This is extremely bad since you can think the operation is happening async, but this is not better than calling
.getContent in sync way since we are blocking right after the call.
The result of this will be.
Again, this code is polluted with
CompletableStages and still very synchronous even when using the right implementation
We should, instead, do the following.
Notice how we are chaining the computational stages to
.countWords and then to print the result.
Running this version will give us the following.
Once more, notice the output order which is an indication of how things are running.
For some reason, it is hard for some people to understand these concepts.
Sometimes you absolutely need the results
In order to continue with our examples, let’s add the following supporting functions.
Now, we can run a modified version of the previous example.
In here, even though we are
.joining, we are delaying it until the very end so we allow other operations to be executed,
AsyncOps.sumAsync in this case. The output of this program will be.
Summing -1974445143 + -1585876417
However, we can still do better than this using the following version of the very same program.
In this case, the
.forEach will happen after the values are computed and the
.join will wait for it. It might seem the same but think about very large computed values. In the first version, nothing gets executed until they are printed (or process). In this version, the processing of the computed values is actually running async.
Multiple Async Operations Running
In many cases, we need to run two or more async operations at the same and in many cases they are independent of each other. Since our code base in polluted of
.join, we are unable to take care of multiprocessing when we should be actually, writing something like the following.
It is important not to
.join because most of the time, we want to run other operations after that. If those operations are sync, then use the monadic operations offered by
CompletionStage. If the operation is a different, independent computation, then blocking on the first one will prevent the second to run which degrade the performance quite a lot.
Trying to write better concurrent code with Async/Await
This is a paradigm used by C# and .NET for years now that reduces the time spent thinking about concurrency since most of the code will look very serial. However, this model improves speedup in most cases while adding extra clarity of the operations being executed.
Let’s see an example that could help us understand how it works.
Suppose we want to run an async operation that does the following:
- Download content from URL
- Print the content
- Count how many times a character appears in the downloaded content.
- Print the counter
- Return the counter
Let’s first see how this can be done using the classic Java Async API.
We might argue this is good enough, but there is another way, using async await, that beats this one.
Few things to be noticed. First, this function is completely async, there is no blocking happening inside it. Second, the function call to
await returns the control flow to the caller of the function
getContentTotalSizeFor until the function being awaited for is completed. This allows more work to be executed every time this function is waiting for its dependency work to complete. Also, the code is quite simple to understand and to write, and less thinking about it is required.
We can test it out using the following.
The output will be.
Notice that even when we have
getContentTotalSizeFor it is not blocking. The first time the
await is hit, the control goes back to the caller (the main in this case) so it continues doing work (
System.out.println("waiting");). Once the function being awaited for ends,
getContentTotalSizeFor resumes its execution from where it was.
Let’s go back and put some numbers on the code to track the execution.
Notice that every time
await is done, the control is resumed from that point on. If a new
await is found, the control goes back to the caller. At the same time, all this is async and does not block the work on the caller function.
At compile time,
async/await code is transformed to similar byte code as if we would write regular
CompletableFuture code as in the first example.
If you are not convinced by now, then let’s look at more interesting examples.
A more complex example is the following.
Or we could write it as follow.
Notice that the calculation of
cubes are two independent computation that are being ran asynchronously at the same time.
First of all, we need to fully understand the problems we currently have in our code base so we can work with the people writing it and solve the problems. This is impacting performance at different levels.
Also, for those of you who are new to concurrency, try not to block your async operations, in most cases, blocking is unnecessary, degrades performance, and there is, probably, a way to do the same without blocking at all.
Finally, by using higher abstractions to manage concurrency as we saw by using
async/await we can write better concurrent code without having so much trouble. Concurrency is naturally hard, but there are ways to make it simpler, take advantage of them, use them.
If you liked this story and think others can benefit from it, please click the like button so others can see it, too.