Many people are familiar with the situation when a project has some tests that either pass successfully or fail. Such tests are called flaky, and in this article, we will talk about how to avoid creating such tests.
I will use the Java Spring Framework as an example, but the reasons discussed here are relevant for any environment.
The most common cause of flaky tests is an unstable environment.
For example, tests use some common database deployed for all tests. When running multiple build jobs in parallel in CI/CD pipelines, the tests modify each other's data.
The most reliable solution, in this case, is to isolate the environments. For example, you can run a database in a docker container (in Java, there is a popular library for it - Testcontainers). Thus, one test run will exclusively use its database, and after the test run, it will terminate it.
It is also important not to forget to clean up the state after each test not to affect the following tests in the test suite.
A more rare cause is the use of schedulers or deferred operations.
Imagine that in some part of the application, we declared a scheduler:
// every hour at 00 second, 00 minute
@Scheduled(cron = "0 0 * * * *")
public void removeExpiredData() {
// do something
}
And then, during the test run, it unexpectedly worked because the test ran exactly at that time. The developer who wrote the test did not expect such a side effect.
To avoid this behavior, you can make the schedule configurable. This will make the application easier to maintain (no need to rebuild the code to change the schedule). It will also allow you to override the schedule in tests so that the schedulers only run manually.
@Scheduled(cron = "${schedulers.remove-expired-data.cron}")
public void removeExpiredData() {
// do something
}
And then in
application.yml
in tests:schedulers:
remove-expired-data.cron: "-"
If your application uses deferred operations, you need to be even more careful with them. An action can be registered in one test, and the execution itself can occur while another test is running.
Using
Thread.sleep(..)
to wait for some action to complete is an indicator that something is wrong with the tests.The timeout may not be enough for some reason (e.g., GC pause). If we set the timeout to a large margin, this will slow down the test run dramatically.
Instead of
sleep
, we can return Future
in asynchronous operations, and wait using Future.get(20, TimeUnit.SECONDS)
with a large timeout. Thus, the overall test execution will even decrease.If you use Spring, then switching between cached contexts can be the cause of flaky tests. This behavior is tough to debug, and it may happen at any time after adding another test class.
For example, if one test class in the suite uses
@MockBean
and the other does not, 2 different contexts will be created. And JUnit engine will somehow switch between them.I know only one guaranteed way to avoid it - combining test classes into test suites so that all tests in the same suite use the same application context.
Flaky tests significantly ruin the experience when developing an application. Some companies build special services to detect flaky tests and rerun only them.
Of course, my list is incomplete, so I will be glad if you write the reasons you had to deal with in the comments.