paint-brush
How to Deal With Flapping or Broken Tests with a new pytest pluginby@pbityukov
1,257 reads
1,257 reads

How to Deal With Flapping or Broken Tests with a new pytest plugin

by Pavel BityukovNovember 28th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The way to schedule fix your flaky tests and not to lose any of it with a fancy pytest lib
featured image - How to Deal With Flapping or Broken Tests with a new pytest plugin
Pavel Bityukov HackerNoon profile picture

Tests Are Failing


”Quality is not an act, it is a habit" - Aristotle


Introduction

Tests are useful, and tests make our lives better, however, sometimes tests are flapping or just simply broken (in this context, flapping or broken tests are irrelevant to changes you have made).


And it always happens when you need to land something urgent. If tests are not working as they should, it always means that you have to invest some time to understand the root cause of it and fix it; it makes you slower, less productive, and demotivates you.


You lose your focus on the main problem, and now, the flapping test is the only enemy you need to sort out what to do next: spend some time and fix the test if it’s possible, disable it, or just drop it. None of those is a silver bullet, and a trade-off should have been taken.

What Are the Options?

There are two straightforward solutions: to fix tests and to drop tests. They both have pros and cons, and none of those is universal. Fixing the test may turn out to be time-consuming, especially if it flaps because other services or resources are unavailable or not stable. Dropping the test, on the other hand, decreases code coverage and simply ignores the problem.


Another way to mitigate this problem is to auto re-run solutions for the flapping test, which is time-consuming and not a 100% reliable solution.

Method

Pros

Cons

Drop or skip tests

Fast

Decreases test coverage
Increase risks of future problems

Fix tests

Proper way :-)

Increases time-to-market

Auto retry (flaky retry)

The test will run and catch an error if it happens

Increases test time
Increases time-to-market because of test run time
Not 100% guarantee passing tests

Skip with deadline

Unblocks you from delivering the current task
Maintains test coverage
Allows you to plan test maintenance

Temporary decreases test coverage

Skip With Deadline

The main idea of this method is to skip and ignore test results until a certain date. For example, you have an urgent fix to land, and one of your tests is flapping and blocks you on the CI/CD pipeline. This is annoying, this is time-consuming, and this pisses you off.


The obvious solution is to comment or delete the test to unblock the fix on the pipeline, which might be not the best approach in this case; fixing the test is not the option either: it might take too much time or might not even be possible at the moment.


Another way is to add a skip decorator to fix this test later. But the problem is that time may never come, and the test will be forgotten and skipped forever.


The better way I think of is to skip the test with the deadline condition and comment to provide context and artifact for the follow-up, e.g., task or plan reference. This approach allows you to land your fix, and as a good citizen, plan and highlight the blocker to the following resolve.

Simple pytest Plugin

I’ve crafted a simple pytest plugin to get this functionality. The code of the plugin can be found on GitHub: https://github.com/bp72/pytest-skipuntil.


To install the plugin, run:

pip install pytest-skipuntil


Also, to imitate flaky behavior and repeat our test N-times, we will need the plugin pytestflake-finderr:

pip install pytest-flakefinder


Let’s create a simple app.py to emulate flaking behavior:

import random


class App:
    def __init__(self, a, b, fail_ratio=3):
        self.choice = [True] + [False]*fail_ratio
        self.a = random.randint(1, 100) if random.choice(self.choice) else a
        self.b = b
    
    def get_data(self):
        if random.choice(self.choice):
            raise Exception("oops")
        return [1, 2, 3]


Let’s add a simple test for the app:

import pytest

from datetime import datetime
from app import App


def test_app():
    s = App(1, 2)
    assert s.a == 1
    assert s.b == 2
    assert s.get_data() == [1, 2, 3]


Run it:

(.venv) bp:/mnt/hdd2/projects/python/dummyapp$ pytest --flake-finder --flake-runs=5 
====================== test session starts ======================
platform linux -- Python 3.10.12, pytest-7.4.3, pluggy-1.3.0
rootdir: /mnt/hdd2/projects/python/dummyapp
plugins: flakefinder-1.1.0, skipuntil-0.2.0
collected 5 items                                                                                                                                                                                                                              

test_app.py FF.F.                                                                                                                                                                                                                        [100%]

====================== FAILURES ======================
______________________ test_app[0] ___________________

    def test_app():
        s = App(1, 2)
>       assert s.a == 1
E       assert 8 == 1
E        +  where 8 = <app.App object at 0x7f8f24b10fd0>.a

test_app.py:9: AssertionError
______________________ test_app[1] ______________________

    def test_app():
        s = App(1, 2)
        assert s.a == 1
        assert s.b == 2
>       assert s.get_data() == [1, 2, 3]

test_app.py:11: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <app.App object at 0x7f8f24b11060>

    def get_data(self):
        if random.choice(self.choice):
>           raise Exception("oops")
E           Exception: oops

app.py:12: Exception
______________________ test_app[3] ______________________

    def test_app():
        s = App(1, 2)
        assert s.a == 1
        assert s.b == 2
>       assert s.get_data() == [1, 2, 3]

test_app.py:11: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <app.App object at 0x7f8f24ac7820>

    def get_data(self):
        if random.choice(self.choice):
>           raise Exception("oops")
E           Exception: oops

app.py:12: Exception
=================== short test summary info ===================
FAILED test_app.py::test_app[0] - assert 8 == 1
FAILED test_app.py::test_app[1] - Exception: oops
FAILED test_app.py::test_app[3] - Exception: oops
=================== 3 failed, 2 passed in 0.01s ===================


As you can see, it failed 3 out of 5. This example is synthetic, however, in real life, I’ve seen it numerous times: some tests are failing due to dependency service timeout, resource shortage, database inconsistency, and many other reasons.


Let’s update the test with the mark:

@pytest.mark.skip_until(deadline=datetime(2023, 12, 1), strict=True, msg="Alert is suppresed. JIRA-12346")
def test_app():
    s = App(1, 2)
    assert s.a == 1
    assert s.b == 2
    assert s.get_data() == [1, 2, 3]


As you can see, on test execution time, we got the warning that some tests are suppressed until a certain date with specific comments. I prefer comments to have reference to planning tool artifacts, e.g., task or issue.

(.venv) bp:/mnt/hdd2/projects/python/dummyapp$ pytest --flake-finder --flake-runs=5 
====================== test session starts ======================
platform linux -- Python 3.10.12, pytest-7.4.3, pluggy-1.3.0
rootdir: /mnt/hdd2/projects/python/dummyapp
plugins: flakefinder-1.1.0, skipuntil-0.2.0
collected 5 items                                                                                                                                                                                                                              

test_app.py sssss                                                                                                                                                                                                                        [100%]

test_app.py: The test is suppressed until 2023-12-01 00:00:00. The reason is: Alert is suppresed. JIRA-12346
test_app.py: The test is suppressed until 2023-12-01 00:00:00. The reason is: Alert is suppresed. JIRA-12346
test_app.py: The test is suppressed until 2023-12-01 00:00:00. The reason is: Alert is suppresed. JIRA-12346
test_app.py: The test is suppressed until 2023-12-01 00:00:00. The reason is: Alert is suppresed. JIRA-12346
test_app.py: The test is suppressed until 2023-12-01 00:00:00. The reason is: Alert is suppresed. JIRA-12346
====================== 5 skipped in 0.00s ======================


Simulate missed deadlines by setting the deadline arg to the past date. The test started to fail again with a note that the previous suppress was not working anymore.

(.venv) bp:/mnt/hdd2/projects/python/dummyapp$ pytest --flake-finder --flake-runs=5 
====================== test session starts ======================
platform linux -- Python 3.10.12, pytest-7.4.3, pluggy-1.3.0
rootdir: /mnt/hdd2/projects/python/dummyapp
plugins: flakefinder-1.1.0, skipuntil-0.2.0
collected 5 items                                                                                                                                                                                                                              

test_app.py FFF..                                                                                                                                                                                                                        [100%]

====================== FAILURES ======================
______________________ test_app[0] ______________________

    @pytest.mark.skip_until(deadline=datetime(2023, 11, 1), strict=True, msg="Alert is suppresed. JIRA-12346")
    def test_app():
        s = App(1, 2)
>       assert s.a == 1
E       assert 73 == 1
E        +  where 73 = <app.App object at 0x7f68ee4a0970>.a

test_app.py:9: AssertionError
______________________ test_app[1] ______________________

    @pytest.mark.skip_until(deadline=datetime(2023, 11, 1), strict=True, msg="Alert is suppresed. JIRA-12346")
    def test_app():
        s = App(1, 2)
        assert s.a == 1
        assert s.b == 2
>       assert s.get_data() == [1, 2, 3]

test_app.py:11: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <app.App object at 0x7f68ee4a28f0>

    def get_data(self):
        if random.choice(self.choice):
>           raise Exception("oops")
E           Exception: oops

app.py:12: Exception
______________________ test_app[2] ______________________

    @pytest.mark.skip_until(deadline=datetime(2023, 11, 1), strict=True, msg="Alert is suppresed. JIRA-12346")
    def test_app():
        s = App(1, 2)
>       assert s.a == 1
E       assert 17 == 1
E        +  where 17 = <app.App object at 0x7f68ee4b5540>.a

test_app.py:9: AssertionError
test_app.py::test_app[0]: the deadline for the test has passed
test_app.py::test_app[1]: the deadline for the test has passed
test_app.py::test_app[2]: the deadline for the test has passed
test_app.py::test_app[3]: the deadline for the test has passed
test_app.py::test_app[4]: the deadline for the test has passed
====================== short test summary info ======================
FAILED test_app.py::test_app[0] - assert 73 == 1
FAILED test_app.py::test_app[1] - Exception: oops
FAILED test_app.py::test_app[2] - assert 17 == 1
====================== 3 failed, 2 passed in 0.01s ======================

The feature of the plugin is that it outputs information about skipped tests and those with passed deadlines. This also can help to keep track of the skipped tests, so they are not forgotten and fixed on time.

Conclusion

This approach helped me not to go mad when it’s a hotfix to land and random tests fail and block your CI/CD. It’s also contributed to daily time management very positively since everything is scheduled, and nothing will be lost in time and rush!