paint-brush
Philosophy of Testing and Rules How to Reliably Test Complex Applications With Python Examplesby@viachkon
134 reads

Philosophy of Testing and Rules How to Reliably Test Complex Applications With Python Examples

by ViAchKoNOctober 9th, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Here are the key points on how to write effective tests Isolation (Ensure that the data is isolated for each test), Generate Test Data Dynamically (Use tools like Factory Boy to create unique test data on the fly and avoid hardcoded values), Define Clear Expectations (Clearly define what is expected from each test to ensure that they accurately measure the intended functionality), Don't mix up test cases (Each test should check a specific scenario), Verify Method Functionality (Ensure that tests not only check for error-free responses but also validate the correctness of results and any internal changes, such as database modifications)
featured image - Philosophy of Testing and Rules How to Reliably Test Complex Applications With Python Examples
ViAchKoN HackerNoon profile picture


Introduction

I have been working as a backend Python engineer for several years now.


During this time, I have learned a lot about writing clean code, applying algorithms in real-world scenarios, working with both relational and non-relational databases, and most importantly, writing effective tests. These skills have allowed me to save significant time on tasks and ensure that the features I implement are reliable.


Throughout my career as a software developer, I've encountered various approaches to testing.


In this article, I would like to share which practices I found to be inefficient and demonstrate how easy it is to create reliable unit tests that ensure both high coverage and robustness. This article may interest not only developers working with Python but also software engineers across the board.

Definition of Tests

Tests are generally considered to be code that tests other code. Typically, tests are divided into two groups: unit tests and integration tests.


  • Unit tests involve testing isolated pieces of source code to validate expected behavior.


  • Integration tests are conducted at the integration level, where multiple parts of a software system are tested as a group, potentially including integration with external systems.


There are differing opinions on how to categorize tests within these groups.


Some argue that only tests for small portions of code should be considered unit tests, while more complex tests should always be classified as integration tests. I endorse the notion that unit tests can involve testing multiple parts of the code, whereas integration testing focuses on the whole modules, such as services, that work together through an interface. I refer to these as unit tests with real dependencies.


Moreover, in my experience, I haven't worked on a project where developers wrote tests for each individual method or small block of code. Instead, this approach allows for testing larger sections of code without the need to write separate tests for every single method, as they are still covered when tested together.


For the purposes of this article, however, I will simply refer to these as tests, because regardless of terminology, the important thing is to have them.

Unreliable Tests

Throughout my career, I've worked on various projects. In some cases, I joined teams that had already been developing their projects for some time.


I've seen different implementations of tests; some of which proved to be unreliable. In this part of the article, I'll try to summarize these cases with code examples and discuss why such implementations have flaws.


For example, let's consider a simple FastAPI application with several methods that fetch data from the database, add data to the database, and update it.

from fastapi import FastAPI, Body, HTTPException

from core.db import queries
from core import schemas

app = FastAPI()


@app.get(
    "/items",
    summary="Get items",
    status_code=200,
    response_model=list[schemas.ItemSchema],
)
def get_items() -> list[schemas.ItemSchema]:
    items = queries.get_items()
    return items


@app.post(
    "/items",
    summary="Add items",
    status_code=200,
    response_model=list[schemas.ItemSchema],
)
def add_items(
    items: list[schemas.ItemBaseSchema] = Body(
        ...,
        embed=True,
    )
) -> list[schemas.ItemSchema]:
    added_items = queries.add_items(
        items=items
    )
    return added_items


@app.patch(
    "/items/{item_id}",
    summary="Update an item",
    status_code=200,
    response_model=schemas.ItemSchema,
)
def update_item(
    item_id: int,
    update_data: schemas.ItemBaseSchema = Body(
        ...,
        embed=True,
    )
) -> None:
    if queries.get_item(
        item_id=item_id
    ) is None:
        raise HTTPException(status_code=404, detail="Item not found")

    item = queries.update_item(
        item_id=item_id,
        update_data=update_data,
    )
    return item


@app.delete(
    "/items/{item_id}",
    summary="Delete an item",
    status_code=204,
)
def delete_item(
    item_id: int
) -> None:
    if queries.get_item(
        item_id=item_id
    ) is None:
        raise HTTPException(status_code=404, detail="Item not found")

    queries.delete_item(
        item_id=item_id
    )


As you can see, this is a simple API with CRUD operations.


I will present examples of tests for this API that I've encountered during my career and discuss the drawbacks of such approaches. In this example, I will combine common testing practices that can lead to issues during testing.

import json

import pytest
from fastapi import status
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

from core.db.models import Item


@pytest.fixture(scope='session')
def db_item(test_db_url, setup_db, setup_db_tables):
    engine = create_engine(
        test_db_url,
        echo=False,
        echo_pool=False,
    )

    session = sessionmaker(autocommit=False, autoflush=False, bind=engine)

    with session() as session:
        item = Item(
                name='name',
                number=1,
                is_valid=True,
            )

        session.add(
            item
        )
        session.commit()
        session.refresh(item)
    return item.as_dict()


def test_get_items(
    fastapi_test_client,
    db_item,
):
    response = fastapi_test_client.get(
        '/items',
    )
    assert response.status_code == status.HTTP_200_OK


def test_post_items(
    fastapi_test_client,
):
    item_to_add = {
        'name': 'name',
        'number': 1,
        'is_valid': False,
    }

    response = fastapi_test_client.post(
        '/items',
        data=json.dumps(
            {
                'items': [
                    item_to_add
                ],
            },
            default=str,
        ),
    )
    assert response.status_code == status.HTTP_200_OK


def test_update_item(
    fastapi_test_client,
    db_item,
):
    update_data = {
        'name': 'new_name',
        'number': 2,
        'is_valid': True,
    }

    response = fastapi_test_client.patch(
        f'/items/{db_item["id"]}',
        data=json.dumps(
            {
                'update_data': update_data,
            },
            default=str,
        ),
    )
    assert response.status_code == status.HTTP_200_OK
    

def test_delete_item(
    fastapi_test_client,
    db_item,
):
    response = fastapi_test_client.delete(
        f'/items/{db_item["id"]}',
    )
    assert response.status_code == status.HTTP_204_NO_CONTENT


At first glance, these tests seem fine. They cover all the APIs, test responses, and achieve high overall coverage. But are they really? Let's discuss how these tests are run.


It’s important to note that setup_db, setup_db_tables, and db_item have a session scope, meaning these fixtures are destroyed only at the end of the test session—after all the tests have been completed.


The order of tests execution is as follows:


  1. The test database is created if it doesn't already exist for the entire test run.

  2. Test database tables are created if they don't already exist for the entire test run.

  3. A test item object is created in the test database.

  4. The API tests are executed.

  5. The test tables and database are destroyed.


There are several issues with how these tests are designed.


The first flaw is that the database and test object are created only once for the entire test run, which can lead to potential issues with data consistency throughout the tests.


In this setup, all the tests depend on each other.


For instance, if we add a new test to fetch data from the database and it runs after the DELETE API test, it could potentially fail because there would be no test data left in the database.


Even though there's a POST method that runs before the DELETE one, there's a risk that it might be moved or deleted, leading to test failures.


The second flaw is that both the test object in the db_item fixture and the data added in the test_post_items test are hardcoded. This approach works until there's a conflict in the database.


Currently, there are no constraints, except for Item.id (the primary key), set in the database.


However, if constraints were to be added in the future, these tests might fail because they rely on hardcoded values that don’t account for potential conflicts.


This issue once again highlights the dependency of these tests on each other.


The final flaw is that none of these tests actually verify that the methods being tested work correctly.


The only thing being checked is whether the response code is as expected, without confirming that data was actually changed in the database or correctly retrieved from it.


As it stands, there is no way to be certain that the methods function correctly just by running these tests.


The only way to ensure everything works as expected is to combine manual QA with automatic testing.


These drawbacks can be summarized into three main points:

  • Dependability: Tests rely too heavily on each other, leading to potential failures if the sequence is altered.


  • Unreliability: Hardcoded values and a lack of thorough checks make these tests prone to failure and inaccuracies.


  • Need Attention: Without additional manual QA, there's no assurance that the methods are functioning correctly.

How to Improve Tests

Earlier, we saw that simply writing tests is sometimes not enough to ensure the robustness of a system.


The good news is that Python, along with other programming languages, provides tools to enhance test reliability.


Additionally, what can't be covered by tools can often be addressed by following simple best practices.


The first key to making tests reliable is ensuring they are independent whenever possible.


One test should not affect another, meaning they should have separate data, variables, and so on.


Changes in one test should not break or alter the execution of any others.


Therefore, when using any data source in an application, it is crucial to flush all data before each test to ensure that no artifacts are left for the subsequent tests.


In our example, it can achieved by changing the scope of the fixture setup_db_tables from session to function. In result, the fixture will look like as:

@pytest.fixture(scope="function")
def setup_db_tables(setup_db, test_db_url):
    create_db_engine = create_engine(test_db_url)
    BaseModel.metadata.create_all(bind=create_db_engine)
    yield
    BaseModel.metadata.drop_all(bind=create_db_engine)


The second step is to create test data within the tests themselves when needed.


The Factory boy package is useful for generating unique data on the fly and offers the option to create objects directly in the database. With a slight modification to its default Factory class, we can create a custom factory that adds objects to the database:

import factory
from tests import conftest

class CustomSQLAlchemyModelFactory(factory.Factory):
    class Meta:
        abstract = True

    @classmethod
    def _create(cls, model_class, *args, **kwargs):
        with conftest.db_test_session() as session:
            session.expire_on_commit = False
            obj = model_class(*args, **kwargs)
            session.add(obj)
            session.commit()
            session.expunge_all()
        return obj


It only requires inheriting from the custom factory class to create model objects in the database.

class ItemModelFactory(CustomSQLAlchemyModelFactory):
    class Meta:
        model = models.Item

    name = factory.Faker("word")
    number = factory.Faker("pyint")
    is_valid = factory.Faker("boolean")


Finally, checking that a method doesn't return any errors is not enough.


We need to verify the results of each tested method and any internal changes they might make, such as modifications to the database in our case.


Additionally, it is important to clearly define what is expected from the tests.

Tests After Modifications

After making slight modifications to follow the steps provided earlier, achieving reliability becomes straightforward. The test will look like this:

import json
from unittest.mock import ANY

import pytest
from fastapi import status
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

from core.db.models import Item
from tests import factories


def test_get_items(
    fastapi_test_client
):
    expected_items = factories.models_factory.ItemModelFactory.create_batch(
        size=5
    )

    response = fastapi_test_client.get(
        '/items',
    )
    assert response.status_code == status.HTTP_200_OK

    response_data = response.json()

    assert response_data == [
        {
            'id': item.id,
            'name': item.name,
            'number': item.number,
            'is_valid': item.is_valid,
        } for item in expected_items
    ]


def test_post_items(
    fastapi_test_client,
    test_db_session,
):
    assert test_db_session.query(Item).first() is None

    item_to_add = factories.schemas_factory.ItemBaseSchemaFactory.create()

    response = fastapi_test_client.post(
        '/items',
        data=json.dumps(
            {
                'items': [
                    item_to_add.dict()
                ],
            },
            default=str,
        ),
    )
    assert response.status_code == status.HTTP_200_OK

    response_data = response.json()

    assert response_data == [
        {
            'id': ANY,
            'name': item_to_add.name,
            'number': item_to_add.number,
            'is_valid': item_to_add.is_valid,
        },
    ]

    assert test_db_session.query(Item).filter(
        Item.name == item_to_add.name,
        Item.number == item_to_add.number,
        Item.is_valid == item_to_add.is_valid
    ).first()


def test_update_item(
    fastapi_test_client,
    test_db_session,
):
    item = factories.models_factory.ItemModelFactory.create()

    update_data = factories.schemas_factory.ItemBaseSchemaFactory.create()

    response = fastapi_test_client.patch(
        f'/items/{item.id}',
        data=json.dumps(
            {
                'update_data': update_data.dict(),
            },
            default=str,
        ),
    )
    assert response.status_code == status.HTTP_200_OK

    response_data = response.json()

    assert response_data == {
            'id': ANY,
            'name': update_data.name,
            'number': update_data.number,
            'is_valid': update_data.is_valid,
        }

    assert test_db_session.query(Item).filter(
        Item.name == update_data.name,
        Item.number == update_data.number,
        Item.is_valid == update_data.is_valid
    ).first()


def test_delete_item(
    fastapi_test_client,
    test_db_session,
):
    item = factories.models_factory.ItemModelFactory.create()

    response = fastapi_test_client.delete(
        f'/items/{item.id}',
    )
    assert response.status_code == status.HTTP_204_NO_CONTENT

    assert test_db_session.query(Item).first() is None

Conclusion

In this article, we explored common issues with testing practices in software development, particularly focusing on a FastAPI application example.


We identified several flaws in existing tests, including dependency between tests, reliance on hardcoded values, and the lack of verification of method functionality. To address these issues, we discussed best practices such as using fixtures with function scope to ensure data isolation, utilizing tools like Factory Boy to generate unique test data, and verify both the results and any internal changes made by the methods being tested.


By implementing these practices, we can enhance the reliability of tests and ensure that the system behaves as expected.


Here are the key points on how to write effective tests:

  • Isolation: Ensure that the data is isolated for each test.


  • Generate Test Data Dynamically: Use tools like Factory Boy to create unique test data on the fly and avoid hardcoded values.


  • Define Clear Expectations: Clearly define what is expected from each test to ensure that they accurately measure the intended functionality.


  • Don't mix up test cases: Each test should check a specific scenario.


  • Verify Method Functionality: Ensure that tests not only check for error-free responses but also validate the correctness of results and any internal changes, such as database modifications.


I hope you found the article useful. You can clone the repository with the example provided (GitHub).