In this blog, I will demonstrate how to get the coverage data for each incoming request on a Python web server built using any web framework.
Code coverage is a metric used in software testing to measure the extent to which the source code of a program has been executed during testing. It indicates the percentage of code that has been covered by the test cases. Code coverage analysis helps developers understand how thoroughly their tests exercise the codebase.
Code coverage tools are used to collect data on code execution during testing and generate reports showing the coverage metrics. These reports help developers identify areas of the code that are not adequately covered by tests, allowing them to write additional tests to improve coverage and increase confidence in the software's correctness and reliability.
Obtaining coverage data for each request coming to a web server offers several benefits:
Granular Insights: By capturing coverage data for each request, developers gain detailed insights into which parts of the codebase are executed in response to different types of requests. This level of granularity allows for a deeper understanding of the application's behavior under various conditions.
Identifying Untested Code Paths: Coverage data helps identify areas of the code that are not adequately covered by tests. By analyzing coverage reports, developers can pinpoint specific code paths that need additional testing, ensuring comprehensive test coverage across the entire codebase.
Building deduplication feature: coverage data for each e2e test case can be analyzed to identify duplicate tests and remove them.
To obtain the coverage data, we would be using the coverage.py library. coverage.py is mostly used through CLI. But it provides API to use it programmatically.
We will define a middleware through which every incoming request would pass. In our "coverage" middleware before passing control to other parts of our application, we will call start
function from coverage library. Coverage measurement is only collected in functions called after start()
function is invoked, so if this middleware is scheduled to run first then coverage of other middleware would also be captured along with main application code.
Once the application returns then we would stop collecting coverage data. We can then fetch the data and further process it.
Below is a code snippet for the coverage middleware which can be used in servers built using Flask web framework:
import coverage
class CoverageMiddleware:
def __init__(self, app):
self.app = app
def __call__(self, environ, start_response):
cov = coverage.Coverage(cover_pylib=False)
cov.start()
response = self.app(environ, start_response)
cov.stop()
result = cov.get_data()
Write(result)
return response
Here,
Write
function writes the coverage data to a file, say dedupData.yaml, which can then be used to identify duplicate testcases in e2e scenario.
Here is the modified sample Python application with middleware and writing logic in place: https://github.com/AkashKumar7902/samples-python/tree/v1.0.0/flask-mongo
The repository includes test cases generated by Keploy, which can be replayed using the command keploy test -c "pythonapp.py"
. Upon successful execution of this command, a dedupdata.yaml
file will be generated. This file will contain details of the executed files, including the lines covered, for each test case.
Here is a sample dedupdata.yaml:
- id: test-1
executedLinesByFile:
/home/akash/Desktop/githubrepo/samples-python/flask-mongo/app.py:
- 33
- 34
- 35
- id: test-2
executedLinesByFile:
/home/akash/Desktop/githubrepo/samples-python/flask-mongo/app.py:
- 24
- 23
Earlier written dedupData.yaml
which contains coverage data for each test case, can be used to identify and flag duplicate test cases by analyzing similar code paths.
There are also multiple deduplication features for test cases based on coverage data for Keploy Cloud.
This is how with very little code change you can collect coverage data for each incoming request and prioritize increasing coverage for the most frequent requests. It can also be used to build a deduplication feature.
Obtaining coverage data for each request provides granular insights into the codebase's execution under various conditions, helps identify untested code paths, and can be used to build deduplication features for test cases.
Coverage data can be obtained using the coverage.py
library, which provides both CLI and API for collecting coverage metrics programmatically. In a web server, coverage data can be captured using a middleware that wraps around the application logic and collects coverage information for each incoming request.
Deduplication features for test cases based on coverage data may include identifying and flagging duplicate tests by analyzing similar code paths, removing redundant tests, and optimizing test suites for better coverage and efficiency.