We all love T.D.D. We know its benefits, we have read a thousand tutorials on how to build a system using this technique. But this not feasible for currently legacy systems.
Test-driven development (TDD) is a software DEVELOPMENT process that relies on the repetition of a very short development cycle.
We turn requirements into very specific test cases.
We improve software so all tests pass.
This is opposed to incorporating functionality that has not been proven to comply with requirements.
Created in 2003 by Kent Beck (also the xUnit testing Framework testing system author).
1) Add a test (it must fail)
Solely based on behavior. Forgetting everything about accidental implementation.
2) Run all tests. The new test must fail. All the rest should pass.
3) Write the simplest possible solution to make the test pass. The programmer must not write code that is beyond the functionality the test checks. (K.I.S.S. and Y.A.G.N.I design principles)
If the all test passes restart the process or ...
4) (optionally) make a refactor (when code stinks).
NEVER DO BOTH 1 and 4 Together.
Test must be in full environmental control.
No Globals, No Singletons, No Settings, No Database, No Caches, No External API Calls and no side effects at all.
TDD can detect coupling problems.
Solving them leads to cleaner code focused on business logic alone and encapsulating implementation decisions.
We must deal with coupling problems using test doubles: mocks, stubs, fake objects, spy, proxies, dummy objects, etc.
According to the popular myth, we can't use TDD on existing systems. This is not true. Let's show an example.
We have a ticketing system needing to showcase several artists performing live-streaming during COVID-19 pandemic.
Users can search for artists based on a type-ahead selector on a React application.
System performs database queries on a heavy concurrent back-end system.
We need to remove redundant SQL queries matching part of artists names.
Like SQL Operator is very expensive on relational systems, and we are not allowed to change back-end architecture.
We need to simplify redundant searches. That's all.
SELECT * FROM ARTISTS
WHERE ((artist.fullname LIKE '%Arcade Fire%')
OR (artist.fullname LIKE '%Radiohead.%')
OR (artist.fullname LIKE '%Radiohead%')
OR (artist.fullname LIKE '%Sigur Ros%')
OR (artist.fullname LIKE '%Sigur%')))
System will execute just:
SELECT * FROM ARTISTS
WHERE ((artist.fullname LIKE '%Arcade Fire%')
OR (artist.fullname LIKE '%Radiohead%')
OR (artist.fullname LIKE '%Sigur%')))
Since this part is redundant and expensive to the database.
OR (artist.fullname LIKE '%Radiohead.%')
OR (artist.fullname LIKE '%Sigur Ros%')
Always start with the simplest problem
1) Add a test (empty case)
Notice:
Test fails (as expected). Let’s create the class and the function.
3) Write the simplest possible solution to make the test pass.
Notice:
Continue with another trivial case
1) Add a test (simple expression).
3) And the simplest solution for both cases.
Works like a charm in both cases.
We are taking baby steps, slicing the problem and following divide and conquer principle.
Continue with another (not so simple) case:
1) Add a test (two independent expressions).
Code works correctly without changes. Is this a good test?
We will discuss it on a more advanced article.
Let's move on.
Continue with a desired and juicy business case.
1) Add a test (one expression containing the other).
Let’s make it work.
3) Add the simplest solution for all the already written cases.
This is an ugly algorithmic solution, but we will improve it with a refactoring once we become more confident.
We cannot fake it anymore. We need to make it.
Ugly, not performant, undeclarative and complex.
We don't care. We need to gain confidence and learn on the domain.
Luckily, we will soon have time for better solutions.
Continue with another case.
1) Add a test (left expression containing the right one)
… and we are Green, so we are covering the business rule stating that terms order is not relevant. (Commutative Property).
We make it explicit so no smart refactor can ever break it!
Move on with another (not so simple) case.
1) Add a test (Capitalization is not relevant to MYSQL engine but our users might not be aware of that)
2) We run the tests and the new one is broken. Let’s fix it!
3) The simplest solution for all the already written cases (with the new case).
… and tests are all green again with the ugly improved solution.
Code smells and we have several test cases. We need a better solution.
4) Let’s refactor the solution with a more efficient and readable one
Let’s test first production scenario requested by customers.
1) Add two unrelated redundant prefixes
And it works !
We inject it on our legacy code:
Before
Let's inject it.
Up to here, we worked in isolation scenario.
Software development is a group activity.
The quality assurance engineers found additional possible benefits.
Pattern could be in the middle of the string.
Customer agreed to add this functionality.
Lets consider those cases
New cases are broken since they were not represented by a previous one. We keep fixing them.
4) Let’s change the solution to cover all previous cases and the new ones.
Once we submitted the intelligent SQL simplifier something bad happened:
This was actual SQL after terms of bad handling:
SELECT * FROM artists WHERE ((()))
This SQL generation mistaken as an empty condition.
So we will fix it TDD Way. We isolate the defect and add it as a broken TDD Case
Of Course, it fails since previous implementation brought an empty solution (and a customer complaint).
We can fix it by doing a duplicate's remover case-insensitive pre-processor at the beginning of simplify function:
To see if we must test private methods please visit shoulditestprivatemethods.com
Tests are green again.
Not dealing with case-sensitive duplicate's algorithm worked again.
Lets consider a different order.
Against our intuition we see it fails.
This is because the unit is bringing a ‘Yes’ instead of a ‘yes’.
The solution depends on the product owner. We can:
We choose 2)
Tests are green again
We add more tests considering mixed cases
Tests worked with the new Solution and given expected SQL.
Case went to peer review.
One of the reviewers asked about not like comparison finding an improvement opportunity.
We asked our customer-on-site for agreement.
If user chooses not to see ‘head’ => it is choosing not to see ‘Radiohead‘ and Talking heads.
In SQL: NOT LIKE ‘%head%’ implies NOT LIKE ‘%Radiohead%’ which is redundant in an AND condition.
Our simplifier was already aware of that, so we injected in a second place being confident tests were already covering that scenario.
Legacy Code is all the code without tests.
Michael Feathers.
Part of the objective of this series of articles is to generate spaces for debate and discussion on software design.
We look forward to comments and suggestions on this article.