When Code Duplication Is Acceptable

Written by fagnerbrack | Published 2016/12/14
Tech Story Tags: testing | software-development | programming | web-development | javascript

TLDRvia the TL;DR App

Code duplication is acceptable when it makes the intent of a test clearer

A picture from The Matrix movie showing several clones of Agent Smith wearing sunglasses in the rain. They all seem to be looking at the screen with an angry face.

For the purpose of this post, we will call as "Production Code" the part of the system that contains the logic of the project and is run in production. We will call as "Test Code" the part of the project that contains the tests which verify if the application (Production Code) work as expected.

A test is just a fake client that consumes the API being tested the same way it would be consumed inside the system (for a unit test) or outside of it (for an integration test). The difference is that it contains assertions to verify the correctness of what's being tested. A test is also called the "first client" because it's the first consumer of the code you are writing.

The API it consumes can be a function, an object or a whole service. The most important aspect is that it should be as simple as possible, consuming the functionality it’s testing without a lot of complexity or logic, just a group of declarative expressions that (preferably) satisfy the Arrange-Act-Assert (AAA) model.

A test should not contain a lot of complexity or logic

When the Production Code contains a series of steps that are repeated more than one time or require a value to be used in more than one place, that can be considered a Bad Code Smell. That is because when we change the same Production Code in several places for a system that doesn’t have 100% coverage, either with “copy/paste”, “search and replace” or typing manually, the chance of missing to change important parts of the system is high.

However, for a Test Code that is declarative, when it executes a series of steps or require a value to be used in more than one test, then there are circumstances where it’s not that bad if the code is duplicated. The tests should express a clear intent of what they are supposed to do. Duplication is still a problem, of course, we should try to not duplicate code in any circumstance. However, it may not be as bad as duplication in Production Code.

Duplication in Test Code might not be as bad as duplication in Production Code

Let's imagine a hypothetical scenario where one test is covering a broader functionality of the system and part of it (a unit) is being reused in the Production Code. That small unit being reused is being covered by a test somewhere else for a different use case. If a bug in that unit makes one test break, then there's no need to cover it again in the Production Code for other places that are using it. Making one single test pass will also fix the parts of the system that are not explicitly covered.

In Test Code, we don’t have anything that will highlight a problem if we change something by mistake. In that context, applying DRY can make the code more complex than necessary by obfuscating information that can be left more explicit.

We can't test the test alright?

If we are going to abstract something that is helpful to run our tests, we should cover it the same way we do for our Production Code. However, creating too many abstractions and helpers in Test Code is an indication of a Bad Code Smell. Tests should be simple, and creating many abstractions indicates potential problems in the design.

The example below shows a couple of tests that don’t repeat values and assertion steps. Instead, they store the values in a few constants and functions to reuse them and have a single place to change:

https://jsfiddle.net/fagnerbrack/9w0wttdf/

In this other example we removed the constants and the extra functions. By doing that, we also reduced the amount of code. One could argue the code is clearer now because everything is contained inside each test instead of being scattered throughout the file and each test having to access a higher scope:

https://jsfiddle.net/fagnerbrack/sdx35wgu/

For a Test Code that is declarative and covers a good portion of the system, duplication is acceptable as long as they make the intent of the test clear

Some operations can repeat in a test environment. After all, we need to execute the “Act” for every single test. There are some things we can abstract, though, such as the “Arrange” and the “Assert”. However, if we see the need for abstracting something then we should only abstract test concerns, not test code.

For example, if we are using OOP and testing the method cart.addProduct(fakeProduct) for many different states of the cart object, it's totally reasonable to repeat the cart.addProduct code for every test that checks for a different behavior. It could be one test that checks if a valid product can be retrieved from the cart after being added and another test that checks if when we try to add an invalid product to the cart then it's not added. It doesn't make much sense to abstract the "Act" part (cart.addProduct) to something like addFakeProductTo(cartInstance). In this case, repetition is ok, even though as a general rule it should be avoided.

https://gist.github.com/FagnerMartinsBrack/76d496d2e71e66dd458a0d9ac4a94c46

If we want to prepare a similar product state for more than one test, though, then it’s more likely it will benefit from an abstraction. In this example, we don't really care about the size or color of the fake cloth, it should just be either pants or shirt:

https://gist.github.com/FagnerMartinsBrack/4faa92f4e5d01bac7229fb4e61a71637

When creating abstractions in Test Code we should only abstract test concerns, not test code.

A test is a small snippet that should be declarative without a lot of complexity or logic. Sometimes it might be necessary to create small abstractions in the "Arrange" or the "Assert" part of it. In a test environment, duplication is not as bad as duplication in the Production Code.

Principles are just principles, they are not hard rules and should be used when it makes sense. In this case, duplication might be desirable in some circumstances. Applying DRY everywhere won't always have benefits.

Try not to repeat yourself. However, understand there are circumstances where you might need to.

Thanks for reading. If you have some feedback, reach out to me on Twitter, Facebook or Github.


Published by HackerNoon on 2016/12/14