Have you ever seen code like this?
I bet you have (or you will see at some point in your career). Code like this exists in some legacy systems and is often very old. Most likely, you don’t feel very good when you see code like this.
The problem with this code is that it is not only way too verbose but, more importantly, it hides the business logic (there are some other problems with this code as we will see later in this post). In enterprise applications, we write code to solve problems. Thus, we should not create new problems with the code. Note that when we write “systems code” or libraries where we aim for high performance or the problem we solve is too complex technically, it is allowed to sacrifice readability, but even then, we should do it carefully to avoid writing obscure code that hides the logic.
Robert C. Martin (Uncle Bob) in his book “Clean Code: A Handbook of Agile Software Craftsmanship” says that “the ratio of time spent reading (code) versus writing is well over 10 to 1”. In some legacy systems, I have found myself spending most of the time trying to understand how to read the code than actually reading the code. Testing and debugging such systems can also be really tricky. In most cases, there is a special, uncommon way completely different from everything you have dealt so far.
Everything we write tells a story
The code is not an exception. The code should not hide the business logic or the algorithm that is used to solve a problem. Instead, it should point it out in a clear way. The names that are used, the length of the methods, even the formatting of the code should look like the problem has been dealt with care and professionalism.
What do you feel about this code?
This code looks like a battlefield after a war. It looks like every developer that worked with this code hated to do so and tried to escape from this hell, leaving it in an even worse state. Different formatting and poor naming clearly show that more than one developer has lived in this hell. Sounds like the broken windows theory, doesn’t it? It is not easy to say what the code does (not only because your eyes hurt when you look at the code). This snippet returns the sum of the array minus the number of the elements. Let’s try to do that in a more convenient way:
Now, we are using the streams of Java 8 which make our code much more concise and readable.
Clean code is not about making our code look pretty. Clean code is about making our code more maintainable. When code is obscure, most of the time is spent on reading. Hence the developers’ productivity is reduced. A consequence of obscure code is that the developers who work with it usually make it even worse as we saw earlier. The reason for doing so is not due to their incapability of cleaning the code, but usually, it is the lack of time due to the pressure of a deadline. When we work with obscure code, it is really hard to estimate how long it takes to fix a bug or implement a new feature since the architecture/design of the system is hidden in the code. Thus, we end up doing ugly hacks just to get the job done, increasing that way the technical debt. Clean code, on the other hand, shows the intention of the author, so even if there is a bug in the code, it is easier to find it and fix it. Clean code helps us go faster in the long term. Two great books I definitely recommend are: “Clean Code: A Handbook of Agile Software Craftsmanship” by Robert C. Martin and “Refactoring: Improving the Design of Existing Code” by Martin Fowler and Kent Beck.
A solution to the maintainability problem of the obscure code would be to spend a couple of months (or more) to refactor the code and clean it, but the chances are really slim for the business to accept that the development is going to be paused while the developers are refactoring the code. So what can we do?
The Boy Scout Rule
The idea behind the Boy Scout Rule, as stated by Uncle Bob, is fairly simple: Leave the code cleaner than you found it! Whenever you touch old code, you should clean it properly. Do not just apply a shortcut solution that will make the code more difficult to understand but instead treat it with care. The rule focuses more on the mentality that the developers should have so they can make their life easier in the long term by making the system more maintainable.
I will be honest and admit that dealing with legacy systems is far from easy most of the time, especially when there are no tests or the test suite is not being maintained anymore, but we should still seek for opportunities to make the code cleaner. There are many techniques someone can employ when working with a legacy system (a great book is: “Working Effectively with Legacy Code” by Michael Feathers), but in this post, I would like to focus on some general advice that I have found useful to write more expressive code.
Think before you write
There is a misconception about software development that developers (only) write code. We don’t. Instead, we solve problems using code. The code is the medium, not the actual solution. Is pressing random keys considered as writing code? Of course not since it is nearly impossible for such gibberish to be interpreted by a computer. The same applies to code that is written without first thinking about the problem that we are trying to solve. Thus, we have to pay careful attention when we write code so that the solution we provide through this code is clear and not ambiguous. We shouldn’t write code for the sake of just writing code. The code should solve problems instead of creating new ones.
Have you ever been requested to do a code review, only to realize that the code completely wrong and the only solution would be to write it again from scratch? I have seen many developers that as soon as they get a task, they start typing in the IDE. They think that if they do so, they look like they are working. Most of the times this is proven to be the wrong approach since writing code without thinking leads them towards the wrong direction. Of course, some very experienced developers could start writing code right away and still be in the right direction, but the majority requires some careful planning before the actual typing.
Consider the following example:
There is nothing bad regarding the code in this example, right? Well, actually there is! The fact that a Strategy pattern is used here shows the intention that this piece of code needs to have some flexibility. In this example, unlike the original one from Wikipedia, we have only one implementation of the strategy and no short-term plans for more implementations. The intention of the Strategy pattern here can be misleading for the reader. The implementation of a pattern requires some effort, so the reader will naturally wonder what the reason for that decision was. Y.A.G.N.I principle stands for “You aren’t gonna need it” and is about not doing unnecessary things. It is difficult to predict what we are going to need in the future. Sometimes experience helps, but in most cases, it is safer to keep things simple.
Design patterns help us to solve particular problems in an elegant way that is easy to communicate. If such problem doesn’t exist (in the previous example there is no need for extensibility) the reader of the code will be misled and think that the problem actually exists. Note that I do not have anything against patterns. I love them! The problem is when people try to invent problems that patterns solve, just because they know the patterns.
The same issue also appears when we try to mix solutions to business requirement with patterns all at once. I find it much easier first to see how the problem should be solved in a “dirty” way. Only then do I examine what patterns and abstractions might help the code be more flexible and readable. The rule I follow either I practice TDD or not, is first to make it work and then make it clean (in TDD, of course, this is driven by the 3 Laws of TDD).
Remember! Just because the code works, it doesn’t mean we have finished our job! Actually, when the code works, we are only halfway done. We have to work on how the code will communicate our intention to the reader.
We have plenty tools in our toolset, and it is our responsibility to use them only when appropriate. There is no point to use frameworks and libraries just because everybody does. We have to learn what problems they solve and use them in a way that the business logic is not hidden. A great post on how to deal with frameworks and libraries is: “Make the Magic go away” by Uncle Bob.
Strive for expressiveness!
Map, filter and reduce come to almost every language that has stream support. So, everyone can understand what you write in the same way everyone can understand a for loop or an if statement when he or she sees it. A great post on this topic is: “Collection Pipeline” by Martin Fowler.
Having such an expressive way to deal with data is powerful. First of all, you don’t have to test this functionality. Did you notice the off-by-one error in the first example 😃 ? It also moves us towards functional programming approaches to our programs. Functional programming has way too many benefits to fit in this blog post (if you are interested in learning more on functional programming I recommend the post “Practical Functional Programming” and, of course, the great book on functional programming: “Structure and Interpretation of Computer Programs” by Harold Abelson, Gerald Jay Sussman and Julie Sussman), but I will focus on how it helps the readability of the code.
A solution based on streams for the first example of the post is the following:
Simple and clean. Easy to understand what it does. Now, consider the following example:
Did you expect that the second parameter will be changed when you call that method? Does this method do what it says? Is the method name well suited? Do you actually “get” something?
What about now?
In this example, the return value is a new list. No parameter is affected. We just read the parameter and produce a new result. It is much easier to understand what this method does now and how to use it. This method can be easily composed with other methods. Composition is one of the most important benefits of streams and functional programming in general. Composition allows us to think in terms of data transformations, filtering, etc. in a higher level and write code that is much more declarative and expressive compared to an imperative approach. The code we write expresses what we want to do instead of how it is done! This is a significant improvement for the readability of the code.
It is much easier to decompose a problem into subproblems, solve each one of the subproblems and then compose these solutions to create the solution to the initial problem. On the other hand, the imperative style might be essential when the main goal is performance. An interesting story regarding this issue is the famous McIlroy vs Knuth story.
Note that the
toList() collector in Java 8 returns a mutable list, whereas in functional programming we usually use immutable data structures. Still, the fact that we produce new data and treat the parameters as read-only improves the readability and the behavior of the method. Although some methods may have side effects, it is important for a method to either have side effects (behave as a command) or have a return value (behave as a query) but not both when possible. More on this topic can be found in this post.
Writing expressive code is not an easy thing to do. A famous quote from Albert Einstein says: “If you can’t explain it simply, you don’t understand it well enough.”. So, when I see code where the levels of abstraction are mixed, e.g. UI classes that interact with DAOs or talk directly to the database, or low level details are exposed when they shouldn’t, I can tell that there is not only violation of the Single Responsibility Principle of the S.O.L.I.D. principles, but also some confusion regarding the problem. Using comments in the code as a resort for this problem is not the solution, as we will see in a future post. I believe that the simpler and more expressive code somebody writes, the better he or she understands the problem.
It is really confusing when the state of the object changes without us noticing it. It is also dangerous to use an object that can be half-constructed when it is returned, especially when we deal with programs that have multiple threads. Sharing such objects is really hard to be done correctly. On the other hand, immutable objects are thread-safe and also perfect candidates to be cached, as their state doesn’t change.
But why do people choose mutable objects? I believe that the reason, most likely, is that they think they will get better performance since the memory used would be less because the modifications are performed in-place. Moreover, it feels natural to have the state of an object being changed through its lifecycle. This is what we have learned in OOP. All these years, we have been writing programs where most of the objects that we used were mutable.
Nowadays, the amount of memory that a system has is orders of magnitude larger than it was a few decades ago. The real problem that we are facing is scalability. Processor speed is no longer being improved at the rate it did in the previous years, but now we have boxes with dozens of cores. So, for our programs to scale, we need to take advantage of the current situation. Since our programs need to be able to run on multiple cores, we need to write them in a way that is safe for them to do so. By using mutable objects, we have to deal with locking to ensure the consistency of their state. Concurrency is not a trivial problem to solve. If you are interested in concurrency, then you should definitely read “Java Concurrency in Practice” by Brian Goetz. On the other hand, immutable objects are inherently safe for sharing among multiple threads and processors due to their nature. Also, the fact that no synchronization is required gives opportunities for creating systems with low latency and high throughput. Thus, immutability is the safer option to achieve scalability.
Apart from the scalability benefits, immutability makes our code much cleaner. In the first example of the previous section, the collection that was passed as a parameter changed after the method invocation. If the collection were immutable, this would have been prohibited. Thus, immutability would have driven us towards a better solution. Also, the reader does not have to keep track of the state changes in his mind since the state is unchanged. The reader only has to associate a name with a value and not remember the latest value of a variable.
More on immutability and programming advice, in general, can be found in the book “Effective Java (2nd Edition)” by Joshua Bloch. Also, a great talk that you should definitely watch is the “The Value of Values with Rich Hickey”.
Programs must be written for people to read, and only incidentally for machines to execute.
― Harold Abelson, Structure and Interpretation of Computer Programs
This post is more on general advice regarding writing code that is more readable and expressive. In future posts, we will discuss smells in production code as well as in test code. We will also see how we can find possible design problems in our production code only by looking at our tests. Stay tuned!
- Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin
- Refactoring: Improving the Design of Existing Code by Martin Fowler and Kent Beck
- Working Effectively with Legacy Code by Michael Feathers
- Structure and Interpretation of Computer Programs by Harold Abelson, Gerald Jay Sussman and Julie Sussman
- Java Concurrency in Practice by Brian Goetz
- Effective Java (2nd Edition) by Joshua Bloch
- Make the Magic go away
- Collection Pipeline
- Practical Functional Programming
- More shell, less eggs (McIlroy vs Knuth story)
- The Value of Values with Rich Hickey
The original source of image of this post is twemoji