In a previous post, we discussed the importance of writing expressive code. Expressiveness makes our code easier to follow and understand. New developers can add new features or fix defects without spending too much time trying to decrypt what the original authors had in their mind or why they ended up with a specific solution. The code should reflect the authors’ decisions in a clear way as much as possible.
As we raise the level of abstraction, the decisions that were made become important. At the architectural level, the existence of each component and the way that it communicates with the other components should be clear to every developer of the team. As Martin Fowler states in “Who needs an Architect?”, the common understanding of the system is highly correlated to the architecture. In the end, it is all about communication.
An architecture should point out its intentions and all the decisions that were made loud and clear! In this post, we will discuss the problems that architectures with not well-communicated decisions have. We call such architectures obscure.
As time passes, the requirements of our system usually change. For example, if a business becomes successful, then the capacity of the system might need to be increased. Also, our understanding of how the system works gets better through the time. Thus, the architecture evolves.
Through the evolution, the architectural decisions that we made in the past might change. This is a common thing. It makes no sense for a prototype to start with a microservice architecture. Not only the cost of adopting a microservice architecture early on is very high, but also the key requirement when we create a prototype is usually flexibility and speed. Microservices solve very specific problems, and usually, none of these problems are important in the prototyping phase. A nice post on this topic is “Evolutionary Architecture” by Randy Shoup.
Should the prototype become successful, these requirements will change. The decisions that an architect takes should balance between what is needed now and how restrictive these choices will be for the future. The goal is to have as many options open, as we can. More on this topic (and software architecture in general) in the amazing book: Clean Architecture: A Craftsman’s Guide to Software Structure and Design by Robert C. Martin (Uncle Bob).
Since the decisions would probably be revisited in the future, it is important to have them well documented. Every member of the team should be able to tell why the architecture of the system is in the current state.
This is (yet) another case where Conway’s Law applies!
At some point during the lifespan of a system, an architect might decide that the best (or only) way to handle a problem is by doing some sort of hack. We use the term hack here to refer to solutions that do not follow the “standard” patterns. If all the use cases in a system are implemented using the same pattern, then a use case that is implemented differently is a hack. The same stands for the tools (libraries, frameworks, etc.) that we use. If a tool is used in a certain way in the system, whereas in some cases it is changed to work differently, then it is also considered as a hack. Hacks are exceptions to the common rules and patterns that are used in the system.
Such hacks include intercepting and changing the behavior of a framework or library, in a way that was not intended to, using specialized features such as reflection, modifying the source code of a framework and building a custom version of it, or even writing a framework that will provide a bespoke solution from scratch. The latter category can be easily found in projects with a long lifespan. In some cases, these hacks might be the quickest (or the only) way to solve a specific problem, so when we want to have something that works quickly, it might be ok to get our hands dirty!
But, such solutions are in most cases complex. It is not easy to tell what is the intention of these solutions by just looking at the code. And it is even harder to tell how they can be replaced by more convenient solutions in the future. When we use a popular framework or library, one of the biggest benefits that we get is that we have the documentation for free! When we build a custom solution, we have to provide something similar to the users.
The problem with complex solutions is that the reader has to go through the same path the initial author did. The solution is the final step of the problem-solving process. It is nearly impossible for a reader to go through exactly the same path and build the same understanding. Thus, a system with hidden hacks is hard to maintain. So, it is necessary for the complex hacks to have an explanation of their existence and not be hidden in the code. Also, the hacks should be developed in a way that makes them easily disposable. As soon as the circumstances permit, for example, a better standard solution comes out, or we realize that there is a better way to solve our problem, the hacks should be replaced! We don’t want to have a system full of hacks that no one (or only a few) knows what they do!
For example, let’s assume that we want to use a library that provides some sort of exotic data structure, but the API is not thread-safe. We may consider wrapping the API and enrich it with our thread-safe logic to make the data structure conform our use case. But what if in a future release of the library provides the thread-safe API that we wanted previously? Now the thread-safe logic is duplicated.
Also, new developers might be confused when they see a wrapper that does exactly what the library does. They will not be able to tell what is the intention of the wrapper. Even worse, if there is no good documentation of exactly what problem the wrapper solved, they will not be completely sure whether the logic is duplicated and can be removed safely. The wrapper is going to live on the project forever! Of course, testing can save the day in such cases, since tests validate the behavior of hack so that it can be replaced without fear.
The wrapper of the previous example, not only impacts the performance but also adds some noise to the system’s architecture. This noise affects the communication of the intentions of the component in our system. In “Cumulative Code” smell post, we discussed how the problems in communication affect the quality of our code.
Hacky solutions are also used for the sake of performance. It is not usual for an architect to change the way some components interact with each other to achieve better performance. Changes like that lead to changes in the shape of the system. Someone who sees the system later might be confused about the mysterious way that some of the components communicate, while the other components communicate in the “standard” way. Such solutions should be somehow documented; otherwise, the architecture is obscure.
A system full of hacky solutions is very difficult to be maintained and extended. The intention of every hack or shortcut should be well described. The architect should describe what problem they solve, what other candidate solutions were examined and why the current solutions were selected among these. This is critical since hacks are technical debt that needs to be paid at some point; otherwise, the system will eventually default on the technical debt (more on how the technical debt should be handled, in a future post). Thus, the hacks should be easily disposable to avoid increasing the technical debt.
In the same way, that every framework and library of the system should be used carefully so that they do not “pollute” our architecture, hacks and shortcuts must be treated with care. A good architect should remove all the noise that hacks and shortcuts add to the system and make the communication poor. If a solution to a problem is not obvious, there should be a good reason for that!
The rule of thumb that I follow is: Avoid creating new problems when you try to solve a problem. If this is inevitable, provide a very analytic description for this decision. Describe what was the situation when you made that decision, what were the alternatives, and what kind of new problems this decision created. This description can be either in comments within the code or in tickets in our issue-tracker system that are associated with the commit message or even in a wiki/confluence page and of course a thorough test suite that validates the behavior of the “uncommon” decision, so it can be replaced easily in the future.
In “Let the code speak!”, we saw that noise in code makes our code less maintainable. In this post, we described why noise in the architecture makes the entire system less maintainable. Obviously, the maintenance cost of an obscure method or a module is much lower than the maintenance cost of an obscure architecture.
The maintenance cost of poor communication is increased when the level of abstraction is higher.