I eat food and build things. Opinions are my own.
Software is expensive. But regardless of how much you spend to get Version One out the door, that cost generally represents only one quarter of the Total Cost of Ownership (TCO) of a system. Expect 75% of a project’s cost to come from maintenance after it’s already in production.
Costs increases exponentially as large monolithic systems get larger without getting any less monolithic. This is because making even a single line change impacts distant parts of the codebase that are difficult to easily predict. Inevitably, changes take longer to make because defects get harder to prevent.
Sam Newman in his book Building Microservices makes a case for breaking up large systems by finding seams between Bounded Contexts. Seams are the divisions between portions of code that can be treated in isolation and worked on without affecting the rest of the codebase.
“The idea is that any given domain consists of multiple bounded contexts, and within each are things … that do not need to be communicated outside as well as things that are shared externally with other bounded contexts.” — Sam Newman
Stated differently, there is a certain set of logic that embodies the contracts and behaviors that define a web service. Because knowledge of this “seam” is shared between multiple systems, design and any subsequent changes require input and sign-off from external teams. However, implementation details are hidden safely behind this seam. This means that as long as the data entering and exiting the bounded context plays by the rules defined at the seam, developers have full authority to make any changes to implementation logic on the inside.
Changes become more expensive to make as a project gets further down the pipeline. Changes made to a production system would have been multiple orders of magnitude cheaper had they been made during the discovery or design phases.
If you jump into a project by implementing the internals of a bounded context before defining the seams, you introduce these types of needlessly expensive changes. This is because by the time the design of the seam is finally locked down, it will inevitably require additions or changes to the now-existing implementation. If the cost of those changes is too high, stakeholders will be left with an unwieldy API.
But by defining and then working away from a seam, design effort starts with the part that impacts the most parties. Once that seam is locked down, development teams are free to design and implement their respective systems in isolation of any other team. This isolation reduces the complexity and thus the cost of both building and maintaining a large distributed system.
Now let’s look at what is needed to actually define those seams.
“Each bounded context has an explicit interface where it decides what models to share with other bounded contexts.” — Sam Newman
I use the term “specification” (or just “spec”) to refer to this “explicit interface” which I generally consider to be different than “documentation” or “docs.” At a high level, a specification dictates what a system ought to do, whereas documentation describes what a system actually does. In that light, a spec should be created before a system is built and documentation would be written (or generated) after.
There are a number of popular tools designed to spec out APIs, the big three being OpenAPI, API Blueprint, or RAML. I could recommend one here, but your specific choice and implementation is irrelevant so long as it meets the following criteria:
As a contract between multiple parties, the API specification should be accessible by all parties. In addition, it shouldn’t be hard to find. All code is written either for or by your 3AM self. If another team member is woken up to put out a server fire, they should be able to effortlessly discover what your system is supposed to be doing, regardless of their level of sleep deprivation.
Also, as a contract between multiple parties, any changes to the spec will affect all parties. This means that all changes, regardless of how small, should be committed into source control. Among all of the usual boons of using an SCM, these changes can be tracked over time on compared with breakages in other teams’ code.
This requirement limits the type of media that can be considered a “specification.” A whiteboard drawing is a fantastic conversation aid, but keep in mind that dry erase ink is intentionally temporary. You can take a photo, but changes are difficult to track. Another problem with something like a drawing is it’s not …
As a contract between multiple parties, the spec should contain enough information that all parties can implement it without ambiguity. As such, all of the routes, headers, authorization, validation rules, and other nuances of the API should be rigorously defined.
Because spec should be written before coding starts, generating the spec from code is an indication of a reversal of that process. But even disregarding that, completeness is one of the requirements that is hardest to hit when generating a spec from existing code. There are a few key examples. First, authorization flows, especially when using OAuth, are probably impossible for a machine to glean from source code. Post-singularity, that might not be the case, but then we’re all out of jobs anyway. Secondly, guard clauses in route handlers are a ubiquitous method for enforcing validation rules. Again, it’s hard for machines to pick up on those nuances when generating a spec.
Any requirements missing from the spec but understood to be part of required behavior anyway will end up causing rework on either the service or the consumers. The best bet is to hand-write your spec to ensure completion and specificity, and then review and update it until all parties can sign off on it. The end result should be a document that explicitly defines the entire service interface.
The spec is only as useful as it is readable. If the content is only human-readable (such as long-form writing, a photograph, etc.), then human interaction will be required to ensure that each new build of your API complies with the specification. Human dependence is the antithesis of automation.
But if the spec is machine-readable, then a whole host of automation possibilities opens up from code generation, test generation, validation, etc. Describing the complete functionality of an API is a lot of work. Ensuring that the resulting description is published in a machine readable format does take a bit more effort, but that effort is good investment.
If you start with machine-readable content (m) then there exists a deterministic function that produces human-readable content (h). However, the opposite conversion is a surprisingly difficult academic problem. There does not exist a deterministic function that produces machine-readable content (m) from primarily human-readable content (h):
h = f(m)
m != f(h)
If a tradeoff must be made between machine-readability and human-readability, always opting for machine-readability ends up netting both, anyway. And only having to do half the work usually ends up yielding a more cost-effective final product.
Once you have found and defined the seams, they need to be enforced. There are a few ways do this. First, the developers can read the spec and then hand-write code to follow all of the rules. While this works fine for super small projects, the burden of hand-writing a spec and then hand-writing the whole implementation is probably cost prohibitive for anything more than a few routes.
The second option is to generate the code that enforces all of the validation rules. This guarantees that the functionality will match the spec as long as the generator is run and the code isn’t updated afterwards. However, the generated code will live is source control as a duplication of the logic defined in the specification. While not inherently problematic, it could lead to potential confusion of which is correct, especially in cases where there is a disparity.
The third option is to evaluate the spec at runtime for validation, route registration, and enforce any other rules that the spec defines. The advantage here is that consumers are always 100% guaranteed that the behavior and data communicated by the spec will be produced by the application. Below, I list 4 rules that should be followed. There are various tools for various frameworks that provide this functionality, but as I’ve stated before, the specific tools don’t matter as long as they do the job.
The down side to runtime evaluation is the overhead of executing extra code. However, this overhead is normally in the single-digit millisecond range (or lower) and is totally worth the quality guarantee that it offers.
The following are the rules that I consider to be the minimum.
The specification shall be evaluated by the application at runtime to guarantee the following:
Identify the logical boundaries between parts of the system that can be managed independently then rigorously define what those boundaries look like. When you do so, teams are free to design and implement within those boundaries without negatively affecting others.
In case you’re curious, my personal preference is OpenAPI and the associated ecosystem of tooling. But what’s your preference? How have you used a spec-driven approach on your project? Please feel free to drop a comment below. :)