Ever so often, in software engineering, new methodologies and frameworks emerge with the promise of increasing the reliability and delivery velocity of software. These “Grand New Ways Of Doing Things” have a period during which they are applied universally, touted by consultants and Engineering Managers alike, to solve all the current problems. Previous “Grand New Ways Of Doing Things” such as Agile, Test Driven Development, and even Microservices, despite starting as cure-alls, have settled into the toolkit of sometimes-useful strategies, contingent upon context.
The latest buzzword that’s probably injecting a few too many technical items on your product roadmaps is “Shifting Left,” a concept that champions the idea of integrating more testing, development, and planning stages earlier in the software development lifecycle. The principle is simple: the sooner you catch and fix a bug, the less it costs and the faster you can move. This approach encourages developers to run more tests locally, integrate earlier, and generally, speed up the feedback loop in the development process.
However, despite its popularity, I remain skeptical about the effectiveness of Shifting Left, or at least the way that it’s generally applied.
Here’s why:
To run things locally or in early environments, teams often rely on mocks or fakes to stub out responses from downstream services. These require significant time to create and maintain, obviously don’t operate the same way as their real counterparts in production, and risk becoming outdated over time as the real service is updated.
No matter how sophisticated the local or pre-production environments are, they never fully emulate the complexities of production.
Here’s a list of some things that are likely to differ in production vs other environments:
This single gap is the leading cause of overconfidence in the stability and performance of software, and hence production incidents.
It’s 2024. Software engineering is changing. The fastest-delivering engineers on any team will at times have multiple pull requests open, relying on automated tests run somewhere in GitHub or GitLab land to check their work for them. Despite knowing how to do it in specific situations, I don’t believe great engineers regularly spin up a local environment, as it can be incredibly time-consuming, detracts from dev time, and this work can be rightfully offloaded to the cloud VM farm.
Shifting Right, on the other hand, is about embracing the once-taboo practice of testing in production. “Testing in Production” used to be synonymous with recklessness. Today, I see it as a sign of sophistication, and I see a lot of potential in this domain for enhancing software reliability and delivery speed.
Testing in prod offers a level of assurance unmatched by any pre-prod environment. When you observe your software operating successfully in the real world, you know it’s genuinely ready.
In the current standard development process, each of black box and end-to-end tests, synthetic monitoring, and observability occupies their slice of a development cycle and has its codebase. Yet all of these things are extremely related! Convergence can not only simplify the testing process under one cohesive plan, but allows for less dev time spent on each step, and high levels of code reuse.
One thing engineering management everywhere is talking about more than shifting left is reducing cloud costs! Most dev teams are operating with 4-5 environments in various states of disrepair, but get almost all of their value from 1-2 of these (hint: one of these is prod!). Maintaining multiple dev or testing environments is a costly affair. By focusing on production, we can reduce unnecessary expenditure on cloud resources.
It’s from this perspective that I’m developing Prodzilla.
Right now Prodzilla is an open-source, low-code, synthetic monitoring tool, with a focus on easily testing complex user flows not traditionally tested in prod. The intention is to grow it into a framework that provides everything you need to test in production, and surface the discovered behaviour in useful outputs such as internal or external docs and alerts.
If you like the idea, please give us a star on our Github!
All this is to say, as with all things, the correct approach is likely balanced. If your engineers aren’t getting any feedback about the quality or behaviour of their code until the end of some pipeline, then yes, please shift left. But don’t fret just because your team isn’t running anything locally - consider this might actually be a good thing! Shifting Left on its own is not a silver bullet, and the losses associated with an over-reliance on simulated environments are significant.
Shifting Right offers a pragmatic, reality-grounded approach that aligns testing with actual user experiences. I hope Prodzilla can be one part of this journey, aiming to make testing in production a feasible, efficient, and integral part of the software development lifecycle.
Also published here.