Irreducible Complexity Swapping out your monolith for micro-services— or just using micro-services in the first place — is, in theory an obvious thing to do. After all, the larger your monolith (“one big thing”!) gets, the harder it is to for you to keep track of everything in it. The architecture eventually get to a place where you just can’t visualize it all in your head. Oh, loose-coupling helps, as does componentization, layered architectures, queues/service-buses, and whatnot, but, in the end, you’re limited by the irreducible complexity of the system.So, you do the obvious. You extend loose-coupling to the breaking point, by, literally, breaking apart the individual components into distinct micro-services. Now, each of these micro-services can be visualized in entirety by development team, and you can task somebody else with visualizing how all the individual components fit together. its So simple, right? You’ve basically the individual components of your system, and ushered in a new era of speed, efficiency, and apple-pie for lunch. decoupled Or have you? The thing is “ ” above carries a of water! If, for example, the API to one of the components is , then any of the other teams that is using this API will need to constantly refactor implementation to keep it up to date. So yeah, you need to spend a lot of time up-front making sure that you have strong — and stable! — APIs/Interfaces/Contracts between your components.( 🙌)That said, your individual services also have dependencies. Maybe you decided to use Cassandra for the time-series data in your component. You know that , right? And that while it’s not that much worse for you to have used Postgres instead, the operations load of maintaining — and paying for! — Cassandra is going to be waaaaay higher?¹ decouple lot constantly changing their And that’s a good thing, not just for micro-services, but for everything that you build soft everybody else is using Postgres The point here being that while somethings are — such as making sure that the contracts between components are clear and well-defined, others — such as the opacity of the implementation — are much more loose and ill-defined. You will need to trade-off individual efficiency for the greater good ( vs above), and this co-ordination burden will actually be than it was when you were just working with that monolith. After all, in monolith-world, it was axiomatic that everybody used Postgres (“ ”), but now architectural decisions need to be co-ordinated across teams! necessary Cassandra Postgres greater That’s the database. You want to store time-series data? Use it! And that, my friends, is the point I’m trying to make. Moving to micro-services is a free pass when it comes to implementation. At the end of the day, you are building out a distributed system, and if you want one that is fault-tolerant, . not you have to internalize that your choices will impact other people When you decoupled your system, all those interactions didn’t magically vanish, they just moved into a different domain. In a typical monolith, while you get architectural decisions that are pretty consist across the whole system, you have human co-ordination issues that are a righteous PITA. For example, an updated database schema might ripple across the whole system, resulting in a bunch of different teams having to co-ordinate their development — and release dates! — to get this implemented, slowing pretty much everything else down. The larger your system gets, the more…entertaining 🙄…the process of actually getting releases out the door. With micro-services though, . You are, in effect, building a distributed system, and to make sure that it is robust and resilient, you have to make sure that you’ve got all your ducks in a row. Dependency management, version control, deployment pipelines, concurrency and deadlocks, oh, there are an infinity of issues that crop up in this world, and you need to deal with all of them. you trade off these human co-ordination issues for system co-ordination issues “ ”. It’s true, and it’s just what you did. You traded off the PITA of human co-ordination for the entirely different — but equally horrific — PITA of system co-ordination. If you think your life got easier, well, it didn’t. It’s just that the illusion of control is much greater with system co-ordination issues, and this illusion allows us to delay the inevitable day of reckoning till we can actually afford to deal with it, or we’ve gone out of business. Mind you, now that I think of it, that’s not such a bad tradeoff… Complexity never goes away, it just moves up the food chain I recall a scenario where one team used the , while all the other teams used . The rational was that the lead dev on that team had never used Prometheus, and went with TICK because “speed to market”. It took quite a while to unwind the whole mess… TICK stack Prometheus ( This article also appears on my blog )