Distributed systems are hard. in and with distributed systems is even harder. Developing Well, first off we’re dealing with two aspects here: What’s so hard? The cluster, in the figure below, that is, the distributed system itself such as DC/OS or or Spark or Kafka. CL Kubernetes The development environment, in the following, which could be anything from vi/Emacs to an IntelliJ IDEA. DE Fundamental options for developing in and with distributed systems. The second distinction we have to make is that of local vs. remote, from the point of view of the developer. Local means: runs on my machine. Remote means: runs somewhere in The Cloud© (or yeah, in your datacenter, welcome to 2017 ;) Let’s walk through the four fundamental options now: CLASS I Both CL and DE are local. Examples are K8S , DC/OS and . minikube Vagrant Docker Compose From the developer’s POV the pros of this approach are: No costs for online stuff, can run as long as I want. Fully under my control. And the cons: Can’t realistically cover all cases of a distributed systems such as network delays (or, in the worst case partitioning) or clock skews. People sometimes forget about the , but these issues still exist, no matter if you’ve heard of it or not. fallacies of distributed computing Doesn’t really scale. Well, only vertically. Typically must be supplemented by also deploying the code into a (real) distributed dev/test environment. CLASS II CL and DE are located where one would expect it. The CL is made available to the DE via proxy or VPN. One example is . DC/OS Tunnel From the developer’s POV the pros of this approach are: Can quickly iterate and deploy/test against the real stuff. On my machine I only need to run DE. And the cons: Requires online connection so offline development is either very limited or not possible at all. Certain edge cases might not be supported because of the limitations of the tunnel/proxy. CLASS III Same as CLASS II in terms of separation but in order to test a service one needs to actually deploy it in the CL. This is the usual setup found in many environments, with or without a . CI/CD pipeline in place From the developer’s POV the pros of this approach are: This is the real thing. It’s WYSIWG and as complete as it gets. And the cons: As with class II it requires connectivity and offline development is most certainly not possible. It can be super slow to iterate. You might end up waiting 5min or more to deploy a new version of your service. CLASS IV Both CL and DE are remote. Call it or whatever, but essentially nothing runs your machine, really, in this setup. And while I’ve written about this topic I think by and large we’re still not there yet. Examples of this category are and . Chromebook-based development many years ago Google Cloud Shell Cloud9 From the developer’s POV the pros of this approach are: Where ever I am, where ever I go, I have all the things set up and available; no local setup/dependencies. It scales like hell: both in terms of system and team. And the cons: Always online is the default. You can’t to offline. anything You have little to no control about your data (== code and build artifacts) and depend on someone else in terms of availability of your DE. What to choose? I don’t know your preferences, your use case, your team size, your industry, your regulatory requirements, your budget, … you get it. Personally, I believe we’ll be transitioning to CLASS IV within the next 5 to 10 years. Currently, I mostly use a CLASS II setup: it combines the authenticity of the distributed system with the (necessary) iteration speed, and if you like you can have a look at a concrete example . here