Jeff Ferber

ferbs.com/bio

[Deep Dive] Docker, Kubernetes, and Microservices for Small Teams

Most of the web apps I build eventually end up needing a background worker. There will be some slow or heavy task that really should run independently, like an integration with a third-party server, a web scraper, PDF creation, something.
The app at this point crosses a barrier, it now has a multi-service architecture. There's a new layer of complexity to manage.

Containerization and the growing popularity of microservices has produced a fantastically rich ecosystem of tools to support multi-service architectures. Most are intended for large dev teams and heavy user loads, but can a small team benefit too?
When is the complexity and overhead of such an architecture appropriate? How far to take it? And how do you actually implement it?
I've been playing around with some ideas, trying to answer these question, looking to find a sweet spot for the smallish team of ~ 2-20 developers, something that works nicely with an existing Django/Laravel/Node/Rails framework, that picks up the main advantages without adding too much of a burden.
I've shared my rough pass at a sample solution, webstack-micro described below, but let's first look at the factors a small team might consider when evaluating a partial/hybrid microservices-based architecture.

A small-team perspective

In a small team, we don't feel pressured to break up our code base into separate services, where a large dev team might (nicely described by Shopify). Going pure micro adds layers of overhead and pain that a small team doesn't need to pay.
What would have been a simple database transaction or join turns into something much harder, requiring interactions between multiple services and possibly two-phase commits.
Since we don't want to split our app into separate micro services, let's define a "microservices architecture" to mean something more general: containerized services running behind a gateway/load balancer/Ingress controller with some form of centralized authentication.
Ok, defined, so now we can call it "microservices" and not just "Docker", but why would we want it?

A Touch of Good Service

With a microservices architecture in place, it's quite easy to add little services that compliment your main app. If you need to do web scraping, perhaps you'll decide to add a service with headless chrome, coding it in JavaScript rather than in PHP/Python/Ruby on a library originally intended for functional testing.
Once the microservices plumbing is in place, adding any single new service is a breeze.
You might not want to introduce too many different languages to your project, but Docker Hub is brimming with useful services that don't require custom code. Does your app need optical character recognition? You can drop in a best-in-class OCR service. With centralized authentication already in place, you can expose it as an independent API endpoint too, should that make more sense than using it as a background worker, it's up to you.

Rich background workers

Background workers drove us to this, we may as well get them right. When a worker finishes, let's push the results to the user over WebSockets. Our new plumbing supports that too.
A heavy-duty message broker like RabbitMQ won't be suitable for every project, but it fits in nicely if needed. A good infrastructure guides future decisions. I once had to integrate a Rails app with an awful ERP system. None of the third-party Ruby libraries worked.
The library provided by the vendor worked but was in a different language. Even ignoring the deployment issue, our Ruby-based background system wouldn't work in that environment. Adding a new message broker was just too much to consider at the time, so I went with something hacky and restrictive instead, recording the requests from my local system, converting them to string templates that our app would send.

Room to grow

After paying the initial complexity cost of a microservices architecture, other benefits are within close reach. If some users are making abusive API calls, imposing rate limits in the API gateway isn't a big step. Or maybe you'll configure the gateway to support canary releases, where you can try out a scary change on a small portion of your audience before rolling it out to everyone. (And have it automatically revert if it encounters new errors.)
Although heavy user load might not be a primary consideration for many smallish teams, the benefit of auto-scaling is nothing to sneeze at. Even with gently increasing load, some feature or service will likely outpace the others in resource consumption. Being able to scale them separately is pretty nice.

At what cost?

Let's look at the downsides...


Resource heavy

Running numerous containerized services eats up memory and CPU. When I limit Docker to 2 CPU cores on my development system, the webstack-micro example project (below) feels pretty slow, especially on start. The example runs only 9 relatively light services--a larger app would be painful under this allocation.
There's a good chance you'll need a modern, hefty dev system. If you're using Java or .NET or have a whole lot of different services, the project might overwhelm a typical dev laptop. If you reach that point, you might need to shift dev mode to the cloud, requiring always-online access (plus a payment if using a commercial service like Okteto), or need to write scripts that skip or stub services, another payment in complexity.
This downside mainly refers to dev mode--it doesn't necessarily predict higher hosting costs but that is a possibility.

Living in containers

Overall, the local dev environment feels a lot like any project that uses a virtual machine for dev mode (like Vagrant projects.) It shares the same downsides:
  • you need to set up remote debugging to step through code;
  • there's an extra step to run commands inside the vm/container and they run more slowly.

Learning curves

If you're introducing new technologies to your project, at least one person will need to learn each one. In particular: Docker Compose, the API Gateway (eg Traefik), and probably Kubernetes.
Depending on the services you decide to include in your project and the composition of your team, you might be introducing other learning curves, such as RabbitMQ, Redis, WebSockets, etc.
Plus, you'll be changing the deployment process. In a previous startup, we eventually put in place fancy devops features like hot deploys and no single-points of failure, but it required a lot of effort and a pile of vendor-specific scripts.
Containerization makes features like these much easier to achieve (and has the benefit of being fairly portable across hosting providers) but you still have the upfront cost of putting something in place and learning how it all works.

Alternatives

Rather than adopting a microservices-like architecture, you can avoid or defer it in various ways, depending on what pain point your project is hitting.

Instead of using background workers, you might offload such tasks to a hosted service like Serverless or AWS Lambda. Similarly, many data centers offer managed solutions, not just for database persistence, but for services like RabbitMQ. You can also find separate, commercial services for WebSockets and authentication too.

Of course, you can also go in the opposite direction, running servers that do whatever you need, with or without an API gateway, with or without containerization, with or without monitoring.

Example setup: webstack-micro

If you decide the flexibility and cool features of a microservices architecture are worth its drawbacks, try dropping your app into webstack-micro to kick the tires. (Free / MIT license.)
I used webstack-micro as a playground, trying to find a balance between functionality and complexity that might be appropriate for relatively small teams.
It sets up Traefik as its API Gateway/Ingress controller. Traefik interacts with one of the example services to enforce centralized authentication for any route marked as protected, requiring either user login or a JWT token. The example also includes the plumbing for background-push, with example uses of background workers sending results over WebSockets.
All of the microservices-related examples I've evaluated would not help a small team get started. They completely ignore authentication, the hardest part to all this, and seem to have very large projects in mind (like Google's example.)
I hope the webstack-micro example gives some teams a big boost.
Artwork credit:
  • https://unsplash.com/photos/ECjHeJtRznQ
  • https://unsplash.com/photos/afq5-t0ZGtQ
  • https://unsplash.com/photos/ZihPQeQR2wM
  • https://unsplash.com/photos/j_Ckc_r25zc

Tags

Comments

Topics of interest