paint-brush
The Twelve-Factor App: 12 Best Practices For Microservicesby@mobycast
3,981 reads
3,981 reads

The Twelve-Factor App: 12 Best Practices For Microservices

by mobycastSeptember 28th, 2019
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

The Twelve-Factor App methodology was created by developers at Heroku after their involvement with the development, operation, and scaling of hundreds of thousands of apps on the Heroku platform. Jon and Chris go through the 12 factors, explaining each one in detail and debating its relevance to today’s cloud-native applications. The Factors: Codebase, Dev/prod parity, build, release, run, build and run stages, processes, port binding, log and port binding. Some have argued for adding 3 additional factors:Telemetry, security and API First.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - The Twelve-Factor App: 12 Best Practices For Microservices
mobycast HackerNoon profile picture

Summary

The Twelve-Factor App methodology was created by developers at Heroku after their involvement with the development, operation, and scaling of hundreds of thousands of apps on the Heroku platform. They noticed that successful apps shared a core set of common principles. First published in 2012, the Twelve-Factor App attempts to distill that knowledge into 12 factors of best practices.

But the Twelve-Factor App is now over 7 years old (which is several lifetimes in the technology world!)… is it still applicable to today’s modern cloud-native applications?

In this episode of Mobycast, Jon and Chris go through the 12 factors, explaining each one in detail and debating its relevance to today’s cloud-native applications.

Soundcloud Link

Show Details

The Twelve-Factor App methodology

Drafted by developers at Heroku based upon their observations of what made good apps. First presented by Adam Wiggins circa 2011 (then published in 2012)

The Factors

  1. Codebase: one codebase tracked in revision control, many deploys
  2. Dependencies: explicitly declare and isolate dependencies
  3. Config: strict separation of config from code
  4. Backing services: foster loose coupling by treating backing services as attached resources
  5. Build, release, run: strictly separate build and run stages
  6. Processes: processes are stateless and share-nothing
  7. Port binding: export services via port binding
  8. Concurrency: scale out via the process model
  9. Disposability: processes are disposable, they can be started or stopped at a moment’s notice
  10. Dev/prod parity: Keep development, staging, and production as similar as possible
  11. Logs: treat logs as event streams, don’t manage log files
  12. Admin processes: admin and utility code ships with app code to avoid synchronization issues

What’s Missing?

7 years since first being published, what changes should be made to make it more relevant for today?

Some have argued for adding 3 additional factors:

  • Telemetry
  • Security
  • “API First”-philosophy

End song:
Flowerchild (Roy England Remix) by Owen Ni – Make Mistakes

We’d love to hear from you! You can reach us at:

Transcript

Jon: Welcome, Chris and Rich. It’s another episode of Mobycast.

Rich: Hey, Jon.

Chris: Hey guys, good to be back.

Jon: Yeah, good to have you back. All right, let’s just see what we’ve been up to. Rich, how about you, what have you been up to?

Rich: Just diving into project management, learning that, struggling with it, having fun with it. We’re growing as a team and I need to start getting better at it. It used to be the case that I would do PM and business stuff after hours. It’s funny, the hardest thing for me to do is actually transition that into the opposite. My day is full of project management and strategy, and then if I have to do anything more meticulous, I do it at night. It’s crazy how hard it is to break through muscle memory.

Jon: Yeah. Just to clarify what I think you’re saying, you’re doing actual company leadership and employee leadership. Not just making sure that the tasks are getting done.

Rich: Yeah. I’m doing that, too, but I’m trying to spend the actual business hour days doing strategy, which is something I always just did nights and weekends.

Jon: How fun, what a great transition. I love that.

Rich: When it’s an actual full transition, I’ll be right there with you right now. It feels like I can’t get anything done.

Jon: Right on. How about you, Chris, what are you up to?

Chris: Just heads down doing some thinking, work around Mobycast, and some other new initiatives that we’re thinking about here at Kelsus, finished the book The Corn Billionaires again. I highly recommend it. Now, I’m starting a new book about a fellow that basically duped all of the folks in the wine industry, especially the high-end wine industry where he was counterfeiting really expensive rare bottles of wine, and really just fooled everyone.

Jon: Oh, I just love that. That’s so cool.

Chris: It’s interesting to see. He just came out of nowhere. He’s still kind of a mysterious background and whether or not did his family really have any money, but literally going through $1 million a month in buying wine and selling wine, throwing wine around. […] and dealing, always getting incredible parties, Hollywood people and investment bankers in Wall Street, all these fancy dinners and everything else. And then at the end of the day, it all just crashes down once folks realize that, “You know what? This is not a 1921 bottle of Bordeaux that he says it is.”

Jon: He got caught. I wonder how many people are not getting caught. That’s what goes for credibility in this day and age. If you say it with confidence, it must be true. Our listeners know that the only way they get ahead in software and then make a million dollars is an international remote developer, so listen to Mobycast. It’s true!

Chris: At the end of the day, it’s all about sales and relationships and making people feel like this is right. It’s not like the best product necessarily win to the best service. It’s just whoever’s making that decision, just convince them to make that decision. That decision is not made solely on a quantitative basis, it’s actually very qualitative.

Jon: It’s all coming back to Steve Jobs’ Reality Distortion Field. He’s the main person I think of and then everyone’s just emulating that in every different area.

Chris: Yeah. He was the master. For our time.

Jon: Right, sure.

Chris: There’s many people, plenty of people before him.

Jon: Yeah. Like Rasputin or whatever.

Chris: Or P.T. Barnum.

Jon: Right, there you go. Today we’re going to talk about the Twelve-Factor App. This is the beginning of a series on architecture. This is exciting because a lot people that listen, especially as we try to talk about stuff that’s really valuable for remote and international developers. A lot of people may have less access to just architectural decisions and the water cooler talk that happens when high performing teams make architecture decisions and what they’re talking about, how they’re making these decisions, what they have in mind when people are starting a new project, or when people are looking at an existing project in doing an architectural evaluation to figure out how it needs to change and grow in a better way. 

If you Google it, you just get a bunch of random blog articles that may even disagree with each other. If you take computer science curriculum, it’s really hard to come across this stuff. Even if at the upper levels when there are people that do study computer architectures and software architectures, it’s part of their PhDs. I don’t know if you’ve ever read some of those papers, Chris, but they’re just so removed from the reality sometimes, of what happens inside businesses, that it’s not really that useful.

Chris: It’s the difference between the theoretical and the practical. It’s cleaner on environment versus real world, two totally different things. You can go try to read a paper, but it’s like, how do I apply this?

Jon: Right, so we’ll try to help our listeners think about this stuff and talk about it from our own perspectives, even though we’re going to use some frameworks to talk about it. We’ll be able to talk about our own experience as it relates to these frameworks. Chris, you want to kick us off on the Twelve-Factor App?

Chris: This is one that listeners may have heard and it maybe something that you haven’t heard about before. It actually is pretty well known in the industry. It first started surfacing around 2011. I think it was first published in 2012. What it is, is it’s a list of characteristics that identify what makes a good web application. This was based upon the observations being made by the folks at Heroku, which is the past Platform as a Service company that was really pretty popular for a while, especially in the Rails Community, where lots of apps were using Heroku for their hosting and for deploying. It was really easy to speed things up, and really didn’t have to worry about infrastructure, that was all taken care of for you by Heroku.

Looking at all these apps being deployed on their system, seeing what works, what doesn’t work, so the result of that was like, “Hey, there are these 12 things that we think are critical to have a web app that works really, really well.” That’s what this framework’s about.

Again, 2012 is when it was first published. In the tech industry, that’s seven years now. We’re talking dog years. This is like 50 years of progress essentially has gone by. It’s interesting that we’re still talking about this. For the most part, it’s still applicable. We’ll get into this, just how relevant is it still and does it need to be updated or are there other frameworks that are better. As far as just the basics of like, hey, these are table stakes, it serves a really useful purpose from that regard.

Jon: For sure. If everybody did these 12 things, then we would have so much better software out there in the world.

Chris: Absolutely. As we go through it, some of these things are just like, “Of course, you do that. Who wouldn’t do that?” The fact of the matter is that there are still a lot of people that don’t do some of these things. We’ll take that into account as we go through it and see what we come up with.

Number one, codebase. The first factor is codebase. What this is saying is that you should have a single codebase for each one of your web apps. And that should be tracked in version control. Who doesn’t do that? I’m always blown away when you see the surveys and the numbers of how many people are not still using version control for their code.

Jon: Yeah. […] project in 2011, 2012, maybe 2013. Somewhere in that range, I can’t remember exactly, but we got the codebase off the production server, that’s where we got the code.

Chris: Yup. For me, any percentage of code above or below 100% that’s not in version control, that’s just a shocker. It is. There’s still folks out there that are not using version control. So please, use it. This factor is also saying, don’t have multiple apps sharing the same codebase. You want to have a single app as your codebase.

Jon: It could be that your one app has multiple different repositories, multiple different Git repositories, but just as long as you don’t have multiple apps in one Git repository, then that’s fine.

Chris: Right. That’s the philosophy of this […] framework. Some people maybe familiar the cons of a mono repo. There’s other camps out there that say it’s better to have just a single repo and have all your apps in it. What we go through here with the 12 factors, from my point of view, this is the way that I lean towards the school of taught then. One repo per app, but there’s other perspectives out there as well and opinions.

Jon: Right. If you’re thinking about this stuff then you’re a step ahead. Number two.

Chris: Before we get into that, just one other quick thing here. One of the reasons, too, with the Twelve-Factor App now is still relevant, still being talked about, and may have a little bit of resurgence is because of micro services. Almost just about everything that applies here to the Twelve-Factor App is really spot on micro services. What we just talked about, a single app per codebase micro service. Keep that in mind as we go through this. 

Factor two is dependencies. This is explicitly declare and isolate your dependencies. Really, what this is saying is don’t rely on any implicit existence of […] that you require.

Jon: It’s explicitly defining our dependencies and not relying on implicit. Are we doing some circular definition here, Chris?

Chris: No. Because by explicitly declaring your dependencies, […] you’re not relying on in implicit. All this is saying is make yourself a self-contained unit. Whatever your software is, it has everything it needs in order to work. If your app requires something like an image processing tool, imagemagick or something like that, your app can’t assume that imagemagick is installed in its environment. You have to handle that. You do have to at some point, say something exists. It may be a hypervisor exist or something like that. Anything above that level, explicitly declared, isolate those dependencies and make your app a self-confined unit.

There’s lots of tools here that can help with this. Declare all your dependencies via some manifest and such, but every single programming language and tech stack out there has these things. Things like Python have pip and things like JavaScript have npm and package.json. Package.json would be the declaration manifest. It list then all the dependencies that you’re  particular app needs.

Jon: Everyone that’s using the web framework just got to check this box off.

Chris: Indeed. And then use the dependency isolation tool during execution as well. We have one of the best dependency isolation tools possible right now, and it’s called Docker. Containers are really great at this. When this was written back in 2011, 2012, containers really didn’t exist at that point, nor they were in their infancy, the isolation was really at the top level instead of that OS level, the container level. These tech stack frameworks provide some bare bones isolation. Rails and that community, they have the concept of basically different folder structure, like different directories for different versions of an app.

Python has something similar with virtualenv, but that’s at the application level. Now we have much better isolation tools with just at the container level. That way, you have your image, and everything has to be on that image. It’s not going to be able to implicitly rely on anything because that image, is either going to work or it’s not. It’s just needs a VM in order to host it.

Jon: Next, number three.

Chris: Factor three is config. Here, it’s basically saying, have a strict separation of config from your code. Anything that is configuration-related, anything that’s going to likely vary between deploys, different environments like staging versus prod, things like settings that can be changed or tweaked or whatnot, that’s all config. That should be separate from the code and you should be able to make changes to that configuration without doing a redeploy of your code. You shouldn’t require any code changes to do that.

Jon: In my opinion it would be nice if you could keep your app running mostly through config changes. That would be nice. But that’s not necessarily what this is saying.

Chris: No. That’s pretty sophisticated, for sure. Now, that’s definitely a good goal to have. Back when this was developed in 2011–2012, just having that separation was like a really big stand. Again, that’s the first thing that the run-walk, if you will. […] the crawl-walk.

Jon: And this may be talking more about environment config. These are where the directories, where my dependencies are, or these are some secrets that I need to talk to my third party services. All that stuff, it’s basically saying don’t have that in the code, have it in some separate file that I can change very easily.

Chris: They go so far as they say environment variables are really what they recommend you do. Nowadays, that’s not necessarily the best way anymore, best practice, but the concept definitely applies. These are settings like, “What’s my database connections […]? Which database server am I talking to for this particular environment?” This is going to be different for your dev versus your staging versus prod. And then, “What email address is the report going to?” and that, again, might be different based upon your environment that you’re putting in, so […], just separate from your code, keep it separate.

Jon: And really, from day one, even if you’re writing a quick play app, you just get used to it, just get used to having that stuff always separate. Frameworks help with this. Rails helped with it. I definitely remember the Python and JavaScript Frameworks helped too. JavaScript, I think, maybe a little less opinionated about this, but it’s maybe easy to put some of this config in the code. It always is. It’s always easy to put config in the code. Just don’t ever do it, then you’ll get in the habit of not doing it.

That was like a lesson I remember learning early in my career. Just taking that extra 10 minutes, 5 minutes to not put a piece of config in  code and then watching that 10 minutes go down to 8 minutes, and then 6 minutes. Before long, it’s just muscle memory.

Chris: Not only that. When you don’t do it, you know how bad it is. You know, “I know I shouldn’t be doing this.” Just don’t do it.

Jon: Cool. Number four.

Chris: Four, backing services. This one is really just basically that everything that your app is talking to, every external dependency that you’re relying on services from, just treat it as an attached resource. This is really talking about databases for the most part, especially given the time frame when this was written. I talk about how your app should not make really distinction between any local versus third party services. If you’re using Postgres as a database, your code shouldn’t make any difference between running Postgres local versus consuming Postgres via RDS running in Amazon type thing. You should be able to swap out, you should be able to change between those different services with just a config change. There shouldn’t be any code changes for that.

Jon: Perfect. They’re almost the same, config and backing services end up being the same idea in a way.

Chris: They go hand in hand. They’re very much related, but they are still separate. Because you could imagine you get a function for attaching to the database and if config setting is RDS, then here is the function to use and if config setting is local, then that would not be the right thing to do here.

Jon: Okay, good point. That’s a great example.

Chris: All right. Five. This one’s called build, release, and run. The point here is that they strictly separate your build, and your run, and your release stages. Each one of these is a distinct phase of your software and it should be treated as such. One of the examples they just give here is you should be able to build or change your code at run time stage. At some point in our career, we’ve all done this. We’ve SSH into a prod machine and change some code to fix a bug. But we know that’s not good. Don’t do that. If you need to change code, you go back and you do that in the build stage. And you do a new build, and you do a new release, and then it goes, and it’s run.

Another thing they recommend in this is just every release that you do should have a unique release ID. Think about it as an […] ledger. It can only go one way. Even for rollbacks. After rollback, just think of that as that’s a new release. You’ve incremented your release ID. Again, that’s […] ledger concept to it.

This particular factor is a little bit tougher to do in the real world. A lot of us don’t do this as completely as we could. We definitely have a strict separation between the build, release, and run, but a lot of times just having that big concept over release ID and being able to have the equivalent rollback functionality to do that correctly in the same manner and not do something special.

Jon: I think that’s worth talking about. Say we have release ID one, two, three. Now three is in production. They’re like, “Oh, actually three has a problem. We need to rollback to number two.” Is this saying, “No, you don’t go back to number two. You actually create a number four that was the same as number two?”

Chris: Yeah.

Jon: Okay.

Chris: It gets really confusing on just where things are at once you start doing that. This gets into what branching model you use and just what your philosophy is for building and deploying your software. If you’re using something like a GitFlow model, where the master contains the code that’s in prod and maybe other staging branch that has the staging branch code or something like that. If you start doing things like rolling back now what’s in production, that’s not actually what’s in master anymore. 

Jon: The thing that’s hard about this, I imagine 99% of teams would just go back to number two and they wouldn’t make a number four. This feels like a pretty tricky rule especially because when you roll back to number two, what you’re ensuring is that you’re not making something new. You’re going back to the exact bits that used to work and you really do want that. You absolutely need to be confident that when you roll back, you’re going back to an existing set of bits. What this is saying is, “Yeah, go get those same exact build files, but just stamp them with a new release number.”

Chris: There’s definitely a bunch of different ways to approach this. Part of what’s missing here is, again, that traceability between releases and code. We just don’t have that in place. If you actually did have something that’s tracking this and you have a release manager dashboard, if you will, that’s showing you this is what’s in this environment and in this environment versus this one, handling the rollbacks, handling the promotions.

Jon: I’m literally […] setting up a table and AWS’s Quantum Ledger Database, that all it does is keeps track of Git commit hashes, and then assigns them to release numbers because that’s the idea, right? Append only. The new number could be the same as the previous Git hash or Docker registry hashes. For image IDs, the new release number still puts that same image ID because we were rolling back so we go back to that old image that worked.

Chris: The important point here is making sure that everyone’s on the same page as far as what that process looks like, and how they determine what’s where. Until now, it’s like the team knows, “Oh, it’s not really what’s in master, it’s what’s in prod. I need to go to that release ledger and that will let me know what’s there.”

Six is processes. The core principle here is that your app should be a stateless process, and that is the fundamental unit here. Because it’s stateless and sharing nothing, that makes this very easy to scale out. Anything that does requires state, that goes into one of those backend servers that’s rigged that’s some database or whatnot, but that’s separate from your app.

Processes are the first-class citizens here. This is the fundamental unit and should be stateless. Never assume that anything is cached to memory or on disk to be available for a future request or a job. It’s definitely fine to use caching. You can use /temp for scratchwork or whatnot, but your app should be able to recover from that. It doesn’t require that to be there. It’s just an optimization.

Jon: Yeah. That’s the main I was thinking about. You might need to have a ton of stuff in memory that you can count on being there, but maybe as part of starting up, you could go recreate that or if it ever gets lost for some reason, you can recreate it.

Chris: All right. Moving on. Seven, port binding. This one is a little bit dated and it goes with where the state of technology was back then. This particular principle says you should export your services if you have port binding. Again, it goes with the idea that it’s self-contained. If your app is being accessed via HTTP, then it needs to expose a port that listens for HTTP traffic.

It’s really like saying you’re not going to rely on the runtime injection of a web server. If you need HTTP, that all has to be supplied by your app itself. So, don’t assume that Nginx is already installed in your execution environment or if you’re a Java app like Tomcat’s not already installed. That’s the app’s job, to make sure that it provides that stuff and is self-contained.

Jon: I don’t have anything to add there other than, I agree, it doesn’t feel modern. It doesn’t feel like something to dwell on. Moving on.

Chris: Again, things like having a built-in HTTP server, you get it for free now. It’s one line of code in Node or you can have an Nginx box here, something like that. It’s just so easy to do versus seven years ago, that wasn’t the case at all.

Jon: Right, but even then it doesn’t feel terribly necessary to me anymore. Why bother having an HTTP server in an app when I can use Docker and make sure that it’s in that way. Now, Docker is my app and the app itself maybe just doesn’t care. It doesn’t need to deal with that so the self-contained part is maybe the boundary of where that is has maybe changed.

Chris: We select HTTP as an example. We have things like ELBs that are doing the front end there for that, but they’re still talking HTTP backdoor app. That’s still going inside the container itself so inside the container we still have to be able to speak HTTP.

Jon: I guess that’s what I’m getting at. Let’s talk about this. I feel like it’s safer to assume that your app is going to be running inside of something that can handle it. Whether it’s additional other third parties that you just make sure are there in your Docker file, or whether it’s stuff that’s in AWS that also knows how to talk to your Docker image, to your running container, your app, the thing that you’re making with your code and building doesn’t need to do some of that fundamental stuff, maybe. You don’t need to ever test it.

That’s the point of this. If the app is completely self-contained, then I can test it as a unit and then I can deploy it into a system that doesn’t need those capabilities that it had when I tested it as a unit. What I’m saying is, it doesn’t need to be tested as a unit anymore. You can always assume that I can test it via the image or always assume that I can even test it in a bigger environment like with serverless stuff, like a Lambda function. You can’t test that in a unit anymore. A Lambda function doesn’t have all the stuff that it needs on it’s own. You have to build some stuff for that to even work. A Lambda function has to be sitting in another thing a lot of times. See where I’m getting with that? It’s like, “Yeah, I don’t know that apps need to be self-contained,” because you can assume that the extra stuff is there for it.

Chris: Given the current state of things, and things like containers, and server lists, the infrastructure and what you provide versus what you get as a service has changed a little bit. With containment, this whole isolation thing and port binding is really straightforward because you’re not really talking to the outside world at all. Everything’s in there. If your app wants to talk HTTP then you better have configured it. Your app needs to expose itself as an HTTP server.

Jon: And your app can be as simple as a shared object library that Nginx depends on. You don’t have to make it capable of running on its own.

Chris: Right. When you set up your container, you just need to make sure you have Ngnix setup because inside the container itself, there has got to be that HTTP server.

Jon: At the risk of playing a little inside baseball that you and I are agreeing, “Okay. This is outdated and you don’t necessarily need to worry about it.”

Chris: You don’t have to worry about it because you’re not going to get very far if you don’t and it’s so easy to take care of. It’s not something you have to think about too much.

Jon: Maybe a more modern way of saying this would be, make sure that you thought about how you’re going to test this, and how you’re going to run your app, and that the different ways that you’re going to run your app are all supported utilities. If you’re going to run it on your local machine, or if you’re always going to run it in the cloud every single time, or if you’re going to run it on your machine, in other people’s machines, think about the runtime environment of your application and make sure that they’re all covered.

Chris: Indeed. All right. Well, for such a boring, simple one that we agreed was dated, it’s got the most attention so far. Moving on. Eight, concurrency. This one very much is related to the factor six processes where here it’s just saying the scale out model is via the process model. Again, processes are first-class citizens because they’re stateless. We have a share nothing, horizontally partitioned model of scaling.

Adding more concurrency is a very simple and reliable operation. We just add more processes and away we go. This is very much what stateless web apps. You have a load balancer, you have two processes going, traffic goes up, you add a third one. If they can handle 50 requests a second each now instead of being 100 requests a second, total you now have 150 requests a second capacity. You can keep scaling that way, that horizontal scaling. Scale out via process is factor eight of this.

Jon: Easy-peasy. Let’s agree with there.

Chris: Yeah. So, moving on. Nine is disposability. What this saying is you need to make sure your app’s processes are disposable. I mean, they can be started or stopped at any moment’s notice and it should be able to deal with that. You should be thinking about your startup and shutdown. The app should startup as quickly as possible. Minimize that startup time and then you should also have a graceful shutdown. When you receive a SIGTERM, properly handle that. Do whatever cleanup you need to do and then shut down. Again, do this as quickly as possible.

Your app should also be architectured to handle unexpected non-graceful terminations because these will happen. You could have an EC2 could just be terminated underneath you. You’re not going to get much notice, but the app should at least be able to handle that and you should be thinking about that. What if that does happen? Because it will happen. You don’t want to be in a state now where you’re in an unknown state or in a corrupted state. What’s going to happen to your app if that does happen?

Jon: If I were writing this, I might have decided to go with the Ten-Factor App instead of Twelve and just had processes include three subthings. This is the third thing we’ve talked about that’s just really about process. The first one was make them stateless, the second one was use scaling with them, and the third one is cleanup after yourself and make it so that it’s easy to kill a moth, and your app is happy and fun. These are really all about processes are the first-class citizens of the app and treat them that way.

Chris: Similarly, dependency and port binding could have been considered as one as well, so now we’re down to nine. Before knowing it, we can get down to the Three-Factor App. Or actually I should say, Five-Factor App which is what we’re going to be talking about next time.

Jon: Okay, so after disposability?

Chris: Ten. Characters of 10 is dev/prod parity. What this particular point is making is to keep all of your environments as close and as similar as possible. Don’t use different backend resources in dev versus prod.

Jon: MySQL in dev, Postgres in prod. That would be scary.

Chris: Exactly. Or using something like SimpleDB locally and then […] in prod. You want to keep your environments as close as possible and you also want to use Continuous Integration and Continuous Deployment to keep the gap between those small. Again, modern CICD pipelines, you want to be deploying very often and you want to keep those deploys small, the amount of change small and that reduces your risk.

If you go and make a bunch of changes to your staging environment, that goes on for weeks, or months, or whatnot. Then when you go announce and say, “Oh, now I’m going to deploy the prod,” this ends up being a big deal where it’s all hands on deck, what’s going to go wrong, maybe we have downtime, and that’s because you patched up all that stuff and the gap has gotten really big. Resist that. Handle change much more frequently. Keep the changes smaller. Keep the gap between your environments as small as practical.

Jon: There’s another piece to this that is really important and it’s not always feasible. If you’re a global company with millions and millions of users, you might not be able to have your stage, physical infrastructure match your production and physical infrastructure, 100,000 machines on each. But to the extent you can. It does feel like pay the extra $50,000 a month for having two servers in stage as well as prod.

Don’t skimp and try to save money by having a noticeably different physical architecture in your staging environment because you’re going to miss a few bugs. Especially if you go from anything like in prod has got two of them and you go down to one of them and stage that, it’s often a place where you’ll have issues. Maybe it’s fine if stage has two and prod has three, but it’s really is never okay if stage has one and prod has two.

Chris: Absolutely. Definitely look at those situations and ask yourself, “What is the required effort here that will reduce that risk, but also doing the right way that’s not going to break the bank?”

Jon: One issue with all hands on deck, we’ve had it happen in our own company where because we didn’t test things, we didn’t have two load balancers or two databases or whatever in staging, one issue that brings all hands on deck and people spend 5–10 hours trying to figure it out, that’s just two years worth of loss. All the money you just saved by not having two of those things in staging, you just lost it by having all hands on deck working at their high bill rates.

Chris: Absolutely. Just an out of […] site or whatever. How many lost sales of lost customers? We have incredible tools now, doing things like infrastructure as code, so being able to spin up environments. These things are not running all the time, either. You can do it with 10 servers or whatnot, but they only have to be running for a few hours, maybe.

Moving on, 11 is logs. This is just saying, treat your logs as event streams. Your app really shouldn’t concern itself with routing or storage of the logs and don’t attempt to write or manage the log files. Think of it as a stream. Logs are important. This particular point, the reason why it was written, any idea what motivated them to include this?

Jon: Yeah, because Heroku and you weren’t allowed to write logs. Heroku was responsible for managing logs for you. That helps them make sure that you’re not painting yourself in their corner when you decided to plug Heroku.

Chris: I’ve never used Heroku. I was never a customer. When I read this, what comes to mind is running out of disk space. This is the reason why they included this in here because if you write into files, at some point you’re disk can get full and that’s a deal killer. Don’t write the disk.

Jon: And with Heroku, you would deploy stuff for Heroku and then you would do Heroku logs, then Heroku would send you back the last 100 lines of your log which just wasn’t sufficient. Then, you could sign up for additional log service where you could go search logs and do whatever you wanted. But the main thing was that if your application just spit logs to standard error and standard out, Heroku would take care of them from there. It all lines up.

Chris: That’s exactly what this point says. Write to standard out and have something else capture that and do whatever it needs to do with that stream whether it’s something like a pass service like Heroku that’s capturing that for your, whether it’s Docker, letting Docker capture that information with its log driver, or whether your shipping them over to a third party service like Sumo Logic, or Loggly, or whatnot.

Jon: Drumroll. Number 12.

Chris: Here we are, 12, last one. Admin processes. This one feels anti-climactic because it’s not all too terribly interesting, but it’s just saying if you have any utility code, or admin, or management tasks that you need to do. Examples would be code for doing database migrations, or maybe you have a one-time script that’s run to cleanup some data, or to change the format of something, or maybe go fix something in your database. All that code should be packaged up with the application code in that same repo. It should leverage all the same shared library. It should be a part of that code base. Run it alongside the rest of the app, so you don’t have any synchronization or drift issues with it.

Jon: So that, “Why didn’t my admin task work? Oh, well because I updated the app or the database schema, or whatever.” If you’d just kept them together and tested them together, then you wouldn’t run into that. It also says run them as one-off processes and that there may be times where that’s not necessarily true. That means you’ve got admin stuff that doing database cleanup or whatever file cleanup or something, don’t have your main process do that.

But I could imagine that sometimes you might start building some of that into your own console and it could actually be running inside whatever processes doing stuff. In the case of microservices, then it’ll be totally separate because you’ll probably write your own separate admin microservice that takes care of that stuff.

Although, just arguing with myself here, if it’s something like cleaning up a database, you got one database for microservice. The process that cleans that database is likely to be the same process that accesses the same main process. I read that in here, “Run admin management tasks as one-off processes,” and maybe sometimes you won’t do that.

I’m thinking about Heroku again. Heroku came from the Rails world largely and there’s this thing called rake. You can run these rake commands and rake commands can go do stuff. They’re really pretty interesting because they can access the main code base and they could do something.

They could say, “Go use all these code I wrote and call this function inside of it.” Or they can say, “Go talk directly to the database that this code knows about and do something to that database.” They can do various things, but they’re run off as separate processes. That’s part of where this comes from, that world of rake tasks where you’re taking advantage of the code your wrote in the monolith but running outside of the monolith process. I think that idea is where that comes from.

Chris: A lot of this is going to come down to perspective, and how you define things, and whatnot. You could have this […] task that every six hours goes and does some pruning of your data, and you could still call that a one-off process. It’s just a scheduled task that someone […]. The important point is that it’s still leveraging the same code bases, whatever data that it’s going to input. Again, the microservices model serves us really well here and all of these principles really apply to that. If it’s accessing the database of a microservice, it probably belongs and is part of that same app.

All right. Maybe just to wrap up quick, thinking about, “What’s missing?” It’s been seven years since this was first published. Is this complete or are there things that we should continue to add to this? It’s definitely ripe for update and there’s a lot of things that could be added. Things like testing is really not here or there’s really not a lot about robustness. We could argue for a bunch of other characteristics to be added to this.

There’s been some conversations about this and one’s that’s gained traction are good for adding three additional factors and those are telemetry, security, and API-first philosophy. These all make sense. You can think of telemetry as being logs++. It’s not just logs, but it is metrics and everything else is associated with it. That makes a lot of sense.

Jon: How long requests are taking.

Chris: Security […] super, super important, especially nowadays, it’s only gotten more important. So, directly addressing that feels pretty important. This concept of API-first philosophy, this is definitely the way we build apps now. Expose the functionality via API and that’s how you talk to these backend services. Your app could be a backend service for another app and that’s the whole microservice model and philosophy.

Those three definitely make sense and again, we could probably even talk about even more than that, that goes into building a really good web app. But as we started off with this, if you just do these 12 things and ask yourself, “Hey, how am I doing on these 12 things?” You’re way ahead of the game. These all still really do apply. It like table stakes.

Jon: That was super interesting and next week we’ll get a very technology-focused view of architecture. Next week, we’ll take a step back and we will look at architecture and running software more from the point of view of the whole business. I’m looking forward to that. Thanks very much.

Chris: Thanks, guys.

Jon: Thanks, Chris. Thanks, Rich.

Rich: Bye.

Jon: Bye.