My Journey to Achieving DevOps Bliss, Without Useless AWS certifications

The story of how I transformed from a naive full-stack engineer struggling with AWS and PaaS providers to loving my life and achieving DevOps bliss. Make sure to read ‘til the end to find a free daily email course! “BEEP BEEP BEEP BEEP BEEP” The “shiny phone” was shouting at me and flashing it’s light like a tiny little ambulance illuminating the dark room for fractions of a second at a time. I had been dreading this moment. I pulled the sheet over my eyes and closed them tightly. Maybe if I just hid under the blanket for 10 more seconds it would stop. “BEEP BEEP BEEP” The shiny phone’s alarm droned on. “Never again.” I thought to myself as I dragged myself out of bed and stumbled across the room to the light switch, nearly tripping over some overflowing pillows I’d thrown on the floor during the night. As the lights blazed on I closed my eyes tightly again as I tried to adjust. The year was 2010, and I was a newly graduated computer scientist ready to take on the world, or preferably conquer it. Though most definitely not “bright-eyed and bushy tailed” in this moment. My childhood was less than cushy, and I had dreams of grandeur. I conjured up images in my head of Forbes magazines with my face: “The next Mark Zuckerburg!” or “30 under 30” (missed that one), “Is it web scale! Yes!” My parents had gone through a divorce that resulted in some poor financial situations to say the least, and I wanted to be able to provide for not just myself, but them as well. I had dreams of buying them homes, and paying their debts, and taking care of my siblings too if I could. My dad would always ask, and still does: “ ” “ ” I’d reply. Are you rich yet? Can I retire? I’m working on it! And now it was finally my moment, I was at my first full time gig as a “UX Engineer” building the new front-end for a startup’s platform. I’d been studying high scalability and DDD in my spare time in college and I was ready to take on the world. I was the 8th employee, so I figured my equity would probably be a pretty good start when the company blew up, too. Turns out it wasn’t - we were acquired 11 days before my vesting period. So it goes. Anyway. This meant as part of my junior engineering job, I was on rotation for carrying “the shiny phone.” “The shiny phone” was the 24/7 technical emergency line established for the start-up I had joined. If things were blowing up, someone had to know, and react, quickly! The format that this message was delivered? “The shiny phone”. And boy did it shine! The shiny phone was a flip phone (cause those were still a thing), which as you may have guessed, was made out of a shiny material. Clever. The CTO, Kevin, had decided that after several months on the job, I was sufficiently ready to be put on the rotating list of individuals who received the phone. Each person who got the phone, was responsible for responding to any system alerts that were sent to it via SMS (repeatedly) at any time while they had it. I secretly dreaded my turn. I didn’t know if I would be ready if it went off. After all, Sergey was a SQL wizard, and Kevin could pull together new features over night like Richard in Silicon Valley. “Hey guys, I rewrote like, everything last night and made it way better, hope you don’t mind!” I also wanted to be able to handle an issue if it came up. I wanted them to think “this guy’s got it — glad we hired him!” But mostly, I just hoped it didn’t happen. And now in the dead of night, the third day on the third week of November, at 3:37AM, the shiny phone had gone off. “BEEP BEEP BEEP BEEP BEEP BEEP” I stumbled back over to my nightstand and flipped open the phone. I remember feeling a chill run through my body and down my arms. It was a cold night even though the heat was on. 23 messages. “FATAL ERROR: DB Server is down. Unable to connect to 10.10.0.1” “FATAL ERROR: UI is down. Unable to connect to https://ourwebsite” “BEEP BEEP” More of the same. 😳 I grabbed a sweater out of my dresser and put it on as I walked across the room to the corner where my laptop sat on a flimsy rolling desk I’d bought for about $20 for my college dorm room four years prior. I sat down in my cheap, uncomfortable, also probably about $20 office chair. “Ok — I got this.” I thought as a rolled myself into position over some poorly placed wires on the hardwood floor. I had a checklist of what to do if something went wrong… what could go wrong? Err, what else could go wrong? First, I had to create a remote desktop connection to my desktop at the office. Check. Next, I had to log in to the admin portal to check the statuses. Uh oh… the admin UI isn’t loading… “Can I ping the servers?” Nope. No ping. “This is bad” I thought, as I tried a few more IP addresses, it seemed like everything was down. Against my pride, I knew I needed to wake Sergey up slightly before four in the morning. I fumbled with the bright blue buttons of the shiny phone as I tried to look up his number, conveniently located as one of the only few numbers in the pre-touchscreen device. “What a poor UX” I though to myself as the phone began to ring. “Hello” Sergey answered in his heavy Russian accent. “Hi Sergey, sorry to wake you” I explained. “The shiny phone is going off, I tried following the checklist to restore the services but nothing is working.” “Ok, I’ll look at it” he groggily replied as he hung up the phone. Guess I was done? As I crawled back into bed I was a bit relieved, and also a bit disappointed as a realized this is not something I could have possibly fixed at the time. As it turns out, there was a failure at a datacenter where we rented servers. Someone had to actually go and replace a physical machine in a physical location to get things running again. It was at that moment I wondered how companies anywhere are able to keep everything up at all times. How it was possible to deploy systems that can scale to meet the demands of enterprises, can be nimble and flexible enough that it is not cost prohibitive to startups, and that were highly available. Mostly, I just didn’t ever want to have to have another shiny phone. It wasn’t until four years later that it would be my responsibility to be in charge of setting up a system from end to end, and making sure it stayed up. I had grown a lot from an engineering perspective, but my extent of Operations and production deployments thus far was deploying to Heroku, or making a Pull Request that an Ops team would be responsible for getting into production later on. To me, Ops was still an afterthought. I had been focused on learning how to build distributed systems with Node, while being a full-stack engineer with a front-end focus (Backbone was the flavor of the day). I was lucky enough to have gained an amazing mentor, Matt Walters, as a colleague at that FinTech company in the couple years prior, where we built a private equity trading platform together. Something he often told me: “With simplicity of services, some complexity necessarily moves to the architecture.” And now, as the CTO of my own startup, I was responsible for ALL of the complexity, no matter where it lie. I was smart going in to the project, though, I knew I needed good design patterns, and not a monolith, and I could definitely sorta draw a chart that described a system that would work and so I went for it! At this point I’d spent most of my efforts trying to engineer and architect any potential problems away. I was still experimenting with design patterns and architecture. I hadn’t accepted failure of services and infrastructure was inevitable, I just thought those before me just needed to be better. “If they used this pattern, or this language, or this tool…” I’d tell myself. At the time, architecture for me, didn’t even consider the infrastructure I was running on or how it ran there, or even much about how it got there for that matter. So what did I do for Ops? What most other engineers I knew did — used a PaaS. One could say I took a “PaaS” on learning AWS. 🤣 A PaaS is a “Platform as a Service” — These are platforms like Heroku or Modulus (gone) I knew I needed to learn AWS. I knew if I wanted to be able to automate all of my problems away some day, I’d need to (so I thought then), but for now, I had things to build, and quickly! My deploys were pretty terrible looking back then. They looked something like this: Care way too much about branches and git flowEventually have a master branch you want to deployTag a releaseDeploy it using a manual process (even if that is typing a few lines into your CLI)Oops, env configs are wrong, reconfigure those using awkward UI on PaaS. I know there’s a CLI way to do this — I should learn that some day.Ok I need a database… Go to some DBaaS website and spin up a MongoDB with Replica Sets. Think “that sounds complicated to configure”Awkwardly change more environment variablesOops, somethings broken, repeat 1–5. There are a lot of problems with this. It’s an error prone manual task. It’s not easily repeatable. You need to know things other than “git commit”, your services communicate over the internet, and of course it’s painful and something you want to avoid. So then you do it less frequently, which amplifies your need for GitFlow and process and time spent deploying and changing delicate sets of carefully configured PaaS instances, which actually makes every deploy more risky, and more complicated. On top of that, now you are trying to remember two different states of the world you created, master, and dev, so you can make hotfix branches. And you’re cool and using “microservices” so now you need to remember states for like 14 branches, and how 7 different projects are dependent on each other in different ways like mixes of shared databases and logs change events from databases to trigger updates on other services. Your junior engineers are confused about GitFlow and creating hotfix branches so you need to teach them that, and someone wants to add a cache, and you’re like “lol does our DBaaS have Redis too??” Architecture and Design Patterns to the rescue! As Udi Dahan said (sarcastically): “throw all the D’s at it! DDD! TDD! BDD! ADD!” We will engineer all the problems away! “Throw all the D’s at it! DDD! TDD! BDD! ADD!” — Udi Dahan And so I tried throwing D’s at it. I thought it was great — and the UI was. I used CSS animations and it felt like a native app in a web browser. It was real time thanks to Meteor, and by being accepted as one of the first companies invited to their beta DevOps platform Galaxy. I was again able to shift away the responsibility of DevOps and deployments Once again, I’d avoided learning AWS. Problem with that, is eventually I stopped using Meteor, and started using more pure Node. Again I was left with only PaaS providers as an option. I got to the point where I was running about 30 services per environment, and the PaaS cost was really starting to grow at $15/m each — and database services, and complicated deploys literally by the dozens. Given how expensive engineering time is, factoring the time spent deploying things, the costs were not insignificant. I lived with this for a few years, still focusing my time on better engineering. Better design patterns, scalable microservices, architectures, proxies, but still put off the dreaded task of learning AWS. I still kept failing to achieve the simplicity and beauty that I wanted to achieve. Something was missing. I had heard rumblings about things like codified infrastructure, and Continuous Deployment. I was pretty sure I needed these things, but I had no idea how to get there. I did know one thing though, a giant shadow of a mountain called AWS loomed over me, and I knew I needed to reach the top to find the answer. So, I did what I like to do when I want to learn things fast: I bought a course on AWS! And I hated AWS. I hated the UI. I hated that I was being taught about UIs instead of programming. I hated that what I was being taught would result in hard to reproduce, manual, and error prone tasks. And I hated that I didn’t get any benefit beyond AWS’s closed infrastructure when there were cool things going on with Digital Ocean, Google Cloud, and Azure. I couldn’t say for sure if AWS would be the platform I wanted to use in a year or two or three. Maybe Google would blow them away. Maybe they wouldn’t. But still, for a couple of months, when I could bring myself to get over my hatred (or fear) of AWS, I would watch the videos, and go into the AWS UI and do the things. Until eventually, I decided there needed to be a better way. After all, I didn’t want to spend a bunch of time getting myself locked in a box. Needless to say, I didn’t get very far with my AWS course. I have now built the DevOps systems for a website with 14+ million yearly users, have built the DevOps system for a blockchain company, two artificial intelligence APIs, and a FinTech platform. All increasingly better than the last. And let me tell you a secret… It definitely WAS NOT from learning AWS. It was abandoning the then standard practices and focusing on a new emerging technology that was getting a ton of hype at the time: . containerization So I started to read everything I could about containerization. I went and bought several books on Leanpub, I read blogs, and docs, and code. AND CODE. And I loved it. There was an entire ecosystem of tooling built for deploying containers I realized, and if I could just learn how to package my code into a container, well, then deploying would be easy! One of the most exciting things I read, was when I had really figured out how to use containers well for all sorts of development tasks that made my life easier, that Docker was coming out with a tool called a “DAB” or distributed application bundle, and you’d be able to simply write a yaml file that describes how your system should run, and then just give that to something called an orchestrator and it would just do all of the other AWS stuff for you. The created volumes, mapped them, security groups, ASGs, loadbalancers, a whole bunch of other stuff that I hadn’t even heard of and way better than I could have ever made it myself. Beyond that, they literally manage containers in production for you, I learned. As in they allowed you scale them with a single command, they would automatically recover from failures if they broke by restarting the service, and all you had to do was literally click a button to set it up! Instead of learning AWS, I realized, I should learn and then, all I need to do, is provide MY containers to an orchestrator, which I set up on AWS by clicking a button! Perfect! containerization All I had to do was click a button in order to get a production ready system on AWS. Orchestrators ✅. It’s kinda like you see in movies the big shipping container yards where various nefarious activities are happening in the dead of night. Or some poor soul is lost and no one knows which container he’s in. Imagine if all of those things inside all of those containers, were not in containers. Imagine if instead tens of thousands of items were just sprawled across some big empty lot. The shipping and unloading process would be more than quite a bit more complicated! My mind shifts to mental images of giant cranes trying to pick up items like one of those annoying crane game at a highway rest stop taunting you with some new Beats headphones and a defunct claw. The reason the cranes can do their jobs of picking up and organizing containers, is because containers are all standard dimensions. It can pack them into neatly defined rows until they are at capacity. So all you need to do, is figure out how to get your stuff into the container, and it will get shipped. Cranes are very powerful tools, that a human can simply operate to arrange, move, order, ship, and receive containers. Orchestrators are kinda like cranes. Orchestrators also powerful tools that basically operate themselves to deploy, assign, , and inspect software containers. They know what servers have capacity, and they put your containers there, and wire them up to networks to run. recover from failures Moreover, containers solve a whole bunch of other problems too! Like the infamous my code runs here but not there bug, and various testing scenarios. So I went all in on containers. I learned the ins, outs, ups, downs, and in-betweens because I knew if I did, then I could just give them to an orchestrator, and then that thing would handle all the AWS or Azure or Google Cloud stuff I needed! This is what really freed me.There were countless benefits beyond what I had originally intended. Such as being able to take advantage of a whole ecosystem that was previously unavailable to me. That’s not to say there still wasn’t more to learn. There was. A lot. But knowing that one piece was the key that connected the two worlds between development and operations. How could I not get on top of this revolutionary technology? All I needed to do was create a single file that is shipped with each project that described the environment it needed. Also, the hard part of this environment was already created and maintained for me by the Node team. Then I just needed to give it to the orchestrator. “With simplicity of the services, some complexity necessarily moves to the architecture… and infrastructure, and process.” I thought. “The key that connects all of those pieces is the Dockerfile.” Containerizing my apps was the starting point that unlocked all of the other potentials and benefits because when you enter the ecosystem of containerization, not only do you get orchestrators, but you get libraries and libraries of production ready systems like databases that other people have already containerized! Do you remember coding before having a package manager like npm? Imagine how difficult today’s programming tasks would be without something like npm. If you aren’t using containers and orchestrators — that’s what you’re missing out on for entire deployed databases, and queues, and caches and probably most things you can think of that you’d want to deploy!!! Want to know how I install a database? Add a couple of lines to a YAML file. Are you understanding how powerful this is? Things like running Redis used to look like a mountain, and now with the containerization tools I had, that mountain was literally about five lines in a YAML file. Someone asks “Can we run XYZ?” And you’re like “Yup…. anndddd…. Done.” How many hours a week would that save you and your colleagues? Now you’re probably thinking — great, a new mountain of things to learn and understand. I feel you. These are the same struggles that I went through. That’s why I’ve made it my mission to help engineers like you bridge the gap between building applications, and running them in production. I also want to teach you the cloud native microservice patterns I use to build scalable systems for billion dollar enterprises. simple Simplicity is everything to me. Complication is the enemy and you need to fight it. You must be one with the cloud. Got it? To get you started on your journey in I have created a FREE 21 day email course, “Getting Dangerous With DevOps”, that will lead you through containers and orchestrators as it teaches you how to deploy a Next.js application to AWS without wasting your time on useless AWS certifications! If you want to receive my free email course “Getting Dangerous With DevOps” with 21 bite sized lessons sent to your inbox daily, head on over to to sign up! https://www.devopsbliss.com If you’ve found this useful and want to say thank you, the best way to do so is sharing my story with someone else who might find it useful! Want to learn more about me and my story? Click here to read about me and how my agency, Unbounded, can help you achieve DevOps Bliss.