SREies is a series on topics related to my job as a Site Reliability Engineer (SRE). About a month ago, I wrote an article about what it means to be an SRE which included a compatibility quiz and resource list to those who were intrigued by the role. If you are unfamiliar with SRE, I would suggest starting there before moving on.
In this series, I will extend my description to include more specific summaries of concepts that I have learned during my first six months at Dropbox. In this edition, I will be discussing Configuration Management.
As I mentioned in the previous article, I came to SRE with little experience in Software Development or System Administration. Prior to making this career change, I taught High School Mathematics and Spanish for six years where I enjoyed breaking down complex concepts. When I felt it was time to seek a new challenge, I decided to attend a Software Engineering Bootcamp called Hackbright Academy. Hackbright is a 12 week intensive in which each student spends 6 weeks learning the fundamentals of software development and 4 weeks applying those skills to development capstone web application project. You can imagine that while developing a project on your personal machine is a great experience for a beginner, it’s limited scope prevents opportunities to learn about scalability and large scale infrastructure.
Working at a data storage company like Dropbox has allowed me to learn a lot about large scale distributed systems. This series is my way of combining my love for breaking down complex concepts with my new learnings as an SRE. One element key concept of large scale distributed systems is this idea of Configuration Management.
According to this awesome Introduction to Configuration Management tutorial by Digital Ocean, Configuration Management is:
“…the mechanism used to make the server reach a desirable state, previously defined by provisioning scripts using a tool’s specific language and features.”
I know what you’re thinking, “What the heck does that mean?” Let me try to break it down by using an example.
Let’s say I decided to open up a bakery called Krishelle’s Kakes that instantly was a hit (I did used to bake cakes actually!). Imagine this bakery serves pastries, cakes, cupcakes and chocolates, and that everyday the line is out the door. The business has become so popular that I immediately decide to expand and open other locations around the state. With this expansion though, I really want to make sure that the customer’s experience is the same no matter which location they attend.
How would you solve this problem?
Some things that come to my mind include:
These are very similar to the types of questions a system designer asks when building a large scale distributed system. Here are some of the solutions that Configuration Management (CM) provides:
In other words, Configuration Management Tools offer a framework through which SRE’s can automate the process of configuring a machine in a large system of machines. Configuring a machine means making sure the machine’s set up matches other machines like it, and that it has the correct software and scripts needed to perform all processes assigned to it. This process is called “provisioning” a machine.
So we’ve talked about what CM is, it’s a framework that allows for automated machine provisioning. Now let’s dig deeper into why this is important.
The automation of machine provisioning helps to minimize problems caused by development environment discrepancies.
Building off of the Krishelle’s Kakes example, think about what would happen if one location received a convection oven, while another received a conventional oven. Even if you don’t know the difference between the two, just know that they have very different heating properties.
The difference in ovens means a difference in the environment in which the goods are baked, which would likely change the way they turn out. Even though each bakery received the same recipe, the change in environment (that is, a different oven than the recipe was written for) would cause inconsistency across the locations. They may even come out bad! The main idea here is this: the environment that the baked goods are cooked in will have a huge impact on the result. The same is true with software.
The environment (meaning anything the code depends on to work properly), needs to be consistent. It should be the same when an engineer is developing it as well as on the machine it will run on. When applications are deployed to production and shared between co-workers who might have different machine set ups, we need a process to make everything consistent. Configuration Management provides a framework for creating a centralized definition of dependencies and does the work of replicating environments with the exact same software and configurations.
The most popular Configuration Management Tools include:
One of the projects I worked on at Dropbox was with Chef. Chef involves several important features:
The cookbooks hold all the important configuration data (that developers write) which provide all of the great features I described above. If you are interested in learning more about Chef or one of the other configuration management tools, I have included some resources below.
As mentioned earlier, DigitalOcean has an excellent set of tutorials to get you started with Configuration Management from a beginner level.
Here are some other tutorials/articles that you might also find helpful.
Looking to make a Career Change but not sure where to start? Job Searching and not sure how to get organized? Check out these articles by Krishelle: