Microsoft, The Unit of Deployment, and Win/Win Kubernetes with Co-Founder Joe Beda

We recently had the opportunity to interview VMware’s principal engineer, , one of the creators of Kubernetes, as well as the Google Compute Engine. Joe Beda He also co-founded the cloud-native leader, Heptio, which is now a part of VMware. We have included the entire interview transcript below, and we hope you enjoy it! Joe is an experienced software engineer who has worked at both Microsoft and Google. The Interview with Kubernetes Co-Founder Joe Beda Hello. It's nice to see you! Let’s start our interview. Nowadays applications help to abstract from hardware by virtualization. Using containers means abstracting from the operating system and a huge system resources economy. Evrone: Platforms such as Kubernetes allow us to feel free from manual control, which means we don't have to care about how and where the app is running. Does that mean that we are losing control and customers’ troubleshooting becomes more difficult? Hey, thanks for inviting me over! I don't think it's necessarily abstracting you from the operating system. I mean, you still know that you're running on Linux when you're running something inside a container. Joe: I think there's a couple of things that it does that do help us quite a bit towards trying to create that magic outcome of: “Just take my code and run it. I don't care where it runs, just make it work.” The first thing is that there's an efficiency argument with containers often combined with virtualization. You can pack more in on the same hardware; you can go smaller, you can have more fine-grained overcommits and tradeoffs between things. But even more important is that with the ability of containers, you have a much more packaged, isolated unit of deployment. So we finally have something, at least on the server-side, where we can run a program on the laptop, we can run it in one cloud , we can run it in another cloud. And pretty much that same program can run. That same artifact can run across all those different environments, relatively unchanged. And having that unit of deployment is an incredibly powerful building block for us to be able to work with. Running CPU processes is great, but things aren't useful, generally, unless you have networking in storage. That's where the real complications come into it. Just taking a container, doing the bin packing problem so that you can get containers running on a set of hosts is one thing, but making it useful so that you can get networking plumb through, you can have these things talking to each other, you can have the right storage at the right place at the right time - those end up being some of the compounding factors that make this a really hard problem. You mentioned that we can assume that a container is running atop some Linux kernel, but Microsoft, right now, performs an intense development of the Windows Subsystem for Linux. Do you think it will be a good start for a Windows container integration era? Evrone: I don't have a ton of experience with the Windows Subsystem for Linux. Some folks that I used to work with at Microsoft are driving some of that stuff. So it's really interesting to just show how small of an industry this is that we oftentimes intersect with folks in new ways over time. But my understanding there is that the second version of the Windows Subsystem for Linux is really a Linux kernel running in a virtual machine with some fine-grained interactions between that and the rest of Windows file systems and networking. Joe: When I have played with it, I've been really impressed with the really interesting technology. With that, you can run a container engine, such as Docker, directly there and get an experience that starts to feel like you have Linux on your desktop, because you do. You have a kernel. It's just a User interface with it, in a little bit of an odd way compared to a more traditional Linux system. I think it's important to recognize that that's a completely different mechanism than native Windows containers. And native Windows containers are challenging because Windows wasn't built to be able to be driven in the way that containers expect. There are global registries, global resources, and creating a way to isolate all of those things is a pretty big task. My understanding is that, in terms of maturity and usage, native Windows containers are still early on the curve, in terms of being developed. Now as Windows programmers approach this, clearly, if you're doing C++ or lower-level type of languages, you're going to see big differences, whether you're targeting Windows with things like Win32 versus Linux. Pretty profound differences in terms of the model there. But once you start getting to higher-level languages, I think what we find is that the differences don't matter as much. With Microsoft's point of view moving towards .NET core, which is really translatable across these things, we're starting to see the separation of the programming model for those higher-level environments from the underlying operating system. That being said, there's still quite a bit of .NET standard out there. There's still quite a bit of dependencies on native Windows DLLs, especially if you're doing things like media encoding. And there are still many places for folks to be able to run Windows workloads. That's why I think there's still work going on around containers for Windows. But with a lot of these things, like .NET core supporting Linux, it hasn't been as critical as you might think it would be. You mentioned your career at Microsoft and working for the Internet Explorer browser. What is your most useful experience from that time? Evrone: With career development, you definitely learn across multiple different aspects. If we talk from a technical point of view, there was a more senior engineer than me that implemented a memory profiling system, where essentially every allocation in Internet Explorer and the main rendering engine was tried, and we put a tag, created hierarchies on those things, and were able to do real-time visualization of where the memory was being spent. Joe: It was a debug mode that we could actually put on top of it. And it was one of these things where this experience of systems gets complex enough that you don't truly understand what's going on, and if you shine a light, you're always going to be surprised. This is something that I've seen again and again over my career. You have theories. You start testing those theories against what's actually happening with your code in the program, and you're always wrong. So you really need to look. I mean, it's good to have those theories, but you really need to validate those things. And you don't optimize until you actually verify that the thing that you're optimizing truly is the bottleneck, truly is the problem. On the professional development side... This was early on, probably in an Internet Explorer 5 or Internet Explorer 6 cycle. I was starting to take on a little bit of a leadership role. And, keep in mind as I talk about this stuff, that Internet Explorer, at that time, was targeted on Win9x, in addition to NT. Pretty low memory, low power machines, the original versions of IE4, I think, ran on like 48 or 64-megabyte machines. So it was a different world than we're in today. But I was helping to actually drive the performance of the Internet Explorer Engine, the rendering engine, and this was a critical competitive win against something like Netscape, our main competitor at the time. There was real learning around those horizontal roles. Where performance isn't a team; performance is something that actually is impacted by everything. So it was a real lesson in: How do you lead by influence? How do you get people to be able to do things when you don't directly control what they're doing? And you know, I made some mistakes along the way early on. I was like, “We must do this, we must do that” and trying to use some authority (that I maybe didn't have) to try and make things happen. And yeah, that didn't work that well. So that was a big lesson around: how do you actually understand what people's incentives are? What are their goals? How do you actually create win-win situations for them, so that you can get people aligned with you when you don't necessarily have the authority to tell them what to do? That's just a skill that you continue to build through your entire career. I think it shows that software development is, if nothing else, a team sport. You really need to get folks working together. And that matters as much as the core technology and just, you know, laying down a lot of great code. Oh, good old times. Did you prefer Internet Explorer or Netscape Navigator during that time? Evrone: There's a lot to be said about what happened back in the day. As I transitioned from Microsoft to Google, one of the biggest changes for me was that, in those early days, Google was very much focused on “let's just make things better for customers” versus “let's go and kill competition.” And I think that attitude of focusing on users, versus focusing on competition, has stuck with me now. I don't know if Google still actually does that. Joe: Later on, I internalized that “let's focus on just doing the right thing for the users.” But early in my career, Microsoft was very much a “let's go out and kill competitors” type of thing. I look back on that, and I think that wasn't a healthy attitude in terms of the ways that IE was positioned in the market. That being said, on the technology side, IE4, specifically, was so much better than Navigator at the time. And I'll give you an example. We had, essentially, an in-memory model of the page that you could program against. The modern DOM that you get in browsers really started with IE4. It was that team that really drove the idea that the programming model and the markup were aligned, and whatever you could do in markup, you could go and tweak later runtime with JavaScript. Navigator, at that time, didn't have that sort of in-memory model. What would happen is that if you took the Navigator window and you resized it, it would actually reparse the original text of the page to re-render it, because it didn't have an in-memory model to be able to do a re-render off of that. And sometimes it would actually redownload some stuff when you resized the window. The dynamic nature of what Navigator did, at the time, meant they would have multiple web pages painted on top of each other, and you could replace some of those things. So this was Netscape layers. From my point of view, IE4, at that time, really established what we consider the modern web browser now. There's been a lot of improvements; there's been a lot of great stuff done. You know, the standards were evolving at the time. Oftentimes, IE was ahead of the standards, and then the standards would change. And then you have compatibility versus standards trade-off, which is always really hard. I don't think we necessarily had the tools or the techniques to be able to navigate those situations. But I really preferred IE at the time, because I thought we were doing something really interesting in terms of establishing the modern web browser. We called it the HTML Dynamic, the DHTML at the time, but, essentially, the modern DOM is the work that we did there. Yeah, we clearly remember that. In one of your speeches, you mentioned that Kubernetes is the platform of platforms. In your opinion, what is the most prosperous Kubernetes-based platform? Evrone: I think we're still learning, to be honest, and one of the reasons that Kubernetes is successful is that we don't necessarily need to have a single monolithic platform built on top of Kubernetes. Let’s explain this as a contrasting comparison. If you look at the extensibility model of something like Mesos, Mesos was essentially a toolkit kernel, and then you could build things like Marathon or Aurora or Chronos on top of it, which were these systems that would use the core scheduler to be able to do things. But what you found is that Aurora and Marathon and Chronos had their own API. They had their own user system. They had their own way of modeling the world. Those things ended up being isolated silos. Joe: One of the things that we did with Kubernetes is we created a common API for a common set of models, all this stuff around CRUDs. What we find is that, as people extend Kubernetes, they do it in a way, oftentimes, that's interoperable with other extensions of Kubernetes. What that means is that the things that people are building on top of it are not running in silos, but are really enriched by the rest of the ecosystem. I think that's a big change that we had with Kubernetes. What you find is that we have common components that are actually evolving. Sometimes, there are singletons, like , as an example. That's an extension of Kubernetes that everybody uses. cert-manager And there aren't a lot of other folks that are going off and creating certificate management systems in that vein. But then we have other places where there is a thriving ecosystem of ingress systems on top of Kubernetes to be able to adapt them, whether they're envoy-based or Nginx or HAProxy or native cloud. And those things can work together in a way. I think what we find is that, instead of having a platform that people build on top of Kubernetes, we have a bunch of building blocks that people can put together to create the platform that really meets their needs. That being said, I do think that there's room, and that's some of the work that we're doing at VMware with Tanzu, to be able to take the universe of choices that are out there, which can be often overwhelming for early users, and boil that down to like, “Hey, if you start with these things put together in this way, you're going to have a good experience.” And, as you understand your needs better, as you understand the systems and the ecosystem better, then you can start leveraging other things and start actually customizing this to your unique needs. There is no singleton platform, but there's still room to actually create a better getting-started experience both for individuals and for organizations, in terms of being able to put all the pieces together. Most of the experts describing Kubernetes note the high entry threshold. Is it good for the platform or not? Shouldn't it be lower or more simple for new users? Evrone: I think this is a really complex topic. The terminology that I like for this is that . And, oftentimes, if you try to take a system and oversimplify it, what you find is that it becomes brittle and the set of situations that it can work against are relatively narrow. We saw this with a lot of earlier platform-as-a-service type of offerings. It was great, until it wasn't. You could get a lot of stuff done, and then, eventually, you would hit some sort of wall. And there weren't a lot of options. You pretty much had to take that part of your application, that part of your program, and start over, to some degree, in a different environment. I think that's because they were oversimplified. Joe: there is accidental complexity and there's essential complexity The other pitfall with oversimplification is that you can make something feel simple in the first five minutes, but there's a sea of hidden complexity under it. Oftentimes, if things end up being too “magical”, then it becomes a lot harder to understand what's really going on. I think, fundamentally, deploying horizontally scalable, distributed applications, against a pool of machines that take advantage of all sorts of other external services, like a load balancer, and dynamic storage, that's a hard problem, and there is a lot of complexity there. And with the level of flexibility that Kubernetes gives, it's difficult, in my mind, to be able to simplify it that much further. There are definitely places in Kubernetes where there is some accidental complexity, where it's more complex than it needs to be, and I think that's inevitable in any real-world system. But I do think that it's a hard problem, and hard problems require relatively sophisticated solutions. I think that we become blind to these things. I mean, think about introducing, say, programming and working against something like Linux to somebody who's new to computers. That is an enormous amount of complexity that they're going to have to take on just to get to the level where they can actually be proficient with that thing. Similarly, take a look at any major cloud, like AWS. It is in no way simple. There's a lot of complexity there, but you see that it's a toolkit that you can use in so many different ways, so that complexity feels justified. I think some of that applies to Kubernetes also. The complexity is justified. The problem is that Kubernetes is still new. As this stuff becomes mature, as it enters sort of the shared gestalt, the shared understanding that we have as an industry, the complexity tends to fade into the background, and people become blind to it. But new users still have to be exposed to it. And we see this when we have new developers coming to our industry. There's a lot to get. You know, remove Kubernetes out of the equation. There's no way to look at our industry and say that things are easy and simple. But what we do know is that, once we master it, all of the sudden we forget about it. We forget that pain. And I think there's a similar dynamic with Kubernetes. Two weeks ago, our team attended one of our biggest technical events. And one of the hot topics at the conference was: should developers pay attention to actually learning Kubernetes basics, or should we leave it to the system administrators or DevOps specialists and focus mostly on writing good code? Evrone: I think it depends on what you mean by developers. Oftentimes, it's very easy for us to talk past each other when we refer to developers as a unified group, and what we find is that there really are lots of different types of developers, and they have different concerns and different things that they need to learn about. Let's take game engines, for example. I have a friend from high school who does AAA games with micro-optimizations. It’s a very different world from what I'm in right now. The question is that, if you're developing a game, should you just use something like Unreal Engine and not worry about the rendering techniques that are going undercover? Or should you actually pay attention to what's happening with the GPU? The answer is, well, both, depending on what type of developer you are. Joe: I think that we are going to have developers that are much more embedded into the runtime deployment systems that we have and are going to have to know Kubernetes, and those are going to be the folks that are developing the platforms. Oftentimes, those are the folks that now gravitate towards these sort of blended DevOps roles, where they are in the thick of the interface between the application and the environment that the application’s running on. There are still going to be developers that are never writing business logic, and they don't want to have to worry about all this. So I think that there's room for both of these things, and I think one of the keys around the term “cloud-native”, or any of these terms, “DevOps”, “cloud-native”, which are very difficult to define, is that there's no authority who can tell us what “cloud-native really” means. But in my mind, cloud-native is essentially being able to take the best advantage of cloud-like platforms, where a cloud-like platform is something that's API-driven, self-service, and elastic, and Kubernetes really is a cloud-like platform. The advantage that we get with cloud-native is that you have a set of people who are creating and exposing platforms, and then you have a set of folks that are building on top of those platforms, and both of them are developers. For both of those things, there's an operational and development role. Those that are curating, developing, and sort of specializing in the platforms, they're going to have to know Kubernetes pretty well. For those who are building on top of it, it depends on what they're doing. If they're doing simple, 3-tier web apps, they're probably not going to have to know Kubernetes that well, because those patterns are well understood and they can be bottled up. If you're doing something that's like a super high-dynamic distributed system involving machine learning, you're probably going to have to get a little bit deeper into the environment that you're running on, because that's more of a specialized architecture, and you're going to have to actually dig a little bit deeper. Again, that analogy to game engines, if you're doing a simple game, you don't have to touch it. But there are people who are doing more advanced things, and they have to actually start doing C++ extensions to their game engine. "Developers" is not a monolithic concept, so it really depends. There is an evil joke that cloud-native is something that you can’t run or debug on your laptop. Evrone: Haha, well, there is a certain level of being able to trust a provider. There is a point where you can’t debug past what’s going on, and it’s both empowering, because it’s somebody else’s problem, which is actually really good. But for the types of developers, and again, there are different types of developers, that love to go deep, that want to understand, “What is the packet from when I do this all the way down to the Silicon?”, cloud creates a sort of a barrier that you can't debug past, and that can sometimes be disconcerting for folks that are used to being able to go super deep on everything they do. Joe: Nowadays, we notice a trend of moving to microservice architecture. In your professional opinion, is container orchestration the only way for microservice evolution? Evrone: Oh, well, no. I think there's plenty of people that do microservices without containers. A lot of the fine-grain, service-oriented work and infrastructure-as-code was done early on by Netflix, and they'd like to talk about it quite a bit. You know, a lot of that stuff was done raw on EC2 using AMI and such. So a lot of the techniques here can work without containers. And I think you can use containers without microservices, but what people see is that these are two things that work really well together. Across all of these things, there is no right answer. There's just a set of tools and techniques and patterns that you can draw from to be able to do things. And these are just a couple of tools and patterns that work well together. Joe: We often talk about the advantages of container orchestration systems, especially at conferences, but are there any disadvantages, anything you don't like or would like to change? Evrone: I think, fundamentally, it's just more to take on. So if you're doing something simple, Kubernetes is probably not for you, and containers may not even be for you. I mean there are situations where you're doing something really simple, you launch a VM and install Docker and run a single container and you're off to the races, and for a lot of use cases that can actually work really well for you. Kubernetes and containers aren't gonna solve every problem for everybody. That being said, I think the more that you can make it somebody else's problem to manage the Kubernetes cluster, the more useful and easier it becomes. Joe: In terms of things that I would change - I don't know yet. I think I'm probably too close to be able to really say that. There are definitely things that we're working on to improve. The life cycle management, sort of the administrator aspect of Kubernetes, how do you actually administer a cluster? That was a lot harder than I think we recognized early on in the project. There's a lot of work that we're doing upstream around using Kubernetes controller patterns to manage Kubernetes clusters themselves. This is Cluster API and work like that. I think the way that we manage and wrangle YAML and install software is still very fractured and harder than it needs to be. I'm surprised that, this far into the Kubernetes journey, we still have people dealing with YAML directly. I've given a talk that is called "I'm Sorry About the YAML", where I've gone into some of the details around that. I think these are all addressable problems. I don't know if there's anything fundamental that I would change about the way that we do stuff. I think we can fix all these things over time, and it's just a matter of listening and iterating and making things better. And our traditional question about work-life balance. Your life must be full of huge, highload projects. What is your advice to fellow developers on keeping efficiency within a lot of tasks and making the world a better place for yourself and everyone else? Evrone: For me, there are different ways to be more effective and more efficient. And it really depends on what type of developer you want to be and where your skills and your passions are. I don't believe in 10x developers. As I said earlier in the interview, this is, fundamentally, a team sport. There's an old saying that if you want to go fast, go alone; if you want to go far, go together. A lot of times we have the 10x developers, and they'll go fast. But, oftentimes, it ends up being a short-term win with a lot of long-term debt. So I think the real 10x developers are those who make the people around them better and help to make everybody else more effective. Joe: If you have a team of 10 people, and you make everybody 20% more effective, that's a huge multiplier across things, especially if they then actually make the people around them that much more effective. My work-life balance advice is: find a way to boost and trust other people, so that you're not taking all of this on by yourself. People really get themselves into trouble when they feel personal responsibility that goes far beyond what they can do. This is my formula for when people are burned out, like when you feel responsible for doing something, and you don't have the tools to do it, either organizationally or just hours in the day. Then you end up getting burned out because you end the day feeling like there's so much that's undone, and it just invades the rest of your life. But if you can make sure that the problems that you're attacking are in scope to the tools that you have, and you develop those tools so that you have more reach by working with other people and actually sort of bringing people up, then I think that becomes a lot more long-term sustainable in terms of avoiding burnout. Thank you! And thank you for Kubernetes, from all our community. We hope that, with this interview, we will be able to make our developers a little better and our world a little better place to live in. Evrone: The Conclusion We’re grateful for the opportunity we had to interview Joe and learn from his valuable experiences and years of expertise working as a software engineer. Also published as " Kubernetes Co-founder Joe Beda: Software development is a team sport." The interviewer is the Chief Editor at Evrone .