Deconstructing a Serverless Cloud OS

Responding to the Serverless Revolution Introduction Unlike containers, which were an incremental change introducing smaller VMs with less isolation, serverless technology has produced truly disruptive change, yet to be fully absorbed by the software industry. We need to stop talking about faster horses and start talking about race cars. While cloud functions (e.g., AWS Lambda) constitute a crucial ingredient in the serverless revolution, they are just one piece of a bigger puzzle. To comprehend the full meaning of this paradigm shift, we have to look at serverless storage, messaging, APIs, orchestration, access control, and metering services glued together by cloud functions. The cumulative effect of using them all is much stronger than the sum of individual parts. To extend our cars versus horses analogy: it’s not just about the engine, chassis, tires, steering wheel and infotainment unit taken separately, but rather all the parts assembled in one coherent unit. Disruptive technologies are first and foremost disruptive psychologically. To utilize the full potential of disruption, one has to think anew and act anew. Obstacles toward this goal begin with the fact that all existing mainstream development tools and programming languages were conceived 50 years ago, together with the Internet and UNIX, when overcoming computing resource scarcity was still the main challenge. Patching new features on the top of that shaky foundation made them heftier but not more powerful. As such, they are all completely inadequate for the new serverless cloud environment. To make real breakthroughs, a RADICAL RETHINKING of many familiar concepts is required: What is Computer? What is an Operating System? What is Middleware? What is Application? What is Programming? What is Testing? What is Source Code Management? What is Productivity? What is Quality? In this article, we will briefly analyze the first two of these topics one by one. What is a serverless cloud computer? Traditionally, a “computer” referred to a single physical device. However, over the last decade, the “Data Center as a Computer” concept has acquired widespread recognition. In the context of cloud computing, a “computer” constitutes a warehouse-size building filled with tens of thousands of individual boxes, each performing various specialized functions and connected by a super-fast local network. If we take a traditional computer model of CPU, ALU, RAM and peripherals connected via a bus, we could say that, now, dedicated computers perform the functions of individual chips and the super-fast LAN plays the role of the bus. From the serverless computing perspective, however, direct application of this metaphor has limited value. Not only are individual data centers not visible anymore, but also the whole concept of Availability Zones also disappears. The minimal unit we could reason about is , as illustrated below: One Region of One Account of One Vendor This is the serverless cloud computer we have at our disposal. Whether it is truly a super computer or just a very powerful one is sometimes a subject of heated debates. Either way, the serverless cloud computer provides us, for a fraction of traditional costs, a level of capacity we could not have even imagined 10 years ago. Could or should we still apply the chips/peripherals/bus analogy to this serverless cloud computer, and what would be a benefit of it? Yes, we can, and it’s still a useful metaphor. It is useful especially because it would help us to realize that all what we have now is a kind of machine code-level programming environment. We must raise up the level of abstraction in order to put it into productive use at scale. What is a serverless cloud computer ALU? ALU stands for Arithmetic Logical Unit; in traditional computers it performs all basic arithmetic and Boolean logic calculations. Do we have something similar in our serverless cloud computer? In a sense, we do. As with any analogy, it’s good to know where to stop, but we could treat cloud functions (e.g., AWS Lambda) as a sort of ALU device with certain constraints. Currently, any AWS serverless cloud computer in the Ireland Region has 3000 such logical units with 3GB local cache (some people would still call it RAM), 500MB non-volatile memory (aka local disk, split into two halves), 15 minutes of hard context switch and approximately 1.5 hours of warmed cache lifespan. These logical devices run “micro-code” written in a variety of mainstream programming languages, such as Python, JavaScript, etc. Whether one is going to use this capacity fully or partially is another question. The Serverless Cloud Computer pricing model is such that you pay only for what you use. Unlike traditional ALUs, there are multiple ways to activate the micro-code of such serverless cloud logical devices and to control whether they will perform a pure calculation or also have some side effects. An ALU is just an analogy in the serverless world but within limits it appears to be a useful one. What is a serverless cloud computer CPU? If we have 3K of serverless cloud ALUs, do we also have CPUs to control them, and do we really need such devices? The answer is that there are such devices, which are useful in a wide range of scenarios, but they are purely optional. Cloud orchestration services, such as AWS Step Functions, could play such a role, with internal Parallel Flows functioning similarly to individual cores. An AWS Ireland Region “Cloud CPU” could be occupied for up to 1 year with maximum 25K events. How many of such serverless cloud CPUs could we have? We can get 1,300 immediately, and then add another 300 every second. As with cloud functions, we will pay only for what we are using. What is a serverless cloud computer memory? OK, we have ALUs (with some cache and NVM) and we have optional CPUs to orchestrate them. Next, do we have an analogy for RAM and disk storage? Yes, we do, but we might opt to stop speaking about an artificial separation between volatile and non-volatile memory. Modern CPUs make this separation meaningless anyhow. It’s better just to talk about memory. Serverless cloud computers have different types of memory, each with its own volume/latency ratio and access patterns. For example, AWS S3 provides support for Key/Value or Heap memory services with virtually unlimited volume and relatively high latency, while DynamoDB provides semantically similar Key/Value and Heap Memory services with medium volume and latency. On the other hand, AWS Athena provides high volume, high latency tabular (SQL) memory services, while AWS Serverless Aurora provides the same tabular (SQL) memory services with medium volume and latency. Interestingly, some serverless cloud “memory” services, such as DynamoDB, are directly accessible from Step Functions (aka serverless cloud CPUs), while others are only accessible through cloud functions (serverless cloud ALUs). For now, Step Functions have a 32K limit of internal cache memory and, as such, are suitable only for direct programming of control flows rather than voluminous data flows. Whether such a limit is a showstopper or a pragmatic trade-off choice is a subject for a separate discussion. A complete analysis of available services, which would include Serverless Cassandra, Cloud Directory and Timestream, is beyond the scope of this memo. What are serverless cloud computer peripherals? Thus, we have serverless cloud computer ALUs, CPUs and Memory (all metaphorical, of course). Do we have something similar to peripherals in traditional computers? Yes, we do have something similar to ports, which connect our serverless computer to the external world. As with traditional ports, each one supports different protocols and has different price/performance characteristics. For example, AWS API Gateway supports REST and WebSockets protocols, while AWS AppSync supports GraphQL, and AWS ALB supports plain HTTP(s). A full analysis of available services, which would include CloudFront CDN, IoT Gateway, Kinesis and AMQP, is beyond the scope of this memo. What is a serverless cloud computer Bus? So, we have metaphoric ALUs, CPUs, Memory and Ports for our serverless computer. Do we have something similar to a bus, and do we need one? The answer is yes, we do have several types, which sometimes are necessary. For example, AWS SQS provides Push high speed, medium volume service, while AWS SNS provides high speed, medium volume Pub/Sub notification service, and AWS Kinesis provides high speed, high volume Push service. What else does the serverless cloud computer have? Unlike traditional computers, quite a few more batteries are included: a data flow unit (aka AWS Glue), a machine learning unit (Sage Maker endpoint), access control (AWS IAM), telemetry (AWS Cloud Watch), packaging (AWS Cloud Formation), user management (AWS Cognito), encryption (AWS KMS), component repository (AWS Serverless Application Repository), and a slew of fully-managed AI services such as AWS Comprehend, Rekognition, Textrat, and others. Complete specification of the Serverless Cloud Computer “hardware” is illustrated below: Serverless Cloud Operating System Following the useful tradition established by Dutch computing pioneer E.W. Dijkstra, we will treat the Serverless Cloud Computer metaphorical “hardware” specification outlined above as the bottom layer of a “necklace string of pearls” — namely, higher-level, domain-specific stacked on the top of lower-level infrastructure virtual machines, as illustrated below: virtual machines The question is what’s next, and how far shall we proceed with such a metaphor? As mentioned above, the main purpose of this metaphoric description is to highlight the low-level abstractions we have at our disposal to tame such a beast. Following a more or less standard model of software systems layering, the next layer above hardware is usually a Drivers layer, which provides programmatic access to underlying devices. In our case, we could treat cloud vendor SDKs (e.g. AWS boto3) as a “Drivers” layer. With regard to the operating system as well as higher layers’ responsibilities, there are probably as many opinions on its scope as the number of people discussing the subject. In this paper, we are going to adopt a relatively restrictive view of the Operating System as responsible for the optimal resource utilization of a single computer — in our case a serverless cloud computer. Indeed, while serverless cloud computers are extremely powerful by common standards, they are not unlimited. Although they could be scaled up, we will always want to get more value for the same amount of money. Therefore, the optimal resource utilization goal, minimizing the cost while staying within SLA boundaries, does apply. Optimizing resource utilization of distributed systems and increasing productivity are the responsibility of higher layers, namely Serverless Cloud Middleware and Serverless Cloud Framework, that will be discussed in a separate memo. What kind of resource optimization should such a Serverless Cloud Operating System be responsible for? Ultimately, it comes down to optimal concurrency structure and optimal packaging. Before we get to optimization details, though, we need to take a brief look at traditional Operating System services, namely: File System, Processes, Installation Packages, and Interprocess Communication. What is a serverless cloud OS file system? As we argued above, the whole concept of the file system is probably outdated and for application code development, we’d better start talking about cloud versions of traditional data structures such as lists, vectors, sets, hash tables, etc. All these data structures could be efficiently mapped on different serverless cloud memory services mentioned above. However, unless we are going to rewrite all available software, which would be impractical, we will sometimes still need to talk about files, for example, Python modules, Linux Shared Objects and Executables. Using local disk storage of cloud functions has to be treated as a special case, mainly for cold start optimization reasons. The ideal solution would utilize Linux to mount, depending on the price/performance ratio, directly to S3, DynamoDB, Serverless Cassandra or even Serverless Aurora. File System in Userspace — FUSE Unfortunately, that’s not possible today since the FUSE mount requires the Lambda container to run in mode, which is not allowed for security reasons. Another possibility is to develop a cloud version of module importer for each run-time environment: Python, JavaScript, JVM. While this requires some extra work and is less friendly towards legacy code, the cloud importer allows some optimizations not available to the traditional disk-based one. privileged See our first in a trilogy of articles describe the construction of at BlackSwan Technologies. Python Cloud Importer Similar logic applies to Linux Shared Objects and Executables. Ideally, ELF files should be directly loaded from the cloud memory source. That, in turn, would require modifications in the function — something hard to expect in the near future. One possible work-around would be to download Shared Library and Executable files from the cloud source to the folder first. That would bring us back to the 250MB disk space limitation for all Shared Libraries including Python extensions. dlopen /tmp Another option is to imitate RAM disk, which would double memory consumption subtracted from a larger 3GB budget. As with cloud importer, some non-trivial optimizations to speed up binary files download are possible here. What is a serverless cloud OS process? Now, we step into an uncharted territory. A clear analog to the Linux process is yet to be defined. Step Functions running State Machine (even though we have to stop calling them State Machines, which they are not) is a good candidate, but what about individual Lambda Functions triggered by some external event? Shall we treat them as interrupt handlers in traditional Operating Systems? That might be not such a bad idea, but only time will tell. What is a serverless cloud OS installation package? The answer seems obvious: it’s a Cloud Formation Stack on AWS or a similar solution on another cloud platform. In the serverless world, Cloud Formation Stacks do not — serverless applications have no daemon processes — nothing is running unless explicitly triggered by some external event. In this discussion, we exclude Fargate containers, which . run do run Therefore, launching a Cloud Formation Stack just means installing a copy of a Serverless Application. Although it will reserve some resources, it will not consume them until some real workload starts running. Well, almost… storage capacity will still be consumed even in passive mode, but this is no different than disk space occupied by some application even if it has never been started. What is a serverless cloud OS interprocess communication? This is another blurry area that requires further elaboration. Traditional Operating Systems, like Linux, have two standard and one semi-standard interprocess communication mechanisms. Shared memory and pipes, named or ephemeral, are two standard interprocess communication mechanisms coming 50 years back to Unix. Tcp/IP is a kind of semi-standard IPC and is mostly devoted to larger scale middleware arrangements. What is a serverless cloud OS Shared Memory? All serverless cloud memory services mentioned above are basically shareable. We still need to properly utilize the mutual exclusion and transaction scoping mechanisms available for each one of them. supply an interesting source of inspiration. Clojure Persistent Data Structures and Software Transaction Memory What are serverless cloud OS pipes? Unfortunately, we do not have serverless cloud OS pipes. More accurately, we do not have good ones. Serverless cloud bus services enlisted above do a decent job, but for a very limited set of scenarios. To use a biological metaphor, they are good for central veins and arteries, but not for capillaries. As for now, it’s impractical to create a separate SQS queue for each flow — it takes too long to create, and it does not scale well for a large number of flows. If we decide to fanout some processing to a queue, it’s not trivial to figure out when all messages belonging to a particular flow have been processed. Using serverless cloud shared memory facilities, it should be possible, in principle, to develop good, lightweight, economical pipes. This is a direction for additional research. What is serverless cloud networking? Some interesting R&D activities are taking place currently in this area (see references). Optimal Concurrency Structure Within a typical Serverless Cloud Computer, such as AWS, one could identify the following distinct levels of concurrency: AWS Step Function (Cloud CPU) Parallel State Machine within a single AWS Step Function (Cloud Core) Individual AWS Lambda Function instance (Cloud ALU, normally correlates with #2 above, but not always) Linux Process within a single AWS Lambda Function (Cloud ALU Process) Posix Thread within a single Linux process within a single AWS Lambda Function (Cloud ALU Thread) Coroutine (green thread) within a single Posix Thread of a particular Linux process of a particular AWS Lambda Function Optimal concurrency structure depends on several changing factors, especially data volume, velocity, and external systems (e.g. web servers) constraints. There is also the question of which processes in the system need to be event-driven and which need to be orchestrated by Step Functions. Finding an optimal structure manually would be a daunting task, even if possible. Finding an optimal structure by applying an appropriate Machine Learning Model fed by operational statistics looks like a much more promising direction, as illustrated below: This discussion of the optimal concurrency structure reveals another important aspect: currently available tools for specifying AWS Step Functions, Lambda Functions and Cloud Formation Stacks are at despairingly low level of abstraction — like a kind of machine code. Calling these long and ugly JSONs and YAMLs would be funny if it were not so sad. human readable There is no reason why their internal structure could not be treated as a target platform for some high-level compiler. It could be done, and it should be done. Optimal Packaging Sticking with the 250MB code size limit of AWS Lambda does not make very much sense. Today, due to this limitation many ML inference processes have to opt for less convenient container packaging, even though available 3GB RAM would be more than enough for performing the task. There is no practical reason why Python modules, for example, could not be imported directly from S3. Python allows this in principle. The same logic applies to Linux Shared Objects. While a proper solution would require a deep intervention into AWS Firecracker — which is not beyond reach in the future, but is less practical in the near term — a close approximation based on additional 250MB of space is possible today. importlib /tmp But now, we face another problem. Cloud import of, say, Python (the same logic applies to JavaScript, Java and .NET), as well as Linux Shared Objects, would increase so-called latency. For many applications, it won’t constitute an issue and overall productivity gains (given no need to package zip files anymore) would easily outweigh another couple of seconds of delay (free of charge, by the way). For some other applications, that might be a problem. That leads us to yet another optimization challenge: to find an optimal combination of imported modules and shared objects to be placed into an AWS Lambda package (directly or via AWS Lambda Layers) based on a suitable ML model and collected operational statistics. cold start As with optimal concurrency structure, this is a task for a high-level compiler. We shall treat every case when software engineers are engaged in manual activities that obviously could be improved through automation as a waste of valuable time. Also, notice an emerging pattern here. While traditional operating systems and compilers provide some forms of static optimization, the new serverless cloud world requires an optimization process to be dynamic, constantly repeated, and based on collected operational statistics, as illustrated below: Portable “Hardware” Abstraction Layer As it usually happens with operating systems, both optimization problems outlined above, require some form of abstraction insulating core algorithms from technical details of each specific cloud platform. Indeed, 90% of the cloud Python import system depends on the Python module system rather than on how Cloud Storage of AWS vs GCP works. The same logic applies to Linux Shared Objects and concurrency structure. Of course, the same “hardware” abstraction would be useful as a productivity tool for writing portable application code, but here we still have some way to go until reaching the Framework Layer. What else? We started from the very basic level of reconsidering the serverless cloud computer “hardware” model and outlined major responsibilities of a serverless cloud operating system. We still need to talk about Middleware’s role in optimizing serverless distributed system resource utilization and about productivity increases through proper adjustments of the Framework, including the whole development toolchain. Of course, the real process is not as linear as described. In order to start even preliminary investigations, we need some minimal development and testing system in place. So, in reality, development activities are conducted at multiple layers in parallel. Introducing CAIOS The project, code name CAIOS (which stands for Cloud AI Operating System, to highlight a deep connection with managed AI capabilities), is currently conducted by as an internal open source project within the parent company, BlackSwan Technologies: BST LABS The objectives of the CAIOS project are to achieve an: Order of magnitude democratization of the software development process — there is no reason why it should be so complicated, and so painful Order of magnitude increase in productivity and quality — there is no reason why software development should be so slow, so expensive, and so buggy Order of magnitude increase in value velocity — we must start delivering what customers and market really need rather than what we could push down their throats Order of magnitude operational cost reduction — there is no reason why running software systems should be as expensive as it is today Order of magnitude improvements in security — with digitization becoming ubiquitous, tolerating security breaches is not an option anymore Unleashing and properly utilizing the full potential of Serverless Cloud Computing provides a unique opportunity to finally implement the right way, the as it was initially envisioned 60 years ago. Man-Computer Symbiosis Coronavirus Era Post Scriptum The bulk of this paper was written before the recent Corona pandemic global crisis. When I read of the potential impact on the global economy in general and startups in particular, I took it seriously: Steve Blank’s analysis Shutting down the economy for a pandemic has never happened. … If your business model today looks the same as it did at the beginning of the month, you’re in denial. Indeed, for the software industry, the party may be over — if not right now, then in the foreseeable future. We will no longer be able to command a premium for developing half-realized services poorly matched to real users’ needs, then delivering them late and over-budget. Because of pandemic concerns, the need for automation will go up. In short order, the tolerance for inflated operational and development costs, poor quality and security, and late delivery will diminish, then disappear. Ironically, our circa March 2020 business models date almost 50 years back. In his seminal , E.W. Dijkstra made the following comment: ACM Turing Lecture Nowadays one often encounters an opinion that … programming had been an overpaid profession … perhaps the programmers … have not done so good a job as they should have done. Society is getting dissatisfied with programmers and their products. The year was 1972. Today, we software developers still earn an order of magnitude more than school teachers. But are we doing a more important job, or at least are we doing our job well enough to earn our keep? Alas, the answer is probably not, and here is why: most of the engineers employed in the software development industry are still busy with what Simon Wardley called : moving software components from one place to another, configuring and re-configuring infrastructure, and in their remaining time writing pieces of code that marginally move the ball forward for a point solution. yak shaving As an industry, and as professionals, we are caught largely unprepared for what is going to happen: society needs real automation solutions now. Nobody is interested anymore in our justifications for the status quo. If we continue , justifying that by technological limitations, how will we continue earning more than many other professionals? Are we able to formulate business problem solutions in a concise and easy-to-prove-correctness form and to leave the rest to tools to perform an automatic conversion into the correct sequence of zeros and ones? shaving yaks The time for radical revision of our software development habits is NOW. While initially the CAIOS project started out of an intellectual curiosity about what would happen if we start treating the cloud as a super computer, it is now readjusting to the new reality. It will focus on delivering practically applicable solutions, enabling a dramatic reduction of operational and development costs and ironclad code security. These solutions were needed yesterday, and we can no longer afford to wait until tomorrow. More detailed information describing available solutions and future plans will follow. Stay tuned. References A. Sterkin, Serverless Cloud Import System. Part One: Linux FUSE Cloud Storage Mount A. Sterkin, Serverless Cloud Import System. Part Two: Python Cloud Importer A. Sterkin, Serverless Cloud Import System. Part Three: Supporting shared EFS S. Wardley, Why the Fuss About Serverless S. Wardley, Amazon is eating the software (which is eating the world) S. Wardley, Thank you Amazon. Boom! Everything in business will change. G. Adzic, Serverless architectures: game-changer or a recycled fad? G. Adzic, Designing for the Serverless Age T. Wagner, Serverless Networking T. Wagner, The Serverless Supercomputer Luiz André Barroso Jimmy Clidaras Urs Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Third Edition E.W. Dijkstra, Notes on Structured Programming E.W. Dijkstra, The Structure of “THE” Multiptogramming System E.W. Dijkstra, The Humble Programmer J.C.R. Liklider, Man-Computer Symbiosis D. Engelbart, J. Augmenting Human Intellect: A Conceptual Framework C.R. Licklider, R. W. Taylor, The Computer as a Communication Device Steve Blank, How Your Startup Can Survive a Worldwide Pandemi