paint-brush
The Economics of Public Cloud Repatriation and Why You Can't Stay in the Cloud at Scaleby@minio
6,662 reads
6,662 reads

The Economics of Public Cloud Repatriation and Why You Can't Stay in the Cloud at Scale

by MinIOSeptember 16th, 2024
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

The public cloud doesn't deliver cost savings at scale. It delivers productivity gains, to a point, but it will not reduce your costs.
featured image - The Economics of Public Cloud Repatriation and Why You Can't Stay in the Cloud at Scale
MinIO HackerNoon profile picture


What has become clear over the past couple of years is that the public cloud, for all of its benefits, doesn't deliver cost savings at scale. It delivers productivity gains, to a point, but it will not reduce your costs. There is goodness in the public cloud as it offers an incredibly powerful value proposition—infrastructure available immediately, at exactly the scale needed by the business—driving efficiencies both in operations and economics. The cloud also helps cultivate innovation as company resources are freed up to focus on new products and growth. However, the mere act of interacting with your data generates egress costs, which have been shown to be egregiously predatory. This is particularly true when the applications and workloads are persistent, consistent, and data intensive (high volume/velocity/variety of read and write calls), or involve high performance analytics - they just are not sustainable in the public cloud as they grow.


“..as industry experience with the cloud matures—and we see a more complete picture of cloud lifecycle on a company’s economics—it’s becoming evident that while cloud clearly delivers on its promise early on in a company’s journey, the pressure it puts on margins can start to outweigh the benefits, as a company scales and growth slows.” Sara Wang & Martin Casado, Andreessen-Horowitz, 2021


That take, while incredibly prescient, was from 2021. In 2024, data has grown (an average of ~20% per year according to an IDC study from 2022), the workloads have gotten bigger and scale has become the problem. Not the technology of scaling, but the cost, specifically, of scaling in the public cloud. According to David Linthicum, there are 3 main reasons, the public cloud is being “Kicked to the Curb”:


Cost - for certain workloads, it's just too expensive to run them in the cloud. Commodity hardware prices have fallen so far in the last few years, that hardware isn’t the huge CapEx that it used to be.


Failed Migrations - workloads that have not been refactored optimally or adjusted to be cloud-native have ended up costing ~2.5X what they were originally projected to cost. Inefficient apps on premise turned out to be inefficient in the cloud. Making them more efficient is costing too much and ending up not being worth it.


Diminishing Need - Applications that originally needed to be spun up quickly and efficiently, as well as able to scale, have scaled in the cloud but now are just a machine of repetitive tasks and data storage. These apps no longer benefit from the fast-scalability the cloud can provide, and are now just utilizing a lot of expensive storage. The need is no longer there for a flexible, quickly scalable model. The commoditization of hardware has presented a new, cost-efficient way to run these workloads. According to a recent Barclay’s CIO poll, many CIO’s agree.


From that same a16z article -


 “In 2017, Dropbox detailed in its S-1 a whopping $75M in cumulative savings over the two years prior to IPO due to their infrastructure optimization overhaul, the majority of which entailed repatriating workloads from public cloud.”


When your cloud costs start to hover around 50% or more of your cost of revenue (like Asana, Datadog, Prerender.io, and others), it's time to start looking at what your workloads are doing in the public cloud. Organizational and business leadership need to be aware of this so they can pivot. Certain workloads, such as running a data analytics cube, in-memory database, or a data analytics cluster are better fits for on-prem infrastructure. But these are just a few examples.


To focus in on a particular trend that will be impacted by this scale problem, let’s look at AI/ML, and specifically, LLMs (Large Language Models). If your current AI initiative has you building your own LLM or foundation model, consider the cons of doing it the public cloud:


  1. High Costs of Scale - Training and running LLMs at scale is expensive, and as the LLM gets bigger, so do the costs of public cloud


  2. Loss of Control - You have less control and visibility over implementation, infrastructure, and performance


  3. Vendor Lock-in - If you have trained LLMs on one cloud platform, it will be difficult to port to a different platform. Furthermore, depending solely on a single cloud provider entails inherent risks, particularly concerning policy and price fluctuations.


  4. Data Privacy and Security - I would also mention data sovereignty here. The bottom line is that you are trusting your data to a provider with servers spread in worldwide regions.


If your enterprise is dealing with petabytes or trending to that kind of scale, the economics favor the private cloud. Yes, that means building out the infrastructure (or leasing it from someone like Equinix), including real estate, HW, power/cooling, but the economics are still highly favorable. The public cloud is an amazing place to learn the cloud-native way and to get access to a portfolio of cloud-native applications, but it is not an amazing place to scale.

An Example of The Economics

So, what are the economics? For illustration, let’s take a 10PB modern datalake that uses Kubernetes to manage Apache Spark and Dremio for persistent and consistent analytics workloads. These types of workloads require frequent data reads and writes from object storage for analysis, updating and refreshing, and presentation. From a cost structure perspective, we will use some assumptions for the main cost drivers:


  • These data lakes and workloads have limited utility if we can’t use the data. The data provides insights, serves other applications, and may need to be processed outside of the storage environment. This requires the data to be transferred out of storage. If we assume 500TB per month being accessed, that only represents only 5% of the data being accessed per month.


  • For Data/Object Requests (PUTs, GETs, HEADs, etc.), we have worked with customers of similar consistent and persistent workloads that see over 10b object requests per month. So, we can use 10b as a conservative assumption for this type of workload.


  • Similarly, those same customers see around the same number of encryption requests for those objects, so again using 10b as a conservative assumption for our example.


With those assumptions, the cost of public cloud could look something like this:



Annual Public Cloud Costs for 10PB = $7.3m or $0.061 per GB/mo


The assumptions above are just that, and the fact that there are so many tells you how variable the costs can be depending on the particular usage and workload factors. This creates significant challenges in trying to budget. In addition, having no tiering or any Data Lifecycle activity is also somewhat rare, as organizations usually move data to colder tiers if the data becomes less “active”. But all of that just adds to the cost, as different tiers have different prices per GB/month, as well as a cost for automatically moving objects into those tiers.


MinIO allows you to scale on the private cloud (colo or a datacenter), using the same technologies that are used on the public cloud: S3 API compatible object storage, dense compute, high-speed networking, Kubernetes, containers and microservices. One major difference is there are no costs for object requests (GETs, PUTs, etc.), nor are there any limits on the number of requests, as long as the infrastructure supports it. In addition, encryption is included with the MinIO Enterprise and Community versions and there are no limits on the number of encrypted objects requested.


This optionality offers the ideal mix of operational costs, flexibility and control. It is true that you will take on CAPEX for hardware, but by starting small and taking advantage of key cloud lessons (elasticity, scaling by component, decoupling compute from storage), enterprises can minimize the initial outlay and maximize the operational savings.


When paired with commodity hardware and operating in a colo, or proprietary datacenter, MinIO can reduce those public cloud costs (as well as costs associated with managing those cloud costs) by anywhere between 50% - 70%, and in some cases, higher.



Annual Colo/MinIO Costs for 10PB = $1.7m per year, or $0.014 per GB/mo


That equates to a ~77% reduction in storage costs for 10PB of storage compared to public cloud. Even for smaller storage capacity needs (200TB - 2PB), the savings are worth exploring. Not to mention you get the industry's best storage performance, a built-in firewall for bucket-level security, observability that is specifically designed for object storage, and many other value-added features that would cost you extra in a public cloud.



The Resource Factor

One additional element that is worth a quick analysis is resources (the human kind). We have heard from our customers that the number of resources required to manage public cloud infrastructures can range from 5-10 FTEs depending on the size of the cloud infrastructure. That includes Cloud Engineers, Cloud Team Leads, DevOps Engineers, and Cloud PMs. Using salary ranges and medians from Glassdoor, those FTE costs can range from $700k - $1.5m per year, fully loaded.


We also hear from our customers (76% of them, in a recent survey) that one of MinIO’s key value drivers is its ease of use and manageability. That same survey found that 60% of them cited MinIO’s ability to deliver Improved Operational Efficiency.


"MinIO…has reduced the cost of support and maintenance for us."


  • Professional Services Company


"MinIO as a product is a very good storage solution, it [has]....reduced cost of resources [by] more than 50%."


  • A leading technology solution provider specializing in end to end DevOps offerings


Internally, we use MinIO for lots of different workloads, storage needs, testing, etc, and our estimates are that MinIO can be managed by 1 FTE - 3 FTE for PB+ infrastructures. That allows for massive infrastructure at scale with minimal resources.

Getting Started

Now that you’ve seen how and why the economics work for private cloud, I am sure you are wondering what the steps are to begin down this path. My colleagues have written about this here and here, and I suggest your Cloud teams and DevOps teams look at these blogs for the details on migrating away from the public cloud.


We have seen dozens of our customers repatriate their data using commodity hardware and either their own data centers or a colo, and realize some real savings and benefits from MinIO’s high-performing, simple object storage solution.


As the above analysis demonstrates, businesses can realize significant cost savings, above 50% of their existing implied annual public cloud S3 bill, by repatriating data to their own hardware in a datacenter or a colocation service. In the above scenario, with only 10PB, your business could save about $6.5 million over the next five years.


The truth of the matter is that the public cloud is cost-prohibitive at scale. The inherently elastic nature of the public cloud makes scaling there appear attractive, but it is almost always the wrong choice from an economic perspective. This is particularly true for data-intensive tasks like AI/ML, where the costs and loss of control in the public cloud can be substantial. As data scales, private cloud solutions with MinIO become economically superior, offering equivalent (arguably better) technologies at reduced costs. By leveraging commodity hardware and private cloud infrastructure, companies can achieve significant cost savings and performance benefits compared to the public cloud, sometimes as much as 70%. We suggest exploring migration away from the public cloud for your workloads, and using MinIO to modernize and scale your critical business applications.


If you want to learn more and take advantage of our value engineering function to run your own models, please reach out to us at [email protected] and we can start the conversation.