Cloud cost optimization is a practice any organization should adopt to ensure they spend right on the cloud. Let’s see how to plan the same.
Every cloud stakeholder should be armed with documents, tutorials, training, guidance, and tools to effectively handle the cloud environment. FinOps products should have the ability to provide graphical representation and reports on cloud usage. Reports should facilitate the stakeholders to dive deep into granular pod level, node level, business unit level, tag level usage, associated cost details, etc.
These reports should equip cloud practitioners with the necessary cost information for effective decisions.
One of the major challenges the enterprises face is cross-functional transparency. There may be two app development teams developing two different cloud-native applications without knowing that they both use different monitoring tools that satisfy the same purpose.
Procurement teams go with a vendor based on the options provided by the cloud teams and better negotiation with the vendor. They have little or no interest in the usage of the tools by diverse teams. It is crucial to identify these common requirements and consolidate the resources accordingly.
Cloud management is a tricky process. Cloud involves the operations team, finance team, cloud engineers, cloud architects, procurement team, LoB managers, C-suite executives, etc. conveying a different message. Requirements vary from time to time. Organizations should have a centralized cloud cost optimization/FinOps team to mitigate the differences. Any cloud financial decision like buying new licenses, renewal, going hybrid cloud, etc., before reaching the CXO’s office should pass through the FinOps team’s scan.
After a thorough scanning of real needs and expectations, costs and business value mapping should be carried out. Once it is acknowledged, it should reach the decision maker’s table for approval.
There are resources that secretly weigh the cloud bills. Cloud practitioners set up auto-scaling to ensure enough capacity to meet traffic demands and improve cost management. Let’s consider Azure GPU machines. For high-end remote visualization, ML, and deep learning, GPU category, N-series virtual machines are ideal.
They accommodate low latency, high-throughput network interfaces for graphics or video-intensive workloads. When the engineers miss out on calculating the right number of nodes and configure in excess, the organization ends up paying for these zombie nodes.
For example,
Azure Instance NC12 with 1XK80 GPU offering 12 vCPUs costs $1.8 per hour. Consider 10 such instances counting 120 vCPUs configured but 5 left unused. At the end of the month, you need to pay $13140 instead of $6570 to Azure midst of no accountable benefits.
It is hard to identify these nodes until you address these in the line items of lengthy cloud bills. For larger organizations handling several applications, identification and mitigation go out of manual efforts. Options left with us are to manually plan and closely watch the configuration process, identify the unclaimed assets, and retire (which is not always feasible) or to go with cloud cost optimization products.
Enterprises use multiple tools for cloud cost optimization. They prefer to stick with native cloud service provider tools for better reliability.
Example: Businesses using AWS cloud, can make use of,
AWS Cost Explorer – for managing and visualizing the cloud usage
AWS Cost Anomaly Detection – for detecting cloud spending abnormalities. This utilizes machine learning and statistical algorithms for accurate cost overage detection.
AWS Trusted Advisor- for recommendations on reducing costs, improving security, performance, etc.
Many other native tools are also available like AWS CloudWatch, and AWS Budgets that aid cloud cost optimization. Hovering over multiple tabs for information can complicate cloud practitioners’ decision-making.
Adopt a single solution that can unify all the results under one pane which helps you to track, monitor, and restructure based on intelligent recommendations.
Cost overages can occur from various sources. An overview of cloud expenditure can only tell us how much we are wasting. Identifying the root cause requires enormous effort.
Example: In Microsoft Azure, when we create a VM, a public IP address, network security group, and regular network interface is also created. When this VM is found to be unused for a longer period, the team decommissions it to save the cloud costs. But if they miss decommissioning the other components (Public IP, Network interface, NSG) it still accounts for the monthly cloud bills.
Cost optimization reports should help us to filter it down and identify the lowest individual unit, and the source of unintended spending.
Every cloud cost optimization activity is directly coupled with the business benefits. List your business KPIs and benchmarks. Involve a stakeholder from every team like engineering, management, finance, and operations while figuring out KPIs. It’s crucial to map cloud spending with the business value it adds.
Example: Cloud spend per customer, cloud spend per application
This exercise will help you to deeply associate engineering activities with cost and make every stakeholder’s decision financially accountable.
When you are aware of the roadmap and confident with the computing demand for the coming days, it’s safe to procure cloud resources well before. Bulk procurement in advance helps management on opting for better offers or discounts from service providers.
Example: Reserved Instances. AWS’ RI can provide up to 72% discount compared to “On-Demand” pricing. It also offers the flexibility to alter families, OS types, and tenancies when Convertible RIs are chosen.
Before choosing the cloud service provider audit the internal environment thoroughly. If you have multiple Microsoft applications, it is good to go with Azure which saves integration costs. If there is a need for a short-term compute resource, go with a “pay-as-you-go” pricing model that allows increasing or decreasing compute capacity on-demand and pay for minutes (VMs) or seconds (Container instances). Whereas, for low latency microservices, and big data processing, GCP is a better option.
Cloud cost optimization is not a one-time setup to build and leave aside. It is an ongoing process that is closely associated with business productivity and growth. When the organization scale, cloud dependency increases piling up the cloud resource volume to meet the growing demands. The need for faster delivery, customer experience, rapid innovations, and competition foster organizations to less worry about the selection, allocation, tracking, and costs of the cloud workloads. Generally, they lock in with the existing cloud vendors for easier procurement and support.
Organizations tend to lose millions when they miss out on optimizing their new workloads along with the previous ones. Select a FinOps solution that runs along with your vision, each day dragging everything under one umbrella.
Keep optimizing.
Also published here.