Building Cloud Platforms as a Service — Key Pillars of Success

Written by jeyabalajis | Published 2023/08/07
Tech Story Tags: devops | cloud-computing | cloud | platform-engineering | paas | the-devops-writing-contest | platform-as-a-service

TLDRBuilding Cloud Platforms as a Service - Key Pillars of Success: - Multi-Tenant Experience - On-boarding and Identity Foundation - Self Service - Security by Defaultvia the TL;DR App

For mature enterprises, planning, building, and managing foundational cloud technology platforms accelerate value for their internal clients. Once these enterprises reach a certain size and level of dependency on technology, establishing these foundational platforms help eliminate complexities, boost the effectiveness of its use base, and help adhere to compliance standards across the enterprise.

Building these foundational platforms as a SaaS (Software as a Service) enables organizations to maximize their client experience and Return on Investment (ROI).

Here is a sample SaaS Journey Framework that breaks down the path to SaaS success into four distinct phases (Business Planning, Product Strategy, Minimum Viable Service, and Launch/Go-To-Market), each representing a critical stage in the move to SaaS.

At a broad level, SaaS Journey Framework allows you to accomplish the following:

  • Articulate the compelling factors for bringing up a Platform as a Service, the Target market within the organization, and the potential customer segments.

  • Have a strategy that articulates a clear set of capabilities/offerings that target specific personas within the consumer base.

  • Identify critical capabilities for your Minimum Viable Service (MVS), and set clear definitions and expectations of the launch, timelines, target customers, and how you will engage potential customers to capture feedback.

  • Before you launch, establish a Customer Advisory Board (CAB) to ingest voice-of-customer feedback and prioritize an appropriate sense of urgency; create measurable product adoption expectations.

In the rest of the sections, we will get into each key pillar of success, as you build out your Cloud Platform as a Service. These key pillars of success can both be treated as a set of design principles and also as recipes for success.

Content Overview

  • Multi-Tenant Experience
  • Control Plane
  • Tenant Workloads Isolation
  • On-boarding and Identity Foundation
  • Self-Service
  • CI/CD Pipeliness as a Service
  • Centrally Managed Products
  • Security by Default
  • Conclusion

Multi-Tenant Experience

When you design your Cloud Platform for housing multiple tenants, you may want to take advantage of the economies of scale that come with sharing infrastructure resources across tenants. Simultaneously, the tenants require some of the resources to be dedicated to them.

Different SaaS models such as Silo, Pool, and Hybrid provide different levels of multi-tenant experience and workload isolation.

  • Silo: The silo model refers to an architecture where tenants are provided with dedicated resources. For example, each tenant of your cloud platform may have their own AWS account, that’s vended from a master aws organization’s account.

  • Pool: The pool model refers to an architecture where tenants share resources. These resources typically belong to the base infrastructure. For example, networking services in aws such as NAT Gateway or VPCs could be shared across all your tenants.

  • Hybrid: The hybrid model refers to an architecture where some resources of your platform are implemented in the Silo model and some in the Pool model. This pattern acknowledges the reality that you cannot exclusively design for Silo or Pool.

In scenarios where you must comply with strict workload isolation for teams, choose Silo model. In scenarios where you would benefit from economies of scale (for example, a common set of VPCs, subnets, and NAT Gateway for a set of customers), choose Pool model.

Immaterial of which model you choose, each environment managed by your platform still relies on a shared identity, onboarding, and operational experience where all tenants are managed and deployed via a shared construct. This is where a Control Plane based operation comes in to help.

Control Plane

Control Plane is a management and operations layer that houses central SDKs and CI/CD Pipelines and operates across multiple accounts.

It is a critical part of your multi-tenant strategy that ensures that all your Infrastructure as Code pipelines (including On-boarding pipelines) are unified, and managed through a central account (Single Pane of Glass).

With Control Plane, you can:

  • Launch Base Infrastructure Resources across multiple hosts (aka accounts) through configuration.

  • On-board teams across all your platform-as-service offerings, through a single, automated, repeated process.

  • Manage platform admin CI/CD Pipelines as products, thus achieving standardization across all admin pipelines.

  • Build standard Admin Deployers that enable you to provide CI/CD Pipeline as Services.

Tenant Workloads Isolation

When you build and operate a multi-tenant platform as a service, you must ensure that each tenant is prevented from accessing another tenant’s resources.

The following strategies help in enforcing workloads isolation:

  • Attribute-Based Access Control (ABAC): ABAC is an authorization strategy that defines permissions based on attributes. For example, you can enforce a policy that allows permissions to a user only when the user’s team name tag matches the resource’s team name tag.

  • Name-Based Access Control (NBAC): All resources in your cloud provider may not support ABAC. In such scenarios, the NBAC strategy additionally helps you to enforce permissions to a team based on resource naming convention. For example, you can enforce a policy that allows permissions to a user only when the resource starts with the team name to which a user belongs. Use this as a complementary strategy on & above ABAC.

In AWS, IAM Policy Conditions element and IAM Policy Resource element are excellent ways to enforce both ABAC and NBAC.

Beyond tenant workloads isolation, ABAC (using Tags) also allows you to measure the cost impact of individual tenants.

Designing multi-tenant experience for your platform as a service requires deliberate planning. The above patterns and techniques (Silo/Pool/Hybrid, Control Plane and Authorization strategies) should help you in coming up with a robust design.

On-boarding and Identity Foundation

There are several channels (sign-up page / GitHub Pull Request etc.) through which you may want to onboard tenants for your platform. Most often, tenant onboarding is initiated directly by tenants, but this could be a provider-managed process too.

Immaterial of the channel, this step often requires the orchestration of a number of components to successfully provision and configure all the elements needed to create a new tenant.

For example, let’s take a scenario where we have decided on our authentication strategy as IAM with MFA.

When a Team (composed of a set of users) is on-boarded, the platform may perform the following steps:

  1. Create IAM User for each user within a team & attach the tag team name to the user with an appropriate value.

  2. Attach IAM User to one or many IAM Groups, depending upon the persona attached to the user in the request.

  3. Create Team Execution Roles with appropriate permissions that underlying cloud services will use for execution. For example, the platform may create separate Team Execution Roles for SageMaker & Glue.

  4. Create AWS SNS Topic for a Team as well as for each user. These topics can then be used for sending notifications to the team or a user.

  5. Attach common policies that govern platform usage across all users. For example, mandate MFA before any service is accessed through Console or CLI.

By pre-packaging all preventive permissions and team-specific resources (like execution roles) during on-boarding, you can deliver a secure, governed yet easy-to-use and repeatable on-boarding process that just works.

Self-Service

Self Service is a key pillar that unlocks accelerated delivery by empowering consumers to launch cloud resources and pipelines with little or no intervention from administrators.

Enabling Self-Service on Cloud Platforms is hard since you need to simultaneously satisfy two goals that are seemingly contradictory:

  • On the one hand, you need to provide flexibility to end users.
  • On the other hand, you require your cloud resources to be secure and compliant.

AWS offers a couple of interesting models for self-service that you can embrace.

CI/CD Pipelines as a Service

CI/CD Pipelines as a Service offers a toolchain standard to be re-used by application teams. In this model, end-user teams launch a governed deployment pipeline from AWS Service Catalog.

The workflow for this model is as follows:

  • Cloud platform engineers develop a service catalog product.
  • A product comprises one or more AWS resources and is created by designing AWS CloudFormation templates that combine resources, their relationships, and the parameters that the end user can plug in when they launch the product.
  • For example, a CI/CD Pipeline Service product may create an AWS CodePipeline project that connects with a GitHub Repository branch (plugged in by end-users) for the source stage, followed by a deploy stage, which invokes a Lambda Deployer Function that performs the required deployment.
  • The input to the Lambda Deployer function comes from a DSL configuration file, which is supplied by end-users.
  • The launch of a product just creates a fully functional CI/CD Pipeline.
  • Now, whenever end-users push the DSL configuration file & any other associated required files into the GitHub Repository branch, the CI/CD Pipeline kicks off and the Lambda Deployer function performs the deployment & also notifies the end-user through Team SNS Topic.

Key Benefits:

  • Standard deployers, but flexible configuration: End-users are provided with the flexibility to specify configurations & details pertaining to the deployment (flexible), but admin takes care of the actual deployment to the cloud (standardization and compliance)

  • GitOps & DSL driven: Every deployment of a resource by the consumer team is forcefully driven through the GitHub repository PR-Merge process. Further, this follows a full-fledged GitOps way of implementing Continuous Delivery for cloud-native applications.

  • Eliminates individual personnel dependency

https://twitter.com/kelseyhightower/status/953638870888849408?s=20&embedable=true

Centrally Managed Products

This model allows end-user/consumer teams to provision resources managed by a central operations team as self-service.

In this model, Cloud platform engineers publish infrastructure portfolios to AWS Service Catalog with pre-approved configuration. This locks down how infrastructure is configured and provisioned.

For example, consider offering S3 Bucket as a Service. In this case, the workflow would be as follows:

  • Cloud platform engineers design and create a CloudFormation template, with appropriate end-user parameters such as team name, bucket name, etc.

  • They pack in all required security and compliance requirements (No Public ACLs, enforce in-transit and at-rest encryption, etc.)

  • They add required tests and DevSecOps validations.

  • They roll out this as a Service Catalog Product.

  • When end users launch this product, they receive a fully secure and compliant resource that is locked down to their team.

Key Benefits

  • Principle of least Privilege: End users are NOT provided elevated benefits. They must use Service Catalog products to launch resources.

  • Standardization & Compliance: All resources created through Service Catalog exhibit the same behavior, thus accomplishing standardization and compliance.

With Centrally Managed Products & CI/CD Pipelines as a Service models, platform admins can provide superior self-service to end-users while adhering to the highest levels of security, standardization, and compliance.

Security by Default

Security is a key pillar that provides multi-fold benefits, especially when you are launching a platform as a service. Building all capabilities and services with Security by default principle ensures that hundreds of teams in your organization use these products, without worrying about managing security in the cloud.

Below are a few strategies to ensure that platform offerings are secure by default:

  • A strong Identity Foundation that is either built with MFA (or) Single Sign-on, and a Teams construct that ensures that workloads are appropriately tagged and authorized.

  • A preventive permissions model that protects cloud resources across all channels (console, CLI, programmatic access) and also enforces team permissions.

  • A carefully designed Tenant Workloads isolation through Attribute Based Access Control (ABAC) and Name Based Access Control (NBAC) ensures teams can access only what is theirs.

  • A Control Plane architecture that ensures that all products and pipelines are launched into accounts from a central account (single pane of glass).

  • A carefully enforced data encryption standards (transport layer & at-rest) while launching products through Centrally Managed Products & CI/CD Pipelines as a Service, through appropriate configuration.

  • Built-in DevSecOps controls through standard libraries like Checkov, which are codified at the source, and automated for efficiency.

  • Integrated container security during Continuous Integration with tools like sysdig.

  • A hardened security posture for externally exposed services with appropriate controls such as CloudFront, Web Application Firewall (WAF), API Keys, and Usage Plans.

  • Enforce standard tags across all resources launched (including admin resources) for attribution, security, and FinOps.

By establishing preventive guardrails, including DevSecOps in all builds, protecting data privacy in the capabilities offered, and isolating team workloads, you can deliver an exceptional security by default that works for teams at scale.

Conclusion

Cloud Platforms as a Service provide agility and innovation to mature organizations. By treating them as full-fledged SaaS services with the help of the SaaS journey framework, and also coming up with a solid architecture with the help of the above-mentioned pillars, organizations can unlock exceptional service experience and also increase agility.

“Platforms and platform thinking aren’t cure-all solutions. They’re conduits and catalysts for your strategic goals. To harness their full potential and realize their full value, they must be designed and built with your unique goals in mind.” — Rachel Laycock, Global Managing Director, Enterprise Modernization, Platforms and Cloud

References

Also published here.


Written by jeyabalajis | Experienced Technology Professional
Published by HackerNoon on 2023/08/07