Hey HN! I built rapid-eks - a CLI that deploys production-ready AWS EKS clusters in 13 minutes (validated). GitHub: https://github.com/jtaylortech/rapid-eks https://github.com/jtaylortech/rapid-eks The Problem I've set up EKS at 5+ companies. Every time, same 2-4 week grind: Multi-AZ VPC with proper CIDR planning IRSA (IAM Roles for Service Accounts) - always breaks Karpenter, ALB Controller, Prometheus - manual Helm hell IAM policies that are too permissive or too restrictive Debugging "why can't my pod access S3?" Multi-AZ VPC with proper CIDR planning IRSA (IAM Roles for Service Accounts) - always breaks Karpenter, ALB Controller, Prometheus - manual Helm hell IAM policies that are too permissive or too restrictive Debugging "why can't my pod access S3?" It's undifferentiated heavy lifting. Same bugs, every time. How It Works rapid-eks is a Python CLI that generates and manages Terraform: Config validation (Pydantic) - Type-safe YAML parsing Preflight checks - AWS creds, Terraform version, kubectl, quotas Terraform generation (Jinja2) - Uses official AWS modules Deployment - Runs terraform apply with progress tracking Health validation - Waits for cluster/nodes/addons to be ready IRSA configuration - Automatically sets up pod→AWS auth Config validation (Pydantic) - Type-safe YAML parsing Config validation Preflight checks - AWS creds, Terraform version, kubectl, quotas Preflight checks Terraform generation (Jinja2) - Uses official AWS modules Terraform generation Deployment - Runs terraform apply with progress tracking Deployment Health validation - Waits for cluster/nodes/addons to be ready Health validation IRSA configuration - Automatically sets up pod→AWS auth IRSA configuration All generated Terraform lives in .rapid-eks/ - you can inspect/modify it. .rapid-eks/ What You Get (13 minutes) Infrastructure: Infrastructure: Multi-AZ VPC (3 AZs, 6 subnets, 3 NAT gateways) EKS 1.31 with OIDC provider Managed node group (t3.medium, 2-4 nodes, autoscaling) Multi-AZ VPC (3 AZs, 6 subnets, 3 NAT gateways) EKS 1.31 with OIDC provider Managed node group (t3.medium, 2-4 nodes, autoscaling) Addons (with IRSA): Addons (with IRSA): Karpenter - Node autoscaling with spot instance support AWS Load Balancer Controller - Native ALB/NLB integration Prometheus + Grafana - Monitoring stack Karpenter - Node autoscaling with spot instance support AWS Load Balancer Controller - Native ALB/NLB integration Prometheus + Grafana - Monitoring stack Security: Security: IRSA for all workloads (no static credentials) Least-privilege IAM policies Private subnets for nodes Security groups with minimal access IRSA for all workloads (no static credentials) Least-privilege IAM policies Private subnets for nodes Security groups with minimal access Technical Details Stack: Stack: Python 3.11+ with type hints (Pydantic for validation) Jinja2 templates for Terraform generation Click for CLI, Rich for output Uses official terraform-aws-modules (vpc, eks, iam) Python 3.11+ with type hints (Pydantic for validation) Jinja2 templates for Terraform generation Click for CLI, Rich for output Uses official terraform-aws-modules (vpc, eks, iam) Why generate Terraform vs pure Python? Why generate Terraform vs pure Python? Terraform state management is battle-tested AWS modules are well-maintained Users can inspect/modify generated code Easier to debug than boto3 API calls Idempotent by default Terraform state management is battle-tested AWS modules are well-maintained Users can inspect/modify generated code Easier to debug than boto3 API calls Idempotent by default Preflight checks: Preflight checks: def validate_aws_credentials(): """Verify AWS creds work and have necessary permissions""" try: sts = boto3.client('sts') identity = sts.get_caller_identity() # Check for required IAM permissions return True except ClientError: return False def validate_aws_credentials(): """Verify AWS creds work and have necessary permissions""" try: sts = boto3.client('sts') identity = sts.get_caller_identity() # Check for required IAM permissions return True except ClientError: return False IRSA setup: IRSA setup: Creates OIDC provider for cluster Generates IAM roles with trust policies Annotates ServiceAccounts with role ARNs Validates pod→AWS auth works Creates OIDC provider for cluster Generates IAM roles with trust policies Annotates ServiceAccounts with role ARNs Validates pod→AWS auth works Health validation: Health validation: def wait_for_cluster_ready(cluster_name, region, timeout=600): """Poll EKS API until cluster is ACTIVE""" eks = boto3.client('eks', region_name=region) start = time.time() while time.time() - start < timeout: cluster = eks.describe_cluster(name=cluster_name) if cluster['cluster']['status'] == 'ACTIVE': return True time.sleep(10) return False def wait_for_cluster_ready(cluster_name, region, timeout=600): """Poll EKS API until cluster is ACTIVE""" eks = boto3.client('eks', region_name=region) start = time.time() while time.time() - start < timeout: cluster = eks.describe_cluster(name=cluster_name) if cluster['cluster']['status'] == 'ACTIVE': return True time.sleep(10) return False Try It pip install git+https://github.com/jtaylortech/rapid-eks.git rapid-eks create demo --region us-east-1 # ~13 minutes later kubectl get nodes pip install git+https://github.com/jtaylortech/rapid-eks.git rapid-eks create demo --region us-east-1 # ~13 minutes later kubectl get nodes Destroy is just as fast: rapid-eks destroy demo --auto-approve # ~17 minutes, validates clean removal rapid-eks destroy demo --auto-approve # ~17 minutes, validates clean removal Feedback Wanted Edge cases I'm missing? Additional addons needed? (cert-manager, external-dns, etc.) AWS regions with issues? Better IRSA patterns? Documentation gaps? Edge cases I'm missing? Additional addons needed? (cert-manager, external-dns, etc.) AWS regions with issues? Better IRSA patterns? Documentation gaps? All code is on GitHub, MIT licensed. Issues and PRs welcome. https://github.com/jtaylortech/rapid-eks/tree/main/docs?embedable=true https://github.com/jtaylortech/rapid-eks/tree/main/docs?embedable=true