436 reads

How to Automate Google Cloud Security Audits with Terraform and Python

by Advait Patel30mJanuary 7th, 2025

Too Long; Didn't Read

Automate Google Cloud security audits using Terraform and Python. Terraform ensures secure resource provisioning with best practices, while Python enables continuous auditing and custom security checks. Together, they form a scalable, proactive solution for managing cloud security, complete with real-world examples and CI/CD integrations for continuous compliance.

featured image - How to Automate Google Cloud Security Audits with Terraform and Python

Security in cloud infrastructure is of utmost priority in today's cloud-driven world. GCP in itself is a powerful tool with its own set of functionalities but, like all its competitors, suffers from inconsistencies due to the scale and complexities of the cloud environment. Furthermore, security audits are highly necessary to identify any misconfigurations or vulnerabilities in the configuration, when done manually, these become pretty time-consuming and error-prone processes.

Terraform and Python form a perfect combination for this automation of audits. Terraform is anIaaC - Infrastructure As A Code tool that allows declarative management of GCP resources while baking in security best practices. Python has extensive libraries and GCP API support for easy scripting on custom audit checks and automation workflows. We can integrate these tools to build a scalable, efficient, proactive security auditing system for GCP.

The aim of this article is to show, programatically, with real-life examples and code snippets, how one could automate GCP security audits by using Terraform and Python. In this article, I will show you how to provision secured infrastructure and trigger automated security alerts in ways that will help with cloud security management.

Setting Up Your Environment

Before we provision any resources and create the required infrastructure for this article, we need to set up the Google Cloud environment. I will briefly explain and list the prerequisites, tools, and configurations needed in this section to get up and running.

Prerequisites

An account on the Google Cloud Platform with administrative privileges.
Basic knowledge of Google Cloud Platform services, Terraform, and Python.
We would need Terraform and Python installed and running on your local machine or development environment.

Installing Required Tools

Install Terraform by downloading the Terraform binary from HashiCorp's official site.
You can also follow the instructions*here* for the other OS.

sudo apt-get update && sudo apt-get install -y gnupg software-properties-common
wget -O- https://apt.releases.hashicorp.com/gpg | \
gpg --dearmor | \
sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg > /dev/null
sudo apt update
sudo apt-get install terraform

Verify terraform installation:

terraform -version
terraform -help
terraform -help plan

We need Python 3.X installed. If not, download it from python.org. Use pip to install the required libraries:
```
pip install google-api-python-client google-auth terraform-validate
```

Configuring Authentication and Permissions

To interact with Google Cloud, we need appropriate permissions and a service user account for automation. Follow these steps:

Create a Service Account
- Go to the Google Cloud Console.
- Navigate to IAM & Admin > Service Accounts > Create Service Account.
- Assign roles like Owner or Security Admin.
- Download the JSON key file and save it securely on your local machine.
Authenticate Terraform
- Set up Terraform to use your service account credentials:
```
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/where/you/stored/service-account-key.json"
```
- Add this to your .bashrc or .zshrc profile.

Authenticate Python Scripts

We need to configure the service account credentials in the Python script as well

from google.oauth2 import service_account
from googleapiclient.discovery import build

credentials = service_account.Credentials.from_service_account_file(
    "/path/to/where/you/stored/service-account-key.json"
)
compute = build('compute', 'v1', credentials=credentials)
print(" You have been Authenticated successfully! ")

Provisioning Secure Infrastructure with Terraform

Terraform is an infrastructure as code tool that lets you build, change, and version infrastructure safely and efficiently by defining it as code. In the below section, I’ll walk you through creating secure infrastructure with Terraform.

Step 1: Creating a Terraform Configuration

Let’s begin by defining a secure Cloud Storage bucket, which will store all the user’s data. You can read more about Cloud Storage bucket here.

Create a Terraform Configuration File:

Save the following code in a file named main.tf

provider "google" {
  credentials = file(var.credentials_file)
  project     = var.project_id
  region      = var.region
}

resource "google_storage_bucket" "storage_bucket" {
  name          = "storage-bucket-audit-${random_id.bucket_suffix.hex}"
  location      = var.region
  force_destroy = false

  versioning {
    enabled = true
  }

  lifecycle_rule {
    action {
      type = "Delete"
    }
    condition {
      age = 365
    }
  }

  uniform_bucket_level_access = true

  logging {
    log_bucket        = var.logging_storage_bucket
    log_object_prefix = "storage-bucket-audit/"
  }
}

resource "random_id" "storage_bucket_suffix" {
  byte_length = 4
}

Define Input Variables:

Create a variables.tf file to hold configurable variables:

variable "credentials_file" {
  description = "service account JSON file"
  type        = string
}

variable "project_id" {
  description = "Google Cloud project ID"
  type        = string
}

variable "region" {
  description = "Google Cloud region"
  type        = string
  default     = "us-west1"
}

variable "logging_storage_bucket" {
  description = "Bucket to store access logs for the storage bucket"
  type        = string
}

Add a Terraform State Backend (Optional)

To ensure state management, configure a remote backend like Google Cloud Storage. This step is optional which avoids any conflict creating or updating the resources if you’re working in a team environment. Read this article on the terraform state conflicts.
```
terraform {
  backend "gcs" {
    bucket  = "terraform-state-bucket"
    prefix  = "terraform/state"
  }
}
```

Step 2: Applying the Configuration

Initialize Terraform by running the following command to download the necessary providers and modules
```
terraform init
```
Validate the Configuration by ensuring the Terraform code has no syntax errors
```
terraform validate
```
Apply the Configuration to provision the resources
```
terraform apply
```
Terraform will display the plan and prompt you for confirmation. Enter yes or y to proceed.

Step 3: Reviewing the Output

After applying the configuration, you’ll see outputs similar to:

google_storage_bucket.storage_bucket: Creation complete after 3s [id=storage-bucket-audit--xyz123]

Verify the bucket in the Google Cloud Console under Cloud Storage. It will have:

Versioning is enabled and supports recovering deleted and overwritten objects.
Logging enabled to track activities of access.
Lifecycle policies that manage object retention.

Best Practices

Use the least privileged IAM roles for your service accounts.
Enable encryption at rest for all storage buckets.
Audit sensitive information exposure in Terraform state files regularly.

Automating Security Audits with Python

While Terraform helps provision secure resources, Python handles it by automating security audits to ensure continuous compliance. In this section, I will explain and show how to use Python scripts to identify security misconfigurations in Google Cloud Platform.

Step 1: Setting Up Python

Install Required Libraries

Ensure the necessary Python libraries are installed:
```
pip install google-api-python-client google-auth
```

Authenticate Using Service Account

Use the service account credentials to interact with GCP APIs:

from google.oauth2 import service_account
from googleapiclient.discovery import build

# Path to your service account key file
SERVICE_ACCOUNT_FILE = '/path/to/where/you/stored/service-account-key.json'

# Authenticate
credentials = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_FILE)
storage_client = build('storage', 'v1', credentials=credentials)
print(" You have been Authenticated successfully! ")

Step 2: Writing a Security Audit Script in Python

Audit IAM Policies for Overly Permissive Roles:

This script identifies Cloud Storage buckets with overly permissive IAM policies, such as granting "allUsers" or "allAuthenticatedUsers" access:

from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

def list_buckets_with_overly_permissions(project_id):
    # Initialize the storage client
    storage_client = build('storage', 'v1')

    try:
        # List all the buckets that are in the project
        storage_buckets = storage_client.buckets().list(project=project_id).execute()
        while storage_buckets:
            for bucket in storage_buckets.get('items', []):
                bucket_name = bucket['name']
                
                try:
                    # fetch the IAM policy for the selected bucket
                    iam_policy = storage_client.buckets().getIamPolicy(bucket=bucket_name).execute()
                    
                    for binding in iam_policy.get('bindings', []):
                        members = binding.get('members', [])
                        if 'allUsers' in members or 'allAuthenticatedUsers' in members:
                            print(f"Bucket '{bucket_name}' has overly permissive IAM policy: {binding}")
                except HttpError as e:
                    print(f" Failed: fetching IAM policy for the bucket '{bucket_name}': {e} ")
            
            # If the project has multiple buckets, this block will handle the pagination
            next_page_token = storage_buckets.get('nextPageToken')
            if next_page_token:
                storage_buckets = storage_client.buckets().list(project=project_id, pageToken=next_page_token).execute()
            else:
                storage_buckets = None
    
    except HttpError as e:
        print(f"Could not list the buckets due to : {e}")

# How To Use
project_id = 'replace-your-project-id'
list_buckets_with_overly_permissions(project_id)

Check for Missing Logging Configurations:

This snippet verifies if access logging is enabled for Cloud Storage buckets or not

from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

def check_bucket_logging(project_id):
    # Initialize the storage client
    storage_client = build('storage', 'v1')

    try:
        # List all the buckets that are in the project
        storage_buckets = storage_client.buckets().list(project=project_id).execute()
        while storage_buckets:
            for bucket in storage_buckets.get('items', []):
                bucket_name = bucket['name']
                try:
                    # Fetch the bucket details
                    bucket_info = storage_client.buckets().get(bucket=bucket_name).execute()

                    # Check if logging is enabled or not
                    if 'logging' not in bucket_info:
                        print(f"Bucket '{bucket_name}' does not have logging enabled.")
                    else:
                        logging_config = bucket_info['logging']
                        print(f"Bucket '{bucket_name}' has logging enabled: {logging_config}")
                except HttpError as e:
                    print(f" Could not fetch the details for bucket '{bucket_name}': {e}")
            
            # If the project has multiple buckets, this block will handle the pagination
            next_page_token = storage_buckets.get('nextPageToken')
            if next_page_token:
                storage_buckets = storage_client.buckets().list(project=project_id, pageToken=next_page_token).execute()
            else:
                storage_buckets = None
    
    except HttpError as e:
        print(f"Error listing buckets: {e}")

# How To Use
project_id = 'replace-your-project-id'
check_bucket_logging(project_id)

Step 3: How to Run the Python Script

Execute the Python script using the below command:

python perform_gcp_audit.py

The output of the above script will print the buckets with oepn IAM roles or missing logging configurations, asking you to take the required actions.

Step 4: Automating the Process

To continuously run the above script, we can schedule the script using cron or can integrate it with a CI/CD pipeline.

Example of a cron job to run the script daily:

0 0 * * * /usr/bin/python3 /path/to/perform_gcp_audit.py

Best Practices

Log audit results to a secure location for traceability.
Combine findings with GCP’s Security Command Center for a holistic security view.
Regularly update the script to include checks for new services or configurations.

Combining Python and Terraform to Automate the Audits

We will combine Python and Terraform to create an automated workflow for provisioning infrastructure and automating performing the Securtity Audits. I will walk through on how to integrate Python and Terraform to build a proactive security system.

Step 1: Need to Export Terraform Outputs for Python Scripts

Terraform allows to export information about the resources it creates. These outputs can be ingested by Python scripts for security audits.

Specify Outputs in Terraform by adding an outputs.tf file to your Terraform configuration:
```
output "list_buckets" {
  value = [for bucket in google_storage_bucket.secure_bucket : bucket.name]
}
```
This exports the names of all storage buckets created by Terraform.
Apply the Terraform Configuration by running the following commands to update and retrieve the outputs:
```
terraform apply
terraform output -json > terraform_output.json
```

Step 2: Consuming Terraform Outputs in Python

Use the Terraform outputs to audit only the resources provisioned by Terraform which makes the audit more targeted.

Load Terraform Outputs in Python

import json

# Load Terraform outputs
with open('terraform_output.json') as file:
    terraform_outputs = json.load(file)

bucket_names = terraform_outputs['bucket_names']['value']
print("List of the Buckets to audit: {bucket_names}")

Integrate Outputs with Audit Scripts

Modify the previous IAM audit script to use these bucket names:

def audit_buckets_with_outputs(storage_client, bucket_names):
    for bucket_name in bucket_names:
        iam_policy = storage_client.buckets().getIamPolicy(bucket=bucket_name).execute()
        for binding in iam_policy.get('bindings', []):
            if 'allUsers' in binding['members'] or 'allAuthenticatedUsers' in binding['members']:
                print(f"Bucket {bucket_name} has weak permissions: {binding}")

audit_buckets_with_outputs(storage_client, bucket_names)

Step 3: Automating the Workflow

To create a fully automated workflow, we need to create a CI/CD pipeline and integrate Terraform resources provisioning and Python auditing scripts in it.

Create a Shell Script to Orchestrate the Process

#!/bin/bash

# Step 1 will Provision the infrastructure with Terraform
terraform apply -auto-approve

# Step 2 will Export the Terraform outputs
terraform output -json > terraform_output.json

# Step 3 will Run the Python security audit script
python perform_gcp_audit.py

Integrate with CI/CD using tools like GitHub Actions or Google Cloud Build to automate the script.

Below is the example of GitHub Actions configuration:

name: Terraform + Python Security Audit

on:
  push:
    branches:
      - main

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2

      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.x

      - name: Install the required dependencies
        run: pip install google-api-python-client google-auth

      - name: Set up and install Terraform
        uses: hashicorp/setup-terraform@v2
        with:
          terraform_version: 1.5.7

      - name: Run automation script
        run: ./run_audit.sh

Step 4: Enhancing with Alerts and Reporting

Add notifications for audit results:

Python script to send emails for critical findings:

import smtplib
from email.mime.text import MIMEText

def send_alerts_for_findings(alert_message):
    msg = MIMEText(alert_message)
    msg['Subject'] = 'Security Audit Alert from GCP'
    msg['From'] = 'SRETeam@example.com'
    msg['To'] = 'advait-patel@example.com'

    with smtplib.SMTP('smtp.example.com', 587) as server:
        server.starttls()
        server.login('advait-patel@example.com', 'your-password')
        server.send_message(msg)

send_alerts_for_findings("Bucket 'sample-storage-bucket' has weak permissions.")

We can use monitoring platforms like Google Cloud Monitoring or Wavefront for visualizing these Audit Results.

Best Practices

Should use CI/CD pipelines to implement security checks after every infrastructure change.
Shoulds regularly update Terraform configurations and Python scripts to align with the latest security standards and patches.
Should store Terraform state files and Python logs securely in the Cloud Storage Buckets.

Real-World Use Case: Securing a Web Application on GCP

We implemented some pieces of the puzzle above. Now let’s explore a real-world scenario where Terraform and Python automate security for a web application hosted on Google Cloud Platform. I will walk you through the scenario of how you would provision secure resources, audit the resources, and introduce proactive monitoring workflows.

Scenario Overview

A company hosts their e-commerce website on the Google Cloud Platform using:

A Compute VM instance for the application server.
A Cloud SQL VM for the database.
A Cloud Storage bucket for data uploaded by users.

Our goal is to:

Provision these above-mentioned resources securely using Terraform.
Audit the common security issues using Python in the above infrastructure we created.
Monitor the infrastructure in real time and trigger an alert on any misconfiguration found.

Step 1: Use Terraform to Provision the Infrastructure

Terraform Configuration:

Here’s how the infrastructure is defined in main.tf

# Create the Compute VM
resource "google_compute_instance" "app_server" {
  name         = "secure-app-server"
  machine_type = "e2-medium"
  zone         = "us-west1-a"

  # Specify Boot Disk
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11" # Use an OS image with long-term support
    }
  }

  # Configure Network Interface
  network_interface {
    network = "default"

    # Allow external connectivity through NAT
    access_config {}
  }

  # Secure Metadata
  metadata = {
    enable-oslogin = "TRUE" # Enforce OS Login for secure access
  }

  # Add Secure Firewall Tags
  tags = ["app", "audit", "secure"]
}

# Create the Cloud SQL Database Instance
resource "google_sql_database_instance" "cloud-sql-server" {
  name             = "secure-db-server"
  database_version = "MYSQL_8_0"
  region           = "us-west1"

  # Database Settings
  settings {
    tier                     = "db-f1-micro" # Low-cost instance type for testing/small workloads
    backup_configuration {
      enabled = true # Enable automatic backups
    }

    ip_configuration {
      ipv4_enabled    = false   # Disable public IP access
      private_network = "default" # Use private networking for secure connectivity
    }

    availability_type = "ZONAL" # Choose regional for higher availability if needed
  }
}

# Storage Bucket for Uploads
resource "google_storage_bucket" "storage-bucket" {
  name                       = "storage-bucket-uploads-${random_id.bucket_suffix.hex}"
  location                   = "US"
  uniform_bucket_level_access = true # Enforce uniform ACL for better security

  # Enable Versioning to prevent accidental deletions
  versioning {
    enabled = true
  }

  # Secure Bucket Settings
  lifecycle_rule {
    action {
      type = "Delete"
    }
    condition {
      age = 90 # Delete objects older than 90 days to manage storage costs
    }
  }

  logging {
    log_bucket        = "advait-patel-log-bucket-name" # Add logging to a monitoring bucket
    log_object_prefix = "logs/"
  }
}

# Generate Random Suffix for Unique Bucket Names
resource "random_id" "bucket_suffix" {
  byte_length = 4
}

Apply the Configuration

terraform init
terraform apply -auto-approve

Step 2: Auditing the Infrastructure

Auditing IAM Roles for Compute Engine

Check if any instance has overly permissive IAM roles:

from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

def perform_audit_compute_instance_iam(project_id, zones):
    
    """
    This code Audits Google Compute Engine instances for overly permissive IAM roles within the given Project.
    """
    
    # this will initialize the compute client
    compute_client = build('compute', 'v1')

    for each_zone in zones:
        print(f" Checking the instances in the given zone: {each_zone} ")

        try:
            # List instances in the given zone
            list_instances = compute_client.instances().list(project=project_id, zone=each_zone).execute()
            
            for list_instances in list_instances.get('items', []):
                instance_name = instance['name']
                try:
                    # Fetch IAM policy for the given instance
                    iam_policy = compute_client.instances().getIamPolicy(
                        project=project_id, zone=each_zone, resource=instance_name
                    ).execute()

                    # Now Validate the overly permissive IAM permissions
                    for binding in iam_policy.get('bindings', []):
                        members = binding.get('members', [])
                        if 'allUsers' in members or 'allAuthenticatedUsers' in members:
                            print(f"Instance '{instance_name}' in zone '{each_zone}' has overly permissive IAM permissions: {binding}")

                except HttpError as e:
                    print(f"Could not fetch the IAM policy for instance '{instance_name}' in zone '{each_zone}': {e}")

        except HttpError as e:
            print(f"Error listing instances in zone '{each_zone}': {e}")

# How to Use
project_id = 'replace-your-project-id'
zones = ['us-west1-a', 'us-west1-b', 'us-west1-c']
perform_audit_compute_instance_iam(project_id, zones)

Example Output

Checking instances in zone: us-west1-a
Instance 'advait-patel-1' in zone 'us-west1-a' has overly permissive IAM permissions: {'role': 'roles/compute.viewer', 'members': ['allUsers']}
Checking instances in zone: us-west1-b
Could not find the overly permissive IAM permissions.
...

Auditing SQL Instance SSL Settings:

Verify if SSL connections are enforced for the Cloud SQL instance or not

from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

def perform_sql_instance_ssl_audit(project_id, credentials):
    
    """
    Validate if the SSL connections are enforced for all the available Cloud SQL instances in a project or not.
    """
    
    # this will initialize the SQL Admin client
    sql_client = build('sqladmin', 'v1', credentials=credentials)

    try:
        # Print all the Cloud SQL instances in the given project
        list_instances = sql_client.instances().list(project=project_id).execute()
        
        if 'items' not in list_instances or not list_instances['items']:
            print(f" Could not find Cloud SQL instances in the project '{project_id}'.")
            return

        # Iterate over each instance to check SSL settings
        for instance in list_instances['items']:
            instance_name = instance['name']
            require_ssl = instance.get('settings', {}).get('ipConfiguration', {}).get('requireSsl', False)
            
            if require_ssl:
                print(f"SQL Instance '{instance_name}' enforces SSL connections.")
            else:
                print(f"WARNING: SQL Instance '{instance_name}' does NOT enforce SSL connections.")

    except HttpError as e:
        print(f"Could not retrieve Cloud SQL instances: {e}")

# How To Use
project_id = 'replace-your-project-id'
credentials = None  # Replace with your Google Cloud credentials object
perform_sql_instance_ssl_audit(project_id, credentials)

Example Output

SQL Instance 'advait-test' enforces SSL connections.
WARNING: SQL Instance 'advait-patel-test' does NOT enforce SSL connections.
Could not find Cloud SQL instances in the project 'replace-your-project-id'.

Step 3: Continuous Monitoring and Alerts

Setup Monitoring for GCP Resources: Use Google Cloud Monitoring to set up alerts for critical configurations:
- Log-Based Alerts: Monitor IAM changes for Compute Engine and Storage buckets.
- Uptime Checks: Ensure the web server is accessible.
Schedule Automated Security Audits: Combine Terraform and Python into a cron job or CI/CD pipeline to ensure regular security checks:
```
0 3 * * * /usr/bin/python3 /path/to/perform_security_audit.py
```

Email Alerts for Misconfigurations: Send alerts for critical issues detected during audits:

import smtplib
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

def send_alert_email_for_misconfig(misconfig_details, recipient_email, sender_email, smtp_server, smtp_port, smtp_password):

    try:
        # Create an email content
        subject = "Critical Alert from SRE Team: Issue Detected During Performing Audit"
        body = f"""
        
        We found a critical issue while performing the recent audit:
        {misconfig_details}
        """

        # Set up the email client
        msg = MIMEMultipart()
        msg['From'] = sender_email
        msg['To'] = recipient_email
        msg['Subject'] = subject

        # Attach the email body
        msg.attach(MIMEText(body, 'plain'))

        # Connect to the SMTP server and send the email
        with smtplib.SMTP(smtp_server, smtp_port) as server:
            server.starttls()  # Secure the connection
            server.login(sender_email, smtp_password)
            server.sendmail(sender_email, recipient_email, msg.as_string())

        print(f"Success: An Alert email is successfully sent to {recipient_email}.")

    except Exception as e:
        print(f"Could not send an alert email due to: {e}")

# How To Use
misconfig_details = "Found A Critical misconfiguration Compute Virtual Machine!"
recipient_email = "advait-patel@example.com"
sender_email = "TeamSRE@example.com"
smtp_server = "smtp.gmail.com"
smtp_port = 587
smtp_password = "replace-your-smtp-password"

send_alert_email_for_misconfig(misconfig_details, recipient_email, sender_email, smtp_server, smtp_port, smtp_password)

Step 4: Validating the Workflow

Deploy the infrastructure using Terraform.
Run the Python audit script to check for security issues.
Fix identified issues (e.g., enable SSL for Cloud SQL, correct IAM permissions).
Test monitoring and alerting to ensure quick response to future issues.

Conclusion

In this article, we looked at how to automate security audits of Google Cloud using Terraform and Python. By combining the powers of both, you will end up with a solid proactive security workflow.

Key takeaways include:

Infrastructure as a Code-IaaC for Secure Deployments: Terraform simplifies the resource provisioning process while maintaining security best practices, including IAM role restrictions and bucket-level access controls.

Python for Continuous Audits: Python scripts can be easily integrated with GCP APIs to run automated security checks, such as the detection of misconfigured IAM policies, enabling logging, and enforcing SSL connections.

Integration for Scalability: Terraform with Python sets up a powerful pipeline not limited to resource provisioning and auditing but includes continuous monitoring of resources as well. Therefore, this is the end-to-end security solution.

Real-World Application: The tools were applied to a real-world use case, where it showed how an e-commerce application can be secured, practical workflows and coding examples were shown for CI/CD integrations to attain continuous compliance. That said, this approach is highly scalable and adaptable for any kind of cloud environment, making sure that as growth happens, the security of your infrastructure is maintained. More importantly, integrating these workflows with monitoring and alerting systems allows teams to quickly respond to security and, hence minimize risks.