Adobe Experience Manager's Oak repository is like a bustling warehouse: over time, unused data (old checkpoints, orphaned nodes) piles up, slowing performance and eating disk space. While Adobe's official documentation explains the what and why of Offline Revision Cleanup (ORC), the how often leaves DevOps executing repetitive tasks. With this in mind, I’ll share how a simple Bash script can transform this critical-but-tedious process into a one-command operation - safely, efficiently, and with zero manual checklists. Why Bash Script Outperforms Manual Cleanup in AEM Oak The script isn’t just a wrapper around oak-run.jar. It’s a safety-first automation tool designed for real-world enterprise environments. Here’s what makes it stand out: Pre-Flight Checks: It verifies dependencies (java, lsof), validates paths, and ensures AEM is offline. Interactive Prompts (with an --yes override): No accidental runs. Atomic Operations: Each checkpoint removal and compaction step is isolated. If one fails, the script exits cleanly. Color-Coded Logging: Spot issues at a glance with red errors and green success states. # Sample Log Output [INFO] [2023-10-15 14:30:00] Starting offline revision cleanup process [ERROR] [2023-10-15 14:30:05] Oak command failed: checkpoints segmentstore Script Architecture Let’s dissect the critical components: 1. Argument Parsing & Validation The script enforces strict input rules. Try passing an invalid port like 99999? It’ll block you. Forget the AEM directory? A friendly error appears. # Validate port range if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then log_message "ERROR" "Invalid port number: $PORT" exit 1 fi 2. The Oak Command Engine The run_oak_command function handles all interactions with oak-run.jar, including error trapping. Notice the -Dtar.memoryMapped=true flag, a performance tweak for large repositories: By allowing memory-mapped file access, it reduces I/O overhead during compaction. run_oak_command() { java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$@" if [ $? -ne 0 ]; then log_message "ERROR" "Oak command failed: $@" exit 1 fi } 3. Safety Nets You’ll Appreciate at 2 AM Port collision prevention: Script exits if AEM is still up, avoiding data corruption. Input validation: Rejects non-existent paths or invalid ports up front. User prompts: Optional confirmations at each critical stage (unless --yes is set) How to Run The Bash Script Prerequisites: AEM instance fully stopped oak-run.jar version matching your AEM/Oak deployment Backup of crx-quickstart (yes, really!) 2. Execute the Script: ./aem-orc-cleanup.sh -d /opt/aem/author -o ./oak-run-1.42.0.jar --yes What Happens Next: Segmentstore compaction Datastore compaction Logs written to stdout Troubleshooting Cheat Sheet Problem: Script fails with "AEM is still running" Fix: # Find AEM pid among lingering Java processes ps aux | grep java | grep -v grep | awk '{print $2}' # Kill AEM process kill -9 AEM_PID Problem: oak command failedduring compactionFix: Check Oak/AEM version compatibility Ensure adequate disk space (2x repository size) Test with --debug for verbose logging Problem: Script fails with Permission denied”Fix: chmod +x aem-orc-cleanup.sh and ensure the user has read/write access to AEM directories. Customizations for Power Users While the base script works out of the box, its real value shines when tailored to your team's workflow. Here's how to transform it into a production-ready powerhouse: 1. Enhanced Monitoring & Analytics Add a pre-compaction disk usage snapshot to quantify your ROI: # Capture before/after metrics DISK_USAGE_PRE=$(du -sh $CQ_REPO_PATH | awk '{print $1}') # ... run compaction ... DISK_USAGE_POST=$(du -sh $CQ_REPO_PATH | awk '{print $1}') log_message "INFO" "Disk reclaimed: $DISK_USAGE_PRE → $DISK_USAGE_POST" 2. Intelligent Alerting Replace passive logging with proactive notifications. For example, trigger a Slack alert on failure: if [ $? -ne 0 ]; then curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"🚨 AEM ORC failed on $(hostname)\"}" \ https://hooks.slack.com/services/YOUR_WEBHOOK fi 3. Dynamic Retry Logic Oak compaction can fail due to transient I/O issues. Add resilience with exponential backoff: MAX_RETRIES=3 RETRY_DELAY=60 # seconds for ((i=1; i<=$MAX_RETRIES; i++)); do run_oak_command "compact" "$store" if [ $? -eq 0 ]; then break fi sleep $(($RETRY_DELAY * $i)) done What’s your killer customization? Please share it in the comments. Final Thoughts: Elevating Maintenance from Chore to Strategic Advantage Let’s face it: tasks like Offline Revision Cleanup rarely make the highlight reel of a developer’s day. They’re the unsung, unglamorous chores that keep systems alive. But the truth is how you handle these tasks defines the reliability of your AEM ecosystem. This script isn’t just about avoiding OutOfDiskSpaceErrors or shaving seconds off query times. It’s about operational maturity. By automating ORC, you’re not just solving a problem, you’re institutionalizing consistency. You’re replacing tribal knowledge (“How did we do this last time?”) with version-controlled, auditable code. For DevOps teams, the payoff is twofold: Risk Reduction: No more “fat-finger” mistakes during manual compaction. Scalability: Apply the same script across dev, stage, and prod environments with predictable results. What’s Next? Grab the script, tweak it for your team’s alerting/logging stack, and run it in a staging environment. Measure the before/after: Disk space reclaimed? Startup times improved? Share the metrics with leadership — it’s a win they’ll understand. Bonus points: Integrate it into your CI/CD pipeline as a post-deployment health check. IMHO the future of AEM ops isn’t about working harder — it’s about scripting smarter. Now, go turn those maintenance windows into something more exciting. (Or at least, something that doesn’t keep you up at night.) What’s your automation success story? Drop a comment below — I’d love to hear how you’re taming AEM’s complexity. #!/bin/bash # ============================================================================= # AEM Offline Revision Cleanup Script # ============================================================================= # Purpose: Performs maintenance operations on Adobe Experience Manager's Oak # repository, including checkpoint management and compaction. # # Source: https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/deploying/revision-cleanup#how-to-run-offline-revision-cleanup # # Version: 1.0.0 # Author: Giuseppe Baglio # ============================================================================= # Color definitions for better readability RED='\033[0;31m' GREEN='\e[0;32m' YELLOW='\e[0;33m' BLUE='\e[0;34m' NC='\033[0m' # No Color # Default values for command line arguments PORT=4502 CQ_HOME="" OAK_RUN="" SKIP_CONFIRMATION=false # Flag to skip user confirmation DEBUG_MODE=false # Function to display usage/help message show_help() { cat << EOF Usage: $(basename "$0") [OPTIONS] Performs offline revision cleanup for an AEM instance, including checkpoint management and compaction of the Oak repository. Options: -h, --help Display this help message -p, --port PORT Specify the AEM instance port (default: 4502) -d, --aem-dir DIR Specify the AEM installation directory (required) -o, --oak-run FILE Specify the path to oak-run.jar (required) -y, --yes Skip confirmation prompts and proceed automatically --debug Enable debug mode for verbose output Example: $(basename "$0") --port 4502 --aem-dir /path/to/aem --oak-run /path/to/oak-run.jar --yes Requirements: - Java 1.8 or higher - lsof utility - AEM must be completely shut down before running this script - Adequate disk space for compaction Note: Always maintain a backup before running this script. EOF exit 1 } # Function to log messages with timestamp log_message() { local level=$1 shift local message=$* local timestamp timestamp=$(date '+%Y-%m-%d %H:%M:%S') case "$level" in "INFO") printf "${GREEN}[INFO]${NC} ";; "WARN") printf "${YELLOW}[WARN]${NC} ";; "ERROR") printf "${RED}[ERROR]${NC} ";; *) printf "[${level}] ";; esac printf "[%s] %s\n" "$timestamp" "$message" } # Function to handle debug output debug_log() { if [ "$DEBUG_MODE" = true ]; then log_message "DEBUG" "$*" fi } check_dependencies() { local dependencies=("java" "lsof") for dep in "${dependencies[@]}"; do if ! command -v "$dep" > /dev/null 2>&1; then log_message "ERROR" "'$dep' is not installed. Please install it and try again." exit 1 fi debug_log "Found required dependency: $dep" done } # Function to parse command line arguments parse_arguments() { while [[ $# -gt 0 ]]; do case $1 in -h|--help) show_help ;; -p|--port) PORT="$2" shift 2 ;; -d|--aem-dir) CQ_HOME="$2" shift 2 ;; -o|--oak-run) OAK_RUN="$2" shift 2 ;; -y|--yes) SKIP_CONFIRMATION=true shift ;; --debug) DEBUG_MODE=true shift ;; *) log_message "ERROR" "Unknown option: $1" show_help ;; esac done # Validate required arguments if [[ ! -d "$CQ_HOME" ]]; then log_message "ERROR" "AEM directory (-d|--aem-dir) is required and must exist" show_help fi # Validate oak-run.jar if [[ ! -f "$OAK_RUN" ]]; then log_message "ERROR" "oak-run.jar not found at '$OAK_RUN'" exit 1 fi # Validate port number if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then log_message "ERROR" "Invalid port number: $PORT (must be between 1 and 65535)" exit 1 fi } # Function to run oak commands with proper error handling run_oak_command() { local command=$1 local store=$2 local args=${3:-""} debug_log "Running oak command \"$command\" on $store with args \"$args\"" java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$command" "$CQ_REPO_PATH/$store" $args if [ $? -ne 0 ]; then log_message "ERROR" "Oak command failed: $command $store $args" exit 1 fi } # Function to check if AEM is running check_aem_off() { lsof -i tcp:"${PORT}" | awk 'NR!=1 {print $2}' | sort -u | wc -l } # Function to prompt for user confirmation wait_user_input() { local prompt_message=${1:-"Do you wish to continue"} local additional_info=$2 # If --yes flag was used, skip confirmation if [ "$SKIP_CONFIRMATION" = true ]; then log_message "INFO" "Skipping confirmation (--yes flag used)" return 0 fi # Display additional information if provided if [ -n "$additional_info" ]; then printf "\n%s\n" "$additional_info" fi printf "\n${YELLOW}%s [Y/n]${NC} " "$prompt_message" read -r response case $response in "y"|"yes"|"Y"|"") # Empty response defaults to yes return 0 ;; "n"|"N"|"no") log_message "INFO" "Operation cancelled by user" exit 0 ;; *) log_message "ERROR" "Please answer 'y' for yes or 'n' for no (default: y)" ;; esac } # Main script execution starts here check_dependencies parse_arguments "$@" # Set repository paths CQ_SEGMENTSTORE_HOME="crx-quickstart/repository" CQ_REPO_PATH="$CQ_HOME/$CQ_SEGMENTSTORE_HOME" # Validate repository path if [ ! -d "$CQ_REPO_PATH" ]; then log_message "ERROR" "Repository folder not found: $CQ_REPO_PATH" exit 1 fi # Check if AEM is running if [ "$(check_aem_off)" -eq 0 ]; then log_message "WARN" "[!] Always make sure you have a recent backup of the AEM instance." log_message "INFO" "Starting offline revision cleanup process" # Process both datastore and segmentstore for store in segmentstore datastore; do log_message "INFO" "Processing $store..." log_message "INFO" "Finding old checkpoints in $store" wait_user_input run_oak_command "checkpoints" "$store" log_message "INFO" "Removing unreferenced checkpoints from $store" wait_user_input run_oak_command "checkpoints" "$store" "rm-unreferenced" log_message "INFO" "Running compaction on $store" wait_user_input run_oak_command "compact" "$store" log_message "INFO" "Completed processing $store" done log_message "INFO" "Revision cleanup completed successfully" else log_message "ERROR" "AEM is still running on port $PORT. Please shut it down first." exit 1 fi Adobe Experience Manager's Oak repository is like a bustling warehouse: over time, unused data (old checkpoints, orphaned nodes) piles up, slowing performance and eating disk space. While Adobe's official documentation explains the what and why of Offline Revision Cleanup (ORC), the how often leaves DevOps executing repetitive tasks. official documentation how With this in mind, I’ll share how a simple Bash script can transform this critical-but-tedious process into a one-command operation - safely, efficiently, and with zero manual checklists. Bash Why Bash Script Outperforms Manual Cleanup in AEM Oak The script isn’t just a wrapper around oak-run.jar . It’s a safety-first automation tool designed for real-world enterprise environments. oak-run.jar Here’s what makes it stand out: Pre-Flight Checks: It verifies dependencies (java, lsof), validates paths, and ensures AEM is offline. Interactive Prompts (with an --yes override): No accidental runs. Atomic Operations: Each checkpoint removal and compaction step is isolated. If one fails, the script exits cleanly. Color-Coded Logging: Spot issues at a glance with red errors and green success states. Pre-Flight Checks : It verifies dependencies ( java , lsof ), validates paths, and ensures AEM is offline. Pre-Flight Checks java lsof Interactive Prompts (with an --yes override): No accidental runs. Interactive Prompts --yes Atomic Operations : Each checkpoint removal and compaction step is isolated. If one fails, the script exits cleanly. Atomic Operations Color-Coded Logging : Spot issues at a glance with red errors and green success states. Color-Coded Logging # Sample Log Output [INFO] [2023-10-15 14:30:00] Starting offline revision cleanup process [ERROR] [2023-10-15 14:30:05] Oak command failed: checkpoints segmentstore # Sample Log Output [INFO] [2023-10-15 14:30:00] Starting offline revision cleanup process [ERROR] [2023-10-15 14:30:05] Oak command failed: checkpoints segmentstore Script Architecture Let’s dissect the critical components: Let’s dissect the critical components: 1. Argument Parsing & Validation The script enforces strict input rules. Try passing an invalid port like 99999 ? It’ll block you. Forget the AEM directory? A friendly error appears. 99999 # Validate port range if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then log_message "ERROR" "Invalid port number: $PORT" exit 1 fi # Validate port range if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then log_message "ERROR" "Invalid port number: $PORT" exit 1 fi 2. The Oak Command Engine The run_oak_command function handles all interactions with oak-run.jar , including error trapping. Notice the -Dtar.memoryMapped=true flag, a performance tweak for large repositories: By allowing memory-mapped file access, it reduces I/O overhead during compaction. run_oak_command oak-run.jar -Dtar.memoryMapped=true run_oak_command() { java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$@" if [ $? -ne 0 ]; then log_message "ERROR" "Oak command failed: $@" exit 1 fi } run_oak_command() { java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$@" if [ $? -ne 0 ]; then log_message "ERROR" "Oak command failed: $@" exit 1 fi } 3. Safety Nets You’ll Appreciate at 2 AM Port collision prevention: Script exits if AEM is still up, avoiding data corruption. Input validation: Rejects non-existent paths or invalid ports up front. User prompts: Optional confirmations at each critical stage (unless --yes is set) Port collision prevention : Script exits if AEM is still up, avoiding data corruption. Port collision prevention Input validation : Rejects non-existent paths or invalid ports up front. Input validation User prompts : Optional confirmations at each critical stage (unless --yes is set) User prompts --yes How to Run The Bash Script Prerequisites: Prerequisites : Prerequisites AEM instance fully stopped oak-run.jar version matching your AEM/Oak deployment Backup of crx-quickstart (yes, really!) AEM instance fully stopped oak-run.jar version matching your AEM/Oak deployment oak-run.jar Backup of crx-quickstart (yes, really!) crx-quickstart 2. Execute the Script : Execute the Script ./aem-orc-cleanup.sh -d /opt/aem/author -o ./oak-run-1.42.0.jar --yes ./aem-orc-cleanup.sh -d /opt/aem/author -o ./oak-run-1.42.0.jar --yes What Happens Next: What Happens Next: What Happens Next: Segmentstore compaction Datastore compaction Logs written to stdout Segmentstore compaction Datastore compaction Logs written to stdout Troubleshooting Cheat Sheet Problem : Script fails with "AEM is still running" Problem Fix : Fix # Find AEM pid among lingering Java processes ps aux | grep java | grep -v grep | awk '{print $2}' # Kill AEM process kill -9 AEM_PID # Find AEM pid among lingering Java processes ps aux | grep java | grep -v grep | awk '{print $2}' # Kill AEM process kill -9 AEM_PID Problem : oak command failed during compaction Fix : Problem oak command failed Fix Check Oak/AEM version compatibility Ensure adequate disk space (2x repository size) Test with --debug for verbose logging Check Oak/AEM version compatibility Ensure adequate disk space (2x repository size) Test with --debug for verbose logging --debug Problem : Script fails with Permission denied” Fix : Problem Fix chmod +x aem-orc-cleanup.sh and ensure the user has read/write access to AEM directories. chmod +x aem-orc-cleanup.sh Customizations for Power Users While the base script works out of the box, its real value shines when tailored to your team's workflow. Here's how to transform it into a production-ready powerhouse: 1. Enhanced Monitoring & Analytics Add a pre-compaction disk usage snapshot to quantify your ROI: # Capture before/after metrics DISK_USAGE_PRE=$(du -sh $CQ_REPO_PATH | awk '{print $1}') # ... run compaction ... DISK_USAGE_POST=$(du -sh $CQ_REPO_PATH | awk '{print $1}') log_message "INFO" "Disk reclaimed: $DISK_USAGE_PRE → $DISK_USAGE_POST" # Capture before/after metrics DISK_USAGE_PRE=$(du -sh $CQ_REPO_PATH | awk '{print $1}') # ... run compaction ... DISK_USAGE_POST=$(du -sh $CQ_REPO_PATH | awk '{print $1}') log_message "INFO" "Disk reclaimed: $DISK_USAGE_PRE → $DISK_USAGE_POST" 2. Intelligent Alerting Replace passive logging with proactive notifications. For example, trigger a Slack alert on failure: if [ $? -ne 0 ]; then curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"🚨 AEM ORC failed on $(hostname)\"}" \ https://hooks.slack.com/services/YOUR_WEBHOOK fi if [ $? -ne 0 ]; then curl -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"🚨 AEM ORC failed on $(hostname)\"}" \ https://hooks.slack.com/services/YOUR_WEBHOOK fi 3. Dynamic Retry Logic Oak compaction can fail due to transient I/O issues. Add resilience with exponential backoff: MAX_RETRIES=3 RETRY_DELAY=60 # seconds for ((i=1; i<=$MAX_RETRIES; i++)); do run_oak_command "compact" "$store" if [ $? -eq 0 ]; then break fi sleep $(($RETRY_DELAY * $i)) done MAX_RETRIES=3 RETRY_DELAY=60 # seconds for ((i=1; i<=$MAX_RETRIES; i++)); do run_oak_command "compact" "$store" if [ $? -eq 0 ]; then break fi sleep $(($RETRY_DELAY * $i)) done What’s your killer customization? Please share it in the comments. What’s your killer customization? Final Thoughts: Elevating Maintenance from Chore to Strategic Advantage Final Thoughts: Elevating Maintenance from Chore to Strategic Advantage Let’s face it: tasks like Offline Revision Cleanup rarely make the highlight reel of a developer’s day. They’re the unsung, unglamorous chores that keep systems alive. But the truth is how you handle these tasks defines the reliability of your AEM ecosystem. how This script isn’t just about avoiding OutOfDiskSpaceErrors or shaving seconds off query times. It’s about operational maturity . By automating ORC, you’re not just solving a problem, you’re institutionalizing consistency. You’re replacing tribal knowledge (“How did we do this last time?”) with version-controlled, auditable code. OutOfDiskSpaceErrors operational maturity For DevOps teams, the payoff is twofold: Risk Reduction: No more “fat-finger” mistakes during manual compaction. Scalability: Apply the same script across dev, stage, and prod environments with predictable results. Risk Reduction : No more “fat-finger” mistakes during manual compaction. Risk Reduction Scalability : Apply the same script across dev, stage, and prod environments with predictable results. Scalability What’s Next? What’s Next? Grab the script, tweak it for your team’s alerting/logging stack, and run it in a staging environment. Measure the before/after: Disk space reclaimed? Startup times improved? Share the metrics with leadership — it’s a win they’ll understand. Bonus points: Integrate it into your CI/CD pipeline as a post-deployment health check. Grab the script, tweak it for your team’s alerting/logging stack, and run it in a staging environment. Measure the before/after: Disk space reclaimed? Startup times improved? Share the metrics with leadership — it’s a win they’ll understand. Bonus points : Integrate it into your CI/CD pipeline as a post-deployment health check. Bonus points IMHO the future of AEM ops isn’t about working harder — it’s about scripting smarter. Now, go turn those maintenance windows into something more exciting. (Or at least, something that doesn’t keep you up at night.) What’s your automation success story? Drop a comment below — I’d love to hear how you’re taming AEM’s complexity. What’s your automation success story? Drop a comment below — I’d love to hear how you’re taming AEM’s complexity. What’s your automation success story? Drop a comment below — I’d love to hear how you’re taming AEM’s complexity. #!/bin/bash # ============================================================================= # AEM Offline Revision Cleanup Script # ============================================================================= # Purpose: Performs maintenance operations on Adobe Experience Manager's Oak # repository, including checkpoint management and compaction. # # Source: https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/deploying/revision-cleanup#how-to-run-offline-revision-cleanup # # Version: 1.0.0 # Author: Giuseppe Baglio # ============================================================================= # Color definitions for better readability RED='\033[0;31m' GREEN='\e[0;32m' YELLOW='\e[0;33m' BLUE='\e[0;34m' NC='\033[0m' # No Color # Default values for command line arguments PORT=4502 CQ_HOME="" OAK_RUN="" SKIP_CONFIRMATION=false # Flag to skip user confirmation DEBUG_MODE=false # Function to display usage/help message show_help() { cat << EOF Usage: $(basename "$0") [OPTIONS] Performs offline revision cleanup for an AEM instance, including checkpoint management and compaction of the Oak repository. Options: -h, --help Display this help message -p, --port PORT Specify the AEM instance port (default: 4502) -d, --aem-dir DIR Specify the AEM installation directory (required) -o, --oak-run FILE Specify the path to oak-run.jar (required) -y, --yes Skip confirmation prompts and proceed automatically --debug Enable debug mode for verbose output Example: $(basename "$0") --port 4502 --aem-dir /path/to/aem --oak-run /path/to/oak-run.jar --yes Requirements: - Java 1.8 or higher - lsof utility - AEM must be completely shut down before running this script - Adequate disk space for compaction Note: Always maintain a backup before running this script. EOF exit 1 } # Function to log messages with timestamp log_message() { local level=$1 shift local message=$* local timestamp timestamp=$(date '+%Y-%m-%d %H:%M:%S') case "$level" in "INFO") printf "${GREEN}[INFO]${NC} ";; "WARN") printf "${YELLOW}[WARN]${NC} ";; "ERROR") printf "${RED}[ERROR]${NC} ";; *) printf "[${level}] ";; esac printf "[%s] %s\n" "$timestamp" "$message" } # Function to handle debug output debug_log() { if [ "$DEBUG_MODE" = true ]; then log_message "DEBUG" "$*" fi } check_dependencies() { local dependencies=("java" "lsof") for dep in "${dependencies[@]}"; do if ! command -v "$dep" > /dev/null 2>&1; then log_message "ERROR" "'$dep' is not installed. Please install it and try again." exit 1 fi debug_log "Found required dependency: $dep" done } # Function to parse command line arguments parse_arguments() { while [[ $# -gt 0 ]]; do case $1 in -h|--help) show_help ;; -p|--port) PORT="$2" shift 2 ;; -d|--aem-dir) CQ_HOME="$2" shift 2 ;; -o|--oak-run) OAK_RUN="$2" shift 2 ;; -y|--yes) SKIP_CONFIRMATION=true shift ;; --debug) DEBUG_MODE=true shift ;; *) log_message "ERROR" "Unknown option: $1" show_help ;; esac done # Validate required arguments if [[ ! -d "$CQ_HOME" ]]; then log_message "ERROR" "AEM directory (-d|--aem-dir) is required and must exist" show_help fi # Validate oak-run.jar if [[ ! -f "$OAK_RUN" ]]; then log_message "ERROR" "oak-run.jar not found at '$OAK_RUN'" exit 1 fi # Validate port number if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then log_message "ERROR" "Invalid port number: $PORT (must be between 1 and 65535)" exit 1 fi } # Function to run oak commands with proper error handling run_oak_command() { local command=$1 local store=$2 local args=${3:-""} debug_log "Running oak command \"$command\" on $store with args \"$args\"" java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$command" "$CQ_REPO_PATH/$store" $args if [ $? -ne 0 ]; then log_message "ERROR" "Oak command failed: $command $store $args" exit 1 fi } # Function to check if AEM is running check_aem_off() { lsof -i tcp:"${PORT}" | awk 'NR!=1 {print $2}' | sort -u | wc -l } # Function to prompt for user confirmation wait_user_input() { local prompt_message=${1:-"Do you wish to continue"} local additional_info=$2 # If --yes flag was used, skip confirmation if [ "$SKIP_CONFIRMATION" = true ]; then log_message "INFO" "Skipping confirmation (--yes flag used)" return 0 fi # Display additional information if provided if [ -n "$additional_info" ]; then printf "\n%s\n" "$additional_info" fi printf "\n${YELLOW}%s [Y/n]${NC} " "$prompt_message" read -r response case $response in "y"|"yes"|"Y"|"") # Empty response defaults to yes return 0 ;; "n"|"N"|"no") log_message "INFO" "Operation cancelled by user" exit 0 ;; *) log_message "ERROR" "Please answer 'y' for yes or 'n' for no (default: y)" ;; esac } # Main script execution starts here check_dependencies parse_arguments "$@" # Set repository paths CQ_SEGMENTSTORE_HOME="crx-quickstart/repository" CQ_REPO_PATH="$CQ_HOME/$CQ_SEGMENTSTORE_HOME" # Validate repository path if [ ! -d "$CQ_REPO_PATH" ]; then log_message "ERROR" "Repository folder not found: $CQ_REPO_PATH" exit 1 fi # Check if AEM is running if [ "$(check_aem_off)" -eq 0 ]; then log_message "WARN" "[!] Always make sure you have a recent backup of the AEM instance." log_message "INFO" "Starting offline revision cleanup process" # Process both datastore and segmentstore for store in segmentstore datastore; do log_message "INFO" "Processing $store..." log_message "INFO" "Finding old checkpoints in $store" wait_user_input run_oak_command "checkpoints" "$store" log_message "INFO" "Removing unreferenced checkpoints from $store" wait_user_input run_oak_command "checkpoints" "$store" "rm-unreferenced" log_message "INFO" "Running compaction on $store" wait_user_input run_oak_command "compact" "$store" log_message "INFO" "Completed processing $store" done log_message "INFO" "Revision cleanup completed successfully" else log_message "ERROR" "AEM is still running on port $PORT. Please shut it down first." exit 1 fi #!/bin/bash # ============================================================================= # AEM Offline Revision Cleanup Script # ============================================================================= # Purpose: Performs maintenance operations on Adobe Experience Manager's Oak # repository, including checkpoint management and compaction. # # Source: https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/deploying/revision-cleanup#how-to-run-offline-revision-cleanup # # Version: 1.0.0 # Author: Giuseppe Baglio # ============================================================================= # Color definitions for better readability RED='\033[0;31m' GREEN='\e[0;32m' YELLOW='\e[0;33m' BLUE='\e[0;34m' NC='\033[0m' # No Color # Default values for command line arguments PORT=4502 CQ_HOME="" OAK_RUN="" SKIP_CONFIRMATION=false # Flag to skip user confirmation DEBUG_MODE=false # Function to display usage/help message show_help() { cat << EOF Usage: $(basename "$0") [OPTIONS] Performs offline revision cleanup for an AEM instance, including checkpoint management and compaction of the Oak repository. Options: -h, --help Display this help message -p, --port PORT Specify the AEM instance port (default: 4502) -d, --aem-dir DIR Specify the AEM installation directory (required) -o, --oak-run FILE Specify the path to oak-run.jar (required) -y, --yes Skip confirmation prompts and proceed automatically --debug Enable debug mode for verbose output Example: $(basename "$0") --port 4502 --aem-dir /path/to/aem --oak-run /path/to/oak-run.jar --yes Requirements: - Java 1.8 or higher - lsof utility - AEM must be completely shut down before running this script - Adequate disk space for compaction Note: Always maintain a backup before running this script. EOF exit 1 } # Function to log messages with timestamp log_message() { local level=$1 shift local message=$* local timestamp timestamp=$(date '+%Y-%m-%d %H:%M:%S') case "$level" in "INFO") printf "${GREEN}[INFO]${NC} ";; "WARN") printf "${YELLOW}[WARN]${NC} ";; "ERROR") printf "${RED}[ERROR]${NC} ";; *) printf "[${level}] ";; esac printf "[%s] %s\n" "$timestamp" "$message" } # Function to handle debug output debug_log() { if [ "$DEBUG_MODE" = true ]; then log_message "DEBUG" "$*" fi } check_dependencies() { local dependencies=("java" "lsof") for dep in "${dependencies[@]}"; do if ! command -v "$dep" > /dev/null 2>&1; then log_message "ERROR" "'$dep' is not installed. Please install it and try again." exit 1 fi debug_log "Found required dependency: $dep" done } # Function to parse command line arguments parse_arguments() { while [[ $# -gt 0 ]]; do case $1 in -h|--help) show_help ;; -p|--port) PORT="$2" shift 2 ;; -d|--aem-dir) CQ_HOME="$2" shift 2 ;; -o|--oak-run) OAK_RUN="$2" shift 2 ;; -y|--yes) SKIP_CONFIRMATION=true shift ;; --debug) DEBUG_MODE=true shift ;; *) log_message "ERROR" "Unknown option: $1" show_help ;; esac done # Validate required arguments if [[ ! -d "$CQ_HOME" ]]; then log_message "ERROR" "AEM directory (-d|--aem-dir) is required and must exist" show_help fi # Validate oak-run.jar if [[ ! -f "$OAK_RUN" ]]; then log_message "ERROR" "oak-run.jar not found at '$OAK_RUN'" exit 1 fi # Validate port number if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then log_message "ERROR" "Invalid port number: $PORT (must be between 1 and 65535)" exit 1 fi } # Function to run oak commands with proper error handling run_oak_command() { local command=$1 local store=$2 local args=${3:-""} debug_log "Running oak command \"$command\" on $store with args \"$args\"" java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$command" "$CQ_REPO_PATH/$store" $args if [ $? -ne 0 ]; then log_message "ERROR" "Oak command failed: $command $store $args" exit 1 fi } # Function to check if AEM is running check_aem_off() { lsof -i tcp:"${PORT}" | awk 'NR!=1 {print $2}' | sort -u | wc -l } # Function to prompt for user confirmation wait_user_input() { local prompt_message=${1:-"Do you wish to continue"} local additional_info=$2 # If --yes flag was used, skip confirmation if [ "$SKIP_CONFIRMATION" = true ]; then log_message "INFO" "Skipping confirmation (--yes flag used)" return 0 fi # Display additional information if provided if [ -n "$additional_info" ]; then printf "\n%s\n" "$additional_info" fi printf "\n${YELLOW}%s [Y/n]${NC} " "$prompt_message" read -r response case $response in "y"|"yes"|"Y"|"") # Empty response defaults to yes return 0 ;; "n"|"N"|"no") log_message "INFO" "Operation cancelled by user" exit 0 ;; *) log_message "ERROR" "Please answer 'y' for yes or 'n' for no (default: y)" ;; esac } # Main script execution starts here check_dependencies parse_arguments "$@" # Set repository paths CQ_SEGMENTSTORE_HOME="crx-quickstart/repository" CQ_REPO_PATH="$CQ_HOME/$CQ_SEGMENTSTORE_HOME" # Validate repository path if [ ! -d "$CQ_REPO_PATH" ]; then log_message "ERROR" "Repository folder not found: $CQ_REPO_PATH" exit 1 fi # Check if AEM is running if [ "$(check_aem_off)" -eq 0 ]; then log_message "WARN" "[!] Always make sure you have a recent backup of the AEM instance." log_message "INFO" "Starting offline revision cleanup process" # Process both datastore and segmentstore for store in segmentstore datastore; do log_message "INFO" "Processing $store..." log_message "INFO" "Finding old checkpoints in $store" wait_user_input run_oak_command "checkpoints" "$store" log_message "INFO" "Removing unreferenced checkpoints from $store" wait_user_input run_oak_command "checkpoints" "$store" "rm-unreferenced" log_message "INFO" "Running compaction on $store" wait_user_input run_oak_command "compact" "$store" log_message "INFO" "Completed processing $store" done log_message "INFO" "Revision cleanup completed successfully" else log_message "ERROR" "AEM is still running on port $PORT. Please shut it down first." exit 1 fi