How to Automate CDN Cache Invalidation in Adobe Experience Manager

Written by realgpp | Published 2023/12/18
Tech Story Tags: aem | cdn | web-performance | java-libraries | akamai-api | open-source-software | cdn-cache-invalidation | automated-cache-invalidation

TLDRJust released an AEM GitHub library that automatically clears CDN caches when new code deploys, accelerating delivery of updates to users. Integrates Akamai API out-of-the-box. Configurable and extensible automated solution to tedious manual cache purging. The key points covered in this brief summary: - Automates CDN cache clearing on code updates - -Accelerates delivery of site changes - Integrated with Akamai API - Configurable and extensible - Solves manual cache purge hasslevia the TL;DR App

Content Delivery Networks (CDNs) are commonly used with AEM to improve website performance by caching and serving content from edge locations closer to users. However, when AEM content changes it can lead to outdated copies in the CDN cache. Manual cache invalidation is complex and error-prone.

Though CDNs improved performance, our authors’ new changes weren’t reflected until caches expired. Clearing the cache manually post-publish did not scale. I could not find an existing out-of-the-box solution for this problem.

So I developed an open-source AEM library that listens to publishing events and automatically triggers selective cache invalidation based on configurable rules.

The library reacts to AEM replication events or resource changes and then runs asynchronous background jobs to invalidate cached CDN content related to the published updates. Flexible APIs allow integration with various CDNs while extension points enable customization.

The Problem

CDNs cache website content like pages, images, and JS/CSS assets. This avoids repeatedly serving static content from the AEM publish tier which improves response times. The CDN only re-fetches assets when they expire from the cache after a TTL (Time to Live).

However, updated AEM content can get outdated copies lingering in the CDN after publishing. Some examples:

  • Publishing a new content page is not immediately visible to users. The older version remains cached.
  • Updating a product image in DAM does not show until the CDN cache expires.

Depending on the business context, any of these situations can be a really sensitive issue that needs to be addressed urgently. But sadly, clearing the cache manually on every publish is operationally complex and time-consuming:

  • Identifying related URLs is manual and repetitive

  • Tedious cache tagging schemes to coordinate flushing

An automated solution is needed… one that selectively and intelligently invalidates cache only for published updates.

Automated Cache Invalidation

Library “Automatic CDN Cache Invalidator for AEM” listens to publish events and resource changes, then triggers background cache invalidation per configurable rules. Flexible APIs allow integration with various CDNs.

https://github.com/realgpp/aem-cdn-cache-invalidator?embedable=true

Key Capabilities

  • Automatic triggering: Reacts to AEM replication events for publishing updates

  • Selective invalidation: Invalidates cache only for specific published paths rather than mass flush

  • Async processing: Uses queue-based jobs framework to offload CDN calls

  • Configurable rules: Flush cache by URL, tag or custom criteria

  • Logs: Detailed logging for tracking

  • Akamai support built-in: Akamai Purge API is integrated

  • Flexible CDN integrations: Easily extendable for other CDNs via APIs

  • Extension APIs: Hooks to personalize behavior

It solves manual cache management challenges allowing authors to focus on content while optimizing end-user experience.

Let’s walk through the flow…

How it Works

The application has configurable OSGi services and listeners for cache invalidation triggered by AEM events.

  1. Event listening: Replication or resource change events signify content updates.
  2. Selective listener: An OSGi listener consumes events filtering to specific trees. It creates asynchronous jobs with updated paths.
  3. Background job: A separate job thread processes paths to extract cache-related metadata like URLs, tags etc.
  4. CDN API call: The job makes the API call to invalidate the cache for the computed metadata.
  5. Cache flush: The CDN receives the API request and flushes related cached content.

The async approach avoids publish request performance impact. The CDN calls happen out-of-band without blocking AEM. The entire flow is configurable via OSGi including supported events, cache criteria, URLs, or logic.

Out-of-the-box integrations make it easy to get started while extension points allow customization. Let’s explore common configuration options…

Purging content with Akamai API

The project includes reusable services, listeners, and jobs supporting caching scenarios like:

  • Invalidating pages or assets by URL

  • Website sections by tag or code

The following are the Akamai Purge APIs supported out of the box:

The following block, which you can find in the ui.config.example maven module, shows how to configure the Akamai service that comes with the library.

// file com.baglio.autocdninvalidator.core.service.impl.AkamaiInvalidationServiceImpl.cfg.json
{
  "isEnabled": true,
  "configurationID": "cdn-akamai",
  "hostname": "localhost:8443",
  "getAkamaiClientToken": "placeholder_akamai_client_token",
  "getAkamaiAccessToken": "placeholder_akamai_access_token",
  "getAkamaiClientSecret": "placeholder_akamai_client_secret",
  "httpClientConfigurationID": "http-client-akamai",
  "purgeType": "invalidate"
}

This service lets you choose how to purge content from the CDN: you can either invalidate or delete it. As you can see, you need to replace placeholders with your Akamai values. You also need a dedicated HTTP Client Configuration to work with the Akamai APIs. After that, your service can be used by other sling jobs that handle content-change events.

Now let’s explore other configuration options that let you customize the library without coding…

Configuration Areas

The library offers a variety of configuration options to customize its functionality without code customizations. You can adjust these settings to suit your specific needs and preferences.

I want to keep this introduction brief and engaging but don’t worry if you have questions about the details. You can find more information in the README.md file of the project or in the Javadoc comments if you like to explore the code.

Key aspects to configure are:

  • Observed events: Replication topics or resource change events that trigger the invalidation jobs.

// file com.baglio.autocdninvalidator.core.listeners.ReplicationEventListener-website-generic.cfg.json
{
  "isEnabled": true,
  "resource.paths": [
    "/content/we-retail",
    "/content/wknd"
  ],
  "job.topic": "com/baglio/autocdninvalidator/akamai/generic",
  "filter.regex": "^/content/(we-retail|wknd)(?!.*products)(?!.*adventures).*$"
}
// file com.baglio.autocdninvalidator.core.listeners.DynamicResourceChangeListener-website-generic.cfg.json
{
  "isEnabled": false,
  "resource.change.types": [
    "CHANGED"
  ],
  "resource.paths": [
    "/content/we-retail",
    "/content/wknd"
  ],
  "job.topic": "com/baglio/autocdninvalidator/akamai/generic",
  "filter.regex": "^/content/(we-retail|wknd)(?!.*products)(?!.*adventures).*$"
}

  • CDN credentials: API credentials and settings for your provider.

// file com.baglio.autocdninvalidator.core.service.impl.AkamaiInvalidationServiceImpl.cfg.json
{
  "isEnabled": true,
  "configurationID": "cdn-akamai",
  "getAkamaiClientToken": "placeholder_akamai_client_token",
  "hostname": "localhost:8443",
  "getAkamaiAccessToken": "placeholder_akamai_access_token",
  "getAkamaiClientSecret": "placeholder_akamai_client_secret",
  "httpClientConfigurationID": "http-client-akamai"
}

  • Invalidation rules: Filtering rules and cache flush criteria for each job consumer.

// file com.baglio.autocdninvalidator.core.jobs.EditorialAssetInvalidationJobConsumer.cfg.json
{
  "isEnabled": true,
  "job.topics": [
    "com/baglio/autocdninvalidator/akamai/generic"
  ],
  "cdnConfigurationID": "cdn-akamai",
  "invalidation.type": "tag",
  "tagCodeMappings": [
    "/content/we-retail/(..)(/.*)*=tag-dev-$1",
    "/content/we-retail/(..)/.*/experience/.*=tag-dev-$1-experience",
    "/content/wknd/(..)(/.*)*=tag-dev-$1",
    "/content/wknd/(..)/.*/adventures/.*=tag-dev-$1-adventures"
  ]
}

  • Job topic tuning: Throttling and parallelization configs for each job topic. Please have a look at the section Custom Job Queue Configuration of project README file.

These configurations determine how the library responds to AEM events, how many cache jobs run concurrently, and how the service communicates with your CDN provider to purge the cache.

Now that we have learned the basics, let’s move on to the fun part where we show you how to make the invalidation work on the AEM instance…

Enabling Akamai CDN Integration

If you want to use this library right away, you don’t need to write any code. Just follow these simple steps:

  • Step 1: Install the ui.config.example module or create the OSGi config manually. The ui.config.example module is a sample configuration that provides a working example for Akamai integration. You can install it from the package manager or create the OSGi config manually through the web console.

  • Step 2: Enter your Akamai credentials and settings in the AkamaiInvalidationServiceImpl config file. You must provide your Akamai hostname, client token, access token, and client secret in the AkamaiInvalidationServiceImpl config file. You can find these values in your Akamai portal.

  • Step 3: Choose to listen for AEM events on Author or Publish.Depending on your setup, you can choose to listen for AEM events on Author or Publish. If you choose Author, the module will detect publish events and invalidate the cache accordingly. If you choose Publish, the module will detect content updates and purge the cache accordingly.

Now let’s walk through adding support for a new custom CDN provider…

Implementing a Custom CDN Provider

Now that we’ve explored utilizing the library without coding, what if you encounter the need for a CDN provider that isn’t currently supported?

The library is designed for extension to support new CDN providers. You can easily extend it by following these steps:

  1. Create a Java class implementing the CdnInvalidationService interface. This interface defines the methods for purging the CDN cache.
  2. Implement the logic for calling your CDN’s purge API in your implementation class.
  3. Create an OSGi configuration for your class with the necessary credentials and property values. Make sure to give it a unique configuration ID so that it can be identified by the invalidation job.
  4. Use the configuration ID mentioned above when you create invalidation job configurations.

The library provides flexibility to customize the invalidation logic when using the built-in Akamai integration. If you need to tweak the preprocessing or postprocessing, the AbstractInvalidationJob class exposes hook methods you can override without having to reimplement the full Akamai handling.

With the procedure shown in the above image, you get the benefits of leveraging the existing Akamai capabilities while adapting things like:

- The preprocessInvalidationValues and postprocessInvalidationValues methods allow transforming the content paths and invalidation values before and after generating them.

- The preprocessPublicUrls and postprocessPublicUrls methods allow updating the public URLs before and after resolving them.

- The beforeInvalidation and afterInvalidation methods enable logic before and after making the CDN purge call.

By selectively overriding these hooks, you can adapt the data and flows to your needs while reusing the bulk of the Akamai integration code. This makes the library very extensible without having to start from scratch.

Wrap Up

And that’s a wrap! We’ve covered everything from out-of-the-box setup to fully custom integrations with this CDN invalidator library. I hope you found it an interesting read!

Did any part leave you scratching your head? Or maybe this library has already helped you invalidate caches more easily?

Either way, I’d love to hear your thoughts! Let me know what worked, what didn’t, or what other guides would help improve your CDN workflows.

Even better — I welcome pull requests if you have code changes, tweaks, or new snippets that can enhance this library and tutorial!

I greatly appreciate you taking the time to provide feedback and contributions. Your real-world experiences will help evolve this project and craft content that empowers fellow developers to master their CDN footprints.

Also published here.


Written by realgpp | Working as an AEM Solution Architect, I consider myself as a software craftsman.
Published by HackerNoon on 2023/12/18