Imagine we have a system where users add articles. For each user, we display statistics about their articles in their personal dashboard: the number of articles, average word count, publication frequency, etc. To speed up the system, we cache this data. A unique cache key is created for each report.
The question arises: how to invalidate such caches when data changes? One approach is to manually clear the cache for each event, for instance, when a new article is added:
class InvalidateArticleReportCacheOnArticleCreated {
public function handle(event: ArticleCreatedEvent): void {
this->cacheService->deleteMultiple([
'user_article_report_count_' . event->userId,
'user_article_report_word_avg_' . event->userId,
'user_article_report_freq_avg_' . event->userId,
])
}
}
This method works but becomes cumbersome when dealing with a large number of reports and keys. This is where tagged caching comes in handy. Tagged caching allows data to be associated not only with a key but also with an array of tags. Subsequently, all records associated with a specific tag can be invalidated, significantly simplifying the process.
Writing a value to the cache with tags:
this->taggedCacheService->set(
key: 'user_article_report_count_' . user->id,
value: value,
tagNames: [
'user_article_cache_tag_' . user->id,
'user_article_report_cache_tag_' . user->id,
'user_article_report'
]
)
Invalidating the cache by tags:
class UpdateCacheTagsOnArticleCreated {
public function handle(event: ArticleCreatedEvent): void {
this->taggedCacheService->updateTagsVersions([
'user_article_cache_tag_' . user->id,
])
}
}
Here, the tag 'user_article_cache_tag_' . $user->id
represents changes in the user's articles. It can be used to invalidate any caches dependent on this data. A more specific tag 'user_article_report_cache_tag_' . $user->id
allows only the user's reports to be cleared, while a general tag 'user_article_report'
invalidates report caches for all users.
If your caching library does not support tagging, you can implement it yourself. The main idea is to store the current version values of tags, as well as for each value tagged, to store the tag versions that were current at the time the value was written to the cache. Then, when retrieving a value from the cache, the current tag versions are also retrieved, and their validity is checked by comparing them.
Creating a TaggedCache
class
class TaggedCache {
private cacheService: cacheService
}
Implementing the set
method for writing to the cache with tags. In this method, we need to write the value to the cache, as well as retrieve the current versions of the tags provided and save them associated with the specific cache key. This is achieved by using an additional key with a prefix added to the provided key.
class TaggedCache {
private cacheService: cacheService
public function set(
key: string,
value: mixed,
tagNames: string[],
ttl: int
): bool {
if (empty(tagNames)) {
return false
}
tagVersions = this->getTagsVersions(tagNames)
tagsCacheKey = this->getTagsCacheKey(key)
return this->cacheService->setMultiple(
[
key => value,
tagsCacheKey => tagVersions,
],
ttl
)
}
private function getTagsVersions(tagNames: string[]): array<string, string> {
tagVersions = []
tagVersionKeys = []
foreach (tagNames as tagName) {
tagVersionKeys[tagName] = this->getTagVersionKey(tagName)
}
if (empty(tagVersionKeys)) {
return tagVersions
}
tagVersionsCache = this->cacheService->getMultiple(tagVersionKeys)
foreach (tagVersionKeys as tagName => tagVersionKey) {
if (empty(tagVersionsCache[tagVersionKey])) {
tagVersionsCache[tagVersionKey] = this->updateTagVersion(tagName)
}
tagVersions[$tagName] = tagVersionsCache[tagVersionKey]
}
return tagVersions
}
private function getTagVersionKey(tagName: string): string {
return 'tag_version_' . tagName
}
private function getTagsCacheKey(key: string): string {
return 'cache_tags_tagskeys_' . key
}
Adding the get
method to retrieve tagged values from the cache. Here, we retrieve the value using the key, as well as the tag versions associated with that key. Then we check the validity of the tags. If any tag is invalid, the value is deleted from the cache and null
is returned. If all tags are valid, the cached value is returned.
class TaggedCache {
private cacheService: cacheService
public function get(key: string): mixed {
tagsCacheKey = this->getTagsCacheKey(key)
values = this->cacheService->getMultiple([key, tagsCacheKey])
if (empty(values[key]) || empty(values[tagsCacheKey])) {
return null
}
value = values[key]
tagVersions = values[tagsCacheKey]
if (! this->isTagVersionsValid(tagVersions)) {
this->cacheService->deleteMultiple([key, tagsCacheKey])
return null
}
return value
}
private function isTagVersionsValid(tagVersions: array<string, string>): bool {
tagNames = array_keys(tagVersions)
actualTagVersions = this->getTagsVersions(tagNames)
foreach (tagVersions as tagName => tagVersion) {
if (empty(actualTagVersions[tagName])) {
return false
}
if (actualTagVersions[tagName] !== tagVersion) {
return false
}
}
return true
}
}
Implementing the updateTagsVersions
method to update tag versions. Here, we iterate over all the tags provided and update their versions using, for example, the current time as the version.
class TaggedCache {
private cacheService: cacheService
public function updateTagsVersions(tagNames: string[]): void {
foreach (tagNames as tagName) {
this->updateTagVersion(tagName)
}
}
private function updateTagVersion(tagName: string): string {
tagKey = this->getTagVersionKey(tagName)
tagVersion = this->generateTagVersion()
return this->cacheService->set(tagKey, tagVersion) ? tagVersion : ''
}
private function generateTagVersion(): string {
return (string) hrtime(true)
}
}
This approach is both convenient and universal. Tagged caching eliminates the need to manually specify all keys for invalidation, automating the process. However, it requires additional resources: storing tag version data and checking their validity with each request.
If your caching service is fast and not heavily constrained in size, this approach will not significantly affect performance. To minimize the load, you can combine tagged caching with local caching mechanisms.
In this way, tagged caching not only simplifies invalidation but also makes working with data more flexible and understandable, especially in complex systems with large amounts of interconnected data.