Redis’ Key Expiration Strategy: A Masterclass in Probabilistic System Design

If you’ve ever implemented a cache with TTLs (time-to-live), you’ve likely faced a tricky balance: how do you efficiently expire entries without slowing down writes or wasting memory?

Redis - the in-memory data store famous for its sub-millisecond latency, has an elegant solution. It doesn’t rely on complex data structures like heaps or sorted lists to manage expirations. Instead, it uses a little randomness and a lot of clever engineering.

Let’s unpack how it works.

The Naïve Approach

Imagine you’re designing your own cache. You want every entry to expire after some duration. The straightforward solution is:

Keep a sorted list (or min-heap) of keys by expiration time.
Periodically check the earliest ones and delete any that are past due.

This works… but it’s not cheap. Each insertion into a heap costs O(log n), and with millions of keys, that adds up.

Redis was designed for millions of writes per second , an O(log n) write path simply wouldn’t fly. So how does Redis avoid this?

Redis’ Two-Pronged Approach

Redis uses two complementary mechanisms to handle expirations efficiently:

1. Lazy Expiration

Every time a key is accessed (say, via GET or HGET), Redis checks whether it’s expired.

If yes, it deletes it right away and pretends the key never existed. That’s free cleanup , if the key is ever accessed again.

But what about keys that expire and are never touched? That’s where the second mechanism kicks in.

2. Active Expiration (the “Random Cleanup Loop”)

Internally, Redis keeps a separate dictionary called expires, which maps keys to their expiration times:

expires["foo"] = 1730665000000  // Unix time in ms

Only keys with a TTL are in this dictionary.

About 10 times per second, Redis runs a background process (activeExpireCycle) that does this:

Sample 20 random keys from the expires dictionary.
If a sampled key’s TTL has passed, delete it.
If more than 25% of the sampled keys were expired, run another round, up to a CPU time limit.

Here’s a simplified version of the logic:

for each db in redis:
    repeat:
        expired_count = 0
        for i in 1..20:
            key = random_key(db.expires)
            if key.is_expired():
                delete(key)
                expired_count++
        if expired_count < 5:  // <25%
            break

That’s it. No global heap. No O(log n) operations. Just random sampling and adaptive cleanup.

How Does Redis Pick “Random” Keys?

This is where the design really shines. Redis’ hash tables support O(1) random sampling. Internally, random_key picks a random bucket in the hash table and returns the first key it finds there. That means Redis can fetch random keys without scanning the entire dictionary or maintaining a separate index. Do this 20 times, and you get 20 random TTL’d keys, fast !!!

Why Randomness Works

At first glance, this sounds too hand-wavy. What if Redis just keeps picking unexpired keys?

Here’s the probabilistic magic:

If a large portion of keys are expired (say 30–40%), the chance of picking an expired key in 20 samples is very high. When that happens, Redis immediately runs another round. It keeps looping until fewer than 25% of sampled keys are expired.

Statistically, the system self-regulates:

Redis automatically cleans more aggressively when there’s a backlog of expired keys, and relaxes when things are tidy. It’s a feedback loop powered by probability.

The Payoff

This design yields three powerful benefits:

Aspect	Traditional Approach	Redis’ Approach
Write complexity	O(log n)	O(1)
Expiration precision	Exact	Approximate
Memory cleanup	Deterministic	Probabilistic but effective
Throughput	Moderate	Extremely high

Tuning the Behavior

A few internal settings control this behavior:

Config	Description	Default
`hz`	Frequency of background tasks	`10` (10Hz = 10 times/sec)
`ACTIVE_EXPIRE_CYCLE_LOOKUPS_PER_LOOP`	Keys sampled per round	`20`
`lazyfree-lazy-expire`	Whether deletion happens asynchronously	`no`

These can be tuned for large deployments where millions of keys expire frequently, but defaults work beautifully for most workloads.

A Lesson in Systems Design

Redis’ key expiration strategy is a case study in probabilistic system design using randomness to balance performance and correctness.

Rather than maintaining a perfectly accurate expiration queue, Redis accepts approximate cleanup in exchange for blazing speed.

And because of statistical feedback, it still converges to the right state.

It’s a reminder that sometimes, the smartest engineering move isn’t to be perfect , it’s to be good enough, fast.

Final Thought

Redis shows that randomness is not the enemy of determinism —when wielded carefully, it’s a tool for building fast, self-correcting systems.

So the next time you’re designing a cache or scheduler, ask yourself:

Do I really need a perfect global order… or can a bit of randomness do the job?