Skip to content

Anti-Stampede Protection

Prevent multiple processes from hammering your database when cache expires.

The Problem

When a popular cache entry expires, many requests arrive simultaneously — potentially overloading the database.

Real impact:

  • 100 concurrent requests × 50ms query = 5 seconds of DB saturation
  • Database connection pool exhausted
  • Cascading failures

The Solution

Only ONE request fetches from database, others wait for its result.

How It Works

Stampede protection uses a two-layer architecture:

Layer 1: Local Singleflight (in-memory)
  ├─ Coalesces concurrent requests within the SAME process
  ├─ Uses Promise coalescing — waiters share the same Promise
  └─ No Redis calls, zero latency overhead

Layer 2: Distributed Redis Lock (cross-process)
  ├─ Acquires lock via SET NX EX (atomic set-if-not-exists with TTL)
  ├─ Lock key format: _stampede:{cacheKey}
  ├─ Lock released via Lua script (only owner can release)
  └─ Prevents multiple instances from loading simultaneously

Step by step:

  1. Request arrives, cache miss detected in getOrSet()
  2. Check local flights — if another request is already loading this key, wait for its Promise
  3. Register new flight (synchronous, before any async work)
  4. Try to acquire distributed Redis lock (SET _stampede:{key} {value} EX {ttl} NX)
  5. Execute loader function with timeout
  6. Resolve all local waiters with the loaded value
  7. Release Redis lock (Lua script ensures only owner releases)
  8. Cache the result

Waiting uses Promise.race() — no polling, no busy-waiting.

@Cached does NOT use stampede protection

@Cached decorator uses separate get() + set() calls — no stampede protection. Only getOrSet() (Service API) includes stampede protection. If you need stampede protection with decorators, use getOrSet() in your service method instead.

Configuration

typescript
new CachePlugin({
  stampede: {
    enabled: true,        // Enable protection (default: true)
    lockTimeout: 5000,    // Loader execution timeout in ms (default: 5000)
    waitTimeout: 10000,   // Max time waiters wait for result in ms (default: 10000)
    fallback: 'load',     // Behavior when lock fails: 'load' | 'error' | 'null' (default: 'load')
  },
})
OptionDefaultDescription
enabledtrueEnable stampede protection globally
lockTimeout5000Max time for loader execution (ms). Also used as Redis lock TTL.
waitTimeout10000Max time a waiter will wait for the leader's result (ms).
fallback'load'Behavior when Redis lock cannot be acquired: 'load' (execute loader anyway), 'error' (throw), 'null' (return null).

Service API Usage

Stampede protection is automatic when using getOrSet:

typescript
// With getOrSet — stampede protected by default
const data = await this.cache.getOrSet(
  'popular-key',
  () => this.db.fetchData(),
  { ttl: 300 }
);

// Disable for specific call
const data = await this.cache.getOrSet(
  'user-key',
  () => this.db.fetchUser(id),
  { ttl: 300, skipStampede: true }
);

Statistics

typescript
const stats = await this.cache.getStats();

/*
{
  stampedePrevented: 142,  // Total stampede events prevented
}
*/

The stampede protection service also tracks internal stats:

MetricDescription
activeFlightsCurrently in-flight loader executions
totalWaitersSum of all waiters across active flights
oldestFlightDuration of the oldest in-flight request (ms)
preventedTotal stampede events prevented

Error Handling

ScenarioBehavior
Loader throwsError propagates to caller. Waiters also receive the error. Cache is not updated.
Loader exceeds lockTimeoutThrows StampedeError. Redis lock expires automatically.
Waiter exceeds waitTimeoutThrows StampedeError. Does not affect the leader.
Redis lock acquisition failsLoader executes anyway (fallback). Protection still works at process level via singleflight.
Redis unavailableLocal singleflight still protects within the same process.

Debugging

Lock keys in Redis use the format _stampede:{cacheKey}. To inspect active locks:

bash
# In redis-cli
KEYS _stampede:*

If a lock is stuck (rare — TTL should auto-expire):

bash
# Check TTL
TTL _stampede:popular-key

# Force remove (use with caution)
DEL _stampede:popular-key

Comparison

ScenarioWithout ProtectionWith Protection
100 concurrent requests100 DB queries1 DB query
Database loadSpikeStable
Response time (leader)50ms50ms
Response time (waiters)50ms each~60ms (shared wait)

Next Steps

Released under the MIT License.