Operations Guide
This section is for SRE, DevOps, and on-call engineers managing NestJS RedisX in production.
Quick Navigation
| I need to... | Go to |
|---|---|
| Set up monitoring | Monitoring |
| Configure alerts | Alerting |
| Plan capacity | Capacity Planning |
| Handle incidents | Runbooks |
For Developers vs Ops
This guide focuses on operational concerns:
| Developers | Operations |
|---|---|
| How to use @Cached | How to monitor cache hit rate |
| How to configure locks | How to alert on lock contention |
| How to process streams | How to scale consumers |
Key Operational Metrics
Quick Health Check
yaml
# Is Redis healthy?
redis_up == 1
# Is cache working?
rate(redisx_cache_hits_total[5m]) > 0
# Are locks releasing?
rate(redisx_lock_releases_total[5m]) > 0
# Any rate limit issues?
rate(redisx_ratelimit_rejected_total[5m]) < 0.1 * rate(redisx_ratelimit_allowed_total[5m])Common Operational Tasks
Scale Workers
bash
# Kubernetes
kubectl scale deployment job-workers --replicas=10
# Docker Compose
docker-compose up -d --scale worker=10Clear Cache
bash
# All cache keys
redis-cli KEYS "cache:*" | xargs redis-cli DEL
# Specific pattern
redis-cli KEYS "cache:user:*" | xargs redis-cli DELCheck Queue Depth
bash
redis-cli XLEN jobs
redis-cli XLEN jobs:dlqOn-Call Checklist
When paged:
Check Redis connectivity
bashredis-cli pingCheck key metrics
- Cache hit rate
- Lock timeout rate
- Stream consumer lag
Check application logs
bashkubectl logs -l app=my-service --tail=100Check Grafana dashboard
- Look for spikes/dips at incident time
Follow runbook for specific alert
Next Steps
- Monitoring — Detailed metrics setup
- Alerting — Alert configuration
- Runbooks — Incident response procedures