KEDA (Event-Driven Autoscaling)
Scale based on external events with support for scale-to-zero.
KEDA (Kubernetes Event-Driven Autoscaler) extends Kubernetes autoscaling beyond CPU and memory. It scales workloads based on external event sources like queue depth, stream lag, cron schedules, and 70+ other triggers. It uniquely supports scaling to zero.
How It Works
KEDA installs three components: the Operator (watches ScaledObject CRDs and creates HPAs), the Metrics Server (exposes external metrics to HPA), and Admission Webhooks (validates configurations).
When you create a ScaledObject, KEDA polls your event source (e.g., RabbitMQ queue) at the configured interval. If the queue has 0 messages, KEDA scales to zero. When messages arrive, it activates the deployment and feeds the queue metric to HPA for standard replica calculation.
ScaledObject for long-running deployments (web servers, consumers) and ScaledJob for short-lived batch workloads.When to Use
- Queue consumers (RabbitMQ, SQS, Kafka)
- Event-driven workloads
- You need scale-to-zero for cost savings
- Standard HPA metrics don't reflect actual load
- Cron-based pre-scaling before traffic windows
When NOT to Use
- Simple CPU-based scaling is sufficient (HPA is simpler)
- Your app can't tolerate cold-start latency from zero
- No external event source that correlates with load
Real-World Example
Order Processing Queue
An e-commerce platform uses RabbitMQ for order processing. During a flash sale, the queue spikes to 5,000 messages. KEDA polls every 10 seconds and scales from 1 to 50 consumer pods, clearing the backlog in 2 minutes. After the sale, the queue drains and KEDA scales to zero, saving compute costs until the next burst.
Step-by-Step Implementation
1. Install KEDA
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
--namespace keda \
--create-namespace
# Verify
kubectl get pods -n keda2. Deploy your consumer (start at 0 replicas)
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-processor
spec:
replicas: 0 # KEDA will manage this
selector:
matchLabels:
app: order-processor
template:
metadata:
labels:
app: order-processor
spec:
containers:
- name: processor
image: myregistry/order-processor:2.1.0
resources:
requests:
cpu: "100m"
memory: "128Mi"3. Create the ScaledObject
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: order-processor-scaler
spec:
scaleTargetRef:
name: order-processor
pollingInterval: 10 # Check every 10 seconds
cooldownPeriod: 60 # Wait 60s before scaling to zero
minReplicaCount: 0 # Scale to zero when idle
maxReplicaCount: 100
triggers:
- type: rabbitmq
metadata:
queueName: orders
mode: QueueLength
value: "5" # 1 replica per 5 messages
authenticationRef:
name: rabbitmq-trigger-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: rabbitmq-trigger-auth
spec:
secretTargetRef:
- parameter: host
name: rabbitmq-credentials
key: url4. Verify
kubectl apply -f order-processor.yaml
kubectl apply -f scaled-object.yaml
kubectl get scaledobject order-processor-scaler
kubectl get hpa # KEDA creates this automaticallyCommon Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Cold start from zero | High latency on first request | Set minReplicaCount: 1 for latency-sensitive workloads |
| Authentication failures | "error getting metrics" | Verify TriggerAuthentication secrets exist and are correct |
| Polling interval too long | Scaling reacts too slowly | Reduce pollingInterval to 5-10 seconds |
| Cooldown too short | Scale to zero then immediately back up | Increase cooldownPeriod to 120-300 seconds |
| KEDA and HPA conflict | Two HPAs for same deployment | Delete manually-created HPA; KEDA creates its own |