ScaleGuidev2
NewsSandbox
ScaleGuide — Kubernetes Autoscaling, Explained Visually.
DocsVisualize

Autoscaling

Horizontal Pod AutoscalerVertical Pod AutoscalerCluster AutoscalerKEDA

Deployment Strategies

Blue-Green DeploymentCanary DeploymentRolling UpdateRecreate DeploymentA/B Testing DeploymentShadow (Dark) Deployment

PostgreSQL

Prerequisites & SetupWhy PostgreSQL?Backend ConnectionsPractice ExamplesOfficial Docs Summary

Code Sandbox

SQL QueriesK8s ManifestsDeploy Configs

KEDA (Event-Driven Autoscaling)

Scale based on external events with support for scale-to-zero.

KEDA (Kubernetes Event-Driven Autoscaler) extends Kubernetes autoscaling beyond CPU and memory. It scales workloads based on external event sources like queue depth, stream lag, cron schedules, and 70+ other triggers. It uniquely supports scaling to zero.

How It Works

KEDA installs three components: the Operator (watches ScaledObject CRDs and creates HPAs), the Metrics Server (exposes external metrics to HPA), and Admission Webhooks (validates configurations).

When you create a ScaledObject, KEDA polls your event source (e.g., RabbitMQ queue) at the configured interval. If the queue has 0 messages, KEDA scales to zero. When messages arrive, it activates the deployment and feeds the queue metric to HPA for standard replica calculation.

TipKEDA uses two CRDs: ScaledObject for long-running deployments (web servers, consumers) and ScaledJob for short-lived batch workloads.

When to Use

  • Queue consumers (RabbitMQ, SQS, Kafka)
  • Event-driven workloads
  • You need scale-to-zero for cost savings
  • Standard HPA metrics don't reflect actual load
  • Cron-based pre-scaling before traffic windows

When NOT to Use

  • Simple CPU-based scaling is sufficient (HPA is simpler)
  • Your app can't tolerate cold-start latency from zero
  • No external event source that correlates with load

Real-World Example

Order Processing Queue

An e-commerce platform uses RabbitMQ for order processing. During a flash sale, the queue spikes to 5,000 messages. KEDA polls every 10 seconds and scales from 1 to 50 consumer pods, clearing the backlog in 2 minutes. After the sale, the queue drains and KEDA scales to zero, saving compute costs until the next burst.

Step-by-Step Implementation

1. Install KEDA

bash
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda \
  --namespace keda \
  --create-namespace

# Verify
kubectl get pods -n keda

2. Deploy your consumer (start at 0 replicas)

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-processor
spec:
  replicas: 0          # KEDA will manage this
  selector:
    matchLabels:
      app: order-processor
  template:
    metadata:
      labels:
        app: order-processor
    spec:
      containers:
      - name: processor
        image: myregistry/order-processor:2.1.0
        resources:
          requests:
            cpu: "100m"
            memory: "128Mi"

3. Create the ScaledObject

yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: order-processor-scaler
spec:
  scaleTargetRef:
    name: order-processor
  pollingInterval: 10           # Check every 10 seconds
  cooldownPeriod: 60            # Wait 60s before scaling to zero
  minReplicaCount: 0            # Scale to zero when idle
  maxReplicaCount: 100
  triggers:
  - type: rabbitmq
    metadata:
      queueName: orders
      mode: QueueLength
      value: "5"                # 1 replica per 5 messages
    authenticationRef:
      name: rabbitmq-trigger-auth
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: rabbitmq-trigger-auth
spec:
  secretTargetRef:
  - parameter: host
    name: rabbitmq-credentials
    key: url

4. Verify

bash
kubectl apply -f order-processor.yaml
kubectl apply -f scaled-object.yaml
kubectl get scaledobject order-processor-scaler
kubectl get hpa  # KEDA creates this automatically

Common Pitfalls

PitfallSymptomFix
Cold start from zeroHigh latency on first requestSet minReplicaCount: 1 for latency-sensitive workloads
Authentication failures"error getting metrics"Verify TriggerAuthentication secrets exist and are correct
Polling interval too longScaling reacts too slowlyReduce pollingInterval to 5-10 seconds
Cooldown too shortScale to zero then immediately back upIncrease cooldownPeriod to 120-300 seconds
KEDA and HPA conflictTwo HPAs for same deploymentDelete manually-created HPA; KEDA creates its own