ScaleGuidev2
NewsSandbox
ScaleGuide — Kubernetes Autoscaling, Explained Visually.
DocsVisualize

Autoscaling

Horizontal Pod AutoscalerVertical Pod AutoscalerCluster AutoscalerKEDA

Deployment Strategies

Blue-Green DeploymentCanary DeploymentRolling UpdateRecreate DeploymentA/B Testing DeploymentShadow (Dark) Deployment

PostgreSQL

Prerequisites & SetupWhy PostgreSQL?Backend ConnectionsPractice ExamplesOfficial Docs Summary

Code Sandbox

SQL QueriesK8s ManifestsDeploy Configs

Blue-Green Deployment

Two identical environments with instant traffic switch for zero-downtime deployments.

Blue-Green deployment maintains two identical production environments. At any time, only one (say Blue) serves live traffic. You deploy the new version to the idle environment (Green), verify it, then switch the load balancer to route all traffic to Green instantly.

How It Works

The key is having two complete, independent environments. A load balancer or DNS sits in front and directs traffic to the active environment. Deployment happens to the inactive environment with zero impact on users. Once validated, the switch is instant -- typically a single load balancer rule change.

InfoRollback is equally instant: just switch traffic back to the old environment. This is why Blue-Green is favored in finance and healthcare where rollback speed is critical.

When to Use

  • Zero-downtime deployments are mandatory
  • Instant rollback is a regulatory or business requirement
  • You can afford 2x infrastructure cost during deployment
  • Database schema is backward-compatible across versions

When NOT to Use

  • Database schema changes that break backward compatibility
  • Budget constraints prevent running two full environments
  • Very frequent deployments (multiple per hour)
  • Stateful apps with in-memory sessions that cannot be shared

Real-World Examples

Capital One - Banking API

Capital One uses Blue-Green for their customer-facing banking API. Regulatory requirements demand instant rollback capability. Each deployment is validated in the Green environment with synthetic transactions before switching. Rollback has been exercised in under 30 seconds.

Transport for London - Fare Engine

TfL updates the fare calculation engine using Blue-Green to ensure zero disruption during peak commuting hours. The Green environment is validated with millions of fare calculations before traffic switch. Inconsistent fares would cause public trust issues.

Step-by-Step Implementation

1. Define two Services for Blue and Green

yaml
apiVersion: v1
kind: Service
metadata:
  name: app-blue
spec:
  selector:
    app: myapp
    version: blue
  ports:
  - port: 80
    targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
  name: app-green
spec:
  selector:
    app: myapp
    version: green
  ports:
  - port: 80
    targetPort: 8080

2. Deploy new version to Green

yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-green
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
      version: green
  template:
    metadata:
      labels:
        app: myapp
        version: green
    spec:
      containers:
      - name: myapp
        image: myregistry/myapp:2.0.0
        ports:
        - containerPort: 8080
        readinessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5

3. Switch traffic via Ingress

yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: myapp-ingress
spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: app-green  # Switch from app-blue to app-green
            port:
              number: 80

4. Verify and roll back if needed

bash
# Verify Green is healthy
kubectl get pods -l version=green

# If issues arise, switch back to Blue
kubectl patch ingress myapp-ingress --type='json' \
  -p='[{"op": "replace", "path": "/spec/rules/0/http/paths/0/backend/service/name", "value": "app-blue"}]'

Common Pitfalls

PitfallSymptomFix
Database schema divergenceRollback fails because Green schema is incompatible with Blue codeUse expand-and-contract migrations; keep schemas backward-compatible
Session state lossUsers logged out after switchUse external session store (Redis) shared between environments
DNS propagation delaySome users still hitting old environmentUse load balancer switching instead of DNS; or set low TTL
Forgetting to warm up GreenHigh latency immediately after switchRun load tests against Green before switching traffic