ScaleGuidev2
NewsSandbox
ScaleGuide — Kubernetes Autoscaling, Explained Visually.
DocsVisualize

Autoscaling

Horizontal Pod AutoscalerVertical Pod AutoscalerCluster AutoscalerKEDA

Deployment Strategies

Blue-Green DeploymentCanary DeploymentRolling UpdateRecreate DeploymentA/B Testing DeploymentShadow (Dark) Deployment

PostgreSQL

Prerequisites & SetupWhy PostgreSQL?Backend ConnectionsPractice ExamplesOfficial Docs Summary

Code Sandbox

SQL QueriesK8s ManifestsDeploy Configs

Vertical Pod Autoscaler (VPA)

Automatically adjust CPU and memory requests for containers.

The Vertical Pod Autoscaler (VPA) automatically adjusts CPU and memory resource requests for containers based on historical and real-time usage. It makes pods "bigger" or "smaller" instead of adding more replicas.

How It Works

VPA consists of three components: the Recommender (monitors usage and computes optimal values), the Updater (evicts pods when recommendations diverge significantly), and the Admission Controller (mutates pod specs on creation with recommended values).

Update modes: Off (recommendations only), Initial (sets at creation only), Recreate (evicts and recreates pods), Auto (currently same as Recreate).

WarningVPA may restart pods to apply new resource values. Ensure your application handles restarts gracefully. Never run VPA and HPA on the same CPU/memory metric.

When to Use

  • Databases, caches, stateful singletons
  • You don't know the correct resource requests
  • Right-sizing over-provisioned pods to save costs
  • Workloads with varying resource needs over time

When NOT to Use

  • Stateless apps that can scale horizontally (use HPA)
  • Disruption-intolerant workloads
  • Already using HPA on CPU/memory (they conflict)

Real-World Example

PostgreSQL Batch Processing

A PostgreSQL pod needs 500m CPU during the day for OLTP queries but 2 CPU at night for ETL batch jobs. VPA automatically adjusts resource requests based on observed patterns, eliminating the need to over-provision for peak (saving 75% of daytime resources).

Step-by-Step Implementation

1. Install VPA

bash
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

# Verify
kubectl get pods -n kube-system | grep vpa

2. Start with recommendation-only mode

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: postgres-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: postgres
  updatePolicy:
    updateMode: "Off"   # Recommendation only

3. Enable auto-updates with bounds

yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: postgres-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: postgres
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: postgres
      minAllowed:
        cpu: "250m"
        memory: "512Mi"
      maxAllowed:
        cpu: "4"
        memory: "8Gi"
      controlledResources: ["cpu", "memory"]

Common Pitfalls

PitfallSymptomFix
Pod eviction disruptionsPods restart unexpectedlyUse "Off" mode initially; set PodDisruptionBudget
Conflicting with HPAOscillating behaviorNever run both on CPU/memory for the same deployment
No minAllowed setVPA recommends tiny resourcesAlways set minAllowed in resourcePolicy
Slow to convergeInaccurate recommendationsLet VPA observe for 24-48 hours before enabling Auto