Official Docs Summary
Condensed PostgreSQL official docs: essential data types, critical configuration, backup/restore, and replication.
The official PostgreSQL documentation is comprehensive but dense (3000+ pages). This section condenses the most important topics: data types, configuration, backup/restore, and replication.
Essential Data Types
| Type | Use Case | Example |
|---|---|---|
| SERIAL / BIGSERIAL | Auto-incrementing IDs | id SERIAL PRIMARY KEY |
| UUID | Distributed unique IDs | gen_random_uuid() |
| VARCHAR(n) / TEXT | Variable-length strings | name VARCHAR(100) |
| NUMERIC(p,s) | Exact decimal (money, finance) | price NUMERIC(10,2) |
| TIMESTAMPTZ | Date/time with timezone | created_at TIMESTAMPTZ DEFAULT NOW() |
| JSONB | Semi-structured data with indexing | metadata JSONB DEFAULT '{}' |
| ARRAY | Lists of values in a single column | tags TEXT[] |
| BOOLEAN | True/false flags | active BOOLEAN DEFAULT true |
| INET / CIDR | IP addresses and networks | client_ip INET |
TIMESTAMPTZ (with timezone) instead of TIMESTAMP for all time data. PostgreSQL stores both as UTC internally, but TIMESTAMPTZ correctly converts on input/output.Critical Configuration Parameters
# postgresql.conf - Key settings
# Memory
shared_buffers = '256MB' # 25% of available RAM
work_mem = '64MB' # Per-operation sort/hash memory
effective_cache_size = '768MB' # OS cache estimate (50-75% of RAM)
# WAL & Checkpoints
wal_level = 'replica' # Required for replication
max_wal_size = '1GB'
checkpoint_completion_target = 0.9
# Connections
max_connections = 200 # Coordinate with pooler
listen_addresses = '*' # Bind to all interfaces
# Logging
log_min_duration_statement = 1000 # Log queries > 1 second
log_statement = 'ddl' # Log DDL statementsshared_buffers too high. Going above 40% of RAM can hurt performance because it competes with the OS page cache. Start at 25% and benchmark.Backup & Restore
Logical Backup (pg_dump)
# Backup a single database
pg_dump -U postgres -Fc mydb > mydb.dump
# Restore from custom format
pg_restore -U postgres -d mydb mydb.dump
# Restore specific table
pg_restore -U postgres -d mydb -t users mydb.dumpPhysical Backup (pg_basebackup)
# Full physical backup for point-in-time recovery
pg_basebackup -U replicator -D /backup/base -Ft -z -PReplication
Streaming Replication
Physical byte-for-byte replication. The replica is an exact copy of the primary. Best for read replicas and high availability.
# On primary (postgresql.conf)
wal_level = 'replica'
max_wal_senders = 3
wal_keep_size = '1GB'
# On replica
primary_conninfo = 'host=primary-host user=replicator password=secret'Logical Replication
Table-level replication that decodes WAL into logical changes. Allows replicating specific tables and cross-version replication.
-- On publisher
CREATE PUBLICATION my_pub FOR TABLE users, orders;
-- On subscriber
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=publisher-host dbname=mydb user=replicator'
PUBLICATION my_pub;| Pitfall | Symptom | Fix |
|---|---|---|
| No monitoring on replica lag | Stale reads, eventual consistency surprises | Monitor pg_stat_replication.replay_lag |
| No backup testing | Backup is corrupt when you need it | Regularly restore backups to a test environment |
| shared_buffers too large | Performance degrades under memory pressure | Keep at 25% of RAM, let OS cache handle the rest |
| Missing indexes | Seq scans on large tables, slow queries | Use EXPLAIN ANALYZE and pg_stat_user_tables to find missing indexes |
Real-World Stories
GitLab - The Accidental DELETE That Took Down Production
On January 31, 2017, a GitLab engineer ran rm -rf on a PostgreSQL data directory on the wrong server -- the primary instead of a replica. Of their 5 backup strategies, only one (LVM snapshots) worked, and it was 6 hours old. pg_dump backups were failing silently due to a configuration error. Logical replication was set up but not tested. GitLab lost 6 hours of production data. The incident led to their famous "postmortem live stream" and a complete overhaul of backup verification.
Crunchy Data - Automated PITR Saves a Healthcare Company
A healthcare SaaS company accidentally ran an UPDATE without a WHERE clause, corrupting 2 million patient records. Using Crunchy Data's PostgreSQL setup with continuous WAL archiving, they performed point-in-time recovery (PITR) to exactly 30 seconds before the bad UPDATE. Full recovery took 22 minutes with zero data loss. The key was having archive_mode = on and archive_command shipping WAL segments to S3 every 60 seconds.
Zalando - Patroni Cluster Survives AWS AZ Failure
When an entire AWS availability zone went down in 2019, Zalando's PostgreSQL clusters (managed by Patroni) automatically promoted replicas in healthy AZs within 10 seconds. Their e-commerce platform served 35 million customers with no visible impact. The setup used synchronous replication across AZs with Patroni handling automatic failover, proving that PostgreSQL can match the high-availability guarantees of proprietary databases.