Official Docs Summary

Condensed PostgreSQL official docs: essential data types, critical configuration, backup/restore, and replication.

The official PostgreSQL documentation is comprehensive but dense (3000+ pages). This section condenses the most important topics: data types, configuration, backup/restore, and replication.

Essential Data Types

Type	Use Case	Example
SERIAL / BIGSERIAL	Auto-incrementing IDs	id SERIAL PRIMARY KEY
UUID	Distributed unique IDs	gen_random_uuid()
VARCHAR(n) / TEXT	Variable-length strings	name VARCHAR(100)
NUMERIC(p,s)	Exact decimal (money, finance)	price NUMERIC(10,2)
TIMESTAMPTZ	Date/time with timezone	created_at TIMESTAMPTZ DEFAULT NOW()
JSONB	Semi-structured data with indexing	metadata JSONB DEFAULT '{}'
ARRAY	Lists of values in a single column	tags TEXT[]
BOOLEAN	True/false flags	active BOOLEAN DEFAULT true
INET / CIDR	IP addresses and networks	client_ip INET

TipUse TIMESTAMPTZ (with timezone) instead of TIMESTAMP for all time data. PostgreSQL stores both as UTC internally, but TIMESTAMPTZ correctly converts on input/output.

Critical Configuration Parameters

text

# postgresql.conf - Key settings

# Memory
shared_buffers = '256MB'          # 25% of available RAM
work_mem = '64MB'                 # Per-operation sort/hash memory
effective_cache_size = '768MB'    # OS cache estimate (50-75% of RAM)

# WAL & Checkpoints
wal_level = 'replica'             # Required for replication
max_wal_size = '1GB'
checkpoint_completion_target = 0.9

# Connections
max_connections = 200             # Coordinate with pooler
listen_addresses = '*'            # Bind to all interfaces

# Logging
log_min_duration_statement = 1000 # Log queries > 1 second
log_statement = 'ddl'             # Log DDL statements

WarningDo not set shared_buffers too high. Going above 40% of RAM can hurt performance because it competes with the OS page cache. Start at 25% and benchmark.

Backup & Restore

Logical Backup (pg_dump)

bash

# Backup a single database
pg_dump -U postgres -Fc mydb > mydb.dump

# Restore from custom format
pg_restore -U postgres -d mydb mydb.dump

# Restore specific table
pg_restore -U postgres -d mydb -t users mydb.dump

Physical Backup (pg_basebackup)

bash

# Full physical backup for point-in-time recovery
pg_basebackup -U replicator -D /backup/base -Ft -z -P

Replication

Streaming Replication

Physical byte-for-byte replication. The replica is an exact copy of the primary. Best for read replicas and high availability.

text

# On primary (postgresql.conf)
wal_level = 'replica'
max_wal_senders = 3
wal_keep_size = '1GB'

# On replica
primary_conninfo = 'host=primary-host user=replicator password=secret'

Logical Replication

Table-level replication that decodes WAL into logical changes. Allows replicating specific tables and cross-version replication.

sql

-- On publisher
CREATE PUBLICATION my_pub FOR TABLE users, orders;

-- On subscriber
CREATE SUBSCRIPTION my_sub
  CONNECTION 'host=publisher-host dbname=mydb user=replicator'
  PUBLICATION my_pub;

Pitfall	Symptom	Fix
No monitoring on replica lag	Stale reads, eventual consistency surprises	Monitor pg_stat_replication.replay_lag
No backup testing	Backup is corrupt when you need it	Regularly restore backups to a test environment
shared_buffers too large	Performance degrades under memory pressure	Keep at 25% of RAM, let OS cache handle the rest
Missing indexes	Seq scans on large tables, slow queries	Use EXPLAIN ANALYZE and pg_stat_user_tables to find missing indexes

Real-World Stories

Failure

GitLab - The Accidental DELETE That Took Down Production

On January 31, 2017, a GitLab engineer ran rm -rf on a PostgreSQL data directory on the wrong server -- the primary instead of a replica. Of their 5 backup strategies, only one (LVM snapshots) worked, and it was 6 hours old. pg_dump backups were failing silently due to a configuration error. Logical replication was set up but not tested. GitLab lost 6 hours of production data. The incident led to their famous "postmortem live stream" and a complete overhaul of backup verification.

Success

Crunchy Data - Automated PITR Saves a Healthcare Company

A healthcare SaaS company accidentally ran an UPDATE without a WHERE clause, corrupting 2 million patient records. Using Crunchy Data's PostgreSQL setup with continuous WAL archiving, they performed point-in-time recovery (PITR) to exactly 30 seconds before the bad UPDATE. Full recovery took 22 minutes with zero data loss. The key was having archive_mode = on and archive_command shipping WAL segments to S3 every 60 seconds.

Success

Zalando - Patroni Cluster Survives AWS AZ Failure

When an entire AWS availability zone went down in 2019, Zalando's PostgreSQL clusters (managed by Patroni) automatically promoted replicas in healthy AZs within 10 seconds. Their e-commerce platform served 35 million customers with no visible impact. The setup used synchronous replication across AZs with Patroni handling automatic failover, proving that PostgreSQL can match the high-availability guarantees of proprietary databases.

Official Docs Summary

Condensed PostgreSQL official docs: essential data types, critical configuration, backup/restore, and replication.

The official PostgreSQL documentation is comprehensive but dense (3000+ pages). This section condenses the most important topics: data types, configuration, backup/restore, and replication.

Essential Data Types

Type	Use Case	Example
SERIAL / BIGSERIAL	Auto-incrementing IDs	id SERIAL PRIMARY KEY
UUID	Distributed unique IDs	gen_random_uuid()
VARCHAR(n) / TEXT	Variable-length strings	name VARCHAR(100)
NUMERIC(p,s)	Exact decimal (money, finance)	price NUMERIC(10,2)
TIMESTAMPTZ	Date/time with timezone	created_at TIMESTAMPTZ DEFAULT NOW()
JSONB	Semi-structured data with indexing	metadata JSONB DEFAULT '{}'
ARRAY	Lists of values in a single column	tags TEXT[]
BOOLEAN	True/false flags	active BOOLEAN DEFAULT true
INET / CIDR	IP addresses and networks	client_ip INET

TipUse TIMESTAMPTZ (with timezone) instead of TIMESTAMP for all time data. PostgreSQL stores both as UTC internally, but TIMESTAMPTZ correctly converts on input/output.

Critical Configuration Parameters

text

# postgresql.conf - Key settings

# Memory
shared_buffers = '256MB'          # 25% of available RAM
work_mem = '64MB'                 # Per-operation sort/hash memory
effective_cache_size = '768MB'    # OS cache estimate (50-75% of RAM)

# WAL & Checkpoints
wal_level = 'replica'             # Required for replication
max_wal_size = '1GB'
checkpoint_completion_target = 0.9

# Connections
max_connections = 200             # Coordinate with pooler
listen_addresses = '*'            # Bind to all interfaces

# Logging
log_min_duration_statement = 1000 # Log queries > 1 second
log_statement = 'ddl'             # Log DDL statements

WarningDo not set shared_buffers too high. Going above 40% of RAM can hurt performance because it competes with the OS page cache. Start at 25% and benchmark.

Backup & Restore

Logical Backup (pg_dump)

bash

# Backup a single database
pg_dump -U postgres -Fc mydb > mydb.dump

# Restore from custom format
pg_restore -U postgres -d mydb mydb.dump

# Restore specific table
pg_restore -U postgres -d mydb -t users mydb.dump

Physical Backup (pg_basebackup)

bash

# Full physical backup for point-in-time recovery
pg_basebackup -U replicator -D /backup/base -Ft -z -P

Replication

Streaming Replication

Physical byte-for-byte replication. The replica is an exact copy of the primary. Best for read replicas and high availability.

text

# On primary (postgresql.conf)
wal_level = 'replica'
max_wal_senders = 3
wal_keep_size = '1GB'

# On replica
primary_conninfo = 'host=primary-host user=replicator password=secret'

Logical Replication

Table-level replication that decodes WAL into logical changes. Allows replicating specific tables and cross-version replication.

sql

-- On publisher
CREATE PUBLICATION my_pub FOR TABLE users, orders;

-- On subscriber
CREATE SUBSCRIPTION my_sub
  CONNECTION 'host=publisher-host dbname=mydb user=replicator'
  PUBLICATION my_pub;

Pitfall	Symptom	Fix
No monitoring on replica lag	Stale reads, eventual consistency surprises	Monitor pg_stat_replication.replay_lag
No backup testing	Backup is corrupt when you need it	Regularly restore backups to a test environment
shared_buffers too large	Performance degrades under memory pressure	Keep at 25% of RAM, let OS cache handle the rest
Missing indexes	Seq scans on large tables, slow queries	Use EXPLAIN ANALYZE and pg_stat_user_tables to find missing indexes

Real-World Stories

Failure

GitLab - The Accidental DELETE That Took Down Production

Success

Crunchy Data - Automated PITR Saves a Healthcare Company

Success