Skip to content
Go back

Automated PostgreSQL Backups and Point-in-Time Recovery

Automated PostgreSQL Backups and Point-in-Time Recovery

Introduction

PostgreSQL Point-in-Time Recovery (PITR) enables restoring databases to any specific moment using base backups and WAL archives.

Prerequisites

Step 1: Configure WAL Archiving

Edit postgresql.conf:

# Enable WAL archiving
wal_level = replica
archive_mode = on
archive_command = 'cp %p /backup/wal_archive/%f'
archive_timeout = 300  # Force switch every 5 minutes

# Optional: increase checkpoint frequency
checkpoint_timeout = 5min
checkpoint_completion_target = 0.7

Restart PostgreSQL after changes.

Step 2: Create Backup Directory

sudo mkdir -p /backup/{base_backups,wal_archive}
sudo chown postgres:postgres /backup/{base_backups,wal_archive}

Step 3: Base Backup Script

Create /scripts/pg_backup.sh:

#!/bin/bash
set -e

BACKUP_DIR="/backup/base_backups"
DATE=$(date +%Y%m%d_%H%M%S)
BACKUP_NAME="backup_$DATE"
RETENTION_DAYS=7

# Create base backup
pg_basebackup -D "$BACKUP_DIR/$BACKUP_NAME" -Ft -z -P -W

# Create backup info file
cat > "$BACKUP_DIR/$BACKUP_NAME/backup_info.txt" << EOF
Backup Date: $(date)
Backup Type: Base backup with WAL
Retention: $RETENTION_DAYS days
EOF

# Cleanup old backups
find "$BACKUP_DIR" -maxdepth 1 -type d -name "backup_*" -mtime +$RETENTION_DAYS -exec rm -rf {} \;

echo "Backup completed: $BACKUP_NAME"

Step 4: Automated Backup with Cron

# Add to postgres user crontab
sudo -u postgres crontab -e

# Daily backup at 2 AM
0 2 * * * /scripts/pg_backup.sh >> /var/log/pg_backup.log 2>&1

Step 5: Point-in-Time Recovery

Recovery script /scripts/pg_restore.sh:

#!/bin/bash
set -e

BACKUP_DIR="/backup/base_backups"
WAL_ARCHIVE="/backup/wal_archive"
RECOVERY_TARGET_TIME="$1"  # Format: 'YYYY-MM-DD HH:MM:SS'
PGDATA_NEW="/var/lib/postgresql/data_recovery"

if [ -z "$RECOVERY_TARGET_TIME" ]; then
    echo "Usage: $0 'YYYY-MM-DD HH:MM:SS'"
    exit 1
fi

# Find latest backup before target time
LATEST_BACKUP=$(find "$BACKUP_DIR" -maxdepth 1 -type d -name "backup_*" | sort | tail -1)

if [ -z "$LATEST_BACKUP" ]; then
    echo "No backups found"
    exit 1
fi

echo "Using backup: $LATEST_BACKUP"
echo "Recovery target: $RECOVERY_TARGET_TIME"

# Stop PostgreSQL
sudo systemctl stop postgresql

# Create recovery directory
sudo rm -rf "$PGDATA_NEW"
sudo mkdir -p "$PGDATA_NEW"
sudo chown postgres:postgres "$PGDATA_NEW"

# Extract base backup
sudo -u postgres tar -xzf "$LATEST_BACKUP/base.tar.gz" -C "$PGDATA_NEW"

# Create recovery.signal file (PostgreSQL 12+)
sudo -u postgres touch "$PGDATA_NEW/recovery.signal"

# Configure recovery
sudo -u postgres cat > "$PGDATA_NEW/postgresql.conf" << EOF
restore_command = 'cp $WAL_ARCHIVE/%f %p'
recovery_target_time = '$RECOVERY_TARGET_TIME'
recovery_target_action = 'promote'
EOF

echo "Recovery setup complete. Start PostgreSQL with new data directory."

Step 6: WAL Archive Management

Script to clean old WAL files:

#!/bin/bash
set -e

WAL_ARCHIVE="/backup/wal_archive"
RETENTION_DAYS=14

# Only remove WAL files older than retention period
# and not needed by any existing backup
find "$WAL_ARCHIVE" -name "*.backup" -mtime +$RETENTION_DAYS -delete
find "$WAL_ARCHIVE" -name "0*" -mtime +$RETENTION_DAYS -delete

echo "WAL archive cleanup completed"

Step 7: Cloud Storage Integration

For S3 storage, modify archive command:

# In postgresql.conf
archive_command = 'aws s3 cp %p s3://my-pg-backups/wal/%f'

And restore command:

# In recovery configuration
restore_command = 'aws s3 cp s3://my-pg-backups/wal/%f %p'

Step 8: Monitoring and Alerts

Create monitoring script:

#!/bin/bash

BACKUP_DIR="/backup/base_backups"
ALERT_EMAIL="admin@example.com"

# Check if backup exists from last 25 hours
LATEST_BACKUP=$(find "$BACKUP_DIR" -maxdepth 1 -type d -name "backup_*" -mtime -1 | wc -l)

if [ "$LATEST_BACKUP" -eq 0 ]; then
    echo "ALERT: No recent PostgreSQL backup found" | mail -s "PostgreSQL Backup Alert" "$ALERT_EMAIL"
    exit 1
fi

echo "PostgreSQL backup monitoring: OK"

Summary

PostgreSQL PITR with WAL archiving and base backups provides comprehensive data protection. Automate with cron jobs and monitor regularly to ensure recovery capability.


Share this post on:

Previous Post
Implementing Redis Caching Strategies in Node.js
Next Post
Implementing Multi-Tenant Database Schemas in PostgreSQL