Proxmox Backup Server

What it is

Proxmox Backup Server (PBS) is the dedicated backup tool from the Proxmox project. It runs alongside Proxmox VE and stores incremental, deduplicated, content-addressable snapshots of every VM and container in the cluster. From PVE's perspective, it's "a storage pool you can back up to"; from PBS's perspective, it owns a datastore on disk where the actual backup chunks live.

Why I run it

The cluster has irreplaceable state: Home Assistant configuration, Vaultwarden's password vault, Nextcloud's data, my photo library, the n8n workflows, every service's config files. Losing any of that to a drive failure would range from annoying to catastrophic.

PBS gives me:

Daily snapshots of every guest, automatic, no thinking required.
Content-addressable chunk storage — unchanged data across snapshots takes near-zero additional space. A week of daily snapshots of Vaultwarden uses maybe 1.5× the size of a single snapshot, not 7×.
Verify jobs that re-read every chunk and confirm the SHA-256 still matches. Catches silent disk corruption that ordinary backups can't see.
Restore to a different VMID for non-destructive recovery drills.

It also enforces a rule I think is important: backups never live on the same physical disk as the data they protect. PBS's datastore is on an external NVMe enclosure that's always plugged in but is its own disk. A failure on the laptop node's main NVMe doesn't take the backups with it.

How I use it

The datastore holds about a year of dedup-collapsed snapshots across every guest. Scheduled backups run as two jobs:

Job 1, 02:00 daily, snapshot mode, for every guest except the media LXC.
Job 2, 08:00 daily, stop mode, for the media LXC only.

The split is the result of a backup incident I'd rather not repeat. The media LXC is the only multi-disk LXC in the cluster (rootfs plus a /data mount). vzdump's snapshot mode for multi-disk LXCs is "suspend → snapshot each disk → resume," and the resume step has a known cgroup-v2 freezer race that hangs the container indefinitely. Single-disk LXCs don't trigger it. Stop mode bypasses the snapshot entirely — quick shutdown, read from the now-stopped volumes, restart — at the cost of a few minutes of downtime, which is why job 2 runs at 08:00 (lowest media-viewing window).

A weekly verify job re-checks every chunk's SHA-256 on Saturday evenings. Snapshots verified in the last 30 days are skipped; everything else gets re-read. Typical run is about five minutes.